Author: YehLiang Hsu, ShuGen Wang, ChiaChieh Yu (20030819);
recommended: YehLiang Hsu (20040322).
Note: This paper is published in Engineering Optimization, Vol.
35, No. 5, October 2003, p. 489 ~ 511.
A sequential approximation
method using neural networks for engineering design optimization problems
Abstract
There are three
characteristics in engineering design optimization problems: (1) the design
variables are often discrete physical quantities; (2) the constraint functions
often cannot be expressed analytically in terms of design variables; (3) in
many engineering design applications, critical constraints are often
“passfail”, “01” type
binary constraints. This paper presents a sequential approximation method
specifically for engineering optimization problems with the three
characteristics. In this method a backpropagation neural network is trained to
simulate a rough map of the feasible domain formed by the constraints using a
few representative training data. A training data point consists of a discrete design
point and whether this design point is feasible or infeasible. Function values
of the constraints are not required. A search algorithm then searches for the
optimal point in the feasible domain simulated by the neural network. This new
design point is checked against the true constraints to see whether it is
feasible, and is then added to the training set. The neural network is trained
again with this added information, in the hope that the network will better
simulate the boundary of the feasible domain of the true optimization problem.
Then a further is made for the optimal point in this new approximated feasible
domain. This process continues in an iterative manner until the approximate
model locates the same optimal point in consecutive iterations. A restart
strategy is also employed so that the method may have a better chance to reach
a global optimum. Design examples with large
discrete design spaces and implicit constraints are solved to demonstrate the
practicality of this method.
Keywords: sequential approximation, neural network, nonlinear
discretevariable optimization, implicit constraint.
1.
Introduction
Engineering
design optimization aims to certify the quality, performance, and cost of a
product during the design phase using optimization algorithms. Engineering
design optimization problems have several particalar characteristics that distinguish
them from general mathematic programming problems:
(1) Their design variables are often discrete physical quantities. In
many engineering applications, variables of the optimization problem must be
selected from a table or graph list of discrete or integer values for practical
reasons. For examples, mechanical components may only be available in standard
sizes.
(2) The constraint functions in engineering design optimization problems
often cannot be expressed analytically in terms of design variables. Such constraints
are called implicit constraints [1]. Typical examples of implicit constraints are those on structural responses such as stress, displacement,
and natural frequency, and constraints involving manufacturing parameters,
measurement data, etc. They are usually evaluated by computer or by physical experiments.
Thus the evaluation of the implicit constraints is usually the major cost of
the optimization process. Moreover, the analytical forms of the first derivatives
of the implicit constraints with respect to the design variables are usually
not available. Sensitivity of such constraints often has to be calculated
through finite difference schemes, which require many function evaluations, and
often have problems with accuracy.
(3) In many engineering design applications, the final design has to
pass certain critical constrains. These are also implicit constraints, and they
are often “passfail”, “01” type
binary constraints.
In summary, an
engineering design optimization problem can usually be formulated as below:
min. f(x)
s.t. _{}
_{}
_{}
_{} (1)
where x is the vector of design variables, f(x)
is the objective function, _{} are the explicit constraints,
and _{} are the implicit
constraints which cannot be expressed in terms of the design variables, _{} are the “passfail”
type binary constraints; _{} is the set of
design variables that can take only discrete values, and _{} is the set of
allowable discrete values for variable _{}.
There has been
considerable interest in the area of nonlinear discrete optimization. Some review/survey articles on the algorithms for
nonlinear optimization problems with mixed discrete variables has been
published. Huang and Arora [2] categorized
mixed continuousdiscrete variable nonlinear optimal design problems into 6
types, depending on the type of variables and problem functions. They are: (1) continuous design variables with functions
twice continuously differentiable; (2) mixed design variables with functions
twice continuously differentiable while discrete variables can have
nondiscrete values during the solution process; (3) mixed design variables
with functions nondifferentiable while discrete variables can have
nondiscrete values during the solution process; (4) mixed design variables
with functions may or may not be differentiable while some of the discrete
variables must have only discrete value during the solution process; (5) mixed
design variables with functions may or may not be differentiable while some of
the discrete variable are linked to others; (6) nondifferentiable discrete
combinatorial problems. The engineering optimization problem in Equation (1) is
close to type (3) or (4) stated above, but the existence of binary constraints _{} further
complicates the problem.
Arora and Huang [3] classified the numerical methods for
solving mixeddiscrete nonlinear programming into 6 types: (1) branch and bound
methods; (2) simulated annealing; (3) sequential linearization method followed
by branch and bound, integer programming, simulated annealing, and others; (4)
penalty function approach; (5) Lagrangian relaxation
methods; and (6) other methods such as roundingoff,
cutting plane, heuristics, genetic algorithms, pure discrete techniques, etc. Thanedar
and Vanderplaats [4] presented a survey of the available methods for discrete
variable structural optimization, and divided them into three principal methods
as the classical branch and bound method, mixed linearization or approximation
methods, and adhoc methods, including simulated annealing and genetic
algorithms.
Among these methods, the branch and bound method, simulated
annealing, and genetic algorithm are suitable implementations for problems with
nondifferentiable functions. But these methods require many function evaluations,
which may not be suitable for engineering optimization problems with implicit
constraints.
One important
category of numerical optimization algorithms is the sequential approximation
methods. The basic idea of sequential approximation methods is to use a
“simple” subproblem to approximate the hard, exact problem. By “simple”
subproblem, is meant the type of problems that can be readily solved by
existing numerical algorithms. For example, linear programming subproblems are
widely used in sequential approximation methods. The solution point of the
simple subproblem is then used to form a better approximation to the hard,
exact problem for the next iteration. In an iterative manner, it is expected
that the solution point of the simple approximate problem will get closer to
the optimum point of the hard exact problem. One major disadvantage of the
existing sequential approximation methods is that they are usually
derivativebased approximation methods, which require at least the first
derivatives of the constraints with respect to the design variables.
This paper
presents a sequential approximation method specifically for engineering
optimization problems with the three characteristics previously mentioned. A
preliminary concept of this method was first proposed by Hsu et al. [5]. In this method, instead of using first
derivatives of constraints for approximation, a backpropagation neural network
is trained to simulate a rough map of the feasible domain formed by the
constraints using a few representative training data. A training data consists
of a discrete design point and whether this design point is feasible or
infeasible. Function values of the constraints are not required. A search
algorithm then searches for the “optimal point” in the feasible domain
simulated by the neural network. This new design point is checked against the
true constraints to see whether it is feasible, and is then added to the
training set. The neural network is trained again with this added information,
in the hope that the network will better simulate the boundary of the feasible
domain of the true optimization problem. Then a further search for the optimal
point in this new approximated feasible domain . This process continues in an
iterative manner until the approximate model locates the same optimal point in
consecutive iterations.
Hopfield and
Tank [6] first extended neural network application to optimization problems. In
the Hopfield model, the optimization problem is converted into a problem of
minimizing an energy function formulated by the design restraints. Watta [7]
utilized similar ideas to propose a coupled gradient network approach,
consisting of two interacting recurrent networks for a class of constrained
mixedinteger optimization problems. The energy function consists of the
objective function plus a penalty function to enforce the constraints. A multistart mode may be easily incorporated into
the computation producing more global optimal results that are within a few
percent of the globally optimal solution.
Lee and Chen [8]
also explored the idea of applying neural network technology in design
optimization. In their approach, neural networks were used to simulate the
constraint functions. Renaud et al. [9] used neural networks to approximate the
discrete system space and repeatedly evaluate the network instead of the actual
design space during optimization. The input/output pairs produced during the
subspace optimizations are used to update the network training, and the optimum
of the network approximation is used as the starting value for the next
iteration. Hajela [10] provided a general
overview to examine issues pertinent to using the nongradient methods in the
multidisplinary design optimization problem, which is characterized by the presence
of a large number of discrete design variables and constraints. He pointed out that the neural network is a special
form of response surface where the response function is a nested squashing
function; the interconnection weights correspond to the regression parameters
in a typical regression model.
Instead of
simulating the function values of objective functions and constraints, the
sequential approximation method using neural networks (the SNA method)
developed in this research only simulates whether a data point is feasible (1)
or infeasible (0) in the optimization problem. One key point of the SNA method
is that the computational cost of the training of the neural network has to be
relatively small compared with that of the function value and sensitivity
calculations of the implicit constraints. Otherwise using neural network in
optimization will not have much advantage computationally over the traditional
numerical optimization algorithms, in those cases where traditional algorithms
are still applicable. This clear 01 pattern makes the training process
relatively fast. A restart strategy is also employed so that the method may
have a better chance to reach a global optimum.
Section 2 of
this paper describes the algorithm of the SNA. Section 3 presents two
verification examples, in which the SNA method finds the global minimum
successfully. In Section 4, three design examples with large discrete design
spaces and implicit constraints are solved by the SNA method to demonstrate its
practicality
2.
The sequential approximation
method using neural networks
2.1 The structure of the method
As discussed
earlier, the basic idea of sequential approximation methods is to use a simple
subproblem to approximate the hard, exact problem, such as Equation (1). The feasible
domain of the optimization model in Equation (1) can be simulated using a few
representative data to form the “simple” approximated model below:
min. f(x)
s.t. NN(x)=1 (2)
where the binary constraint NN(x)=1 approximates the feasible
domain of the optimization model in Equation (1). If NN(x)=1, the
discrete point x is feasible; if NN(x)=0, the discrete point x
is infeasible. The discrete variable constraints are embedded in the format of
the input nodes of the neural network, as will be described later. Throughout
this paper, the optimization model, Equation (1), will be denoted M_{real}
and the approximate model in Equation (2) will be denoted M_{NN} for
convenience. M_{NN} is a simple approximated model because the
computational cost of evaluating NN(x) is much less than that of
evaluating the implicit constraints in M_{real}.
Figure 1
outlines the sequential approximation method using neural networks. The set of
initial training data S_{0} with at least one feasible point is
given. A backpropagation neural network is trained to simulate a rough map of
the feasible domain, formed by the implicit constraints using S_{0}
to form the initial optimization model M_{NN}. There has to be at least
one feasible point in S_{0}, such that M_{NN} has a
feasible domain.
Starting from the
feasible point s,_{ }a search algorithm is used to search
for the solution point x_{j} of M_{NN} in the jth
iteration. The search algorithm discussed later has to start from a feasible
design point. There can be several feasible design points in S_{0},
and the one with lowest objective value is used first. This solution point x_{j}
is then evaluated to see whether it is feasible in M_{real}, and
whether x_{j} = x_{j}_{1}.
If not, this new solution point x_{j} is added to the set
of training points S, and
the neural network is trained again with the updated data set. Then searching
commences for the solution point in the updated M_{NN} again, using x_{j}
as the starting point if x_{j} is feasible. With the
increasing number of training points, it is expected that the feasible domain
simulated by the neural network in M_{NN} will get closer to that of
the exact model M_{real}, and the solution point of M_{NN} will
get closer to the optimum point.
If the feasible
solution point x_{j} generated in the current iteration is
the same as x_{j}_{1}, no new training point is
generated. The set of training point since S, and consequently M_{NN},
is not updated. The current search has to be terminated because the same
solution point will be reached.
To ensure a
better chance to reach a global optimum in the search domain, the next step is
to restart the search from another feasible point in S_{0}, which
has the second lowest objective value. The iterations continue until all
feasible points in S_{0} have been used as initial starting
points. Finally the feasible design point with the lowest objective value in S
is output as the optimal solution.
Figure 1. Flow chart of the algorithm
2.2 Defining the set of initial training data using the
threelevel orthogonal array
Figure 2 shows a
simple threebartruss problem that is commonly used in structural optimization
literatures to demonstrate various optimization techniques [11]. Here it is
used as an example to illustrate the detailed procedure of the SNA method. The
objective of this problem is to find the optimal values of three
crosssectional areas A_{1}, A_{2} and A_{3}
of the bars that minimizes the weight of the structure, subjected to maximum
stress constraints on the three bars. The horizontal force P can either act to the right or to the left, thus we have a
symmetric structure in which A_{1} equals A_{3}.
_{}
Figure 2. Threebar trusses
Substituting values of the parameters r, E, P, L, and s_{max} given in the example, and assume that all three bars have square
cross sections and the sizes of the raw material available from manufacturers
range from 1×1mm^{2},
2×2mm^{2}, and so on to 17×17mm^{2}. Therefore we have the discretevariable
optimization problem below:
min. _{}
s.t. _{}
_{}
_{}, i = 1, 2. (3)
Equation (3) represents
M_{real} of this problem. For illustrative purpose, g_{1},
g_{2} are assumed to be implicit constraints, which will be
simulated by a backpropagation neural network to form M_{NN}.
Referring to
Figure 1, a set of initial training points S_{0} is used to
train the neural network so that the neural network can simulate a rough map of
the whole design domain. The set of initial training points should reasonably represent
the whole design domain. Various types of matrices are commonly used for
planning experiments to study several input variables. Orthogonal arrays that
were first proposed by Plackett and Burman [12] and Rao[13] are highly popular
because they are geometrically balanced in coverage of the experiment region
with just a few representative experiments. Orthogonal arrays are also adopted
in this research to define the set of initial training data S_{0}.
The numbers of design variables and the defined levels of the variables
distinguish a series of orthogonal array. In this research, threelevel
orthogonal arrays are favorable over twolevel orthogonal arrays because
threelevel orthogonal arrays have a higher resolution, especially in
nonlinear evaluation. Threelevel orthogonal arrays also cover the middle
range of the design domain more properly. The number of data points in the
threelevel orthogonal array is generally determined by the maximum number of
variables in the form L_{i}(3^{j}), where index i denotes the number of data points and j denotes the maximum number of
variables. For example, L_{9}(3^{4}) array is suitable for no
more than 4 variables, and 9 data points are needed in this array; the L_{18}(2^{1}×3^{7}) array is suitable
for not more than 7 variables, and 18 data points are needed in this array. Similarly
L_{27}(3^{13}) array is suitable for not more than 13
variables, and 27 data points are needed in this array.
For the threebartruss problem illustrated in the previous section,
an L_{9}(3^{4}) array is used. The three levels are made by
both end points and one middle point of each variable within its design domain.
Therefore the 9 initial training points in this example are (1, 1), (1, 81),
(1, 289), (81, 1), (81, 81), (81, 289), (289, 1), (289, 81) and (289, 289).
Among the 9 initial training points, (81, 289) and (289, 289) are feasible.
2.3 Training the neural network
Figure 3 shows
the architecture of the threelayer network. The size of the input layer
depends on the number of variables and the number of discrete values of each
variable. In this example, there are 2 design variables, with 17 discrete
values each. Thus a total of 2×17 neurons are used in the input layer. Each
neuron in the input layer has value 1 or 0 to represent the discrete value ach
variable takes, as will be explained later. There is only a single neuron in
the output layer to represent whether this design point is feasible (the output
neuron has value 1) or infeasible (the output neuron has value 0).
A training point
is represented as shown in Figure 4, where a cross represents “0” in the node, a circle represents a “1” in the node. For illustrative purpose,
the 34 input nodes are arranged into two rows in Figure 4 to represent two
design variables. The first 9 nodes of the first row have value 1 to represent
that the first variable A_{1} has the discrete value 81. The
first 9 nodes of the second row also have value 1 to represent that the second
variable A_{2} also has the discrete value 81. Checking with M_{real},
we can determine that (A_{1}, A_{2})=(81, 81) is
an infeasible design point. So the single node in the output layer has the
value of 0 to represent that this point is infeasible. Similarly, Figure 5
shows another initial training point (A_{1}, A_{2})=(289,
289), which is a feasible design point. Note that in this representation,
discrete values of the variables are embedded and neighboring nodes have very
similar input patterns.
Figure 3. The architecture of the neural network
Input layer
Output layer
(infeasible)
Figure 4. Representation of an
infeasible training point {(81, 81), infeasible}.
Input layer
Output layer
(feasible)
Figure 5. Representation of a feasible training
point {(289, 289), feasible}.
The number of
neurons in the hidden layer depends on the number of variables in the input
layer. There are 12 neurons in the hidden layer in this example. The transfer
functions used in the hidden and output layer of the network are both
logsigmoid functions. The neuron in the output layer has a value range [0, 1].
After the training is completed, a threshold value 0.25 is applied to the
output layer when simulating the boundary of the feasible domain. In other
words, given a discrete design point in the search domain, the network output will
always be 0, if the output neuron’s value is less than the given threshold, or
1 otherwise, to indicate whether this discrete design point is feasible or
infeasible.
The
computational effort required in neural network training is critical. If the
computational cost required in neural network training is larger than that of
evaluating the implicit constraints, than the SNA method will lose its
advantage. Here all the training data are represented in a clear 01 pattern,
which makes the training process relatively fast. A quasiNewton algorithm,
which is based on Newton’s method but do not require calculation of second
derivatives, is used for the training. In our numerical experience, the error
goal of 1e6 is usually met within 100 epochs, even for cases with many
training points.
After this
initial training is completed, the mathematical optimization model M_{real}
is transformed into M_{NN}, in which the
feasible domain of the optimization model is simulated by NN(x). Figure
6 shows the feasible domain of the threebartruss example simulated by M_{NN}
with 9 initial training points defined by the L_{9} orthogonal array (nodes
enclosed by square brackets in the Figure). There are 289 possible discrete
combinations as shown in the figure. A circle represents that the design is
feasible in M_{NN}, and a cross represents that the design is
infeasible.
It is possible that none of the points in the orthogonal array is
feasible. In that case the whole search domain will be infeasible in M_{NN}.
Therefore we have to include at least one feasible point and add it in the set
of initial training data.
Figure 6. The feasible domain simulated by M_{NN}
with 9 initial training points
2.4 Searching in the feasible domain
A search for the
optimum point of M_{NN} is then started. Figure 7 is the flow chart of
this discrete search algorithm. The search algorithm has to start from a
feasible design point. As discussed in the previous section, if a new design
point from the previous iteration is a feasible design point in M_{real},
it is used as the starting point in the current search. If the new design point
is infeasible in M_{real}, then the same starting point used in the
previous iteration is used again in the current search. As discussed earlier, there
has to be at least one feasible design point in the set of initial training
points, so the search algorithm can always start from a feasible design point.
As shown in
Figure 7, a usable set U_{j} is formed from neighboring discrete
nodes of x_{j}. There are m=3^{n}1
nodes in U_{j}. Then the components in U_{j} are
sorted by the “distances” to x_{j} in a descending order,
thus U_{j}={u_{1}, u_{2}, u_{3},
…, u_{m}}. Note that by “distances”, does not mean the absolute
distance u_{k}x_{j}. Instead,
it measures the number of variables in u_{k} that have
different discrete values from x_{j}. Therefore this
sorting is the same for all discrete points, given the number of total design
variable, and does not require any computation. The node that has a larger “distance”
to x_{j} has a higher priority to be searched first.
Pick up a node u_{k}
from U_{j}. Two conditions are checked: “is u_{k}
feasible in M_{NN}?” and “is f(u_{k})<f(x_{j})
?”. If both conditions are satisfied, u_{k}
is a better feasible point than x_{j}, and the algorithm
moves on to u_{k} and start the next iteration (j=j+1).
On the other hand, if either of the two conditions is not satisfied, then the
next node in U_{j }(k=k+1) is checked. When the
end of U_{j} (k=m) is reached, there is no better
feasible point in U_{j}, and x_{j} is
output as the optimum (at least locally) point.
Figure7. Flow chart of the search steps
This search
algorithm is used to search for the solution point in the feasible domain formed
by the L_{9} initial training points as shown in Figure 6. There are
two feasible design points (81, 289) and (289, 289) in the set of the 9 initial
training points. The feasible design point (81, 289) has a lower objective
value and is first used as the starting point in the search algorithm. The
search algorithm terminates at a new design point (16, 289). Figure 8 shows the
path of the search, from point (81 289), through (64 256), (49 225), (36 256),
(25 289) to point (16 289). This point is infeasible when checked with M_{real}
in Equation (1). This new design point is added to the set of training points,
and the neural network is trained again. Figure 9 shows the updated domain.
Comparing Figure 9 with Figure 8, the feasible domain of M_{NN} is “pushed”
to the right by the new point {(16, 289), infeasible}.
The search
algorithm is started again to search for the solution point in the updated
feasible domain shown in Figure 9 and terminates at another new design point
(36, 289). This path of the search is from point (81 289), through (64 256),
(49 289) to point (36 289) as shown in Figure 9. This point is feasible when
checked with M_{real} in Equation (1). Similarly this new design point
is also added to the set of training points, and the neural network is trained
again. Figure 10 shows the next updated feasible domain simulated by the neural
network. The new feasible design point pushes the feasible domain of M_{NN}
downwards.
Since the
solution point (36, 289) in the previous iteration is a feasible design point,
it is used as the starting point when searching in the feasible domain shown in
Figure 10. As shown by the arrow in Figure 10, the next solution point (25,
289) is obtained in just one step.
Figure 8. The path of the search from point (81
289) to point (16 289)
Figure 9. The updated feasible
domain
Figure 10. The next updated domain
2.5 The restart strategy
The iteration
continues with one more training point added into the set of training points in
each iteration. The first search process terminates at a feasible point (25,
196) after 6 iterations, when the same design point is found twice and no new
training point is generated. The objective function value at this point is 2323.6g, or 48.3% of that of initial
design point (81, 289)
To ensure a
better chance to reach a global optimum, the searching process is restarted
from another feasible point (289, 289) in the set of initial training data. In
this example, the search process terminates at the same feasible point (25,
196) in one iteration. Figure 11 compares the feasible domain of M_{NN}
at the end of the process with the true feasible domain of M_{real} in
Equation (1). As shown in Figure 11, (25, 196) is also the global optimum in M_{real}.
In this example 6 more discrete points are evaluated (nodes enclosed by square
brackets in the figures) beside the 9 initial training points in Figure 6. The
feasible domain of M_{NN} in the neighborhood of the optimum point is
exactly the same as that of M_{real}, because most evaluation points
fall in this region. Figure 12 shows the iteration history of this example; the
points represented by blank circles are the feasible design points. Note that a
total of 15 points out of the 17×17=289 possible combinatorial combinations
(5.2%) are evaluated to obtain this optimum design. No gradient calculation is
required.
Figure 11(a). The feasible domain of M_{NN}
Figure 11(b). The feasible domain of M_{real}
Figure 12. Iteration history of the threebartruss
example
3.
Verification examples using the
SNA method
Two design
examples are presented to verify the solution found by the SNA method. These
examples are on the optimal design of mechanical components and are widely used
in literatures to demonstrate various algorithms for discrete variable
optimization. Although there are not implicit constraints in the examples presented
here, it is useful to verify the results obtained from the SNA method by treating
all constraints as implicit constraints. In both examples, the SNA method finds
the global minimum.
3.1 Ibeam design
The minimum
crosssectional area design of a welded Ibeam, subject to bending and
compression loads, is considered. As shown in Figure 13, the design variables
are the web height h_{,} web thickness t_{w}, and
the area of the flange a_{f} .
Figure 13. The crosssectional area of an Ibeam
The objective
function can be formulated as
min. f
= ht_{w} + 2a_{f} (4)
There are two constraints.
The first constraint is that the stress caused by bending and compression loads
should be lower than the ultimate stress of the material:
_{} (5)
where _{} is the stress
caused by the bending load M, and _{} is the section
modulus, _{} is the stress
caused by the compression load N, _{} is the total area
of the crosssection, and R_{u} is the ultimate stress.
The constraint
due to web buckling, g_{2}, may be expressed as
_{} (6)
This problem was
discussed by Farkas [14], Yuan et al. [15], and Farkas and Jarmai [16]
to demonstrate various algorithms for nonlinear discrete variable optimization.
For comparing with the results in the literature, the same parameters M=320kNm,
N=128kN, and R_{u}=200MPa are used. Table 1 shows the discrete
values for each variable. The set of initial training data according to the L_{9}(3^{4})
orthogonal array is also used in this example. The 9 initial training points in
this example are (74, 0.9, 18), (74, 0.7, 14), (74, 0.5, 22), (70, 0.9, 14),
(70, 0.7, 22), (70, 0.5, 18), (66, 0.9, 22), (66, 0.7, 18) and (66, 0.5, 14).
Among the 9 initial training points, (74, 0.9, 18), (70, 0.7, 22) and (66, 0.9,
22) are feasible, and (70, 0.7, 22) has the
lowest objective value and is first used as the starting point of the search.
Table 1. Discrete values of the crosssection area
of the Ibeam example

1

2

3

4

5

6

7

8

9

h

66

68

70

72

74





t_{w}

0.5

0.6

0.7

0.8

0.9





a_{f}

14

15

16

17

18

19

20

21

22

The SNA method terminates after 21 iterations, including 2
restarts. Figure 14 shows the
iteration history. Note that a total of 30 design points out of the _{} possible
combinatorial combinations (13.3%), are evaluated. The final optimal design
point (70, 0.6, 18) was found in the 2nd restart. This is the same design point
found in the literatures. An exhaust search of all 225 designpoints shows that
this design point is indeed the global minimum.
Figure 14. Iteration history of the Ibeam example on crosssection area
3.2 Helical compression spring design
The minimum volume
design of a helical compression spring, subject to an axially guided constant compression
load, is now considered. As shown in Figure 15, the ends of the spring are
ground and squared. The design variables are the winding coil diameter d_{w}_{,}
wire diameter d, and the number of spring coil, n. The springs
are usually manufactured from commercial wire in stock, and therefore the wire
diameter can only be in a set of discrete values. The number of spring coils
has to be integer, and the winding diameter is also assumed to be in a set of
discrete values. Siddall [17], Sandgren [18], Lu and Siddall [19], and Kannan
and Kramer [20] used this problem to demonstrate algorithms for nonlinear
discrete variable optimization.
Figure 15. The spring design example
The objective
function is to minimize the volume of the spring and can be formulated as
min. F
= pd_{w}d^{2}(n+2)/4 (7)
The design
limitations placed on this model are specified in the following 8 constraints:
(1) The design shear stress caused by compression load, should be lower than
the allowable maximum shear stress S of the material. (2) The free
length of the spring should be lower than the maximum specified value L_{max}.
(3) The wire diameter must not be less than the specified minimum diameter D_{min}.
(4) The outside diameter of the coil should be lower than the specified maximum
diameter D_{max}. (5) The inner coil diameter must be at least
three times less than the wire diameter to avoid a lightly wound spring. (6)
The deflection under preload d must be less than the specified maximum deflection under preload D_{pm}. (7) The combined deflection must be consistent with the coil free
length L_{free}. (8) The deflection from preload to maximum load
must be greater than the specified working deflection D_{w}.
Buckling load is
not considered due to the fact that the spring is guided. The derivation of the
equations used in formulating the constraints can be found in the paper by
Siddall [17]. This problem can be formulated as follows:
min. f
= pd_{w}d^{2}(n+2)/4
s.t. g_{1}=
8KP_{max}d_{w}/(3.14156d^{3}) – S
_{} 0
g_{2}
= 8P_{max}d_{w}^{3}n/(Gd^{4})
+ 1.05(n+2)d – L_{max} _{} 0
g_{3}
= D_{min} – d _{} 0
g_{4}
= (d_{ }+d_{w}) – D_{max}
_{} 0
g_{5} =
3 – (d_{w}–d )/d _{} 0
g_{6} = d – D_{pm} _{} 0
g_{7} =
8P_{max}d_{w}^{3}n/(Gd^{4})
+ (P_{max} – P_{load})/K + 1.05(n+2)d
– L_{free} _{} 0
g_{8}
=D_{w}– (P_{max} – P_{load})/K
_{} 0 (8)
where the spring correction factor K = (4C1)/(4C4)+0.615/C and the spring index C
= d_{w}/d. P_{max} is the maximum design
load; E and G are respectively the Young’s and shear modulus of
rigidity of the spring material. The values of all the parameters are defined
below [18]: S = 189000psi, E = 30e+06psi, G = 11.5e+06psi,
P_{max} = 1000lb,
L_{free} = 14in, D_{min}
= 0.2in, D_{max}
= 3in, P_{load }=
300lb, D_{pm} = 6in, D_{w} = 1.25in.
Table 2 shows
the discrete values of each variable in the literature. The 9 initial training
points (L_{9}(3^{4}) orthogonal array) in this example are (7,
0.207, 1.13), (7, 0.307, 1.38), (7, 0.438, 2), (10, 0.207, 2), (10, 0.307,
1.13), (10, 0.438, 1.38), (13, 0.207, 1.38), (13, 0.307, 2), (13, 0.438, 1.13).
Among the 9 initial training points, (7, 0.307, 1.38), (7, 0.438, 2), (10,
0.307, 1.13) and (10, 0.438, 1.38) are feasible, and (7, 0.307, 1.38) has the
lowest objective value and is first used as the starting point of the search.
Table 2. Discrete values of each variable

1

2

3

4

5

6

7

8

9

10

11

n (no.)

7

8

9

10

11

12

13





d (in)

0.207

0.225

0.244

0.263

0.283

0.307

0.331

0.362

0.375

0.394

0.438

d_{w} (in)

1.13

1.18

1.23

1.28

1.33

1.38

1.43

1.53

1.68

1.85

2.000

Figure 16 shows
the iteration history using the SNA method, including 3 restarts. The optimal
solution point (7, 0.283, 1.13) with a minimum volume 2.010 in^{3} was found in the first
restart. During this process, 22 out of the total _{} possible
combinatorial combinations (2.6%) are evaluated. An exhausted search on those
847 design points finds that the design point (7, 0.283, 1.13) obtained in the
SNA method is the global optimum.
Figure 16. Iteration history with multiple restart in ascending objective
values
4.
Design examples
Three design
examples with large discrete design spaces and implicit constraints are solved
by the SNA method to demonstrate its practicality. The first example is a
pressure vessel design example that is widely used in literatures to present
various algorithms for discrete variable engineering optimization. This example
does not have implicit constraints. It is used here to compare the solution
obtained from the SNA method, treating all constraints as implicit constraints,
with those obtained from other discrete programming algorithms in the
literatures. The second example is on the optimal sizing design of a fivesteps
cantilever beam. This problem has 10 design variables and is used to
demonstrate the limitation of the SNA method when the number of design
variables is large. The third example is on the structure optimal design of a
reinforced cylindrical body. This example has two implicit constraints that
have to be evaluated by finite element analysis.
4.1 Pressure vessel design
Figure 17 shows
a cylindrical pressure vessel capped at both ends by hemispherical heads. This compressed air tank has a working pressure of
3,000 psi and a minimum volume of 750 feet^{3}, and is designed in accordance with the ASME boiler and pressure
vessel code. The total cost, including a combination
of single 60 degree welding cost, material and forming cost, is to be
minimized. The design variables are the thickness of the shell and head, and
the inner radius and length of the cylindrical section. These variables are
denoted x_{1}, x_{2}, x_{3}, and x_{4},
respectively. The thickness of the shell is not to be less than 1.1 inches
while for the thickness the head must not be less than 0.6 inch. Also the
thickness is not to exceed 2 inches for both of them.
Figure 17. The pressure vessel design example
Sandgren [18]
first discussed this problem using the branch and bound method for nonlinear
integer and discrete variable optimization, then
Kannan and Kramer [20] followed up to solve this problem by an augmented
Lagrange multiplier method for mixed integer, discrete, and continuous variables
optimization problems. The units of the variables used in the literatures are
inches. Here inches are used so that the result can be compared with those in
the literatures. This problem can be formulated as follows:
min. _{}
s.t. _{}
_{}
_{}
_{} (9)
In Equation (9), the objective function is the total cost,
including the material and forming costs expressed by the first two terms, and the
welding cost in the last two terms. The ASME codes impose limits on the minimum
shell thickness (g_{1}) and spherical head thickness (g_{2}).
There are also constraints on the minimum volume required (g_{3})
and the maximum shell length of the tank (g_{4}).
In the
literatures, design variables x_{1} and x_{2} are
integermultiples of 0.0625 inch, the standard size of the available thickness
of rolled steel plates. But x_{3} and x_{4} are
continuous variables. Here we assume all four variables are discrete variables,
and each variable can take different numbers of possible discrete values as
listed in Table 3. Note that the ranges of the variables are different from
those in the literatures.
Table 3. Discrete values of the design variables in
the pressure vessel example

1

2

3

4

5

6

7

8

9

10

11

12

x_{1}

1.125

1.1875

1.25

1.3125

1.375

1.4375

1.5

1.5625

1.625

1.6875

1.75

1.8125

x_{2}

0.625

0.6875

0.75

0.8125

0.875

0.9375

1

1.0625

1.125

1.1875

1.25

1.3125

x_{3}

40

41

42

43

44

45

46

47

48

49

50

51

x_{4}

40

45

50

55

60

65

70

75

80

85

90

95


13

14

15

16

17

18

19

20

21

22

23


x_{1}

1.875

1.9375

2










x_{2}

1.375

1.4375

1.5

1.5625

1.625

1.6875

1.75

1.8125

1.875

1.9375

2


x_{3}

52

53

54

55

56

57

58

59

60




x_{4}

100

105

110

115

120








The 9 initial
training points (L_{9}(3^{4}) orthogonal array) in this example
are (1.125, 0.625, 40, 40), (1.125, 1.3125, 50, 80), (1.125, 2, 60, 120),
(1.5625, 0.625, 50, 120), (1.5625, 1.3125, 60, 40), (1.5625, 2, 40, 80), (2,
0.625, 60, 80), (2, 1.3125, 40, 120), (2, 2, 50, 40). Among the 9 initial
training points, (1.5625, 0.625, 50, 120), (1.5625, 1.3125, 60, 40) and (2,
0.625, 60, 80) are feasible, and (1.5625, 0.625, 50, 120) has the lowest objective
value and is first used as the starting design point of the search. The optimal
point is obtained after 37 iterations, including two restarts. The optimal design
point (1.1875, 0.625, 59, 40) was found in the first restart, and the objective
function value at this point is 7442.015. In the optimization process, 45 out
of the 15×23×21×17=123,165 possible combinatorial combinations (0.037%) are
evaluated. Finally Table 4 compares the results obtained here with the solutions
obtained in the literatures. This comparison is only for reference purpose
because the ranges of the variables used here are different with those in the
literatures, and variables x_{3} and x_{4} are
continuous variables in literatures.
Table 4. Comparing the results with those issued
in literatures

Continuous solution

Discrete solution SNA method

Mixed discrete solution Kannan & Kramer [20]

Mixed discrete solution Sandgren [18]

x_{1}

1.125

1.1875

1.125

1.125

x_{2}

0.625

0.625

0.625

0.625

x_{3}

58.290

59

58.291

48.97

x_{4}

43.693

40

43.69

106.72

g_{1}

0.000

0.049

+0.000

0.180

g_{2}

0.069

0.062

0.069

0.158

g_{3}

+0.000

0.001

0.000

+0.000

g_{4}

196.307

200

196.31

133.28

f

7198.01

7442.02

7198.04

8129.14

4.2 Structure design of a stepped
cantilever beam with rectangular shape
Figure 18 shows a
fivestepped cantilever beam with total length L of 500 cm, and the material elasticity
modulus E is 200GPa. A concentrated load of 50,000N is applied at the tip end of the beam.
This problem was presented by Thanedar and
Vanderplaats [4]. In this problem, the height and width of the beam in
all five steps of the cantilever beam are design variables. The volume of the
beam is to be minimized. There are 11 constraints in this problem, 5 bending
stress constraints in all five steps of the beam to be less than the allowable
stress (14,000N/cm^{2}), and one displacement constraint on the tip
deflection is to be less than the allowable deflection (2.7cm). A specified aspect ratio 20 has to be
maintained between the height and width of all 5 cross sections of the beam.
Substituting the parameters, this problem can be formulated as follows:
min. _{}
s.t. _{}
_{}
_{}
_{}
_{}
_{}
_{}
_{}
_{}
_{}
_{} (10)
Figure 18. Stepped cantilever beam
Table 5 shows
the discrete values of each variable. The ranges of the variables are selected
from engineering viewpoint, therefore they are not exactly the same as those in
the literatures. The L_{27} orthogonal array is constructed to form the
set of initial training data of a fivestep cantilever beam design problem. There
are three feasible design points (3.0, 3.4, 2.6, 2.6, 2.0, 60, 56, 50, 46, 37),
(3.4, 3.0, 3.0, 2.6, 1.8, 58, 58, 48, 46, 35), and (3.4, 3.4, 3, 2.6, 2, 60,
54, 48, 42, 37) in the set of initial training data.
Table 5. Possible discrete values of the design
variables
b_{1}

3.0

3.2

3.4

3.6

3.8

b_{2}

3.0

3.2

3.4

3.6

3.8

b_{3}

2.2

2.4

2.6

2.8

3.0

b_{4}

2.2

2.4

2.6

2.8

3.0

b_{5}

1.6

1.7

1.8

1.9

2.0

h_{1}

58

59

60

61

62

h_{2}

54

55

56

57

58

h_{3}

48

49

50

51

52

h_{4}

42

43

44

45

46

h_{5}

33

34

35

36

37

The feasible design
point (3.0, 3.4, 2.6, 2.6, 2.0, 60, 56, 50, 46, 37) has
the lower objective value of 69,400cm^{3}
and is used as the starting point of the search. The SNA method
terminates after 57 iterations, including two restarts. The final design point
is (3.0, 3.0, 2.8, 2.6, 1.8, 60, 54, 50, 46, 35) with total volume of 66,460cm^{3}. Only 81 design
points (including 27 points in the initial training set) out of 5^{10} =
9,765,625 possible combinatorial combinations are evaluated (0.00083%).
4.3 Design of a reinforced
cylindrical body of a marine structure
Figure 19 shows
the cross section of a quarter of a symmetrical cylindrical body. It is a major
marine structure reinforced in a series of ringshape ribs with different width
but equally distributed along the length. The outer and inner diameter sizes of
this cylinder are both limited by its space; the cylinder has to withstand a
certain depth of water pressure on
the outer surface. The cylinder needs to have less weight and high structural
strength for better performances under water. There are four sizing variables b,
c, d, t; the first three are widths of 3 different ribs,
and t is the thickness of the
cylinder. This cylinder has to sustain a water pressure of 40N/mm^{2},
a uniform payload load of 975N, and selfweight. The objective is to minimize
the volume of the cylinder, while the induced maximum stress of the cylinder
under the loading and boundary conditions must be less than an allowable value
for a specified material and the maximum deformation must also be less than its
thickness. This cylinder is made of a high strength aluminum alloy 7075 with allowable
yield strength s_{allow} of 620N/mm^{2}. The
optimization problem can be formulated as Equation (11). Note that the
constraints on stress and deformation are both implicit constraints, and have
to be evaluated through finite element analysis. Table 6 shows the discrete values
of each variable available from engineering practices.
min. f (b, c, d,
t) = p[535^{2}(5352t)^{2}](405+b+c+2d)
s.t. s (b, c,
d, t) < s_{allow}
disp (b, c, d, t) < t (11)
Figure 19. The reinforced cylindrical body
Table 6. Discrete values of the reinforced
cylindrical body variables (unit: mm)

1

2

3

4

5

6

7

b

41

43

45

47

49

51

53

c

33

35

37

39

41

43

45

d

16

18

20

22

24

26

28

t

4

5

6

7

8

9

10

The 9 initial
training points (L_{9}(3^{4}) orthogonal array) in this example
are (41, 33, 16, 4), (41, 39, 22, 7), (41, 45, 28, 10), (47, 33, 22, 10), (47,
39, 28, 4), (47, 45, 16, 7), (53, 33, 28, 7), (53, 39, 16,10), (53, 45, 22, 4).
Among the 9 initial training points, (41, 45, 28, 10) and (47, 33, 22, 10) are
feasible, and (47, 33, 22, 10) has the lower objective value and is first used
as the starting design point of the search. The optimal point is obtained after
8 iterations, including one restarts. The optimal design point is (41, 33, 20,
10), and the objective function value at this point is 3.42e+7. In the
optimization process, 17 out of the 7^{4}=2,401 possible combinatorial
combinations (0.7%) are evaluated.
5.
Discussion and Conclusions
When dealing with engineering design optimization problems,
it is often difficult to find a numerical optimization algorithm that is
readily applicable because of the existence of discrete variables, implicit
constraints, or even binary constraints. The SNA method presented in this paper
is designed specifically for this type of engineering design optimization
problem. In this method, the constraints are modeled as a passfail functions.
The precious function evaluations of the implicit constraints are accumulated
to progressively form a better approximation to the feasible domain using the
neural network. Only one evaluation of the implicit constraints is needed in
each iteration to see whether the current design point is feasible. No precise
function value or sensitivity calculation is required. Thus the SNA method does
not require continuity or differentiability of the problem functions.
A clear 01 binary pattern is used in the inputoutput
layers of the neural network. Thus the computational cost of the training of
the neural network is relatively small comparing to that of using the neural
network to simulate the function values of the implicit constraints. As
discussed earlier, in our numerical experience, the
error goal of 1e6 is usually met within 100 epochs, even for cases with many variables
and training points. A restart strategy is also employed so that the method may
have a better chance to reach a global optimum.
One potential
problem of the SNA method is that, when the number of design variables and the
number of discrete values that a design variable can take is large, the search
space will be large. This might affect the efficiency of the search algorithm
of the SNA method. When the number of discrete values that a design variable
can take is large, the “resolution” of the discrete grids of the search space
can be controlled. Instead of taking all discrete values all at once, a few
discrete values can be chosen first, so that the size of the search space is
acceptable. When the SNA method terminates at a discrete design point, the
search space can be reduced to the neighborhood of this point, but increase the
resolution of the discrete grids, then start the iterations again.
In the search algorithm
in the SNA method, a branching criterion is used to cut down the number of
points searched. However, number of design points searched in each iteration is
still in the order of 3^{n}1, where n is the number of
design variables. This can be very expensive with large number of design
variables even though the function evaluation for each design point is cheap,
which poses limitation to the SNA method. As demonstrated by the examples
presented in this paper, the SNA method can handle up to 10 design variables.
References
[1]
Hsu, Y. L., Sheppard, S. D.,
and Wilde, D. J., (1996) Explicit approximation method for design optimization
problems with implicit constraints, Engineering
Optimization, 27(1), 2142.
[2]
Huang, M. W., and Arora, J.S.,
1997, “Optimal design with discrete variables: some numerical experiments,” International Journal for Numerical Methods
in Engineering, Vol. 40, pp. 165188.
[3]
Arora, J. S., and Huang, M.W.,
1994. “Method for optimization of nonlinear problems with discrete variables: a
review,” Structural Optimization, Vol. 8, pp. 6985.
[4]
Thanedar, P.B., and
Vanderplaats, G..N., 1995, “Survey of discrete variable optimization for structural
design,” Journal of Structural Engineering, Vol. 121, No.2, pp. 301305.
[5]
Hsu, Y. L., Dong, Y. H., and
Hsu, M. S., 2001. “A sequential approximation method using Neural networks for
nonlinear discretevariable optimization with implicit constraints,” JSME
International Journal, Series C, Vol. 44, No. 1, p. 103~112.
[6]
Hopfield, J. and Tank, D.,
1985. “Neural computation of decisions in optimization problems,” Biol. Cybern., Vol. 52, pp. 141152.
[7]
Watta P.B., 1996, “A coupled
gradient network approach for static and temporal mixedinteger optimization”, Transactions
of Neural Networks, Vol., 7, No. 3, pp. 578593.
[8]
Lee, S. J., and Chen, H.,
1991, “Design optimization with backpropagation neural networks,” Journal of Intelligent Manufacturing,
Vol. 2, pp. 293303.
[9]
Renaud, J., Sellar, R., Batill,
S., andKar, P., 1994, “Design driven coordination procedures for concurrent
subspace optimization in MDO.” Proceedings 35 AIAA/ASME Structures and
Matreials Conference.
[10]
Hajela, P., 1999, “Nongradient
methods in multidisplinary design optimization  status and potential”, Journal
of Aircraft, Vol.36, No.1, pp.255265.
[11]
Haftka, R.T., and Gurdal, Z.,
1992, Element of Structural Optimization,
3rd editions, Kluwer Academic Publishers.
[12]
Plackett, R. L., and Burman, J.
P., 1946. “The Design of Optimum Multifactorial Experiments,” Biometrics,
Vol. 33, pp.305325.
[13]
Rao, C. R., 1947. “Factorial
Experiments Derivable from Cominatorial Arrangement of Arrays,” Journal of
the Royal Statistical Society, Series B, Vol. 9, pp. 155162.
[14]
Farkas J., 1984, Opitmum
Design of Metal Structures, 1st edition, Ellis Horwood limited Publishers.
[15]
Yuan, K. Y., Liang, C. C., and
Ma, Y. C., 1990. “The estimation of the accuracy and efficiency of the
backtrack programming method for discretevariables structural optimization
problems,” Computers & Structures,
Vol. 79, No. 2, pp. 155162.
[16]
Farkas J., and Jarmai, K., 1997,
Analysis and optimum design of metal Structures, 1st edition, Balkema
Publishers.
[17]
Siddall, J. N., 1982, Optimal
engineering Design; Principle and Applications, Marcel.Dekker, N. Y.
[18]
Sandgren, E., 1988, “Nonlinear integer
and discrete programming in mechanical design,” Proceeding of the ASME Design Technology Conference, Kissimmee, FL,
pp. 95105.
[19]
Lu, P., and Siddall, J.N.,
1993, “A boolean local improvement method for general discrete optimization problems,”
Journal of Mechanical Design,
Transactions of ASME, Vol. 115, pp. 776783.
[20]
Kannan, B. K., and Kramer, S.
N., 1994, “An augmented lagrange multiplier based method for mixed integer discrete
continuous optimization and its applications to mechanical design,” Journal of Mechanical Design, vol. 116,
pp405411.