Author: YehLiang Hsu, YuHsin Dong, MingSho Hsu
(20000913); recommended: YehLiang Hsu (20010216).
Note: This paper is published in the JSME International Journal,
Series C, Vol. 44, No. 1, April 2001, p. 103~112.
A Sequential Approximation
Method Using Neural Networks For Nonlinear DiscreteVariable Optimization with
Implicit Constraints
ABSTRACT
This paper
presents a sequential approximation method that combines a backpropagation
neural network with a search algorithm for nonlinear discretevariable
engineering optimization problems with implicit constraints. This is an
iteration process. A backpropagation neural network is trained to simulate the
feasible domain formed by the implicit constraints. A search algorithm then
searches for the “optimal point” in the feasible domain simulated by the neural
network. This new design point is checked against the true constraints to see
whether it is feasible, and is added to the training set. Then the neural
network is trained again.
With more design
points in the training set, information about the whole search domain is
accumulated to progressively form a better approximation for the feasible
domain. This iteration process continues until the approximate model insists
the same “optimal” point in consecutive iterations. In each iteration, only one
evaluation of the implicit constraints is needed to see whether the current
design point is feasible. No precise function value or sensitivity calculation
is required. Several engineering design examples are used to demonstrate the
practicality of this approach.
Introduction
Engineering
design optimization problems have two characteristics that differ themselves
from general mathematical programming problems: (1) their design variables are
often discrete physical quantities, (2) their constraints often cannot be
expressed analytically in terms of design variables. Such constraints are
called implicit constraints. An engineering design optimization problem can
usually be formulated as follows:
min. f(x)
s.t. _{}
_{}
_{} (1)
where x is the vector of design variables, f(x)
is the objective function, _{} and _{} are constraints; _{} is the set of
design variables that can take only discrete values, and _{} is the set of
allowable discrete values for variable _{}.
There are two
types of constraints in the formulation: _{} are the explicit
constraints which can be expressed explicitly in terms of design variables
(e.g., the upper and lower bounds of the design variables); _{} are the implicit
constraints. In an engineering design problem, typical examples of implicit
constraints are those on structural responses such as stress, displacement, and
natural frequency, and constraints involving manufacturing parameters,
measurement data, etc.
The implicit
constraints are usually evaluated by computer simulation such as finite element
analysis, or by experiments. Thus the evaluation of the implicit constraints is
usually the major cost of the optimization process. Moreover, the analytical
forms of the gradients of the implicit constraints with respect to the design
variables are usually not available. Sensitivity of such constraints often has
to be calculated through finite difference schemes, which requires many
function evaluations, and often have problems with accuracy.
One important
category of numerical optimization algorithms is the sequential approximation methods.
The basic idea of sequential approximation methods is to use a “simple”
subproblem to approximate the hard, exact problem. By “simple” subproblem, we
mean the type of problems that can be readily solved by existing numerical
algorithms. For example, linear programming subproblems are widely used in
sequential approximation methods. The solution point of the simple subproblem
is then used to form a better approximation to the hard, exact problem for the
next iteration. In an iterative manner, it is expected that the solution point
of the simple approximate problem will get closer to the optimum point of the
hard exact problem. For continuous nonlinear optimization problems, the
wellknown sequential linear programming which utilizes linear programming
subproblems, and sequential quadratic programming which utilizes quadratic
programming subproblems, both fall into this category.
One major
disadvantage of applying numerical sequential approximation methods to
engineering design optimization problems is that, these are usually
derivativebased approximation methods which require at least the first
derivatives of the implicit constraints with respect to the design variables.
Moreover, only the local numerical information from the solution point of the
previous iteration is used to form the next approximation. Knowledge about the
whole search domain is not accumulated for formulating better approximation, or
for better understanding of the optimization model. Precious function
evaluations and sensitivity calculations of the implicit constraints are often
not most efficiently utilized.
Nonlinear,
discrete variable optimization is a well studied field. Lu and Siddall [1993]
pointed out that two general approaches can be identified in research work on
discrete programming problems. The first attempts to find methods that treat
the variables as discrete from the beginning. The second approach is to first
approximate an optimum by treating all variables as continuous and then using
some rounding technique to find the best feasible discrete point in the region
of the continuous variable optimum. Both approaches tend to be adaptations of
continuous variable methods. Thanedar and Vanderplaats [1995] further
classified the discrete variable optimization methods into three categories:
branch and bound, approximation using branch and bound, and adhoc methods,
including simulated annealing and genetic algorithms.
This paper
presents a “Sequential NeuralNetwork Approximation Method (the SNA method),”
which attempts to handle discrete variable optimization problems with nonlinear
implicit constraints. It is also a sequential approximation method that
combines backpropagation neural networks with a search algorithm. In this
method, first a backpropagation neural network is trained to simulate a rough
map of the feasible domain formed by the implicit constraints using just a few
training data. A training data consists of a design point and whether this
design point is feasible or infeasible. A search algorithm then searches for
the “optimal point” in the feasible domain simulated by the neural network.
This new design point is checked against the true constraints to see whether it
is feasible, and is then added to the training set. The neural network is
trained again with this added information, in the hope that the network will
better simulate the boundary of the feasible domain of the true optimization
problem. Then we search for the “optimal point” in this new approximated
feasible domain again. This process continues in an iterative manner until the
approximate model insists the same “optimal point” in consecutive iterations.
Hopfield and
Tank [1985] first extended neural network application to optimization problems.
In the Hopfield model, the optimization problem is converted into a problem of
minimizing an energy function formulated by the design restraints. Lee and Chen
[1991] further explored the idea of applying neural network technology in
design optimization. In their approach, neural networks were used to simulate
the constraint functions. Oda and Mizukami
[1993] developed an application of a “mutual combinationtype” neural network
to the optimum design problems of truss structure. In their work, the Hopfield
model and the quadratic energy function are used to solve the design problems,
and the method is able to obtain feasible solutions for several optimum design
problems.
The SNA method
presented in this paper is attractive in two ways over the numerical sequential
approximation methods. First, in each iteration, only one evaluation of the
implicit constraints is needed to see whether the current design point is
feasible. No precise function value or sensitivity calculation is required.
Therefore it is especially suitable for engineering optimization problems with
expensive implicit constraints. Secondly, in the SNA method, precious function
evaluations of the implicit constraints are accumulated and utilized repeatedly
to form better approximation to the feasible domain.
One key point of
the SNA method is that, the computational cost of the training of the neural
network has to be relatively small comparing with that of the function value
and sensitivity calculations of the implicit constraints. Otherwise using
neural network in optimization will not have much advantage computationally
over the traditional numerical optimization algorithms, in the cases which
traditional algorithms are still applicable. As will be illustrated in this
paper, in the SNA method, all training data in the input, output pairs have values
of either 0 or 1. This clear 01 pattern makes the training process relatively
fast.
Arora et.al.
[1994] also reviewed the methods of nonlinear problems with discrete variables.
Their conclusion was, none of the methods are guaranteed to produce the global
minimizer for nonlinear problems, but “good and practical” solutions can be
obtained. The optimality of the “optimal” design point obtained by the SNA
method cannot be proved. The quality of the “optimal” design point really
depends on the quality of the feasible domain simulated by the neural network.
However, several engineering design examples are used in this paper to
demonstrate the practicality of this approach. The following sections will
further elaborate on the basic ideas of the SNA method.
Sequential Approximation Using Neural Networks
The basic
concept of the SNA method presented here is exactly the same as that of the
numerical sequential approximation techniques discussed in the previous
section. Eq.(1) in the previous section can be thought of as a hard, exact
problem because of the existence of the implicit constraints. The idea is to
use a neural network to simulate the boundary of the feasible domain formed by
the implicit constraints. The hard, exact problem in Eq.(1) is approximated by
a “simple” approximate problem below
min. f(x)
s.t. _{}
NN(x)=0 (2)
where the binary constraint NN(x)=0 approximates the
feasible domain of the hard, exact implicit constraint _{}. If NN(x)=0, the design point x is
feasible; if NN(x)=1, the design point x is infeasible.
The discrete variable constraints are embedded in the format of the input nodes
of the neural network, as will be described in the next section. For
convenience, the optimization model in Eq.(1) will be denoted _{} throughout the
paper, and the approximate model in Eq.(2) will be denoted _{}.
Figure 1 is a
flow chart of the SNA method. A backpropagation neural network is trained to
simulate a rough map of the feasible domain formed by the implicit constraints
using just a few training data (the set of initial training data X_{0}), to form the initial
optimization model _{}. _{} is a “simple”
model because once the training is completed, the computational cost of
evaluating NN(x) is much less than that of evaluating the
implicit constraints. A search algorithm is used to search for the solution
point _{} of _{}. This solution point is then evaluated to see whether it is
feasible in _{}, and whether _{} generates the
same solution point as in the previous iteration. If the feasible solution
point _{} is the same as _{}, no new training point is generated and this procedure has
to stop. If not, a new solution point _{} is added to the
set of training points X, and the
neural network is trained again. With the increasing number of training points,
it is expected that the feasible domain simulated by the neural network in _{} will get closer
to that of the exact model _{}, and the solution point of _{} will get closer
to the true optimum point.
Figure 1. Flow chart of the algorithm
Figure 2 shows a
simple threebartruss problem which is commonly used in structural
optimization literatures to demonstrate various optimization techniques [Haftka
and Gurdal, 1992]. It is used here as an example to illustrate the detailed
procedure of the SNA method. The objective of this problem is to find the
optimal values of the crosssectional areas of the bars _{}, _{}, _{}, which minimize the weight of the structure, subject to
maximum stress constraints on the three bars. The horizontal force P can either act to the right or to the
left, thus we have a symmetric structure in which _{} must equal _{}.
_{}
Figure 2. Three bar truss
Substituting
values of the parameters r, E, P,
L, and _{} assumed for the
example, this optimization problem can be formulated as below:
min. _{}
s.t. _{}
_{} (3)
where constraint _{} is the maximum
stress constraint for bar 1 and 3, constraint _{} is the maximum
stress constraint for bar 2. Also assume that all three bars have square cross
sections, and the sizes of the raw material available from manufacturers range
from 1x1mm^{2}, 2x2mm^{2}, ... to 17x17mm^{2}.
Therefore we have the discretevalue constraints below:
_{}, i = 1, 2. (4)
Eq.(3) and (4)
form _{} of this problem.
For illustrative purpose, _{}, _{} are viewed as
implicit constraints. In _{} of this problem,
a backpropagation neural network is trained to simulate the boundary of the
feasible domain.
Figure 3 shows
the architecture of the threelayer network. The size of the input layer
depends on the number of variables and the number of discrete values of each
variable. In this example, there are 2 design variables, with 17 discrete values
each. Thus a total of 2x17 neurons are used in the input layer. Each neuron in
the input layer has value 0 or 1 to represent a discrete value the
corresponding variable takes, as will be explained later in this section. There
is only a single neuron in the output layer to represent whether this design
point is feasible (the output neuron has value 0) or infeasible (the output
neuron has value 1).
Figure 3. The architecture of the neural network
In the
threebartruss example, the center point of the search domain _{} is used as one of
the initial training points. This training point is represented as shown in
Figure 4, where a blank circle represents a “0” in the node, a solid circle
represents a “1” in the node. For illustrative purpose, the 34 input nodes are
arranged into two rows in Figure 4 to represent two design variables. The first
9 nodes of the first row have value 1 to represent that the first variable _{} has the discrete
value 81. The first 9 nodes of the second row also have value 1 to represent
that the second variable _{} also has the
discrete value 81. Checking with _{}, we can determine that _{} is an infeasible
design point. So the single node in the output layer has the value of 1 to
represent that this point is infeasible. Similarly, Figure 5 shows another
initial training point _{}, which is a feasible design point. There has to be at least
one feasible design point in the set of initial training points, as the search
algorithm described later has to start from a feasible design point.
Input layer
_{}
_{}
Output layer
(infeasible)
Figure 4. Representation of an infeasible training
point _{}.
_{}
_{}
Output layer
(feasible)
Figure 5. Representation of a feasible training
point _{}.
The number of
neurons in the hidden layer depends on the number of neurons in the input
layer. There are 16 neurons in the hidden layer in this example. The transfer
functions used in the hidden and output layer of the network are both
logsigmoid functions. The neuron in the output layer has a value range [0, 1].
After the training is completed, a threshold value is applied to the output
layer when simulating the boundary of the feasible domain. In other words,
given a discrete design point in the search domain, the network always output 0
(if output neuron’s value is less than the given threshold) or 1 (otherwise) to
indicate whether this discrete design point is feasible or infeasible.
All training
data are in this clear 01 pattern, which makes the training process relatively
fast. A conjugate gradient algorithm (PowellBeale Restarts) is used for the
training. In this example, the error goal of
1e5 is usually met within 100 epochs, even for cases with many training
points.
Searching in the Feasible Domain
As described in
the previous section, two design points are used as the initial training
points. We can also use several evenly distributed design points as initial
training points to learn a rough map of the feasible domain. After this initial
training is completed, the mathematical optimization model _{} is transformed
into _{}, in which the feasible domain of the optimization model is
simulated by NN(x).
Figure 6. Flow chart of the search algorithm
A search for the
optimum point of _{} is then started.
Figure 6 is a simple flow chart of this discrete search algorithm. The search
algorithm has to start from a feasible design point. If the new design point
generated from the previous iteration is a feasible design point in _{}, it is used as the starting point in the current search. If
the new design point is infeasible in _{}, then use the same starting point of the previous iteration
in the current search. As stated in the previous section, there has to be at
least on feasible design point in the set of initial training points, so that
the search algorithm can always starts from a feasible design point.
From the
feasible starting point, the search algorithm then moves one variable at a time
to the neighboring discrete value. This next design point should be feasible
and have a lower objective function value. Moving one variable at a time, the
search terminates when all neighboring points of the current design point are
infeasible or have higher objective function values.
As shown in
Figure 6, at the beginning of the search all design variables form a “usable
set”, which is denoted U. The
gradient of the objective function with respect to the design variables is
calculated at the starting point. Note that the assumption of the optimization
model (Eq.(1) or Eq.(2)) is that its objective function is an explicit
function, whose gradient has an analytical form and can be easily calculated.
Then we have to
pick one variable to move to the neighboring discrete value. From all variables
_{}, the variable _{}that has the largest _{} is picked. Since
we are minimizing the objective function, if _{}, then _{} is moved to the
higher neighboring discrete value; on the contrary, if _{}, then _{} is moved to the
lower neighboring discrete value.
Before accepting
the move, this new design point is checked against NN(x) to see
if it is feasible in _{}. If the new design point is feasible, this move is accepted,
its gradient is calculated, and we go back to the outer iteration loop as shown
in Figure 6. If the new design point is infeasible in _{}, this move is discarded, the variable _{} is not usable, so
we update the usable set, _{}, then go back to the inner iteration loop as shown in Figure
6. Another variable with the largest gradient value is picked from the usable
set, and we proceed with the same procedure. This search process terminates
when _{}, since all moves from the current design point for lower
objective function values are infeasible in _{}. Therefore the current design point is the optimum design
point (at least locally) in _{}.
Back to our threebartruss example, Figure 7 shows the feasible
domain simulated by NN(x) of _{} using two initial
training points {(81, 81), infeasible} and {(289, 289), feasible}. The search
algorithm described above is then started to search for the optimum point in
the feasible domain in Figure 7. The feasible design point _{} is used as the starting
point, and the search algorithm terminates at a new design point _{}.
Figure 7. The feasible domain simulated by _{} with two initial
training points
This new design
point is then evaluated using the true constraints of _{}_{ }in Eq.(3),
and is determined feasible. Thus a new training point {(81, 289), feasible} is
added to the set of training points, and the neuralnetwork is trained again.
After the training is completed, NN(x) is updated, the search
algorithm is started again to search for the optimum point in the feasible
domain simulated by _{} with three
training points.
The iteration
continues, with one more training point added into the set of training points
after each iteration. The whole process terminates when we get the same design
point repeatedly and no new training point is generated. In our threebartruss
example, the SNA method terminates after 13 iterations. The final design point obtained
by the SNA method is _{}, which is the same as the true global minimum.
(a) _{}_{}
(b) _{}_{}
Figure 8. The feasible domains of _{} and _{}.
Figure 9. Iteration history of the threebartruss
example
Figure 8
compares the feasible domain simulated by NN(x) at the end of the
process and the true feasible domain of _{} in Eq.(3). The
feasible domain of _{} in the
neighborhood of the optimum point is exactly the same as that of the _{}, because most training points (nodes enclosed by square
brackets in the Figures) fall in this region. Figure 9 shows the iteration
history of this example; the points enclosed by circles are the feasible design
points. Note that only 13 training points are used, i.e., only 13 evaluations,
out of the 17x17=289 possible combinatorial combinations, of the implicit
constraints are needed. No gradient calculation is required.
Design Examples
Two design
examples are presented to demonstrate the practicality of the SNA method. The
first example is a pressure vessel design example. It is widely used in
literatures to demonstrate various algorithms for discrete variable
optimization. This example does not have implicit constraints. It is used here
to compare the result obtained from the SNA method, treating all constraints as
implicit constraints, with those obtained from other discrete programming
algorithms in the literatures. The second example is on the optimal design of a
cross member of a vehicle. This example has an implicit constraint that has to
be evaluated by finite element analysis. The result obtained by the SNA method
is also compared with that obtained by sequential linear programming.
Pressure Vessel Design
Figure 10. The pressure vessel design
A cylindrical
pressure vessel is capped at both ends by hemispherical heads as shown in Figure
10. The total cost, including the cost of material, cost of forming and
welding, is to be minimized. The design variables are the thickness of the
shell and head, and the inner radius and length of the cylindrical section.
These variables are denoted _{}, _{}, _{}, and _{}, respectively.
This problem was
discussed by Sandgren (1988) and Kannan and Kramer (1994) to demonstrate
algorithms for nonlinear discrete variable optimization. The units of the
variables used in the literatures are inches. Here we still use inches so that
we can compare our result with those in the literatures. This problem can be
formulated as follows:
min. _{}
s.t. _{}
_{}
_{}
_{} (5)
In the
literatures, variables _{} and _{} are integer
multiples of 0.0625 inch, the available thickness of rolled steel plates; _{}, and _{} are continuous
variables. Here we assume all four variables are discrete variables, and each
variable can take 11 possible discrete values as listed below. Note the ranges
of the variables are smaller than those in the literatures.
Table 1. Possible discrete values of the design
variables

1.

2.

3.

4.

5.

6.

7.

8.

9.

10.

11.

_{}

0.625

0.6875

0.75

0.8125

0.875

0.9375

1

1.0625

1.125

1.1875

1.25

_{}

0.0

0.0625

0.125

0.1875

0.25

0.3125

0.375

0.4375

0.5

0.5625

0.625

_{}

45

45.5

46

46.5

47

47.5

48

48.5

49

49.5

50

_{}

100

102

104

106

108

110

112

114

116

118

120

Design point
{(1.25, 0.625, 50, 120), feasible} is used as the initial training point. This
is the same starting point as in the literatures. The SNA method terminates
after 21 iterations. Table 2 shows the iteration history. Note that only 20
design points are evaluated, out of _{} possible combinatorial
combinations. The final design point (1, 0.5, 48.5, 112) was visited 3 times
during the iterations.
Table 2. Iteration history on the pressure vessel
design example
Iteration

x_{1}

x_{2}

x_{3}

x_{4}

Weight

Feasibility

0.

1.25

0.625

50

120

9589.9

Y

1.

0.625

0

45

100

2222.9

N

2.

0.625

0.0625

46.5

118

2881.0

N

3.

0.625

0.375

48.5

118

4316.5

N

4.

0.625

0.625

45

120

4848.2

N

5.

0.6875

0.625

49

118

5778.5

N

6.

0.9375

0.625

50

120

7485.1

N

7.

1.0625

0.625

49

118

8011.1

Y

8.

1

0.4375

49

118

6812.3

N

9.

1

0.625

49

118

7612.7

Y

10.

0.9375

0.625

49

118

7224.9

N

11.

1

0.5625

48.5

118

7250.5

Y

12.

1

0.5

46.5

118

6633.6

N

13.

1

0.5625

45

100

6035.6

N

14.

1

0.5

48.5

116

6922.4

Y

15.

1

0.375

48.5

116

6399.6

N

16.

1

0.5

48

108

6569.2

N

17.

1

0.5

48.5

112

6789.0

Y

18.

1

0.5

48.5

108

6655.6

N

19.

1

0.5

48.5

110

6772.3

N

20.

1

0.5

48.5

112

6789.0

Y

21.

1

0.5

48.5

112

6789.0

Y

Table 3 compares
the results obtained here with those in the literatures. We can see that the
design point obtained by the SNA method has a lower objective function value
than both results in literatures.
Table 3. Comparing the results with those in the
literatures

Initial design point

The SNA method

Kannan and Kramer

Sandgren

_{}

1.25

1

1.125

1.125

_{}

0.625

0.5

0.625

0.625

_{}

50.0

48.5

58.291

47.7

_{}

120

112

43.69

117.701

_{}

0.285

0.064

0

0.204

_{}

0.148

0.037

0.069

0.170

_{}

170076.571

9533.333

21.222

54.225

_{}

120

128

196.310

122.300

F

9589.925

6788.988

7198.043

8129.104

Cross Member Design
Figure 11. The finite element model of a cross
member of a vehicle
Figure 11 shows
the finite element model of a cross member of a vehicle. Only half model is
shown because of symmetry. The cross member is bolted to the frame of the
vehicle at point P and Q. The suspension of the vehicle exerts
force to the cross member at point R
and pressure loading on the top of the lower barrel of part A. There are four sizing variables t_{1}, t_{2}, t_{3},
t_{4}, which are the
thickness of part A, B, C
and D to be decided. The objective is
to minimize the weight of the cross member, while the maximum stress of the
cross member under the loading and boundary conditions is less than an
allowable value. This cross member is made of steel. The thickness of the steel
plates available from manufacturers is also restricted.
This
optimization problem can be formulated as below:
min. _{}
s.t. _{}
_{}_{}_{}_{} (6)
where the coefficients a, b, c,
d directly relate to the areas of
parts A, B, C, D, and a = 0.5525, b = 0.7606, c = 0.4484, d = 0.2272. Also the allowable stress _{} = 450 MPa, _{}, _{}. There are a total of 1296 (_{}) possible combinatorial combinations of the discrete values.
Note that the stress constraint is an implicit constraint, since there is no
analytical form of stress with respect to the design variables.
The SNA method
is used to solve this discretevariable optimization problem. _{} and _{} are used as the
initial training points. Note that _{} is the middle
point of the search domain, and _{} was the upper
bound of the search domain. The SNA method terminates after 24 iterations. The
final design point is _{}; the weight of this final design is 6.18kg; the maximum
stress of this design is 437.95MPa. Table 4 shows the iteration history of this
example.
The same problem
was also solved using a sequential linear programming (SLP) algorithm. From the
same starting point, SLP terminates at a continuous solution _{} after 20
iterations. The weight at this continuous design point is 6.058kg, and the
maximum stress is 449.26MPa. Note that the discrete solution obtained by the
SNA method is exactly the neighboring discrete point of the continuous
solution.
For the SLP
algorithm, each iteration requires 5 finite element analyses in this case, 1
analysis at the current design point, and 4 analyses for sensitivity
calculation. Therefore SLP needs a total of more than 100 analyses to reach the
continuous solution in this case. More analyses may be required to obtain the
discrete solution from the continuous solution. On the other hand, the SNA
method requires only one analysis in each iteration, no sensitivity analysis was
required. Thus only 25 analyses were required to reach the final discrete
solution.
Table 4. Iteration history on the cross member
design example
iteration

t1

t2

t3

t4

weight(kg)

Smax(MPa)

feasibility

0

2

2

6

6

5.88

506.52

N

0

2

2

7

7

7.22

368.08

Y

1

2

2

5

6

5.61

613.94

N

2

2

2

7

6

6.93

380.69

Y

3

2

2

6

6

6.29

457.54

N

4

2

2

7

6

6.68

393.5

Y

5

1

2

6

6

6.26

443.15

Y

6

1

1

6

5

5.67

504.44

N

7

2

2

6

7

6.65

439.26

Y

8

2

2

5

6

5.94

597.28

N

9

2

2

6

6

6.51

436.02

Y

10

1

2

6

5

5.93

499.93

N

11

2

2

6

6

6.48

420.95

Y

12

2

2

6

5

6.23

495.57

N

13

2

2

7

5

6.58

492.19

N

14

2

2

7

5

6.66

487.13

N

15

2

2

7

6

6.75

391.11

Y

16

1

2

6

5

6.03

451.93

N

17

2

1

6

6

6.1

465.67

N

18

2

2

7

5

6.69

431.76

Y

19

1

2

5

6

5.82

528.18

N

20

2

2

6

6

6.47

450.23

N

21

2

2

6

6

6.59

416.74

Y

22

2

2

6

5

6.35

438.11

Y

23

1

2

6

5

6.18

437.95

Y

24

1

2

6

5

6.18

437.95

Y

Discussion and Conclusions
This paper
presents a framework of applying neural network on discretevariable
engineering optimization problems. A backpropagation neural network is used to
approximate the feasible domain of the implicit constraints. A clear 01
pattern is used in the inputoutput layers of the neural network, thus the
computational cost of the training of the neural network is small comparing to
the evaluation of the implicit constraints. In each iteration, only one
evaluation of the implicit constraints is needed to see whether the current
design point is feasible. No precise function value or sensitivity calculation
is required. When applied to several engineering design problems, the SNA
method demonstrated the potential of being a more practical and efficient
approach than the derivativebased numerical approximation algorithms.
One potential
problem of the SNA method is that, when the number of design variables and the
number of discrete values that a design variable can take is large, the search
space will be large. This might affect the efficiency of the training of the
neural network because the number of input neurons will be large. One way to
overcome this potential problem is to control the “resolution” of the discrete
grids of the search space. Instead of taking all discrete values all at once,
we can choose a few discrete values first, so that the size of the search space
is acceptable. When the SNA method terminates at a discrete design point, we
can then reduce the search space to the neighborhood of this point, but
increase the resolution of the discrete grids, then start the iterations again.
Extending this
concept to the extreme, the SNA method can be applied to continuousvariable
optimization problems with implicit constraints. The continuous variables and
functions are “digitized” into discrete points. Then the SNA method can be
applied repeated until the resolution of the digitization reaches the desire
accuracy. Also the discrete design point obtained from this framework can
certainly be used as the starting point for the search of other
continuousvariable algorithms.
Acknowledgment
This research is
partially supported by National Science Council, Taiwan, ROC, grant number
NSC830401E155023. This support is gratefully acknowledged.
References
Arora, J.S.,
Huang, M.W., and Hsieh, C.C., 1994, “Methods for Optimization of nonlinear
problems with discrete variables: A review,” Structural Optimization, Vol. 8, No. 23, pp. 6985.
Haftka, R.T.,
and Gurdal, Z., 1992, Element of
Structural Optimization, 3rd editions, Kluwer Academic Publishers.
Hopfield, J. J.,
and Tank, D. W., 1985, “Neural Computation of Decisions in Optimization Problems,”
Biological Cybernetics, Vol. 52, pp.
141152.
Kannan, B. k.,
and Kramer, S. N., 1994, “An Augmented Lagrange Multiplier Based Method for
Mixed Integer Discrete Continuous Optimization and Its Applications to
Mechanmical Design,” Journal of Mechanical
Design, vol. 116, pp405411.
Lee, S.J., and
Chen, H., 1991, “Design optimization with backpropagation neural networks,” Journal of Intelligent Manufacturing,
Vol. 2, pp. 293303.
Lu, P., and
Siddall, J.N., 1993, “A Boolean Local Improvement Method for General Discrete
Optimization Problems,” Journal of
Mechanical Design, Transactions of ASME, Vol. 115, pp. 776783.
Oda, J., and
Mizukami, T., 1993, “Technique for optimum design of truss structures by neural
network,” Nippon Kikai Gakkai Ronbunshu, A Hen/Transactions of the Japan
Society of Mechanical Engineers, Part A, Vol. 59, No 557, pp. 273278.
Sandgren, E.,
1988, “Nonlinear Integer and Discrete Programming in Mechanical Design,” Proceeding of the ASME Design Technology
Conference, Kissimmee, FL, pp. 95105.
Thanedar, P.B., and Vanderplaats, G.N.,
1995, “Survey of Discrete Variable Optimization for Structural Design,” Journal
of Structural Engineering, Vol. 121, No.2, pp. 301305.