//Logo Image
Author: Yeh-Liang Hsu, Shu-Gen Wang, Chia-Chieh Yu (2003-08-19); recommended: Yeh-Liang Hsu (2004-03-22).
Note: This paper is published in Engineering Optimization, Vol. 35, No. 5, October 2003, p. 489 ~ 511.

A sequential approximation method using neural networks for engineering design optimization problems

Abstract

There are three characteristics in engineering design optimization problems: (1) the design variables are often discrete physical quantities; (2) the constraint functions often cannot be expressed analytically in terms of design variables; (3) in many engineering design applications, critical constraints are often “pass-fail”, “0-1” type binary constraints. This paper presents a sequential approximation method specifically for engineering optimization problems with the three characteristics. In this method a back-propagation neural network is trained to simulate a rough map of the feasible domain formed by the constraints using a few representative training data. A training data point consists of a discrete design point and whether this design point is feasible or infeasible. Function values of the constraints are not required. A search algorithm then searches for the optimal point in the feasible domain simulated by the neural network. This new design point is checked against the true constraints to see whether it is feasible, and is then added to the training set. The neural network is trained again with this added information, in the hope that the network will better simulate the boundary of the feasible domain of the true optimization problem. Then a further is made for the optimal point in this new approximated feasible domain. This process continues in an iterative manner until the approximate model locates the same optimal point in consecutive iterations. A restart strategy is also employed so that the method may have a better chance to reach a global optimum. Design examples with large discrete design spaces and implicit constraints are solved to demonstrate the practicality of this method.

Keywords: sequential approximation, neural network, nonlinear discrete-variable optimization, implicit constraint.

1.     Introduction

Engineering design optimization aims to certify the quality, performance, and cost of a product during the design phase using optimization algorithms. Engineering design optimization problems have several particalar characteristics that distinguish them from general mathematic programming problems:

(1)     Their design variables are often discrete physical quantities. In many engineering applications, variables of the optimization problem must be selected from a table or graph list of discrete or integer values for practical reasons. For examples, mechanical components may only be available in standard sizes.

(2)     The constraint functions in engineering design optimization problems often cannot be expressed analytically in terms of design variables. Such constraints are called implicit constraints [1]. Typical examples of implicit constraints are those on structural responses such as stress, displacement, and natural frequency, and constraints involving manufacturing parameters, measurement data, etc. They are usually evaluated by computer or by physical experiments. Thus the evaluation of the implicit constraints is usually the major cost of the optimization process. Moreover, the analytical forms of the first derivatives of the implicit constraints with respect to the design variables are usually not available. Sensitivity of such constraints often has to be calculated through finite difference schemes, which require many function evaluations, and often have problems with accuracy.

(3)     In many engineering design applications, the final design has to pass certain critical constrains. These are also implicit constraints, and they are often “pass-fail”, “0-1” type binary constraints.

In summary, an engineering design optimization problem can usually be formulated as below:

        min.  f(x)

        s.t.   

               

               

                                                                     (1)

where x is the vector of design variables, f(x) is the objective function,  are the explicit constraints, and  are the implicit constraints which cannot be expressed in terms of the design variables,  are the “pass-fail” type binary constraints;  is the set of design variables that can take only discrete values, and  is the set of allowable discrete values for variable .

There has been considerable interest in the area of non-linear discrete optimization. Some review/survey articles on the algorithms for nonlinear optimization problems with mixed discrete variables has been published. Huang and Arora [2] categorized mixed continuous-discrete variable nonlinear optimal design problems into 6 types, depending on the type of variables and problem functions. They are: (1) continuous design variables with functions twice continuously differentiable; (2) mixed design variables with functions twice continuously differentiable while discrete variables can have non-discrete values during the solution process; (3) mixed design variables with functions non-differentiable while discrete variables can have non-discrete values during the solution process; (4) mixed design variables with functions may or may not be differentiable while some of the discrete variables must have only discrete value during the solution process; (5) mixed design variables with functions may or may not be differentiable while some of the discrete variable are linked to others; (6) non-differentiable discrete combinatorial problems. The engineering optimization problem in Equation (1) is close to type (3) or (4) stated above, but the existence of binary constraints  further complicates the problem.

Arora and Huang [3] classified the numerical methods for solving mixed-discrete nonlinear programming into 6 types: (1) branch and bound methods; (2) simulated annealing; (3) sequential linearization method followed by branch and bound, integer programming, simulated annealing, and others; (4) penalty function approach; (5) Lagrangian relaxation methods; and (6) other methods such as rounding-off, cutting plane, heuristics, genetic algorithms, pure discrete techniques, etc. Thanedar and Vanderplaats [4] presented a survey of the available methods for discrete variable structural optimization, and divided them into three principal methods as the classical branch and bound method, mixed linearization or approximation methods, and ad-hoc methods, including simulated annealing and genetic algorithms.

Among these methods, the branch and bound method, simulated annealing, and genetic algorithm are suitable implementations for problems with non-differentiable functions. But these methods require many function evaluations, which may not be suitable for engineering optimization problems with implicit constraints.

One important category of numerical optimization algorithms is the sequential approximation methods. The basic idea of sequential approximation methods is to use a “simple” sub-problem to approximate the hard, exact problem. By “simple” sub-problem, is meant the type of problems that can be readily solved by existing numerical algorithms. For example, linear programming sub-problems are widely used in sequential approximation methods. The solution point of the simple sub-problem is then used to form a better approximation to the hard, exact problem for the next iteration. In an iterative manner, it is expected that the solution point of the simple approximate problem will get closer to the optimum point of the hard exact problem. One major disadvantage of the existing sequential approximation methods is that they are usually derivative-based approximation methods, which require at least the first derivatives of the constraints with respect to the design variables.

This paper presents a sequential approximation method specifically for engineering optimization problems with the three characteristics previously mentioned. A preliminary concept of this method was first proposed by Hsu et al. [5]. In this method, instead of using first derivatives of constraints for approximation, a back-propagation neural network is trained to simulate a rough map of the feasible domain formed by the constraints using a few representative training data. A training data consists of a discrete design point and whether this design point is feasible or infeasible. Function values of the constraints are not required. A search algorithm then searches for the “optimal point” in the feasible domain simulated by the neural network. This new design point is checked against the true constraints to see whether it is feasible, and is then added to the training set. The neural network is trained again with this added information, in the hope that the network will better simulate the boundary of the feasible domain of the true optimization problem. Then a further search for the optimal point in this new approximated feasible domain . This process continues in an iterative manner until the approximate model locates the same optimal point in consecutive iterations.

Hopfield and Tank [6] first extended neural network application to optimization problems. In the Hopfield model, the optimization problem is converted into a problem of minimizing an energy function formulated by the design restraints. Watta [7] utilized similar ideas to propose a coupled gradient network approach, consisting of two interacting recurrent networks for a class of constrained mixed-integer optimization problems. The energy function consists of the objective function plus a penalty function to enforce the constraints. A multi-start mode may be easily incorporated into the computation producing more global optimal results that are within a few percent of the globally optimal solution.

Lee and Chen [8] also explored the idea of applying neural network technology in design optimization. In their approach, neural networks were used to simulate the constraint functions. Renaud et al. [9] used neural networks to approximate the discrete system space and repeatedly evaluate the network instead of the actual design space during optimization. The input/output pairs produced during the subspace optimizations are used to update the network training, and the optimum of the network approximation is used as the starting value for the next iteration. Hajela [10] provided a general overview to examine issues pertinent to using the non-gradient methods in the multidisplinary design optimization problem, which is characterized by the presence of a large number of discrete design variables and constraints. He pointed out that the neural network is a special form of response surface where the response function is a nested squashing function; the interconnection weights correspond to the regression parameters in a typical regression model.

Instead of simulating the function values of objective functions and constraints, the sequential approximation method using neural networks (the SNA method) developed in this research only simulates whether a data point is feasible (1) or infeasible (0) in the optimization problem. One key point of the SNA method is that the computational cost of the training of the neural network has to be relatively small compared with that of the function value and sensitivity calculations of the implicit constraints. Otherwise using neural network in optimization will not have much advantage computationally over the traditional numerical optimization algorithms, in those cases where traditional algorithms are still applicable. This clear 0-1 pattern makes the training process relatively fast. A restart strategy is also employed so that the method may have a better chance to reach a global optimum.

Section 2 of this paper describes the algorithm of the SNA. Section 3 presents two verification examples, in which the SNA method finds the global minimum successfully. In Section 4, three design examples with large discrete design spaces and implicit constraints are solved by the SNA method to demonstrate its practicality

2.     The sequential approximation method using neural networks

2.1 The structure of the method

As discussed earlier, the basic idea of sequential approximation methods is to use a simple sub-problem to approximate the hard, exact problem, such as Equation (1). The feasible domain of the optimization model in Equation (1) can be simulated using a few representative data to form the “simple” approximated model below:

        min.  f(x)

        s.t.    NN(x)=1                                                                                         (2)

where the binary constraint NN(x)=1 approximates the feasible domain of the optimization model in Equation (1). If NN(x)=1, the discrete point x is feasible; if NN(x)=0, the discrete point x is infeasible. The discrete variable constraints are embedded in the format of the input nodes of the neural network, as will be described later. Throughout this paper, the optimization model, Equation (1), will be denoted Mreal and the approximate model in Equation (2) will be denoted MNN for convenience. MNN is a simple approximated model because the computational cost of evaluating NN(x) is much less than that of evaluating the implicit constraints in Mreal.

Figure 1 outlines the sequential approximation method using neural networks. The set of initial training data S0 with at least one feasible point is given. A back-propagation neural network is trained to simulate a rough map of the feasible domain, formed by the implicit constraints using S0 to form the initial optimization model MNN. There has to be at least one feasible point in S0, such that MNN has a feasible domain.

Starting from the feasible point s, a search algorithm is used to search for the solution point xj of MNN in the j-th iteration. The search algorithm discussed later has to start from a feasible design point. There can be several feasible design points in S0, and the one with lowest objective value is used first. This solution point xj is then evaluated to see whether it is feasible in Mreal, and whether xj = xj-1. If not, this new solution point xj is added to the set of training points S, and the neural network is trained again with the updated data set. Then searching commences for the solution point in the updated MNN again, using xj as the starting point if xj is feasible. With the increasing number of training points, it is expected that the feasible domain simulated by the neural network in MNN will get closer to that of the exact model Mreal, and the solution point of MNN will get closer to the optimum point.

If the feasible solution point xj generated in the current iteration is the same as xj-1, no new training point is generated. The set of training point since S, and consequently MNN, is not updated. The current search has to be terminated because the same solution point will be reached.

To ensure a better chance to reach a global optimum in the search domain, the next step is to restart the search from another feasible point in S0, which has the second lowest objective value. The iterations continue until all feasible points in S0 have been used as initial starting points. Finally the feasible design point with the lowest objective value in S is output as the optimal solution.

Figure 1. Flow chart of the algorithm

2.2 Defining the set of initial training data using the three-level orthogonal array

Figure 2 shows a simple three-bar-truss problem that is commonly used in structural optimization literatures to demonstrate various optimization techniques [11]. Here it is used as an example to illustrate the detailed procedure of the SNA method. The objective of this problem is to find the optimal values of three cross-sectional areas A1, A2 and A3 of the bars that minimizes the weight of the structure, subjected to maximum stress constraints on the three bars. The horizontal force P can either act to the right or to the left, thus we have a symmetric structure in which A1 equals A3.

Figure 2. Three-bar trusses

Substituting values of the parameters r, E, P, L, and smax given in the example, and assume that all three bars have square cross sections and the sizes of the raw material available from manufacturers range from 1×1mm2, 2×2mm2, and so on to 17×17mm2. Therefore we have the discrete-variable optimization problem below:

        min. 

        s.t.   

               

                , i = 1, 2.                                                            (3)

Equation (3) represents Mreal of this problem. For illustrative purpose, g1, g2 are assumed to be implicit constraints, which will be simulated by a back-propagation neural network to form MNN.

Referring to Figure 1, a set of initial training points S0 is used to train the neural network so that the neural network can simulate a rough map of the whole design domain. The set of initial training points should reasonably represent the whole design domain. Various types of matrices are commonly used for planning experiments to study several input variables. Orthogonal arrays that were first proposed by Plackett and Burman [12] and Rao[13] are highly popular because they are geometrically balanced in coverage of the experiment region with just a few representative experiments. Orthogonal arrays are also adopted in this research to define the set of initial training data S0.

The numbers of design variables and the defined levels of the variables distinguish a series of orthogonal array. In this research, three-level orthogonal arrays are favorable over two-level orthogonal arrays because three-level orthogonal arrays have a higher resolution, especially in non-linear evaluation. Three-level orthogonal arrays also cover the middle range of the design domain more properly. The number of data points in the three-level orthogonal array is generally determined by the maximum number of variables in the form Li(3j), where index i denotes the number of data points and j denotes the maximum number of variables. For example, L9(34) array is suitable for no more than 4 variables, and 9 data points are needed in this array; the L18(21×37) array is suitable for not more than 7 variables, and 18 data points are needed in this array. Similarly L27(313) array is suitable for not more than 13 variables, and 27 data points are needed in this array.

For the three-bar-truss problem illustrated in the previous section, an L9(34) array is used. The three levels are made by both end points and one middle point of each variable within its design domain. Therefore the 9 initial training points in this example are (1, 1), (1, 81), (1, 289), (81, 1), (81, 81), (81, 289), (289, 1), (289, 81) and (289, 289). Among the 9 initial training points, (81, 289) and (289, 289) are feasible.

2.3 Training the neural network

Figure 3 shows the architecture of the three-layer network. The size of the input layer depends on the number of variables and the number of discrete values of each variable. In this example, there are 2 design variables, with 17 discrete values each. Thus a total of 2×17 neurons are used in the input layer. Each neuron in the input layer has value 1 or 0 to represent the discrete value ach variable takes, as will be explained later. There is only a single neuron in the output layer to represent whether this design point is feasible (the output neuron has value 1) or infeasible (the output neuron has value 0).

A training point is represented as shown in Figure 4, where a cross represents “0” in the node, a circle represents a “1” in the node. For illustrative purpose, the 34 input nodes are arranged into two rows in Figure 4 to represent two design variables. The first 9 nodes of the first row have value 1 to represent that the first variable A1 has the discrete value 81. The first 9 nodes of the second row also have value 1 to represent that the second variable A2 also has the discrete value 81. Checking with Mreal, we can determine that (A1, A2)=(81, 81) is an infeasible design point. So the single node in the output layer has the value of 0 to represent that this point is infeasible. Similarly, Figure 5 shows another initial training point (A1, A2)=(289, 289), which is a feasible design point. Note that in this representation, discrete values of the variables are embedded and neighboring nodes have very similar input patterns.

Figure 3. The architecture of the neural network

Input layer

Output layer

(infeasible)

Figure 4. Representation of an infeasible training point {(81, 81), infeasible}.

Input layer

Output layer

(feasible)

Figure 5. Representation of a feasible training point {(289, 289), feasible}.

The number of neurons in the hidden layer depends on the number of variables in the input layer. There are 12 neurons in the hidden layer in this example. The transfer functions used in the hidden and output layer of the network are both log-sigmoid functions. The neuron in the output layer has a value range [0, 1]. After the training is completed, a threshold value 0.25 is applied to the output layer when simulating the boundary of the feasible domain. In other words, given a discrete design point in the search domain, the network output will always be 0, if the output neuron’s value is less than the given threshold, or 1 otherwise, to indicate whether this discrete design point is feasible or infeasible.

The computational effort required in neural network training is critical. If the computational cost required in neural network training is larger than that of evaluating the implicit constraints, than the SNA method will lose its advantage. Here all the training data are represented in a clear 0-1 pattern, which makes the training process relatively fast. A quasi-Newton algorithm, which is based on Newton’s method but do not require calculation of second derivatives, is used for the training. In our numerical experience, the error goal of 1e-6 is usually met within 100 epochs, even for cases with many training points.

After this initial training is completed, the mathematical optimization model Mreal is transformed into MNN, in which the feasible domain of the optimization model is simulated by NN(x). Figure 6 shows the feasible domain of the three-bar-truss example simulated by MNN with 9 initial training points defined by the L9 orthogonal array (nodes enclosed by square brackets in the Figure). There are 289 possible discrete combinations as shown in the figure. A circle represents that the design is feasible in MNN, and a cross represents that the design is infeasible.

It is possible that none of the points in the orthogonal array is feasible. In that case the whole search domain will be infeasible in MNN. Therefore we have to include at least one feasible point and add it in the set of initial training data.

Figure 6. The feasible domain simulated by MNN with 9 initial training points

2.4 Searching in the feasible domain

A search for the optimum point of MNN is then started. Figure 7 is the flow chart of this discrete search algorithm. The search algorithm has to start from a feasible design point. As discussed in the previous section, if a new design point from the previous iteration is a feasible design point in Mreal, it is used as the starting point in the current search. If the new design point is infeasible in Mreal, then the same starting point used in the previous iteration is used again in the current search. As discussed earlier, there has to be at least one feasible design point in the set of initial training points, so the search algorithm can always start from a feasible design point.

As shown in Figure 7, a usable set Uj is formed from neighboring discrete nodes of xj. There are m=3n-1 nodes in Uj. Then the components in Uj are sorted by the “distances” to xj in a descending order, thus Uj={u1, u2, u3, …, um}. Note that by “distances”, does not mean the absolute distance ||uk-xj||. Instead, it measures the number of variables in uk that have different discrete values from xj. Therefore this sorting is the same for all discrete points, given the number of total design variable, and does not require any computation. The node that has a larger “distance” to xj has a higher priority to be searched first.

Pick up a node uk from Uj. Two conditions are checked: “is uk feasible in MNN?” and “is f(uk)<f(xj) ?. If both conditions are satisfied, uk is a better feasible point than xj, and the algorithm moves on to uk and start the next iteration (j=j+1). On the other hand, if either of the two conditions is not satisfied, then the next node in Uj (k=k+1) is checked. When the end of Uj (k=m) is reached, there is no better feasible point in Uj, and xj is output as the optimum (at least locally) point.

Figure7. Flow chart of the search steps

This search algorithm is used to search for the solution point in the feasible domain formed by the L9 initial training points as shown in Figure 6. There are two feasible design points (81, 289) and (289, 289) in the set of the 9 initial training points. The feasible design point (81, 289) has a lower objective value and is first used as the starting point in the search algorithm. The search algorithm terminates at a new design point (16, 289). Figure 8 shows the path of the search, from point (81 289), through (64 256), (49 225), (36 256), (25 289) to point (16 289). This point is infeasible when checked with Mreal in Equation (1). This new design point is added to the set of training points, and the neural network is trained again. Figure 9 shows the updated domain. Comparing Figure 9 with Figure 8, the feasible domain of MNN is “pushed” to the right by the new point {(16, 289), infeasible}.

The search algorithm is started again to search for the solution point in the updated feasible domain shown in Figure 9 and terminates at another new design point (36, 289). This path of the search is from point (81 289), through (64 256), (49 289) to point (36 289) as shown in Figure 9. This point is feasible when checked with Mreal in Equation (1). Similarly this new design point is also added to the set of training points, and the neural network is trained again. Figure 10 shows the next updated feasible domain simulated by the neural network. The new feasible design point pushes the feasible domain of MNN downwards.

Since the solution point (36, 289) in the previous iteration is a feasible design point, it is used as the starting point when searching in the feasible domain shown in Figure 10. As shown by the arrow in Figure 10, the next solution point (25, 289) is obtained in just one step.

Figure 8. The path of the search from point (81 289) to point (16 289)

Figure 9. The updated feasible

domain

Figure 10. The next updated domain

2.5 The restart strategy

The iteration continues with one more training point added into the set of training points in each iteration. The first search process terminates at a feasible point (25, 196) after 6 iterations, when the same design point is found twice and no new training point is generated. The objective function value at this point is 2323.6g, or 48.3% of that of initial design point (81, 289)

To ensure a better chance to reach a global optimum, the searching process is restarted from another feasible point (289, 289) in the set of initial training data. In this example, the search process terminates at the same feasible point (25, 196) in one iteration. Figure 11 compares the feasible domain of MNN at the end of the process with the true feasible domain of Mreal in Equation (1). As shown in Figure 11, (25, 196) is also the global optimum in Mreal. In this example 6 more discrete points are evaluated (nodes enclosed by square brackets in the figures) beside the 9 initial training points in Figure 6. The feasible domain of MNN in the neighborhood of the optimum point is exactly the same as that of Mreal, because most evaluation points fall in this region. Figure 12 shows the iteration history of this example; the points represented by blank circles are the feasible design points. Note that a total of 15 points out of the 17×17=289 possible combinatorial combinations (5.2%) are evaluated to obtain this optimum design. No gradient calculation is required.

Figure 11(a). The feasible domain of MNN

Figure 11(b). The feasible domain of Mreal

Figure 12. Iteration history of the three-bar-truss example

3.     Verification examples using the SNA method

Two design examples are presented to verify the solution found by the SNA method. These examples are on the optimal design of mechanical components and are widely used in literatures to demonstrate various algorithms for discrete variable optimization. Although there are not implicit constraints in the examples presented here, it is useful to verify the results obtained from the SNA method by treating all constraints as implicit constraints. In both examples, the SNA method finds the global minimum.

3.1 I-beam design

The minimum cross-sectional area design of a welded I-beam, subject to bending and compression loads, is considered. As shown in Figure 13, the design variables are the web height h, web thickness tw, and the area of the flange af .

Figure 13. The cross-sectional area of an I-beam

The objective function can be formulated as

        min. f = htw + 2af                                                                                    (4)

There are two constraints. The first constraint is that the stress caused by bending and compression loads should be lower than the ultimate stress of the material:

                                                                                    (5)

where  is the stress caused by the bending load M, and  is the section modulus,  is the stress caused by the compression load N,  is the total area of the cross-section, and Ru is the ultimate stress.

The constraint due to web buckling, g2, may be expressed as

                                                           (6)

This problem was discussed by Farkas [14], Yuan et al. [15], and Farkas and Jarmai [16] to demonstrate various algorithms for nonlinear discrete variable optimization. For comparing with the results in the literature, the same parameters M=320kNm, N=128kN, and Ru=200MPa are used. Table 1 shows the discrete values for each variable. The set of initial training data according to the L9(34) orthogonal array is also used in this example. The 9 initial training points in this example are (74, 0.9, 18), (74, 0.7, 14), (74, 0.5, 22), (70, 0.9, 14), (70, 0.7, 22), (70, 0.5, 18), (66, 0.9, 22), (66, 0.7, 18) and (66, 0.5, 14). Among the 9 initial training points, (74, 0.9, 18), (70, 0.7, 22) and (66, 0.9, 22) are feasible, and (70, 0.7, 22) has the lowest objective value and is first used as the starting point of the search.

Table 1. Discrete values of the cross-section area of the I-beam example

 

1

2

3

4

5

6

7

8

9

h

66

68

70

72

74

 

 

 

 

tw

0.5

0.6

0.7

0.8

0.9

 

 

 

 

af

14

15

16

17

18

19

20

21

22

The SNA method terminates after 21 iterations, including 2 restarts. Figure 14 shows the iteration history. Note that a total of 30 design points out of the  possible combinatorial combinations (13.3%), are evaluated. The final optimal design point (70, 0.6, 18) was found in the 2nd restart. This is the same design point found in the literatures. An exhaust search of all 225 design-points shows that this design point is indeed the global minimum.

Figure 14. Iteration history of the I-beam example on cross-section area

3.2 Helical compression spring design

The minimum volume design of a helical compression spring, subject to an axially guided constant compression load, is now considered. As shown in Figure 15, the ends of the spring are ground and squared. The design variables are the winding coil diameter dw, wire diameter d, and the number of spring coil, n. The springs are usually manufactured from commercial wire in stock, and therefore the wire diameter can only be in a set of discrete values. The number of spring coils has to be integer, and the winding diameter is also assumed to be in a set of discrete values. Siddall [17], Sandgren [18], Lu and Siddall [19], and Kannan and Kramer [20] used this problem to demonstrate algorithms for non-linear discrete variable optimization.

Figure 15. The spring design example

The objective function is to minimize the volume of the spring and can be formulated as

        min. F = pdwd2(n+2)/4                                                                            (7)

The design limitations placed on this model are specified in the following 8 constraints: (1) The design shear stress caused by compression load, should be lower than the allowable maximum shear stress S of the material. (2) The free length of the spring should be lower than the maximum specified value Lmax. (3) The wire diameter must not be less than the specified minimum diameter Dmin. (4) The outside diameter of the coil should be lower than the specified maximum diameter Dmax. (5) The inner coil diameter must be at least three times less than the wire diameter to avoid a lightly wound spring. (6) The deflection under preload d must be less than the specified maximum deflection under preload Dpm. (7) The combined deflection must be consistent with the coil free length Lfree. (8) The deflection from preload to maximum load must be greater than the specified working deflection Dw.

Buckling load is not considered due to the fact that the spring is guided. The derivation of the equations used in formulating the constraints can be found in the paper by Siddall [17]. This problem can be formulated as follows:

        min. f = pdwd2(n+2)/4

        s.t.    g1= 8KPmaxdw/(3.14156d3) – S  0

                g2 = 8Pmaxdw3n/(Gd4) + 1.05(n+2)dLmax  0

                g3 = Dmind  0

                g4 = (d +dw) Dmax  0

g5 = 3 – (dwd )/d  0

g6 = d Dpm  0

g7 = 8Pmaxdw3n/(Gd4) + (PmaxPload)/K + 1.05(n+2)d Lfree  0

g8 =Dw– (PmaxPload)/K  0                                                 (8)

where the spring correction factor K = (4C-1)/(4C-4)+0.615/C and the spring index C = dw/d. Pmax is the maximum design load; E and G are respectively the Young’s and shear modulus of rigidity of the spring material. The values of all the parameters are defined below [18]: S = 189000psi, E = 30e+06psi, G = 11.5e+06psi, Pmax = 1000lb, Lfree = 14in, Dmin = 0.2in, Dmax = 3in, Pload = 300lb, Dpm = 6in, Dw = 1.25in.

Table 2 shows the discrete values of each variable in the literature. The 9 initial training points (L9(34) orthogonal array) in this example are (7, 0.207, 1.13), (7, 0.307, 1.38), (7, 0.438, 2), (10, 0.207, 2), (10, 0.307, 1.13), (10, 0.438, 1.38), (13, 0.207, 1.38), (13, 0.307, 2), (13, 0.438, 1.13). Among the 9 initial training points, (7, 0.307, 1.38), (7, 0.438, 2), (10, 0.307, 1.13) and (10, 0.438, 1.38) are feasible, and (7, 0.307, 1.38) has the lowest objective value and is first used as the starting point of the search.

Table 2. Discrete values of each variable

 

1

2

3

4

5

6

7

8

9

10

11

n (no.)

7

8

9

10

11

12

13

 

 

 

 

d (in)

0.207

0.225

0.244

0.263

0.283

0.307

0.331

0.362

0.375

0.394

0.438

dw (in)

1.13

1.18

1.23

1.28

1.33

1.38

1.43

1.53

1.68

1.85

2.000

Figure 16 shows the iteration history using the SNA method, including 3 restarts. The optimal solution point (7, 0.283, 1.13) with a minimum volume 2.010 in3 was found in the first restart. During this process, 22 out of the total  possible combinatorial combinations (2.6%) are evaluated. An exhausted search on those 847 design points finds that the design point (7, 0.283, 1.13) obtained in the SNA method is the global optimum.

Figure 16. Iteration history with multiple restart in ascending objective values

4.     Design examples

Three design examples with large discrete design spaces and implicit constraints are solved by the SNA method to demonstrate its practicality. The first example is a pressure vessel design example that is widely used in literatures to present various algorithms for discrete variable engineering optimization. This example does not have implicit constraints. It is used here to compare the solution obtained from the SNA method, treating all constraints as implicit constraints, with those obtained from other discrete programming algorithms in the literatures. The second example is on the optimal sizing design of a five-steps cantilever beam. This problem has 10 design variables and is used to demonstrate the limitation of the SNA method when the number of design variables is large. The third example is on the structure optimal design of a reinforced cylindrical body. This example has two implicit constraints that have to be evaluated by finite element analysis.

4.1 Pressure vessel design

Figure 17 shows a cylindrical pressure vessel capped at both ends by hemispherical heads. This compressed air tank has a working pressure of 3,000 psi and a minimum volume of 750 feet3, and is designed in accordance with the ASME boiler and pressure vessel code. The total cost, including a combination of single 60 degree welding cost, material and forming cost, is to be minimized. The design variables are the thickness of the shell and head, and the inner radius and length of the cylindrical section. These variables are denoted x1, x2, x3, and x4, respectively. The thickness of the shell is not to be less than 1.1 inches while for the thickness the head must not be less than 0.6 inch. Also the thickness is not to exceed 2 inches for both of them.

Figure 17. The pressure vessel design example

Sandgren [18] first discussed this problem using the branch and bound method for nonlinear integer and discrete variable optimization, then Kannan and Kramer [20] followed up to solve this problem by an augmented Lagrange multiplier method for mixed integer, discrete, and continuous variables optimization problems. The units of the variables used in the literatures are inches. Here inches are used so that the result can be compared with those in the literatures. This problem can be formulated as follows:

        min. 

        s.t.   

               

               

                                                                                          (9)

In Equation (9), the objective function is the total cost, including the material and forming costs expressed by the first two terms, and the welding cost in the last two terms. The ASME codes impose limits on the minimum shell thickness (g1) and spherical head thickness (g2). There are also constraints on the minimum volume required (g3) and the maximum shell length of the tank (g4).

In the literatures, design variables x1 and x2 are integer-multiples of 0.0625 inch, the standard size of the available thickness of rolled steel plates. But x3 and x4 are continuous variables. Here we assume all four variables are discrete variables, and each variable can take different numbers of possible discrete values as listed in Table 3. Note that the ranges of the variables are different from those in the literatures.

Table 3. Discrete values of the design variables in the pressure vessel example

 

1

2

3

4

5

6

7

8

9

10

11

12

x1

1.125

1.1875

1.25

1.3125

1.375

1.4375

1.5

1.5625

1.625

1.6875

1.75

1.8125

x2

0.625

0.6875

0.75

0.8125

0.875

0.9375

1

1.0625

1.125

1.1875

1.25

1.3125

x3

40

41

42

43

44

45

46

47

48

49

50

51

x4

40

45

50

55

60

65

70

75

80

85

90

95

 

13

14

15

16

17

18

19

20

21

22

23

 

x1

1.875

1.9375

2

 

 

 

 

 

 

 

 

 

x2

1.375

1.4375

1.5

1.5625

1.625

1.6875

1.75

1.8125

1.875

1.9375

2

 

x3

52

53

54

55

56

57

58

59

60

 

 

 

x4

100

105

110

115

120

 

 

 

 

 

 

 

The 9 initial training points (L9(34) orthogonal array) in this example are (1.125, 0.625, 40, 40), (1.125, 1.3125, 50, 80), (1.125, 2, 60, 120), (1.5625, 0.625, 50, 120), (1.5625, 1.3125, 60, 40), (1.5625, 2, 40, 80), (2, 0.625, 60, 80), (2, 1.3125, 40, 120), (2, 2, 50, 40). Among the 9 initial training points, (1.5625, 0.625, 50, 120), (1.5625, 1.3125, 60, 40) and (2, 0.625, 60, 80) are feasible, and (1.5625, 0.625, 50, 120) has the lowest objective value and is first used as the starting design point of the search. The optimal point is obtained after 37 iterations, including two restarts. The optimal design point (1.1875, 0.625, 59, 40) was found in the first restart, and the objective function value at this point is 7442.015. In the optimization process, 45 out of the 15×23×21×17=123,165 possible combinatorial combinations (0.037%) are evaluated. Finally Table 4 compares the results obtained here with the solutions obtained in the literatures. This comparison is only for reference purpose because the ranges of the variables used here are different with those in the literatures, and variables x3 and x4 are continuous variables in literatures.

Table 4. Comparing the results with those issued in literatures

 

Continuous solution

Discrete solution SNA method

Mixed discrete solution   Kannan & Kramer [20]

Mixed discrete solution Sandgren [18]

x1

1.125

1.1875

1.125

1.125

x2

0.625

0.625

0.625

0.625

x3

58.290

59

58.291

48.97

x4

43.693

40

43.69

106.72

g1

-0.000

-0.049

+0.000

-0.180

g2

-0.069

-0.062

-0.069

-0.158

g3

+0.000

-0.001

-0.000

+0.000

g4

-196.307

-200

-196.31

-133.28

f

7198.01

7442.02

7198.04

8129.14

4.2 Structure design of a stepped cantilever beam with rectangular shape

Figure 18 shows a five-stepped cantilever beam with total length L of 500 cm, and the material elasticity modulus E is 200GPa. A concentrated load of 50,000N is applied at the tip end of the beam. This problem was presented by Thanedar and Vanderplaats [4]. In this problem, the height and width of the beam in all five steps of the cantilever beam are design variables. The volume of the beam is to be minimized. There are 11 constraints in this problem, 5 bending stress constraints in all five steps of the beam to be less than the allowable stress (14,000N/cm2), and one displacement constraint on the tip deflection is to be less than the allowable deflection (2.7cm). A specified aspect ratio 20 has to be maintained between the height and width of all 5 cross sections of the beam. Substituting the parameters, this problem can be formulated as follows:

        min. 

        s.t.   

               

               

               

               

               

               

               

               

               

                                                                                    (10)

Figure 18. Stepped cantilever beam

Table 5 shows the discrete values of each variable. The ranges of the variables are selected from engineering viewpoint, therefore they are not exactly the same as those in the literatures. The L27 orthogonal array is constructed to form the set of initial training data of a five-step cantilever beam design problem. There are three feasible design points (3.0, 3.4, 2.6, 2.6, 2.0, 60, 56, 50, 46, 37), (3.4, 3.0, 3.0, 2.6, 1.8, 58, 58, 48, 46, 35), and (3.4, 3.4, 3, 2.6, 2, 60, 54, 48, 42, 37) in the set of initial training data.

Table 5. Possible discrete values of the design variables

b1

3.0

3.2

3.4

3.6

3.8

b2

3.0

3.2

3.4

3.6

3.8

b3

2.2

2.4

2.6

2.8

3.0

b4

2.2

2.4

2.6

2.8

3.0

b5

1.6

1.7

1.8

1.9

2.0

h1

58

59

60

61

62

h2

54

55

56

57

58

h3

48

49

50

51

52

h4

42

43

44

45

46

h5

33

34

35

36

37

The feasible design point (3.0, 3.4, 2.6, 2.6, 2.0, 60, 56, 50, 46, 37) has the lower objective value of 69,400cm3 and is used as the starting point of the search. The SNA method terminates after 57 iterations, including two restarts. The final design point is (3.0, 3.0, 2.8, 2.6, 1.8, 60, 54, 50, 46, 35) with total volume of 66,460cm3. Only 81 design points (including 27 points in the initial training set) out of 510 = 9,765,625 possible combinatorial combinations are evaluated (0.00083%).

4.3 Design of a reinforced cylindrical body of a marine structure

Figure 19 shows the cross section of a quarter of a symmetrical cylindrical body. It is a major marine structure reinforced in a series of ring-shape ribs with different width but equally distributed along the length. The outer and inner diameter sizes of this cylinder are both limited by its space; the cylinder has to withstand a certain depth of  water pressure on the outer surface. The cylinder needs to have less weight and high structural strength for better performances under water. There are four sizing variables b, c, d, t; the first three are widths of 3 different ribs, and t is the thickness of the cylinder. This cylinder has to sustain a water pressure of 40N/mm2, a uniform payload load of 975N, and self-weight. The objective is to minimize the volume of the cylinder, while the induced maximum stress of the cylinder under the loading and boundary conditions must be less than an allowable value for a specified material and the maximum deformation must also be less than its thickness. This cylinder is made of a high strength aluminum alloy 7075 with allowable yield strength sallow of 620N/mm2. The optimization problem can be formulated as Equation (11). Note that the constraints on stress and deformation are both implicit constraints, and have to be evaluated through finite element analysis. Table 6 shows the discrete values of each variable available from engineering practices.

        min.  f (b, c, d, t) = p[5352-(535-2t)2](405+b+c+2d)

        s.t.    s (b, c, d, t) < sallow

                disp (b, c, d, t) < t                                                                    (11)

Figure 19. The reinforced cylindrical body

Table 6. Discrete values of the reinforced cylindrical body variables (unit: mm)

 

1

2

3

4

5

6

7

b

41

43

45

47

49

51

53

c

33

35

37

39

41

43

45

d

16

18

20

22

24

26

28

t

4

5

6

7

8

9

10

The 9 initial training points (L9(34) orthogonal array) in this example are (41, 33, 16, 4), (41, 39, 22, 7), (41, 45, 28, 10), (47, 33, 22, 10), (47, 39, 28, 4), (47, 45, 16, 7), (53, 33, 28, 7), (53, 39, 16,10), (53, 45, 22, 4). Among the 9 initial training points, (41, 45, 28, 10) and (47, 33, 22, 10) are feasible, and (47, 33, 22, 10) has the lower objective value and is first used as the starting design point of the search. The optimal point is obtained after 8 iterations, including one restarts. The optimal design point is (41, 33, 20, 10), and the objective function value at this point is 3.42e+7. In the optimization process, 17 out of the 74=2,401 possible combinatorial combinations (0.7%) are evaluated.

5.     Discussion and Conclusions

When dealing with engineering design optimization problems, it is often difficult to find a numerical optimization algorithm that is readily applicable because of the existence of discrete variables, implicit constraints, or even binary constraints. The SNA method presented in this paper is designed specifically for this type of engineering design optimization problem. In this method, the constraints are modeled as a pass-fail functions. The precious function evaluations of the implicit constraints are accumulated to progressively form a better approximation to the feasible domain using the neural network. Only one evaluation of the implicit constraints is needed in each iteration to see whether the current design point is feasible. No precise function value or sensitivity calculation is required. Thus the SNA method does not require continuity or differentiability of the problem functions.

A clear 0-1 binary pattern is used in the input-output layers of the neural network. Thus the computational cost of the training of the neural network is relatively small comparing to that of using the neural network to simulate the function values of the implicit constraints. As discussed earlier, in our numerical experience, the error goal of 1e-6 is usually met within 100 epochs, even for cases with many variables and training points. A restart strategy is also employed so that the method may have a better chance to reach a global optimum.

One potential problem of the SNA method is that, when the number of design variables and the number of discrete values that a design variable can take is large, the search space will be large. This might affect the efficiency of the search algorithm of the SNA method. When the number of discrete values that a design variable can take is large, the “resolution” of the discrete grids of the search space can be controlled. Instead of taking all discrete values all at once, a few discrete values can be chosen first, so that the size of the search space is acceptable. When the SNA method terminates at a discrete design point, the search space can be reduced to the neighborhood of this point, but increase the resolution of the discrete grids, then start the iterations again.

In the search algorithm in the SNA method, a branching criterion is used to cut down the number of points searched. However, number of design points searched in each iteration is still in the order of 3n-1, where n is the number of design variables. This can be very expensive with large number of design variables even though the function evaluation for each design point is cheap, which poses limitation to the SNA method. As demonstrated by the examples presented in this paper, the SNA method can handle up to 10 design variables.

References

[1]   Hsu, Y. L., Sheppard, S. D., and Wilde, D. J., (1996) Explicit approximation method for design optimization problems with implicit constraints, Engineering Optimization, 27(1), 21-42.

[2]   Huang, M. W., and Arora, J.S., 1997, “Optimal design with discrete variables: some numerical experiments,” International Journal for Numerical Methods in Engineering, Vol. 40, pp. 165-188.

[3]   Arora, J. S., and Huang, M.W., 1994. “Method for optimization of nonlinear problems with discrete variables: a review,” Structural Optimization, Vol. 8, pp. 69-85.

[4]   Thanedar, P.B., and Vanderplaats, G..N., 1995, “Survey of discrete variable optimization for structural design,” Journal of Structural Engineering, Vol. 121, No.2, pp. 301-305.

[5]   Hsu, Y. L., Dong, Y. H., and Hsu, M. S., 2001. “A sequential approximation method using Neural networks for nonlinear discrete-variable optimization with implicit constraints,” JSME International Journal, Series C, Vol. 44, No. 1, p. 103~112.

[6]   Hopfield, J. and Tank, D., 1985. “Neural computation of decisions in optimization problems,” Biol. Cybern., Vol. 52, pp. 141-152.

[7]   Watta P.B., 1996, “A coupled gradient network approach for static and temporal mixed-integer optimization”, Transactions of Neural Networks, Vol., 7, No. 3, pp. 578-593.

[8]   Lee, S. -J., and Chen, H., 1991, “Design optimization with back-propagation neural networks,” Journal of Intelligent Manufacturing, Vol. 2, pp. 293-303.

[9]   Renaud, J., Sellar, R., Batill, S., andKar, P., 1994, “Design driven coordination procedures for concurrent subspace optimization in MDO.” Proceedings 35 AIAA/ASME Structures and Matreials Conference.

[10]   Hajela, P., 1999, “Nongradient methods in multidisplinary design optimization - status and potential”, Journal of Aircraft, Vol.36, No.1, pp.255-265.

[11]   Haftka, R.T., and Gurdal, Z., 1992, Element of Structural Optimization, 3rd editions, Kluwer Academic Publishers.

[12]   Plackett, R. L., and Burman, J. P., 1946. “The Design of Optimum Multi-factorial Experiments,” Biometrics, Vol. 33, pp.305-325.

[13]   Rao, C. R., 1947. “Factorial Experiments Derivable from Cominatorial Arrangement of Arrays,” Journal of the Royal Statistical Society, Series B, Vol. 9, pp. 155-162.

[14]   Farkas J., 1984, Opitmum Design of Metal Structures, 1st edition, Ellis Horwood limited Publishers.

[15]   Yuan, K. Y., Liang, C. C., and Ma, Y. C., 1990. “The estimation of the accuracy and efficiency of the backtrack programming method for discrete-variables structural optimization problems,” Computers & Structures, Vol. 79, No. 2, pp. 155-162.

[16]   Farkas J., and Jarmai, K., 1997, Analysis and optimum design of metal Structures, 1st edition, Balkema Publishers.

[17]   Siddall, J. N., 1982, Optimal engineering Design; Principle and Applications, Marcel.Dekker, N. Y.

[18]   Sandgren, E., 1988, “Nonlinear integer and discrete programming in mechanical design,” Proceeding of the ASME Design Technology Conference, Kissimmee, FL, pp. 95-105.

[19]   Lu, P., and Siddall, J.N., 1993, “A boolean local improvement method for general discrete optimization problems,” Journal of Mechanical Design, Transactions of ASME, Vol. 115, pp. 776-783.

[20]   Kannan, B. K., and Kramer, S. N., 1994, “An augmented lagrange multiplier based method for mixed integer discrete continuous optimization and its applications to mechanical design,” Journal of Mechanical Design, vol. 116, pp405-411.