°> u o > 0 In (3.33) I O q and to^ are positive auxiliary variables satisfying (3.33a) and (3.33b). They are introduced to put the subproblem into the standard format and to eliminate the parameters A^"^ and from constraint (3.33a). i s a sufficiently positive constant to insure that the l e f t side of (3.33b) i s positive. The minimization problem posed in (3.33) i s a piecewise "pseudo" signomial program. One way to solve i t i s as follows: (i) Use the scheme proposed in [36] to condense (3.33a) to a monomial, ( i i ) Simultaneously condense the signomials in (3.33b) and (3.33c) to posynomials; (ii i ) S o l v e the resulting geometric program, (iv) Repeat steps (i) - ( i i i ) u n t i l some convergence criterion is satisfied. The advantage of this technique i s that the size of the original problem remains the same. A major weakness is the numerical d i f f i c u l t y that can result from either or $ m + 1 being very small. When this 35. condition occurs, exponent overflow or gross clipping errors destroy the convergence of the iterative solution. The second method, named Method II, exploits the quadratic character of the penalty terms in the subproblem's objective function. The subproblem (3.9) i s recast as follows: (^2) min u Q p+k* q s.t. (a) 4 + ^ 4*<2> ± "0 (b) X ; L + C + I (A k - 2 K ) g p + k ( x ) < K c o 2 (3.34) ( c) x e X , o>1 > 0, CDQ > 0 where C i s large enough to make the l e f t side of (3.34b) positive. Since a primal approach to solving problem (T^ ) has been adopted, the positive and negative terms of a l l the signomial inequality constraints must be d i s t i n -guished. Such a distinction among the terms in (3.34b) and (3.34c) i s straightforward. In (3.34a) because of the quadratic exponents, the separation of positive and negative terms can also be easily carried out by replacing gp + k(x) with Pp + k(x) - Q + k 0 0 a n c* expanding the squares. Then the constraint (3.12a) can be written as 4+ 1 f p p + k ( ^ + Q p + k ^ ) ] - 2 V k ^ v ^ 1 "0 ( 3 - 3 5 ) k=l k _ i 2 Being posynomials, P + k ( x ) and Q p + k(x) are positive for x e X. Hence, 10^ and the f i r s t summation constitute the positive terms while the rest are the negative terms. The specific primal algorithm selected to solve (3.34) i s the Avriel-Williams algorithm explained in Section 2.6. The algorithm requires that each signomial inequality constraint of (3.34) be rewritten as a ratio of positive terms to one plus the negative terms, with the ratio bounded from above by unity. The solution to problem (TT^) (3.34) is then obtained by solving a sequence of geometric programs each of which is obtained by condensing the denominator of each ratio constraint at a feasible point of ( T T 2 ) . The condensation of the ratio version of (3.35) merits special attention. Written as a ratio of posynomials, (3.35) becomes ^ < 1 (3.3. If condensation is viewed as an application of the arithmetic-geometric inequality, then the multiplication in each squared or product term has to be carried out. But this i s the precise procedure that i s unwanted. How-ever, i f condensation is interpreted as exponentiation following the f i r s t -order Taylor approximation of the logarithm of a posynomial, then either the numerator or the denominator of (3.36) can be condensed directly in the manner of (2.7) - (2.8). The geometric programs approximating the problem (T^) have been solved by Dembo's cutting-plane method [12], [16]. The numerical exper-ience acquired suggests that convergence is attained rather slowly. A closer analysis of the results shows that a good part of the computing time i s spent on approximating the constraint (3.34a). This discovery leads to the adoption of the third method detailed in Chapter V following the dis-cussion of an intimately related algorithm in the next chapter. Method III differs from Method II in the following ways: 1) The objective function in (3.9) is not incorporated into the constraint set, as i t i s done in (3.34) 2) Only the set X is convexified by monomial condensation. 3) The nonconvex objective function of (3.9), not i t s approximation, is directly minimized over the convex approximant of X. The numerical results obtained with Method III are detailed in Chapter VI. As a whole, the results indicate that Method III is preferred. 3.7 Conclusion In this chapter an algorithm has been synthesized to solve signomial programs with equality constraints. The development of the algo-rithm has been shaped by three main concepts: the method of multipliers viewed as a primal-dual algorithm, par t i a l duality, and the retention of the structure of signomial programs for convexification. The algorithm replaces the original problem with a sequence of inequality-constrained signomial programs. However, i f each subproblem is written out e x p l i c i t l y in the standard format of a signomial program, inconvenience in data preparation and computational d i f f i c u l t y due to large problem size are both very l i k e l y . To circumvent these problems, three indirect methods based on monomial condensation have been numerically explored, and one has been singled out as computationally promising. These methods require no or limited increase in variables and constraints, and the effort for data entry i s the necessary minimum. Finally, the convergence of the proposed algorithm is related to that of the method of multipliers and the A v r i e l -Williams algorithm. The proposed algorithm, as stated in Section 3.4.3, serves as a broad framework for specific methods. The conditions for the algorithm's convergence permit wide latitude of arbitrariness in casting the algorithm into a more concrete form for implementation in a computer. One major arbitrary area, the algorithm for solving the subproblem (3.9), has been discussed and w i l l be continued in Chapter V. The other important un-specified step in the proposed algorithm is the updating rules of X. and K. The discussion of these rules i s deferred to Chapter V. In both parts, the selected procedures not only must satisfy the basic conditions for convergence, but also should exhibit experimental efficiency in solving problems. In Section 3.6, the computational performance of the three indirect methods has been assessed on the basis of numerical experience acquired in the process of selecting a suitable method. In the subsequent chapters, further computational considerations are discussed and more numerical experimental results are offered. IV. A PROPOSED ALGORITHM FOR INEQUALITY-CONSTRAINED SIGNOMIAL PROGRAMS 4.1 Introduction The Method II discussed in Section 3.6 for solving the subproblem (3.9) i s effectively the application of the Avriel-Williams algorithm to transform a nonconvex signomial program to a sequence of convex geometric programs via posynomial condensation. Each of the convex geometric programs, in turn, is solved by Dembo's [12], [16] cutting plane algorithm. In this method, the signomial objective function is treated as a constraint through the use of an additional variable. The computational experience associated with Method II indicated that convergence is slow, that many cuts may be re-quired, and that most of required cuts are to satisfy the constraint derived from the objective function. A suspected reason why the objective function turned constraint requires so many linearizations i s the two-stage approxi-mation of i t s many terms with a single monomial term. To overcome this probable block on achieving acceptable convergence, an algorithm i s proposed in which only the nonconvex feasible set of an inequality-constrained signomial program is convexified and then outer linearized. The nonconvex objective function remains unchanged in the whole iterative process. As in any approximation-based strategy, the approximating problem is useful only i f i t can be handled with ease. In the problem of interest, an effective algorithm for minimizing a nonconvex function subject to linear con-straints must be selected. A promising candidate i n this regard i s the reduced gradient method f i r s t suggested by Wolfe [37]. Its generalized version [38] for nonlinear constraints was ranked as among the best of the techniques tested in Colville's study [39]. The reduced gradient method i s a hybrid of 40. simplex-type algorithms and those that are gradient-based. Thus, i f the solu-tion point i s an interior point, the method zeroes i n towards the solution point with the characteristic rate of the steepest descent method. This feature is a potential aid to accelerate the solution of the approximating convex program either by reaching an interior solution without too many cuts, or by ending at a point at which a deep cut can be made. The algorithm proposed in this chapter is, i n principle, applicable to a l l inequality-constrained signomial programs. For this reason the problem solved by the algorithm is formulated in Section 4.2 as a general signomial program. In Section 4.3, the algorithm i s developed and compared with related algorithms. Important computational considerations are elaborated in Section 4.4. Section 4.5 reports the computational experience with the algorithm. In the last two sections, the inclusion of simple upper-bound constraints and-the extension to non-signomial objective functions are considered. 4.2 Problem Statement The problem to be solved i s (T) min g Q(x) ( 4 > 1 ) S . t . X E F where F = {x: x_ E Rm; 0 < £ x; g i , ^ £ 1> k = 1, 2, ...,p} and g (x) are signomials. Note that F differs from X of (3.1) only i n that there i s no simple upper bound constraint on x. in F. Consideration of such type of constraint i s taken up in Section 4.6. A primal approach i s adopted to solve (4.1). Such an approach i s favorable i f the problem has a high degree of d i f f i c u l t y and/or slack primal 41. constraints. The algorithm to be discussed is particularly useful i f the objective function g 0(x) possesses so many terms that approximating i t with a monomial i s not preferred. 4.3 The Proposed Algorithm 4.3.1 Preview Consider the problem (T) in (4.1). It is a nonconvex program whose nonconvex feasible set F can be convexified by posynomial condensation. In the Avriel-Williams algorithm, the objective function g Q(x) is f i r s t trans-formed into a constraint and added to the set F before convexification. The whole problem is thus replaced with an approximating convex program. In the algorithm proposed in this section, the set F is not augmented with gQ(x.). Hence, only F i s convexified while g Q(x) remains unchanged. The original prob-lem i n this case i s approximated by a subproblem, say (ST), involving the min-imization of the original nonconvex cost over an approximating convex feasible set. The subproblem i s then solved by a combined reduced gradient and cutting plane (CRGCP) algorithm to be detailed later in the chapter. The proposed method has two major steps: the convexification of the set F and the solution of the resultant subproblem (ST) by a CRGCP algorithm. The discussion of how F can be convexified by posynomial condensation may be found in Section 2.6, and is not dealt with here. The reduced gradient method as proposed by Wolfe [37] is stated in the next section for completeness and for explaining subsequent algebraic manipulations and computational considera-tions. 4.3.2 The Reduced Gradient Method Assume that the problem to be solved is 42. min M(x) s.t. Ax = b, x _> 0 (4.2) where A is an m x n matrix and m < n. Let x. be partitioned into basic (depen-dent) variables x^ = (x^ , ..., \ )' and nonbasic (independent) variables —B B m x^ = ( x ^ , .. . , x^ )'. Without any loss of generality, a nonsingular square m x m matrix B can be defined as that consisting of the columns of A associated with Xg> Then = -B _ 1Cx T + B _ 1b (4.3) 0 r i % = ~ ^ % where C are the columns of A after those belonging to B have been deleted. The reduced gradient, i.e., dK(x)/d^, i s dM(x)/dx^ - V M(x) - C^B')" 1 V x M(x) (4.4) The strategy of the reduced gradient method is to decrease M(x) by maneuvering only in the subspace of x^ subject to the simple constraint x^ >_ 0. The equality constraint Ax = b is satisfied by (4.3) and the positivity of x —B is assured by suitable restriction on the step size i n x^. Furthermore, whenever a basic variable vanishes, a simplex pivot operation is done to inter-change the vanishing basic variable with a nonzero nonbasic variable. Algo-rithm 4.1 states formally the reduced gradient method for solving (4.2). Algorithm 4.1 (Wolfe's Reduced Gradient Mehtod) Step 1: Set k = 0; x° satisfies Ax° = b_. k Step 2: Compute VM(x ) and v k =V M(xk) + C Tu k k T -1 k where u = - (B ) V M(x ) ^B k Step 3: Define the descent direction s as i) s k. = 0 i f v =0 and v k > 0 Ni Wi l 43. ' ' P I K K Else, s... = -v. Nx x xx) s B - -B C % Step 4: If ||s^|| < t , e > 0 and small, stop. Else go to Step 5. k k k k Step 5: Find i) A„ - min {-x./s. : s. < 0, i = 1, 2, ..., n} c — M . x i l V k k k k i i ) X A minimizing M(x + Xs_ ) , .X e (0,X ] Step 6: Set X k = X k k+1 k . ,k k x_ = x + X s_ k+1 If X g > 0, set k = k+1 and go to Step 2. Else go to Step 7. Step 7: Perform a pivot operation on B * to interchange the vanishing basic variable with a nonvanishing nonbasic variable. Update the indices of the basic and nonbasic variables. Set k = k+1 and go to Step 2. From the foregoing summary of the reduced gradient method, four important requirements have to be met i f Algorithm 4.1 is to be applied. They are 1) the constraint structure of (4.1) has to be satisfied; 2) an i n i t i a l feasible point has to be available; 3) an i n i t i a l basis matrix and i t s i n -verse has to be determined; 4) the gradient of the objective function can be calculated. The next section gives the necessary algebraic manipulations for casting the outer-linearized version of the subproblem (ST) into the form of (4.1). Consideration of the other requirements i s found in Section 4.4. 4.3.3 Algorithm Development Consider the problem (4.1). Let g^ Cx) = ^ ( i ) ~ » k = 1» 2> p, where P^ and are posynomials. The feasible set F can be rewritten as F = {x : x e Rm; 0 < x L £ x; P (x)/(l + Q k(x)) < 1, k = 1, 2, ..., p} (4.5) 44. Assuming that a point x E F is known, the set F can be approximated with a smaller set F by applying posynomial condensation (see Section 2.2) to the denominator of Qfc(x) at x to yield the monomial Q k(x,x). The set F is then defined as 7 = {x : x e Rm; 0 < x L < x; P k(x)/Q (x, x) < 1, k = 1, 2, ...,p} (4.6) The approximate problem (ST) is (ST) min g (x) s.t. x E F o - (4.7) Both problems (T) and (ST) are nonconvex programs because F and F are gener-all y nonconvex for x j> x L > 0, x E R W . But i f the logarithmic transformation z^= In x^ is used to replace x with £, the problem equivalent to (T) and (ST), respectively, may be stated as follows: (T) min g Q(z) s.t. ze Z (4.8) where Z - {z : z £ Rm; z L < z; P k ( z ) / U + Q k(z)) < 1, k = 1, 2, ..... p}(4.9) (4.10) rm L c — \z_ : z_ £ f and (ST) Z min g 0(z) s.t. z_ E Z where Z = (z_ : z E Rm; z L £ z^ P k (O/Qk(z_, z) <_ 1, k = 1, 2, .. ., p} (4.11) Note that now problem (ST) Z is convex. Another formulation of (ST)^ equiva-lent to (4.10) - (4.11) i s (ST) min g (z) s.t. z £ \ (4.12) where Z L = ^ : £ e R 5 £ 1 G k(z, z) £ 0, k = 1, 2, .. . , p} (4.13) a n d % (z, £) = ^(Py^O/QkCz, z)) (4.14) In (4.12) - (4.14), the following statements are true: 1. G,(jz>J) i s a convex function V _z E Rm, <_ ^ 4 > . 2 . ZT = Z 3. ZT is a convex set. ti In Problem (ST) , a nonconvex function i s minimized over a convex set. To solve this problem, a combined reduced gradient and cutting plane algorithm i s used. The algorithm's strategy requires 1) the outer li n e a r i z -ation of the set Z , 2) the casting of the problem formulation into that of (4.2), 3) the use of Algorithm 4.1, 4) i f required, the improvement of the ap-proximation of Z with cutting planes. To outer linearize Z^ > replace the convex function Gk(z,2) with i t s m first-order Taylor approximation Gk(a,2) + (z - a) * VG^(a_,z) for some a_ e R satisfying a_ _> . Define the resultant polyhedron Z^ C R™ as Z L = (z : z e Rm; z L < z./^Ca.z) + (z-a) * VGj^a.z) £ 0, k=l,...,p} (4.15) Because of the convexity of G, (z ^|) , Z cr ZT . Next let = £ - and £ = [C £ ] be the vector of slack variables. Then define the hyperplane s i sp H c R ^ a s H = {(?', c ') : (?', c ') E R ^ ; 0 < (?', c ') ; [J : I](C\ ? ') - b}(4.16) — —S — — S — ~ — 8 S where J i s the p x m Jacobian matrix [ 3 G , (a,j|)/9z. ], I i s the pth-order ident-i t y matrix, and b_ is the p x 1 vector - [ ( z L - a.)' VGk(a,^) + Gk(a,2) J. The problem suitable for solution by Algorithm 4.1 i s (1ST) min gQ(0 (4.17) s.t. (?, ? ) e H — —s * * Let the solution to (LST) be (or correspondingly, some z_ ) . If * — z_ s Z- J proceed to the next convex approximation of Z. Otherwise, add a cutting plane to Z^. The cutting plane i s obtained by linearizing the most — * violated constraint of Z at jz . Li 46. 4.3.4 The Algorithm In this section the proposed algorithm for solving inequality-con-strained signomial programs of the type (4.1) is formally stated as Algorithm 4.2. Algorithm 4.2 (k) Iteration 0: Set k = 1. Assume that a point z e Z is known (k) Iteration k:; Step 1: Condense at 2 to produce problem (ST) Z Step 2: Set i = 1. Set a ( k ) = z^ k ) — (k) Step 3: Linearize at a . and set up problem (LST). Step 4: Solve (LST) by Algorithm 4.1. Let the solution (in the z - domain) be z ^ . Step 5: If z ^ e Z go to Step 7. Otherwise determine the most violated constraint, linearize i t at z , and use the linearization as the next cutting plane. Proceed to the next step. Step 6: Determine a new basis and a new feasible point for prob-lem (LST). Set i = i+1 and go to Step 4. Step 7: If z ^ satisfies the termination criterion, stop. Else, set k = k+1, z ( k ) = z ( l ) , and go to Step 1. Step 6 has to be included because before Algorithm 4.1 can be applied a feasible point and a basis must be available. How Step 6 is carried out is among the practical computational considerations taken up i n Section 4.4 4.3.5 The Algorithm in Perspective Algorithm 4.2 has been developed to solve signomial programs character-ized by a high degree of d i f f i c u l t y and by an objective function with many terms. The main basis of the algorithm is that a set defined by signomial inequalities 47. can be readily approximated by a convex set via posynomial condensation. In this regard, the algorithm is similar to the Avriel-Williams algorithm. There i s , however, a major difference between the two. Unlike the Avriel-Williams algorithm, the approach taken here does not incorporate the objective func-tion into the constraint set. It is hoped that by minimizing the original generally nonconvex objective function over the approximating convex set, convergence towards the original solution can be attained faster. The decision to approximate only the feasible set and leave the objective function untouched means that an efficient method for minimizing a nonconvex function over a convex set must be employed. A promising method i s the novel algorithm proposed in this chapter. The algorithm merges the a b i l i t y of Kelly's [40] cutting plane method to handle convex constraints with the efficiency of the reduced gradient method to minimize a general nonlinear func-tion subject to linear constraints. To tackle the issue of f e a s i b i l i t y , the algorithm exploits the con-vexity property by adopting the strategy of relaxing the constraints by outer linearization and progressively tightening the constraints with cutting planes. The reduction of the nonconvex cost over the set defined by the relaxed con-straint (a convex polyhedron) is achieved by the reduced gradient method. Thus the algorithm translates into a sequence of programs each solved by the reduced gradient method. In contrast, Kelley's cutting plane method for convex programs requires the solution of a sequence of linear programs. The l i k e l y benefit of Algorithm 4.2 is that because maneuvers into the interior of the feasible set are allowed, an interior solution point can be reached sooner, or i f the s o l -ution i s a boundary point, f e a s i b i l i t y can be achieved with deeper cuts. Another interesting comparison that can be made here is with Abadie and Carpentier's [38] generalized reduced gradient (GRG) algorithm for handling 48. nonlinear constraints. Because GRG is designed for general nonlinear constraints, f e a s i b i l i t y i s maintained in each iteration by a restoration phase i n which a system of nonlinear equations (the constraints) is solved by Newton's method. In the problem of interest in this chapter, because of the convexity of the feasible set, the need to maintain f e a s i b i l i t y in each iteration i s dispensed with and i s replaced with a relaxation strategy [41]. In GRG, the inverse of the Jacobian of the constraints with respect to the basic variables has to be either exactly or approximately updated at each point or when the basic variables are changed. This step can become a heavy computational load. In the suggested combined approach, because the relaxed constraints are just linear equations, the basis' inverse i s constant and i s easily updated by simple pivot operations when the basis i s changed. 4.4 Considerations for Implementation Together, Sections 4.3.2 and 4.3.4 spell out clearly the major steps of the proposed technique for solving signomial programs. However, i f the proposed method i s to be implemented i n a computer, some important practical questions need to be answered. How can the point c Z be found? How can the i n i t i a l basis and the i n i t i a l feasible point be derived before a c a l l to the reduced gradient method (Algorithm 4.1)? How can the basis' inverse be updated? How should the one-dimensional minimization required in Step 5 of Algorithm 4.1 be carried out? What are the termination c r i t e r i a and toler-ances? These are the questions to which this section is addressed. 4.4.1 Finding a feasible Point of a Signomial Program A requisite for approximating the set F defined in (4.1) with a smaller disguised convex set F is a point x e F. To find x, Dembo's [16] Phase I method i s used. In this method, a sequence of programs each of the 49. form p (PHSI) min ^ o>k s . t . P k ( x ) / ( 1 + Q k(x)) < (4.18) 0 < x J J _ < x , l _ < t o k , k = l , . . . , p i s solved. Each problem (PHSI) i s solved by Avriel-William's algorithm in. conjunction with Kelley's c u t t i n g plane method. (See [12], [16]). The se-quence i s terminated e i t h e r i f w, = 1, V" k, or some w, - ^ 1 when the optimal s o l u t i o n to (PHSI) has been obtained. 4.4.2 Getting an I n i t i a l Basic Feasible Solution of Prolbem (LST) - No Cut Added From (4.16), the problem (LST) i s defined as min g D(£) t s . t . (a) [ J : I ] [—]•=' b s (4.19) (b) 0 < c, 0 < c — --s where j; = z - z", J = [ 9G k(a,£) /dz ± ] , b = - [ ( z L - a)' VGfc (a ,J) (a, z) ], z e Z, and a^ e R m s a t i s f y i n g a _ 2 l ^ * A convenient i n i t i a l f e a s i b l e s o l u t i o n to (4.19) can be obtained by the following lemma. Lemma 4.1 Assume that jz e Z and l e t £ = _z - . I f a = z and = b - j£_ f then the point (|_, £^) i s a f e a s i b l e s o l u t i o n of problem (LST) . Proof: By the hypothesis and Lemma 2.1, Gk(!>D = 1" (Pk(A)/Qk(i»A)> = l n ( P k ( l ) / ( 1 + Qk(D) ) . n (4.20) But _z e Z. I t follows that ln(P k(£)/(! + Qk(z)))< 0 k^ (4.21) Or for some £ > 0, —s — G k(z, £) + = 0 • k^ (4.22) —s 50. Adding to both sides (4.22) the quantity j£ leads to the desired conclusion. Q.E.D. While the point ) is feasible for (4.18), i t s nonzero basic v a r i -s ables s t i l l have to be identified. A basis of (4.19a) has to be specified. Or equivalently, a set of m linearly independent columns of the composite mx(m+p) matrix [J:I] has to be known. A procedure to identify a set of such columns is explained in the next paragraph. Let the index sets V and W be defined as (4.23) W= {k^gk > °) = a32' UL2* (4.24) Clearly, L^ + L2 = P> Reorder the rows of (4.19a) such that the f i r s t L^ rows are indexed by V and the last L 2 rows, by W. Let the reordered equation with V-'{k:C 8 k-'0> = {v 2, v 2, v L i> C =? and C = ?, be written as s s Q. s ^ 1 i 0 0 i ^2 = b (4.25) It is assumed that the components of £ have been appropriately recorded. Then a basic solution to (4.25) consists of L^ variables selected from the nonzero elements of and the last L 2 nonzero slack variables. This basic solution i s feasible by Lemma 4.1. The L^ variables from must correspond to l i n -early independent columns selected from [-~] or simply from Q. An algo-us rithm such as Gaussian elimination used for the determination of a matrix's rank can be used to pick the desired L^ columns. Let the L^ linearly independent columns of [-—] be [-§-] where 0 i s a L x L n matrix and S i s s S ' 1 i a L£ x L j matrix. Then the i n i t i a l basis B of (4.25) i s I I Q B = Q s L 2 (4.26) 51. Since Q i s nonsingular by construction, the i n i t i a l inverse of the basis i s 1 _ 5 - i 0 ~ ~_1 1 (4.27) - S Q | Hence, in general, only a smaller matrix 6 need to be inverted to obtain the f u l l basis inverse. To summarize, the following steps are undertaken to obtain an i n i -t i a l basic feasible solution to the problem (LST) with no cut added: 1. Obtain a point z e Z and set a = z, ? = z^ - _zL . 2. Generate the matrix J and the vector b. Set £ = Jc, - b . — ~s — — 3. Reorder J, b_ and C such that the f i r s t L 1 rows are indexed by V while the last rows are indexed by W. 4. Identify L.^ linearly independent columns in Q. Let these columns form the L^ x L.^ matrix Q . 5. Invert Q and generate B ^ according to (4.27). 4.4.3 Getting an I n i t i a l Basic Feasible Solution of Problem (LST) - After the Addition of a Cut At the end of one application of the reduced gradient algorithm (i.e. the completion of Step 4 of Algorithm 4.2), l e t the solution be (i,£ ). —s Suppose that at this solution point, at least one of the constraints of the convex set Z^ is violated. Let the most violated constraint's index be r. Then with z = £ + z^, G (z,z) > 0. The cut to be added to the polyhedron ZL i s the supporting plane of Gr(z,J) at z_ = z. Expressed in terms of the equation of the new cut is 6 ' ? + Y (4.28) 52. where = !G^(.z,z) , c = -\{z - z) ' VGr(z,-2) + G r(z , 2 ) ] and y is the new slack variable associated with the new cut. Equation (4.25) becomes • - r ~ S i 0 - r -l* i 0 L2 (4.29) Let y = c - j? Clearly, (_£, ? , y) is a basic solution to (4.29). However, i t i s not feasible since by the assumption that Gr(z,_2) > 0, y < 0. A method to drive Y into the feasible set defined by (4.29) and the positivity constraint on (£, Z ,y) i s to solve the following auxiliary problem (AUX) (AUX) min -y (4.30) s.t. (4.29) , ? > 0, c > 0 —s — As i t is formally stated, the problem (AUX) is a linear program i n a form ame-nable to the use of a composite algorithm [42] to obtain an i n i t i a l basic feasible solution. But because the reduced gradient method allows nonzero nonbasic variables, the point X ? , , ^ , ? ) cannot be used as the i n i t i a l basic feasible point for the LP solution of (AUX). Using a completely different starting point would not only require a Phase I operation but also would des-troy the continuity of the original iterative process. A very convenient answer to this d i f f i c u l t y i s to solve problem (AUX) by a slightly modified ver-sion of the reduced gradient method. The only modifications are the objective function and the definition of the bound on the step size variable A . The usual bound i s defined as A . , = min {-x. /s. : s. < 0, Vs i} (4.31) For solving problem (AUX), (4.31) is changed to the following 53. X M = min [min {-x±/s± : x± > 0, s± < 0 , ^ i > , {-Y/S^: y < 0, > 0 } ] ( 4 . 3 2 ) where is the component of the direction vector s_ along the new dimension of y. The criterion specified by ( 4 . 3 2 ) ensures no new i n f e a s i b i l i t y , and elimin-ates the i n f e a s i b i l i t y due to y < 0 as soon as i t is possible. If both A and s^ are negative, then the constraints are inconsistent. Once y >_ 0, the bound given by ( 4 . 3 1 ) is used, and the original nonconvex objective function g Q(£) i s i n force. The i n i t i a l basic feasible solution to the problem (AUX) is simply (_?,£,y). The starting basis' inverse is ,-1 B -1 -1 B V B-l 0 1 ( 4 . 3 3 ) where B " i s the basis' inverse at the termination of Step 4 in Algorithm 4.2, and w' is the vector of the components of 6' that are associated with the basic variables in Note that once y > 0, B serves as the starting inverse for the next execution of Step 4 in Algorithm 4.2. Thus, from a computational viewpoint, using the modified reduced gradient method does not perturb too much the sequence of solution points obtained by Algorithm 4.2 . Furthermore, the additional programming effort required to implement the method i s very small since the method is s t i l l within the algorithmic framework of the main reduced gradient method. 4.4.4 A Linear Search Method Like other gradient-based algorithms, the reduced gradient method requires in each of i t s iterations the solution of the following one-dimensional problem: f(A k) = min f(A) = min g o U k + As k) s.t. A (0, A M] ( 4 . 3 4 ) 54. The conceptual algorithm calls for an exact linear search. In practice, how-ever, exact linear searches often consume an exorbitant part of the total com-puting time of a problem. Generally, approximate linear minimization i s found adequate to achieve convergence at a reasonable rate. The inexact approach i s adopted in this section. Inexact linear search methods generally f a l l under the following categories: direct searches involving function values only, low-order curve f i t t i n g requiring both function and/or gradient values, and nonoptimal step methods. The f i r s t two types are effectively iterative schemes which produce a sequence converging to the true minimizer of g(A). The nonoptimal step method subscribes to a different viewpoint, namely, that linear search i s a step imbedded in a larger algorithm. The convergence of the larger algorithm can be, and i s , maintained as long as the cost function is decreased s u f f i c -iently in each linear search. Nonoptimal step methods are preferred because they are operationally well defined and are e f f i c i e n t . Examples of nonoptimal step methods are the step size rules of Goldstein [43] and Armijo [44]. Other strategies were outlined by Bard [45]. In the problem (LST) that i s of interest, the signomial objective function gQ(.C) is continuous and d i f f e r e n t i a b l e over H. Furthermore, the evaluation of the derivative at a t r i a l point requires very l i t t l e additional . effort once the function has been evaluated. This observation suggests that a low-order polynomial f i t , such as the cubic f i t scheme derived by Davidon [46] is suitable. However, a curve f i t t i n g scheme by i t s e l f may s t i l l require numerous repetitions i f the termination criterion i s the usual relative change k k in f ( X ) or i n X . An appealing approach is to couple Armijo's test for s u f f i c -ient descent to the cubic f i t method. The strategy is to use Armijo's step size 55. rule except when i t i s clear that a relative minimum exists in the interval (0, X M ] . This condition i s confirmed, when df/dX < 0 at X = 0 and df/dX > 0 at X^. The proposed linear search method, stated as Algorithm 4.3, is thus a synthesis of the cubic f i t method and Armijo's step size rule. Algorithm 4.3 Step 0: Given a e (0,1), g e (0,1), y e (0,1), and X q > 0. Let X^ be determined by Step 5 o f Algorithm 4.1. Set X^ = min [X 0,X M] and X^ = 0. Calculate f(X ) and f';(X ). Step 1: Set X = X^, and evaluate f(X) and f'(X). A Step 2: If f*(X) > 0, go to Step 4. Else go to Step 3. Step 3: Set X = X i f f(X ) < f(X.) and go to Step 5. Else, X = YX and go m m x, r m m to Step 1. Step 4: Apply Algorithm 4.4 to f i t a cubic through (X„, f(X.)) and (X , f(X ) ) . x x m m Let the minimum estimate be X. Step 5: If f(X) - f U ^ ) < -a|x - xj |f\(X )|, Xfc = X stop. Else continue. Step 6: If X = X , go to Step 7. Else, set X = X i f f'(X) < 0, or X = X m Z m i f f'.(X) > 0. Go to Step 4. Step 7: X = g A . Go to Step 1. c — m m ^ Algorithm 4.4 (Cubic Fit) Step 0: Given X , A , f(X ), f(X ), f'(X.) and f'(X ). Assume f'(X„) < 0 and f'(X ) > 0. m — Step 1: Calculate u = f (A ) + f (X ) - 3(f(X.) - f(X )) / (X - X ) J- x m x m x m u 2 = [ u 2 - f ' ( X , ) f ' ( X m ) ] 1 / 2 Step_2: X = X m - (X^ - ( f ' ( X J + - u j / (f' ( X J - f ' (X £) + 2u£) 56. Algorithm 4.3 combines the useful features of a cubic f i t and Armijo's step size rule. The rule assures convergence by stipulating sufficient descent before a t r i a l point i s accepted as the next solution point. But the ith t r i a l point i s not always obtained by the relation = P** m^ »• a s x t i s the case in Armijo's algorithm. When the situation warrants i t , the a b i l i t y of a cubic f i t to rapidly pick a good interior estimate is exploited to produce a t r i a l point. 4'.4.5 Optimality Cri t e r i a , Feasibility Tolerance, and Updating the Basis' Inverse • . The main problem (T) posed i n (4.1) i s a nonconvex program. Solving (T) numerically with an algorithm generally leads to at best a local minimum at which the Kuhn-Tucker optimality conditions must be satisfied. In practice, unless a Lagrangian-type algorithm is used, testing the first-order necessary optimality conditions i s l i k e l y not possible since the optimal Lagrange multi-pliers may not be known. Even i f the multipliers are known, checking the Kuhn-Tucker conditions at the end of each iteration may create too large a compu-tational load. The whole question of knowing when optimality i s reached be-comes a trade-off between r e l i a b i l i t y and extravagance. For the sake of com-putational efficiency, simple but imperfect stopping rules for terminating the iterative process have to be adopted. The rule adopted in this chapter i s : Terminate i f the relative change in g Q (C.) £ £ and the relative change in £ <. e, where e is a small preassigned positive number. If either g Q or £ -*• 0, replace relative change with absolute change. Depending on which quantity is of i n -terest, the rule may be relaxed in two ways. If only g Q is important (e.g. only cost, not the design parameters, is sought), the test on the relative change in _£ may be dropped. Analogously, i f only the accuracy of i s emphasized, the test on g Q may be avoided. An excellent operational check on the answer obtained 57. by the previous stopping rules i s to perturb the s o l u t i o n and resolve the problem with the perturbed point as the s t a r t i n g point. Another important issue that must be faced i n implementing a con-ceptual algorithm i s the d e f i n i t i o n of f e a s i b i l i t y , or more generally, the d e f i n i t i o n of i n e q u a l i t y and e q u a l i t y . T h e o r e t i c a l l y , whether a and b are s c a l a r s or vectors, the expression a <^ b, a _> b, and a = b are unambiguous. However, i n the implemented ve r s i o n of an algorithm, because numbers are repre-sented i n f i n i t e lengths, i n t e r p r e t i n g i n e q u a l i t i e s and e q u a l i t i e s i n t h e i r s t r i c t mathematical sense could j u s t spawn a myriad of nuisances such as un-expected e a r l y program termination, undesired negative q u a n t i t i e s , d i v i s i o n by zero and other d i f f i c u l t i e s . For t h i s reason, i n the implementation of t h i s t h e s i s ' algorithms, i n e q u a l i t i e s and e q u a l i t i e s are replaced with s - i n e q u a l i -t i e s and e - e q u a l i t i e s . This means that given s c a l a r s a and b, and a small £ > 0 , a £ b + e replaces a <_ b; a>_b - E, a>_b; and |a - b| <_ e replaces a = b. If a. and b_ are vectors, then a__* K . Then Vd„ (X) and Vd„ (X) are 1 o l ~ o K l — N> ~ gradients of different dual functionals. Clearly, they cannot be used to interpolate either d^ (A) or dj, (A) • A common functional i s needed i f interpolation i s to be used and the penalty constant K can s t i l l be changed. A solution i s to use AQ(A)> i.e., d Q(X) = min £(x, A' °) (5.11) x e X (5.11) i s just the minimization of the ordinary Lagrangian. Observe that d Q a + 2Kh(x(A,K))) = d ^ a ) (5.12) Hence to approximate dg(A) ^y passing a low-order polynomial f i t through two points, dg(A_ + 2Kh(x(A>*0)) has to be evaluated for any tT.?o pairs of A and K. This i s the rationale of the following algorithm whose local convergence was proved by Bertsekas [27]. However, no comparative 83. numerical evidence was provided. Algorithm 5.1 Step 0; Set k = 1. Assume A^ 2 k ^ ' the sequence {K^} are specified. Step 1; Obtain x&k-l) by minimizing i l(x, _A ( 2 k _ 1 ) , K ( 2 k - 1 ) ) s.t. x e X. Step 2: Pk)= P k - 1 )+ 2 K ( 2 k - 1 ) h ( x ( 2 k _ 1 ) ) . It follows from (5.12) and Step 2 that d Q(A* 2 k )) and Vd Q(A ( 2 k )) are known. Step 3: Obtain x/ 2 k^ by minimizing JL(x, A_^2k\ K^ 2 k )) s.t. x e X. Hence d 0 ( A ( 2 k ) + 2K ( 2 k ) h ( x ( 2 k ) ) ) and i t s gradient are known. (2k) (2k) Step 4; Approximate dQ(A) by f i t t i n g a cubic through (Av , dQ(A. )) and ( A ( 2 k> + 2K ( 2 k> h ( x ( 2 k ) ) , d0(A<2k> + 2K ( 2 k ) h ( x ( 2 k > ) ) ) . Let the estimated step size be a Step 5; Set the f i n a l step choice a^ 2 k^ according to the following rule: \ K ( 2 k ) . f 4 K ( 2 k ) < o(2k) -(2k) I a ((2k) . f 2 K(2k) < a(2k) < 4 K ( 2 k ) 2 K(2k) . f o(2k) K 2 K(2k) c a c '„ .(2k+l) ,(2k) . ^ (2k) (2k). . . . , . Step 6: Set _A = _A + a h(x ) , k = k + 1, and go to Step 1. In Algorithm 5.1 , cubic f i t i s used every second iteration, i.e., after every even iteration. The Hestenes-Powell step size rule is (2k) used in every odd iteration. In Step 5, the estimated step size a (2k) (2k) is bounded within [2K , 4K ] because the rate of convergence ana-ly s i s by Bertsekas shows such an interval provides linear local conver-gence. 84. 5.3.3 Updating the Penalty Constant K As discussed in Section 3.6, the penalty constant K must be sufficiently large to assure convergence. Another view of this condi-tion on K i s that K must be greater than some lower bound K* in order that a local convex structure exists. The latter condition, in turn, implies that local duality theory can be used. Furthermore, as K increases, the dual convergence rate improves, provided the i n i t i a l value of X_ is close enough to the optimal Lagrange multiplier vector Xf1. In practice, however, i t i s d i f f i c u l t , i f not impossible, to determine K*. If K i s too large, the primal problem may be d i f f i c u l t to solve because of i l l -conditioning. This situation is particularly l i k e l y i f the i n i t i a l X_ i s not in the neighborhood of If seems that the only practical recourse is to start with a modest value of K, and let i t increase in some way so that both the convergence and the fast rate of convergence promised by the method of multipliers can be attained with high probability. This approach seems to be not different from the ordinary penalty method. There i s , however, a crucial difference - the consolation that K does not have to approach a for convergence to occur. (k) Two rules for choosing the sequence {K } are considered here. Their performance in numerical experiments is detailed in Section 5 . 4 . The two rules are stated below. Rule 1; K ( k + 1 ) = K ( k ) i f || h ( x ( k + 1 ) ) | | < 9 ||h(x ( k ))|| , 9 < 1.0. Else, K ( k + 1 ) = cj) K ( k ), eb > 1.0. Rule 2: K ( k + 1 ) = k K, •> 1.0. , Rule 1 increases the penalty constant only when a prescribed rate of con-vergence is not met. Rule 2 divorces the increase in K from the current 8 5 . condition of the problem. In both rules, the penalty constant i s i n -creased in a geometric fashion. The geometric increase, however, is a convenient arbitrary realization of the requirement that there should be an increase. 5 . 4 Numerical Experiments 5 . 4 . 1 Introduction The schemes discussed in Section 5 . 3 for updating A. and K are combined and tested in a series of numerical experiments. The experi-ments are designed to compare the convergence and the rate of convergence of each of the four possible combinations of updating rules. The numeri-cal study also looks at the sensitivity of each scheme's performance to the key varying parameter of the updating rule for K. Section 5 . 4 . 2 explains the software implementation and other related details. The test functions used in the study are given in Section 5 . 4 . 3 while the experiment plan is described in Section 5 . 4 . 4 . The results of the experiments are presented and discussed in Section 5 . 4 . 5 . 5 . 4 . 2 Software Details The software needed to perform the numerical study requires the computer code SP developed in Chapter IV and an additional subroutine incorporated into the code. The functions of this subroutine are to monitor the violations of the equality constraints and, i f required, to update _A and/or K. At the entry to this subroutine, the subproblem (irj) has been approximately solved. Since the objective function of (TTJ) i s the augmented Lagrangian £(x, _A> K) which involves a l l the equality con-straints, these constraints need not be calculated inside the subroutine i t s e l f . They are a l l implicitly calculated, in the last evaluation of £(x, _X, K) prior to the c a l l of the subroutine. The violations of the equality constraints h^.(x) = P -Qp+k(x) - 1, k = 1, 2, ..., q, can be measured by various distance func-tions. For the study of this chapter, the measure of the equality con-straint violations i s given by the Euclidean distance function <5(N) de-fined as q 5(N) = [ I h \ ( x ^ ) ] 1 / 2 (5.11) k=l K (N) where N i s the dual iteration count and x i s the solution of the Nth subproblem (T^). The equality constraints are said to be sat i s f i e d i f 6(N) < e , where e is a prescribed small positive number. Another possible measure is the Chebyshev distance max I^ QE)! u s e d by k=l,...,q fc Powell [26]. However, the Euclidean distance yields an error region smaller than that of the Chebyshev distance. Two other important tolerances are the tolerance for termina-ting the subproblem (TTj) and the f e a s i b i l i t y tolerance of the inequality constraints. Both are defined as in Section 4.4.5. Computational ex-perience acquired during the preparatory phase of the experiments re-vealed that i f the threshold for terminating the subproblem is small -4 (e.g. < 10 ) and uniform for a l l (dual) iterations, an Inordinate amount of computing time would be consumed, especially in the f i r s t couple of iterations, without improving much the overall performance. Faster con-vergence could be achieved with earlier updating of X_ and/or K. Hence a heuristic rule to accelerate convergence through controlling the thres-hold for terminating a subproblem is included in the subroutine. The rule can be described as follows: Let the threshold for terminating a subproblem be e r r p and the iteration count be k. Step 0: Set = 10 2 . 87. Step 1; If S(k) < 10 or k > 2, Anl = e^/lO.O. Step 2: If ""* Gux Lux 6(k) < 1 0 " ( 1 + J ) , 1 < j < 5, e j j * = ejJ^/lO^. The rule links the reduc-tion of the threshold to the closeness of the approximate solution to the true solution. 5.4.3 Test Problems Four signomial programs of varying sizes serve as the test pro-blems in the numerical experiments. The s t a t i s t i c s of the problems are summarized in Table 5.1. The details of each problem are described in this section. Note that the starting value of _X in a l l cases i s the zero vector with the appropriate dimension. Test Problem No. of Variables No. of Ineq. Const. No. of Eq. Const. Total No. of Terms* Degree of D i f f i c u l t y A 2 1 3 5 12 B 4 1 1 9 4 C 3 2 2 20 16 D 4 1 2 30 25 * Excludes lower and upper bounds on variables. Table 5.1 Summary of Characteristics of Test Problems. 88. The following are the four test problems: Test Problem A -2 , v -2 mm g Q(x) = x 1x 2 - 0.5 x 2 s.t. g-^x) = 0.5 x 1 1 , 8 x 2 < 1.0 g 2(x) = 2x 1 2 + 2x £ 2 + 5.0 - 6x x - 2x £ = 1.0 g 3(x) = (2/3)x± + (1/3)x2 =1.0 g 4(x) = 4x 1 2 + x 2 2 + 10.0 - 12x1 - 2x 2 = 1.0 0.01 < x < 4.0 ~ o -0.01 < x_. < 10.0, j= 1, 2 Solution: X q = 0.5, x^ ^ = 1.0, - 1.0 X± = 0.48793, X2 = 0.075643, X3 = 0.0189 Starting point: X q = 0.5, x^ = 0.5, x 2 = 1.0 Test Problem B mxn g Q(x) = 2.0 - x 1x 2x 3 s.t. g (x) = 0.25 x. + 3.75 x 0x„ + 0.375 x„x, < 1.0 J - — j 15 3 4 " g„(x) = x.. + 2x. + 2x„ + 1.0 - x, = 1.0 ^ — 1 / 3 4 0.01 < x < 2.0 o ~ 0.01 < X j < 1.0, j = 1, 2, 3 0.01 < x. < 2.0 4 -Solution: X q = 52/27, x± = 2/3, x £ = 1/3, x 3 = 1/3, x 4 = 2.0 \ = 1/9 Starting point: X q = 2.0, x± = 0.5, x 2 = 0.5, x g = 0.25, x 4 = 1.5 89. Test Problem C min g Q(x) = x J ' 6 x 2 + X 2 X 3 ° ' 5 + 1 5 - 9 8 x i + 9.0824X2, - 60.72625x, s.t. -2 -2 g 1(x) = x 1x 2 + 1.48 - x 2 x3<_1.0 g 2(x) = ^ ' 2 5 x 3 + x 2 + 6.75 - xJ' 5x 3-° <_1.0 g 3(x) = x 2 + 4.0x2 + 2.0x3 - 57.0 = 1.0 -12 5 2 g^(x) = x 1x 2 x 3' + x 2x 3- x 2 - 15.55 =1.0 1.0 < x < 1000.0 _ o _ 0.1 < x.. < 10.0, j = 1, 2, 3 Solution: X q = 319.4, x^ = 1.0, x 2 = 2.5, x 3 = 4.0 X± = -3.0, A2 = 1.5 Starting Point: X q = 558.0, x± =2.0, x 2 = 2.0, x 3 = 8.0 Test Problem D m m s.t. g Q(x) = 3x1 + 2x 2 + x 3 + x 4 + 7x 1 + 565 - 39x2 - 17x 3 § 1 ( x ) = (l/6)x 2 + (l/18)x 2 + (l/9)x 2 + (1/18)^ -(l/9)x 3 - (l/18)x 2 < 1.0 2 2 2 2 g„(x) = x.. + x„ + 2.5x, + x„ + 9.5 - 4x. - x, = 1.0 2 — 1 3 4 3 Z 4 2 2 A 2 g 3(x) = x 2 + 3x 3 + x^ + x 2 - x^ - x^ - x^ - 2.0 = 1.0 1.0 < x < 1,000.0 - o -0.1 < x j < 10.0, j = 1, ..., 4 Solution: X q = 505.0, x± = 2.0, x 2 = 2.0, x 3 = 1.0, x 4 =1.0 A 1 = -1.0, A 2 = 3.0 Starting Point: X q = 700.0, x± = 1.0, x 2 - 1.0, x 3 = 3.0, x 4 = 2.0 90. In Test Problem A, the inequality constraint g^(x) < 1.0 i s loose at the solution. The same i s true with the constraint g^(x) < 1.0 of Test Problem D. A l l the other inequality constraints of a l l four test pro-blems are tight at the quoted solutions. 5.4.4 Experiment Plan Four schemes are considered in the experiments. The f i r s t two, Schemes I and II, result from combining the Hestenes-Powell step size rule and Rules 1 and 2, respectively, for updating K. Schemes III and IV are the combination of the same two rules for updating K and the use of Algorithm 5.1 to obtain the step size. When Rule 1 i s used to update K, the i n i t i a l values K/°^ tested are 0.1, 1.0, and 10.0 . The reduction factor 8 i s 1.0 while the expansion factor *