"Applied Science, Faculty of"@en . "Civil Engineering, Department of"@en . "DSpace"@en . "UBCV"@en . "Yassien, Hassen Ali"@en . "2009-04-08T20:16:34Z"@en . "1994"@en . "Doctor of Philosophy - PhD"@en . "University of British Columbia"@en . "The mathematical equations used in civil engineering design procedures are predomi\r\nnantly nonlinear. Most civil engineering design optimization problems would therefore\r\nrequire the use of nonlinear programming (NLP) techniques for their solution. Those\r\nNLP packages with the ability to handle practical sizes of problems, and have been\r\navailable on mainframe computers for many years, are only now becoming available on\r\nmicrocomputers. On top of this, these existing NLP techniques, which are dominated\r\nby the gradient methods, do not guarantee global solutions. As a consequence suitable\r\noptimization methods for civil engineering design are not being enjoyed by practitioners.\r\nIn this thesis, the level set optimization method, whose theory was initially presented\r\nin \u00E2\u0080\u009CIntegral global optimization\u00E2\u0080\u009D by [Chew & Zheng, 1988] was further developed to\r\naddress, in particular, practical engineering problems. It was found that Level Set Pro\r\ngramming (LSP), offers a viable alternative to existing nonlinear optimization methods.\r\nWhile LSP does not radically alter the computational effort involved it has some unique\r\ncharacteristics which appear to be significant from the engineering users point of view.\r\nLSP which is classified as a direct search method of optimization, utilizes the set\r\ntheory concept of a level set. It uses estimates of moments of the objective function\r\nvalues at the confirmed points within a level set to control the search advance and as a\r\nmeasure of convergence on the global optimum.\r\nThe reliability and efficiency of LSP was verified by comparing its results with pub\r\nlished results for both mathematical and engineering test problems. In addition to the\r\npublished test problems, a new parametrically adjustable mathematical test problem was\r\ndesigned to test global optimization methods in general and to explore the strengths and weaknesses of LSP in particular. Experience with these test problems showed that LSP\r\ngave similar results to those cited in the literature as well as improved results or more\r\ncomplete sets of global solution.\r\nThe large number of solutions developed at each iteration of LSP permits meaningful\r\ngraphical displays of the progressive reduction in the level set boundaries as the global\r\nsolution is approached. Other displays were also found to provide insights into the\r\nsolution process and a basis for diagnosing search difficulties."@en . "https://circle.library.ubc.ca/rest/handle/2429/6965?expand=metadata"@en . "4824523 bytes"@en . "application/pdf"@en . "A LEVEL SET GLOBAL OPTIMIZATION METHODFOR NONLINEAR ENGINEERING PROBLEMSByHassen Au YassienB. Sc. in Civil Engineering, Addis Ababa University, Ethiopia, 1979Diploma in Hydrology, Free University of Brussels, Belgium 194Masters in Hydrology, Free University of Brussels, Belgium 1985A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE. OFDOCTOR OF PHILOSOPHYinTHE FACULTY OF GRADUATE STUDIESDEPARTMENT OF CIVIL ENGINEERINGWe accept this thesis as conformingto the required standardDecember 1993\u00C2\u00A9 Hassen Au Yassien, 1993THE COLUMBIAIn presenting this thesis in partial fulfilment of the requirements for an advanceddegree at the University of British Columbia, I agree that the Library shall make itfreely available for reference and study. I further agree that permission for extensivecopying of this thesis for scholarly purposes may be granted by the head of mydepartment or by his or her representatives. It is understood that copying orpublication of this thesis for financial gain shall not be allowed without my writtenpermission.(Signature)__________________________________Department of CThe University of British ColumbiaVancouver, CanadaDate kcvL toDE-6 (2)88)AbstractThe mathematical equations used in civil engineering design procedures are predominantly nonlinear. Most civil engineering design optimization problems would thereforerequire the use of nonlinear programming (NLP) techniques for their solution. ThoseNLP packages with the ability to handle practical sizes of problems, and have beenavailable on mainframe computers for many years, are only now becoming available onmicrocomputers. On top of this, these existing NLP techniques, which are dominatedby the gradient methods, do not guarantee global solutions. As a consequence suitableoptimization methods for civil engineering design are not being enjoyed by practitioners.In this thesis, the level set optimization method, whose theory was initially presentedin \u00E2\u0080\u009CIntegral global optimization\u00E2\u0080\u009D by [Chew & Zheng, 1988] was further developed toaddress, in particular, practical engineering problems. It was found that Level Set Programming (LSP), offers a viable alternative to existing nonlinear optimization methods.While LSP does not radically alter the computational effort involved it has some uniquecharacteristics which appear to be significant from the engineering users point of view.LSP which is classified as a direct search method of optimization, utilizes the settheory concept of a level set. It uses estimates of moments of the objective functionvalues at the confirmed points within a level set to control the search advance and as ameasure of convergence on the global optimum.The reliability and efficiency of LSP was verified by comparing its results with published results for both mathematical and engineering test problems. In addition to thepublished test problems, a new parametrically adjustable mathematical test problem wasdesigned to test global optimization methods in general and to explore the strengths and11weaknesses of LSP in particular. Experience with these test problems showed that LSPgave similar results to those cited in the literature as well as improved results or morecomplete sets of global solution.The large number of solutions developed at each iteration of LSP permits meaningfulgraphical displays of the progressive reduction in the level set boundaries as the globalsolution is approached. Other displays were also found to provide insights into thesolution process and a basis for diagnosing search difficulties.HiAbstractTable of Contents11List of TablesList of FiguresAcknowledgementixxixiv1 INTRODUCTION1.1 Nonlinear Programming (NLP)1.1.1 Direct Search Methods1.1.2 Gradient based methods1.2 Limitations of existing NLP methods1.3 Global optimization1.4 Optimization needs for the professional1.5 Earlier work on level set optimization1.6 Level Set Programming (LSP)1.7 Thesis ontline112357911121416162326272 LEVEL SET PROGRAMMING2.1 Basic theory of Level Set Programming (LSP)2.2 Overview of the LSP algorithm - one dimensional case2.3 LSP Implementation2.3.1 Cuboid approximation of level set boundaryiv3 LSP PERFORMANCE IMPROVEMENT3.1 Partition sets and clnster analysis .3.1.1 Cluster analysis methods3.1.2 General implementation of cluster3.1.3 A clustering routine for LSP3.2 Penalty functions3.2.1 Penalty parameters and LSP3.3 Inequality and equality constraint relaxation .3.3.1 Tightening of relaxed constraints3.4 Cnboid stretching3.5 Skewness adjustment3.6 Search domain modification4 TEST PROBLEMS4.1 Coding of mathematical test problems4.2 Mathematical problems4.2.1 The Road Runner function4.3 Engineering test problems4.3.1 Basin design [Problem 3-16-1-3]4.3.2 Pipe network design [Problem 32-1-40-34]3032363842452.3.2 Alternative methods of sample point generation2.3.3 Revision of level set value-c2.3.4 Detailed description of the LSP Algorithm - general case2.3.5 Alternate termination criteria2.3.6 Handling Constraints in LSP2.4 Snmmaryanalysis methods484853566165687173747579838687899496101v4.3.3 10-member truss design [Problem 10-2-0-40] 1074.3.4 38-member trnss design [Problem 38-1-0-77] 1124.3.5 Unit Hydrograph ordinate determination [Problem 19-1-21-0] 1154.3.6 Air pollution control design [Problem 4-11-0-40] 1194.3.7 Irrigation system design [Problem 6-7-1-4] 1234.3.8 Alkylation process [Problem 10-1-7-0] 1284.3.9 Heat exchanger network configuration [Problem 16-1-15-0] 1324.3.10 Power generation using fuel oil [Problem 4-11-0-5] 1364.3.11 Chemical equilibrium [Problem 10-1-3-0] 1394.4 Model fitting 1425 EXPERIENCE WITH LSP 1455.1 Intermediate output, its presentation, and interpretation 1455.1.1 Plots of x, x3 pairs 1465.1.2 Plots of cumulative function evaluations-Nf against iteration numberI 1475.1.3 Plots of level set value-c against the iteration number-I 1535.1.4 Further generalization of the Nf I and c \u00E2\u0080\u0098-S-\u00E2\u0080\u0099 plot interpretations 1555.2 Adjusting LSP parameters 1585.2.1 Number of confirmed points in the acceptance set-Nkeep 1585.2.2 Termination criteriou-VM 1615.2.3 Cluster criterion 1625.2.4 Cuboid stretching parameters 1635.2.5 Skewness adjustment parameters 1635.2.6 Penalty and constraint relaxation and tightening parameters .. 1645.3 Use of a rhombohedron shaped search domain 165viRelationship between initial cuboid volume and its location, and NfSome specific difficulties encountered while implementing LSPObservations and recommendations1771781811821849 CONCLUSION9.1 Introduction5.45.55.61671681736 EVALUATION OF NLP PERFORMANCE6.1 Existing evaluation criteria6.2 Limitations of Schittkowski\u00E2\u0080\u0099s performance criteria6.3 Recommended set of evaluation criteria6.4 Evaluation example: LSP versus GRG27 LSP COMPUTER IMPLEMENTATION7.1 Subroutines7.2 Confirmed point geueration strategy7.3 Intermediate results presentation7.4 Final output7.5 Alternative presentation of scatter plots7.5.1 Static plots7.5.2 Dynamic plots - Alternagraphics8 SENSITIVITY ANALYSIS8.1 Sensitivity with LSP8.1.1 Sensitivity interpretation of the confirmed points at convergenceplotted in the initial cuboid8.1.2 Response of acceptance set to changes in the level set value-c .8.2 Other approaches to obtaining sensitivity information188189193198199199200202208210211213214217217vii9.2 Evaluating LSP performance using test problems 2199.3 LSP performance improvements 2209.4 Graphical output 2229.5 Civil engineering design problems and LSP 223Bibliography 228Appendices 233A MATHEMATICAL TEST PROBLEMS 233B DIRECT SEARCH METHODS 277viiiList of Tables3.1 Reduced number of function evaluations due to cluster analysis methods 634.1 Summary of test problems reported in Appendix A 874.2 Optimal results for the basin design problem 1014.3 Input data for the pipe network 1054.4 Available pipe diameters and unit prices 1054.5 Optimal design for the pipe network given in ASCE, 1977 1064.6 Optimal design for the pipe network found with LSP 1064.7 Detailed results for the 10-member truss design 1114.8 Comparison of truss weights for the 10-member truss design 1114.9 Alternate optimal designs for the 38-member truss 1144.10 Rainfall-Runoff data for Little Walnut Creek 1184.11 Optimal hydrograph for the Little Walnut Creek 1194.12 Stack and Emission data 1214.13 Optimal design cited in the literature and that found with LSP 1234.14 Constants for the irrigation system design 1264.15 Optimal solution for the irrigation system design 1274.16 Bounds for the variables involved in the alkylation process 1314.17 Optimal solution for the Alkylation process 1324.18 Stream data 1344.19 Match data 1344.20 Optimal solution to the heat exchanger configuration 136ix4.21 Constants in fuel consumption equations 1384.22 Optimal power generation cited in the literature 1394.23 Optimal power generation found with LSP 1394.24 Free energy constants (w8) 1404.25 Optimal solution to the Chemical equilibrium problem 1415.1 Function evaluations using cuboid and rhombohedron 1666.1 Schittkowski\u00E2\u0080\u0099s NLP performance criteria 1816.2 Recommended NLP performance criteria 1856.3 Comparison of GRG2 and LSP using proposed criteria set 1867.1 LSP\u00E2\u0080\u0099s problem definition and operations 1907.2 Comparison of Nf for two different point confirmation strategies 1957.3 Comparison of Nf for different strategies for a constrained problem . 197A.1 Dataform=4andn=3 250A.2 Experimental data used to fit the nonlinear regression model 254A.3 Optimal solution for the regression model 254A.4 Dataform=Sandn=4 259A.5 Experimental data used to fit the nonlinear regression model 260KList of Figures2.1 Sequential improvement of level set value-c 242.2 Cuboid derived from limited number of confirmed points. . 302.3 Cuboid defined by the smallest objective function values. . 332.4 Nf for cuboids defined by different level set value criteria. . 353.1 Connected and partition sets 513.2 Level set \u00E2\u0080\u009Cvolume\u00E2\u0080\u009D in one dimensional problem with and without partitioning 533.3 Single and multiple hnkage clustering methods 583.4 Stepwise dendrogram construction 603.5 Cluster analysis approach for a two variable problem 643.6 Modified penalty function 703.7 Relaxation of constraints 733.8 Shift of cuboid with skewness adjustment 773.9 Contrast between cuboid and rhombohedron in two variable space. . 803.10 Point generation in a rhombohedron search domain 824.1 The LSP algorithm as implemented in this thesis 854.2 The Road Runner function in one dimension to show the influence of a 904.3 The Road Runner function in two dimensions 934.4 Treatment basin 974.5 Two-looped water supply system 1034.6 10-member truss 108xi4.7 Detailed results for Venkayya\u00E2\u0080\u0099s 10-member truss problem 1104.8 38-member truss 1124.9 Rainfall components 1164.10 Optimal unit hydrographs for Little Walnut Creek 1204.11 Alkylation process 1304.12 Two-boiler Turbine- Generator combination 1375.1 Local optima suggested by the disappearance of a point cluster 1485.2 Nf I plots 1505.3 Rastrigin function-Implied local optima by Nf I plot. (Actual screendump) 1525.4 Screen display for the Road Runner function, two dimensional case. (Actual screen dump) 1535.5 Level set value-c versus iteration-I 1555.6 Low Niceep search for the Road Runner function. Demonstrates discontinuity in c I plot. (Actual screen dump) 1565.7 Interpretation of simple c I and Pt/f \u00E2\u0080\u0098-J I curves, unimodal and multi-modal objective functions 1595.8 Interpretation of more complex Pt/f rsi and c n.\u00E2\u0080\u0099 J plots 1605.9 The crescent shaped feasible region of Subrahmanyam\u00E2\u0080\u0099s problem [Problem2-12-0-4, Appendix A] 1705.10 Acceptance set volume change after an iteration while cuboid remainsconstant 1727.1 Deletion of a point in dynamic graphics 2047.2 Linking of points for a 4 variable problem [Problem 4-1-0-3] 2057.3 Points in the brush highlighted in all the panels [Becker, 1987] 206xii8.1 Plots of confirmed points after convergence criterion is met 2128.2 Plots of confirmed points at convergence within the initial cuboid for a 6variable problem (Actual screen dump) 2138.3 Plots of confirmed points within the initial cuboid for different c values 2148.4 Points in the level set, for c = c\u00E2\u0080\u0099 and c = 1.01 * c - [Problem 4-6-1-2](Actual screen dump) 216xliiAcknowledgementI wish to express my most sincere gratitude to Dr. William F. Caselton, my teacherand supervisor, who had a profound and positive impact on my academic and professional attitudes. I greatly appreciate his advice, encouragement, guidance and supportthroughout my graduate studies. This thesis would not exist without his patient effortsand valuable suggestions. I am deeply indebted for his interest in this work, for his invaluable guidance, for the many hours spent on valuable discussions, for his meticulousproof reading, and for the diligence during the research.My special thanks go to my supervisory committee members, Dr. Alan D. Russell,Dr. S. 0. Denis Russell and Dr. Donald W. Thompson for their valuable suggestionsduring the course of the research. In spite of their busy schedules, they always foundtime to review the progress of my work and communicate back with me. They havecontributed numerous clarifications of the research. Their efforts in reviewing this thesisare greatly appreciated.I would also like to acknowledge my family members, who had always given me themoral support and warm passion across the miles. My thanks go to my wife Munira, forthe patience and affection she has shown me over the past few years and the understandingshe has shown during the travails of my research. Her presence in my life gave addedmeaning to this whole endeavour.The financial support of CIDA to cover the expenses of my studies and the role ofAssociated Engineering International Ltd. in facilitating my stay in Vancouver is alsoappreciated.Above all, I praise the Almighty Allah for all my achievements.xivChapter 1INTRODUCTION1.1 Nonlinear Programming (NLP)The equations used by engineers to mathematically describe engineering problems andto describe physical behaviour are predominantly nonlinear. When an explicit optimization objective is involved then a mathematical formulation of the optimization problemwill conform to the conventional nonlinear progranuning problem. The mathematicalformulation of such an optimization problem can be written asMinimize or Maximize f(x)Subject to:g(x)O fori=1,mh(x)=O for j=1,1where x is an n-tuple vector of independent variables and f(x), gi(x), g2(x), ...,gn(x),hi(x), h2(x), ..., hj(x) are nonlinear functions defined on the Enclidian ii space.The equation f(x) is known as the objective function while the other two equationtypes, gj(x) 0 and the h(x) = 0, are the set of constraints. The space or region definedby these constraints is known as the feasible space or the feasible region.The non linearity of any of these equations precludes the use of the well establishedand reliable Linear Programming methods. Approximation of the nonlinear equations to1Chapter 1. INTRODUCTION 2linear or qnadratic expressions might facilitate a solution being found but can substantially alter the problem being solved. Finding a satisfactory solution generally necessitates the use of true nonlinear approaches to preserve the characteristics of the problem.Currently the methods used in nonlinear programming are categorized as direct searchor gradient search [Reklaitis et al., 1983]. These two method classes are discussed in thenext sections.1.1.1 Direct Search MethodsDirect search methods range from simple simultaneous point sampling techniques to moreelaborate procedures that involve a coupling of sequential sampling with heuristic hillclimbing methods [Reklaitis et al., 1983]. None of the direct search methods requirethe use of derivatives. Some of the most frequently cited direct search methods in theliterature are: Blind search; Grid search; IJuivariate search; Conjugate direction methods;Powell\u00E2\u0080\u0099s conjugate direction methods; Simplex method and Complex method.Many of these methods have elements in common with each other and the level setmethod described in this thesis is no exception. Descriptions of each of the above listedmethods are included in Appendix B of this thesis for reference purposes.The blind search, which is also known as simultaneous search, is quite inefficient inits use of function evaluations [Leon , 1966 & Reklaitis et al., 1983]. In this strategy, Nfrandom points are generated and the objective function evaluated at each point. The Nfobjective function values are compared and the lowest value is selected as the optimumvalue.The size of N1 depends upon the desired probability of successfully finding the globaloptimum. For a bounded but otherwise unconstrained problem involving n variables anda length of d units on each side of the feasible region, the volume of the feasible regionwould be d. Take a small fraction of the feasible volume a = (, where 6 is a smallChapter 1. INTRODUCTION 3length on each side of the bounded feasible region. The size of a is so small that variationof the objective function value within that region can be tolerated. Then for a purelyrandom search, the probability of a single test to be outside of a would be 1 \u00E2\u0080\u0094 a.Assume the probability of having at least one point in a is F, then, with N1 pointsF = 1 \u00E2\u0080\u0094 (1 \u00E2\u0080\u0094 a)VfN = log(l\u00E2\u0080\u0094F)f log(1\u00E2\u0080\u0094a)and for small a,N1 231og(1\u00E2\u0080\u0094F)For F = 0.90, which corresponds to a 0.90 confidence of obtaining the global optimum,the computational effort reflected by N1, would be 2.3Q) [Reklaitis et al., 1983]. Witha value of 6 = %, a ten variable problem would need about 2.4*1016 function evaluationsfor 0.90 confidence. This number is prohibitively large from a computational standpoint.The optimal solution can be obtained with probability of 1.0 only as the number ofsample points approaches infinity. Because of the high number of function evaluationsinvolved, a blind search is not recommended for solving problems of even moderate size.1.1.2 Gradient based methodsGradient based methods make use of derivatives to determine the search direction foroptimization. One of the common gradient based methods, the steepest descent method,uses only the first derivatives of the objective function for unconstrained problems [Edgar& Himmelblau, 1988]. The gradient vector at a point gives the direction of the greatestdecrease in f(x). For a minimization problem the search direction is specified by thenegative of the gradient, i.e.= vj\u00E2\u0080\u0099f(xj.Chapter 1. INTRODUCTION 48k is the search direction at point k.vjf(xj is the derivative of f(x) atAt any stage of minimization the transition from one point to the next is given by= + z\xk = xk + ASk = xC \u00E2\u0080\u0094 f(xj.Axk is the vector of increments from xV to x1VC is a scalar quantity that determines the step length in direction 8k and it can be eithera predefined fixed value or its value can be optimized at each iteration to improve thespeed of the convergence on the optimum point. In some cases it is possible to simplifya nonlinear function by making use of the quadratic approximation, at any point xk,f(x) f(xk) + vTf(xjz1xxk + .5(xk)TH(xk)zxkwhere H(xj is the Hessian matrix of f(x). Some methods then use this second-orderapproximation of the objective function and the information obtained from the secondpartial derivatives of f(x) with respect to the independent variables.The quadratic approximation of f(x) at is differentiated with respect to eachindependent variable and the resulting expression equated to zero to define a stationarypoint.f(x) =7f(xj + H(xj/Xxk = 0 (1.1)The next search point, at point Ic + 1, is obtained from this expression as\u00E2\u0080\u0094 xk = =\u00E2\u0080\u0094[H(xj]1 f(xj orxl = x \u00E2\u0080\u0094 [H(xj]1v f(x\u00E2\u0080\u0099flwhere [H(x\u00E2\u0080\u0099)]1 is the inverse of the Hessian matrix H(xjBoth the step length and direction of search are determined from the same expression,Equation 1.1. Only one step would be necessary to reach the optimum if f(x) was a trueChapter 1. INTRODUCTION 5quadratic equation. But, since in general this is not the case, there would be multiplesteps.1.2 Limitations of existing NLP methodsFor simple two variable problems, elementary direct search methods are often quite satisfactory, but they are neither efficient nor reliable for the higher dimensionality problemswhich typically arise in engineering [Edgar & Bimmelblau, 1988].Gradient procedures that use second order information are superior to those that usefirst order information only. But the danger is that usually the second order informationmay be only approximate as it is based, not on second derivatives, but approximates tosecond derivatives, with the consequence that it is no longer the original problem that isbeing solved [Edgar & Himmelblau, 1988].All gradient procedures start their search from a point and follow a single path untilthe convergence criterion is met. The point at which the convergence criterion is metis regarded as the optimum point. If a different starting point is used to perform thesearch again, it is possible that it can lead to a different optimum point. This revealsthat solutions found with gradient based optimization methods are often local optima.For a solution point x to be a local optimum (local minimum in this case) the objectivefunction value at that point should be the least of all the objective function values in theneighbourhood. That isf(x*) f(x)where x \u00E2\u0080\u0094 x* W C, and C is a small value.Unfortunately, except in certain mathematically simple cases, there is no mechanismwhich can confirm the global nature of the optimum point obtained with gradient methods. The global minimum is defined as the smallest of all the local minima in the regionChapter 1. INTROD UCTION 6of interest. In some particular cases, the global minimum can appear at multiple points.In order to apply gradient methods, the functions have to be continuous and differentiable so that their derivatives can be evaluated. But in engineering design we areoften confronted with problems which involve un-differentiable functions, discontinuousfunctions and/or discrete variables. In such cases the use of gradient based methods islimited.Nonlinear optimization problems may have more than one global optimum solution.Furthermore there may be other solutions which, while not globally optimal, producevalues of the objective function close to the global optimum value and are of considerablepractical interest. Unfortunately almost all existing NLP algorithms are incapable ofidentifying the full set of optimal points and provide little or no indication of near optimalsolutions.Information obtained as a byproduct of gradient searches, primarily in the form ofLagrange multipliers, Hessian and Jacobean matrices, does provide information but onlyin the immediate vicinity of the optimal solution. But this information can also bedifficult to interpret, especially in the context of the original engineering problem andwhen the global nature of the solution is in question. It provides no clues concerningmultiple global optima or near optimal solutions. In the event of an NLP method failingto confirm even a local optimum, the user is left with no suggestionfor his next move.Much of the information we get from the intermediate stages of gradient based NLPsearches does not have pertinence to the engineering problem being solved. It simply indicates the outcome of some mathematical manipulations which are incidental to the realworld problem and therefore have no real relevance to an engineering practitioner. Mostof the existing NLP methods will perform well on certain specific types of problems butthese are rarely representative of the more general nonlinear optimization problems facedby engineers. Thus, in general, existing gradient based or direct search NLP methodsChapter 1. INTRODUCTION 7have limitations in addressing real world problems.Other optimization techniqnes, which represent departures from the two classes whichhave been described above, have also emerged in the last decade. Genetic Optimizationfor example, which \u00E2\u0080\u009Cmutates\u00E2\u0080\u009D binary strings, has already appeared in the engineering literature [Goldberg, 1989], while Random Tunnelling and Simulated Annealing have beenproposed in the last few years. These methods have some limitation in their application, for instance Random Tunnelling converges rapidly for problems involving only onevariable but good results are not expected for higher dimensional problems [Kan andTimmer, 1989], and Simulated Annealing is reported to be inefficient [Torn & Zilinskas,1988], [Corana et al., 1987] and [Kan & Timmer, 1989].1.3 Global optimizationWhen dealing with nonlinear optimization problems, it is possible that the problempossesses many local optima. Because of the highly nonlinear nature of the equationsoften involved this phenomenon is quite common with engineering system designs [Luns& Jaakola, 1973]. For general engineering optimization purposes a method is requiredthat can identify the overall optimal solution among the many alternative local solutions.Such a method is known as a global optimizer. Its aim is to identify the smallest of allthe minima in the region of interest and may not necessarily evaluate all of the localminima.A sufficient gradient related criterion to positively confirm that a global optimumvalue has been achieved at a point does not exist. The only way to confirm that thepoint is a global solution is to evaluate the objective function at that point and compareit with objective function values at all other points.In any optimization method the search for the optimum solution is performed byChapter 1. INTRODUCTION 8evaluating the objective function at some trial points within the search domain. Anumber of points, obviously more than one, have to be used to reach the final solution.The distribution of these trial points within the search domain is influenced by twodistinct optimization goals, these are the global reliability goal and the local refinementgoal [Torn & Zilinskas, 1988].The global reliability goal is based on the assumption that the global optimum couldbe located at a point anywhere in the search domain and therefore assumes that no partof the search domain can be neglected if the global optimum is sought. Without anyrefinement in strategy it leads to search procedures where the trial points are distributeduniformly over the entire search domain. On the other hand, the local refinement goal isbased on the assumption that the probability of detecting a point with improved solutionis higher in the neighbourhood of a point with a relatively low objective function valuethan in the neighbourhood of a point with a relatively high objective function value.Again, without refinement, this goal has a natural tendency to generate sequences ofpoints with decreasing function values that converge to local minima.Most global optimization methods focus on exploring the whole of the search domain, but use local search strategies to numerically refine solutions. The various globaloptimization methods differ on how the search strategy shifts from one goal to the other.The performance evaluation of local and global minimization methods differ in principle. Local methods are evaluated on the basis of how often and how efficiently theyconverge on a confirmable optimal point of any kind. But a global method is evaluatedon two distinct qualities, the first one addresses the same issues as for local optimization methods but a second one, called reliability, measures the capacity to identify thetrue global optimum [Wang & Luus, 1978]. Clearly, because of these dual requirementsand the complex nonlinear problems addressed, global techniques must be tested morecomprehensively than gradient methods to prove their effectiveness.Chapter 1. INTRODUCTION 9Rather than strictly following mathematically formulated procedures, global optimization methods are more inclined to adopt some heuristic approaches than are gradientmethods. For example, if the global optimum is difficult to find, it may be attractiveto deal with an approximation of the objective functiou which yields a computationalenumeration advantage over the original objective function. A second example of anheuristic approach is subdividing the search domain to smaller regions at some stageof the search and then performing the search in each of the subregions. Justificationfor introducing heuristics into optimization methods are discussed in [Zanakis & Evens,1981] where they state the following.\u00E2\u0080\u009CThe need for good heuristics in both academia and business will continueincreasing fast. When confronted with real world problems, a researcher inacademia experiences at least once the painful disappointment of seeing hisproduct, a theoretically sound and mathematically \u00E2\u0080\u0098respectable\u00E2\u0080\u0099 procedure,not used by its ultimate user. This has encouraged researchers to develop newimproved heuristics and rigorously evaluate their performance, thus spreadingfurther their usage in practice, where heuristics have been advocated for a longtime.\u00E2\u0080\u009D1.4 Optimization needs for the professionalIn addition to the limitations of the existing methods in dealing with nonlinear optimization problems, mentioned in section 1.2, many of them have until recently beenconfined to implementation on mainframe computers. The fact that access to mainframecomputers has been severely limited to most civil engineers has deterred practitionersfrom using the existing NLP optimization methods. But the substantial improvementin access to computers in recent years, through the introduction of personal computersChapter 1. INTRODUCTION 10and their rapid development, has not yet lead to a corresponding growth in the nse ofoptimization methods in civil engineering practice.As stated previously, the mathematical equations used in civil engineering design procedures are predominantly nonlinear, therefore most civil engineering design optimizationproblems require the use of nonlinear programming techniques for their solution. ThoseNLP methods with the ability to handle practical sizes of problems that have been available on mainframe computers for many years are only just beginning to become availableon microcomputers. But there are other problems still facing the practitioner which willnot be solved simply by improved access.None of the nonlinear optimization methods which have been widely implementedto date offer any absolute assurance of finding the global optimum except under veryideal conditions. Generally they rely on the user to initiate the search from a number ofdifferent starting points so that the chances of stumbling onto the global optimum areenhanced. No systematic way of selecting starting points to ensure finding the globaloptimum is available and the responsibility for the overall global search strategy and itssuccess is left entirely to the user.Perhaps, most seriously of all, a practising civil engineer is not generally trained orskilled in numerical analysis. He therefore faces a considerable investment in time andmoney to confidently embark on nonlinear optimization to solve a real world problem.If he makes this investment then he still faces the shortcomings of existing methodsdescribed in the two previous paragraphs. Although not an uncommon topic in thecivil engineering research literature, there has been evidence of only rare and limiteduse of optimization methods in civil engineering practice. Clearly existing nonlinearoptimization methods have had insufficient appeal to convince civil engineers that theirtime should be invested in that direction. This prompted the research, reported in thisthesis, into a methodology based on the exploration of level sets, that has received veryChapter 1. INTRODUCTION 11little attention. Amongst many of the other positive attribntes explained snbsequentlyin this thesis, it has the potential of being a far more appealing nonlinear optimizationmethodology to the person who is not a specialist in nnmerical analysis.1.5 Earlier work on level set optimizationThe theory discussed in [Chew & Zheng, 1988] under \u00E2\u0080\u009CIntegral global optimization\u00E2\u0080\u009D opensa new alternative to global optimization based on level set theory. But some other authorshave expressed opinions that the level set method is not designed for solving practicalproblems [Torn & Ziliuskas, 1988]. This is because, from the practitioner\u00E2\u0080\u0099s point of view,there are certain issues, important in engineering problems, which Chew and Zhengdid not address. As described below, these issues deal with superiority of the method,computational effort, multiple optima and significance of intermediate outputs.Even though the authors hint that there had been improvements made in their work,the improvements are neither enumerated nor documented. This remains a hinderauce inidentifying their contribution to optimization methodology and assessing the superiorityof the level set method over other existing methods.A general statement was given stating that less computational effort is expendedwith the level set method compared to the pure random search method. Apart fromthis statement there was no systematic comparison between the level set method and thepredominant (gradient) methods. Lack of comparative performance tests with respect toefficiency, reliability or global convergence provided no incentive for others to adopt levelset optimization.The authors admitted that only problems which have single global optimum hadbeen fully addressed and made only a minor comment on the multiple global optimacase. There was no discussion of recognizing near global optimum solutions. But theChapter 1. INTRODUCTION 12identification of mnltiple optima and near optimal solutions are some of the importantbenefits to practitioners which are identified and investigated in this thesis.The significance of the level set method output values was also not recognized in [Chew& Zheng, 1988] and consequently the benefits of a graphical interface was overlooked.1.6 Level Set Programming (LSP)The formal mathematical presentation of the methodology upon which LSP is based hasbeen presented under the title \u00E2\u0080\u009CIntegral Global Optimization\u00E2\u0080\u009D [Chew & Zheng, 1988].They claim equal or better performance when compared with some of the best \u00E2\u0080\u009Cconventional\u00E2\u0080\u009D gradient search NLP methods. Their experience in solving a series of NLPproblems supported this finding.Level Set Programming (LSP) is classified as a direct search method of optimization.Like any other direct search optimization method, LSP avoids gradient evaluations andrelies solely on evaluations of the objective and constraint functions at solution points. Incomparison with most direct search methods it carries a much larger number of solutionsat any one time and these solutions are dispersed over larger regions of the design space.LSP utilizes the simple set theory concept of a level set which will be defined inChapter 2. It uses estimates of statistical moments of a level set to assess the level setproperties, to guide the search algorithm, as well as to measure convergence on a solutionoptimum. LSP adopts a global strategy where, at least in theory, the whole of the searchdomain is explored in the search for the global minimum.The most significant features of LSP from a conventional engineering standpoint are:LSP is a global search method; the search does not follow a single path or is not influencedby a single point; it covers the whole search domain; its reliability to identify the globalsolution is high; the method is conceptually simple; all the computations and numericalChapter 1. INTRODUCTION 13results geuerated during the search are meaningful; and an elementary, and thereforefast, graphical interface can display much of the useful information at any stage of thesearch. Based on our experience to date, its computational burden is in the same orderas gradient based search methods while its assurance of a successful global search issubstantially better.The name Level Set Programming is adopted to emphasize the central level set feature. The word programming, as appended here, appears to be justified in that theimplementation of the methodology has an algorithmic structure with a rigorous theoretical criterion for convergence on the global optimum.Although the theory of level set optimization already existed prior to this research,its full implementation and testing had never been documented. In this research, acomprehensive level set optimization algorithm named Level Set Programming (or LSP)has been developed.Refinements to the level set approach, which helped increase the chances of identifying the global solution, increased computational efficiency and enabled the method tohandle a wider range of problems were investigated and implemented. These refinementsincluded the use of cluster analysis, penalty functions and constraint relaxation. It wasalso demonstrated that conventional engineering routines, which are necessary for evaluating the objective function or constraints (such as structural analysis methods andHardy Cross pipe network analysis method), could be readily embedded into the optimization method. A set of control parameters were also introduced into the algorithmand their optimal values investigated. Furthermore, it was found that these parameterscould be adjusted during the course of the search to increase the reliability and efficiencyof the search.Chapter 1. INTRODUCTION 14Graphical displays of intermediate and final search resnlts were also developed duringthe research and these have opened up a new perspective on the process of optimization.The human-machine interaction during optimization was also significantly improved withthese graphical displays which guide the user towards search parameter adjustments forincreased efficiency of search as well as facilitate a better understanding of the optimization process as it unfolds.The strength of LSP has been demonstrated by solving a wide range of mathematicalproblems (about 200) and many difficult and realistic engineering problems (about 20)which have been presented in the literature as being challenging for optimization methods.In addition, a parametrically adjustable mathematical test problem was developed duringthe course of the research to further strengthen confidence in LSP\u00E2\u0080\u0099s global optimizationcapabilities and to challenge global optimization methods in general.1.7 Thesis outlineThe theory behind level set programming and its implementation is explained in Chapter2. Refinements in the LSP scheme and the introduction of some heuristic methods arethen explained in Chapter 3.Chapter 4 explains the general strategy and performances of LSP in solving mathematical and engineering problems. The engineering test problems formulation and theirsolutions are presented in the same chapter. The mathematical problems formulationand solutions are documented in Appendix A.The experience gained in developing LSP and in solving test problems are explained inChapter 5. This chapter explains the interpretation of intermediate results, discusses theuse and adjustments of LSP parameters, the relationship between the volume of searchdomain and computational overhead, and finally identifies the principal difficulties whichChapter 1. INTRODUCTION 15might be encountered while implementing LSP.The established approaches for evaluating NLP methods and some new proposedcriteria are given in Chapter 6. The most important factors in the field of optimizationfrom an engineering practitioners point of view are also enumerated in the same chapterand contrasted with the established approaches to evaluating NLP methods.Chapter 7 explains the operational use of LSP. The subroutines in the computerprogram developed and its input/output procedures are discussed.Chapter 8 discusses sensitivity analysis with LSP. A contrast on the definition ofsensitivity analysis between the classical methods and in the LSP context are presented.Interpretations of sensitivity information from LSP plots are also included. Finally Chapter 9 gives the general conclusions concerning LSP\u00E2\u0080\u0099s effectiveness and suitability as anengineering tool.Chapter 2LEVEL SET PROGRAMMINGThis chapter outlines the basic theory of LSP and then considers extensions and variationsof the algorithm which were investigated in the conrse of this research.2.1 Basic theory of Level Set Programming (LSP)This section summarizes the essential theory for the implementation of LSP as describedin this thesis. More extensive theory is provided in [Chew & Zheng, 1988] and includesdetails such as the properties of higher moments (third, forth, etc.) of the level set,etc., which do not appear to have any bearing on engineering implementation, and aretherefore omitted here.The general formnlation of an optimization problem isMinimize f(x)Subject tog(x)O fori=l,2,...,mh,(x)=O forj=1,2,...,lxcX (2.1)wheref(x) is a single valued objective functiongm(x) are inequality constraintshi(x) are equality constraints16Chapter 2. LEVEL SET PROGRAMMING 17X is a subset of E, andx is a vector of n components x1, ..., x,1.and all fnnctions are defined on Euclidian n space, E\u00E2\u0080\u0099t.The optimization problem given in Equation 2.1 is a nonlinear programming problem when any or all of the constraints or the objective fnnction are nonlinear. In theapplications of interest here, x corresponds to a set of design or decision variables.For an unconstrained but bounded minimization problem the goal is to identify x,which is called the global optimum point, or a set of global optimal points, and c*, whichdenotes the global optimum value of f(x*), where the relation f(x) f(x) = C wouldbe true for all feasible points x. Thus the minimization problem is to find the minimumof f(x) over 5b, where 5b is the region defined by the upper and lower bounds of eachvariable x, so that= min{f(x) I x C Sb} (2.2)A level set, which is at the heart of the optimization methodology inthis thesis, is defined as the set of points which provide an objectivefunction value below or equal to some specified value. If this specifiedvalue is represented by c, then the associated level set H0 can be definedbyH0 = {x f(x) S c}. (2.3)Bounds on each component of x are implicit but constraints are notconsidered in this definition.Certain first and second moments of the objective function values of the points contained in the level set play important roles in LSP and are defined below. The followingChapter 2. LEVEL SET PROGRAMMING 18expressions are for continnous integrable functions on 8b only adopted from [Chew &Zheng, 1988].Mean value over the level set HM(f, c)=f(x)dp (2.4)and for cM(Lc)cwhile for constants c2 c1M(f,c2) M(f,ci)where Chew & Zheng defined t as a measure of the level set and, c1 and c2 represent anytwo constants greater than or equal to the global optimum value.Variance over the level set HSimilarly Chew and Zheng specified the variance and a modified variance over thelevel set H asV(f, c)=p(H) - M(f, c)]2dpVM(f, c)=J[f(x) - c]2djz (2.5)The modified variance is slightly easier to compute and has the same general properties as the variance.For the discrete sampling case Equations 2.4 and 2.5 can be written in the form ofsummations to provide estimates of M(f, c) and VM(f, c)Chapter 2. LEVEL SET PROGRAMMING 19M(f,c) = 1 af(x)ZXJEHCa XiEHVU, C; 5) = E a[f(x) \u00E2\u0080\u0094 MU c)]2ZXJEHC aVM(f,c;S)= 1 a[f(x)\u00E2\u0080\u0094c]2 (2.6)YZXJEHCa xieHIn practice, there would be only a limited number of points in a level set, therefore,Equations 2.4 and 2.5 are rewritten to reflect this as follows.1NkeepM(f,c)= N > f(x3) (2.7)keep j=11VU, c)= N E [f(x) - MU, c)]2keep j=11 NkeepVM(fc)=N YZ[f(x)-c]2 (2.8)keep j=1where Nkeep, the number of points at which f(x) is calculated for a discrete samplingscheme, is the number of points in the level set, and plays the role of the measure p(H) inEquation 2.4. The equivalent moments for the integer variable or discontinuous functioncases are computed by evaluating the objective function at discrete points.Convergence on global optimumTake a large real number c0 such that the level set defined by this number over thefunction f(x) is non empty, that isChapter 2. LEVEL SET PROGRAMMING 20H0\u00E2\u0080\u0094_{xf(x)C1...>Ck>Ck+1>...>C.where c is the global minimum value of f(x) Now define a decreasing sequence of levelsets HCk asHck={xf(x) level set value c.Generate as many confirmed points as those discarded in step 8 of the previousiteration. That is, restore the number of points in the acceptance set to a total ofChapter 2. LEVEL SET PROGRAMMING 385. Evaluate the mean of the objective fuuction values, and assign this mean value tothe new level set value-c for the next iteration. That is= M(f, c)6. Calculate V(f, c), the variance of the objective function values.V(f, e)= N [EU(x) - MU, e2]keep 11If the variance is less than the pre-specified convergence tolerance, e, terminate thesearch, otherwise continue to the next step.7. Discard points whose objective function values are greater than c\u00E2\u0080\u0099.8. Set c =9. Go back to step 4 and redefine a new cuboid.Let the value of c at termination be represented by ct. V(f, Ct) e is satisfied so thatM(f, e1). The solution is presented in the form of a final acceptance set H.and will contain points in the immediate vicinity of the global optimum or optima.2.3.5 Alternate termination criteriaBecause of its unique search mechanism, LSP\u00E2\u0080\u0099s requirements for a termination criteriondiffer from other NLP search methods. In addition to the termination criterion describedin section 2.3.4, three other alternatives were considered and are described below.i) Size of cuboidChapter 2. LEVEL SET PROGRAMMING 39In the special case where the problem is unimodal, the cuboid at c = C is theoretically a single point and cnboid volume can he exploited to provide a convergencecriterion.If Vol=flL1 d1 defines the volume of the cuboid at any stage of the LSP search inan n dimensional problem, where d1 is the length of the side of the cuboid measuredon the jth variable axis, then this volume tends to zero as convergence is approached.The search could be terminated when the volume is less than a pre-specified smallvalue. Since the length of each side of the cuboid gets closer to zero as the searchprogresses, the fact that Vol \u00E2\u0080\u0094* 0 as d1 \u00E2\u0080\u0094+ 0 for i = 1, ..., n is obvious. But thedanger is that the convergence criterion could be met even if only one or two sidesapproach zero length and a premature or pseudo-convergence could result. To avoidthis a criterion requiring that the sum of the cuboid side lengths, Z1 d, be closeto zero could be used in addition to the volume criterion. As a further alternativethe size of each individual cuboid side length in the final cuboid could be comparedto pre-specified termination criterion. This last mentioned criterion is similar tothat used in many local optimization methods [Archetti & Schoen,1984; Betro &Schoen, 1987; Boender & Kan, 1987], which stipulate that if x is the currentestimate of the minimum point and x is the last predicted minimum point thenconvergence is considered to have been achieved if\u00E2\u0080\u0094 4 < E for i = 1,2, ..., n.where a0 is a pre-specified small value.ii,) Difference between consecutive level set valuesThe difference between the objective function values at consecutive iterations isused as a convergence criterion by some direct search methods. For instance, theChapter 2. LEVEL SET PROGRAMMING 40following simultaneous termination criteria are used for Newton\u00E2\u0080\u0099s direct searchmethod [Edgar and Himmelblau, 1988].\u00E2\u0080\u0094 f(xjf(xk) I<6ior as f(x) \u00E2\u0080\u0094, 0,f(x41)- f(xj 1< e\u00E2\u0080\u0094I Ic <63xior as x \u00E2\u0080\u0094* 0,\u00E2\u0080\u0094 x1 1< 64js\u00E2\u0080\u0099II CthenM(Lc;S) CMU, C; 5) = CChapter 2. LEVEL SET PROGRAMMING 43V(f,c*;S)= 0VM(f,c*;S)= 0For a discrete sampling (or integer problem), equation 2.18 can be rewritten asMQf,c;S) = 1 a\u00C2\u00B0f(x3)ZVEHflsa XJEHcflSVU, C; 3) = 1 a [f(x)\u00E2\u0080\u0094 MU c)]2ZX3EHCnSaxicHflSYwUc;S)= 1 a[f(x)\u00E2\u0080\u0094c]2 (2.19)ZXJEHCnSa xieHflSReduction methodWhenever there are equality constraints, the number of variables in the problem cantheoretically be reduced by the number of equality constraints. In effect reduction canbe performed by solving a system of simultaneous nonlinear equations. Most of the timehandling equality constraints in this fashion is difficult, if not impossible, since in practicenonlinear constraint functions are seldom simple enough to be able to equate the functionwith a single variable.Penalty methodWhen the set of constraints defines a very small or narrow feasible region, or onegoverned by equality constraints, generating feasible points can be difficult. In this casethe objective function can be modified by adding some terms which penalize (i.e. increase)the value of the objective function when any sample point outside the feasible region isevaluated. The existence of points that violate the constraints is permitted within theacceptance set but is discouraged later in the search. This approach is common to allpenalty function methods.Chapter 2. LEVEL SET PROGRAMMING 44Let S be a closed subset of X. Consider the constrained minimization problemC = min[f(x) + ap(x)]where a is a positive value.Definition:- a function p on X is a penalty function for the constrained set S ifp(x)0 VxcXp(x)=0 iffxeSp(x)>0 VxSSuppose ek, the level set value at the kt iteration, so that cj C, is decreasing andtends to c* as k \u00E2\u0080\u0094\u00E2\u0080\u0098 cc and cvk is a positive increasing sequence which tends to infinity ask \u00E2\u0080\u0094* cc. Let= {x f(x) + akP(X) C} (2.20)thenlimHk = fllik = li flSandlzm(H) Lkm4 = (HflS)As defined earlier, the mean value of f(x) over its level set H within a closed feasibleregion defined by the constraint set S is given byM(f,c;S)= p(HflS)Because C is a lower bound of clim 1 j f(x)dR = M(f, c; 5)k_icc fz(JJk) HkChapter 2. LEVEL SET PROGRAMMING 45The limit does not depend on the choices of sequences {ck} and {ak}.Suppose c > c then the limitM([c;p) = lim 1 j f(x)dRk\u00E2\u0080\u0094*oo /A(Hk) ukis called the penalty mean value of f(x) over the level set Hk. Suppose c* = f(x*) is theglobal minimum value of f(x) over 5; then= M(f,c*;S) = M(f,c*;p)The penalty mean value, variance and modified variance of f(x) have the same convergence properties as the constrained mean value, variance and modified variance; thereforethey share the same properties [Chew & Zheng, 1988].2.4 SummaryThe general theoretical description of the use of level sets for nonlinear optimization waslaid down by Chew and Zheng [Chew & Zheng, 1988]. In this chapter, the theory hasbeen extended, the potential importance elaborated and some practical application issuesaddressed.One of the main ideas explored is the relationship between the reliability of LSP solutions and the revision of the level set value at each iteration. The choice of revised levelset value was found to affect the reliability of the solutions and the overall performance ofLSP but the use of M(f, c) was confirmed as being appropriate for general applications.Different methods of sample point generation were also investigated. The explorationof alternative methods has helped to give more insight into a key numerical and computational aspect of this implementation of level set theory. Even though many methods wereexamined, only one of the methods, the cuboid approach, is recommended for generalapplication.Chapter 2. LEVEL SET PROGRAMMING 46Various search termination criteria were also examined. Any of these criteria canbe used in practice without affecting the theoretical global convergence characteristicsthongh the best choice is dependent upon the nature of the problem being solved.In practical implementations of LSP, the modified variance, VM is adopted as a termination criterion instead of the variance evaluated around the mean. The reason is thatthe modified variance is larger and therefore provides a stronger convergence conditionand is, as well, easier to evaluate. Moreover the modified variance is more sensitive tothe entry of points into the acceptance set which have the lowest values of the objective function. This becomes a significant benefit in the final convergence stages of thesearch. In addition to using variance as a convergence criterion, a plot of level set value-cagainst each iteration is displayed to give a visual indication of the typically asymptoticconvergence of level set valne-c on c.How closely the final value of the convergence criterion VM(f, c) approaches zero inpractice is a qnestion of numerical precision and the practical needs of the problem. Itis governed primarily by the number of sample points maintained in the level set andaccuracy of the computational device used in evaluating the various nonlinear functionsinvolved in the problem. In general the most important aspect of LSP is fixing thenumber of points in an acceptance set. Too few points increases the chances of missingvery localized but significant fissures in the objective function surface, too many pointsplaces an unnecessarily high computational burden. For a fixed number, Nseep, of pointsin the acceptance set the density of points per unit length on the variable axes increaseswith the number of iterations.The search process is not infallible, a very localized fissure in the f(x) surface whichcontains the global optimum can be overlooked, but the VM(f, c) valne still convergeson zero over one or more local optima. Such problems are more likely to emerge whenthe number of sampling points in a level set is too low. Problems which are peculiarlyChapter 2. LEVEL SET PRO CRAMMiNG 47difficult for LSP are thus possible aud some of these are discussed in Chapter 5, section5.5.Chapter 3LSP PERFORMANCE IMPROVEMENTOptimization based purely on the exploration of level sets has been described by some authors as a theoretical approach with little practical significance [Torn & Zilinskas, 1988],since they felt that the method lacks strong convergence characteristics. There have beenfew attempts to implement pure level set methods in any general way. Those implementations that have been made are poorly docnmented and there is little existing literatnreon enhancing the performance with LSP type methods. A variety of approaches to improve the efficiency of the LSP search have therefore been investigated in this research.The more successful techniques dealing with partitioning of search domain, relaxation ofconstraints and redefining of search domain are discussed in this chapter. In additionto improving the computational efficiency of LSP, these techniques also tend to producemore accurate convergence on optimal solutions, and enhance the reliability of detectionof boundary solutions. Further discussion of actual experience when implementing thesetechniques and setting the values of the various LSP parameters is provided in chapter5.3.1 Partition sets and cluster analysisThe volume occupied by the level set is always less than or equal to any containingcuboid volume. The ratio of level set volume to cuboid volume has a direct influence onthe efficiency of generating new points in the level set; when this ratio is low considerablecomputational effort will be expended on unsuccessful point generation. Very low ratios48Chapter 3. LSP PERFORMANCE IMPROVEMENT 49can result from problems involving widely separated, multiple global optima, or from along, narrow, level set with its principal axis at an oblique angle (e.g. 45\u00C2\u00B0) to the variableaxes.For a typical unimodal problem, the ratio of level set to cuboid volume remains bothhigh and nearly constant throughout the search. But if the optimization problem hasmultiple global optima, then the cuboid would include a large amount of non level setspace. This results in a very low level set to cuboid volume ratio. Moreover the level setcould consist of discontinuous sets within the cuboid.Some new terms are introduced here to facilitate discussion of the computationalimplications when the level set becomes discontinuous.Two sets B and C, where B 0 C, define a disjoint set if B fl C = 0,that is if B and C do not have any element in common. A class A ofsets is called a disjoint class of sets if each pair of distinct sets in A isdisjoint [Lipschutz, 1965].A class A of non empty subsets of A is called a partition of A if andonly if each a E A belongs to some member of A and the members ofA are mutually disjoint [Lipschutz, 1965].A space A is connected if, whenever it is decomposed as the unionBUC of two non empty subsets, then BflC 00 or GflB $0, wherethe upper bar designates the complement set and 0 designates the nullset [Armstrong, 1983].If a level set defines a region where all points in the level set belong to a single set,then the level set is called connected as in Figure 3.1(a). Henceforth in this thesis aChapter 3. LSP PERFORMANCE IMPROVEMENT 50connected subset of the partition will be referred to simply as a \u00E2\u0080\u009Csubset\u00E2\u0080\u009D.If a level set defines a set of small subregions and there is no common point between any two subregions, then the level set is called apartition level set.When the subsets of the partition are close together they will, in practice, be treated asif they were a connected set. Figure 3.1(b) shows a partition level set in two dimensions.Sub\u00E2\u0080\u0094cuboid will be defined here as the cuboid enclosing an individualsubset of a partition level set.Minimal cuboid enclosure will be defined in this thesis as the setof sub-cuboids which enclose the level set in a specific cuboid. Theminimal cuboid enclosure would be equal to the cuboid itself if thelevel set is connected, otherwise it would be a collection of more thanone sub-cuboid.As the iterative process of LSP unfolds, the size of the region occupied by the acceptance set gets smaller. However, this region may contain a single connected levelset in the special unimodal case, or it can enclose a partition level set forming separatediscontinuous regions. Points generated in any of these subregions would be acceptedif they meet the level set condition f(x) c and do not violate any of the constraints.The efficiency of generating an acceptable sample point in the cuboid would be directlyproportional to the ratio of the acceptance region volume to the cuboid volume.For a multiple optima problem, the chances of generating an acceptable point becomessmaller as the search approaches the global optimum value of the objective function. ThisChapter 3. LSP PERFORMANCE IMPROVEMENT 51x2 x2xlFigure 3.1: Connected and partition sets.is because the volume of the acceptance region becomes extremely small near convergencewhereas, at this stage of the search, the cuboid typically maintains more or less a constantvolume. This is particularly true when the global optimum points are at stationary pointsin the objective function surface. Although an inefficient search may eventually converge,the time required to obtain a solution may be prohibitive. To overcome the inefficiencydue to the presence of multiple optima it is possible to subdivide the cuboid into aset of smaller cuboids without sacrificing the integrity of the search. This is provided,of course, that no subregion containing a global optimum is omitted from the smallercuboid set. If the existence of distinct clusters of points is suspected during a search,statistical cluster analysis can be used as an aid to identifying the clusters, and hencethe sub-cuboid boundaries. Details on cluster identification and classification will bediscussed in section 3.1.1. The current level set value-c would be applicable to all of thesub-cuboids at the point of dividing the cuboid into smaller sub-cuboids.xl(a) Connected set (b) Partition setChapter 3. LSP PERFORMANCE IMPROVEMENT 52After the clusters have been identified their individual subcuboid dimensions aredefined. The combined sum of the volume of all sub-cuboids would be smaller than thevolume of the cuboid before subdividing into smaller subregions. Thus the total volumeof search is reduced relative to the volume of the level set.Once subcuboids are defined the optimization problem is then treated as a separateproblem in each subregion and each is solved independently. The process of performingcluster analysis and subdividing the cuboid into smaller subcuboids can be repeated atsubsequent stages of the LSP search but will result in solving an ever increasing numberof sub-problems. Such an exhaustive division of search domain leads to an inefficientsearch and is generally unjustified.An example to show the merits of using smaller cuboids at some stage of the searchfor the unconstrained but bounded one dimensional case is given in Figure 3.2. Herethe cuboid is one dimensional and its \u00E2\u0080\u009Cvolume\u00E2\u0080\u009D is represented by the length of its singleside. At the kth iteration, the chances of generating a feasible point fulfilling f(x)is roughly 0.5, that is the level set designated as Hek in the figure, occupies about halfthe cuboid length. At the next iteration, the level set value-c is lowered to and thecorresponding level set volume becomes so small that the efficiency of generatingfeasible points fulfilling f(x) CJcfl drops to below 0.25. After this stage of the search itwould be advantageous to work with two distinct cuboids where the chances of generatingpoints fulfilling the level set condition would approach 1.0, depending on how preciselythe cuboid boundaries are estimated.Recognizing such benefits from partitioning, it is almost essential that a cluster analysis subroutine be incorporated into LSP. From the implementation point of view, the precision needs for the identification of clusters and cluster bounds are not critical. Except,of course, that the cluster boundaries and hence cuboid boundaries do not inadvertentlyexclude a global optimum.Chapter 3. L5P PERFORMANCE IMPROVEMENT 53\\u00E2\u0080\u00A2\u00E2\u0080\u00A2\u00E2\u0080\u00A2\u00E2\u0080\u00A2\u00E2\u0080\u00A2\u00E2\u0080\u00A2\u00E2\u0080\u00A2\u00E2\u0080\u00A2\u00E2\u0080\u00A2\u00E2\u0080\u00A2\u00E2\u0080\u00A2\u00E2\u0080\u00A2Hck = {xjf(x) Ck }Hck+l = {xlf(x) ck.H }\u00E2\u0080\u0094\u00E2\u0080\u00981___r-\u00E2\u0080\u0094Cuboid side at k+1 et iteration4Cuboid sid\u00C3\u00B3 at kth IterationFigure 3.2: Level set \u00E2\u0080\u009Cvolume\u00E2\u0080\u009D in one dimensional problem with and without partitioning.3.1.1 Cluster analysis methodsA cluster can be defined as a set of points that possess some common characteristicwhich is not shared by members of another group, or as a set of points that bear somerelationship to each other but not with those outside that particular set. In Euclideanspace, where the relationship between points is a function of their distance from eachother, clusters are defined as points which are located close to each other. In somecases a similarity criterion, which expresses the common characteristics shared betweenpoints, is used instead of distance to classify points into clusters. When partitioning levelsets, Euclidian distance is the appropriate measure of the relationship between points inthe decision space. Cluster analysis is a technique which allocates points to a clusterin such a way as to maximize the distance between clusters and minimize the distancebetween points within a cluster. Cluster analysis procedures can be divided into twomajor categories, which are briefly explained here.f(x)CkCk+l/Chapter 3. LSP PERFORMANCE IMPROVEMENT 54\u00E2\u0080\u00A2 Hierarchical Methods:These produce a tree-like taxouomic system where, at oue end of the tree everyindividual point is a cluster and at the other end all points are included in acommon cluster. At intermediate levels, clusters are formed by aggregating lowerclusters. The hierarchical cluster configuration is usually represented graphicallyas a dendrogram as shown in Figure 3.4. This graphical structure is a common toolused to express the results of a clustering analysis. The hierarchical method canhe divided into agglomerative and divisive methods.i) Agglomerative methods: These are methods where similar data points are sequentially aggregated to form a single cluster. The methods construct the hierarchicaltree from individual points at the branch tips to a single root. Specific agglomerative methods differ according to their definition of distance between a point and acluster, or between two clusters [Mezzich & Solomon, 1980].ii) Divisive clustering methods: These methods subdivide the aggregate set of datapoints into smaller subgroups by partitioning, such that the variance within eachgroup is minimized and the variance among the groups, their mutual separation,is maximized without breaking up natural groupings. One of the techniques isthe Automatic Interaction Detection, AID, which employs a series of discrete variable splits, each dictated by the maximum reduction in the empirical unexplainedvariance [Jenson, 1977].\u00E2\u0080\u00A2 Non-hierarchical methods: In contrast to hierarchical methods, these produce configurations that do not present rankings in which lower order clusters must becomemembers of larger more inclusive clusters. These are methods which essentiallyproduce the final partition in a single step. The major approaches used in nonhierarchal methods are as follows [Mezzich & Solomon, 1980].Chapter 3. LSP PERFORMANCE IMPROVEMENT 55i) Total enumeration of partitions methods: In this method an attempt is made toenumerate all clustering possibilities, and then select the best cluster arrangement.A quantitative clustering index is used to choose the best alternative. Since thismethod involves exhaustive searches, it appears to be computationally unattractiveeven for modest data sets.ii) Nearest centroid sorting methods: The basic feature in this method is theselection of seed points to be used as cluster nuclei around which the set of pointscan be grouped. When the number of clusters is fixed, the location of the centresof the clusters are updated after each full cycle of allocation of points. Given thenumber of clusters, the method allocates points so that the within cluster sum ofsquares are minimized [Hartigan, 1977]. For a variable number of clusters, pointsare allocated on the basis of the nearest centroid sorting process. The number ofclusters changes during the allocation process by partitioning those clusters whichhave a large within cluster sum of squares to form smaller clusters, and joining anytwo clusters with small between cluster sum of squares to form a larger cluster.This is controlled by using certain parameters for \u00E2\u0080\u009Ccoarsening\u00E2\u0080\u009D and \u00E2\u0080\u009Crefinement\u00E2\u0080\u009Dset by the user [Mezzich & Solomon, 1980].iii) Reallocation methods using variance-covariance criteria: The basic procedurein this category is to reallocate points among a set of clusters in such a way as tooptimize some overall discriminant function or variance-covariance criterion.iv) Density search methods: These methods use the allocation of points in a metricspace, looking for regions of high point density separated by regions of low densityfor identification of clusters.Chapter 3. LSP PERFORMANCE IMPROVEMENT 56From the experience in this research, the agglomerative hierarchical clustering methodwas fonnd to be easy to implement with the least complications in theory and applicationand providing acceptable precision for LSP purposes. This method is discussed in detailin the next section.3.1.2 General implementation of cluster analysis methodsSuppose there are in points, in n dimensional Euclidean space, which we wish to arrangeinto a hierarchal classification. The data set for the in points forms an in x ri matrix. Theunit of measurement for the in x ii raw data matrix is standardized, which means thatall variables are transformed to a single unit, before computing distance measurements.This ensures that each variable is weighted equally, otherwise the Euclidean distance d1will be influenced most strongly by the variable which has the greatest magnitude [Davis,1973]. The Euclidean distance, between two points is computed using equation 3.1.=\u00E2\u0080\u0094Xj (3.1)where XjJ\u00C3\u00A7, denotes the kt variable measured at point i and Xjk is the kth variable measuredat point j. In all, n variables are measured at each point and is the distance betweenpoint i and point j. The distance between all possible pairs of points will result in anin x in symmetrical matrix. The points are arranged into a hierarchy based upon themagnitude of d1. Initially all pairs of points with mutually short distances are groupedtogether to form the beginning of clusters. Then those points which are not in any of theclusters formed at the initial stage, together with the small clusters already formed, areregrouped to form larger clusters. Small clusters are treated in the same way as points,and are lumped together on the basis of their mutual distances. The process of regroupingis repeated until all of the points have been placed into a complete classification scheme.Chapter 3. LSP PERFORMANCE IMPROVEMENT 57The first step in clustering by the pair-group method is to find the mutually closestdistance in the matrix, the corresponding data pairs form the centres of subsequentclusters. After the centres have been identified, all the other points are connected to thecentres one by one to form a dendrogram.The most common and simple clustering method is the single hnkage method, whichemploys a predefined arbitrary critical distance as a criterion for clustering. Clusteringwill only occur between pairs of points and small clusters if the distance between themis less than the critical distance. If all pairwise distances are greater than the criticaldistance no clustering can occur. After the initial clusters have been formed the distancesbetween the remaining points and the closest point within a cluster are computed one byone. Those points which have distances less than the critical distance from a cluster areentered into the cluster. The procedure continues until either no more points can enterany cluster, or all points agglomerate to a single cluster.A similar method, referred to as the multiple linkage or complete linkage method,uses the distance between an unclustered point, and the farthest point in a cluster as ameasure for clustering. The distance between an unclustered point and the most distantpoint within a cluster is checked against the critical distance for the viability of being amember in that cluster. Figure 3.3 shows two different results of cluster analysis resultingfrom a single data set using single and multiple linkage cluster analysis methods.Another type of cluster analysis method, the weighted average cluster analysis, usesequally weighted average distance between the cluster and a newly connected point.This method gives equal importance to the set of points which have already formed acluster and the new point which is to enter the cluster, hence the greater influence of thelatter. In this method, once the first grouping is completed, the distance matrix must berecalculated, treating each set of grouped points as a single point. Then the next step isto regroup the small clusters already formed on the basis of shortest mutual distance toChapter .3. LSP PERFORMANCE IMPROVEMENT 58(b) Multiple linkageFigure 3.3: Single and multiple linkage clustering methods.1iEI,jEJfljflj2,2distanceCrftical\u00C2\u00A9(a) Single linkagei Clusters xlform a larger cluster. A simple and effective way to calculate the distance between thegrouped points is to use the arithmetic mean of distances between every pair of points,each point from a different cluster. For example, two clusters, say the first one containingpoints A&B, and the second one containing points C&D, are made to be regrouped toform a single big cluster. The distance between the two small clusters is calculated as(A + XD + + )/4, where each pair of letters under a bar line designate thedistance between the two points. In general the distance between two clusters, D1, isexpressed as(3.2)whereD1 is the distance between cluster I and cluster J.nj and nj are number of points in cluster I and J respectively.Chapter 3. LSP PERFORMANCE IMPROVEMENT 59d1 is the distance between points i and j.Becanse points in a cluster are treated as a single point, averaging the distancesin the cluster introduces distortion into the dendrogram. Points located far out fromthe centre of the cluster have the greatest influence on the structure. The distortionis introduced when two clusters with unequal number of points join to form a biggercluster. For example, if two points A and B are in a cluster, and a third point C is toenter the cluster, then the distance between the cluster and the third point is calculatedas the average distance between points A and C and the distance between points B andC. Thus point C is involved twice while the other points are each used only once. Thedistortion is increasingly apparent as successive levels of clusters are averaged together.The severity of this distortion can be evaluated by examining the matrix of cophenticvalues, which describes the apparent distance within the dendrogram. The cophenticvalues and original distance matrix can be correlated to reveal the overall degree ofdistortion.Unweighted average clustering methods are similar to the weighted average clustering methods except that they assume an equal number of points in each cluster. Inthese methods late entries into a large cluster have almost no influence on the distancecalculations of the clusters.Figure 3.4 demonstrates how to construct a dendrogram using 9 data points in a twodimensional space. Figure 3.4(a) shows the original data set in two dimensions. Thematrix of distances is calculated from the ordinates of each point. The pairs of pointswith mutually closest distances, are grouped and connected together as shown in figure3.4(b). Figure 3.4(c), 3.4(d) and 3.4(e) demonstrate, step by step, how the distancematrixes are calculated for already clustered points, how new points are entered intothe structure, and how small clusters are sequentially agglomerated to form the finaldendrogram.Chapter 3. LSP PERFORMANCE IMPROVEMENTOriginai data60Point i 2A 2.00 2.00B 6.00 4.00C 7.00 2.00D 5.00 3.00E 2.00 3.00F 3.00 4.00G 3.00 3.00H 7.00 3.00I 5.00 4.00e5.I. B.E H2 A C.2 4 6xlDistance matrices between pointsAA 0.00B 4.4 0.00C 5.00 2.24 0.00o 3.15 1.41 2.24E 1.00 4.12 5.10F 2.24 3.00 4.47o 1.41 3.16 4.12H 5.10 1.41 1.00I 3.61 1.00 2.830.001.41 0.001.00 1.00 0.005.00 4.12 4.00 0.003.15 2.00 2.24 2.24 0.00AEI B.c )I H0FEFa (c)BF 0CHEF0 (d)LZ 0CHAEF0B (e)0C2.2 1.5 12 1.0 0.0Distancex2(a)B C 0 E F 0 H0.003.002.242.002.001.00AE BAE 0.00BI 3.84 0.00CH 5.05 2.180 3.08 1.21FG 1.52 2.60CM 0 FG0.002.124.180.002.12 0.00AE-FG Bi-DAE-FG 0.00Bi-D 2.91 0.00Cl-I 4.61 2.15CH0.00AEFG B1D-CHAEFG 0.00BID-CH 3.76 0.003.8Figure 3.4: Stepwise dendrogram construction.Chapter 3. LSP PERFORMANCE IMPROVEMENT 61There are many ways of allocating points into taxonomic classifications. The choiceof methods depends mostly on how the analyst views the needs of his problem and wishesto trade off compntational effort against the desire for accuracy of cluster identification.3.1.3 A clustering routine for LSPA cluster analysis subroutine which performs a weighted average cluster analysis was incorporated to LSP. This method is chosen for its simplicity and the fact that it needs onlya few user selected parameters. Even though the method might not be the most accurate, it is good enough for LSP purposes since the search does not need precise clusteringmethod for its solntion. With LSP the potential for overall reduction in computationalload and gain in reliability from sophisticated clustering methods is not obvious.A user selected type clustering criterion, which is expressed as the ratio of the minimum distance between two clusters and the distance between the mutually farthest pointswithin a cluster, is used as an iuput for the analysis. The subroutine checks if the clustering criterion is met, and performs sub-grouping of the data set into a number of clusterswhen the criterion is met. The bounds for each cluster, or subgroup, are identified andthen used to define the initial cuhoids for the respective clusters. This subroutine isadapted from [Zupan, 1982].The computational time the search needs while performing the clustering routinesdepends on the strength of the clustering criterion. In this thesis, a clustering criterionis referred to as a strong criterion when the ratio of distances between two clusters to thedistance between the farthest points within a single cluster is high. A strong criterionimplies the expectation of very distinct clusters.If a very strong clustering criterion is used, it might take longer computational timeto meet the criterion, but once the clusters are identified and the problem is subdividedinto the true set of distinct sub-problems, the subsequent LSP search becomes faster.Chapter 3. LSP PERFORMANCE IMPROVEMENT 62The increase in efficiency is demonstrated in Table 3.1 through an example in the latterpart of this section. Since aggregating points into clusters and identifying their boundsdoes not need high precision in LSP, the clustering criterion does not in fact need to bestrong. But if a weak criterion is used for problems whose optimal points are at extremecorners of the initial cuboid, the imperfect cluster identification might result in missingthe global optima. In practice, the probability of generating a sample point at an extremeboundary corner is almost nil for finite sampling schemes. A heuristic method has beendeveloped to avoid this situation and is discussed in section 3.5.The following two variable, unconstrained but bounded, problem with two globaloptima demonstrates the danger of premature clustering and the computational benefitof introducing cluster analysis into LSP.Minimize f(x) = 100 \u00E2\u0080\u0094 (x1 + x2 \u00E2\u0080\u0094 10)2Subject to:0 x1 100 x2 10The true global optimum is f(x*) = 0.0 and the optimal points are located at (0, 0)and (10, 10), which are the diagonal corners of the initial cuboid.Figures 3.5(a) and 3.5(b) show this optimization problem in three and two dimensionsrespectively. Figures 3.5(c) and 3.5(d) show plots of the points in the level set in thex2 plane at consecutive iterations. In both cases the two small rectangles around theclusters of points inside the initial cuboid show the dimensions of the new sub-cuboids ifthe problem is to be partitioned into two sub-problems. Figure 3.5(c) demonstrates howa premature clustering can exclude the true global optimal points. However if clusteringis performed at the next iteration, the global optimal points would be included in theChapter 3. LSP PERFORMANCE IMPROVEMENT 63new sub-cuboids, as shown in Figure 3.5(d). Premature clustering can be avoided byadopting a stronger clustering criterion.Table 3.1 shows the number of function evaluations required to meet the LSP convergence criterion for the above example, with and without using cluster analysis method.These results are from a single LSP run for each case. The number of points maintainedin the acceptance set for each sub-problem was the same as for the acceptance set withoutthe cluster analysis.Number of function evaluationsIteration without cluster with clusteranalysis analysis5 3,024 3,2436 4,236 4,94013 1,420,092\u00E2\u0080\u009414 3,686,64815 8,018,142\u00E2\u0080\u009437 6,550Table 3.1: Reduced number of function evaluations due to cluster analysis methods1eep value of 30 was used for this example. This is higher than the recommendedvalue of Nkeep = 20 but appropriate in known multiple global optima cases. Withoutcluster analysis, this example had not met the convergence criterion VM 0.0001 at the15th iteration when the number of function evaluations was over 8.0 * 106. In the secondcase, where the cluster analysis method was used, the problem was partitioned into twodifferent sub-cuboids immediately after the sixth iteration. With Neep = 30 in eachof the two sub-problems, it took 37 iterations to meet the convergence criterion. Thenumber of iterations cited is the sum of iterations expended before dividing the probleminto smaller regions plus iterations expended to meet the convergence criterion for bothsub-problems. Even though the number of LSP iterations is high when the problem ishandled in two separate subregions, the total number of function evaluations is only 6,550.Chapter 3. LSP PERFORMANCE IMPROVEMENTx26410x2200 2 4 8xl(c) Premature clustering1Qa68 10 0 2 4 6 5 10xl(d) Proper clusteringxl(a) (b)xla64 4.2iteration 6M=4940Figure 3.5: Cluster analysis approach for a two variable problem.Chapter 3. LSP PERFORMANCE IMPROVEMENT 65As was experienced in a nnmber of other examples, the high number of LSP iterationsdoes not necessarily imply inefficiency of the overall search, but the cumniative nnmberof function evaluations. In this instance partitioning of the problem permitted efficientpoint generation and lowered the number of function evaluations by a factor estimatedat or lower.3.2 Penalty functionsGenerally, feasible sample points are randomly generated with LSP for constrained optimization problems. In some cases the feasible region defined by the set of ineqnalityconstraints might be so small compared with the initial variable bounds that acceptablepoint generation reqnires an impractically large number of trial points. Similarly otherproblems might have equality constraints involving nonlinear eqnations which do not leadto reducing the nnmber of variables by elimination. Such difficulties can be overcome byintroducing penalties into the objective function.One conventional penalty approach converts the constrained problem to an eqnivalentunconstrained problem so that those methods developed for unconstrained problems canthen be applied [Bazaraa, 1979]. Suppose there is an optimization problem with onlyequality constraints, given as followsMinimize f(x)Subject to: h(x) = 0 for i = 1,2,..,lx e FtmThis constrained problem can be transformed to an equivalent unconstrained form asMinimize f(x) + /3(h(x))2Subject to: x C FtmChapter 3. LSP PERFORMANCE IMPROVEMENT 66where /3 is the penalty parameter, a non-negative large number.A similar modification is made to problems with inequality constraints. If the originalproblem isMinimize f(x)Subject to: gj(x) 0 forj = 1,2 ,rnx e B\u00E2\u0080\u00992then the transformed problem would be expressed asMinimize [f(x) + /3 maximum {0,g(x)}]Subject to: x E B\u00E2\u0080\u00992.For general optimization problems involving both equality and inequality constraints,given asMinimize f(x)Subject to: h1(x) = 0 for i = 1,2,...,lg(x)0 forj=1,2,...,rnx e B\u00E2\u0080\u00992a penalty function, a(x) is defined asm 1a(x)= >: ql[gj(x)] + i,b[h(x)] (3.3)j=1 i=1where and 5 are functions satisfyingf th(y)=OI q5(y) > 0 otherwiseI z(y) =0 ify= 0I.. z/\u00E2\u0080\u0099(y) > 0 otherwiseand y is a dummy variable representing h(x) and gj(x) respectively.Chapter 3. LSP PERFORMANCE IMPROVEMENT 67Usually, and \u00E2\u0080\u0098 are used in their power form as(y) = [max{O,y}]rwhere i\u00E2\u0080\u0099 is a positive integer [Bazaraa, 1979].The modified objective function, which is also called the auxiliary function, is theuwritteuf(x)+Ba(x) (3.5)aud the transformed problem would be finally expressed asMinimize f(x) + 5a(x)Subject to: x e K\u00E2\u0080\u0099In the above expressions the penalty parameter 5 could be a fixed value or a sequence tending to infinity such that for each iteration k, /3k 0 and Pk+1 > /3k. With thequadratic penalty function, where ct(x) = \u00C3\u00A7[g3(x)]2f1i/i[h1(x)]2,the penalty parameter 5 in Equation 3.5 must become infinite in order to achieve complete convergenceon x [Gill et al., 1989].In the case of an exact penalty function, where cv(x) = I g(x) I + Z=1 \u00E2\u0080\u0098 Ih1(x) I the penalty parameter /3 in equation 3.5 takes a constant value and /3 0. Thereis then a specific value of /3 50 that x, the unconstrained problem minimizer, is also thesolution to the original constrained problem [Gill et al., 1989].The meaning of \u00E2\u0080\u009Clarge value\u00E2\u0080\u009D in connection with specifying /3 is not very clear for aparticular problem and selecting an inappropriate value of /3 can lead to computationaldifficulty If /3 is too small, the penalty function may be unbounded below and as aChapter 3. LSP PERFORMANCE IMPROVEMENT 68consequence produce a solution point far from the feasible region. If j3 is too large thenit induces an ill conditioned Hessian matrix which can imply slow convergence for manyNLP algorithms. Matrix condition is defined as the ratio of its largest and smallesteigenvalues, and a matrix is ill conditioned if this ratio is large.With large values of more emphasis is placed on feasibility and most proceduresfor unconstrained optimization will move quickly towards a feasible point. But it ispossible for search termination to occur on a feasible point even though the point is farfrom optimum. This is especially true in the presence of nonlinear equality constraints[Bazaraa, 1979].There is no clear indication that progressively increasing fi has any advantage over afixed large 3 in saving computational effort. But using several values of j3, together withtheir corresponding solution points, to predict the solution point with the next /9 value isa classical search procedure. Interpolation is sometimes used to predict the next /9 fromthe past values [Gill et al., 1989].3.2.1 Penalty parameters and LSPGenerally LSP responded best to the use of an exact penalty function as opposed to thequadratic penalty function. The gentle slopes induced in the objective function surface byexact penalty functions encourages LSP to form larger volume cuboids at every iterationwhen compared with the surface produced by the quadratic penalty function.In practice, LSP starts finding the solution for the auxiliary function with a large /3,and reduces the value of 3 by 1O%-30% at every iteration. In this research a sequentialreduction of the penalty parameter has been found to provide the greatest enhancementto the efficiency of the LSP search. This approach is exactly the opposite of the traditionalNLP penalty function strategy, where the value of 3 increases as optimum is approached.The advantage of sequentially reducing the penalty parameter is that the decreaseChapter 3. LSP PERFORMANCE IMPROVEMENT 69brings about the acceptance of more infeasible points to the cnrrent cuboid. With theintroduction of more infeasible points into the cuhoid, the cuboid volume is larger thanin the constant penalty parameter case. This in effect retards the rate of reduction ofcuboid volume over consecutive iterations. Although slow cuboid reduction requires moreiterations to reach convergence, its compensating advantage is that the newly generatedpoints are retained for a greater number of iterations than with the constant penaltyparameter strategy. Furthermore, because points farther out of the true feasible regionare tolerated with a lower penalty parameter value, fewer function evaluations per acceptable point are required. The process of reducing j3 continues until a(x), and hencethe penalty term /3a(x), goes to zero. This must occur before convergence so that all ofthe unmodified constraints are satisfied and the original problem is finally solved.Figure 3.6 demonstrates the effect on the level set boundaries of modifying the penaltyparameter for a one variable problem with two constraints. The objective function is astraight line. The problem can be formulated as followsMinf(x) = kxSubject to:x>3x 3.1where Iv is a constant.A penalty is applied to just one of the constraints, i.e. to x 3 only. Those linesdesignated as Pi(x), P2(x), etc. indicate plots of the auxiliary functions, that is the objective function plus the penalty term, at the first, second, etc. iterations. The c1, c2, etc.indicate the level set c-value established in the LSP search at the corresponding iterations.The plot of the intersection points of the level set values and the correspondingChapter 3. LSP PERFORMANCE IMPROVEMENTf(x)gCl875Ca4CaFigure 3.6: Modified penalty function.70auxiliary function values represents the objective function of an equivalent unconstrainedproblem.The whole idea of reducing the penalty parameter is to generate new points moreefficiently and at the same time smoothly contract the level set boundary so that thesearch eventually excludes any infeasible space. The full benefits of penalty modificationmay not be achieved unless a near optimum rate of reduction is used, otherwise thefollowing problems may arise:\u00E2\u0080\u00A2 If the reduction of 9 between consecutive iterations is large in the initial iteration,then the influence of the penalty term on the objective function must be small inthe later iterations as 3 approaches its lower bound value. Then many infeasiblepoints generated at the later iterations would be present in the acceptance set. Inthat case the search is not directed towards the feasible region and may convergeon an infeasible point.P(4)x 3.1f(x)2 tx4Chapter 3. LSP PERFORMANCE IMPROVEMENT 71A large reduction of /3 allows more infeasible points to remain in the level set fora longer period, If the problem is divided into subregions because of suspectedmultiple optima, it is possible that a subregiou may contain only infeasible points.Then the search in that subregion would obviously lead to an infeasible solutionpoint.\u00E2\u0080\u00A2 A very small reduction of /3 does not allow many infeasible points into the acceptance set, hence introduces an inefficient search.In order to overcome the problems mentioned, two measures are taken.i) The reduction of /3 between consecutive iterations is kept small (between 0.1 and0.25 of /3).ii) A lower bound for the parameter /3 is predetermined. The upper bound can be ashigh as required, though extremely high values would lead to an inefficient search.Typical values of /3 and its reduction are given in Chapter 5, section 5.2.6.3.3 Inequality and equality constraint relaxationWhen the feasible region defined by the set of the constraints is very small or very narrow,or defined by equality constraints, the generation of acceptable points can he inefficientand even difficult. To overcome such difficulties and speed up the search, the constraintfunctions can be initially relaxed and the relaxation eliminated in the later iterations.The relaxation method may also be applied to the bounds of the original cuboid if it isfelt that the bounds are too tight for efficient point generation. Penalties are associatedwith the relaxation of constraints, so that infeasible points are discouraged from stayingin the acceptance set. As a result of the penalties introduced into the problem, the finalsolution is likely to he found in the feasible region.Chapter 3. LSP PERFORMANCE IMPROVEMENT 72The idea of constraint relaxation in this research was first investigated using slackvariables as introduced in the reduced gradient method [Lasdon & Waren, 1982]. Butthe introduction of these new variables increased the dimensionality of the problem without any compensating gain in performance. Instead, the constraints are allowed to relaxby simply adding (or subtracting) a constant term to the right hand side of the constraints. Equality constraints are replaced by two constraints with the constant addedor subtracted from their right hand sides to form an interval. The constraint relaxationraises the volume of the feasible region which in turn raises the ratio of the feasible regionvolume to the cuhoid volume, as a result the efficiency of generating acceptable pointsis increased. Figure 3.7(a) and (b) demonstrate how constraints are relaxed for equalityand inequality constraints respectively for a two variable problem.The number of function evaluations at the first iteration N), gives a general indicationon how tight the constraints are. For a highly constrained problem, N) would be verylarge when compared with the value of 2 x Niceep expected for an ideal unconstrainedproblem.Setting the magnitude of the relaxation is a trial and error process. The constraintsare relaxed by an initial value and N) is compared with 2 x Nkeep. If the response tothis initial relaxation does not seem to be promising, the search should be interruptedduring or after the first iteration and the constraint relaxation adjusted until the userfeels efficient point generation has been achieved. While this process is unquestionablyheuristic, initial trial relaxation values which appeared to he \u00E2\u0080\u009Csensible\u00E2\u0080\u009D in the context ofthe problem being solved were often successful or were easily corrected from the feedbackprovided by N.Chapter 3. LSP PERFORMANCE IMPROVEMENT 733.3.1 Tightening of relaxed constraintsThe initially relaxed constraints are progressively tightened so that the level set will ultimately be confined to the feasible region. Ideally the relaxed constraints will be tightenedback to their original form before the convergence criterion is met. This is achieved by reducing the relaxation by a fixed fraction at each iteration. How far constraints should betightened at each iteration is a compromise between the smallness of the feasible regionand how fast the search is expectd to achieve convergence. If the relaxation reductionper iteration is high, the search might contract into a very small region and the originaldifficulty will again be faced. On the other hand a very small reduction at each iterationcan lead to a high number of iterations. Reducing the relaxation to a typical value of 0.75of its value at the previous iteration would leave to a reduction to 0.01 (i.e. 0.7516) of theinitial relaxation after the l6\u00E2\u0080\u009D iteration. The ideal value of the tightening parameter isrelated to the complexity of the constraint functions and the size of the feasible regionbut was found to be robust in the region of 0.75 for virtually all of the test problemsK2 h(x)=Ox2Rilaxid bandNp(a) R.Iax.d .qullty oonat,alntxlFigure 3.7: Relaxation of constraints.(b) R&ax.d In.qullty oonsiIntChapter 3. LSP PERFORMANCE IMPROVEMENT 74where relaxation was appropriate.3.4 Cuboid stretchingAt any stage of the search, the acceptance set is expected to be a subset of the cuboid.The minimum and maximum coordinate values of the confirmed points are used to locatethe cuboid sides. Dne to limited sampling, the true bounds of the level set may actuallyextend beyond the observed points. Therefore, the confirmed points always tend toproduce a biased under estimate of the bounds of the true level set. This bias is correctedby extending the cuboid faces outwards by some prescribed amount. This is referred toas cuboid stretching. The ideal size of this correction is not known in practice but overcorrection only incurs a small penalty in the efficiency of point generation, while undercorrection can have serious consequences of omitting a global solution. Overcorrection ispreferred and a good heuristic estimate can be made.The following expressions to stretch the bounds of the cuboid were proposed in [Chewa Zheng, 1988]. For a one dimension case, let be a random variable on the interval[a, b] with distribution function F, and F is assumed to be a uniform distribution.Assume that t points are sampled, that is i,e2,..., \u00C2\u00B1 are confirmed points in theacceptance set and t is the number of confirmed points. We consider to estimate theunbiased estimators of a and b from the sampled points. Suppose a = 0 and b = 1B1 = mm {1,2B = maxThen the distribution functions \u00C3\u00A7\u00C3\u00B3r and 4j of B1 and B are respectively0 yOco(Y)= 1_[1_F(y)]t 0<<11 y1Chapter 3. LSP PERFORMANCE IMPROVEMENT 750 iO&(u) = [F(y)]\u00E2\u0080\u00991 ylThe expected mean values for each bound are given asE[B1] JJ (1 \u00E2\u0080\u0094 F(y))tdy, which for uniformly distributed F is simpilfied to andE[BU] = 1\u00E2\u0080\u0094 JJ (F(y))tdy, which is simplified toThe corrected new bounds are estimated asB1\u00E2\u0080\u009D = B1 \u00E2\u0080\u0094_______Ba\u00E2\u0080\u009D = B + B-B1where B1\u00E2\u0080\u009D and BU\u00E2\u0080\u009D are the unbiased estimators of the end points.The consequence of this bias correction is that each cuboid side is stretched by\u00E2\u0080\u0094---times the length of the original size at every iteration, where n is the number of confirmedpoints surviving in the acceptance set.3.5 Skewness adjustmentSkewness adjustment is a combination of shift of cuboid location and cuboid stretchingat the end of each iteration, and was also proposed in [Chew & Zheng, 1988]. It wasconfirmed in this research to be one of the most important heuristic devices for speedingup an LSP search and increasing the reliability of convergence on optimal points.Since the adopted point generation scheme for new points in a cuboid utilizes auniform random distribution along the axis of each variable, the possibility of missingan important point is high when the optimal point lies at a boundary, and even greaterif it lies at a corner of the feasible region formed by multiple boundaries. The dangerChapter 3. LSP PERFORMANCE IMPROVEMENT 76of missing an important boundary point can be lowered either by substantially raisingwhich would in turn increase the total number of function evaluations, or shiftingthe cuboid to ensure that it straddles the most promising region. When the acceptanceset at any iteration is skewed, that is when the point with the lowest objective value isclose to the boundary of the cuboid, an adjustment can be made to both the size andlocation of the cuboid so that the minimum point would be shifted towards the centreof the cuboid. A slight shift of the cuboid about the current minimum point helps morepoints to be generated around the current minimum point.With a skewness adjustment the shift or translation of the cuboid is governed by thelocation of the single best confirmed point. This introduces a bias where more pointsare generated close to the current best point. But, since the magnitudes of the shiftare controlled by certain user selected parameters, there is a risk that the detection ofmultiple optima can be undermined with imperfect skewness parameters.The magnitude of the shift is somewhat arbitrary. Quantitatively a shift which is toolarge may lead to missing an optimal point, while a shift which is too small does notspeed up the search nor direct the search towards the optimum significantly. Experiencein this research suggests that there is a rauge of shift which results in very low chance ofmissiug the global optimum and this range is in the order of 4% to 10% of each currentcuboid side length.Figure 3.8 demonstrates the principle of skewness adjustment in a two dimensionalproblem. There, the global optimum is located at x\u00E2\u0080\u0099, while.Qi and 92 are two constraintfunctions and the objective function and its gradient is represented by the f(x) isoquantsand the arrows. The rectangular boxes represent the cuboids at consecutive iterations,the largest at the earliest iteration and the smallest at the last iteration. In the iterativeprocess of LSP, points generated close to the right corner of the cuboid would be discardedearlier since they would not fulfil the level set condition. The remaining points wouldChapter 3. LSP PERFORMANCE IMPROVEMENT 77tend to cluster towards the global optimum point. But because of the limited samplepoint generation and deficiency of sampling on the boundary itself, points around thelower left corner of each cuboid may not be sampled. An unadjusted cuboid at anyiteration except the first excludes the lower left corner of the previous cuboid. Thereforethe location of the current cuboid gets further away from the global optimum with eachiteration. Consequently, convergence without skewness adjustment would occur at a nonoptimal point x\u00E2\u0080\u009D.When the skewness adjustment is implemented, as shown in Figure 3.8(b), the cuboidis shifted towards the global optimum at every iteration and eventually convergenceoccurs at the true global optimum.(a) No skewness adjustment (b) Skewness adjustedx2I\u00E2\u0080\u0098x2IIxlXIxlFigure 3.8: Shift of cuboid with skewness adjustment.Chapter 3. LSP PERFORMANCE IMPROVEMENT 78Skewness adjustment can be nsed for problems of any dimensionality and the adjustment is applied independently in each dimension. The procedure in each dimension is asfollows.Define the skewness adjustment 6 at a single iteration as6 \u00E2\u0080\u0094 (x\u00E2\u0080\u0099 \u00E2\u0080\u0094 B1) \u00E2\u0080\u0094 (B \u00E2\u0080\u0094 x\u00E2\u0080\u0099)\u00E2\u0080\u0094 (B\u00E2\u0080\u0094B1)defining X,. = B \u00E2\u0080\u0094 B1, then6 (2x\u00E2\u0080\u0099\u00E2\u0080\u0094(B11+B) (3.6)Where B1 and B are the lowest and highest observed points measured on this variable\u00E2\u0080\u0099saxis, x\u00E2\u0080\u0099 is the coordinate of the minimum confirmed solution point in the current cuboidand Xr is the length of the cuboid side in this dimension. Note that 6 can be +ve or\u00E2\u0080\u0094ye in the interval \u00E2\u0080\u00941 6 1.Adopt three skew parameters: which acts as a skewness activator threshold; and6 and 62, which limit the size of the skewness adjustment. Their specific values will bediscussed in Chapter 5, section 5.2.5, but must be bounded as follows.o 6 1o 62 6 1Adjustments to the cuboid bounds are made in relation to the size of the skew andthe skew parameters 6 and applied only if 6> 6 or 6 < \u00E2\u0080\u00946 [Chew & Zheng, 1988].I B B1 + 662XrIf 6 6B = B + 661XI B = B1 + 661Xff6 \u00E2\u0080\u00946B = B + 6623CrChapter 3. LSP PERFORMANCE IMPROVEMENT 79where B and B are the new lower and upper bounds of the cuboid.Such an adjustment results in both stretching and shifting of the cuboid. The stretching of the cuboid z\ would be between 6(6 \u00E2\u0080\u0094 62) and (6 \u00E2\u0080\u0094 62), whereas the shift of thecentroid of the cuboid A ranges from 66 up to 6, depending upon the degree of theskew, that is\u00E2\u0080\u0094 62) S (6 \u00E2\u0080\u0094 62)66 S A S 6. (3.7)Experience in this research revealed that skewness adjustment is not a good approachfor discrete variable problems, especially when the sides of the cuboid become smallrelative to the discretization interval. For a discrete variable problem, the shift andstretch of the cuboid introduced by skewness adjustment is not just a small fraction ofthe cuboid dimensions. If the cuhoid side length for an integer variable was say 3 units,then the minimum shift along that axis, will result in a shift of 33% of the total length.Such a significant alteration in the location of the cuboid can lead to the exclusion ofimportant points.3.6 Search domain modificationThe efficiency of generating acceptable points in a cuboid declines as the volume ratiobetween the region occupied by the acceptance set and the cuboid gets smaller. Oneof the typical cases is when the x r%1 xj scatter diagram of points in the acceptance setform a cluster around a diagonal of the current cuboid. In such cases generating pointswithin the cuboid boundary is inefficient. Two different techniques were considered inthis thesis to improve on this situation. The first technique was the rotation of the axesof the cuboid so that one of the sides of the new cuboid would be almost parallel to theChapter 3. LSP PERFORMANCE IMPROVEMENT 80line which passes through the centre of the cluster. This technique requires substantialextra computational effort since it involves transforming all points to the new rotatedaxes, constructing the cuboid and generating new points, and then transforming eachnew point back to the initial axes to check feasibility and compute objective function.The second technique, which is discussed in detail in this section uses a non-right angleparallelepiped as shown in Figure 3.9 instead of a cuboid as a search domain. Thenon-right angle parallelepiped is referred to here as a rhombohedron.Rhombohedron is defined as a parallelepiped whose faces are parallelograms in which the angles are oblique and adjacent sides are unequal.The use of a rhombohedron increases the search efficiency by the ratio of the volumeof the (minimal) cuboid to the volume of the (minimal) rhombohedron.XjV\u00E2\u0080\u0094\u00E2\u0080\u00A2 /]...Confirrned points.\u00E2\u0080\u00A2.CuboidRhombohedronxiFigure 3.9: Contrast between cuboid and rhombohedron in two variable space.The shift from the cuboid to the rhombohedron search domain can be instigatedwhen the displayed scatter diagrams of the current acceptance set indicate a strongChapter 3. LSP PERFORMANCE IMPROVEMENT 81correlation between a pair of variables. Alternatively a statistical procednre to calculatea correlation matrix of all variables can be initiated at any stage of the search. If theabsolute value of any of the correlation coefficients is greater than a pre-specified value,a linear regression analysis can be performed between the highly correlated variables.Choice of the dependant and independent variables is arbitrary.Point generation in a rhombohedron is demonstrated in Figure 3.10. Initially a pointis randomly generated on the independent variable axis within the current cuboid, forexample point a in Figure 3.10. The corresponding value of the dependent variable isthen calculated on the regression line, i.e. at point b in the figure. The generated pointto be considered in the level set would be randomly chosen around the regression lineplus or minus a pre-specified band width. That is to say the candidate point would begenerated randomly on the line joining points c and d.Finally the dependent variable value at point b is randomly perturbed up or down toproduce a generated point. The limits of this perturbation were defined in this work by+ 3x se, where se is the standard error of the estimate returned by the regression analysis.This ensured that the random generation of new points occurred within a rhombohedronlarge enough to virtually eliminate the risk of excluding any of the acceptance set.Chapter 3. LSP PERFORMANCE IMPROVEMENT 82.Regression lineb/f\u00E2\u0080\u0094 Range of perturbationCurrent cuboldxiFigure 3.10: Point generation in a rhombohedron search domain.Chapter 4TEST PROBLEMSThe merits of an optimization method lie in its ability to solve a wide range of problemswithin the reqnired bounds of accuracy and reliability A newly developed method cannotbe evaluated solely on its theoretical details but has to be tested numerically. Ideallythe same standardized testing procedure can be applied to all optimization methodsbut such a procedure has not yet evolved. The practical alternative is to apply theoptimization method to a wide variety of suitably challenging test problems. Anotherequally important alternative is to use test problems which provide insights into thestrengths and weaknesses of a particular method.A major component of this research effort was expended on exploring the viabilityof LSP in solving both mathematical and practical engineering problems. This involvedsolving some practical engineering optimization problems where the dimensions and thenon linearities of the problem challenge existing optimization methods. In addition, anew test problem, designed specifically to explore perceived weaknesses with LSP, wasdeveloped.About 200 nonlinear optimization problems with published solutions were collectedfrom the mathematical literature. A selection of 78 of the more challenging of the 200mathematical test problems solved with LSP are listed in Appendix A. Solutions citedin the literature and the LSP solution(s) are given for each problem. The problemsdocumented in the Appendix were chosen for two major reasons. First, for the challengethey have presented to many of the existing NLP methods. Second, because they were83Chapter 4. TEST PROBLEMS 84found to provide useful insights into LSP performance.A second group of test problems consists solely of engineering problems. The problems in this group are pubhshed practical problems known to be difficult for many NLPmethods and of interest to engineers. Some of these publications present specific NLPapproaches developed to solve just one of these problems.A third group of test problems involves parameter fitting for a specified model. Thevalues of the model parameters are established by optimizing various nonlinear best fitcriteria. This is an area of apphcation where LSP provides a unique and potentiallyattractive perspective. Some of these models are used to demonstrate the capacity ofLSP to refine estimates of parameters previously established with nonlinear regressionmethods.In every one of the problems, the problem formulation used with LSP was tested fornumerical agreement with the published solutions. Except for very few cases, all thesolutions in the source literature were verified. Any deviation from the original solutionis reported and the discrepancy discussed.Whenever possible, the widest possible variety of problems are addressed to explorethe robustness of LSP. However, the main difficulty experienced was in finding suitablychallenging mathematical and engineering test problems. Most of the published testproblems uncovered in the literature were felt to be inadequate for testing all qualitiesof LSP. In response, one new, parametrically adjustable, mathematical test problem wasdeveloped during this research. Significantly, many of the existing NLP methods fail tosolve this new test problem unless a favourable starting point is selected. Details of thisnew test problem are provided in section 4.2.1 of this chapter.The algorithm used in addressing all of the test problems in this research is presentedin Figure 4.1 as a flow chart. The algorithm performs all tasks automatically exceptthose indicated as optional in the figure. The optional tasks can be made functional atChapter 4. TEST PROBLEMSRelax constraints85Generate replacement points * optionalEvaluate objective at each pointEvaluate momentslevel set value; c=Max{f(X)1I Discard bad pointsCheck for linear correlation *Check for distinct clusters *Define new cuboidStretch cuboid boundsMake skewness adjustmentFigure 4.1: The LSP algorithm as implemented in this thesis.Chapter 4. TEST PROBLEMS 86any stage of the search by the user as the need arises or may be triggered automaticallywhen default conditions occur. These optional tasks can be computationally intensiveand are often not crucial to finding a solution.4.1 Coding of mathematical test problemsThe mathematical problems documented in Appendix A are coded in such a way thatthe number of variables and constraints involved can be easily recognized from the code.Each problem coding takes the form[Problem A \u00E2\u0080\u0094 B \u00E2\u0080\u0094 C \u00E2\u0080\u0094 D, Location in thesis]whereA designates the number of variables in the problemB designates the index number within that particular dimeusionality groupC designates the number of equality constraintsD designates the number of inequality constraints.For instance, [Problem 3-7-2-1, Appendix A] would mean the seventh problem withinthe set of 3 variable problems, and the problem has two equality and one inequalityconstraint. It is documented in Appendix A of this thesis.Most practical optimization problems assume upper and lower bounds on each variable but these bounds are not counted here as constraints. The only instances whenbounds are considered as inequality constraints are where the source literature dictatestheir use for specific reasons.The original source for each problem is referenced.Chapter 4. TEST PROBLEMS 874.2 Mathematical problemsIt is not easy to find test problems which can reveal all the strengths and weaknesses ofa single NLP method. Recognizing this difficulty, a wide range of solved test problemswere chosen from different journals and books. The more interesting mathematical testproblems solved using LSP are documented in Appendix A. Results given in the literatureand those obtained with LSP are documented along with each problem. The distributionof these problems according to the number of variables is summarized in Table 4.1.Number of Unconstrained Constrainedvariables problems problems Total1 5 1 62 8 15 233 5 12 174 3 9 125- 4 46- 6 67- 3 38- 3 39- 2 215- 1 120 1- 1Table 4.1: Summary of test problems reported in Appendix AAbout two thirds of the test problems documented in Appendix A involve up to 4variables. Their specific characteristics are intended to challenge NLP methods apartfrom the issue of dimensionality. Furthermore, these low dimensional problems are relatively easy to visualize. The space defined by the set of constraints and the surface of theobjective function can often be interpreted and clarified in simple 2 and 3 dimensionalgraphs. Such graphs help visualization of the search process and its response to thedifficulties presented by the problem. The experience gained with the low dimensionalproblems can often be exploited in higher dimensional problems.Chapter 4. TEST PROBLEMS 88Comparison of LSP results with the solutions given in the literature showed that,in about 95% of the cases, the LSP results confirmed those cited in the literature. Inabout 5% of the cases, LSP gave better results than have been previously published. Twotypes of improvements were observed. One was the identification of multiple solutionswith LSP when the source literature gives only some of these solutions. LSP generallyfound the extra solution point(s) on a single run. For example, only two solution pointswere reported in the literature for [Problem 2-11-0-0, Appendix A], but LSP found twoadditional global optimal points. A second type of improvement was observed when LSPconverged at a different solution point from the one given in the literature and with animproved objective function value. Such improvements were rare since test problemsoften favour the NLP methods that they were developed to demonstrate. Examples oftest problems with improved solutions are: [Problem 4-8-0-3]; [Problem 4-10-1-2]; and[Problem 7-2-0-3, Appendix A].There were several instances where, when the solution cited in the literature wastested, the constraints had been significantly violated, although this was not acknowledged. In other instances, the claimed optimal value of the objective function did notagree with the calculated value at the solution point cited. For example, the constraintsof [Problem 9-2-6-0, Appendix A] were significantly violated at the optimal solution pointin the published solution. Similarly the optimal objective function value cited in the literature for [Problem 1-4-0-0, Appendix A] can not be achieved at the given solution points.LSP found multiple global optimal points which produce a superior objective functionvalue for the same problem. When corrections are made to such incorrect solutions inthis thesis they are not reported as improvements, but as errors in the source literature.Some special problems were encountered by LSP when solving some of the test problems and are reported. Either the results were inferior to those mentioned in the literatureor the number of function evaluations was considerably higher. An interesting exampleChapter 4. TEST PROBLEMS 89is [Problem 2-12-0-4, Appendix A], discussed in detail in Chapter 5, section 5.5, wherethe cause for the difficulty is associated with the geometry of the feasible region.4.2.1 The Road Runner functionThis new test problem was designed to test global optimization methods in general aswell as to specifically challenge LSP and explore its weaknesses. It produces an objectivefunction surface which has an easily missed fissure containing the global optimum anda topology which can cause fragmentation and dispersion of the level set. The scale ofits prominent features and the location of the global optimum point are controlled bytwo parameters. The first parameter a, in Equation 4.1 adjusts the size of the fissureon the objective function surface. It controls how wide and how deep the fissure canbe. Here, fissure width is defined as the distance between the fissure edges on eachvariable axis. Depth of the fissure is defined as the difference between the maximum andminimum objective function values. The second parameter b controls the location of theglobal minimum. Figure 4.2 shows the influence of parameters a and b on the objectivefunction, particularly at the edges of the fissure region.This test problem has been named \u00E2\u0080\u009CThe Road Runner function\u00E2\u0080\u009D, after a popular filmcartoon which is set in terrain which resembles the extreme topography of the objectivefunction surface produced. The problem is a challenge for all local optimization methodsunless a favourable starting point is chosen. It is also a challenge for LSP since it isparticularly sensitive to sparse sampling of the level set. The additive structure of thisfunction allows it to be easily extended to any number of dimensions while retaining thesame critical features.The general n dimensional formulation of the Road Runner function is given in Equation 4.1Chapter 4. TEST PROBLEMSa x @ fissure edges 1(x) at edges Fissure width Fissure depth5 ..431l4, 127782 4.22497, 1.76965 1.70894 4.2249710 \u00E2\u0080\u00A20.32818, 100210 7.24521, 2.31046 1.41029 7.2452120 .0.26295,0.95756 13.24932,3.21143 122051 13.24932Figure 4.2: The Road Runner function in one dimension to show the influence of a.nf(x)={(xj_b)2+aIxj_bI}?s=190(4.1)where a and b are parameters which directly adjust the scale of the critical topographicfeatures and n is the required dimensionality for x. This function will always have itssingle global minimum at x, = b; i = 1, ..., n where f(x) = ZerO.The function becomes a unimodal problem if the search domain is restricted to thesmall region around the global optimum. The search domain has to be bigger than this151050xa=5 a=10 a=20Chapter 4. TEST PROBLEMS 91minimal region to observe all of the features of the function. Thus, the bounds on eachvariable should be wide enough to enclose regions beyond the fissure width. Multiplelocal optimal points develop with this bigger search domain and their number increasesexponentially with dimensionality. The total number of local optimal points is 3, 9 and27 for one, two and three dimensional functions respectively. In general, there arelocal optima for an ii dimensional function. Only one of the local points is the globaloptimum for each case. Two and three dimensional plots of the two dimensional versionof this function are given in Figure 4.3 using the values a = 10 and b = 0.5. In this casethe function is1 1f(x) = {(xi \u00E2\u0080\u0094 .5)2 + 10 \u00E2\u0080\u0094 .5 }T + {(x2 \u00E2\u0080\u0094 .5)2 + 10 I \u00E2\u0080\u0094 .5 I}\u00E2\u0080\u0099.Figure 4.3(a) shows that the function has two sets of valleys. Each set consists of fourvalleys having similar depths. The depth of each valley increases as it goes farther awayfrom the global optimum. The 4 valleys which are parallel to the axes have all equaldepths. The other 4 valleys which are diagonal to the axes have also similar depths hutdifferent from the first four. The local optimal points appear at the intersection of thevalleys and at the boundaries formed by the bounds of each variable. The global pointis at x = (0.5, 0.5), at the centre of a very narrow fissure, where f(x) achieves its globalminimum of zero.Figure 4.3(b) clearly reveals the influence of starting point location on the solutionsobtained with gradient optimization methods. Gradient based methods always convergedat a local optimum unless they started at a point very close to the global optimum. Figure4.3(b) shows 36 arbitrarily chosen starting points, and the corresponding optimal valuesobtained using gradient optimization methods. The results shown in Figure 4.3(b) werefound using the popular NLP methods such as FMIN, GRG2 and GAMS which usesChapter 4. TEST PROBLEMS 92MINOS. Searches starting at points with circular marks converge at the true globaloptimum. Those searches which started at points other than those identified by circlesconverged at local optimal points. The optimal objective function values mentioned inthe figure are for the purely unconstrained cases, i.e. where there were no bounds on thevariables.As was intended this problem challenges LSP, particularly when the initial variablebounds are wide and the value of I\u00E2\u0080\u0099keep is low. With a large cnhoid, the ratio of the fissurevolume to the cuboid volume becomes very small, so points generated in the early stageof the search are unlikely to fall within the fissure. Because of the nature of the objectivefunction surface, points far away from the fissure produce better objective function valuesthan those that are close to the fissure but not actually within it. As the level set value-capproaches 1.0, there is little size reduction in the cuboid with successive iterations, butthe level set value-c continues to improve at each iteration. This situation is the leastadvantageous for LSP because the ratio between acceptance region and cnboid volumesgets very small where the search becomes inefficient.If an initial cuboid with one of its corners very close to the global optimum is used,the chances of generating sample points within the fissure are much lower than when theglobal solution is more centrally located in the cnboid. The reason is that, in regions closeto the global optimum, points produce very high objective function values and are quicklyrejected from the level set. Therefore, unless points with low objective function valueare sampled on either side of the global minimum, the region close to the global pointcan be excluded from the current cuboid at an early stage of the search. The situationgets worse with low Nkeep. Consequently the search terminates at a point other than theglobal optimum. The reliability of LSP can always be improved by raising Witha higher number of points, the chances of generating points in the fissure is increased atthe early stage of the search.c\1-\u00E2\u0080\u0098UxIf)QQ3B 2\u00C3\u0098-1.0-2.6 =4. -2.5 -1.0 0.52,0 3.y xChapter 4. TEST PROBLEMS(1x(a)to(b)yxx\u00E2\u0080\u00A2 give 2 as an optimumAgivel as an optimum\u00E2\u0080\u00A2 give 0 as an optimum932.0 O. 2 -2.S -.1.0 g. 2.0y-i.ePoints starting atPoints starting atPoints starting atFigure 4.3: The Road Runner function in two dimensions.Chapter 4. TEST PROBLEMS 94The Road Runner function was solved using LSP for various number of variables. Theresult for the 2 dimensional function with iV = 20, convergence criterion VM i*1O\u00C2\u00B0,a = 10 and b = 0.5 and the initial bounds \u00E2\u0080\u00944< xi,x2 4, wasf(x*)= 4.0E \u00E2\u0080\u0094 4at xK = (0.5000, 0.5000).The number of function evaluations expended to meet this convergence criterion was4,230.For a two dimensional Road Runner function there are 8 valleys stretching radiallyaway from the global optimum. The valleys are equally spread in the search domain andthe minimum point in each valley occurs at the boundaries formed by the variable bounds.Such a distribution of the local minimal points is believed to introduce the greatestinefficiency in set based search schemes and in particular to LSP. Due to the orientationand spatial distribution of the valleys, the domain modifications discussed in Chapter 3,section 3.6, do not improve the LSP efficiency as there is no search domain which canenclose all confirmed points more efficiently than the cuboid. The function with rotatedaxes has features similar to the original one, so it does not bring any computational gainwhatsoever.4.3 Engineering test problemsFor the most part real world engineering test problems for optimization methods deal withthe search for superior designs. Such problems often involve both continuous and discretevariables. In addition the functions involved may be discontinuous and non differentiable.Most of the established nonlinear optimization methods are designed to handle onlycontinuous and differentiable functions and are local rather than global optimizers. TheseChapter 4. TEST PROBLEMS 95restrictions may account for some of the resistance by civil engineering practitioners tothe application of these existing optimization methods for solving practical problems.There are very few well documented civil engineering test problems in the literature andthis scarcity represented one of the major difficulties in testing LSP\u00E2\u0080\u0099s capabilities in thatfield of specialization. In addition, those examples which do exist in civil engineeringwere not developed to isolate and test specific qualities of optimization methods, otherthan perhaps their high dimensionality capabilities.Floudas and Pardalos [Floudas & Pardalos, 1990] documented a collection of engineerin test problems for constrained global optimization algorithms. A few of the moreinteresting problems have been selected from this collection. Significantly, some of theresults given in the literature are not global solutions. For instance, solutions to the4-variable problem [Problem 4-10-1-2, Appendix A] and the six variable problem [Problem 6-5-3-12, Appendix A] are far from the optimums which are cited on page 29 of thebook entitled 4 collection of test problems for constrained global optimization algorithms,[Floudas & Pardalos, 1990]. In some cases the reported optimum value of the objectivefunction is different from that obtained by using the values in the solution vector. The4-variable problem already mentioned [Problem 4-10-1-2, Appendix A] is an example ofsuch inconsistency. In other cases the optimal points given for some test problems violatethe constraints. Examples of these are the 7-variable problem on page 153 and the 27variable on page 67 in the same mentioned book.Because of such inconsistencies, many test problems which at first sight seem to bewell documented were not of any value for LSP testing purposes. In spite of having toreject many, about 20 credible published engineering problems were found and used fortest purposes. These problems varied from 3 to 38 variables, and included both discreteand continuous variables. They cover applications of optimization to pipe network design, structural truss design, treatment basin design, heat exchange design, air pollutionChapter 4. TEST PROBLEMS 96control design, irrigation system design and hydrological model parameter determination.Details of 11 of the most challenging problems, including their solutions, are given in thenext subsections.In most cases the LSP results were comparable to the solutions given in the literature,but again some improvements were observed. For instance, a 12.91% improvement onthe published objective function value was found with the pipe network design [Problem32-1-40-34, section 4.3.2]. Another improvement was obtained with a better fit for thehydrologic model parameters which, from a hydrologist\u00E2\u0080\u0099s point of view, produced a farmore reasonable unit hydrograph for the UH ordinate determination problem [Problem19-1-21-0, section 4.3.5].4.3.1 Basin design [Problem 3-16-1-3]Source: F. Douglas Shields Jr., and Edward L. Thackston, \u00E2\u0080\u009CDesigning Treatment BasinDimensions to Reduce Cost\u00E2\u0080\u009D, ASCE, Journal of Environmental Engineering Vol 117 No3 May/June 1991.Method of optimization used in the source literature: Choosing the best curve out of aseries of Length/Width versus cost plots drawn separately for different number of spurs.Problem formulation:A rectangular shallow treatment basin is to be designed with internal spur dikes, asshown in Figure 4.4. The designer is expected to choose the basin length L, the basinwidth I4 at water surface, the number of dikes N and the ratio of spur dike length tobasin length r. The choices have to satisfy the governing constraints at a minimum cost.The geometric and hydrologic relationships are given as follows.Geometric relationshipsFigure 4.4: Treatment basin.A=LW\u00E2\u0080\u0094NrLW3V=LWD-NrLVL1 Ww =rL(N+1)\u00C2\u00B1(Nl)orLf L 2V =(W)r(N+l)whereA is the water surface area calculated as the total basin area at the watersurface elevation minus the area occupied by the spur dikesV is the basin volume, which is calculated as the total basin volumeassuming that the volume of side slopes below water is negligible,minus the volume of the spur dikesL is the length of basinChapter 4. TEST PROBLEMS 97W+2wwPlanSection A-AChapter 4. TEST PROBLEMS 98W is the width of basinW3 is the width of spur dike at the water surfaceN is the number of spur dikesr is the ratio of spur dike length to basin lengthD is the depth of waterV\u00E2\u0080\u0099 is the unit volume of spur dike below the water lineLf is the flow lengthWf is the flow widthHydraulic relationshipsHE4whereHE = hydraulic efficiencyt = mean hydraulic retention timeT = volumetric residence timeQ = mean flow rateFor a given Q, the value of T can be adjusted by changing the basin geometry.Experimentally the hydraulic efficiency is found to be related to the flow length, Lf,width, Wf, ratio, expressed asHE = O.84[1 \u00E2\u0080\u0094 exp(O.59J)}Thereforev_tQHEChapter 4. TEST PROBLEMS 99orLWD\u00E2\u0080\u0094NrLV tQ \u00E2\u0080\u0094 O.84{1 \u00E2\u0080\u0094 exp[_O.59()r(AT + 1)2]}For a small volume of the spur dikes, NrLV, the basin volume is approximately LWD,thenLW= tQO.84D{l \u00E2\u0080\u0094 exp[\u00E2\u0080\u0094O.59()r(N + l)2]}Normally t, Q, and D are fixed by design constraints soLW a- {1 - exp[-O.59(4)r(N + l)2]}where a = tQ/O.84D.The basin cost includes the costs of land, dikes, inlet and outlet structures and accessroads. If f is the ratio of the unit cost per foot of spur dike to the unit cost of theperimeter dike, and if all other costs are negligible relative to the cost of the dikes andland, then the basin cost may be approximated bycost = [(2L + 2W + 4w) + fNrL]cd + [(L + 2w)(W + 2w)] ciwhere w is the base width of the perimeter dike, cd is the cost per unit length of perimeterdike and c1 is the cost per unit area of land.If Q and t are known and the dike design has determined B, r and w then the onlyremaining geometric variables to be chosen are L, W and N.Two cases are considered:Case a) No constraint on land.Case b) Land is a constraint, width is fixed to 500 ft.Chapter 4. TEST PROBLEMS 100The optimization problem has a nonlinear objective fnnction involving three decisionvariables. One of the decision variables, N, takes integer valnes only. The problem hasone nonlinear eqnality constraint. The formnlation isMinimize cost = [(2L-f- 2W + 4w) + fNrL]cd + [(L + 2w)(W + 2w)]ciSubject to:LW=a{1 \u00E2\u0080\u0094 exp[\u00E2\u0080\u00940.59(4)r(N + i)2J}W,L,N 0The predetermined constant values used and physical conditions of the basin are givenas follows.- The basin is rectangular, and spur dikes are longitudinal, with r = 0.85.- Perimeter dikes have crown widths of 10 ft, heights of 7 ft and side slopes of 1:3,their unit volume is 217 cu ft/ft, with a base width, w, of 52 ft.- Spur dikes have crown widths of 2 ft, heights of 7 ft, and side slopes of 1:2. Widthat the water line, W5, is 10 ft, and the unit volume below the water line is 100 cu ft/ft.- Basin depth is 5 ft.- Average flow rate, Q, is 27 cu ft/sec.- The required residence time is 45 hours.- The unit cost of the perimeter dikes Cd, is $20/ft.- Unit cost of spur dikes is 810/ft, so f = 0.5.- The unit cost of land is 81000/acre, or $0.23/sq ft.The optimal solution cited in the literature and LSP\u00E2\u0080\u0099s output are given in Table 4.2.Observations:\u00E2\u0080\u00A2 It took 34 minutes to find the optimal solution for Case a of Table 4.2, and only 4seconds for Case b with LSP on a 80386-33Mhz microcomputer.Chapter 4. TEST PROBLEMS 101Case a Case bLiterature LSP Literature LSPN 1 1 0 1L(ft) 1245 1266 2310 2083.3W(ft) 890 869 500 500Cost (1000 $) 131 131 150 155.5Error (ft2) 368.58 0.67 314.57 0.26Error is the violation of constraintsTable 4.2: Optimal results for the basin design problem\u00E2\u0080\u00A2 LSP found that the problem has multiple global optima for case (a). A number ofnear optimal points were also found. The sizes of the final cuboid on the lengthvariable L side ranged between 1,267 and 1,303 ft. On the width side it rangedbetween 853 and 869 ft whereas the number of spur dikes was always one. Thevalue of the cost within the final acceptance set varied only by about 0.15%.\u00E2\u0080\u00A2 When the results cited in the literature are used, some constraints are violatedsignificantly.4.3.2 Pipe network design [Problem 32-1-40-34]Source:1. E. Alperovits and U. Shamir, \u00E2\u0080\u009CDesign of optimal water distribution systems\u00E2\u0080\u009D, Waterresources research, Vol 13, No 6, pp 885-900, 1977.2. 0. Fujiwara, B. Jenchaimahakoon, N. C. Edirisinghe, \u00E2\u0080\u009CA modified Linear programming gradient method for optimal design of looped water distribution networks\u00E2\u0080\u009D,Water Resources Research, Vol 23, No 6, pp 977-982, June 1987.3. G. V. Loganathan and J. J. Greene, \u00E2\u0080\u009CGlobal approaches for the nonconvex optimization of pipe networks\u00E2\u0080\u009D, Water Management in the 90\u00E2\u0080\u0099s, 1993.Chapter 4. TEST PROBLEMS 102Methods of optimization used in the source literature: LPG, Linear programming gradientin [Alperovits & Shamir, 1977] and [Fnjiwara et al., 1987]. The solution is obtained viaa hierarchial decomposition of the optimization problem. The primary variables are theflow in the network. For each flow distribution the other decision variables are optimizedby linear prograrmning. Post optimality analysis of the linear program provides theinformation necessary to compute the gradient of the total cost with respect to changesin the flow distribution. The gradient is used to change the flows so that a (local)optimum is approached.Two global search schemes, MULTISTART and ANNEALING, were used in [Loganathan & Greene, 1993]. The problem is decomposed into a two stage problem whereone search strategy selects link flows and a second strategy, linear programming, seeksthe optimal pipe design.Problem formulation:Consider the network shown in Figure 4.5.\u00E2\u0080\u00A2 the network has 7 nodes, the water demand and ground elevation at each node areknown.\u00E2\u0080\u00A2 the head at each node has to be at least 30 rn above the ground elevation of thenode.\u00E2\u0080\u00A2 the network has 8 links with known lengths.\u00E2\u0080\u00A2 the data on allowable gradient, available pipe dimensions, maximum diameter ofpipe to be used, Hazen Williams coefficients and unit prices for each pipe size areknown.Each link is assumed to comprise m pipes and the overall pipe cost is calculated asChapter 4. TEST PROBLEMSFigure 4.5: Two-looped water supply system.COSt = qjmxijmijmThe objective of the problem is to minimize cost under the following constraints.Xijm =Xijm 0LHi,m = JijmX:jm103(100)link0 node0demandJ = a(Q/C)\u00E2\u0080\u00992D487Chapter 4. TEST PROBLEMS 104Hmin J\u00E2\u0080\u0099ijmnijm H,,mwhere the first summation is over all links (i,j) connecting node s and node ii, and thesecond summation is over all segments rn in each link.The nomenclatnre is as follows:Xjjm is the length of a pipe segment of the m\u00E2\u0080\u0099 diameter in the linkconnecting nodes i and j;cjjm is the unit cost of the m\u00E2\u0080\u0099 diameter in the link connecting nodes i and j;L1 is the length of the link connecting nodes i and j;AHijm is the head loss of the mth diameter in the link connecting nodes i and j;ijm is the hydraulic gradient of the mt diameter;Q is the discharge;C is the Hazen Williams constant;D is the pipe diameter;113 is the head at node s;Hmin and Hmax are the lower and upper head constraints at each node ii; anda is a coefficient = 1.526 * 1010 when Q is in m3/s and B in cm.Input data for the pipe system are given in Table 4.3 and Table 4.4. The flow ineach pipe for a given set of pipe diameters and this input data was solved using theconventional Hardy Cross relaxation method.The least cost solution for looped pipe network problems demands that some links willhave two sections with different diameters [Orth, 1986]. Therefore, each link is assumedto consist of two different sized pipes i.e. in = 2. The sizes of the pipes in a link are to beconsecutive sizes available in the market. The problem has 16 pipe diameter variables,Chapter 4. TEST PROBLEMS 105Node Demand Elevation Head Link Length1 -1120 210 210 1 10002 100 150 2 10003 100 160 3 10004 120 155\u00E2\u0080\u0094 4 10005 270 150 5 10006 330 165 6 10007 200 160\u00E2\u0080\u0094 7 10008 1000data for the pipeDiameter Unit cost Diameter Unit cost(in) ( $/ft) (in) ( 8/ft)1 2 12 502 5 14 603 8 16 904 11 18 1306 16 20 1708 23 22 30010 32 24 550Table 4.4: Available pipe diameters and nnit priceswhich take discrete values, and another 16 continnous pipe length variables. Therefore,the problem has a total of 32 decision variables.Observations:Optimal solutions cited in the first source literature and those found with LSP aregiven in Table 4.5 and Table 4.6 respectively. It is reported in [Alperovits & Shamir,1977] that the problem was solved after 19 iterations, which took 4.05 seconds CPU timeon IBM 370/168 machine. LSP found an optimal solution different from the one cited inthe literature. The solution found with LSP showed about 12.91% improvement on theobjective fnnction value over the solution cited in [Alperovits and Sharnir, 1977]. In thesecond source literature, [Fujiwara et al., 1987], a minimum cost of $ 415, 271 is reported.Table 4.3: Input networkChapter 4. TEST PROBLEMS 106Unfortunately no details of the solution are given in the literature, hence no comparisoncould be made with the LSP results. In the third source literature, Loganathan andGreene found a minimum cost of $ 405,301, which is an improvement over the solutionfound with LSP. A close examination of this solution revealed that the minimum headrequirements at two nodes were violated.Bigger pipe Smaller pipeLink Total length Length Diameter Length Diameter(m) (m) (in) (m) (in)1 1000 255.97 20 744.00 182 1000 996.37 8 3.61 63 1000 999.98 18 0 04 1000 319.38 8 680.62 65 1000 1000.00 16 0 06 1000 784.94 12 215.06 107 1000 999.99 6 0 08 1000 990.91 6 9.06 4Total cost = $ 479,525Table 4.5: Optimal design for the pipe network given in ASCE, 1977Bigger pipe Smaller pipeLink Total length Length Diameter Length Diameter(m) (m) (in) (m) (in)1 1000 0 22 1000 202 1000 0 12 1000 103 1000 632.13 16 367.87 144 1000 232.30 2 767.70 15 1000 294.89 16 705.11 146 1000 0 12 1000 107 1000 860.36 10 139.64 88 1000 114.14 2 885.86 1Total cost = $ 417,607Table 4.6: Optimal design for the pipe network found with LSPChapter 4. TEST PROBLEMS 1074.3.3 10-member truss design [Problem 10-2-0-40]Source:1. Venkayya, \7 B., \u00E2\u0080\u009CDesign of optimum structures\u00E2\u0080\u009D in Computers and Structures,Vol 1, pp 265-309, 1971.2. David E. Goldberg and Manohar P. Samtani, \u00E2\u0080\u009CEngineering optimization via geneticalgorithm\u00E2\u0080\u009D, in Electronic computation, pp 471-482, Kenneth M Will ed 1986.Methods of optimization used in the source literature: Venkayya used a method basedon an energy criteria and a gradient search procedure for design of structures subjectedto static loading. A Genetic Algorithm was used in [Goldberg & Samtani, 1986].Problem formulation:The geometry and loading system of an indeterminate structure for a 10 membertruss is given, as shown in Figure 4.6. The cross sectional area of all members to give theminimum weight of the structure is required. The optimization problem is formulated asfollows.Minimize Weight=p L 4wherep is the specific weight of the material usedW is the total weight of the membersA is the cross sectional area of the member j in square inchesL is the member j length in inchesThe problem is constrained by the allowable stresses and the bounds on membersections.Chapter 4. TEST PROBLEMS 108Figure 4.6: 10-member truss.\u00E2\u0080\u00987mifl j umaxAmin A2 Amaxwhereo is member stress\u00C2\u00B0min and \u00C2\u00B0maa are the minimum and maximum stresses.Amin and A,, are the minimum and maximum areas allowable,which are given as 0.1 in2 and 10 in2 respectively.The material to be used is aluminum which has maximum stress of 2,000 psi (compression and tension), Modulus of elasticity iO psi and specific weight of 0.1 lb/in3.Both loads shown in Figure 4.6 are 100K lb each. A stiffness analysis computer programdeveloped by Fleming [Fleming, 1986] was used with LSP to solve the structural analysis.360\u00E2\u0080\u009D1360\u00E2\u0080\u009D26C)100KIb lOOKIbObservations:(hapter 4. TEST P1WBLEMS 109This problem has been solved by many authors using different niethods. Goldbergand Samtani [Goldenberg & Samtani, 1986] are those who addressed the problem todemonstrate the use of the \u00E2\u0080\u009CGenetic Algorithm\u00E2\u0080\u009D for solving engineering problems. Theycompare their result with Venkayya\u00E2\u0080\u0099s [Venkayya, 1971] result. Assuming Venkayya\u00E2\u0080\u0099sconfirmed feasible solution as the \u00E2\u0080\u0098true\u00E2\u0080\u0099 solution, their three best-of-run results showthat the Genetic Algorithm results deviated from the \u00E2\u0080\u0098true\u00E2\u0080\u0099 solution by 0.82%, 1.32%and 2%. Unfortunately the best run cited in [Goldenberg & Samtani, 1986] violates sixof the ten stress constraints, which is acknowledged as \u00E2\u0080\u0098minor excursions beyond thestress constraint\u00E2\u0080\u0099. Figure 5 of the original source paper shows the violations. It citedthat 6,400 points were explored to reach the solution in all the three runs.A number of LSP runs were made to compare results with the \u00E2\u0080\u0098true\u00E2\u0080\u0099 solution results.The three best LSP feasible results showed 0.6%, 0.9% and 1.00% deviation from theresults given by Venkayya. LSP found a set of well separated near optimal points, whichsuggests that the problem has multiple global optima The LSP results given in Table 4 7were found, on the average, after 73,800 function evaluations per each run which took 5minutes and 15 seconds on a 80386-33Mhz microcomputer. The detailed results cited byVenkayya and those found with LSP are given in Table 4.7. The three different solutionsfound with LSP and Venkayya\u00E2\u0080\u0099s optimum solution are shown in Figure 4.7. Figure 4.7(a)shows the optimal decision variables, whereas Figure 4.7(b) shows the stress in eachmember of the optimal solution.LSP\u00E2\u0080\u0099s failure to identify the optimal solution in this instance can be attributed to thefact that it lies on the boundary of the original search domain. Further improvement inthe LSP solution might be achieved through local refinement using a gradient methodbut was not attempted.For a general comparison, a summary of minimum objective function values for thetruss problem found with the two methods mentioned above, and LSP, are given in Table4.8.Chapter 4. TEST PROBLEMS 110et.. 52\u00C2\u00A7SU)8MembersV.nkayya Run 1 Run 2 ES] Run 3(a) Optimal dimensions2 3 4 5 5 7 8 9 10MembersV.nkayya Run I Run 2 ES] Run 3(b) Optimal member stressesFigure 4.7: Detailed results for Venkayya\u00E2\u0080\u0099s 10-member truss problem.Chapter 4. \u00E2\u0080\u0098TEST PROBLEMS It ICross sectional area (in2)Member Venkayya LSPrun I run 2 run 31 7.938 7.826 7.753 7.7782 0.100 0.236 0.321 0.2893 8.062 8.179 8.251 8.2234 3.938 3.827 3.754 3.7795 0.100 0.100 0.100 0.1016 0.100 0.223 0.359 0.2957 5.745 5.906 6.028 5.9928 5.569 5.411 5.308 5.3439 5.569 5.411 5.309 5.34310 0.100 0.330 0.447 0.421Truss weight (ib) 1,593.2 1,602.6 1,609.6 1,607.2Table 4.7: Detailed results for the 10-member truss designVenkayya Goldberg LSP 71,593.2 run 1 1,606 1,602.6\u00E2\u0080\u0098True\u00E2\u0080\u0099 run 2 1,614 1,609.6solution run 3 1,625 1,607.2Table 4.8: Comparison of truss weights for the 10-member truss designChapter 4. TEST PROBLEMS 1124.3.4 38-member truss design [Problem 38-1-0-77jSource: Andrew B. Templeman, \u00E2\u0080\u009CDiscrete optimum structural design\u00E2\u0080\u009D, Computer andStructures, Vol 30, No 3 pp 511-518, 1988.Method of optimization used in the source literature: Heuristic methods for finding discrete optimum solution.Problem formulation:Consider the 38-member truss shown in Figure 4.8. With the given loading system,the truss is to be designed in such a way that the tip displacement does not exceed 10millimetres. Members should be chosen from the five different cross section bars available,that is 5.0, 10.0, 20.0 40.0 and 75.0 * iO mm2. All bars are of the same material withmass density, p = 7.85 * 10-6 Kg/mm3 and Young\u00E2\u0080\u0099s Modulus, E = 21OKN/mm.Figure 4.8: 38-member truss.The total mass of the truss is used to represent the cost so the optimization problemis formulated asMinimize Mass = p L A1lu---fIN1. i.m i. \u00E2\u0080\u00A2 ,. 1. II II U IIf-ar1o@1 hChapter 4. TEST PROBLEMS 113Subject to 6 10mmwhere L1 and A1 are the length and cross section of the ith member, and 6 is the displacement at the tip of the truss. L1 are known quantities and the A1 are discrete variables.Observations:The result given in the literature is reported to he the global discrete optimum design.Even though the difference is small, LSP achieved a slight improvement in the objectivefunction. The most interesting aspect is that LSP identified multiple global optima. Over40 different global optimal solutions were identified, of which only ten are given in Table4.9 along with the single solution given in the source literature.Fleming\u00E2\u0080\u0099s [Fleming, 1986) computer programme for plane trusses was used with LSPto solve the structural analysis problem using the stiffness method. It took about 75hours to meet the modified variance convergence criterion using an 80386-33Mhz microcomputer, but the total number of function evaluations was less than 500,000. Thereason for such a long computation time is the effort expended on the determination ofthe deflection for the many designs evaluated in the LSP search.As was pointed out by the external examiner Dr. A. Templeman, there exists a uniquealgebraic equation for the tip deflection of this truss. While use of the equation wouldhave significantly reduced the computational time needed for constraint evaluations thenumber of function evaluations would remain unchanged.Chapter 4. TEST PROBLEMS 114Member areas (1000mm2)\u00E2\u0080\u0094Member Literature LSP solutions1 10 10 10 10 10 10 10 10 10 10 102 510555105555 Sf3 10 10 10 10 10 10 10 10 10 10 104 10 5 10 10 10 10 5 10 10 5 10 f5 10 10 10 10 10 10 10 10 10 10 106 10 10 5 10 5 10 10 10 10 10 10 f7 10 10 10 10 10 10 10 10 10 10 108 10 10 10 10 10 10 5 5 10 10 10 f9 10 10 10 10 10 10 10 10 10 10 1010 10 5 10 10 10 5 10 10 10 10 10 f11 10 10 10 10 10 10 10 10 10 10 1012 10 10 10 10 10 10 10 10 10 10 5 f13 10 10 10 10 10 10 10 10 20 10 10 *14 10 10 10 10 10 10 10 10 10 10 1015 10 10 10 10 10 10 10 10 10 10 1016 10 10 10 10 10 10 10 10 10 10 1017 10 10 20 20 10 10 10 20 10 10 10 *18 10 10 10 10 10 10 10 10 10 10 1019 10 20 10 10 20 20 20 10 10 20 20 *20 5 55105551055Sf21 20 10 10 10 10 10 10 10 10 20 10 *22 20 20 20 20 20 20 20 20 20 20 2023 40 40 40 40 40 40 40 40 40 40 4024 40 40 40 40 40 40 40 40 40 40 4025 40 40 40 40 40 40 40 40 40 40 4026 75 75 75 75 75 75 75 75 75 75 7527 75 75 75 75 75 75 75 75 75 75 7528 73 75 75 75 75 75 75 75 75 75 7329 55555101010105 5f30 20 20 20 10 20 10 20 10 10 10 20 *31 20 20 20 20 20 20 20 20 20 20 2032 40 40 40 40 40 40 40 40 40 40 4033 40 40 40 40 40 40 40 40 40 40 4034 40 40 40 40 40 40 40 40 40 40 4035 75 75 75 75 75 75 75 75 75 75 7536 75 75 75 75 75 75 75 75 75 75 7537 75 75 75 75 75 75 75 75 75 75 7538 75 75 75 75 75 75 75 75 75 75 75Mass (Kg) 8489.16 8482.42f takes eitner 5 or 10 * takes either 10 or 20Table 4.9: Alternate optimal designs for the 38-member trussChapter 4. TEST PROBLEMS 1154.3.5 Unit Hydrograph ordinate determination. [Problem 19-1-21-OjSource: Larry W. Mays and Cheng-Kang Tanr, \u00E2\u0080\u009CUnit Hydrographs via Nonlinear Programming\u00E2\u0080\u009D, Water Resources Research, Vol 18, No 4 pp 744-752, 1982.Method of optimization used in the source literature: Large scale Generalized ReducedGradient Technique, where the output from a linear programming version of the problemis used as the starting point for GRG2.Problem formulation:The optimal unit hydrograph from a set of known rainfall and runoff data is required.The unit hydrograph should be able to produce either a minimum sum of deviations or aminimum sum of squares of deviations between observed and derived runoff hydrographs.Unlike the traditional approach, determination of rainfall excess does not need to bedefined in advance. Therefore rainfall excess values are considered as part of the set ofdecision variables. The relationship between rainfall excess and total rainfall is shown inFigure 4.9.At any time much of the rainfall is lost as ground water flow and evapotranspirationand only the rest contributes to the surface runoff. The rainfall excess, which contributesto the surface runoff, is expressed asi=1,2whereis the total rainfall intensity of event i at time n.H1, is the rainfall loss of event i at time n.is the rainfall excess of event i at time n.I is the total number of observed hydrographs.Chapter 4. TEST PROBLEMSU,0Figure 4.9: Rainfall components.116N1 is the total number of ordinates of observed hydrograph from the th event.The objective of this problem is to minimize the sum of absolute deviations betweenthe observed runoff and the runoff derived from the unit hydrograph. The optimizationproblem is formulated, in its general form, asSubject to:IN,Minimize > F1,, \u00E2\u0080\u0094 Q I\u00E2\u0080\u0099i1 n1= P1,,U +P1,_iU2+ ... + Pi,n_m+iUmfori=1,2,...,Iandn=1,2,...,N= 13,N,MKUm = 1. . .Timen\u00E2\u0080\u0094I n\u00E2\u0080\u00942 n\u00E2\u0080\u0094SChapter 4. TEST PROBLEMS 117Urn0 m=1,2,..,M\ThereIVI is total number of unit hydrograph ordinates.Urn is the value of the rn unit hydrograph ordinate.F1, is observed direct runoff from the i/h hydrograph at the n11\u00E2\u0080\u0099 ordinate.Q is derived direct runoff from the i/h hydrograph at the nt\u00E2\u0080\u0099\u00E2\u0080\u0099 ordinate.D is the direct runoff volume for the i/h rainfall event.L1 is the number of periods of rainfall excess for the ith event.r is a constant which takes a value of 1 or 2 depending uponthe definition of the objective function.K is a constant expressed asK\u00E2\u0080\u009412(3600)At\u00E2\u0080\u0094 (5280)2 Awhere t is the interval in hours between hydrograph ordinates, and A is the drainagearea of the watershed in miles2.LSP solved the Little Walnut Creek watershed in Austin, Texas, hydrograph using thedata given in the literature. Table 4.10 shows details of the data recorded on 07/19/79.The watershed has a drainage area of 5.57 miles2. The problem has 9 excess rainfallvariables and 10 unit hydrograph ordinates which makes the total number of decisionvariables to be 19.In the source literature this problem was solved using different approaches. The bestresult was claimed to be the minimization of the sum of absolute deviations using GRG2.It is reported that optimal solution was reached after 7.8 seconds on CDC Cyber 170/750B computer system. The results cited for the \u00E2\u0080\u0098best\u00E2\u0080\u0099 result were used to make a comparisonwith the LSP results and are given in Table 4.11. The optimal unit hydrographs suggestedin the literature and the one found with LSP are given in Figure 4.10.Chapter 4. TEST PROBLEMS 118Total ObservedTime rainfall runoff(hr) (in) (cfs)1845 0.02 0.41915 0.36 3.01945 0.19 12.52015 0.16 62.2045 0.49 466.002115 0.87 1750.002145 0.13 985.002215 0.04 647.002245 0.01 330.002315 216.002345 97.000015 52.000045 43.000115 33.800145 24.600213 19.600245 18.800315 17.90Table 4.10: Rainfall-Runoff data for Little Walnut CreekObservations:The results with LSP show a considerable improvement in the objective functionvalue over that is cited in the literature. The total sum of deviation between observedand calculated runoff cited in the literature is 54.7 whereas the LSP results show the sumof deviations to be 50.57. This is about an 8% improvement on the optimum value. Butan equally significant improvement from a hydrological modelling standpoint is that theunit hydrograph derived from the LSP results is superior to the hydrograph result citedin the literature. The recession side of the hydrograph derived from LSP results showsthe smooth monotone decay expected in a unit hydrograph. Figure 4.10 contrasts theunit hydrographs derived from the two results.Chapter 4. TEST PROBLEMS 119Literature LSPUnit UnitRainfall Hydrograph Calculated Rainfall Hydrograph CalculatedLoss Ordinates Runoff Loss Ordinates Runoff(in) (cfs) (cfs) (in) (cfs) (cfs)0.020 4445.00 0.40 0.0199 3638.42 0.370.359 841.90 3.00 0.3592 1506.41 3.000.187 881.70 12.50 0.1869 964.16 12.470.147 271.30 62.00 0.1445 399.77 62.030.388 347.30 466.00 0.3693 335.04 466.010.498 70.00 1750.00 0.4436 124.20 1749.960.000 50.50 985.00 0.0698 58.79 985.100.000 68.60 647.00 0.0148 64.24 647.300.000 36.10 330.00 0.0042 52.10 330.0586.30 216.00 45.79 216.0697.00 97.0552.00 51.9943.00 43.0033.80 33.8140.00 24.6212.35 4.443.80 1.450.90 0.27Table 4.11: Optimal hydrograph for the Little Walnut Creek4.3.6 Air pollution control design [Problem 4-11-0-40]Source: Wang, Bi-Chong and Luus, Ruin, \u00E2\u0080\u009CReliability of optimization procedures forobtaining global optimum\u00E2\u0080\u009D, AIChE Journal Vol 24, No 4, pp 619-626, 1978.Method of optimization used in the source literature: Direct search method based onrandom sampling and search domain contraction.Problem formulation:The maximum ground level concentration of a pollutant resulting from the emissionof multiple sources is required. Holland\u00E2\u0080\u0099s plume rise equation and Gifford\u00E2\u0080\u0099s dispersionChapter 4. TEST PROBLEMS 120= InUFigure 4.10: Optimal unit hydrographs for Little Walnut Creek.equation are used for estimating the ground level sulphur dioxide concentration [Turner,1973]. Under adiabatic conditions, the ground level concentration of sulphur dioxide isgiven by=c, exp[_.5()2 \u00E2\u0080\u0094 .5()2ji=1 0\u00E2\u0080\u00A2Y \u00C2\u00B0Zi \u00E2\u0080\u00A2yj \u00C2\u00B0Ziwhere= .sin9(x \u00E2\u0080\u0094 a1) + cosO(y \u00E2\u0080\u0094 b1)X1 = cos6(x \u00E2\u0080\u0094 a) \u00E2\u0080\u0094 .sin6Q,, \u00E2\u0080\u0094 b1)1H = [1.5 + 2.68T\u00C3\u00A7Tad1}=H31+t .Time (hours)- - - -Litrure LSPfor 1, 2, ..., 10Chapter 4. TEST PROBLEMS 1210.9591 X 100.1136X?92 5 10 < X, <2 * iOo\u00E2\u0080\u00A2gi =O.1385X901 2 * iO < X, iO0.2O30X\u00C2\u00B0\u00C2\u00B0 iO < X < iO0.07925 X 104.828 *1O(lnX)88766 10 < X < 200= 3.108 *106(lnX)05295 200 = 2a0 + +a24; j = 1,2Z2 10xii 0Minimize z1Observations:In the source literature, the optimal power outputs for generator 1 and 2 are 30 MWand 20 MW respectively, and 4.68 1 ton/h of fuel oil is purchased. BFG generates a powerof 36.325 MW and the fuel oil is used to generate 13.675 MW. The optimum objectiveChapter 4. TEST PROBLEMS 139function value found with LSP is 4.683 ton/h, which is very similar to the solution citediii the literature. One of the important results found with LSP is that the global optimumis accompanied with multiple near optimal solutions. Details of solutions cited in theliterature and found with LSP are given in Tables 4.22 and 4.23 respectively. The totalnumber of function evaluations expended to meet LSP\u00E2\u0080\u0099s termination criterion was 33,556,which took 50 seconds on a 80386-33Mhz computer.Generator Fuel oil BFG Power1 10.114 19.886 302 3.561 16.439 20Total 13.675 36.325 50Table 4.22: Optimal power generation cited in the literatureGenerator Fuel oil BFG Power1 10.479 19.504 29.9832 3.254 16.768 20.022Total 13.733 36.272 50.005Table 4.23: Optimal power generation found with LSP4.3.11 Chemical equilibrium [Problem 10-1-3-0]Source: Moran, Manfred and Grossmann, Ignacio E., \u00E2\u0080\u009CChemical Engineering Optimization Models with CAMS\u00E2\u0080\u009D, CACHE Process design case studies, Vol 6, 1991.iViethod of optimization used in the source literature: GAMSProblem formulation:From the thermodynamics of chemical reaction equilibrium, the equilibrium state ofa closed system at constant temperature, T, and pressure, F, is the state at which itstotal Gibbs free energy is a minimum. This criterion is used to obtain the equilibriumChapter 4. TEST PROBLEMS 140composition of a given mixture by minimizing its free energy with respect to its composition. An ideal gas mixture of ten chemical species is maintained at T = 298 K andP = 750 Hg. The species are made up of three atomic elements (e.g. H, 0 and C). If wedenote the three elements as A, B and C, then the species formulas in terms of A, B andC are A, A2,A2C, B, B2,AB, BC, C, C2 and AC respectively for .s = 1, 2, ..., 10. Being anideal gas, the Gibbs free energy per mole of species .s is given by C3 = C03 + RT ln(Fy3),where y3 is the mole fraction of species s in the mixture. The C05 are given in terms ofin3 = C03/RT in Table 4.24. The mixture contains 2 mole A, I mole B and 1 mole C.5 inS 5 in31 -10.021 6 -18.9182 -21.096 7 -28.0323 -37.986 8 -14.6404 - 9.846 9 -30.5945 -28.653 10 -26.111Table 4.24: Free energy constants (in3)Define x3 as the moles of species .s and F as the total moles of all species in themixture, that isF=x1+x2...+x0Since the mixture is ideal, C is the sum of the free energies of the species in the mixture,where= x1C + x2C + x3C + ... +x10C.Substituting in3 in(Fy3) for C3, dividing C by RT, and expressing the mole fractionsYs = x3/F then,C = = YZx3[w + ln(Px3/F)]Let a3 denote the moles of element i in one mole of species .s. The values of a13 areeasily obtained from the species formulas. For instance a11 = 1, since species A(s = 1)Chapter 4. TEST PROBLEMS 141has only one atomic element A(i = 1), a21 = 0, a13 = 2, etc.. The total number of molesof element i in the mixture, X0, is given by,= a1x +a2x + a3x + ... +a1ox0; i = 1,2,3.The optimization problem is to minimize the total Gibbs free energy, G, with respectto the mixture composition. But since the mixture T is maintained constant, minimizingG is equivalent to minimizing G. Then the optimization formulation isMinimize GThe solutions for this 10 variable problem cited in the literature and found with LSPare given in Table 4.25.Observations:The LSP results confirm the results cited in the literature. LSP found the solution in3 minutes and 20 seconds on a 80386-33Mhz microcomputer, which took 83,588 functionevaluations to meet the convergence criterion VM j 1 * iO.xss Literature LSP1 0.007 0.00612 0.068 0.07093 0.907 0.90384 0.0004 0.00055 0.491 0.49026 0.0005 0.00087 0.018 0.01838 0.003 0.00349 0.015 0.015410 0.042 0.0436G -43.495 -43.4942Table 4.25: Optimal solution to the Chemical equilibrium problemChapter 4. TEST PROBLEMS 1424.4 Model fittingRegression analysis is used to fix the parameters of a model from a set of experimentaldata. The experimental data will have a set of observed points with known values forthe independent variables and the corresponding values of the dependent variable. Therelationship between the dependent and independent variables can be expressed in avariety of ways. For example, as a simple linear model= /3o + /31x;or as a multiple linear modelor as a polynomial model+ + + ... +or as a general nonlinear modelU = o + thx1 +52x1+ 3x +In all the above forms the regression analysis determines the values of the parameters, i.e. the fi vector, in the model. The methods used to estimate these parametersare known as simple, multiple linear and nonlinear regression analysis depending uponthe formulation of the model. The equations used to define a certain model normallyreflect the physical processes involved and models dealing with the physical world arepredominantly nonlinear in their formulation. On the other hand estimating parametersfor linear models is easier than for nonlinear models and therefore the equations of nonlinear models are often linearized so that linear regression methods can be used. Suchtransformations may not be possible with highly nonlinear models.Chapter 4. TEST PROBLEMS 143Three best fit criteria used in the estimation of parameters of a linear model are: minimization of the sum of the squares of deviation between observed and model generatedvalues; minimization of sum of absolute deviations; and minimization of the maximumabsolute deviation [Narula, 1982]Suppose a set of data is collected and the parameters for the linear model y = j3xare to be estimated. To fit a model from m experimental points using the least squares,or the absolute deviations, as the model criterion, the objective function can be expressedasMinimize L I \u00E2\u0080\u0094 /3x2 (4.2)Subject to typical constraints with as many constraints as there are data values asj=1,...,mrherey represents the experimental value of the dependent variable at point j.x is the coordinate of the independent variable vector at point j.j3 is the vector of model parameters.If p takes a value of 2, the objective would be to minimize the sum of squares of thedeviations. When p takes a value of 1, then the objective would be to minimize the sumof absolute deviations.LSP can be used to optimize the function given in Equation 4.2 where the constant ptakes the value of 1, 2 or any real value. LSP finds a solution to such problems withoutthe need for linearization of any of the functions. It also provides designers with morefreedom when deciding on the severity of the criterion for model fitting. In addition, LSPcan easily estimate parameters for both linear and nonlinear regression models and, infact, provides new perspectives on the parameter values found.Chapter 4. TEST PROBLEMS 144The type of sensitivity information on the model parameters, provided by LSP, isunique and has considerable potential value to an engineer seeking a representative model.The sensitivity information is provided by LSP in the form of the acceptance sets asthe optimal solution is approached. This information is provided without any extracomputational overhead and is also easily understood. More discussion on sensitivityanalysis with LSP in connection with model fitting is discussed in Chapter 8.Examples of regression models were taken from the literature and solved using LSP. Inalmost all the cases LSP confirmed the results in the literature. Test problems [Problem3-17-0-0 and Problem 4-11-0-0, Appendix A] are representative of some of the nonhnearregression models solved with LSP. A 4% improvement over the solution cited in theliterature was found with LSP\u00E2\u0080\u0099s solution to [Problem 3-17-0-0, Appendix A]. While ittook 19,842 function evaluations to meet the convergence criterion of l%- 1 * 10,this still took only 3 minutes with an 80386-33Mhz personal computer and this time isof httle practical significance. Only 2,128 function evaluations were expended to meet asimilar convergence criterion for the four variable problem [Problem 4-11-0-0, AppendixALChapter 5EXPERIENCE WITH LSPConsiderable experience was gained with LSP when the many test problems, both mathematical and engineering in nature, were solved during the course of this research. Thisexperience showed that the output obtainable from LSP as the search progresses can giveimportant information concerning the effectiveness of the search, possible adjustments tothe search strategy when difficulties are experienced, as well as insights into the topologyof the problem being solved. It has also helped to give a clearer in dication of the influenceof certain LSP parameters, such as the number of points in a level set on the number offunction evaluations required for solution, on the reliability and efficiency of the overallsearch.While the basic LSP algorithm with the recommended default parameter values is aneffective global optimizer, it is natural to seek further improvement in performance. Thischapter describes a variety of enhancements and adjustments which have been beneficialin solving some of the test problems and would be implemented under the direct controlof the user.5.1 Intermediate output, its presentation, and interpretationOne of the important contributions of this research in implementing LSP is the information which is displayed graphically on the computer monitor after every iteration. Thisfeature was not anticipated but, as its potential value became apparent, it was exploredand exploited in this research. Three principal types of output displays were investigated;145Chapter 5. EXPERIENCE WITH LSP 146scatter diagrams of points in an acceptance set on two variable planes; plots of level setvalue-c against iterations; and the cumulative number of function evaluations againstiterations. These can all be generated rapidly with little computational overhead andwere provided without difficulty by the LSP software developed. These plots and theirsignificance are described in detail in the next three sections. AU of the diagrams usedin this chapter are either derived from these displays or represent actual screen dumpsof the displays.5.1.1 Plots of rj, x3 pairsA set of two dimensional x rSi x plots of the confirmed points in the current acceptanceset are displayed on the computer monitor at each iteration. When the axes of theseplots span the initial variable bounds, the cluster formed by these points indicate thesize and location of the current acceptance set in the decision domain. The approximateacceptance set boundaries can be visually inferred from the cluster of points displayed.The existence of multiple local optima is suggested in these x xj plots when thedisplayed points form distinct clusters with open spaces in between. It is quite commonto observe that, after a few more iterations and an accompanying improvement of the levelset value-c, the displayed points form a smaller single cluster. The existence of distinctclusters at some stage of the search and their disappearance at a later stage is indicativeof local optima and LSP\u00E2\u0080\u0099s ability to reject them. If there is some special reason to explorea particular local optimum, LSP can be rerun within a new initial cuboid enveloping thecluster of points of interest. Clusters may however exist in n dimensional space but notnecessarily be evident in many or even all of the x1 \u00E2\u0080\u009C.\u00E2\u0080\u0098 xj plots due to overlapping andthe limited perspective of each of the two dimensional views. Thus while visual detectionof clusters is sufficient to confirm their existence it is not necessarily the case that theywill be detectable in this way. In section 5.1.2 other graphical evidence of the existenceChapter 5. EXPERIENCE WITH LSP 147of multiple clusters is described.Two plots derived from a single LSP run, but at different iterations, are shown inFigure 5.1 for a two variable problem and indicate the existence of multiple optima.The points form two distinct clusters, which indicates that there are at least two localoptima. At this stage there is nothing to suggest whether the two clusters will lead toa single global optimum or to two or more equal global optima. After a few additionaliterations the points around the upper left corner of Figure 5.1 (a) completely disappearand all points in the acceptance set cluster around a single point, as shown in Figure5.1(b). The disappearance of those points which had formed a distinct cluster at anearlier iteration implies the existence of a local minimum in that vicinity. As the LSPsearch progresses the displayed points coalesce about the solution and ultimately occupyonly a single pixel in the graphical display.Similar scatter diagrams plotted with axes which span the sides of the current cuboid,that is where the full length of the axes correspond with the current cuboid side, can alsobe displayed at any stage of the search. Such a plot gives a closer look at the patternof points in the current acceptance set. These patterns may reveal useful informationconcerning, for example, the distribution in the decision space of near optimal solutionsand the mutual relationships of each pair of the variable set. The significance of suchindications will be discussed un4er the subject of sensitivity in Chapter 8.5.1.2 Plots of cumulative function evaluations-Nf against iteration number-IMany different schemes for counting the number of function evaluations incurred in nonlinear optimization appear in the literature. The count adopted in this research reflectsboth objective function and constraint function evaluations. When a point is generated in LSP it is first checked for feasibility against the set of constraints including thebounds on each variable. If the point is feasible then it is retained, or if infeasible, itSChapter .5. EXPERIENCE WITH LSP 148xa za(a) (b)Cubold. :...\u00E2\u0080\u0094.\u00E2\u0080\u00A2.\u00E2\u0080\u0098.%\u00E2\u0080\u00A2:\u00E2\u0080\u00A2- xl*1Figure 5.1: Local optima suggested by the disappearance of a point cluster.is rejected. In either case this is counted as one function evaluation. At every feasiblepoint the objective function value is calculated and this is also counted as a functionevaluation. Consequently every confirmed point in an acceptance set will count as twofunction evaluations.A variety of shapes of the N1 I curve can be observed. From experience with thetest problems these curves have been categorized into three types: linear, exponential-likeand sigmoidal curves. The implication of each type of curve is explained next.Linear curveThe number of function evaluations at the first iteration gives an indication of theratio of the feasible region volume and the initial cuboid volume. The number of functionevaluations at the first iteration for an unconstrained but bounded problem should beequal to that is equal to the number of points which are to be confirmed in theacceptance set.For the ideal of an unconstrained, but bounded, problem with convex objective, theN1 \u00E2\u0080\u0094\u00E2\u0080\u0098 I plot has been confirmed through the test problems to be almost a straight line.Chapter 5. EXPERIENCE WITH LSP 149The slope of this line, the number of function evaluations per iteration, is only slightlylarger than Njceep/2, which is shown in Figure 5.2(a). The reason is simply that, in thisrather ideal case, only about half of the points in a level set are lost with the lowering ofthe level set value-c to M(f, c) at each iteration. These lost points must be replaced bynew ones at every iteration and each new point needs one function evaluation.Exponential-like curveWhen a problem has constraints in addition to the variable bounds then the numberof function evaluations at the first iteration will be greater than as a result of thediminished feasible space within the initial cuboid. For constrained problems, using theuniform random trial point generation scheme described in Chapter 2, Section 2.3.2, thenumber of function evaluations increases as the ratio of the feasible region volume to thecuboid volume gets smaller. A progressive plot of the cumulative number of functionevaluations against iteration number, NJ I, indicates the effectiveness of the search ateach iteration and any change in this efficiency as the search progresses.For constrained problems the NJ rsI plot usually deviates from a straight line, and fora typical \u00E2\u0080\u009Cwell behaved\u00E2\u0080\u009D LSP problem, progressively steepens reflecting that generatingnew acceptable points gets more difficult with the number of iterations. Figure 5.2(b) isa plot for a typical constrained problem.The inefficiency of acceptable point generation is indicated by the degree of deviationof the NJ r. J plot from the straight line of slope Nk\u00E2\u0082\u00AC, each point requiring one functionevaluation for feasibility; and one function evaluation foi\u00E2\u0080\u0099 eligibility in the level set. Thiscan occur when the problem has widely spaced multiple optimal points which produce adisjoint acceptance set or when the geometry of a connected acceptance set produces alarge cuboid volume with small acceptance set volume, for example when the acceptanceset is distributed around a cuboid diagonal.ChapterS. EXPERIENCE WITH LSP 150ci,.2 .2.. .C C.2 .2.isC C/1NEz \u00E2\u0080\u00A21o 0\u00E2\u0080\u00A2j>N.Iterations Iterations(a) Linear N1-\u00E2\u0080\u009Dl curve (b) Exponential-like N1-l curve(0...C.2..C.2isC.a).EC.)- N\u00E2\u0080\u00A2 1Iterations(C) Sigmoidal N\u00E2\u0080\u009Dl curveFigure 5.2: N1 I plots.Chapter 5. EXPERIENCE WITH LSP 151Sigmoidal curveFor some problems the Nf r\u00E2\u0080\u0099. plot may be linear for the initial iterations, thenadopt an exponentially increasing shape for a few more iterations, and finally declineand continue as a straight line, as shown in Figure 5.2(c). Such a plot shows thatgenerating points in the acceptance set was efficient at the earlier stages of the search,then lower efficiency was experienced for a while, but that the efficiency was restored inthe later stages when approaching the final convergence of c on C.A common explanation for such a curve is that, at the early stage of the search,the level set requirement is easily met and feasibility dominates acceptance set pointgeneration. At this stage all points in the acceptance set constitute a single cluster,indicating that feasible region is not disjoint. Therefore the Nf \u00E2\u0080\u0098-. I plot is close to astraight line with slope approximately equal to Nkeep. At the intermediate stages theacceptance set becomes fragmented, the points forming distinct separate clusters eachsurrounding a local optimum. Since acceptance set point generation is inefficient withmultiple optima, their existence is reflected in the Nf I curve by a steepened slope.But in the later stages the local optima are discarded and the search is performed ona single connected acceptance set. Such a curve shape is most pronounced when theobjective function values at widely dispersed local optima are close to the global optimumobjective function value. If the objective function value differences at all local and theglobal optimum are large, the influence of the local optima on the N I curve is small.The sigmoidal curve can be demonstrated with an LSP solution of the Rastriginfunction, [Problem 2-16-0-0, Appendix A], a two variable problem with 50 local optimabut only one global optimum. This problem\u00E2\u0080\u0099s formulation isMinimize f(x) = x + 4 \u00E2\u0080\u0094 co.s18x1 \u00E2\u0080\u0094 cosl8x2Subject to:Chapter 5. EXPERIENCE WITH LSP 152\u00E2\u0080\u00941 x1 11 2 1.Figure 5.3 shows the screen display at convergence obtained from the LSP solution.Figure 5.3: R.astrigin function-Implied local optima by N1 I plot. (Actual screendump).A second possible reason for observing the characteristic sigmoidal N1 .s curve ariseswhen a fissure exists in the objective function surface in the feasible region at the globaloptimum. For a problem with more than one local optimum, points in the acceptance setcan form clusters around the individual local optimal points but not necessarily includea point in the fissure. The ratio of the acceptance set and cuboid volumes may be smallunder these conditions and the search inefficient. Provided the cuboid at this stageincludes the fissure region, then there is still a possibility of generating points at or nearSChapter 5. EXPERIENCE WITH LSP 153the global optimum. Once a single point is generated in the fissure, the search tends toconverge rapidly on the global optimum as more and more points are generated withinthe fissure, so that the efficiency of the search is restored. The Road Runner functiontest problem [Problem 2-22-0-0, Appendix Al screen display is shown in Figure 5.4, andis an example where the N1 I plot exhibits the sigmoidal shape due to the presence ofa fissure.r*n.yVi.s. \u00E2\u0080\u00A2r\u00E2\u0080\u00A2 rnLEVEL SET PROGRAMMING[ LSP \u00E2\u0080\u0094 Vast ian 1993______1giMt\u00E2\u0080\u0099W i1j4mwxrnpnc,xri rPLhf:)\u00E2\u0080\u00A24VJt_________ ______\u00E2\u0080\u0094 ]@I!W5E2_ __________________15EFigure 5.4: Screen display for the Road Runner function, two dimensional case. (Actualscreen dump).5.1.3 Plots of level set value-c against the iteration number-IThe progressive plot of level set value against the number of iterations, c e-\u00E2\u0080\u0099 J, givesfeedback on the strength of the convergence on c*. In this research a pre-specified valueof the modified variance was adopted as a convergence criterion. This value is comparedChapterS. EXPERIENCE WITH LSP 154with the modified variance after each iteration. Ideally the modified variance shouldreach zero at the global optimum, but a small value is used as a tolerance for practicalimplementation purposes.Because the modified variance is not dimensionless and is also sensitive to the absolutevalue of the objective function, a strong convergence criterion value for one case may beweak for another case. It can therefore be difficult to specify a convergence criterion inadvance of obtaining a solution of the problem. The convergence question can be judgedfrom the evolving c I curve and based on the difference of level set value betweenconsecutive iterations, as was discussed in Chapter 2, section 2.3.5.Ideally c \u00E2\u0080\u009C-\u00E2\u0080\u0098 I would be a smooth and monotone decreasing curve, and strongly asymptotic to the global minimum. The advancement of c on c\u00E2\u0080\u0099 resembles a binary search insome respects and therefore often shares siniilar convergence characteristics. As the curvebecomes nearly horizontal it indicates that the difference between the level set values atconsecutive iterations is very small, and that further iterations would probably bring evensmaller improvement to the objective function per iteration, see Figure 5.5(a). But if thesearch stops before the horizontal part of the curve is well developed, it may suggest thatthe convergence criterion adopted is too large and has lead to a premature termination,as in Figure 5.5(b).In less ideal cases the level set value versus iterations plot might show some discontinuity as in Figure 5.5(c). The causes for the discontinuity can be either a discontinuityin the objective function surface, or that the global optimum lles in a small fissure anda first point has been generated in that fissure just prior to the discontinuity. When alow point is generated in the fissure it focuses the search around the new point and amajor correction occurs in the level set boundary estimates. As more points are thengenerated in the fissure the level set value drops rapidly compared to previous iterations,hence an abrupt change in shape is introduced into the c I curve. An example of theChapter .5. EXPERIENCE WITH LSP155zIterations Iterations Iterations(a) Ideal convergence (b) Premature termination (c) Discontinuous convergenceFigure 5.5: Level set value-c versus iteration-I.above is given in Figure 5.6, where the Road Runner Function is used to demonstrate adiscontinuity in the c I-s I plot. In this instance the number of points maintainedin the level set, was reduced to half of its recommended value. The curve exhibits theinitial tendency to converge on a non optimal point, as discussed above, and then theabrupt slope change leads to the final convergence on the global optimum.5.1.4 Further generalization of the N1 - I and c \u00E2\u0080\u0098-\u00E2\u0080\u0098 I plot interpretationsThe N \u00E2\u0080\u0094\u00E2\u0080\u0098 I and c - I plots are considered to be an important by-product of theLSP search and their interpretation is unique to LSP. Experience with these plots whensolving problems has revealed some distinct and frequently observed patterns. Thesepatterns have been linked to the known features of the many test problem solved andhas lead to the following generalizations. While these are explained with the aid of aone dimensional problem, their key features can easily be extended to the multidimensional equivalent cases. It should be stressed that, regardless of the dimensionality ofthe problem being solved, the c I and N1 I plots will remain two dimensional.Interestingly, the characteristic plots shown in the following pages were clearly observedin many multidimensional problems. This suggests that the types of features reflected in.....\u00E2\u0080\u00A2..\u00E2\u0080\u00A2\u00E2\u0080\u00A2I\u00E2\u0080\u00A2\u00E2\u0080\u00A2 \u00E2\u0080\u00A2Chapter 5. EXPERIENCE WITH LSP 156LEVEL SET PROGRAM hGI______LSP-jqqII e .LiE!.. :\u00E2\u0080\u00A2 .-.. ..L-.--. I:zi--ra. a \u00E2\u0080\u00A2 \u00E2\u0080\u00A2-a\u00E2\u0080\u00A2\u00E2\u0080\u0094 a a\u00E2\u0080\u00A2 p-L ii:: \u00E2\u0080\u0094Iii!r. aa\u00E2\u0080\u0099___________________ _ _ _ _ _ _______________________\u00E2\u0080\u00A2 a aFigure 5.6: Low Nke search for the Road Runner function. Demonstrates discontinuityin c I plot. (Actual screen dump).the two dimensional examples are often also dominant in more complex problems.The top two shaded diagrams in Figure 5.7 show plots of the objective function againstthe variable x for two single dimensional problems. The user would not have knowledgeof these curves, but their general nature is revealed by the LSP screen displays of thec I and N1 I plots shown in the lower sections of Figure 5.7.The plots of the level set value-c versus iterations are both simple monotone decreasing. This suggests for example that no fissure like feature in the objective functionsurfaces have been detected during LSP search.In the unimodal objective function case of Figure 5.7, the N1 I plot, shows a linearrelationship between N1 and I. The slope of the curve is close to Nkeep, which is indicativeof good search efficiency. This curve, together with the smooth monotone decreasingChapter 5. EXPERIENCE WITH LSP 157c I curve, leads to the conclusiou that the optimization problem being solved has onlyone global optimum and an objective function surface which is essentially unimodal.In the multimodal case of Figure 5.7 the Nj n-i J plot increases exponentially. Thisdecline in efficiency is indicative of a progressive decrease in the volume ratio of the levelset to the current cuboid which often arises when the problem being solved has multipleglobal, or near global, optima.Figure 5.8 shows three sets of more complex c \u00E2\u0080\u0098\u00E2\u0080\u0098-\u00E2\u0080\u0098 I and Nj n-i J plots reflecting morecomplex objective functions. The first set, Figure 5.8(a), shows a smooth monotonedecreasing c n-i J curve and an almost constant slope of the N n.? I curve except for aslightly higher slope at the initial stage of the search. These two curves are characteristicof problems with a set of local optima, and a single global optimum lying in a relativelybroad depression in the objective function surface. Moreover the objective function valueat the local points is significantly inferior to the value at the global optimum. If theobjective values of the local optimal points are extremely inferior to the value at theglobal optimum, then the influence of the local optimal points on the Nf n-\u00E2\u0080\u0099 I curve is 50weak that the higher slope around the initial stages may not be distinct. The objectivefunction of Figure 5.8(a) is representative of this situation.Figure 5.8(b) shows a smooth monotone decreasing c n.i curve but the N n-\u00E2\u0080\u0099 Jplot has a characteristic sigmoidal shape. The likely indication from these two curves isthat the problem has multiple local optima, but only one is a global optimum, and thedifference between the values of the objective function at the local and global optima isquite small. This kind of objective function is depicted in the top figure of 5.8(b).Figure 5.8(c) shows a pronounced discontinuity in the c n-\u00E2\u0080\u0099 I curve and the characteristic three part sigmoidal N n-i curve. Such curves are generally obtained withproblems which have a fissure in the objective function surface. The fissure is containedwithin the cuboids throughout the search but no point is sampled inside the fissure atChapter 5. EXPERIENCE WITH LSP 158the earlier stages of the search. The type of objective function manifesting these plots isdepicted at the top of Figure 5.8(c) for a single variable case.5.2 Adjusting LSP parametersWith LSP there are a number of parameters which control key aspects of the search.These can remain fixed at recommended default values, be initialized at other values, orbe adjusted as the search progresses in the light of the outputs discussed in section 5.1.The default parameter values recommended in sections 5.2.1 to 5.2.5 have proven to beeffective in many cases, but when search difficulties are indicated by the c n\u00E2\u0080\u0099 J and Nf Icurves, adjustments to these values may be warranted. Each of the parameters impactsdirectly on quantities in the domain of the engineer\u00E2\u0080\u0099s real world problem as opposed tobeing mathematically abstract.The important LSP parameters are: the number of confirmed points in the acceptanceset Niveep; the value of the criterion for convergence VM; the value of the clusteringcriterion for partitioning the problem into subregions; the criterion for initiating skewnessand the adjustment parameters; the change of the penalty parameters per iteration; andthe tightening of any relaxed constraints per iteration. These parameter values and theiradjustment, which were investigated and exploited throughout the test problems, arediscussed in the next sections.5.2.1 Number of confirmed points in the acceptance set-NkeThe required number of confirmed points in an acceptance set depends primarily on thenumber of variables involved in the problem formulation. Generally, increasing Nkeepimproves reliability but reduces the search efficiency. The ideal relationship betweenNjeep and the number of variables in the problem is not necessarily linear, though aChapter 5. EXPERIENCE WITH LSP 159C \u00E2\u0080\u00A2 C\u00E2\u0080\u00A2 \u00E2\u0080\u00A2\u00E2\u0080\u00A2III.. aI IN \u00E2\u0080\u00A2. Nf :I I(a) Unlmodal (b) MultlmodalFigure 5.7: Interpretation of simple c \u00E2\u0080\u0098 I and N1 I curves, unimodal and multimodalobjective functions.Chapter 5. EXPERIENCE WITH LSP 160f(fJ\J\J !XfJJJJc C CN1 N1 N1(a) (b) (c)Figure 5.8: Interpretation of more complex Nf I and c I plots.Chapter 5. EXPERIENCE WITH LSP 161linear relationship appears to work well for lower dimension problems.From the experience gained dnring this research, Nkeep = (loxnumber of variables)was found to be adequate for most problems up to 6 variables. With larger dimensionproblems substantially less than the l0xnumber of variables was often satisfactory. Forexample, the maximum value of 11keep utilized was 160 for Templeman\u00E2\u0080\u0099s problem, discussed in Chapter 4, section 4.3.4, involving 38 variables and yet the solution found withLSP improved (very sllghtly) on the published results for the same problem as well asfinding multiple optima in a single run.5.2.2 Termination criterion-VMSince the convergence criterion used in LSP is based on the variance of the objectivefunction values in the acceptance set, it is expressed in the same units as the objectivefunction and is prone to scaling problems. Therefore a specific value of the convergencecriteria performs best for a specific range of objective function values. A similar scaling problem is common with gradient optimization convergence criteria. One gradientmethod, GRG2, recommends that the objective function and constraint functions bescaled to have absolute values greater than 10\u00E2\u0080\u009D\u00E2\u0080\u0099 and less than 100 for successful operation of its convergence criterion [Lasdon, 1982]. Convergence criterion values rangingbetween 1 * i0 to 1 * 108 were adopted with LSP. The higher values were used forproblems with high absolute values of the objective function at their global optimum.The modified variance of the objective function values in the level set is reportedat every iteration. Its value can also be assessed in the light of the information on theprogress of the search provided by the c \u00E2\u0080\u0098\u00E2\u0080\u0094\u00E2\u0080\u0098 I and Nf I curves as discussed in Chapter2, section 2.3.4 and earlier in this chapter.Chapter 5. EXPERIENCE WITH LSP 1625.2.3 Cluster criterionSubdividing the cuboid into a set of smaller cuboids when multiple optima are indicatedby the formation of multiple point clusters can be an important strategy in achieving acomputationally efficient and successful solution with LSP. A simple clustering analysisroutine is adequate to identify the clusters and hence their boundaries. Any overestimation and even overlap of cluster boundary estimates does not threaten the success ofthe search. Once the bounds of each cluster and hence their cuboids are established, thesearch proceeds independently within each of these individual new cuboids.The standard cluster analysis routine adopted in this research involves the construction of a dendrogram. At some distance from the branch ends, the dendrogram is cut bya line perpendicular to the branches so that the initial dendrogram tree is divided into aset of smaller trees. Points corresponding to each of the smaller trees are considered toform individual sets and are identified as clusters if the necessary clustering criterion ismet. \u00E2\u0080\u00A2The clustering criterion was specified in Chapter 3, section 3.1.3 as the ratio of theminimum distance between two clusters and the maximum distance between two pointswithin a cluster. When this ratio is above a certain predefined value, the region occupiedby the acceptance set is subdivided into smaller subregions. Based on the test problems,experience showed that the useful range of the clustering criterion is between 2 and 8.Even though clustering analysis methods can be used for any n dimensional problems,their application in this research was confined to problems involving up to 4 variables.This was largely because of the difficulties which occur in cluster investigation in largerdimensional problems. Furthermore, there are dangers in over generalizing the experienceobtained with a limited number of test problems to high dimensional problems that mightbe encountered in engineering.Chapter 5. EXPERIENCE WITH LSP 1635.2.4 Cuboid stretching parametersAt every iteration the cuboid defined by the confirmed points in the acceptance set isstretched on each side by - of its total length, where n is the nnmber of points fromthe previous iteration remaining in the acceptance set after points which do not fuffil thecurrent f(x) c condition have been discarded. Details of the stretching parameters aregiven in Chapter 3, section 3.4.The stretching of cuboids obviously increases the cuboid volume, but the efficiencyof the search is not noticeably reduced by this extra volume when the stretch is small.Experience with the test problems showed that no significant difference in the totalnumber of function evaluations exists for cases with and without cuboids being stretched.But the increased reliability benefits from cuboid stretching were significant, especiallywhen optimal points were close to the cuboid boundaries.5.2.5 Skewness adjustment parametersThe LSP parameters used for skewness adjustment have already been discussed in Chapter 3, section 3.5. The choice of value for each parameter is made so as to balance theincreased speed of convergence against the increased risk of missing the global optimum.The parameters help reduce the risk of missing optimal points located at the boundaries,and in particular the corners, of the cuboid.From the experience gained in this research, the best values for the three skew parameters range from 0.3 to 0.4 for 6, 0.10 to 0.25 for 6 and 0.05 to 0.15 for 62.When skewness adjustments were used, the default values 6o = 0.4, 6 = 0.1 and62 = 0.05, were adopted. For a one variable problem with a unit length of cuboid, thesefigures give a maximum elongation (stretch) of 5% when 6 = 1 and a minimum elongationof 2% when 6=6. For cases where 6J 6 there is no adjustment. With the defaultChapter 5. EXPERIENCE WITH LSP 164parameter values adopted, the shift of the cuboid would be a maximum of 10% when6 = 1 and a minimum of 4% when 6 = 6o Therefore for a cuboid of unit length thestretch introduced due to skewness adjustment will be in the range [0.02, 0.05] and theconsequent shift will be in the interval [0.04,0.10].The above mentioned default parameter values produced satisfactory results throughout the test problems. As a test of response to a highly skewed feasible region, the initialbounds for the Rosenbrock function, where the solution is at x1 = 1 and x2 = 1, were setat 0 x1, x2 20. Two sets of runs were conducted, one without, and the other with,skewness adjustment. In the first case, out of the 10 different LSP runs without skewnessadjustment, only 3 runs converged on the true optimum point. But, in the second case,where skewness adjustment was used, all the 10 different runs converged on the globaloptimum point. These examples, as well as a number of the other test problems, demonstrated that the skewness adjustment significantly improves the reliability of convergenceon the true optimum point with only a small addition in computational effort.5.2.6 Penalty and constraint relaxation and tightening parametersDetails of penalty parameter modification are given in Chapter 3, section 3.2.1. Experimental work showed good results for the penalty coefficient reduction within a rangeof 0.7 to 0.9 at every iteration. A minimum value of 100 was found to be satisfactoryfor problems whose absolute value of objective function evaluated at the optimal pointis less than 100. Reductions outside this range encountered difficulties which commonlyinvolved convergence on a non optimal point. This is also discussed in Chapter 3, section3.2.1.The relaxation parameter value was usually determined by trial and error while running LSP. In some instances, with heavily constrained problems, very little progresstowards establishing the initial set of Nheep points occurred even after the expenditure ofChapter 5. EXPERIENCE WITH LSP 165considerable computational effort. The constraints were then relaxed before completionof the first iteration. But, if possible, it is suggested that at least one iteration is completed so that an indication of N, the number of function evaluations needed to establishNk\u00E2\u0082\u00AC6 confirmed points, is obtained. If N) 20 x one might think of relaxing theconstraints. Generally the value of the relaxation parameter is related to the complexityof the constraint functions. The merits of relaxing constraints are discussed in Chapter3, section 3.3.The six variable problem [Problem 6-1-0-2, Appendix A] demonstrates the advantageof constraint relaxation. Without relaxing the constraints it took over 100,000 functionevaluations to generate a single acceptable point. But with constraint relaxation, it waspossible to find the optimal solution with 272,225 function evaluations.From experience with the test problems, values between 0.5 and 0.85 were found tobe reasonable for tightening relaxed constraints. A lower value can be used for problemswhere the search efficiency indicated by Nf after the first iteration is high indicating thatthe feasible region volume is close to the current cuboid volume.5.3 Use of a rhombohedron shaped search domainA rhombohedron can be used as a more compact search domain in place of a cuboid. Theefficiency of LSP then increases in proportion to the volume ratio of the minimal cuboidto the minimal rhombohedron enclosing the same acceptance set. This is discussed inChapter 3, section 3.6. This efficiency improvement increases with increasing linearity inthe relationship between pairs of variables. The greatest efficiency gain occurs when thecorrelation coefficient approaches +1 or \u00E2\u0080\u00941.As the search becomes confined to a very small region of the decision domain, relationships which are highly nonlinear will often approximate locally to a linear relationshipChapter 5. EXPERIENCE WITH LSP 166so that the rhombohedron can be exploited frequently. However, it is important to notethat LSP does not impose an actual linear relationship on the variable pair so that noapproximation or distortion is being introduced into the problem being solved.The rhombohedron was used for a number of test problems which had shown distinctevidence of linearity between some of the variables in the x xj plots at some stage in thesearch. The three variable flywheel design problem [Problem 3-15-0-2, Appendix A], andthe fonr variable power generation problem [Problem 4-11-0-5, Chapter 4], were used toinvestigate the rednction in the nnmber of function evalnations when the rhombohedronwas nsed. Each of these test problems was solved in two ways. In the first case, a cuboidwas used throughout the search while in the second case shifts were made between cuboidand rhombohedron as the need arose. The shift to the rhombohedron was made onlywhen the absolute value of the correlation coefficient between any two variables wasgreater than or equal to 0.9. Both test problems were run three times for each case andthe average number of function evaluations required for convergence is given in Table5.1. The same termination criterion and Nj\u00E2\u0080\u00996ep values were adopted in all of the cases.\u00E2\u0080\u0098 IIn multidimensional problems more than one pair of variables can exhibit high linearcorrelation. In such cases the rhombohedron defined might have more than one non-rectangular face. Even though it is technically possible to make further modifications tothe cuboid in response to multiple high correlation coefficients, only a single variable pairwas accommodated at any one time in this research. When more than one pair of variableshad correlation coefficients greater than 0.9 the cuhoid was replaced with a rhomboidProblem3-15-0-24-11-0-5Table 5.1: FunctionNumber of function evaluationsCuboid Rhombohedron7,432,678 507,5581,431,317 L 84,653evaluations using cuboid and rhombohedronChapter 5. EXPERIENCE WITH LSP 167only in the plane of the variable pair with the highest correlation. At this presentstage of implementing the nse of rhombohedron both conceptualization and visualizationdifficulties preclude the use of a rhomboid for more than one variable pair.The rhombohedron also provides benefits in other situations. When a problem hasmultiple global optima, the number of function evaluations can be very large before theconvergence criterion is met, as discussed in section 5.1.4. If there are only two globaloptimal points, or there are more than two optimal points lying approximately on astraight line in the decision space, adopting a rhombohedron can dramatically improvethe search efficiency. Therefore in some cases, the use of the rhombohedron can be asubstitute for cluster analysis. For example, a rhombohedron search domain was used tosolve the two variable, two global optima problem discussed in Chapter 3, section 3.1.3,using Njeep = 20 and VM 0.001. The convergence criterion was met and both optimalpoints identified after 234,866 function evaluations. Comparison of the result with the8,018,142 function evaluations cited in Table 3.1 in Chapter 3 indicates that the efficiencygain is substantial.5.4 Relationship between initial cuboid volume and its location, and NfWhen optimization problems are bounded but bound estimates are not known, the designer has to guess the possible upper and lower limits for each variable in order todefine the initial cuboid. A generous guess is preferred since precision on the bounds isnot required and it is important that the global optimum not be excluded. The size ofthe initial cuboid was found not to have a significant influence on the total number offunction evaluations required to solve the problem.A two variable unconstrained problem, [Problem 2-24-0-0, Appendix A], demonstratesthe influence of the initial cuboid can have on the total number of function evaluations.Chapter 5. EXPERIENCE WITH LSP 168This problem was run with two different sized initial cuboids. In the first case the boundsof the variables used were 0 x1 $ 4 and 0 x2 $ 4, which gives a volume of 16 units.In the second case, where the bounds used were 0 x1 20 and 0 x2 20, the volumeof the initial cuboid was increased to 400 units, i.e. 25 times that of the volume in firstcase. Ten separate runs for each of these cases were conducted and the average valueof number of function evaluations was found to be 2,369 for the first case and 2,725 forthe second, larger initial volume, case. These numbers suggest that there is only a weakrelationship between initial cuboid volume and the total number of function evaluationsrequired for solution.Inappropriately skewed initial estimates of bounds relative to the global optimalpoint(s) can result in a high number of function evaluations for solution. The skewness adjustment generally modifies the location of the cuboids so that clusters of pointsin the vicinity of a cuboid boundary are not on the boundaries of subsequent cuboids.Even if the skew adjustment is high at the earlier stage of the search, it can take manyiterations to shift the cuboid until the global optimum is included in the current cuboid.Such extra iterations introduce computational inefficiency. Of course the inefficiency canbe reduced with a different set of skewness parameter values, but adjusting these parameters for each particular problem is not a feasible option as there will rarely be any apriori information on which to base the adjustment.5.5 Some specific difficulties encountered while implementing LSPComputational inefficiency and convergence difficulties were encountered by LSP in justfour of the 200 published mathematical test problems attempted. LSP can fail to find thetrue solution to a problem when a cuboid at any stage of the search erroneously excludesa portion of the level set containing the global optimum and this is not subsequentlyChapter 5. EXPERIENCE WITH LSP 169recovered through cuboid stretching or skewness adjustment. Evidence of this difficultywas provided by a test problem taken from Subrahmanyam, [Problem 2-12-0-4, AppendixA). This two variable, four constraint problem is shown in Figure 5.9. Its key features area narrow crescent shaped feasible region formed by two parabolas and an almost parallelobjective function surface to one of the variable axis. The formulation isMinimize f(x) = (xi \u00E2\u0080\u0094 i0)-I- (x2 \u00E2\u0080\u0094 20)Subject to:\u00E2\u0080\u0094x1 + 13 0\u00E2\u0080\u0094(x1 \u00E2\u0080\u0094 5)2\u00E2\u0080\u0094 (x \u00E2\u0080\u0094 5)2 + 100 0(x1 \u00E2\u0080\u0094 6)2 + (x2 \u00E2\u0080\u0094 5)2 \u00E2\u0080\u0094 82.81 0x2 0Subrahmanyam, [Subrahmanyam, 1989], solved this problem using the \u00E2\u0080\u009CExtendedsimplex method applied to constrained nonlinear optimization\u00E2\u0080\u009D. He states that convergence was achieved at 833 function evaluations, though it is not clear if this numberincludes the evaluation of constraints or is only the number of objective function evaluations.Attempts to solve the above problem with LSP, using the default parameter values,always converged on non optimal points. Since the global optimum lies at an acutecorner of the feasible region, the cuboids at successive iterations excluded the globalpoint for the reason discussed in Chapter 3, section 3.5. Using skewness values withinthe range recommended in this thesis, the skewness adjustment could not shift the cuboidfar enough to include the optimum point before the convergence criterion was met. Theskewness parameter values were easily tailored to solve this particular problem, valuesof 6 = 0.25, b = 0.22 and 62 = 0.15 produced the correct global optimum. Such valueswould not be appropriate for general LSP use because of the severe reduction in efficiencythat would result with most types of problems.Chapter .5. EXPERIENCE WITH LSP 17014.012.010.06.0x26.04.02.00.013.0 16.0Figure 5.9: The crescent shaped feasible region of Subrahmanyam\u00E2\u0080\u0099s problem [Problem2-12-0-4, Appendix A].Subrahmanyam\u00E2\u0080\u0099s problem suggests an even simpler test problem. As the only criticalingredient is the crescent shaped feasible region, the objective function can be replaced bya simple linear function with a mild gradient. It appears that, with its elegant simplicity,this modified version of Subrahmanyam\u00E2\u0080\u0099s problem might play a similar role for level setmethods in general as that played by Rosenbrock\u00E2\u0080\u0099s valley problem for gradient searchmethods. A sample objective function for a modified Subrahmanyam\u00E2\u0080\u0099s problem can bestated asxlChapter 5. EXPERIENCE WITH LSP 171f(x) = 245x1 + 4292x.Another particular difficulty arises with LSP when the shape of an active constraintsurface is similar to the shape defined by the objective function surface in the vicinity ofthe global optimum value. An elementary example of this occurs in linear programmingwhen the objective function is parallel to an active constraint. There, all points onthe line joining the two neighbouring optimal vertices constitute a set of global optima.When solving nonlinear problems having this characteristic, the cuboid volume reductionbetween consecutive iterations can, from a practical point of view, cease. But the volumeof the level set inevitably decreases at every iteration, even though the magnitude of thisdecrease may be small. Thus the ratio of the acceptance set volume to the cuboidvolume gets progressively smaller with each iteration. As this ratio declines, the chancesof generating acceptable points decreases, and the search becomes increasingly inefficient.Another situation where a progressive reduction of the acceptance set volume occurswhile the volume of the cuboid remains almost constant is shown in Figure 5.10 for anunconstrained problem. The figure shows that the cuboid volume, which corresponds toarea in this two dimensional example, remains nearly the same at iterations k and k + 1,while the volume of the acceptance set is substantially reduced. In this case the efficiencyof generating acceptable points at the k + 1 iteration declines proportionally with theacceptance set volume. Such a situation can, however, often be detected visually in thexj displays and remedied by partitioning.Efficiency problems also arise with LSP when a problem has multiple global optima,and the global optimal points are approximately distributed along an elongated path.An example is provided by the problem shown in Chapter 3, Figure 3.5 when solved asa maximization problem, so that the problem becomesChapter 5. EXPERIENCE WITH LSP 172x2 x2..\u00E2\u0080\u00A2 S..\u00E2\u0080\u00A2: \u00E2\u0080\u0098.I \u00E2\u0080\u00A2 ... Approxmst.d \\u00E2\u0080\u00A2 .\u00E2\u0080\u0094 acc.ptano. t \u00E2\u0080\u00A2\u00E2\u0080\u00A2 boixidary \u00E2\u0080\u00A2\u00E2\u0080\u00A2*.\u00E2\u0080\u00A2xl xl(a) acceptance set at Iteration k (b) acceptance set at Iteration k+1Figure 5.10: Acceptance set volume change after an iteration while cuboid remains constant.Maximize f(x) = 100 \u00E2\u0080\u0094 (xi + x2 \u00E2\u0080\u0094 10)2Subject to:0 x1 100 x2 10The solution is f(x*) = 100, and all points on the line connecting (0,10) and (10,0)on the x1-x2 plane are members of the solution set. The final cuboid enveloping thesepoints is the same as the original cuboid.Under normal circumstances the on-screen plot of number of iterations versus the levelset value, c I, shows a monotone decreasing curve, but in some cases discontinuitiesin the curve are observed. This indicates that there is an abrupt change in the rate ofimprovement of the level set value. This phenomenon may arise from the fact that thereis a fissure in the objective function surface in the feasible region which gives lower valuesof the objective function and no point was sampled in the fissure in any iteration prior toChapter 5. EXPERIENCE WITH LSP 173the discontinuity. There could of course be more than one fissure in the objective surfaceand consequently any discontinuity in the c n. J curve should signal caution about thevalidity of any presumed global solution.5.6 Observations and recommendationsPrior information about a nonlinear optimization problem, such as the nature of theobjective function surface or the shape of the feasible region, can be of help in thesearch for the global optimum. Realistically though, such a priori information is seldomavailable, even with engineering problems of modest complexity.The information provided by LSP as the search progresses, as discussed earlier inthis chapter, is simple in nature but gives important clues to the likely nature of thedifficulties being encountered. As a result, possible remedies to overcome the difficulty,like relaxation of constraints or adjustments of other LSP parameters, can be suggestedas the search progresses and before unnecessary computational effort is expended or thesearch abandoned.In most of the test problems, the volume of the cuboid reduces and the density ofpoints (i.e. points per unit volume in the decision space) in the acceptance set increasesfrom one iteration to the next. In order to maintain a more or less constant density ofpoints, the number of confirmed points in the acceptance set can be reduced at every iteration without impairing the convergence properties of LSP. A recommended refinementis that, unless the computational effort for a single function evaluation is very large, ahigher initial Nkeep value be used than that suggested in section 5.2.1. Initially, Nk66can be raised by about 30% of the recommended value and then be gradually reduced,by one or two, at every iteration, the reduction to stop when Nkeep reaches about twothirds of the recommended (loxnumber of variables) value. When no other difficultiesChapter 5. EXPERIENCE WITH LSP 174are experienced, the total number of function evaluations per iteration is proportional tothe number of points in the acceptance set so that, while this strategy improves reliabilitythe overall computational effort expended is unchanged.The other heuristic adjustments discussed are modification of the penalty parameterand tightening of constraints following relaxation. These systematic adjustments areintended to increase the efficiency of generating acceptable points within a cuboid. Therewere instances where problems with feasible regions which were too small relative to theinitial cuhoid to successfully generate any acceptable points could only be solved usingthese approaches. Examples are [Problem 3-7-0-2, Appendix A] and the ten membertruss problem in Chapter 4, section 4.3.3.In LSP the number of function evaluations expended at each iteration are displayedin the N1 I graph and used to interpret certain phenomena. In general the N1 -\u00E2\u0080\u0098 Iplot serves as an indication of search efficiency as the search progresses. High numbersof function evaluations at the initial iteration indicates difficulty in generating points inthe feasible region. Continuing the search under those circumstances, without modifyingthe LSP parameters, may result in an overall inefficient search or even an unsuccessfulsearch. Therefore it is recommended that, in response to a high N1 value in the firstor early iterations, the search be halted and the technique of constraint relaxation andtightening actuated in the LSP program. The fact that the need for such modificationis usually indicated quite clearly by the output from the first iteration is of significantpractical value as it limits wasteful expenditure of computational effort.When the plot of the level set value versus iterations, c n- J, shows a discontinuity atsome stage of the search, the reasons for the discontinuity can usually be attributed tothe detection of a narrow and deep fissure in the objective surface in the region containingthe global optimum point. No point has been sampled in the fissure prior to the c Icurve discontinuity but the acceptance set (so also the cuboid) still includes the globalChapter 5. EXPERIENCE WITH LSP 175optimum. The subsequent generation of a single point in the fissure, with substantiallylower objective function value, triggers the discontinuity in the rate of reduction in cover the current and following iterations. When such a phenomenon is observed, Nkmight be raised, say by about 50%, and LSP rerun so that the chances of generatingpoints in any similar fissures, if they exist, is increased. The increase in Nkeep will helpto increase the chance of converging on the global optimum, but at the expense of extracomputational effort.In some cases, and especially with low dimension problems, the x n. x scatterdiagrams of points in the acceptance set clearly indicate the existence of more than onecluster of points. In that case LSP can be halted and separate LSP searches initiatedwithin cuboids bounding the individual subregions indicated by the clusters. The boundsof the subregions can be estimated visually from the scatter diagrams. Some additionalgraphical identification techniques are discussed in Chapter 7, section 7.5.2.Finally it should be emphasized that, for all of the parameter adjustments discussed,much of the triggering information concerns just c, 1/f and I. The magnitude of cis related directly to the problem being solved while N1 and I are easily understoodquantities related to the search. The influence of these quantities on the search strategyis therefore not likely to be difficult for an engineer practitioner, as opposed to thenumerical analysis specialist, to understand. The combined influence of a complicatedset of constraints and a complicated objective function surface on the LSP search, evenwhen a high number of dimensions are involved, can be interpreted from the set oftwo dimensional screen plots discussed in sections 5.1.1 to 5.1.3 far more easily thanone might expect. In addition, the two dimensional x1 xj scatter diagrams have thepotential to provide an engineering practitioner with useful new knowledge concerningthe topology of his design problem. The convergence criterion can be judged and adjustedin the context of the specific problem being solved and in light of the progress indicatedChapter 5. EXPERIENCE WITH LSP 176in these plots. The plots also provide important clues concerning the modification ofcertaiu LSP parameters which can enhance computational efficiency and reliability whilethe search is in progress.Chapter 6EVALUATION OF NLP PERFORMANCEDuring the course of this research into the implementation of a level set optimizationscheme it became evident that it was evolving into an optimization tool with many features which might be considered attractive. At the same time it was recognized that,with the large number of solution points needed to be generated at each iteration, LSPmight be considered to be computationally extravagant when compared with most gradient methods as well as certain direct search methods. Some kind of objective evaluationof LSP\u00E2\u0080\u0099s overall performance and characteristics seemed desirable and the literature wassearched for a suitable evaluation procedure.There are a growing number of nonlinear optimization methods and many have beendeveloped within very specialized fields of application. Diverse testing procedures andevaluation criteria have been presented but the problem remains of determining whichtest problems and evaluation criteria are relevant to an individual user needs. Whilesome users are concerned primarily with speed and accuracy of convergence, others aremore interested in reliably obtaining the true global solution.No established universal criteria set for evaluating nonlinear optimization methodswas found. As well, most of the test problems which appear in the literature, andhave been used for evaluating performance, were developed to demonstrate the specialstrengths of one specific NLP method.The performance criteria which have been offered in the literature are geared almost177Chapter 6. EVALUATION OF NLP PERFORMANCE 178entirely to evaluating local optimization methods. The reason being that local optimization methods are by far the most prevalent. Since very few NLP methods have beendeveloped for global optimization, evaluation criteria for global optimization methodsare rare. Unfortunately, the criteria sets which have been proposed for local NLP methods do not consider some of the important featnres of global optimization techniques andthe pitfalls of global optimization problems.In this chapter some of the existing criteria for assessing global optimization methodsare reviewed and a revised set proposed which, in particular, attempt to address thecurrent needs of engineering practitioners. An assessment of LSP\u00E2\u0080\u0099s performance underthe recommended performance criteria set is also demonstrated in the last section of thischapter.6.1 Existing evaluation criteriaMost of the existing performance indicators or performance criteria focus on the followingfactors: efficiency, expressed as the CPU time required to obtain a solution; the numberof function evaluations for solution; and the numerical accuracy of the final results. Theability of a particular method to solve a wide variety of optimization problems is seldomconsidered and quantitatively measured since in the past efficiency has been the dominantconsideration [Reklaitis et al., 1983]. Sandgren set as a criterion the ability to solve alarge number of problems in a reasonable amount of time [Sandgren, 1980]. He rankedthe algorithms on the basis of the number of problems solved within a specified CPU timelimits. The limits were based on a fraction of the average CPU time for all algorithmsto solve each of a set of problems.Most of the criteria sets to date have been tailored to a specific algorithm. Regardlessof the type of test problems used, the following performance indicators have consistentlyChapter 6. EVALUATION OF NLP PERFORMANCE 179appeared in the literature and are therefore considered to be the most general.\u00E2\u0080\u00A2 Efficiency (CPU time). A measure of the central processor time for termination ofa successful search for a specific problem with a specified degree of precision. Thisindicator is highly dependant on the type of computer used and the coding of analgorithm.\u00E2\u0080\u00A2 Robustness. The ability to solve a large variety of problems in a reasonable timewith a specified precision.\u00E2\u0080\u00A2 Number of function evaluations. A count of how many times the objective functionand/or constraint functions are called during execution. There is not even consistent agreement as to how the number of fnnction evaluations required to find asolution should be counted so the numbers quoted are often not comparable. Thisis an alternative measure of efficiency, which is independent of the type of computerused in the test.\u00E2\u0080\u00A2 Number of Iterations. The nnmber of iterations to reach convergence for iterativealgorithms. This indicator is not influenced by the computer used, but it is notnecessarily comparable between different algorithms.\u00E2\u0080\u00A2 Basic operation count. How many basic computer numerical operations (or flops)are performed. Will be influenced by the code and may even depend on the typeof computer used.\u00E2\u0080\u00A2 Numerical accuracy. The deviation of the attainable solutions from the \u00E2\u0080\u009Ctrue\u00E2\u0080\u009D oranalytical solution. With dimensionally large and complex test problems the truesolution may not be known.Chapter 6. EVALUATION OF NLP PERFORMANCE 180\u00E2\u0080\u00A2 User friendliness. Simplicity, convenience of input format, ease of understandingof the output both during the search and at termination. A crucial componentfor practitioners, usually overlooked, is the time required by an occasional user tofamiliarize himself with the theoretical basis of the methodology and (re)acquireimplementation skills.\u00E2\u0080\u00A2 Reliability. The ratio of the number of successful solutions found to the totalnumber of problems attempted. Wrong solutions involve either convergence on anon-optimal point, violating constraints, failure to meet the convergence criteriondue to excessive computation time, or search termination because of computationalfatal errors or numerical overflow.\u00E2\u0080\u00A2 Problem dimensional capability. Expressed as the maximum number of variablesand equality and inequality constraints which can be handled without substantially reducing the performance indicators, while the computational effort requiredremains within acceptable or feasible bounds.\u00E2\u0080\u00A2 Complexity. The ability to handle ill conditioned, indefinite and degenerate problems.\u00E2\u0080\u00A2 Sensitivity to starting point. Effect of choice of starting point on the success of thesearch.Most attempts to combine the above criteria have been purely qualitative. Schittkowski produced one of the rare attempts to give a quantitative interpretation to therelative weight of each criterion [Schittkowski, 1980]. He adopted nine criteria and assigned the three sets of weighting factors given in Table 6.1 below. The choice of thecriteria set depended on the relative importance of the various criteria to three distincttypes of users. The actual nature of these types of users was not described but theChapter 6. EVALUATION OF NLP PERFORMANCE 181emphasis on ease of use for type II users suggests that this is most applicable to thenon-specialist.Alternate_WeightPerformance criteria I II IIIEfficiency (speed) .32 .18 .14Reliability .23 .18 .36Global Convergence .08 .08 .20Ability to solve degenerate problems .05 .03 .03Ability to solve ill conditioned problems .05 .06 .03Ability to solve indefinite problems .03 .03 .03Sensitivity to slight problem variation .03 .03 .06Sensitivity to position of the starting point .07 .06 .06Ease for use .14 .35 .09Table 6.1: Schittkowski\u00E2\u0080\u0099s NLP performance criteria6.2 Limitations of Schittkowski\u00E2\u0080\u0099s performance criteriaA principal shortcoming of Schittkowski\u00E2\u0080\u0099s performance criteria is that they do not addresssome of the currently important issues which impact upon the appeal and practicabilityof nonlinear optimization to the practising engineer, while considering others which are oflittle relevance. Limitations also arise from the fact that the criteria sets were designed toevaluate local optimization methods with continuous functions only. Therefore these setsof criteria and weights need to be extended and the weighting of each criterion needs tobe adjusted to reflect contemporary needs and economic realities. Schittkowski\u00E2\u0080\u0099s criteriaweights were proposed when mainframe computers were the only computing resourceand the cost of computation, which was directly proportional to CPU time, was thedominant expense. The development of microcomputers has substantially lowered thecost of computation. Furthermore, the marginal cost of computing on microcomputerscan approach zero on machines which primarily serve other purposes and are severelyChapter 6. EVALUATION OF NLP PERFORMANCE 182under utilized over a 24 hour day. Thus the weighting of efficiency should be considerablylower today.6.3 Recommended set of evaluation criteriaThe purpose of a criteria set should be to serve as an aid to identifying the best optimization method for a particular class of user. Therefore, as many of the relevant practicalconsiderations as possible should be addressed by the criteria set. The factors consideredimportant in this thesis for NLP performance evaluation are as follows.Reliability:One of the most important qualities of an optimization method is its ability to reacha successful solution. A solution is successful when the final solntion is achieved withoutnumerical failure and without any violation of the constraints. Reliability is defined asthe ratio of the number of successful solutions obtained to the total number of problemsattempted. High reliability makes a method applicable, with confidence, to the widestpossible range of problems, hence a high weight is given to reliability in the set of criteria.Global convergence:Many civil engineering optimization problems are nonlinear in their formulation whichcan lead to multiple local optima. Since the difference between local and global optimalsolutions can be considerable, there is always the need to identify the global optimum.Therefore, a considerable weight is assigned for the capacity to guarantee finding the trueglobal solution.Multiple global solutions:A practitioner would definitely be interested in identifying all possible global optimalpoints or at least being given some indication of their possible existence. These pointsChapter 6. EVALUATION OF NLP PERFORMANCE 183may give designers and decision makers the flexibility to choose from significantly differingalternate optimal decisions. Consequently, the ability to identify mnltiple global optimais desirable and should have high weight amongst the performance criteria.Near optimal solutions:In many cases nonlinear problems have an undnlating objective function surface whichresults in multiple local optima. All of the local optima may not necessarily be of greatinterest to a designer but those local points which produce objective function values closeto the global optimum can, from a practical standpoint, be of equal interest to the globaloptimum. Therefore, a considerable weight should be assigned to near optimal solutionidentification.Discrete variables:Many civil engineering problems involve discrete variables as well as mixtures ofdiscrete and continuous variables, so that the capacity to handle integer variables shouldbe given some weight.Ease of use:Ease of use and interpretation of the output are major factors for choosing and usingan optimization method. The general NLP user cannot be expected to know the meaningof all of the terminology used in numerical analysis. Terms like Kuhn Tucker conditionsand Hessian matrix are often alien to him and there is a natural reluctance to adopt amethod if the important features of the output data are not immediately understood. Afriendly input-output format in a language the designer can understand is appealing. Inpractice, ease of use is related to expenditures of an engineer\u00E2\u0080\u0099s time and hence to costof implementation. This includes the time an engineer needs to learn and familiarizehimself with the theory and implementation of the method and in the interpretation ofthe results. In this thesis, the highest weight in the performance criteria set is dedicatedChapter 6. EVALUATION OF NLP PERFORMANCE 184to user friendliness.Speed of convergence:As discussed in section 6.1, speed of convergence should no longer be considered asa major factor. It used to be important when the cost of computation was dominatedby CPU time on a mainframe computer. The fact that some NLP methods can be runon personal computers has changed the relationship between cost and computation time.Microcomputers, which are often under utilized, can be left to run for the whole day andthe computational cost is not far from the electrical power used. Moreover the speedof computing has continued to increase. This phenomenon has further diminished theimportance of speed or efficiency as an indicator of NLP method performance. However,the number of function evaluations, which does not depend on the type of machine used,can serve as a relative indication of speed of convergence.After considering the above factors, a new proposal for an NLP performance evaluation criteria set along similar lines to those originally proposed by Schittkowski is givenin Table 6.2. This table is an extended version of Table 6.1. The criteria set and weightsrecommended are based on personal experience gained in using a variety of NLP techniques in this research. Each figure in the last column indicates the author\u00E2\u0080\u0099s view ofthe relative importance of the various criteria for the present day needs of a practitionerengineer doing only occasional optimization.6.4 Evaluation example: LSP versus GRG2The performance of LSP under the proposed criteria set was not investigated in anysystematic or formal way over the more than 200 test problems investigated in thisresearch. Only a limited example of performance evaluation is given here, comparingChapter 6. EVAL UATION OF NLP PERFORMANCE 185Mainframe PC.1980 1993Alternate WeightPerformance criteria I II IIIEfficiency (speed) .32 .18 .14 .03Reliability .23 .18 .36 .21Global Convergence .08 .08 .20 .15Ability to solve degenerate problems .05 .03 .03 ftAbility to solve ill conditioned problems .05 .06 .03 negligibleAbility to solve indefinite problems .03 .03 .03 importanceSensitivity to slight problem variation .03 .03 .06 1).Sensitivity to position of the starting point .07 .06 .06 .03Ease for nse .14 .35 .09 .25Integer handling\u00E2\u0080\u0094 .12Near optimal solution identification \u00E2\u0080\u0094 .08Multiple optima identification\u00E2\u0080\u0094 .13Table 6.2: Recommended NLP performance criteriathe performance of LSP against a leading gradient method GRG2 using the proposedcriteria set. A single test problem, [Problem 4-12-1-5, Appendix A], is used in thisexample. This test problem was used by Schittkowski to evaluate the performance of avariety of NLP methods including GRG2 [Schittkowski, 1980]. In Schittkowski\u00E2\u0080\u0099s work,the performance under each of his criteria was evaluated on the basis of 10 different runs.The same problem was run ten times using LSP. The results showed that all of the LSPruns converged at a single global optimum. The performances and scores of the twomethods derived from the test problem are given in Table 6.3. The performance for eachcriterion is assigned values between 0.0 and 1.0, and scores are calculated as (weight xperformance). The weights adopted here are those cited in Table 6.2, section 6.3. Thetotal score, which can take a value between 0.0 and 1.0, is a comparative indicator of theoverall performances of the two methods.In Table 6.3, a performance value of 0.0 is assigned for efficiency to LSP and a valueof 1.0 to GRG2. This is because the number of function evaluations to reach searchChapter 6. EVALUATION OF NLP PERFORMANCE 186termination with LSP is quite high compared to that reported for GRG2. On the otherhand, it is assumed that the time an engineer needs to acquire a working knowledge ofthe gradient method and interpret its output is about four times the time required withLSP. Therefore, the performance values for ease of use are 1.0 and 0.25 for LSP andGRG2 respectively. The performance values for reliability and global convergence forGRG2 are from [Schittkowski, 1980].Performance ScorePerformance criteria Weight GRG2 LSP GRG2 LSPEfficiency (speed) 0.03 1.00 0.00 0.03 0.00Reliability 0.21 0.867 1.00 0.182 0.21Global Convergence 0.15 0.654 1.00 0.098 0.15Sensitivity to position of the starting point 0.03 0.00 1.00 0.00 0.03Ease for use 0.25 0.25 1.00 0.065 0.25Integer handling 0.12\u00E2\u0080\u0094 1.00\u00E2\u0080\u0094 0.12Near optimal solution identification 0.08\u00E2\u0080\u0094 1.00 0.08Multiple optima identification 0.13\u00E2\u0080\u0094 1.00\u00E2\u0080\u0094 0.13Total score 0.435 0.97Table 6.3: Comparison of GRG2 and LSP using proposed criteria setIn general, with the proposed criteria set and weighting scheme, LSP showed superiorperformance over GRG2. The main factors for LSP\u00E2\u0080\u0099s high performance rating are asfollows.\u00E2\u0080\u00A2 It is particularly easy to understand the output from LSP as it is expressed almostentirely in terms of the problem variables and their values (see Chapter 7 on LSPcomputer implementation).\u00E2\u0080\u00A2 Almost all attempted problems were solved to a reasonable accuracy with LSP butGRG2 failed to find solutions in about 13.3% of the cases.\u00E2\u0080\u00A2 Multiple optima are clearly identified by LSP while GRG2 identifies only a singlesolution at each run.Chapter 6. EVALUATION OF NLP PERFORMANCE 187\u00E2\u0080\u00A2 A set of near optimal points always accompanies the global optimum solution withLSP but only one local optimum is provided by GRG2 at the end of each search.Because efficiency still dominates many people\u00E2\u0080\u0099s view of the value of an NLP method,and even though LSP was given a zero score in this category in the above evaluation, thenumber of function evaluations with LSP for most of the cases, were of the same orderof magnitude as those solved using CR02. Only a short time was required to solve thesesmall sized problems. For example, with LSP, the time taken to solve the two variableRosenbrock function is about 2.86 seconds [Problem 2-7-0-0, Appendix A], a 3 variablefuel allocation problem [Problem 3-14-0-1, Appendix A] took about 17.53 seconds, andabout 90 minutes was taken to solve a 15 variable problem with 29 constraints [Problem15-1-0-29, Appendix A]. These times were on an 80386-33Mhz microcomputer. The timeswere averaged over five different runs.Because of the difficulties in finding representative test problems and finding universalperformance measures, a single quantitative assessment of the performance of an NLPmethod does not appear to be possible. In addition, ease of use, which is an importantfactor in determining if an NLP package is ever to be adopted by non-speciallst users, isnecessarily a subjective issue.This chapter has provided a reassessment of Schittkowski\u00E2\u0080\u0099s set of qualitative performance criteria to reflect the present day computing environment and the needs of anengineer practitioner. Although no systematic and complete evaluation of a number ofNLP packages was attempted in this research it is evident that LSP has characteristicswhich make it worthy of serious consideration as a tool for nonlinear optimization.Chapter 7LSP COMPUTER IMPLEMENTATIONLSP was written in Quick Basic. This language was adopted for convenience in theearly stages of the research with the assumption that a more powerful language wouldbe adopted when the need arose. It transpired that LSP does not place any specialprogramming or computational demands so that, for developmental purposes, there wasno incentive to change from Quick Basic. While it eventually became apparent thatthe graphical interface could be a unique and attractive factor of LSP it also presentedminimal demands on the graphics capabilities of Quick Basic. The only justificationfor reprogramming would simply be to maximize computational speed and refine theappearance of the graphics displays.LSP executes the optimization process iteratively, giving results at the end of everyiteration. This intermediate output is available as numerical data and also in the form ofgraphical displays. When the convergence criterion is satisfied, the program terminatesand gives the points in the acceptance set and their corresponding objective functionvalues. All of the intermediate and the final results can also be saved in a file.Normally LSP does not require any adjustment to its parameters, it performs thewhole search automatically and outputs the intermediate and the final results. Modification of penalty terms, tightening of relaxed constraints and adjustments to the number ofpoints in the acceptance set are usually done without user interference. The predominantuser interaction is in the form of requests for specific information during the search, suchas displaying particular types of plots or sending information to a printer or to a file, or188Chapter 7. LSP COMPUTER IMPLEMENTATION 189when the search experiences difficulties.7.1 SubroutinesThe LSP package subroutines are categorized into three main groups. The first groupconsists of the problem definition routines, which contain the objective and constraintfunction code and the LSP search parameter values, and also process user instructions forany LSP parameter changes. The second group consists of the routines which performthe optimization. The third group consists of the output routines which generate thegraphics displays and numerical solutions at various stages of the search.LSP lends itself to programming as a set of short subroutines. The purposes of eachsubroutine are discussed in more detail below. A schematic representation of the generalfunctions and operations of each set of the subroutines is also given in Table 7.1.Problem definition subroutines:The user is required to furnish information concerning the problem being solved. Thesubroutines where this information is coded or entered are:CONSTANTS: All LSP parameters and constants describing the problem size are storedin this subroutine. These include the number of points in an acceptance set, thevalue of the termination criterion, the value of the clustering criterion, values ofthe skewness adjustment parameters, and the number of variables in the problemformulation.INITBOIJNDS: The minimum and maximum possible values for each variable are codedinto this subroutine and on execution these values are passed to the main programto define the initial cuboid.Chapter 7. LSP COMPUTER IMPLEMENTATION 190PROBLEM DEFINITION\u00E2\u0080\u00A2 formulate objective function\u00E2\u0080\u00A2 formulate set of constraints\u00E2\u0080\u00A2 specify variable boundsOPTIMIZATIONInitialization\u00E2\u0080\u00A2 generate feasible points\u00E2\u0080\u00A2 evaluate moments\u00E2\u0080\u00A2 specify level set valueAlgorithm\u00E2\u0080\u00A2 check for termination\u00E2\u0080\u00A2 revise level set value\u00E2\u0080\u00A2 discard bad points\u00E2\u0080\u00A2 define current cuboid\u00E2\u0080\u00A2 generate new feasible pointsOUTPUT\u00E2\u0080\u00A2 Display:- points in the current cuboid- current level set value- current cuboid boundaries- computational effort expended- scatter plots(in both initial and current cuboid)- efficiency and search progress curves\u00E2\u0080\u00A2 send information to a printer and/or a fileTable 7.1: LSP\u00E2\u0080\u0099s problem definition and operationsChapter 7. LSP COMPUTER IMPLEMENTATION 191CONSTRAINTS: All constraints are coded in this snbroutine. It receives the x vectorfor a point and retnrns a binary value indicating feasible or infeasible. Constraintevaluation at a point is terminated when the first violated constraint is detected sothat generally, on average, only half the constraints will need to be checked beforerejection occurs.OBJECTIVE: The objective function is coded in this subroutine. It receives the x vectorand returns the value of the objective function.Optimization subroutines:This set of subroutines perform the actual optimization task.INITKEEP: In this subroutine trial points are generated, points are checked for feasibilityby calling the CONSTRAINTS subroutine and the objective function is evaluatedby calling the OBJECTIVE subroutine. The number of confirmed points generatedis controlled by the value set for Nkeep.UPDATEMSD: The mean and variances of the objective function values at the pointsin an acceptance set are evaluated. A new level set value-c is assigned for the nextiteration.SWAPSKEW: Points with objective function value greater than the new level set value-care discarded. A new cuboid, which contains those points fulfilling the level setcondition, is defined. Modifications of the new cuboid, in the form of stretchingand skewness adjustments are done when the necessary criteria are met.FILLKEEP: More points are generated in the new cuboid to replace those points discarded at the new level set value-c. The generated points are checked for feasibilityand against the new level set value-c before they are accepted.Chapter 7. LSP COMPUTER IMPLEMENTATION 192Output subroutines:Some of the subroutines give the intermediate and final results as an output. Theoutput is given in the form of numerical values and/or in the form of graphical displays.The user has the option to supply the format required for the output.REVIEWKEEP: This subroutine prints intermediate and final numerical results on thescreen. The current level set value-c, the best point x vector found so far and itsobjective function value, the total number of function evaluations and the modifiedvariance of the points in the acceptance set are displayed after every iteration.KEEPOTIT: This subroutine prints the decision variable values and objective functionvalues of the confirmed points in the acceptance set on the screen at any timerequested. This subroutine is activated at any time, while the search is in progress,by pressing a function key.DATA: This subroutine sends the decision variable values and the objective functionvalues of points in the current acceptance set, the bounds of the current cuboid,the number of function evaluations expended so far and the current level set value-cto a file whenever requested. Data transfer is activated by pressing a function key.PLOTS: Generates the various types of screen plots discussed in this thesis. Some ofthem are prompted by pressing a specific function key, and others are displayedautomatically after each iteration. The Nf \u00E2\u0080\u0098-. I and the c I plots are alwaysdisplayed after every iteration.SCRGRAB: This subroutine sends graphical information into a specified file by savingthe currently displayed screen. It can be invoked either by pressing a function keyor automatically after every iteration.Chapter 7. LSP COMPUTER IMPLEMENTATION 193CLUSTER: The clustering analysis subroutines calculate the dendrogram using the confirmed points and displays the distance coefficients on the screen. The user is askedto specify the number of clusters extracted on the basis of the dendrogram. Oncethe clusters are identified the confirmed points are reclassified into the differentclusters and the cuboid bounds for each cluster calculated.The LSP program framework can readily incorporate auxiliary subroutines whichsolve major tasks that are associated with a specific type of problem. For example,the cluster analysis routine, the Hardy Cross relaxation routine for solving pipe flowproblems, and frame and structural truss analysis routines have all been incorporatedinto the LSP program in the form of subroutines when solving test problems.7.2 Confirmed point generation strategyA point has to be both feasible and produce an objective function value less than orequal to the level set value-c to enter the acceptance set. Guaranteeing the feasibility ofa point before evaluating the objective function or guaranteeing the level set conditionbefore checking feasibility leads to two different strategies of confirmed point generation.These different strategies can affect the efficiency of the search depending upon the natureof the optimization problem being solved. Even though it is not necessary to check pointfeasibility for unconstrained problems it is relevant, in the following discussion, to considerboth constrained and unconstrained problems. Constrained problems can yield effectivelyunconstrained interior solutions as the search progresses. At some stage of the search thecurrent cuboid might remain entirely within the feasible regiou, consequently from theLSP perspective, the problem has become unconstrained. Confirmed point generationin such a situation is discussed below under the title \u00E2\u0080\u0098Interior solutions with multipleChapter 7. LSP COMPUTER IMPLEMENTATION 194optima\u00E2\u0080\u0099. Two distinct situations and their recommended point confirmation strategiesare discussed here.\u00E2\u0080\u00A2 Interior solutions with multiple optima:How often trial points generated in the current cuboid violate the level set conditionis inversely proportional to the volume ratio of the acceptance set to the currentcuboid. In the common situation of global solutions lying at interior stationarypoints, the rejection of trial points due to violation of the level set condition increases as the search approaches the optimum value. The reason being that thevolume of the cuboid remains close to constant while the acceptance set volumereduces to small regions surrounding the optimal points. This situation lowers theefficiency of the search.The efficiency is therefore often related to closeness of the level set value-c to theoptimum value, but is also affected by the dispersion of the optimal points in thedecision space. For example, for a two variable problem with three global optimalpoints, the efficiency is low if these three points are on a straight line and the lineis parallel to one of the variable axes. The efficiency decreases further when theline joining the three points sits on a diagonal of the cuboid.As an experiment, a problem was chosen with the acceptance set volume at the kEhiiteration occupying 10% of the current cuboid volume. This meant that, on theaverage, 10 trial points were being generated to get a single confirmed point. Thenumber of function evaluations necessary to generate 10 confirmed points was theninvestigated.The two strategies were examined. In the first case, generating a feasible point andthen calculating the objective function. In the second case, the steps were reversedand a point which fulfils the level set condition was first generated and then itsChapter 7. LSP COMPUTER IMPLEMENTATION 195feasibility checked. The same experiment was done again after few more iterationswhere the volume of the acceptance set had been reduced to only 1% of the currentcuboid. The number of function evaluations for the two strategies at these twoiterations are given in Table 7.2.a. When acceptance set is 10% of the cuboid volumeFunction FunctionStrategy evaluations Strategy evaluationsCheck EvaluateFeasibility 10 x 10 Obj. function 10 < 10JL.j.J JLEvaluate j CheckObj. function 10 x 10 Feasibility 10Sum 200 110b. When acceptance set is 1% of the cuboid volumeFunction FunctionStrategy evaluations Strategy evaluationsCheck EvaluateFeasibility io x 100 Obj. function io x 100.iL Jj.Evaluate CheckObj. function 10 x 100 Feasibility 10Sum 2000 1010Table 7.2: Comparison of Nf for two different point confirmation strategiesThe above examples support the general experience with the test problems that, iffeasibility is checked before the level set condition, the number of function evaluations required for search termination is about twice the number required for thereverse strategy. This advantage may not necessarily be enjoyed for all problemsChapter 7. LSP COMPUTER IMPLEMENTATION 196solved with LSP but appears to be the best default strategy.\u00E2\u0080\u00A2 Constrained problems - general case:Inefficiency arises when the acceptance set volume to the cuboid volume ratio islow. The acceptance set for a constrained problem is governed by both the set ofconstraints and the level set condition. Which of the two factors, that is either thefeasibility or the level set condition, is the most likely to cause rejection of a trialpoint depends upon the nature of the objective function and the complexity of thefeasible region in a specific problem.In general, the minimum total number of function evaluations is always achievedby checking the dominant rejecting factor first. In practice this means checkingfeasibility first in the case of highly (tightly) constrained problems, and evaluatingthe objective function first for problems whose objective function surface is veryirregular. Unfortunately, identifying the dominant factor at any stage of an LSPsearch is not always an easy task.Consider an example, where at some stage of the search, the feasible region and thelevel set occupy 30% and 80% of the current cuboid volume respectively. Assumethat the acceptance set, the intersection of the two regions, is 20% of the currentcuboid. Table 7.3 shows the number of function evaluations needed, in this example,to generate 10 confirmed points using the two different strategies. The strategywhich generates feasible points before evaluating the objective function requiredless function evaluations since the feasible region volume is smaller than the levelset volume.Checking the feasibility of a single point might consume significant CPU time ifthe constraints are many and compilcated. For example, a 4 variable problem with 3Chapter 7. LSP COMPUTER IMPLEMENTATION 197Function FunctionStrategy evaluations Strategy evaluationsCheck EvaluateFeasibility 5 x 10 Ohj. function 5 x 10Jyl. itI). LIEvaluate CheckObj. functionj 1.5 >< Feasibility 4 >< 10Sum 65 90Table 7.3: Comparison of Nf for different strategies for a constrained probleminequality constraint functions [Problem 4-1-0-3, Appendix A] was used to assess thetypical time spent for checking feasibility and for evaluating the objective function. Theaverage time taken to calculate the constraint functions at a feasible point was about2.64 times the time required to evaluate the objective function. However, no matter howlong the computation takes, it is simply counted as a single function evaluation in thisresearch (a rationale for this was discussed in Chapter 5, section 5.1.2). Therefore therelationship between CPU time and number of function evaluations is not always linear.The same test problem was used to compare the total number of function evaluationsto meet the convergence criterion with both strategies, i.e. with checking feasibilitybefore evaluating objective function or with the reverse strategy. The problem was run10 times for each case. The results showed that the strategy which evaluated objectivefunction prior to checking feasibility required 30.7% extra function evaluations than thestrategy which generated feasible points before evaluating objective function.In the discussion above, the superiority of one strategy over the other is based purelyon the total number of function evaluations. From the experience gained with the testproblems, it was generally found better to check feasibility before evaluating the objectivefunction. This was basically due to the computational effort required to evaluate a setChapter 7. LSP COMPUTER IMPLEMENTATION 198of constraint functions for most of the test problems. It is possible to switch from onestrategy to the other without difficulty so that a choice can be made after appraising therelative computational complexity of the various functions involved in a specific problem.7.3 Intermediate results presentationAfter each iteration, the new and improved level set value-c, the best point observed sofar, and the bounds of a new cuboid are all displayed on the screen. The results at anyiteration will always show some improvement over the previous iteration. Even if theprogram is interrupted at the middle of the search, the effort expended is not wastedas the intermediate solution can be used to provide initial cuboid data for a subsequentreviewed attempt to find the solution.There may be justification to modify the values of some LSP parameters on the basisof the display indicating search performance trends. If it is necessary to make any changesthe search can be paused, the necessary changes made and the search then resumed.All points in the acceptance set, the objective function value at each confirmed point,the level set value-c and the number of function evaluations at every iteration are stored inthe computer memory. At any stage of the search the results provided at every iteration,from the beginning of the search up to the current stage, can be displayed and reviewedwithout stopping the search.A variety of information is given out by the plots at every iteration. The set ofscatter diagrams of points on a two variable plane are displayed by pressing a functionkey. The scatter diagrams are plotted with a choice of axis lengths, either the initialvariable bounds or, alternatively, the current cuboid lengths. The c r-\u00E2\u0080\u0099 and Nf Iplots, are automatically displayed after every iteration.If the intermediate results, including the plots, give an indication that there areChapter 7. LSP COMPUTER IMPLEMENTATION 199multiple optima and efficiency is low, it is better to stop the search and divide the currentcuboid into subregions so that the search can be carried out in each subregion separately.Such user intervention increases the overall efficiency. LSP could be programmed toautomatically make the necessary changes without the need for sophisticated detectionroutines. It must be stressed that the need for such adjustments can be recognized byonly a moderately experienced user.7.4 Final outputThe final output from LSP gives the global optimal point(s) and their corresponding objective function values and a set of alternate near-optimal points whose objective functionvalues do not exceed the final level set value-c. This information is available both numerically and graphically. Certain performance indicators, like the total number of functionevaluations and the number of iterations, are given as a supplementary information.7.5 Alternative presentation of scatter plotsThe most informative graphical representation of intermediate and final results is inthe form of the scatter plots of confirmed points in the x x planes. When dealingwith problems involving more than two variables, a set of scatter plots are required torepresent the points in every plane of the variables. Various ways of presenting theseplots have been assessed during the course of this research. Two methods of scatter plotpresentations on a computer screen are discussed in the next subsections under static anddynamic plots. One of the methods utilizes the static plot where no logical connectionbetween points is displayed. The alternative dynamic method allows removal of someof the points from the display, a change to the scale of a plot or labelling points havingsomething in common (for example, a point identified in one plot showing up in the sameChapter 7. LSP COMP UTER IMPLEMENTATION 200colour in all other displayed plots).7.5.1 Static plotsFor a problem involving n variables there will be n(n\u00E2\u0080\u00941) scatter plots, each showing onen-i x plane. These plots are ideally juxtaposed in a single screen frame. A maximumof 14 plots have been displayed at any one time on a standard 14\u00E2\u0080\u009D VGA display. Forhigher dimensional problems requiring more than 14 plots the user is either expected tochoose a subset of the planes of interest or use a higher resolution display. A page displayfeature could easily be implemented if necessary.In some cases there may be a practical, problem related, need for identifying a specificconfirmed point, or a subset of confirmed points, in one of the scatter plots and tracingthem in the other plots. One example arises when one is interested in tracking solutionpoints which lie within a preferred area of the search domain. A second need, whichmight arise when the level set value-c has fallen to an acceptably low value, is to trackpoints which were once confirmed in the acceptance set and were subsequently discardedin the search process when they did not fuffil the current level set condition.Tracking a single point with some particular numerical property which can be detectedwithin the computer program was implemented. An example of this was tracking of thecurrent best point in the acceptance set. A special display symbol (+) was assigned tothe current best point and this can be seen in Figure 8.3, Chapter 8, for a six variableproblem where it is discussed in connection with sensitivity analysis. Experience withthis feature and the many test problems showed that, at any stage of the search, thecurrent best point rarely lies close to the centre of the cluster in any of the x1 n-i x plots.The greatest insights to the problem being solved can be provided by the scatterplots of confirmed points at any stage of the LSP search. For example, when dealingwith problems involving more than two variables, the user might have preference forChapter 7. LSP COMPUTER IMPLEMENTATION 201solution points within a certain range for one or two of the key variables. His interestwill then be to identify the corresponding points in the other planes.The implementation of point identification schemes becomes more complicated whenthere is an interest in identifying groups of points at the same time. Some multiplepoint detection schemes were implemented to investigate the kinds of information whichcould be inferred from visually identifying points which had been discarded at previousiterations. Points discarded in each of the last 5 iterations, identified by iteration, andthose surviving in the current acceptance set were displayed on the same plot. Pointsdiscarded at each iteration were thus considered as members of a single group.Several techniques were tried to visually distinguish between points in each group.The first attempt used different colours for different point groups, and the second useddifferent symbols for points in different groups. Both attempts did not give meaningfulresults after the first few iterations. In the latter stages of LSP search, points fromdifferent groups overlap so that their differences become indistinct.A slightly modified approach to the above methods where only the points discardedat one iteration are displayed at any time was also implemented. The set of plots fora single group were displayed in a single frame. Each frame remained on the screen foronly a short time (about a second) and the display cycled through all the frames. Thisanimated approach did not appear to have any particular value.To overcome the difficulties mentioned above, and to exploit the benefits of scatterplots, it is better to use dynamic graphs instead of static graphs. In dynamic graphics,a viewer can see changes which have occurred over time. This improved approach topresenting scatter plots is explained in the next section.Chapter 7. LSP COMPUTER IMPLEMENTATION 2027.5.2 Dynamic plots - AlternagraphicsAlternagraphics [Becker et al., 1987] is a technique which allows the viewer to interactively select some points from a set of scatter points and then view the selected pointdisplayed in any number of corresponding two dimensional displays. Another featureis that the displays can be modified rapidly as the user explores the data set so thatprimitive animation effects are achieved. The purpose of alternagraphics is to conveyinformation which is not easily perceived in static displays. It is, in effect, just anotherrather specialized form of computer graphical user interface. Because LSP generates alarge set of solution points at every iteration as well as in the final solution, alteruagraphics is far more exploitable with LSP than with any of the more conventional nonlinearoptimization techniques. Alternagraphics can be implemented in three different waysprovided that the points plotted on the screen may be categorized into various groupsdepending on their importance to the viewer or other numerically identifiable characteristics.The first approach displays points belonging to successive acceptance sets for only ashort time, automatically cycling through the most recent sequence of iterations. Thesecond approach displays all data points on the screen simultaneously in a number oftwo dimensional plots, but highlights the points belonging to a particular set. Points ina series of sets are then highlighted one after the other. The third approach provides theuser with the capability of turning any set ON or OFF rapidly. When a set is turnedON, either its points are highlighted and other points remain unchanged, or only thosepoints turned ON are shown and all the other points are erased from the screen. TheON or OFF procedure can be initiated from a displayed menu, pressing a function keyor clicking a button on the mouse to activate the changes.The third approach mentioned above, where the user has the freedom to interactivelyChapter 7. LSP COMPUTER IMPLEMENTATION 203choose point sets, appears to offer the greatest benefits in engineering practice. Forexample, in solving engineering problems with LSP, the engineer might be interested indesign solutions which can be achieved within a specific value range of one of the designvariables. Using the scatter plot of points in any of the two dimensional displays includingthe variable of interest, the points which lie in that specific range can be selected andturned ON. The display of the corresponding (i.e. same) points in all the other planes willthen be highlighted automatically. This helps the engineer to visualize the distributionof the selected solution points in the other design variable domains.Some of the main features of dynamic graphics discussed extensively in [Becker et al,1987] are reviewed here to further elaborate on their use and implementation with LSP.Deletion:When a single point distorts the scale of a scatter plot, the outlier can be deletedfrom the plot by simply clicking the mouse cursor at that particular point. Once theoutlier is removed, the plot is instantaneously re-scaled and the points appear on thescreen with the new axes ranges. Points which had been crammed in a small regionare now well dispersed, improving the resolution of the plot. Figure 7.1 demonstratesthe improvement of plot resolution by removing the outlier at the upper right corner ofFigure 7.1(a).Linking:Suppose there are n(n+1) scatter plots of points representing the N168 points in anacceptance set of an n variable problem, each plot in an x n. x3 plane. Linking visuallyconnects the corresponding points in the different scatter plots. Certain points are chosenin one of the scatter plots by clicking the mouse on each point and the linking procedurethen highlights them on all the other plots. Consider, for example, in a 4 variableChapter 7. LSP COMPUTER IMPLEMENTATION 204Outliec point\u00E2\u0080\u00A2 1\u00E2\u0080\u00A2 Xla(a) Points crammed in one area (b) Improved resolutionFigure 7.1: Deletion of a point in dynamic graphics.problem, the two scatter plots in the planes of x1 x2 and x3 - x4. Suppose we wantto identify the same points in these two scatter plots. Some points are first chosen inthe x1 - x2 plane, the same points are then automatically identified in the x3plane by highlighting or similar labelling technique. Therefore joining points meansvisually linking the point (x1,X2)C on the first plot to (x3,X4)IC on the second plot, wherek indicates the point number. Linking of points is illustrated in Figure 7.2, where thehighlighted points in Figure 7.2(a) and (b) represent the same four points in two differentplanes. Intermediate results of an LSP run for a 4 variable problem, [Problem 4-1-0-3,Appendix A], were used to draw the points in Figure 7.2. In this instance the highlightedpoints correspond to the current 4 best points which had been selected automatically bythe LSP software.Linking is also a feature of static graphics display and was described in section 7.5.1.Chapter 7. LSP COMPUTER IMPLEMENTATION 2052a1.2 1.1a.01.2 00000o 02$ \u00E2\u0080\u00A2 a \u00E2\u0080\u00A204 00 0X2 C 00000 o a 0a 4 0 a\u00E2\u0080\u00941.5..13 a 0I I I I I I I I I I\u00E2\u0080\u00942I I I I4.1 20 20 0.5 1.5 .0.5 .02 0.2 06 1.0 1.4 16 22XI . Y3Figure 7.2: Linking of points for a 4 variable problem [Problem 4-1-0-3].Brushing:This is a dynamic method in which the user moves a small adjustable rectangle aroundthe screen, with a mouse, in order to identify points within the rectangle. The rectangleis called a brush and each two dimensional plot is called a panel. This technique is usedfor high dimensional problems which need more than one panel to show points in allplanes of the feasible region.There are two ways of displaying points which lie inside the brush. As the brushmoves around the active panel, points within the brush are highlighted in all of the otherpanels. The second method is performed with all data points plotted on the active panel,but only points in the brush are shown in the other panels. Figure 7.3 is adopted fromBecker to demonstrate the second approach of brushing, where only points in the brushare displayed in all the panels other than the active panel [Becker et al, 1987]. In thisfigure, ozone, radiation, temperature and wind speed designate 4 different variables.The brushing technique can also be used to delete points by pressing one of theChapter 7. LSP COMPUTER IMPLEMENTATION 206BrushActivepanelFigure 7.3: Points in the brush highlighted in all the panels [Becker, 1987].mouse\u00E2\u0080\u0099s buttons. Points inside the brush on the active panel are deleted, temporarily,along with corresponding points on the other panels.The shape and size of the brush can be changed to meet the requirements of the user.When there is a need to examine the effect of only one variable on the others, the brushcan be adjusted to produce a long and narrow rectangle. Such a slender brush can alsobe used to give information on the nonlinear dependence of one variable on the others.Alternagraphics enhances understanding of the nature of the problem being solved.Implementation of this information in engineering has yet to be fully investigated but, inChapter 7. LSP COMP UTER IMPLEMENTATION 207conjunction with LSP, it appears to have considerable potential for supporting practicalengineering applications of nonlinear optimization.Chapter 8SENSITIVITY ANALYSISSensitivity analysis is dependant upon numerical information which can be derived directly from the optimal solution and obtained with minimal compntational effort. Forthe established optimization methods this means that the sensitivity output is primarilyassociated with changes in the optimal solution due to variation in the numerical valueof coefficients in the problem formulation, both in the objective function and the righthand side constant coefficients. This information can be important when the coefficientsin a problem formulation are poorly defined or can in fact be adjusted.In LSP, useful sensitivity information which is obtained without any additional analysis is different from that discussed above. It utilizes the acceptance sets with level setvalues close to the global optimal value to indicate how near optimal solution points aredistributed in the search domain. Here the emphasis is not on the coefficients but on thevalues of the decision variables and the value of the objective function. Therefore thesensitivity information indicates how much one can deviate from the optimal point(s)without the objective function excluding its global optimal value by some prescribedamount. Revealing the distribution of multiple global optima in the decision domainmight also be considered as a part of this kind of sensitivity analysis.LSP sensitivity information is undoubtedly of practical value as it focuses on theinfluence of those decision, design, or activity variables over which the engineer exercisesactual control and can therefore adjust their value. Furthermore, in some importantapplications of nonlinear optimization, conventional sensitivity analysis results which208Chapter 8. SENSITIVITY ANALYSIS 209focus on variation of coefficient values are of no practical value. For example, in modelfitting the decision variables are the parameters of the model and the coefficients are theexperimental data.There are a number of options for a model fitting criterion. Consider the case of asimple linear model where the optimization problem is to minimize the sum of the squaresof deviations between observed and model generated estimates of a dependent variablez. There are two independent variables .x and y and m sets of observations of x, y and z.Each observation is identified by its subscript i. The problem is expressed as followsMinimize\u00E2\u0080\u0094 z)2where the model generated data set E is expressed as= a * x + b *and the constraints area * x1 + b * Yi =a * x2 + b * Y2 =a * x3 + b * =a * Xm + b * Yin = z.The two model parameters, a and b, must be established on the basis of m observationsof the independent and dependent variables.In this problem x, y and z represent the already realized experimental data butalso serve as coefficients in the constraint set, whereas a and b are the decision variables,i.e. the unknown model parameters whose values are being established. Because theChapter 8. SENSITIVITY ANALYSIS 210coefficients in the optimization formulation are observed data sets a sensitivity analysiswhich is concerned with changing the coefficients has little meaning. In contrast, anindication of the sensitivity of the objective function value to the value of a and b in theregion of their optimal values could provide some useful insight as to the robustness ofthe estimates of a and b. Sensitivity information of this type is readily provided by LSP.The remainder of this chapter attempts to describe the generally interpretive nature ofLSP sensitivity analysis which is derived almost entirely from the graphical outputs. Theinterpretation of the graphical output is influenced by the finite nature of the point sampling and by topological characteristics of the various functions involved in the problembeing solved. The sensitivity conclusions discussed below may therefore not necessarilybe the most appropriate for all cases.8.1 Sensitivity with LSPOnce the LSP convergence criterion is met then all points in the final acceptance setare global optima or provide near optimal solutions. In cases where these points forma distinct single cluster and the difference between any two points is approaching thenumerical precision of the computer, then the final acceptance set can be consideredto represent a single global optimum point. The size of the final acceptance set is,however, normally determined by the convergence criterion, VM so that in many cases,the confirmed points in the final cuboid may be scattered over a larger region. Thenthe difference between any two confirmed points is not simply governed by numericalprecision but reflects a considerable difference between solutions yielding optimal or verynear optimal solutions. Large distances between points suggests that the problem haseither multiple optima or a very flat objective function surface in the region of theacceptance set.Chapter 8. SENSITIVITY ANALYSIS 211In spite of the interpretations suggested above, and in the remainder of this chapter,one should exercise caution when associating the cuboid size and the distribution ofpoints in the acceptance set with the number of distinct global optima. Even though thefinal cuboid is assumed to contain all globally optimal points, all confirmed points in thefinal acceptance set may not produce the global optimum value and, with the discretesampling involved, some globally optima points may not have been sampled.The x \u00E2\u0080\u0098-S- x scatter plots can be viewed with axes lengths determined by eitherthe variable bounds established prior to running LSP (the initial cuboid) or within thecuboids at the later or final stages of the search.8.1.1 Sensitivity interpretation of the confirmed points at convergence plotted in the initial cuboidThe plots of the confirmed points at convergence in the initial cuboid provide anothersensitivity perspective on the LSP solution. Usually the scale of these plots, and the display resolution, will obscure the details. They do, however, reveal certain characteristicsof the optimum solution from the perspective of the engineer and his problem. Becauseof resolution limitations at this scale, points which are close together will coalesce to asingle point in the display so that the visible points can be assumed to be global optimal points. Problems which have multiple optima often produce distinct scatter plotswhich can be meaningfully interpreted in various ways. Figures 8.1(a) to (e) are usedto demonstrate the interpretation of some typical final scatter plots, within the initialcuboid, for some two variable problems.Figure 8.1(a): Suggests that there is only a single optimum in the region of interestand the optimum point lies around the centre of the search domain.Figure 8.1(b): Suggests that there is a single optimum point but two of the variablebounds are probably acting as active constraints. From a practical point of view thisChapter 8. SENSITIVITY ANALYSIS 212X2(a)X2[ X2L (c)_________Figure 8.1: Plots of confirmed points after convergence criterion is met.suggests that, if it is possible, a relaxation of bounds might produce a better result.Figure 8.1(c): The plot clearly suggests the existence of two distinct global optimalpoints.Figure 8.1(d): The plot suggests the possible existence of multiple optima, but oneof the variables (x1 in this case) has the same, i.e. constant, value at all of the optimalpoints.Figure 8.1(e): Indicates that in a specific region of the search domain, the confirmedpoints can be approximated by a diagonal line, which implies a linear relationship betweenthe two variables in the region of near optimality.An actual screen dump for a six variable problem [Problem 6-2-0-6, Appendix A] isgiven in Figure 8.2 to demonstrate the kind of variability that can occur in the appearanceof the 14 individual x x plots. The plot was generated with a relatively high value ofthe convergence criterion to make the plots more distinctive. It shows, for example, thatfixing the values for variables x1 and x2 might be justifiable to simplify the optimizationproblem as this may not change the optimal solution significantly.xlChapter 8. SENSITWITY ANALYSIS 213Figure 8.2: Plots of confirmed points at convergence within the initial cuboid for a 6variable problem (Actual screen dump)8.1.2 Response of acceptance set to changes in the level set value-cOnce LSP is run and an optimal solution found, there may be a value in investigating thesensitivity of the decision variables with respect to the value of the objective function.More specifically determining to what extent an objective function value which is inferiorto the global optimum value permits a wider range of decision variable values. This isdone by storing the acceptance sets at all iterations. Figure 8.3 shows the kind of effectwhich might be observed when c is raised above c* when the acceptance sets are plottedin the initial cuboid.Actual screen dumps of plots for the test problem [Problem 4-6-1-2, Appendix A] areChapter 8. SENSITIVITY ANALYSIS 214x2 x2xl xl(a) c = c (b) C> c (connected level set) (C) C> c (partitIoned level set)Figure 8.3: Plots of confirmed points within the initial cuboid for different c valuesgiven in Figure 8.4 and represent outputs at two level set values. Figure 8.4(b) showspoints in the acceptance set for a level set value of 29.68, which is the global minimum.Figure 8.4(a) shows points in the acceptance set for the same problem but with the levelset value-c raised by 1% to 29.96 which coincides with c at iteration 8. As a result manynew points are introduced into the acceptance set. Most of the new points are locatedfar from the global optimum point, as shown in Figure 8.4(a). The small increase in theobjective function value has permitted a wide range of possible solution points some ofwhich may offer benefits not measured by the objective function and therefore preferredas practical solutions to the problem.8.2 Other approaches to obtaining sensitivity informationInformation similar to that obtained for coefficient values in conventional sensitivity analysis can also be obtained with LSP, though at some additional computational expense.When the sensitivity can be expressed in the form of a gradient at the optimal solution (e.g. Lagrange multipliers) then only a slight perturbation of a coefficient value isxxl.Chapter 8. SENSITIVITY ANALYSIS 215necessary to estimate the gradient by finite difference. A small change would almost guarantee that the acceptance set already obtained just a few iterations before convergencewould provide an appropriate starting point for solving the revised problem. Thus onlya small additional computational effort would be involved for solving for each perturbedcoefficient.Chapter 8. SENSITIVITY ANALYSIS 216(a)(b)Figure 8.4: Points in the level set, for c = c and c = 1.01 * c - [Problem 4-6-1-2] (Actualscreen dump)Chapter 9CONCLUSION9.1 IntroductionA global search scheme for nonlinear optimization problems based exclusively on levelsets, was first presented in \u00E2\u0080\u009CIntegral global optimization\u00E2\u0080\u009D by [Chew & Zheng, 1988]. It wasthis work that provided the starting point for the research in this thesis. The followingsurmnary of its shortcomings is not intended to question the value of Chew & Zheng\u00E2\u0080\u0099scontribution in any way but to provide a clearer perspective on the contribution of thisthesis. As was suggested by the title, they placed considerable emphasis on the theoreticalproperties of integral expressions for the higher moments of the objective function valuesassociated with solution points in level sets. Much of this theory was found to have littleor no bearing on the implementation of a level set based optimization scheme. Althoughthe capability of identifying multiple optima was mentioned, the authors focused on amethod for solving problems with a single global optimum only. Thus the potentialof the methodology to solve multiple global optima problems as well as identify nearoptimal solutions was overlooked. A systematic performance assessment over a varietyof challenging test problems was not included. The importance of the interpretation ofintermediate and final output from a level set search and the potential for a graphicalinterface were not recognized.Level sets have been used to augment some global optimization schemes. But theview of some of the authorities in that field has been that the use of level sets as the217Chapter 9. CONCLUSION 218principal search tool is impractical. The level set method was labelled as being only a\u00E2\u0080\u009Ctheoretical\u00E2\u0080\u009D iterative scheme in [Horst & Hoang, 1990], and as being \u00E2\u0080\u009Cnot designed forsolving practical problems\u00E2\u0080\u009D in [Torn & Zilinskas, 1988]. A level set based search fallsinto the class of iterative direct search methods of optimization. Some anthors clearlyfeel that direct search methods are applicable for only low dimensional problems. Edgar& Himmelblan [Edgar & Himmelblau, 1988] suggest that \u00E2\u0080\u009Cdirect methods ... are not asefficient and robust as many of the modern indirect methods, but for simple two variableproblems, they are satisfactory\u00E2\u0080\u009D. Also surprising in the light of experience in this thesis isthe view that optimization methods based on random sampling are suitable only for lowdimensional problems or as devices for generating starting points for more sophisticatedmethods [Reklaitis et al., 1983].Computational efficiency has dominated the development of nonlinear optimizationmethods over the past 30 years. The negative views which have been expressed aboutdirect search methods in general, and level set methods specifically, arise from computational efficiency concerns. As was discussed in Chapter 6 on performance assessment,ease of use is more likely to be the dominant criterion in the future and computationalefficiency only of secondary importance.The research reported in this thesis set out to investigate the use of level sets inpractical nonlinear optimization. As the investigation proceeded many new, previouslyunexploited aspects of level set optimization were revealed and incorporated into theevolving implementation scheme. This implementation was eventually given the nameLevel Set Programming (LSP) as it appears to have the necessary characteristics, including a theoretically sound global convergence criterion, to qualify as a mathematicalprogramming tool. It uses estimates of only the first two moments of the objective function values at the (discrete) points in the level sets to redefine the search domain at eachChapter 9. CONCLUSION 219iteration and to measure convergence on the optimum solution. LSP utilizes approximations of the true level set (or acceptance set in constrained problems) boundaries in theform of cuboids for efficiently generating points in the level sets. The combined effectof the progressively improving level set and the contraction of the search domain makesLSP an efficient and reliable global search procedure.LSP can handle, without modification, both equality and inequality constraints, andalso has the provisions to incorporate existing techniques of optimization, such as thepenalty functions. It also makes use of established clustering analysis techniques toimprove the search efficiency for problems involving widely separated multiple globaloptima.Unlike many other optimization methods, the intermediate results generated duringan LSP search provide a global view of the problem being solved. This is in contrast withthe single path view provided by say gradient methods. Because a much larger numberof point solutions are established at each iteration than is the case with other practicaldirect search methods, LSP\u00E2\u0080\u0099s view of the problem is far more complete than with theestablished direct search methods.9.2 Evaluating LSP performance using test problemsA large set of published mathematical and engineering test problems, already solvedby a variety of NLP methods, were solved with LSP to evaluate its performance. Thetest set included unconstrained and constrained problems, continuous and discontinuousfunctions, and continuous and discrete variables. The dimensions of the problems variedfrom 1 to 38 variables. The results showed that solutions found with LSP were generallyin agreement with the published results, with some iniprovement in about 5% of the cases.The only aspect of performance to other methods cited in the literature, which showsChapter 9. CONCLUSION 220some superiority over LSP, is in the shorter computational time expended to solve testproblems. Even though similar machines are not used for all the cases, the computationaltime taken to meet the termination criterion reported in the literature was generally lowerthan with LSP. Still, the time spent with LSP was found to be well within a tolerabletime frame from a practical engineering design optimization point of view. The additionalcomputational time can be justified easily when it is weighed against the reliability ofLSP as a global optimizer and the extra information it provides during the search.The nonlinear optimization test problems in the literature have many shortcomingsand there are virtually no test problems capable of testing global optimization schemesin any systematic way. To more adequately test the strengths and limitations of LSP, anew mathematical test problem named \u00E2\u0080\u0098The Road Runner Function\u00E2\u0080\u0099 was developed. Ithas a single global optimum and several local optima for any dimension n. The functionis parametrically adjustable and can be easily extended to any number of dimensionswhile retaining its key features. It is a challenge to all existing NLP and direct searchmethods unless an ideal starting point is chosen.One problem discovered in the literature, that by Subrahmanyam [Subrahmanyam,1989], was found to have many desirable attributes for a test problem. The formulation and the geometry of its feasible region and objective function surface are simple.It presents convergence difficulties for direct search methods even though the dimensionality of the problem is low. LSP was only able to solve this problem by significantreadjustments to its search parameters.9.3 LSP performance improvementsAs a result of the experience gained with the test problems both the reliability and efficiency of LSP were substantially improved over the course of this research. This entailedChapter 9. CONCLUSION 221the refinement of some of the techniques suggested in [Chew & Zheng, 1988], development of some new techniques, the introduction of parameters to control these techniques,and the development of diagnostics to assess progress and guide the modification of theseparameters.One heuristic method investigated involved subdividing the cuboid into a set ofsmaller subcuboids and then performing searches within each subcuboid. This techniqueis applied when the intermediate results displayed in the graphical output or cluster analysis indicate the possible existence of a partitioned level set. The division of the searchdomain avoids inefficient searches in-between the connected regions of the level set, sothat the search then concentrates on smaller subregions each of which is fully connected.The sum of the function evaluations to meet the termination criterion for each subregionwas found to be considerably lower than the number of function evaluations for the singleregion with a partitioned level set.When the feasible region formed by the set of constraints occupies only a very smallregion of the initial cuboid, feasible point generation can be difficult and result in aninefficient or suspended LSP search. An indication that this difficulty has arisen is provided by the number of function evaluations required at the first LSP iteration. Thisearly feedback permits corrective action to be taken before a large amount of computational effort is wasted. Constraint relaxation was found to improve the efficiency ofsample point generation by temporarily increasing the volume of the feasible region. Todiscourage the existence of infeasible points in the solution set, penalty terms were alsoadded to the objective function. The progressive tightening of the relaxed portion of theconstraints in conjunction with the penalty terms on the objective function assures thebest continuity between successive acceptance sets and that solutions are found withinthe feasible region of the original problem.Due to the random generation of sample points, a global optimum point may be missedChapter 9. CONCLUSION 222if it lies at a boundary of the feasible region. To increase the capability of generatingboundary points the technique of skewness adjustment was implemented. This techniquemodifies the location and boundaries of the current cuboid to ensure that no potentiallyimportant regions are excluded from the current cuboid. In addition to this skewnessadjustment, reduced bias estimates of cuboid bounds are calculated at every iteration.Both of these adjustments further increase the volume of the cuhoid and maximize theprobability that the true acceptance set is included within the modified cuboid.Two other alternatives to the cuboid approach, replacing the cuboid with a rhombohedron and exploiting a linear relationships between a variable pair, were also investigated.These were found to provide significant efficiency improvements under specific but commonly occurring conditions which are detectable from LSP\u00E2\u0080\u0099s output.As with all NLP schemes, equality constraints present significant difficulties. In anideal case, equality constraints can be used to reduce the number of variables involved,otherwise they must be accommodated using a penalty technique. The approach usedwith LSP is similar to the classical penalty approach, but the penalty parameter valueis reduced at every iteration. This progressive modification of penalty terms throughoutthe search process was found to significantly improve the chances of reaching the globaloptimum in the presence of equality constraints.9.4 Graphical outputThe intermediate graphical displays provided by LSP and their interpretation open upa new dimension in engineering optimization. These plots provide entirely new ways tojudge the progress of the search as well as insights into the topology of the particularproblem being solved.The first set of plots, the x, x scatter plots, could of course, be obtained withChapter 9. CONCLUSION 223any NLP method, but are far more meaningful with a technique such as LSP whichprovides a large number of confirmed solution points at each stage of the search. Theseplots facilitate visualizing the acceptance set boundary shapes and will often reveal thepresence of distinctly separate global optima or near optima. The plots can also giveindications of dependencies between the variables which can be used to speed up thesearch. The second type of plots are the progressive plots of level set value-c versusiterations-I which can be used as an indication of the strength of convergence. Mostimportantly the c n.\u00E2\u0080\u0099 I curve can also suggest the possible existence of fissure like featureson the objective function surface. A third type of plot, the plot of number of functionevaluations-Nf versus iteration-I, reveals the efficiency of the search as it progresses. Itsinterpretation can give an indication of the existence of multiple global optima, or theexistence of multiple local optima with only one global optimum.One of the most interesting aspect of the plots discussed is that much of the triggeringinformation for LSP parameter adjustment or problem modification concerns just c, Nfand I. It is also significant from a users point of view that the magnitude of c is relateddirectly to the real world problem being solved while Nj and I are easily understoodquantities related to the search. The influence of these quantities on the search strategyis therefore easy for practitioners to understand.9.5 Civil engineering design problems and LSPThe mathematical equations used in civil engineering design are predominantly nonlinear. In many instances the mathematical formulation of the associated design problemsdo not meet the ideals, from a nonlinear optimization point of view, of providing a convexobjective function and a convex feasible region. It is also not uncommon for the decisionvariables to refer to discrete choices of pipe diameter, structural member size, etc. soChapter 9. CONCLUSION 224that discrete decision variables mnst be accommodated. Most civil engineering designoptimization problems therefore require the use of nonlinear programming techniqueswhich can accommodate mixed variables for their solution. Furthermore, in engineeringdesign, there are instances where the design which produces the optimum objective function value may not be the best practical option. Alternative solutions which produceslightly inferior values to the mathematical global optimum solution may be preferredbecause of factors which could not be captured in the formulation. Therefore, a featureof an optimization method which might appeal to civil engineers is one which can identifynear optimum solutions. Simplicity of use of an optimization method also translates toeconomy of use in a commercial engineering office and this alone may be the major factordetermining the practical use of optimization.Currently, there are several NLP packages being offered commercially but most welldeveloped packages use gradient methods and require the differentiability of the objectiveand constraint functions. These do not fully satisfy the needs of civil engineers for severalreasons. Gradient methods basically find local optima and can not guarantee finding aglobal optimum except under very ideal conditions. Unfortunately, they do not giveany indication of the local or the global nature of the solution found. When problemshave multiple optima, either local or global, a single gradient search can only identify asingle solution. The solution found depends entirely on the starting point of the search.No systematic way of selecting starting points to ensure finding the global optimum isavailable and the responsibility for the overall global search strategy and its success isleft entirely to the user.Most NLP packages which have the ability to handle practical sizes of problems have,until very recently, been available only on mainframe computers. There are only a fewnonlinear optimization packages which can address practical engineering problems usingpersonal computers and these use almost without exception, the same gradient methodsChapter 9. CONCLUSION 225which have dominated mainframe optimization for the past 30 years.The feedback provided by the gradient search methods are in the form of matricesof first and second partial derivatives, matrix condition, Lagrange multipliers, etc.. Thiskind of information rarely has any direct relevance to an engineer\u00E2\u0080\u0099s view of his engineeringdesign problem. Additionally, this information can be particularly difficult to interpretwhen search failures occur or when the global nature of the solution is in question. Ifa gradient NLP method fails to confirm even a local optimum, the user is left with nosuggestion for his next move.In summary, in spite of the recognized need for optimization in civil engineering,existing NLP packages do not appear to satisfy the needs of practitioners. It is entirelypossible that it is the available methods, and not just their computer implementation,that have insufficient appeal to convince engineers that their time should be invested inthat direction.LSP is a global search method and the chance of converging on the global solutionis much higher than with any of the gradient based methods such as GRG and MINUS,even when a number of different starting points are used. One of the important aspects ofLSP is its capability to identify multiple optima and near optimal solutions. If multiplelocal optima exist, LSP either identifies all of them or gives indications of their existencein the process of identifying the global optimum.LSP overcomes many of the limitations of the existing gradient methods but doesnot radically reduce the computational effort involved for those problems which couldbe solved using a gradient method. Perhaps the most significant features of LSP froma practical engineering standpoint are that: the method is conceptually simple, and cantherefore be easily understood by engineers who are generally not experts in numericalanalysis techniques; virtually all of the computations and numerical results generatedduring the search are meaningful in the context of the engineering problem; the programChapter 9. CONCLUSION 226can be run on personal computers; and an elementary, and therefore fast, graphicalinterface can display all of the useful information at any stage of the search. Flexibility,simplicity and meaningful intermediate results during the search lower the engineers\u00E2\u0080\u0099time requirements for familiarization and therefore make LSP economical to implementeven when its use is only occasional.The success of global optimization is highly dependent on the complexity of the objective function surface and the boundaries of the feasible region, in other words theirgeneral topology A better appreciation of the topological nature of the global optimization problem can only help to greater understanding and possibly refinement of globaloptimization methods. Yet topological interpretations have not been widely nsed in optimization, the theoretical basis for level set optimization presented in [Chew & Zheng,1988] being a rare example. Topology is an established and mature subtopic of mathematics with a considerable body of theory, research in this topic may therefore providenew perspectives on level sets and provide the basis for further refinements in LSP typemethods.Subdividing the search domain into smaller regions was found to increase the searchefficiency considerably for multiple optima problems. It was also observed that determining the number of clusters and allocating points into different clusters becomes difficultwith the number of variables in a problem. Consequently, high dimensionality clusteranalysis methods should be further explored with LSP\u00E2\u0080\u0099s particular requirements of lowprecision cluster analysis, as discussed in Chapter 3, section 3.1.3, in mind.Graphical interpretation of both intermediate and final optimization results opens upa new perspective in the field of optimization. The fact that LSP produces and storesa large set of intermediate results in the form of solution points makes the graphicalinterface far more attractive than with almost all other optimization schemes. ThisChapter 9. CONCLUSION 227is especially true with high dimensional problems when the information provided in thedynamic graphs is high. Therefore, alternagraphics and similar display techniques shouldbe further researched to fully exploit the inherent ability of LSP to support graphicaloutputs.BibliographyThe following bibliography is arranged in conventional alphabetical order. The numerical references provided are used only in the Appendix.[1] Alperovits, A. and Shamir, U., \u00E2\u0080\u009CDesign of optimal water distribution systems\u00E2\u0080\u009D,Water Resources Research, pp 885-900, 1977.[2] Aluffi-Pentini, F., Parisi, V. and Zirilli, F., \u00E2\u0080\u009CGlobal optimization and stochasticdifferential equations.\u00E2\u0080\u009D, JOTA Vol 47 NO. 1 pp 1-16, Sept 1985.[3] Armstrong, M. A.,\u00E2\u0080\u009DBasic topology\u00E2\u0080\u009D, New York, Springer-Verlag, 1983.[4] Archetti, F. and Schoen, F.,\u00E2\u0080\u009DA survey on the global optimization problems: Generaltheory and computational approaches\u00E2\u0080\u009D, Annals of operations research No 1, pp 87-110, 1984.[5] Ballard, D. H, Jelinek, C, 0., and Schinzinger, R., \u00E2\u0080\u009CAn algorithm for the solutionof constrained generalized polynomial programming problems\u00E2\u0080\u009D, Computer Journal,Vol 17, pp 261-266, 1974.[6] Bazaraa, S. Mokhater, \u00E2\u0080\u009CNonlinear programming, theory and algorithms\u00E2\u0080\u009D. John Wiley & sons, 1979.[7] Becker, A. Richard, Cleveland, S. William and Wilks, R. Allan, \u00E2\u0080\u009CDynamic graphicsfor data analysis\u00E2\u0080\u009D, Statistical Science, Vol 2, No 4, Nov 1987.[8] Ben Saad, Sihem and Stefer Jacobsen, E. \u00E2\u0080\u009CA level set algorithm for a class of reverseconvex programs\u00E2\u0080\u009D, Annals of operations research, No 25, pp 19-42, 1990.[9] Betrio, B. and L. De Biase, \u00E2\u0080\u009CA recursive spline technique for uniform approximationof sample data\u00E2\u0080\u009D, Technical Report, University of Piza (1), 1987.[10] Betro, B. and Schoen, F., \u00E2\u0080\u009CSequential stopping rules for the multistart algorithm inglobal optimization\u00E2\u0080\u009D, Mathematical programming 38, pp 271-280, 1987.[11] Boender, C. G. E. and Kan, A. H. G. Rinnooy, \u00E2\u0080\u009CBayesian stopping rules for multistart global optimization methods\u00E2\u0080\u009D, Mathematical programming 37, pp 59-80, 1987.[12] Box, M. J., \u00E2\u0080\u009CA new method of constrained optimization and a comparison withother methods\u00E2\u0080\u009D, Computer Journal, Vol 8 pp 42-52, 1965.228Bibliography 229[13] Chew, Soo Hong and Zheng, Quan, \u00E2\u0080\u009CIntegral Global Optimization: Theory, implementation and applications\u00E2\u0080\u009D, Lecture notes in economics and mathematical systems,Vol 298, Springer-Verlag, Berlin Heidelberg 1988.[14] Cole, F., Gochet, W. and Smeers, Y., \u00E2\u0080\u009CA comparison between a primal and dualcutting plane algorithm for posynomical geometric programming problems\u00E2\u0080\u009D, JOTAVol 47 NO. 2 Oct pp 159-180, 1985.[15] Corana, A., Marchesi, M., Martini, C., and Ridella, S., \u00E2\u0080\u009CMinimizing multimodalfunctions of continuous variables with the simulated annealing algorithm\u00E2\u0080\u009D, ACMTransactions on Mathematical Software, Vol. 13, No. 3, pp 262-280, 1987.[16] Crowder, H. P., Dembo, R. S. and Mulvey, J. M., \u00E2\u0080\u009CReporting computational experiments in mathematical programming\u00E2\u0080\u009D, Math Programming, No 15, pp 316-329,1978.[17] Davis, John C., \u00E2\u0080\u009CStatistics and data analysis in geology\u00E2\u0080\u009D, John Wiley & Sons 1973.[18] Devore, Jay L., \u00E2\u0080\u009CProbability and statics for engineering and the sciences\u00E2\u0080\u009D,Brooks/Cole publishing company, 1982.[19] Dixon, L. C. W. and Szego, G. P., \u00E2\u0080\u009CTowards global optimisation 2\u00E2\u0080\u009D, Elsevier NorthHolland, 1976.[20] Eason, E. D. and Fenton, R. G., \u00E2\u0080\u009CA comparison of nnmerical optimization methodsfor engineering design\u00E2\u0080\u009D, Journal of Engineering for Industry, Trans of ASME, pp196-200, February 1974.[21] Edgar, T. F. and Himmelblau, D. M., \u00E2\u0080\u009COptimization of chemical processes\u00E2\u0080\u009D, McGraw Hill Inc. 1988.[22] Everitt, Brian, \u00E2\u0080\u009CMath cluster analysis\u00E2\u0080\u009D, Heinemann Educational Books Ltd, 1980.[23] Falk, James E. and Hoffman, Karla L., \u00E2\u0080\u009CConcave minimization via collapsing polytopes\u00E2\u0080\u009D, Operations Research Vol 34 No. 6, Nov-Dec, pp 919-929, 1986.[24] Fleming, John F., \u00E2\u0080\u009CStructural engineering analysis on personal computers\u00E2\u0080\u009D, McGrawHill Book Company, 1986.[25] Flondas, C. A. and Ciric, A. R., \u00E2\u0080\u009CStrategies for overcoming uncertainties in heatexchanger network synthesis\u00E2\u0080\u009D, Computers and Chemical Engineering, 13(10), pp1133-1152, 1989.[26] Floudas, C. A. and Pardalos, P. M., \u00E2\u0080\u009CA collection of test problems for constrainedglobal optimization algorithms\u00E2\u0080\u009D, Lecture Notes in Computer Science, NO 455,Springer-Verlag, 1990.Bibliography 230[27] Fujiwara, 0., Jenchaimahakoon, B., and Edirisinghe, N. C., \u00E2\u0080\u009CA modified Linearprogramming gradient method for optimal design of looped water distribntion networks\u00E2\u0080\u009D, Water Resources Research, Vol 23, No 6, pp 977-982, June 1987.[28] Ghani, S. N., \u00E2\u0080\u009CAn improved \u00E2\u0080\u0098complex\u00E2\u0080\u0099 method of function minimization\u00E2\u0080\u009D, Computeraided design, January 1972.[29] Gill, Philip E., Murray, Walter, Saunders, Michael A. and Wright, Margaret H.,\u00E2\u0080\u009CConstrained nonlinear programming\u00E2\u0080\u009D in Handbooks in operations research and management science, Volume 1, Optimization. North Holland, 1989.[30] Goldberg, David E., \u00E2\u0080\u009CGenetic algorithms in search, optimization, and machine learning\u00E2\u0080\u009D, Addison-Wesley Publishing Company, Inc., 1989.[31] Goldberg, David E. and Samtani, Manohar P., \u00E2\u0080\u009CEngineering optimization via geneticalgorithm\u00E2\u0080\u009D in Electronic computation, pp 471-482, Kenneth M. Will ed. 1986.[32] Gottfried, Byron S. and \Veisman, Joel, \u00E2\u0080\u009CIntroduction to optimization theory\u00E2\u0080\u009D, Prentice Hall, 1973.[33] Hartigan, John A., \u00E2\u0080\u009CClustering Algorithms\u00E2\u0080\u009D, Wiley, 1975.[34] Himmelblan, David M., \u00E2\u0080\u009CApplied nonlinear programming\u00E2\u0080\u009D, McGraw Hill Inc. 1972.[35] Horst, Reinner and Hoang, Tuy, \u00E2\u0080\u009CGlobal optimization: deterministic approaches\u00E2\u0080\u009D,Springer-Verlag, 1990.[36] Huang, H. Y. and Aggarwal, A. K., \u00E2\u0080\u009CA class of quadratically convergent algorithmsfor constrained function minimization\u00E2\u0080\u009D, JOTA Vol 16 Nos 5/6, pp 447-485, 1975.[37] Jenson, David L., \u00E2\u0080\u009CThe role of cluster analysis in computer assisted mass appraisal\u00E2\u0080\u009D,Lincoln Institute Monograph NO 77, July 1977.[38] Kan, A. H. G. Rinnooy and Timmer, G. T., \u00E2\u0080\u009CGlobal optimization\u00E2\u0080\u009D, in Handbooksin operations research and management science, Volume 1, Optimization, NorthHolland, 1989.[39] Kassicich, Suleiman K., \u00E2\u0080\u009CInternational intra-company transfer pricing\u00E2\u0080\u009D, OperationsResearch Vol 29, No 4, pp 817-828, 1981.[40] Lasdon, Leon S. and Waren, Allan D., \u00E2\u0080\u009CGRG2 user\u00E2\u0080\u0099s guide\u00E2\u0080\u009D, May 1982.[41] Leon, A., \u00E2\u0080\u009CA classified bibliography on optimization\u00E2\u0080\u009D, in recent advances in optimization techniques, pp 599-649, 1966.Bibliography 231[42] Lipschutz, Seymour, \u00E2\u0080\u009CGeneral topology\u00E2\u0080\u009D, Schaum\u00E2\u0080\u0099s outline series in Mathematics,McGraw-Hill book company, 1965.[43] Loganathan, G. V. and Greene, J. J., \u00E2\u0080\u009CGlobal approaches for the nonconvex optimization of pipe networks\u00E2\u0080\u009D, Water Management in the 90\u00E2\u0080\u0099s, 1993.[44] Lucidi, S. and Piccioni, M., \u00E2\u0080\u009CRandom tunnelling by means of acceptance-rejectionsamphng for global optimization\u00E2\u0080\u009D, JOTA Vol 62 NO. 2, pp 255-278, 1989.[45] Lnenberger, David G., \u00E2\u0080\u009CLinear and nonlinear programming\u00E2\u0080\u009D, Addison Wesley, 1984.[46] Luus, Rein and Jaakola, T. H. I., \u00E2\u0080\u009COptimization by direct search and systematicreduction of the size of search region\u00E2\u0080\u009D, AIChE Vol 19, NO. 4, pp 760-766, 1973.[47] Mays, Larry W. and Taur, Cheng-Kang, \u00E2\u0080\u009CUnit hygrographs via nonlinear programming\u00E2\u0080\u009D, Water Resources Research, Vol 18, No 4, pp 744-752, 1982.[48] Mezzich, Juan. E. and Solomon, Herbert, \u00E2\u0080\u009CTaxonomy and behavioral Science, comparative performance of grouping methods\u00E2\u0080\u009D, Academic press, 1980.[49] Moran, Manfred and Grossmann, Ignacio E., \u00E2\u0080\u009CChemical engineering optimizationmodels with GAMS\u00E2\u0080\u009D, CACHE Process design case studies, Vol 6, 1991.[50] Narula, Subhash C., \u00E2\u0080\u009COptimization techniques in linear regression: A review\u00E2\u0080\u009D, inOptimization in statistics, TIMS Studies in the management sciences, volume 19,pp 11-29, North-Holland publishing company, 1982.[51] Orth, Hermann, \u00E2\u0080\u009CModel based design of water distribution and sewage systems\u00E2\u0080\u009D,John Wiley & Sons, 1986.[52] Pike, Ralph W., \u00E2\u0080\u009COptimization for engineering systems\u00E2\u0080\u009D, Van Nostrand Reinhold,1986.[53] Ratschek, H. and Rokne, J., \u00E2\u0080\u009CNew computer methods for global optimization\u00E2\u0080\u009D, EllisHorwood Limited, 1988.[54] Reklaitis, G. V., Ravindran, A. and Ragsdell, R. M., \u00E2\u0080\u009CEngineering optimization,methods and application\u00E2\u0080\u009D, New York Wiley, 1983.[55] Sandgren, E and Ragsdell, K. M., \u00E2\u0080\u009CThe utility of nonlinear programming algorithms:A comparative study parts 1 and 2\u00E2\u0080\u009D, ASME J. Mech Des 102(3), pp 540-551, July1980.[56] Sarma, M. S., \u00E2\u0080\u009COn the convergence of Buta and Dora random optimization methods\u00E2\u0080\u009D, JOTA vol 66, No 2, Aug 1990.Bibliography 232[57] Schittkowski, Klaus, \u00E2\u0080\u009CNonlinear programming codes: Information, Tests, Performance,\u00E2\u0080\u009D Lecture Notes in Economics and Mathematical Systems, Vol 183, Springer-Verlag, New York, 1980.[58] Schittkowski, Klaus, \u00E2\u0080\u009CTest examples for nonlinear Programming Codes\u00E2\u0080\u009D, LectureNotes in Economics and Mathematical Systems, Vol 187, Springer-Verlag New York,1981.[59] Schittkowski, Klaus, \u00E2\u0080\u009CMore test examples for nonlinear programming codes\u00E2\u0080\u009D, Lecture Notes in Economics and Mathematical Systems, Vol 282, Springer-Verlag NewYork, 1987.[60] Shields, F. Douglas Jr., and Thackston, Edward L., \u00E2\u0080\u009CDesigning treatment basindimensions to reduce cost\u00E2\u0080\u009D, ASCE, Journal of Environmental Engineering, Vol 117,No 3, May/June 1991.[61] Subrahmanyam, M. B., \u00E2\u0080\u009CAn extension of the simplex method to constrained optimization\u00E2\u0080\u009D, JOTA, vol 62, NO 2, pp 311-319, August 1989.[62] Templeman, Andrew B., \u00E2\u0080\u009CDiscrete optimum structural design\u00E2\u0080\u009D, Computer andStructures, Vol 30, No 3, pp 511-518, 1988.[63] Torn, Amino and Zilinskas, Antanas, \u00E2\u0080\u009CGlobal optimization\u00E2\u0080\u009D, Lecture Notes in Computer Science, NO. 350, Springer-Verlag, 1988.[64] Turner, D. B., \u00E2\u0080\u009CWorkbook of atmospheric dispersion estimates\u00E2\u0080\u009D, Environmentalprotection agency, office of Air programs, Research Triangle Park, N.C., 1873.[65] Venkayya, V. B., \u00E2\u0080\u009CDesign of optimum structures\u00E2\u0080\u009D in Computers & Structures, Vol1, pp 265-309, 1971.[66] Visweswaran, V., and Flondas, C. A., \u00E2\u0080\u009CA global optimization algorithm (GOP) forcertain classes of nonconvex NLPs-II. Application of theory and test problems\u00E2\u0080\u009D,Computers and Chemical Engineering, Vol 14, No 12, pp 1419-1434, 1990.[67] Wang, Bi-Chong and Luus, Rein, \u00E2\u0080\u009CReliability of optimization procedures for obtaining global optimum\u00E2\u0080\u009D, AIChE Vol 24, No 4, pp 619-26, 1978.[68] Wasserman, William and Kutner, Michael H., \u00E2\u0080\u009CApplied linear regression models\u00E2\u0080\u009D,Richard D. Irwin, Inc., 1983.[69] Zanakis, S. H. and Evans, J. R., \u00E2\u0080\u009CHeuristic optimization: why, when, and how touse it\u00E2\u0080\u009D, Interfaces 11, pp 84-91, 1981.[70] Zupan, Jure, \u00E2\u0080\u009CClustering of large data sets\u00E2\u0080\u009D, Chemometrics research studies series.Research studies press, 1982.Appendix AMATHEMATICAL TEST PROBLEMSProblem 1-1-0-0. Goldstein function Source: [2]f(x) = \u00E2\u0080\u0094 15x + 27x + 250Subjected to NO constraint.Initial bounds:\u00E2\u0080\u009410 27> 397.5x23/x 1.93a1 = ((745 *x4/x2) + 16.91 * 106).5a2 = ((745 *5/x3+ 157.5 * 106).5= 0.1 *b2 = 0.1 * 4a1/b < 1100a2/b < 850x23 40x1/x2 5121.5x6 + 1.9 < X41.1x7 + 1.9 < x53 4.6 x2 .817 x3 <206.6 r4 8.37.3 < x5 8.32.9 < x6 < 3.95 X7 5.5Solution in literature:f(x*)= 2994.47x\u00E2\u0080\u0099 = (3.5, 0.7, 17, 7.3, 7.71, 3.35,5.287)Solution using LSP:f(x*)= 2994.5x\u00E2\u0080\u0099 = (3.50, 0.70, 17.00,7.30, 7.716, 3.351, 5.287)LSP initial bounds: As given in the set of constraints.convergence criterion:V,1<1E-3LSP Number of functional evaluation:Nf = 82826 = 70Problem 7-2-2-3. Source: [59]f(x) = \u00E2\u0080\u00945x1\u00E2\u0080\u00945x2\u00E2\u0080\u0094 4x3\u00E2\u0080\u0094x13\u00E2\u0080\u0094 6x4\u00E2\u0080\u0094 (5x)/(1 + x5)\u00E2\u0080\u0094(8x6)/ 1 + x6) \u00E2\u0080\u0094 10(1 \u00E2\u0080\u0094 2exp(\u00E2\u0080\u0094xr) + exp(\u00E2\u0080\u00942x7))Subject to:10\u00E2\u0080\u0094x 05\u00E2\u0080\u0094 x1 0x3xg\u00E2\u0080\u0094402x4+0.8x6xr\u00E2\u0080\u00945=0Appendix A. MATHEMATICAL TEST PROBLEMS 2704 + 4 + 4 + 4 \u00E2\u0080\u00945 = 0Solution in literature:f(x*)= \u00E2\u0080\u009437.4130= (1.47,1.98,0.35,1.2,0.57,0.78,1.41)Computational effort, N1, cited in literature (NLPQL): = 437Solution using LSP:f(x*)= \u00E2\u0080\u009437.352x = (1.271, 1.972, 0.536, 1.220, 0.498, 0.758, 1.455)LSP initial bounds:0 x2,x3,x5,x6,x7 <2.30 5; x1,x4 5; 5Convergence criterion:VM1E-3LSP Number of functional evaluation:N1 =6973724 IN\u00E2\u0080\u0099keep = 50Problem 7-3-0-4. Source: [61]f(x) =(xi\u00E2\u0080\u009410)+5(x2\u00E2\u0080\u009412)-I-(4+3(x\u00E2\u0080\u0094110*$+74+4\u00E2\u0080\u00944x0x8Subject to:24-I- 34 + x3 + 44 -1- 5x \u00E2\u0080\u0094 127 <07x1 + 3x2 + 104 + x4 \u00E2\u0080\u0094 \u00E2\u0080\u0094 282 5; 023x1 + 4 + 64 \u00E2\u0080\u0094 8x7 \u00E2\u0080\u0094 196 5; 044 + X2 \u00E2\u0080\u0094 3x1x2 + 245x6 \u00E2\u0080\u0094 11X 5; 0Solution in literature:f(x*)= 680.63005743= (2.330517, 1.9513717, \u00E2\u0080\u00940.47759196,4.365728, \u00E2\u0080\u00940.6245119, 1.0381598, 1.5942702)Computational effort, N1, cited in literature: = 4368Solution using LSP:f(x*)= 680.6488= (2.33976,1.94295, \u00E2\u0080\u00940.44145,4.38498, \u00E2\u0080\u00940.63151,1.04464,1.60214)LSP initial bounds:\u00E2\u0080\u00945 2000 + 80x5 + 80x6X37 + X43 7000 + 65x7 + 65x8X15 + x37 < 10000 + 89.286(x + x7)X26 + X48 3000 + 166.66(x7+ x8)4x5 + 4x6 10005x7+5x8<700x5 H- x6 <233X7 + x8 150H- .X7 200X6 + X8 10048 5 12048 20042 x3 12042 x4 200Solution in literature:f(x*) = \u00E2\u0080\u009416592.55x\u00E2\u0080\u009D = (120,53.51,120,108.87,122.89,110.1,77.1,62.89)The first two constraints are slightly violatedSolution using LSP:f(x*) = \u00E2\u0080\u009416563.059x = (120,57.41, 119.98, 110.28, 116.06, 116.95, 83.95, 56.06)LSP Initial bounds:48 12048 $ 20042 12042 x4 2000 2330 x 2330 1400 01 \u00E2\u0080\u0094 4x3x1\u00E2\u0080\u0094 2x71x\u00E2\u0080\u0099 \u00E2\u0080\u0094 0.0588x137\u00E2\u0080\u0094 1 01 \u00E2\u0080\u0094 4xx1\u00E2\u0080\u0094 2x711\u00E2\u0080\u0094 0.0588x\u00E2\u0080\u00993x8\u00E2\u0080\u0094 1 0Appendix A. MATHEMATICAL TEST PROBLEMS 2721f(x)4.20.1x10 fori=1,8Solution in literature:f(x*) 3.9511634396= (6.465114,2.232709, 0.6673975, 0.59575645.932676, 5.527235, 1.013322, 0.4006682)Computational effort, iVf, cited in literature (GRGA, FMIN): = 5929, 9544Solution using LSP:f(x*) = 3.960746x = (6.092, 2.550, 0.706, 0.611, 6.016,5.546, 1.105, 0.416)LSP Initial bounds:0.1x10 fori=1,8convergence criterion:VM 1E-6LSP Computational effort:Nf = 192453 Nkeep 80Problem 8-3-0-7. Source: [14]f(x) = 19.4xT7 + 16.8x6 + 91.5x3+ 19.4xZ\u00E2\u0080\u009947+86x138 + 152x27 + 16.8x\u00C2\u00B0+ 27.4x63Subjected to:0.537x1x23+ 4.47xx56+ 0.386x7x8 1.65xT\u00E2\u0080\u0099 1.15xTx\u00E2\u0080\u0099 \u00E2\u0080\u0099 1.65x\u00E2\u0080\u0099 1.15xx\u00E2\u0080\u0099 1.60x\u00E2\u0080\u0099 1.20x\u00E2\u0080\u0099x\u00E2\u0080\u0099 1x0 fori=1,8Solution in literature:f(x*) = 550.642x\u00E2\u0080\u0099 = (0.937, 0.901,0.676,0.671,0.331,0.467, 1.096, 0.554)Solution using LSP:f(x*) = 551.1217= (0.940, 0.954, 0.656, 0.693, 0.326, 0.487, 1.090, 0.581)LSP Initial bounds:0x3 fori=1,8LSP convergence criterion:VM1E\u00E2\u0080\u00943LSP Computational effort:JVj = 5485 Nkeep = 80Appendix A. MATHEMATICAL TEST PROBLEMS 273Problem 9-1-0-13. Source: [58]f (x) = \u00E2\u0080\u0094.5 * (x1x4 \u00E2\u0080\u0094 x23 + x39 \u00E2\u0080\u0094 x59 + x5 \u00E2\u0080\u0094 x6y)Subject to:1\u00E2\u0080\u0094\u00E2\u0080\u0094 x 01\u00E2\u0080\u0094\u00E2\u0080\u0094 x 01\u00E2\u0080\u0094 (xi\u00E2\u0080\u0094 \u00E2\u0080\u0094 (x2 \u00E2\u0080\u0094 x6)2 01\u00E2\u0080\u0094 (x1\u00E2\u0080\u0094 x7)2 \u00E2\u0080\u0094 (x2 \u00E2\u0080\u0094 x8)2 01 \u00E2\u0080\u0094 (x3\u00E2\u0080\u0094 x5)2 \u00E2\u0080\u0094 (x4 \u00E2\u0080\u0094 xg)2 01 \u00E2\u0080\u0094 (x3\u00E2\u0080\u0094 \u00E2\u0080\u0094 (x4\u00E2\u0080\u0094 xs)2 01 2 1\u00E2\u0080\u0094x39 0X58\u00E2\u0080\u0094 X67 01\u00E2\u0080\u0094 4 01\u00E2\u0080\u0094\u00E2\u0080\u0094 (x2\u00E2\u0080\u0094 x9)2 0Xi4 \u00E2\u0080\u0094 X23 0x59 0Solution in literature:f(x*) = \u00E2\u0080\u00940.8660254038x = (.884129, .4672425, .03742076, .9992996, .884129, .467242, .03742076, .99929996, 0)Computational effort, A/f, cited in literature (ORGA, FMIN):= 16857, 16724GRGA did not find the global solutionSolution using LSP:f(x*) = \u00E2\u0080\u00940.8657007x = (.96099, .29952, .23727, .917137, .95973, .28040, .24091, .99339, .02330)LSP initial bounds:0 x 1 fori= ito 9convergence criterion:Yw1E\u00E2\u0080\u00947LSP computational effort:Nf = 60500 iVkeep 80Problem 9-2-6-0. Source: [59]f(x)=ExSubject to:x1 + x2 *6(\u00E2\u0080\u00945r3) + x4 \u00E2\u0080\u0094 127 = 0r1 + x2 * e(3Ta) + x5 \u00E2\u0080\u0094 151 = 0x1 + x2 * + x6 \u00E2\u0080\u0094 379 = 0x1 + x2 * e(xs) + x7 \u00E2\u0080\u0094 421 = 0x1 + x2 * e(3T3) + x8 \u00E2\u0080\u0094 460 = 0xi + X2 * e() + x9 \u00E2\u0080\u0094 426 = 0Solution in literature:f(x*) = 13390.1Appendix A. MATHEMATICAL TEST PROBLEMS 274x\u00E2\u0080\u0099 = (523.31, \u00E2\u0080\u0094156.95, \u00E2\u0080\u00940.2, 29.61, \u00E2\u0080\u009486.62,47.33, 26.47, 22.92, \u00E2\u0080\u009439.47)Computational effort, iif, cited in literature (NLPQL):= 2443The constraints are slightly violated, possibly due to round off.Solution using LSP:f(x*) = 13390.102x\u00E2\u0080\u0099 = (523.45996094, \u00E2\u0080\u0094157.13957214, \u00E2\u0080\u00940.19949156, 29.60516357, \u00E2\u0080\u009486.56939697,47.37318420, 26.26046944,22.91170883, \u00E2\u0080\u009439.50439453)LSP initial bounds:400 5; x < 600\u00E2\u0080\u0094200 5; 5; \u00E2\u0080\u0094100\u00E2\u0080\u009410 5; X3 5; 00 5; 5; 600000\u00E2\u0080\u009410000 ioooo0 5; 5; 800\u00E2\u0080\u009410 5; x7 5; 400\u00E2\u0080\u009410 5; 5; 400\u00E2\u0080\u0094600 5; x9 5; 10Convergence criterion:VM1E-3LSP computational effort (number of functional evaluation):Nf 16601 Nkeep 90Appendix A. MATHEMATICAL TEST PROBLEMS 275Problem 15-1-0-29. Source: [58]f(x)= E(2.3x3k+I + 0.00014+ + l.7X3k+2 + 0.00014+2 +2.X3k+3 + 0.OOO1543)Subject to:O\u00E2\u0080\u0094 x33_2 + 7 130\u00E2\u0080\u0094 x3_1 + 7 140 x33\u00E2\u0080\u0094 x3 + 7 13forj 1,4xi+x2+x\u00E2\u0080\u0094600x4 + x5 + x6 \u00E2\u0080\u009450 07+xs+x9\u00E2\u0080\u0094700x10 + x11 + x12 \u00E2\u0080\u0094 85 0x13-I- x14 + xiS \u00E2\u0080\u0094 100 08< x1 2143 573 x3 160 X3k+1 900 X3k+2 1200 X3k+3 60for k = 1,4Solution in literature:f(x*) = 664.8204500= (8,49,3,1,56,0,1,63,6,3,70,12,5,77,18)Computational effort, Nf, cited in literature (GRGA): = 3857Solution using LSP:f(xj = 668.921x = (8.113,48.861,3.031, 2.081,48,934,0.070,8.067, 55.933,6.002, 13.793, 62.804, 8.405, 16.043, 69.665, 14.305)LSP initial bounds: As given in the set of constraints.convergence criterion:VM1E-3LSP computational effort:Nf = 3805238 Niveep = 120Appendix A. MATHEMATICAL TEST PROBLEMS 276Problem 20-1-1-0. Source: [59] Solution improvedf(x) = i(x + x)Subject to:1x=lSolution in literature:f(x*) = 1.91667x = (0.91287, 0.408286, \u00E2\u0080\u00940.00017, \u00E2\u0080\u00940.0000054, 0.0000, \u00E2\u0080\u00940.0000089,)0.0000082, \u00E2\u0080\u00940.000014, 0.000022, \u00E2\u0080\u00940.0, 0.0000135, \u00E2\u0080\u00940.000004,\u00E2\u0080\u00940.000011, \u00E2\u0080\u00940.000013, 0.0, 0.00002, 0.00000546, \u00E2\u0080\u00940.000009, \u00E2\u0080\u00940.00001,0.0)Computational effort, N, cited in literature (NLPQL): = 198Solution using LSP:f(x*) = 1.91836x = (\u00E2\u0080\u00940.9112, \u00E2\u0080\u00940.4116, \u00E2\u0080\u00940.0101, \u00E2\u0080\u00940.0087,0.0077, 0.0021, \u00E2\u0080\u00940.0050,\u00E2\u0080\u00940.0016,0.0051,0.0008, \u00E2\u0080\u00940.0013,0.0047, \u00E2\u0080\u00940.0034,0.0010,0.0013. 0.0004, \u00E2\u0080\u00940.0041, \u00E2\u0080\u00940.0024, \u00E2\u0080\u00940.0036, 0.0005)LSP initial bounds:\u00E2\u0080\u00941 "Thesis/Dissertation"@en . "1994-05"@en . "10.14288/1.0050419"@en . "eng"@en . "Civil Engineering"@en . "Vancouver : University of British Columbia Library"@en . "University of British Columbia"@en . "For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use."@en . "Graduate"@en . "A level set global optimization method for nonlinear engineering problems"@en . "Text"@en . "http://hdl.handle.net/2429/6965"@en .