A LEVEL SET GLOBAL OPTIMIZATION METHODFOR NONLINEAR ENGINEERING PROBLEMSByHassen Au YassienB. Sc. in Civil Engineering, Addis Ababa University, Ethiopia, 1979Diploma in Hydrology, Free University of Brussels, Belgium 194Masters in Hydrology, Free University of Brussels, Belgium 1985A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE. OFDOCTOR OF PHILOSOPHYinTHE FACULTY OF GRADUATE STUDIESDEPARTMENT OF CIVIL ENGINEERINGWe accept this thesis as conformingto the required standardDecember 1993© Hassen Au Yassien, 1993THE COLUMBIAIn presenting this thesis in partial fulfilment of the requirements for an advanceddegree at the University of British Columbia, I agree that the Library shall make itfreely available for reference and study. I further agree that permission for extensivecopying of this thesis for scholarly purposes may be granted by the head of mydepartment or by his or her representatives. It is understood that copying orpublication of this thesis for financial gain shall not be allowed without my writtenpermission.(Signature)__________________________________Department of CThe University of British ColumbiaVancouver, CanadaDate kcvL toDE-6 (2)88)AbstractThe mathematical equations used in civil engineering design procedures are predominantly nonlinear. Most civil engineering design optimization problems would thereforerequire the use of nonlinear programming (NLP) techniques for their solution. ThoseNLP packages with the ability to handle practical sizes of problems, and have beenavailable on mainframe computers for many years, are only now becoming available onmicrocomputers. On top of this, these existing NLP techniques, which are dominatedby the gradient methods, do not guarantee global solutions. As a consequence suitableoptimization methods for civil engineering design are not being enjoyed by practitioners.In this thesis, the level set optimization method, whose theory was initially presentedin “Integral global optimization” by [Chew & Zheng, 1988] was further developed toaddress, in particular, practical engineering problems. It was found that Level Set Programming (LSP), offers a viable alternative to existing nonlinear optimization methods.While LSP does not radically alter the computational effort involved it has some uniquecharacteristics which appear to be significant from the engineering users point of view.LSP which is classified as a direct search method of optimization, utilizes the settheory concept of a level set. It uses estimates of moments of the objective functionvalues at the confirmed points within a level set to control the search advance and as ameasure of convergence on the global optimum.The reliability and efficiency of LSP was verified by comparing its results with published results for both mathematical and engineering test problems. In addition to thepublished test problems, a new parametrically adjustable mathematical test problem wasdesigned to test global optimization methods in general and to explore the strengths and11weaknesses of LSP in particular. Experience with these test problems showed that LSPgave similar results to those cited in the literature as well as improved results or morecomplete sets of global solution.The large number of solutions developed at each iteration of LSP permits meaningfulgraphical displays of the progressive reduction in the level set boundaries as the globalsolution is approached. Other displays were also found to provide insights into thesolution process and a basis for diagnosing search difficulties.HiAbstractTable of Contents11List of TablesList of FiguresAcknowledgementixxixiv1 INTRODUCTION1.1 Nonlinear Programming (NLP)1.1.1 Direct Search Methods1.1.2 Gradient based methods1.2 Limitations of existing NLP methods1.3 Global optimization1.4 Optimization needs for the professional1.5 Earlier work on level set optimization1.6 Level Set Programming (LSP)1.7 Thesis ontline112357911121416162326272 LEVEL SET PROGRAMMING2.1 Basic theory of Level Set Programming (LSP)2.2 Overview of the LSP algorithm - one dimensional case2.3 LSP Implementation2.3.1 Cuboid approximation of level set boundaryiv3 LSP PERFORMANCE IMPROVEMENT3.1 Partition sets and clnster analysis .3.1.1 Cluster analysis methods3.1.2 General implementation of cluster3.1.3 A clustering routine for LSP3.2 Penalty functions3.2.1 Penalty parameters and LSP3.3 Inequality and equality constraint relaxation .3.3.1 Tightening of relaxed constraints3.4 Cnboid stretching3.5 Skewness adjustment3.6 Search domain modification4 TEST PROBLEMS4.1 Coding of mathematical test problems4.2 Mathematical problems4.2.1 The Road Runner function4.3 Engineering test problems4.3.1 Basin design [Problem 3-16-1-3]4.3.2 Pipe network design [Problem 32-1-40-34]3032363842452.3.2 Alternative methods of sample point generation2.3.3 Revision of level set value-c2.3.4 Detailed description of the LSP Algorithm - general case2.3.5 Alternate termination criteria2.3.6 Handling Constraints in LSP2.4 Snmmaryanalysis methods484853566165687173747579838687899496101v4.3.3 10-member truss design [Problem 10-2-0-40] 1074.3.4 38-member trnss design [Problem 38-1-0-77] 1124.3.5 Unit Hydrograph ordinate determination [Problem 19-1-21-0] 1154.3.6 Air pollution control design [Problem 4-11-0-40] 1194.3.7 Irrigation system design [Problem 6-7-1-4] 1234.3.8 Alkylation process [Problem 10-1-7-0] 1284.3.9 Heat exchanger network configuration [Problem 16-1-15-0] 1324.3.10 Power generation using fuel oil [Problem 4-11-0-5] 1364.3.11 Chemical equilibrium [Problem 10-1-3-0] 1394.4 Model fitting 1425 EXPERIENCE WITH LSP 1455.1 Intermediate output, its presentation, and interpretation 1455.1.1 Plots of x, x3 pairs 1465.1.2 Plots of cumulative function evaluations-Nf against iteration numberI 1475.1.3 Plots of level set value-c against the iteration number-I 1535.1.4 Further generalization of the Nf I and c ‘-S-’ plot interpretations 1555.2 Adjusting LSP parameters 1585.2.1 Number of confirmed points in the acceptance set-Nkeep 1585.2.2 Termination criteriou-VM 1615.2.3 Cluster criterion 1625.2.4 Cuboid stretching parameters 1635.2.5 Skewness adjustment parameters 1635.2.6 Penalty and constraint relaxation and tightening parameters .. 1645.3 Use of a rhombohedron shaped search domain 165viRelationship between initial cuboid volume and its location, and NfSome specific difficulties encountered while implementing LSPObservations and recommendations1771781811821849 CONCLUSION9.1 Introduction5.45.55.61671681736 EVALUATION OF NLP PERFORMANCE6.1 Existing evaluation criteria6.2 Limitations of Schittkowski’s performance criteria6.3 Recommended set of evaluation criteria6.4 Evaluation example: LSP versus GRG27 LSP COMPUTER IMPLEMENTATION7.1 Subroutines7.2 Confirmed point geueration strategy7.3 Intermediate results presentation7.4 Final output7.5 Alternative presentation of scatter plots7.5.1 Static plots7.5.2 Dynamic plots - Alternagraphics8 SENSITIVITY ANALYSIS8.1 Sensitivity with LSP8.1.1 Sensitivity interpretation of the confirmed points at convergenceplotted in the initial cuboid8.1.2 Response of acceptance set to changes in the level set value-c .8.2 Other approaches to obtaining sensitivity information188189193198199199200202208210211213214217217vii9.2 Evaluating LSP performance using test problems 2199.3 LSP performance improvements 2209.4 Graphical output 2229.5 Civil engineering design problems and LSP 223Bibliography 228Appendices 233A MATHEMATICAL TEST PROBLEMS 233B DIRECT SEARCH METHODS 277viiiList of Tables3.1 Reduced number of function evaluations due to cluster analysis methods 634.1 Summary of test problems reported in Appendix A 874.2 Optimal results for the basin design problem 1014.3 Input data for the pipe network 1054.4 Available pipe diameters and unit prices 1054.5 Optimal design for the pipe network given in ASCE, 1977 1064.6 Optimal design for the pipe network found with LSP 1064.7 Detailed results for the 10-member truss design 1114.8 Comparison of truss weights for the 10-member truss design 1114.9 Alternate optimal designs for the 38-member truss 1144.10 Rainfall-Runoff data for Little Walnut Creek 1184.11 Optimal hydrograph for the Little Walnut Creek 1194.12 Stack and Emission data 1214.13 Optimal design cited in the literature and that found with LSP 1234.14 Constants for the irrigation system design 1264.15 Optimal solution for the irrigation system design 1274.16 Bounds for the variables involved in the alkylation process 1314.17 Optimal solution for the Alkylation process 1324.18 Stream data 1344.19 Match data 1344.20 Optimal solution to the heat exchanger configuration 136ix4.21 Constants in fuel consumption equations 1384.22 Optimal power generation cited in the literature 1394.23 Optimal power generation found with LSP 1394.24 Free energy constants (w8) 1404.25 Optimal solution to the Chemical equilibrium problem 1415.1 Function evaluations using cuboid and rhombohedron 1666.1 Schittkowski’s NLP performance criteria 1816.2 Recommended NLP performance criteria 1856.3 Comparison of GRG2 and LSP using proposed criteria set 1867.1 LSP’s problem definition and operations 1907.2 Comparison of Nf for two different point confirmation strategies 1957.3 Comparison of Nf for different strategies for a constrained problem . 197A.1 Dataform=4andn=3 250A.2 Experimental data used to fit the nonlinear regression model 254A.3 Optimal solution for the regression model 254A.4 Dataform=Sandn=4 259A.5 Experimental data used to fit the nonlinear regression model 260KList of Figures2.1 Sequential improvement of level set value-c 242.2 Cuboid derived from limited number of confirmed points. . 302.3 Cuboid defined by the smallest objective function values. . 332.4 Nf for cuboids defined by different level set value criteria. . 353.1 Connected and partition sets 513.2 Level set “volume” in one dimensional problem with and without partitioning 533.3 Single and multiple hnkage clustering methods 583.4 Stepwise dendrogram construction 603.5 Cluster analysis approach for a two variable problem 643.6 Modified penalty function 703.7 Relaxation of constraints 733.8 Shift of cuboid with skewness adjustment 773.9 Contrast between cuboid and rhombohedron in two variable space. . 803.10 Point generation in a rhombohedron search domain 824.1 The LSP algorithm as implemented in this thesis 854.2 The Road Runner function in one dimension to show the influence of a 904.3 The Road Runner function in two dimensions 934.4 Treatment basin 974.5 Two-looped water supply system 1034.6 10-member truss 108xi4.7 Detailed results for Venkayya’s 10-member truss problem 1104.8 38-member truss 1124.9 Rainfall components 1164.10 Optimal unit hydrographs for Little Walnut Creek 1204.11 Alkylation process 1304.12 Two-boiler Turbine- Generator combination 1375.1 Local optima suggested by the disappearance of a point cluster 1485.2 Nf I plots 1505.3 Rastrigin function-Implied local optima by Nf I plot. (Actual screendump) 1525.4 Screen display for the Road Runner function, two dimensional case. (Actual screen dump) 1535.5 Level set value-c versus iteration-I 1555.6 Low Niceep search for the Road Runner function. Demonstrates discontinuity in c I plot. (Actual screen dump) 1565.7 Interpretation of simple c I and Pt/f ‘-J I curves, unimodal and multi-modal objective functions 1595.8 Interpretation of more complex Pt/f rsi and c n.’ J plots 1605.9 The crescent shaped feasible region of Subrahmanyam’s problem [Problem2-12-0-4, Appendix A] 1705.10 Acceptance set volume change after an iteration while cuboid remainsconstant 1727.1 Deletion of a point in dynamic graphics 2047.2 Linking of points for a 4 variable problem [Problem 4-1-0-3] 2057.3 Points in the brush highlighted in all the panels [Becker, 1987] 206xii8.1 Plots of confirmed points after convergence criterion is met 2128.2 Plots of confirmed points at convergence within the initial cuboid for a 6variable problem (Actual screen dump) 2138.3 Plots of confirmed points within the initial cuboid for different c values 2148.4 Points in the level set, for c = c’ and c = 1.01 * c - [Problem 4-6-1-2](Actual screen dump) 216xliiAcknowledgementI wish to express my most sincere gratitude to Dr. William F. Caselton, my teacherand supervisor, who had a profound and positive impact on my academic and professional attitudes. I greatly appreciate his advice, encouragement, guidance and supportthroughout my graduate studies. This thesis would not exist without his patient effortsand valuable suggestions. I am deeply indebted for his interest in this work, for his invaluable guidance, for the many hours spent on valuable discussions, for his meticulousproof reading, and for the diligence during the research.My special thanks go to my supervisory committee members, Dr. Alan D. Russell,Dr. S. 0. Denis Russell and Dr. Donald W. Thompson for their valuable suggestionsduring the course of the research. In spite of their busy schedules, they always foundtime to review the progress of my work and communicate back with me. They havecontributed numerous clarifications of the research. Their efforts in reviewing this thesisare greatly appreciated.I would also like to acknowledge my family members, who had always given me themoral support and warm passion across the miles. My thanks go to my wife Munira, forthe patience and affection she has shown me over the past few years and the understandingshe has shown during the travails of my research. Her presence in my life gave addedmeaning to this whole endeavour.The financial support of CIDA to cover the expenses of my studies and the role ofAssociated Engineering International Ltd. in facilitating my stay in Vancouver is alsoappreciated.Above all, I praise the Almighty Allah for all my achievements.xivChapter 1INTRODUCTION1.1 Nonlinear Programming (NLP)The equations used by engineers to mathematically describe engineering problems andto describe physical behaviour are predominantly nonlinear. When an explicit optimization objective is involved then a mathematical formulation of the optimization problemwill conform to the conventional nonlinear progranuning problem. The mathematicalformulation of such an optimization problem can be written asMinimize or Maximize f(x)Subject to:g(x)O fori=1,mh(x)=O for j=1,1where x is an n-tuple vector of independent variables and f(x), gi(x), g2(x), ...,gn(x),hi(x), h2(x), ..., hj(x) are nonlinear functions defined on the Enclidian ii space.The equation f(x) is known as the objective function while the other two equationtypes, gj(x) 0 and the h(x) = 0, are the set of constraints. The space or region definedby these constraints is known as the feasible space or the feasible region.The non linearity of any of these equations precludes the use of the well establishedand reliable Linear Programming methods. Approximation of the nonlinear equations to1Chapter 1. INTRODUCTION 2linear or qnadratic expressions might facilitate a solution being found but can substantially alter the problem being solved. Finding a satisfactory solution generally necessitates the use of true nonlinear approaches to preserve the characteristics of the problem.Currently the methods used in nonlinear programming are categorized as direct searchor gradient search [Reklaitis et al., 1983]. These two method classes are discussed in thenext sections.1.1.1 Direct Search MethodsDirect search methods range from simple simultaneous point sampling techniques to moreelaborate procedures that involve a coupling of sequential sampling with heuristic hillclimbing methods [Reklaitis et al., 1983]. None of the direct search methods requirethe use of derivatives. Some of the most frequently cited direct search methods in theliterature are: Blind search; Grid search; IJuivariate search; Conjugate direction methods;Powell’s conjugate direction methods; Simplex method and Complex method.Many of these methods have elements in common with each other and the level setmethod described in this thesis is no exception. Descriptions of each of the above listedmethods are included in Appendix B of this thesis for reference purposes.The blind search, which is also known as simultaneous search, is quite inefficient inits use of function evaluations [Leon , 1966 & Reklaitis et al., 1983]. In this strategy, Nfrandom points are generated and the objective function evaluated at each point. The Nfobjective function values are compared and the lowest value is selected as the optimumvalue.The size of N1 depends upon the desired probability of successfully finding the globaloptimum. For a bounded but otherwise unconstrained problem involving n variables anda length of d units on each side of the feasible region, the volume of the feasible regionwould be d. Take a small fraction of the feasible volume a = (, where 6 is a smallChapter 1. INTRODUCTION 3length on each side of the bounded feasible region. The size of a is so small that variationof the objective function value within that region can be tolerated. Then for a purelyrandom search, the probability of a single test to be outside of a would be 1 — a.Assume the probability of having at least one point in a is F, then, with N1 pointsF = 1 — (1 — a)VfN = log(l—F)f log(1—a)and for small a,N1 231og(1—F)For F = 0.90, which corresponds to a 0.90 confidence of obtaining the global optimum,the computational effort reflected by N1, would be 2.3Q) [Reklaitis et al., 1983]. Witha value of 6 = %, a ten variable problem would need about 2.4*1016 function evaluationsfor 0.90 confidence. This number is prohibitively large from a computational standpoint.The optimal solution can be obtained with probability of 1.0 only as the number ofsample points approaches infinity. Because of the high number of function evaluationsinvolved, a blind search is not recommended for solving problems of even moderate size.1.1.2 Gradient based methodsGradient based methods make use of derivatives to determine the search direction foroptimization. One of the common gradient based methods, the steepest descent method,uses only the first derivatives of the objective function for unconstrained problems [Edgar& Himmelblau, 1988]. The gradient vector at a point gives the direction of the greatestdecrease in f(x). For a minimization problem the search direction is specified by thenegative of the gradient, i.e.= vj’f(xj.Chapter 1. INTRODUCTION 48k is the search direction at point k.vjf(xj is the derivative of f(x) atAt any stage of minimization the transition from one point to the next is given by= + z\xk = xk + ASk = xC — f(xj.Axk is the vector of increments from xV to x1VC is a scalar quantity that determines the step length in direction 8k and it can be eithera predefined fixed value or its value can be optimized at each iteration to improve thespeed of the convergence on the optimum point. In some cases it is possible to simplifya nonlinear function by making use of the quadratic approximation, at any point xk,f(x) f(xk) + vTf(xjz1xxk + .5(xk)TH(xk)zxkwhere H(xj is the Hessian matrix of f(x). Some methods then use this second-orderapproximation of the objective function and the information obtained from the secondpartial derivatives of f(x) with respect to the independent variables.The quadratic approximation of f(x) at is differentiated with respect to eachindependent variable and the resulting expression equated to zero to define a stationarypoint.f(x) =7f(xj + H(xj/Xxk = 0 (1.1)The next search point, at point Ic + 1, is obtained from this expression as— xk = =—[H(xj]1 f(xj orxl = x — [H(xj]1v f(x’flwhere [H(x’)]1 is the inverse of the Hessian matrix H(xjBoth the step length and direction of search are determined from the same expression,Equation 1.1. Only one step would be necessary to reach the optimum if f(x) was a trueChapter 1. INTRODUCTION 5quadratic equation. But, since in general this is not the case, there would be multiplesteps.1.2 Limitations of existing NLP methodsFor simple two variable problems, elementary direct search methods are often quite satisfactory, but they are neither efficient nor reliable for the higher dimensionality problemswhich typically arise in engineering [Edgar & Bimmelblau, 1988].Gradient procedures that use second order information are superior to those that usefirst order information only. But the danger is that usually the second order informationmay be only approximate as it is based, not on second derivatives, but approximates tosecond derivatives, with the consequence that it is no longer the original problem that isbeing solved [Edgar & Himmelblau, 1988].All gradient procedures start their search from a point and follow a single path untilthe convergence criterion is met. The point at which the convergence criterion is metis regarded as the optimum point. If a different starting point is used to perform thesearch again, it is possible that it can lead to a different optimum point. This revealsthat solutions found with gradient based optimization methods are often local optima.For a solution point x to be a local optimum (local minimum in this case) the objectivefunction value at that point should be the least of all the objective function values in theneighbourhood. That isf(x*) f(x)where x — x* W C, and C is a small value.Unfortunately, except in certain mathematically simple cases, there is no mechanismwhich can confirm the global nature of the optimum point obtained with gradient methods. The global minimum is defined as the smallest of all the local minima in the regionChapter 1. INTROD UCTION 6of interest. In some particular cases, the global minimum can appear at multiple points.In order to apply gradient methods, the functions have to be continuous and differentiable so that their derivatives can be evaluated. But in engineering design we areoften confronted with problems which involve un-differentiable functions, discontinuousfunctions and/or discrete variables. In such cases the use of gradient based methods islimited.Nonlinear optimization problems may have more than one global optimum solution.Furthermore there may be other solutions which, while not globally optimal, producevalues of the objective function close to the global optimum value and are of considerablepractical interest. Unfortunately almost all existing NLP algorithms are incapable ofidentifying the full set of optimal points and provide little or no indication of near optimalsolutions.Information obtained as a byproduct of gradient searches, primarily in the form ofLagrange multipliers, Hessian and Jacobean matrices, does provide information but onlyin the immediate vicinity of the optimal solution. But this information can also bedifficult to interpret, especially in the context of the original engineering problem andwhen the global nature of the solution is in question. It provides no clues concerningmultiple global optima or near optimal solutions. In the event of an NLP method failingto confirm even a local optimum, the user is left with no suggestionfor his next move.Much of the information we get from the intermediate stages of gradient based NLPsearches does not have pertinence to the engineering problem being solved. It simply indicates the outcome of some mathematical manipulations which are incidental to the realworld problem and therefore have no real relevance to an engineering practitioner. Mostof the existing NLP methods will perform well on certain specific types of problems butthese are rarely representative of the more general nonlinear optimization problems facedby engineers. Thus, in general, existing gradient based or direct search NLP methodsChapter 1. INTRODUCTION 7have limitations in addressing real world problems.Other optimization techniqnes, which represent departures from the two classes whichhave been described above, have also emerged in the last decade. Genetic Optimizationfor example, which “mutates” binary strings, has already appeared in the engineering literature [Goldberg, 1989], while Random Tunnelling and Simulated Annealing have beenproposed in the last few years. These methods have some limitation in their application, for instance Random Tunnelling converges rapidly for problems involving only onevariable but good results are not expected for higher dimensional problems [Kan andTimmer, 1989], and Simulated Annealing is reported to be inefficient [Torn & Zilinskas,1988], [Corana et al., 1987] and [Kan & Timmer, 1989].1.3 Global optimizationWhen dealing with nonlinear optimization problems, it is possible that the problempossesses many local optima. Because of the highly nonlinear nature of the equationsoften involved this phenomenon is quite common with engineering system designs [Luns& Jaakola, 1973]. For general engineering optimization purposes a method is requiredthat can identify the overall optimal solution among the many alternative local solutions.Such a method is known as a global optimizer. Its aim is to identify the smallest of allthe minima in the region of interest and may not necessarily evaluate all of the localminima.A sufficient gradient related criterion to positively confirm that a global optimumvalue has been achieved at a point does not exist. The only way to confirm that thepoint is a global solution is to evaluate the objective function at that point and compareit with objective function values at all other points.In any optimization method the search for the optimum solution is performed byChapter 1. INTRODUCTION 8evaluating the objective function at some trial points within the search domain. Anumber of points, obviously more than one, have to be used to reach the final solution.The distribution of these trial points within the search domain is influenced by twodistinct optimization goals, these are the global reliability goal and the local refinementgoal [Torn & Zilinskas, 1988].The global reliability goal is based on the assumption that the global optimum couldbe located at a point anywhere in the search domain and therefore assumes that no partof the search domain can be neglected if the global optimum is sought. Without anyrefinement in strategy it leads to search procedures where the trial points are distributeduniformly over the entire search domain. On the other hand, the local refinement goal isbased on the assumption that the probability of detecting a point with improved solutionis higher in the neighbourhood of a point with a relatively low objective function valuethan in the neighbourhood of a point with a relatively high objective function value.Again, without refinement, this goal has a natural tendency to generate sequences ofpoints with decreasing function values that converge to local minima.Most global optimization methods focus on exploring the whole of the search domain, but use local search strategies to numerically refine solutions. The various globaloptimization methods differ on how the search strategy shifts from one goal to the other.The performance evaluation of local and global minimization methods differ in principle. Local methods are evaluated on the basis of how often and how efficiently theyconverge on a confirmable optimal point of any kind. But a global method is evaluatedon two distinct qualities, the first one addresses the same issues as for local optimization methods but a second one, called reliability, measures the capacity to identify thetrue global optimum [Wang & Luus, 1978]. Clearly, because of these dual requirementsand the complex nonlinear problems addressed, global techniques must be tested morecomprehensively than gradient methods to prove their effectiveness.Chapter 1. INTRODUCTION 9Rather than strictly following mathematically formulated procedures, global optimization methods are more inclined to adopt some heuristic approaches than are gradientmethods. For example, if the global optimum is difficult to find, it may be attractiveto deal with an approximation of the objective functiou which yields a computationalenumeration advantage over the original objective function. A second example of anheuristic approach is subdividing the search domain to smaller regions at some stageof the search and then performing the search in each of the subregions. Justificationfor introducing heuristics into optimization methods are discussed in [Zanakis & Evens,1981] where they state the following.“The need for good heuristics in both academia and business will continueincreasing fast. When confronted with real world problems, a researcher inacademia experiences at least once the painful disappointment of seeing hisproduct, a theoretically sound and mathematically ‘respectable’ procedure,not used by its ultimate user. This has encouraged researchers to develop newimproved heuristics and rigorously evaluate their performance, thus spreadingfurther their usage in practice, where heuristics have been advocated for a longtime.”1.4 Optimization needs for the professionalIn addition to the limitations of the existing methods in dealing with nonlinear optimization problems, mentioned in section 1.2, many of them have until recently beenconfined to implementation on mainframe computers. The fact that access to mainframecomputers has been severely limited to most civil engineers has deterred practitionersfrom using the existing NLP optimization methods. But the substantial improvementin access to computers in recent years, through the introduction of personal computersChapter 1. INTRODUCTION 10and their rapid development, has not yet lead to a corresponding growth in the nse ofoptimization methods in civil engineering practice.As stated previously, the mathematical equations used in civil engineering design procedures are predominantly nonlinear, therefore most civil engineering design optimizationproblems require the use of nonlinear programming techniques for their solution. ThoseNLP methods with the ability to handle practical sizes of problems that have been available on mainframe computers for many years are only just beginning to become availableon microcomputers. But there are other problems still facing the practitioner which willnot be solved simply by improved access.None of the nonlinear optimization methods which have been widely implementedto date offer any absolute assurance of finding the global optimum except under veryideal conditions. Generally they rely on the user to initiate the search from a number ofdifferent starting points so that the chances of stumbling onto the global optimum areenhanced. No systematic way of selecting starting points to ensure finding the globaloptimum is available and the responsibility for the overall global search strategy and itssuccess is left entirely to the user.Perhaps, most seriously of all, a practising civil engineer is not generally trained orskilled in numerical analysis. He therefore faces a considerable investment in time andmoney to confidently embark on nonlinear optimization to solve a real world problem.If he makes this investment then he still faces the shortcomings of existing methodsdescribed in the two previous paragraphs. Although not an uncommon topic in thecivil engineering research literature, there has been evidence of only rare and limiteduse of optimization methods in civil engineering practice. Clearly existing nonlinearoptimization methods have had insufficient appeal to convince civil engineers that theirtime should be invested in that direction. This prompted the research, reported in thisthesis, into a methodology based on the exploration of level sets, that has received veryChapter 1. INTRODUCTION 11little attention. Amongst many of the other positive attribntes explained snbsequentlyin this thesis, it has the potential of being a far more appealing nonlinear optimizationmethodology to the person who is not a specialist in nnmerical analysis.1.5 Earlier work on level set optimizationThe theory discussed in [Chew & Zheng, 1988] under “Integral global optimization” opensa new alternative to global optimization based on level set theory. But some other authorshave expressed opinions that the level set method is not designed for solving practicalproblems [Torn & Ziliuskas, 1988]. This is because, from the practitioner’s point of view,there are certain issues, important in engineering problems, which Chew and Zhengdid not address. As described below, these issues deal with superiority of the method,computational effort, multiple optima and significance of intermediate outputs.Even though the authors hint that there had been improvements made in their work,the improvements are neither enumerated nor documented. This remains a hinderauce inidentifying their contribution to optimization methodology and assessing the superiorityof the level set method over other existing methods.A general statement was given stating that less computational effort is expendedwith the level set method compared to the pure random search method. Apart fromthis statement there was no systematic comparison between the level set method and thepredominant (gradient) methods. Lack of comparative performance tests with respect toefficiency, reliability or global convergence provided no incentive for others to adopt levelset optimization.The authors admitted that only problems which have single global optimum hadbeen fully addressed and made only a minor comment on the multiple global optimacase. There was no discussion of recognizing near global optimum solutions. But theChapter 1. INTRODUCTION 12identification of mnltiple optima and near optimal solutions are some of the importantbenefits to practitioners which are identified and investigated in this thesis.The significance of the level set method output values was also not recognized in [Chew& Zheng, 1988] and consequently the benefits of a graphical interface was overlooked.1.6 Level Set Programming (LSP)The formal mathematical presentation of the methodology upon which LSP is based hasbeen presented under the title “Integral Global Optimization” [Chew & Zheng, 1988].They claim equal or better performance when compared with some of the best “conventional” gradient search NLP methods. Their experience in solving a series of NLPproblems supported this finding.Level Set Programming (LSP) is classified as a direct search method of optimization.Like any other direct search optimization method, LSP avoids gradient evaluations andrelies solely on evaluations of the objective and constraint functions at solution points. Incomparison with most direct search methods it carries a much larger number of solutionsat any one time and these solutions are dispersed over larger regions of the design space.LSP utilizes the simple set theory concept of a level set which will be defined inChapter 2. It uses estimates of statistical moments of a level set to assess the level setproperties, to guide the search algorithm, as well as to measure convergence on a solutionoptimum. LSP adopts a global strategy where, at least in theory, the whole of the searchdomain is explored in the search for the global minimum.The most significant features of LSP from a conventional engineering standpoint are:LSP is a global search method; the search does not follow a single path or is not influencedby a single point; it covers the whole search domain; its reliability to identify the globalsolution is high; the method is conceptually simple; all the computations and numericalChapter 1. INTRODUCTION 13results geuerated during the search are meaningful; and an elementary, and thereforefast, graphical interface can display much of the useful information at any stage of thesearch. Based on our experience to date, its computational burden is in the same orderas gradient based search methods while its assurance of a successful global search issubstantially better.The name Level Set Programming is adopted to emphasize the central level set feature. The word programming, as appended here, appears to be justified in that theimplementation of the methodology has an algorithmic structure with a rigorous theoretical criterion for convergence on the global optimum.Although the theory of level set optimization already existed prior to this research,its full implementation and testing had never been documented. In this research, acomprehensive level set optimization algorithm named Level Set Programming (or LSP)has been developed.Refinements to the level set approach, which helped increase the chances of identifying the global solution, increased computational efficiency and enabled the method tohandle a wider range of problems were investigated and implemented. These refinementsincluded the use of cluster analysis, penalty functions and constraint relaxation. It wasalso demonstrated that conventional engineering routines, which are necessary for evaluating the objective function or constraints (such as structural analysis methods andHardy Cross pipe network analysis method), could be readily embedded into the optimization method. A set of control parameters were also introduced into the algorithmand their optimal values investigated. Furthermore, it was found that these parameterscould be adjusted during the course of the search to increase the reliability and efficiencyof the search.Chapter 1. INTRODUCTION 14Graphical displays of intermediate and final search resnlts were also developed duringthe research and these have opened up a new perspective on the process of optimization.The human-machine interaction during optimization was also significantly improved withthese graphical displays which guide the user towards search parameter adjustments forincreased efficiency of search as well as facilitate a better understanding of the optimization process as it unfolds.The strength of LSP has been demonstrated by solving a wide range of mathematicalproblems (about 200) and many difficult and realistic engineering problems (about 20)which have been presented in the literature as being challenging for optimization methods.In addition, a parametrically adjustable mathematical test problem was developed duringthe course of the research to further strengthen confidence in LSP’s global optimizationcapabilities and to challenge global optimization methods in general.1.7 Thesis outlineThe theory behind level set programming and its implementation is explained in Chapter2. Refinements in the LSP scheme and the introduction of some heuristic methods arethen explained in Chapter 3.Chapter 4 explains the general strategy and performances of LSP in solving mathematical and engineering problems. The engineering test problems formulation and theirsolutions are presented in the same chapter. The mathematical problems formulationand solutions are documented in Appendix A.The experience gained in developing LSP and in solving test problems are explained inChapter 5. This chapter explains the interpretation of intermediate results, discusses theuse and adjustments of LSP parameters, the relationship between the volume of searchdomain and computational overhead, and finally identifies the principal difficulties whichChapter 1. INTRODUCTION 15might be encountered while implementing LSP.The established approaches for evaluating NLP methods and some new proposedcriteria are given in Chapter 6. The most important factors in the field of optimizationfrom an engineering practitioners point of view are also enumerated in the same chapterand contrasted with the established approaches to evaluating NLP methods.Chapter 7 explains the operational use of LSP. The subroutines in the computerprogram developed and its input/output procedures are discussed.Chapter 8 discusses sensitivity analysis with LSP. A contrast on the definition ofsensitivity analysis between the classical methods and in the LSP context are presented.Interpretations of sensitivity information from LSP plots are also included. Finally Chapter 9 gives the general conclusions concerning LSP’s effectiveness and suitability as anengineering tool.Chapter 2LEVEL SET PROGRAMMINGThis chapter outlines the basic theory of LSP and then considers extensions and variationsof the algorithm which were investigated in the conrse of this research.2.1 Basic theory of Level Set Programming (LSP)This section summarizes the essential theory for the implementation of LSP as describedin this thesis. More extensive theory is provided in [Chew & Zheng, 1988] and includesdetails such as the properties of higher moments (third, forth, etc.) of the level set,etc., which do not appear to have any bearing on engineering implementation, and aretherefore omitted here.The general formnlation of an optimization problem isMinimize f(x)Subject tog(x)O fori=l,2,...,mh,(x)=O forj=1,2,...,lxcX (2.1)wheref(x) is a single valued objective functiongm(x) are inequality constraintshi(x) are equality constraints16Chapter 2. LEVEL SET PROGRAMMING 17X is a subset of E, andx is a vector of n components x1, ..., x,1.and all fnnctions are defined on Euclidian n space, E’t.The optimization problem given in Equation 2.1 is a nonlinear programming problem when any or all of the constraints or the objective fnnction are nonlinear. In theapplications of interest here, x corresponds to a set of design or decision variables.For an unconstrained but bounded minimization problem the goal is to identify x,which is called the global optimum point, or a set of global optimal points, and c*, whichdenotes the global optimum value of f(x*), where the relation f(x) f(x) = C wouldbe true for all feasible points x. Thus the minimization problem is to find the minimumof f(x) over 5b, where 5b is the region defined by the upper and lower bounds of eachvariable x, so that= min{f(x) I x C Sb} (2.2)A level set, which is at the heart of the optimization methodology inthis thesis, is defined as the set of points which provide an objectivefunction value below or equal to some specified value. If this specifiedvalue is represented by c, then the associated level set H0 can be definedbyH0 = {x f(x) S c}. (2.3)Bounds on each component of x are implicit but constraints are notconsidered in this definition.Certain first and second moments of the objective function values of the points contained in the level set play important roles in LSP and are defined below. The followingChapter 2. LEVEL SET PROGRAMMING 18expressions are for continnous integrable functions on 8b only adopted from [Chew &Zheng, 1988].Mean value over the level set HM(f, c)=f(x)dp (2.4)and for cM(Lc)cwhile for constants c2 c1M(f,c2) M(f,ci)where Chew & Zheng defined t as a measure of the level set and, c1 and c2 represent anytwo constants greater than or equal to the global optimum value.Variance over the level set HSimilarly Chew and Zheng specified the variance and a modified variance over thelevel set H asV(f, c)=p(H) - M(f, c)]2dpVM(f, c)=J[f(x) - c]2djz (2.5)The modified variance is slightly easier to compute and has the same general properties as the variance.For the discrete sampling case Equations 2.4 and 2.5 can be written in the form ofsummations to provide estimates of M(f, c) and VM(f, c)Chapter 2. LEVEL SET PROGRAMMING 19M(f,c) = 1 af(x)ZXJEHCa XiEHVU, C; 5) = E a[f(x) — MU c)]2ZXJEHC aVM(f,c;S)= 1 a[f(x)—c]2 (2.6)YZXJEHCa xieHIn practice, there would be only a limited number of points in a level set, therefore,Equations 2.4 and 2.5 are rewritten to reflect this as follows.1NkeepM(f,c)= N > f(x3) (2.7)keep j=11VU, c)= N E [f(x) - MU, c)]2keep j=11 NkeepVM(fc)=N YZ[f(x)-c]2 (2.8)keep j=1where Nkeep, the number of points at which f(x) is calculated for a discrete samplingscheme, is the number of points in the level set, and plays the role of the measure p(H) inEquation 2.4. The equivalent moments for the integer variable or discontinuous functioncases are computed by evaluating the objective function at discrete points.Convergence on global optimumTake a large real number c0 such that the level set defined by this number over thefunction f(x) is non empty, that isChapter 2. LEVEL SET PROGRAMMING 20H0—_{xf(x)<co}O.Define a decreasing series ckCk+1 = M(f, Ck)so thatCQ>C1...>Ck>Ck+1>...>C.where c is the global minimum value of f(x) Now define a decreasing sequence of levelsets HCk asHck={xf(x)<ck}thenH1 ... H÷1 2 ... 2 (2.9)where is the set of global minima of f(x).Both cj and Hck are decreasing and are bounded below. rhe lower bounds can bedefined byc = lim ck (2.10)k—boo= lim Hck = fl Hck (2.11)k—+ocEquations 2.9, 2.10 and 2.11 are important properties of level set values and level setsand are critical to LSP’s optimization capabilities. At a global optimal point x’, whereChapter 2. LEVEL SET PROGRAMMING 21= f(x*) is the corresponding global minimum value, the following are also importantproperties.For cM(f,c) cM(f, C) = CV(f,c) 0VM(f,c) 0V(f,c*)= 0VM(f,C*) = 0The fact that M(f, c), V(f, c) and VM(f, c) are always single valued quantities, regardless of the dimensionality of x, is exploited in the LSP algorithm. The first momentis used to redefine a new level set value-c and advance the LSP search. The propertiesof V(f, c) and 14(f, c) provide a theoretical global convergence criterion. V(f, c) andVM (f, c) are always non negative and approach zero from the positive direction as thevalue of c approaches C. Either V(f, c) or VM(f, c) can be used for search termination,but in this thesis, VM(f, c) was used throughout the test problems.A closed region in the decision domain is defined by the set of constraints, whichincludes the variable bounds, and is specified here as the feasible region S. In a constrained minimization problem the goal is to identify xK and C, where the relationf(x) f(x*) = C would be true for all feasible points x. Thus the minimizationproblem is to find the infimum of f(x) over S so thatC = inff(x). (2.12)Assuming that there is a real number c such that the intersection of the level setChapter 2. LEVEL SET PROGRAMMING 22and S is non empty, that is H fl S 0, where 0 signifies the null set, then the optimalsolution for a constrained problem can be restated as= mm f(x). (2.13)xeH fl sFor any H, the level set associated with c is defined by= {x f(x) = c*;x S} (2.14)which constitutes the set of global optima. Restated in terms of level sets, the globaloptimization problem is to find c and In any point sampling scheme only a limitednumber of points can be realized.The existence of linear or nonlinear constraints can be accommodated by defining aset of points which fulfil the condition of being both f(x) c and x feasible.Feasible points which fulfil the f(x) c condition are called acceptable points in this thesis. The acceptance set, which consists onlyof acceptable points is defined asHHcflS{xf(x)<c; xeS}. (2.15)Equation 2.11 can be extended to accommodate constrained problems. The lowerbound of the acceptance set at convergence on the global optimum c isHcflS{xIf(x)c;xS}HCflS = lirn(HCkflS)Chapter 2. LEVEL SET PROGRAMMING 23H. = lim(HCkflS) (2.16)2.2 Overview of the LSP algorithm - one dimensional caseFigure 2.1 conveys the essential idea behind LSP. It shows a nonlinear function f(x) whichis to be minimized within specified bounds x’ and xU of the single decision variable x.The problem is otherwise unconstrained.If x is a continuous variable then there will be an infinite number of elements inthe level set within the lower and upper bounds. It will simplify the exposition of theremaining theory, and more accurately reflect its implementation, if we think of level setsonly in terms of samples at a set of discrete values of x or, equivalently, at a set of pointsin the decision space. This is not to suggest that x is necessarily a discrete variable,although discrete variables are not excluded in this formulation.To determine an initial level set, a set of feasible points is first established and theobjective function evaluated at each point. The initial level set value-c is then set equalto the highest objective function value, that isc = Max{f(x2)}Those points which yield f(x) values either equal to or less than c, and do not violatethe bounds, then constitute the level set H for level c, that is H0 = {x f(x) e}.Figure 2.1(a) shows the randomly generated points within the initial bounds and the sizeof the level set H0 for the corresponding level set value-c.Estimates of the mean and variance of the f(x) values corresponding to the x valuesin the level set H0 are established using Equations 2.4 and 2.5 asMc, c) = 1 E f(x)NkeepChapter 2. LEVEL SET PROGRAMMING 24C,CT, —. —xH.Optimum PointInitial points confirmed in level Seto initial points which were rejected from the level set H• Points retained in the level set H* New points generated in the level set HC JXL XUXHCC,(a)(b)(c)(d)IH.Xv XFigure 2.1: Sequential improvement of level set value-c.Chapter 2. LEVEL SET PROGRAMMING 25VUc) = [ZU(x) - M([c))2].Nkeep H,jVM(f, c) = 1 (f(x) - c)2].H,jThe search or algorithmic aspect of LSP is driven by using M(f, c) to provide a revisedand lower value for c, say c’, at a successive iteration by simply setting c’ = M(f, c). Thelevel set value c’ is then used to establish a new “improved” level set II’. Once the newlevel set c’ is identified, new points are randomly generated in the improved level setH’ to bring the total number of points in the level set to Nkeep, see Figure 2.1(b). It isimportant to note that the level set will always be a subset of the level set H (seeEquation 2.9) although these subsets may not necessarily be connected (see definitionin Chapter 3, section 3.1). Once H’ is established then the value of M(f, c’) can beobtained. The level set value can again be revised to c” = M(f, c’) and a new level setHe” established.This procedure of estimating M(f, c) for the current level set, revising the level setvalue-c, and establishing the new corresponding level set, is then repeated with obviousprogressive reduction in the level set value-c.Finally,the search terminates at the global minimum point x’ with objective functionvalue f(x) = c as shown in Figure 2.1(c). The value of V(f, c) acts as an indicatorof convergence on the global optimum value c. Most significantly, the convergence ofV(f, c) on zero at the global optimum is not affected by the existence of local optimaor by the existence of a number of separate global optimal solutions. The optimal levelset H. can include any number of points some distance apart but, as V(f, c) measuresChapter 2. LEVEL SET PROGRAMMING 26dispersion only in 1(x), then the value of V(f, c*) would still be zero. In the special casewhere the problem has a single global optimum, reduces to a single point.2.3 LSP ImplementationIn an application of LSP the predominant computational load is in evaluating the objective function and testing the constraints for feasibility at the many trial solution pointswhich are prospective members of the sequence of level sets. This imposes a practicallimit on the number of points which can be confirmed to lie within any one level set.At the same time it is essential that this number be large enough to ensure that thesampling is dense enough to reduce, to an acceptably low value, the risk of overlookinga global solution. More precisely, we would like the sample to be dense enough to avoidoverlooking important features in the level set boundaries which are being inferred fromthe level set sample. The algorithmic scheme in LSP exploits the fact that a substantialnumber of the Nk€ sample x points present in H will also be present in the revel setH’ at the next iteration. In practice, using the mean value criterion for the revised levelset value as indicated in section 2.2, the number of these “surviving” points has beenfound to be close to f5P. The set of surviving points is used to provide a cuboidalapproximation of the new level set boundary (see “Cuboid approximation” below). Thisapproximate boundary is then used to support the efficient regeneration of acceptablepoints in the new level set. To infer the new level set boundary approximation withacceptable accuracy, and to minimize the risk of omitting any global solution, it is necessary for the number of points in the level set sample to be restored to prior toeach revision in the level set value-c.In the next subsections the important features of LSP implementation are discussedin detail. In the first subsection, approximation of the search domain at every iterationChapter 2. LEVEL SET PROGRAMMING 27is addressed. Then, alternate methods of sample point generation are explored to identify a more reliable and efficient search. Reliability and efficiency of the LSP search isdiscnssed in section 2.3.3. Initialization of the LSP search and a detailed description ofthe algorithm are presented in snbsection 2.3.4. While V(f, c) is the theoretical criterionfor convergence, other search termination criteria are reviewed in section 2.3.5. Finallysubsection 2.3.6 explains how the implementation of LSP is extended from the unconstrained case to the general constrained problem. This subsection describes a variety oftechniques for handling constraints in the LSP search.2.3.1 Cuboid approximation of level set boundaryA cuboid, which is also known as a hypercube or hyper-rectangle, isdefined here as a rectangular parallelepiped in the decision domain.The sides of a cuboid are always parallel to the variable axes. It isdefined byD={xb xj b, i=l,2,..n}where bt and b are the lower and upper cuboid bounds respectivelyfor each variable i.The points in a current level set can be used to infer the boundaries of a minimalcuboid which envelops the level set. This is computationally far more economical thanattempting to generate more accurate but nonlinear estimates of a minimal envelope ofthe level set. It should, at the same time, be recognized that more precise boundaryestimates using say ellipsoids might provide greater computational efficiency when generating new level set points. The simplicity of the cuboid approach is appealing howeverChapter 2. LEVEL SET PROGRAMMING 28and the use of nonlinear envelopes has not been explored in this work.A general n dimensional optimization problem will be assumed so that the variablevector will be x = (xi, x2,.. . , x,j. To initiate the search, Nkeep points are first generatedwithin the feasible region. The highest objective function value defines the initial levelset value-c and a corresponding level set H.The first general iterative cycle begins by reassigning the level set value to c’ = M(f, c)(where the prime symbol denotes the next iteration). Points whose objective functionvalue is greater than e’ are discarded. Only n points (where in practice r Njceep/2)would then remain in the new level set Ply. The minimum and maximum values for the ivariable (i.e. i11’ dimension) in the new level set, estimated from the n, surviving points,would be bL = Min{x, 4,. .. , r} and b’ = Max{xt , 4,. .. , x} respectively. Thusthe vector of lower bounds for all variables would beb’L= {IV[inH,[x1], &1i72H,[X2]and the upper bound vector would beb’U= {Maxj-j,[xi], MaXH,[x2J..., Max H,[xj}Then [b’L, b”fl defines the cuboid for the level set and its volume is designated asVol.Volume is defined here as the product of the sides of a cuboid. For ann dimensional cuboid, the volume is expressed asVol=fJd1where cl is the length of the side of the cuboid measured on thevariable axis.Chapter 2. LEVEL SET PROGRAMMING 29This cuboid, in effect, is an approximation of the boundary of the true level setH0. New points are randomly generated within this cuboid and those which satisfy thef(x) c condition enter into the level set k[’. The number of new points is as many asthose discarded in the previous iteration.Those feasible points which have been sampled are defined as confirmed points. This is introduced here to distinguished between pointswhich exist in theory and those points which have been verified numerically to meet the relevant set conditions.There is a danger that the cuboid estimate based on a limited number of confirmedpoints in a level set will be an underestimate, i.e. exclude a portion of the true levelset with the consequent risk of missing a global optimum. Figure 2.2 displays a cuboidderived from limited number of confirmed points. The figure clearly shows that part ofthe level set is excluded from the cuboid. Corrections to the cuboid size to minimize therisk are discussed in Chapter 3, section 3.5.It is clear in the example shown in Figure 2.1(b) that the level set H1 does not forma continuous set but consists of two disconnected subsets, technically a “partition set”,which will be discussed in Chapter 3, section 3.1. The concept of a level set remainsbut the more complex boundaries make generation of new points in the level set far lessefficient when the cuboid approximation is adopted. Such inefficiencies will often arisewith multiple optima problems, when clusters of confirmed level set points eventuallyform around each optimum. One method of restoring the efficiency of the search is tohandle each distinctive cluster as a separate sub-problem. In that case the search isrun independently within a set of smaller cuboids, each enveloping one cluster, and willlead to the optimum solution for each sub-problem. The global solution to the originalChapter 2. LEVEL SET PROGRAMMING 30x2Figure 2.2: Cuboid derived from limited number of confirmed points.problem is then given by the best of the sub-problem solutions. Identifying clusters andpartitioning them is a problem which can be resolved by statistical approaches and theseare also discussed in Chapter 3. section 3.1.2.3.2 Alternative methods of sample point generationThe point generation scheme based on cuboids was described under LSP implementationin section 2.3.1 above. Other methods were considered in this research and are reviewedhere.In all direct search methods experimental feasible points are generated, the objectivefunction is evaluated at each point, and then a ranking procedure based on the objectivefunction values is used to guide the next stages of the search or detect the overall optimumsolution. A good, computationally efficient, point generation method provides the maximum number of acceptable points for a given computational effort. and thereby increasesthe chance of finding the global optimum. The greatest opportunity for increasing theObserved pointsLev& setboundaryxlChapter 2. LEVEL SET PROGRAMMING 31computational efficiency of direct search methods lies in an efficient sampling technique.In addition to increasing the efficiency of the search, a good sample point strategy alsoincreases the reliability of finding the true optimum solution. Here, reliability is definedas the number of runs producing the true optimum solution divided by the total numberof runs.Grid samplingThis type of point generation occurs entirely at discretized, or systematically spaced,grid points. This method has two major shortcomings;i) The location of sample points is fixed once the grid size and the starting point of thegrid system are chosen. Variation in either the grid size or its starting point canlead to different solutions.ii) If the grid point generation is adapted to iterative search with progressive domainreduction, there is a grid scaling problem after each iteration.\‘Vith grid scaling changes, and the retention of some points from earlier grid samples,the distribution of points can become distinctly non random after just a few iterations.Any appeal of systematic grid sampling is therefore lost with progressive domain reduction techniques such as LSP.Sampling around existing best pointsIn this strategy a set of sample points is generated randomly and the best pointsidentified by comparison. The bad points are discarded and replaced by new points generated around the surviving points. The principal difficulty with this method is in fixingthe scale of the perturbation around the confirmed points. Too big a perturbation leadsto an unnecessarily inefficient search. Too small a perturbation leads to greater efficiencyChapter 2. LEVEL SET PROGRAMMING 32in generating new points, but the risk of converging on local optima is significantly increased. With a small perturbation there is a natural tendency for the new points toform clusters around the best points sampled early in the search, but at the expense ofexploring the rest of the feasible region. This leads to local convergence, especially withlow Nk€€.Experiments were performed with all the above methods in conjunction with LSPand were rejected in favour of a point generation strategy based on cuboids. The cuboidstrategy was found to produce the greatest search reliability and reasonable efficiency.2.3.3 Revision of level set value-cAt every iteration some of the points in the previous acceptance set are discarded andthe rest remain in the current level set. The screening criterion for keeping points in theacceptance set is simply that their objective function value is less than or equal to thecurrent level set value-c. The current level set value-c can be set to any value betweenthe minimum and the maximum objective function values at the confirmed points inthe previous acceptance set. Only the mean value of the previous level set, which wasproposed in [Chew & Zheng, 1988] though without justification, has been consideredso far in this chapter, but other possibilities were explored in this research. First twoextreme possibilities will be discussed.If the maximum objective function value in the previous acceptance set is used asthe new level set value then the cuboid volume and the level set value-c would remainnearly constant throughout the search, irrespective of the type of problem. In that casethe search becomes almost a blind search, which is computationally inefficient.If a small value is used as the level set value-c, so that just a few points with thesmallest objective function values survive to the next acceptance set, the search loses itsglobal nature and focuses on a small region in the search domain. As the volume of the(haptcr 2. LEVEL SET PROG1L1M.X1I.\l :3:3cuboid reduces to a small size at the early stages of the search, the search can terminateat a non optimal arbitrary point, as shown in Figure 2.3(a) for a single variable problem.Another example where the generation of new points would be almost impossible isshown in Figure 2.3(b). For the single variable problem, the two points which give closeto the minimum objective function value are observed at the first iteration, say at pointsA and B as shown in Figure 2.3(b). When the minimum objective function value is takenas the level set value-c then the new cuboid would be the length A. Unfortunately thereis no point which fulfils the level set condition in the cuboid except the extreme pointsthemselves. In this case the search would be extremely inefficient or fail completely.Figure 2.3: Cuboid defined by the smallest objective function values.In general the revision in the level set value-c has to fulfil two conditions. Firstly, allof the important regions in the search domain should be retained in the cuboid followingthe process of discarding points which do not fulfil the level set condition. Secondly,the improvement in the level set value-c at each iteration must be substantial to avoidan impracticably large number of iterations and hence computationallv unacceptablex x(a) Cuboid excluding the optimum point (b) Cuboid excluding the level setA BChapter 2. LEVEL SET PROGRAMMING 34searches.A set of test problems were used to iuvestigate the relationship between the number offunction evaluations and the level set value-c reduction at each iteration and, if possible,to identify a better alternative to setting c’ = M(f, c). Two different approaches wereused to fix the revised level set values and define the current cuboid.In the first approach, Nkeep feasible points are first generated in the initial cuboid.Then the highest and lowest objective function values are identified and set to Maxf andMinj respectively. The level set value-c is then defined as= Minj + lc(Maxf — Minf) (2.17)where it is a constant and 0 i 1.Once the new level set value-c is established, points whose objective function valuesare less than c (say n points) are retained and the rest discarded. The current cuboidis defined using the retained n points. New feasible points are generated to replace thediscarded points, each point fulfilling the f(x) c condition. This same procedure isthen repeated at all subsequent iterations.The results presented in Figure 2.4(a) and (b) were obtained with the Rosenbrockvalley test problem [Problem 2-7-0-0, Appendix A]. The figures show the plots of itversus the number of function evaluations. The values were averaged over 10 differentruns. Different runs of the same problem with different values of it were tried to identifythe it value which produces the least number of function evaluations. The same numberof points, Nkeep, in a level set and the same convergence criterion were used for all runs.A typical plot of the number of function evaluations versus it for the first approach isgiven in Figure 2.4(a). As shown in the figure, the highest number of function evaluationsChapter 2. LEVEL SET PROGRAVL’lL\G :to meet the convergence criterion occurs when = I. and this number gets lower as thevalue of k approaches 0. But as i approaches zero, the reliability of converging on theglobal optimum also declines, i.e. the global nature of the search is lost and it becomesa local search. In addition, indications of the existence of multiple optima are lost.The second approach always generates new points within a cuboid prescribed byconfirmed points in a level set defined by M(f. c) but allows the value of c to advancebelow this level. This approach ensures that approximately n = points are used toestimate the cuboid for point generation purposes. It avoids the difficulty of estimatingcuboid bounds where nr is very small. To generate new points a new level set value-c isdefined as in equation 2.17. The new level set value-c could be as high as roughly M(f,c)when , = .5 or as low as Minj when , = 0. A plot of number of function evaluationsagainst K is given in Figure 2.4(b).tS 1a•.•.•I031.0k k‘ A_______Re Ryby Q) Cuboid defined by M(f.c), acceptability of points defined by cFigure 2.4: N for cuboids defined by different level set value criteria.In the case of this second approach, the search maintains its global nature, and candetect multiple optima. The numerical experiments showed that decreasing the level setChapter 2. LEVEL SET PROGRAMMING 36value-c below M(f, c) slightly improves the reliability. On the other hand, the number offunction evaluations increases with lower values of c, and the search can be as inefficientas the blind search in the worst case i.e. when t = 0.In general, it is recommended that M(f, c) be used as a criterion to advance the search,that is to set the level set value-c equal to M(f, c). Then redefine the current cuboid usingthe approximately f1 surviving points which fulfil the f(x) c = M(f, c) condition,and regenerate new points within this new cuboid. Although the choice of M(f, c) forthe revised level set value is essentially heuristic, these limited numerical experiments, aswell as experience with the many test problems, appears to support this choice.2.3.4 Detailed description of the LSP Algorithm - general caseImplementation of the LSP algorithm for the constrained case, using a cuboid approximation for the level set, is given below.Let e he a small number which will be compared with the variance of the objectivefunction values to confirm convergence.Let D0 = [bf, b] for i = 1,2, .., n; be the initial cuboid, where bf and b are theinitial lower and upper bounds respectively for the variable.Let ‘= (, , ..., ) be an independent n - tuple random number which is uniformlydistributed on [0, 1].LSP operates by following the steps described below.Chapter 2. LEVEL SET PROGRAMMING 37Initialization1. Generate Niveep feasible random points in D0 as follows:Foratrialpointj,generatex=bf+(b—b) fori=1,2,...,nThis will establish a trial point j, where x° = (x{, x ,...,x). Note that the generated point will be random uniform distribnted within D0.Test for feasibility, reject if infeasible. Continue until Nkeep feasible points areobtained.2. Evaluate the objective function, f(x) for each generated point. Assign the highestobjective function value, c1 as the initial level set value-c, that is= Max{f(x2)}Algorithm3. Redefine a new cuboid B1 using the remaining approximately points asD1={xI1fxbY, i=1,2where bf” and b’ are the lowest and highest values of the 1th variable among thosepoints in the current level set.4. Generate new points in the new cuboid.For a new trial point j, generate x = bf’ + (b1 — b’) for i = 1,2, ..., nTest for feasibility, reject if infeasible.Evaluate f(x); reject if f(x) > level set value c.Generate as many confirmed points as those discarded in step 8 of the previousiteration. That is, restore the number of points in the acceptance set to a total ofChapter 2. LEVEL SET PROGRAMMING 385. Evaluate the mean of the objective fuuction values, and assign this mean value tothe new level set value-c for the next iteration. That is= M(f, c)6. Calculate V(f, c), the variance of the objective function values.V(f, e)= N [EU(x) - MU, e2]keep 11If the variance is less than the pre-specified convergence tolerance, e, terminate thesearch, otherwise continue to the next step.7. Discard points whose objective function values are greater than c’.8. Set c =9. Go back to step 4 and redefine a new cuboid.Let the value of c at termination be represented by ct. V(f, Ct) e is satisfied so thatM(f, e1). The solution is presented in the form of a final acceptance set H.and will contain points in the immediate vicinity of the global optimum or optima.2.3.5 Alternate termination criteriaBecause of its unique search mechanism, LSP’s requirements for a termination criteriondiffer from other NLP search methods. In addition to the termination criterion describedin section 2.3.4, three other alternatives were considered and are described below.i) Size of cuboidChapter 2. LEVEL SET PROGRAMMING 39In the special case where the problem is unimodal, the cuboid at c = C is theoretically a single point and cnboid volume can he exploited to provide a convergencecriterion.If Vol=flL1 d1 defines the volume of the cuboid at any stage of the LSP search inan n dimensional problem, where d1 is the length of the side of the cuboid measuredon the jth variable axis, then this volume tends to zero as convergence is approached.The search could be terminated when the volume is less than a pre-specified smallvalue. Since the length of each side of the cuboid gets closer to zero as the searchprogresses, the fact that Vol —* 0 as d1 —+ 0 for i = 1, ..., n is obvious. But thedanger is that the convergence criterion could be met even if only one or two sidesapproach zero length and a premature or pseudo-convergence could result. To avoidthis a criterion requiring that the sum of the cuboid side lengths, Z1 d, be closeto zero could be used in addition to the volume criterion. As a further alternativethe size of each individual cuboid side length in the final cuboid could be comparedto pre-specified termination criterion. This last mentioned criterion is similar tothat used in many local optimization methods [Archetti & Schoen,1984; Betro &Schoen, 1987; Boender & Kan, 1987], which stipulate that if x is the currentestimate of the minimum point and x is the last predicted minimum point thenconvergence is considered to have been achieved if— 4 < E for i = 1,2, ..., n.where a0 is a pre-specified small value.ii,) Difference between consecutive level set valuesThe difference between the objective function values at consecutive iterations isused as a convergence criterion by some direct search methods. For instance, theChapter 2. LEVEL SET PROGRAMMING 40following simultaneous termination criteria are used for Newton’s direct searchmethod [Edgar and Himmelblau, 1988].— f(xjf(xk) I<6ior as f(x) —, 0,f(x41)- f(xj 1< e—I Ic <63xior as x —* 0,— x1 1< 64js’II <orWVf(x)W <66where k is the iteration number, s is the search direction and V is the gradientoperator.Similarly with LSP, differences between level set values at consecutive iterationscan be used as a termination criterion. In the later iterations of a successful LSPsearch, improvement of the level set value-c with each iteration decreases. This is anindication that the points in the level set are distributed around stationary points inthe objective function surface. A maximum acceptable value, e, for the differencebetween two consecutive level set values, can serve as a termination criterion. Thatis, c—c41 e dictates termination.Chapter 2. LEVEL SET PROGRAMMING 41iii) Coefficient of variation of the objective function valuesThe variance of the objective fnnction valnes in the acceptance set give a precisetheoretical convergence condition, and has to be equal to zero at the global optima.But in actual implementation, the variance of the objective function values is compared against a predefined small value at each stage of the search. Since valuesof the objective function with high absolute value produce a higher variance, andobjective function values close to zero prodnce lower variance, a scaling difficultyarises with the implementation of variance as a convergence criterion.To overcome the scaling difficulty, the dimensionless coefficient of variation, definedhere as modified variance divided by the level set value-c, can be used as a convergence criterion. This criterion seems to be quite satisfactory for a wide range ofproblems, but a difficulty arises when the level set value equals zero and the searchstops because of numerical overflow. The level set value can be equal to zero fortwo reasons: when the global optimum is just below or equal to zero; and when thelevel set value starts with a positive number and improves to a negative optimalvalue, taking a value equal to zero in the course of the search process.Generally then, the choice of search termination criterion is tied to the magnitude ofthe optimum objective function. Except for the criterion discussed under ‘size of euboid’,any of the above mentioned criteria, as well as the one discussed in section 2.3.4, can beused for LSP search termination. The ‘size of cuboid’ criterion undermines the search formultiple optima solutions.Chapter 2. LEVEL SET PROGRAMMING 422.3.6 Handling Constraints in LSPLSP can accommodate constraints by making use of three simple and straightforwardapproaches. These are: rejection, reduction and penalty methods. All three methodscan be applied in the same LSP solution of a given problem, depending on the nature ofthe constraints involved. An explanation of each method is given below.Rejection methodFor a constrained problem with a feasible region 8, defined by the set of the constraints, the mean value and the variances over the level set were redefined by [Chew &Zheug, 1988] asM(f, C; 5) =p(Hck fl 5)f(x)djtV(f,c; 5) = (Hck flS) — M(f,c; S)]2djzVM(f, c; 5) =fl 5) LCk[f(x)- c]2dR (2.18)where k indicates the iteration.The above expressions are called the rejection mean value, rejection variance andrejection modified variance of f(x) over H fl S respectively as discussed in section 2.1.The following important characteristics are retained. Givene S is the global minimum solution andC = f(x*) is the corresponding global minimum value.for c> CthenM(Lc;S) CMU, C; 5) = CChapter 2. LEVEL SET PROGRAMMING 43V(f,c*;S)= 0VM(f,c*;S)= 0For a discrete sampling (or integer problem), equation 2.18 can be rewritten asMQf,c;S) = 1 a°f(x3)ZVEHflsa XJEHcflSVU, C; 3) = 1 a [f(x)— MU c)]2ZX3EHCnSaxicHflSYwUc;S)= 1 a[f(x)—c]2 (2.19)ZXJEHCnSa xieHflSReduction methodWhenever there are equality constraints, the number of variables in the problem cantheoretically be reduced by the number of equality constraints. In effect reduction canbe performed by solving a system of simultaneous nonlinear equations. Most of the timehandling equality constraints in this fashion is difficult, if not impossible, since in practicenonlinear constraint functions are seldom simple enough to be able to equate the functionwith a single variable.Penalty methodWhen the set of constraints defines a very small or narrow feasible region, or onegoverned by equality constraints, generating feasible points can be difficult. In this casethe objective function can be modified by adding some terms which penalize (i.e. increase)the value of the objective function when any sample point outside the feasible region isevaluated. The existence of points that violate the constraints is permitted within theacceptance set but is discouraged later in the search. This approach is common to allpenalty function methods.Chapter 2. LEVEL SET PROGRAMMING 44Let S be a closed subset of X. Consider the constrained minimization problemC = min[f(x) + ap(x)]where a is a positive value.Definition:- a function p on X is a penalty function for the constrained set S ifp(x)0 VxcXp(x)=0 iffxeSp(x)>0 VxSSuppose ek, the level set value at the kt iteration, so that cj C, is decreasing andtends to c* as k —‘ cc and cvk is a positive increasing sequence which tends to infinity ask —* cc. Let= {x f(x) + akP(X) C} (2.20)thenlimHk = fllik = li flSandlzm(H) Lkm4 = (HflS)As defined earlier, the mean value of f(x) over its level set H within a closed feasibleregion defined by the constraint set S is given byM(f,c;S)= p(HflS)Because C is a lower bound of clim 1 j f(x)dR = M(f, c; 5)k_icc fz(JJk) HkChapter 2. LEVEL SET PROGRAMMING 45The limit does not depend on the choices of sequences {ck} and {ak}.Suppose c > c then the limitM([c;p) = lim 1 j f(x)dRk—*oo /A(Hk) ukis called the penalty mean value of f(x) over the level set Hk. Suppose c* = f(x*) is theglobal minimum value of f(x) over 5; then= M(f,c*;S) = M(f,c*;p)The penalty mean value, variance and modified variance of f(x) have the same convergence properties as the constrained mean value, variance and modified variance; thereforethey share the same properties [Chew & Zheng, 1988].2.4 SummaryThe general theoretical description of the use of level sets for nonlinear optimization waslaid down by Chew and Zheng [Chew & Zheng, 1988]. In this chapter, the theory hasbeen extended, the potential importance elaborated and some practical application issuesaddressed.One of the main ideas explored is the relationship between the reliability of LSP solutions and the revision of the level set value at each iteration. The choice of revised levelset value was found to affect the reliability of the solutions and the overall performance ofLSP but the use of M(f, c) was confirmed as being appropriate for general applications.Different methods of sample point generation were also investigated. The explorationof alternative methods has helped to give more insight into a key numerical and computational aspect of this implementation of level set theory. Even though many methods wereexamined, only one of the methods, the cuboid approach, is recommended for generalapplication.Chapter 2. LEVEL SET PROGRAMMING 46Various search termination criteria were also examined. Any of these criteria canbe used in practice without affecting the theoretical global convergence characteristicsthongh the best choice is dependent upon the nature of the problem being solved.In practical implementations of LSP, the modified variance, VM is adopted as a termination criterion instead of the variance evaluated around the mean. The reason is thatthe modified variance is larger and therefore provides a stronger convergence conditionand is, as well, easier to evaluate. Moreover the modified variance is more sensitive tothe entry of points into the acceptance set which have the lowest values of the objective function. This becomes a significant benefit in the final convergence stages of thesearch. In addition to using variance as a convergence criterion, a plot of level set value-cagainst each iteration is displayed to give a visual indication of the typically asymptoticconvergence of level set valne-c on c.How closely the final value of the convergence criterion VM(f, c) approaches zero inpractice is a qnestion of numerical precision and the practical needs of the problem. Itis governed primarily by the number of sample points maintained in the level set andaccuracy of the computational device used in evaluating the various nonlinear functionsinvolved in the problem. In general the most important aspect of LSP is fixing thenumber of points in an acceptance set. Too few points increases the chances of missingvery localized but significant fissures in the objective function surface, too many pointsplaces an unnecessarily high computational burden. For a fixed number, Nseep, of pointsin the acceptance set the density of points per unit length on the variable axes increaseswith the number of iterations.The search process is not infallible, a very localized fissure in the f(x) surface whichcontains the global optimum can be overlooked, but the VM(f, c) valne still convergeson zero over one or more local optima. Such problems are more likely to emerge whenthe number of sampling points in a level set is too low. Problems which are peculiarlyChapter 2. LEVEL SET PRO CRAMMiNG 47difficult for LSP are thus possible aud some of these are discussed in Chapter 5, section5.5.Chapter 3LSP PERFORMANCE IMPROVEMENTOptimization based purely on the exploration of level sets has been described by some authors as a theoretical approach with little practical significance [Torn & Zilinskas, 1988],since they felt that the method lacks strong convergence characteristics. There have beenfew attempts to implement pure level set methods in any general way. Those implementations that have been made are poorly docnmented and there is little existing literatnreon enhancing the performance with LSP type methods. A variety of approaches to improve the efficiency of the LSP search have therefore been investigated in this research.The more successful techniques dealing with partitioning of search domain, relaxation ofconstraints and redefining of search domain are discussed in this chapter. In additionto improving the computational efficiency of LSP, these techniques also tend to producemore accurate convergence on optimal solutions, and enhance the reliability of detectionof boundary solutions. Further discussion of actual experience when implementing thesetechniques and setting the values of the various LSP parameters is provided in chapter5.3.1 Partition sets and cluster analysisThe volume occupied by the level set is always less than or equal to any containingcuboid volume. The ratio of level set volume to cuboid volume has a direct influence onthe efficiency of generating new points in the level set; when this ratio is low considerablecomputational effort will be expended on unsuccessful point generation. Very low ratios48Chapter 3. LSP PERFORMANCE IMPROVEMENT 49can result from problems involving widely separated, multiple global optima, or from along, narrow, level set with its principal axis at an oblique angle (e.g. 45°) to the variableaxes.For a typical unimodal problem, the ratio of level set to cuboid volume remains bothhigh and nearly constant throughout the search. But if the optimization problem hasmultiple global optima, then the cuboid would include a large amount of non level setspace. This results in a very low level set to cuboid volume ratio. Moreover the level setcould consist of discontinuous sets within the cuboid.Some new terms are introduced here to facilitate discussion of the computationalimplications when the level set becomes discontinuous.Two sets B and C, where B 0 C, define a disjoint set if B fl C = 0,that is if B and C do not have any element in common. A class A ofsets is called a disjoint class of sets if each pair of distinct sets in A isdisjoint [Lipschutz, 1965].A class A of non empty subsets of A is called a partition of A if andonly if each a E A belongs to some member of A and the members ofA are mutually disjoint [Lipschutz, 1965].A space A is connected if, whenever it is decomposed as the unionBUC of two non empty subsets, then BflC 00 or GflB $0, wherethe upper bar designates the complement set and 0 designates the nullset [Armstrong, 1983].If a level set defines a region where all points in the level set belong to a single set,then the level set is called connected as in Figure 3.1(a). Henceforth in this thesis aChapter 3. LSP PERFORMANCE IMPROVEMENT 50connected subset of the partition will be referred to simply as a “subset”.If a level set defines a set of small subregions and there is no common point between any two subregions, then the level set is called apartition level set.When the subsets of the partition are close together they will, in practice, be treated asif they were a connected set. Figure 3.1(b) shows a partition level set in two dimensions.Sub—cuboid will be defined here as the cuboid enclosing an individualsubset of a partition level set.Minimal cuboid enclosure will be defined in this thesis as the setof sub-cuboids which enclose the level set in a specific cuboid. Theminimal cuboid enclosure would be equal to the cuboid itself if thelevel set is connected, otherwise it would be a collection of more thanone sub-cuboid.As the iterative process of LSP unfolds, the size of the region occupied by the acceptance set gets smaller. However, this region may contain a single connected levelset in the special unimodal case, or it can enclose a partition level set forming separatediscontinuous regions. Points generated in any of these subregions would be acceptedif they meet the level set condition f(x) c and do not violate any of the constraints.The efficiency of generating an acceptable sample point in the cuboid would be directlyproportional to the ratio of the acceptance region volume to the cuboid volume.For a multiple optima problem, the chances of generating an acceptable point becomessmaller as the search approaches the global optimum value of the objective function. ThisChapter 3. LSP PERFORMANCE IMPROVEMENT 51x2 x2xlFigure 3.1: Connected and partition sets.is because the volume of the acceptance region becomes extremely small near convergencewhereas, at this stage of the search, the cuboid typically maintains more or less a constantvolume. This is particularly true when the global optimum points are at stationary pointsin the objective function surface. Although an inefficient search may eventually converge,the time required to obtain a solution may be prohibitive. To overcome the inefficiencydue to the presence of multiple optima it is possible to subdivide the cuboid into aset of smaller cuboids without sacrificing the integrity of the search. This is provided,of course, that no subregion containing a global optimum is omitted from the smallercuboid set. If the existence of distinct clusters of points is suspected during a search,statistical cluster analysis can be used as an aid to identifying the clusters, and hencethe sub-cuboid boundaries. Details on cluster identification and classification will bediscussed in section 3.1.1. The current level set value-c would be applicable to all of thesub-cuboids at the point of dividing the cuboid into smaller sub-cuboids.xl(a) Connected set (b) Partition setChapter 3. LSP PERFORMANCE IMPROVEMENT 52After the clusters have been identified their individual subcuboid dimensions aredefined. The combined sum of the volume of all sub-cuboids would be smaller than thevolume of the cuboid before subdividing into smaller subregions. Thus the total volumeof search is reduced relative to the volume of the level set.Once subcuboids are defined the optimization problem is then treated as a separateproblem in each subregion and each is solved independently. The process of performingcluster analysis and subdividing the cuboid into smaller subcuboids can be repeated atsubsequent stages of the LSP search but will result in solving an ever increasing numberof sub-problems. Such an exhaustive division of search domain leads to an inefficientsearch and is generally unjustified.An example to show the merits of using smaller cuboids at some stage of the searchfor the unconstrained but bounded one dimensional case is given in Figure 3.2. Herethe cuboid is one dimensional and its “volume” is represented by the length of its singleside. At the kth iteration, the chances of generating a feasible point fulfilling f(x)is roughly 0.5, that is the level set designated as Hek in the figure, occupies about halfthe cuboid length. At the next iteration, the level set value-c is lowered to and thecorresponding level set volume becomes so small that the efficiency of generatingfeasible points fulfilling f(x) CJcfl drops to below 0.25. After this stage of the search itwould be advantageous to work with two distinct cuboids where the chances of generatingpoints fulfilling the level set condition would approach 1.0, depending on how preciselythe cuboid boundaries are estimated.Recognizing such benefits from partitioning, it is almost essential that a cluster analysis subroutine be incorporated into LSP. From the implementation point of view, the precision needs for the identification of clusters and cluster bounds are not critical. Except,of course, that the cluster boundaries and hence cuboid boundaries do not inadvertentlyexclude a global optimum.Chapter 3. L5P PERFORMANCE IMPROVEMENT 53\••••••••••••Hck = {xjf(x) Ck }Hck+l = {xlf(x) ck.H }—‘1___r-—Cuboid side at k+1 et iteration4Cuboid sidó at kth IterationFigure 3.2: Level set “volume” in one dimensional problem with and without partitioning.3.1.1 Cluster analysis methodsA cluster can be defined as a set of points that possess some common characteristicwhich is not shared by members of another group, or as a set of points that bear somerelationship to each other but not with those outside that particular set. In Euclideanspace, where the relationship between points is a function of their distance from eachother, clusters are defined as points which are located close to each other. In somecases a similarity criterion, which expresses the common characteristics shared betweenpoints, is used instead of distance to classify points into clusters. When partitioning levelsets, Euclidian distance is the appropriate measure of the relationship between points inthe decision space. Cluster analysis is a technique which allocates points to a clusterin such a way as to maximize the distance between clusters and minimize the distancebetween points within a cluster. Cluster analysis procedures can be divided into twomajor categories, which are briefly explained here.f(x)CkCk+l/Chapter 3. LSP PERFORMANCE IMPROVEMENT 54• Hierarchical Methods:These produce a tree-like taxouomic system where, at oue end of the tree everyindividual point is a cluster and at the other end all points are included in acommon cluster. At intermediate levels, clusters are formed by aggregating lowerclusters. The hierarchical cluster configuration is usually represented graphicallyas a dendrogram as shown in Figure 3.4. This graphical structure is a common toolused to express the results of a clustering analysis. The hierarchical method canhe divided into agglomerative and divisive methods.i) Agglomerative methods: These are methods where similar data points are sequentially aggregated to form a single cluster. The methods construct the hierarchicaltree from individual points at the branch tips to a single root. Specific agglomerative methods differ according to their definition of distance between a point and acluster, or between two clusters [Mezzich & Solomon, 1980].ii) Divisive clustering methods: These methods subdivide the aggregate set of datapoints into smaller subgroups by partitioning, such that the variance within eachgroup is minimized and the variance among the groups, their mutual separation,is maximized without breaking up natural groupings. One of the techniques isthe Automatic Interaction Detection, AID, which employs a series of discrete variable splits, each dictated by the maximum reduction in the empirical unexplainedvariance [Jenson, 1977].• Non-hierarchical methods: In contrast to hierarchical methods, these produce configurations that do not present rankings in which lower order clusters must becomemembers of larger more inclusive clusters. These are methods which essentiallyproduce the final partition in a single step. The major approaches used in nonhierarchal methods are as follows [Mezzich & Solomon, 1980].Chapter 3. LSP PERFORMANCE IMPROVEMENT 55i) Total enumeration of partitions methods: In this method an attempt is made toenumerate all clustering possibilities, and then select the best cluster arrangement.A quantitative clustering index is used to choose the best alternative. Since thismethod involves exhaustive searches, it appears to be computationally unattractiveeven for modest data sets.ii) Nearest centroid sorting methods: The basic feature in this method is theselection of seed points to be used as cluster nuclei around which the set of pointscan be grouped. When the number of clusters is fixed, the location of the centresof the clusters are updated after each full cycle of allocation of points. Given thenumber of clusters, the method allocates points so that the within cluster sum ofsquares are minimized [Hartigan, 1977]. For a variable number of clusters, pointsare allocated on the basis of the nearest centroid sorting process. The number ofclusters changes during the allocation process by partitioning those clusters whichhave a large within cluster sum of squares to form smaller clusters, and joining anytwo clusters with small between cluster sum of squares to form a larger cluster.This is controlled by using certain parameters for “coarsening” and “refinement”set by the user [Mezzich & Solomon, 1980].iii) Reallocation methods using variance-covariance criteria: The basic procedurein this category is to reallocate points among a set of clusters in such a way as tooptimize some overall discriminant function or variance-covariance criterion.iv) Density search methods: These methods use the allocation of points in a metricspace, looking for regions of high point density separated by regions of low densityfor identification of clusters.Chapter 3. LSP PERFORMANCE IMPROVEMENT 56From the experience in this research, the agglomerative hierarchical clustering methodwas fonnd to be easy to implement with the least complications in theory and applicationand providing acceptable precision for LSP purposes. This method is discussed in detailin the next section.3.1.2 General implementation of cluster analysis methodsSuppose there are in points, in n dimensional Euclidean space, which we wish to arrangeinto a hierarchal classification. The data set for the in points forms an in x ri matrix. Theunit of measurement for the in x ii raw data matrix is standardized, which means thatall variables are transformed to a single unit, before computing distance measurements.This ensures that each variable is weighted equally, otherwise the Euclidean distance d1will be influenced most strongly by the variable which has the greatest magnitude [Davis,1973]. The Euclidean distance, between two points is computed using equation 3.1.=—Xj (3.1)where XjJç, denotes the kt variable measured at point i and Xjk is the kth variable measuredat point j. In all, n variables are measured at each point and is the distance betweenpoint i and point j. The distance between all possible pairs of points will result in anin x in symmetrical matrix. The points are arranged into a hierarchy based upon themagnitude of d1. Initially all pairs of points with mutually short distances are groupedtogether to form the beginning of clusters. Then those points which are not in any of theclusters formed at the initial stage, together with the small clusters already formed, areregrouped to form larger clusters. Small clusters are treated in the same way as points,and are lumped together on the basis of their mutual distances. The process of regroupingis repeated until all of the points have been placed into a complete classification scheme.Chapter 3. LSP PERFORMANCE IMPROVEMENT 57The first step in clustering by the pair-group method is to find the mutually closestdistance in the matrix, the corresponding data pairs form the centres of subsequentclusters. After the centres have been identified, all the other points are connected to thecentres one by one to form a dendrogram.The most common and simple clustering method is the single hnkage method, whichemploys a predefined arbitrary critical distance as a criterion for clustering. Clusteringwill only occur between pairs of points and small clusters if the distance between themis less than the critical distance. If all pairwise distances are greater than the criticaldistance no clustering can occur. After the initial clusters have been formed the distancesbetween the remaining points and the closest point within a cluster are computed one byone. Those points which have distances less than the critical distance from a cluster areentered into the cluster. The procedure continues until either no more points can enterany cluster, or all points agglomerate to a single cluster.A similar method, referred to as the multiple linkage or complete linkage method,uses the distance between an unclustered point, and the farthest point in a cluster as ameasure for clustering. The distance between an unclustered point and the most distantpoint within a cluster is checked against the critical distance for the viability of being amember in that cluster. Figure 3.3 shows two different results of cluster analysis resultingfrom a single data set using single and multiple linkage cluster analysis methods.Another type of cluster analysis method, the weighted average cluster analysis, usesequally weighted average distance between the cluster and a newly connected point.This method gives equal importance to the set of points which have already formed acluster and the new point which is to enter the cluster, hence the greater influence of thelatter. In this method, once the first grouping is completed, the distance matrix must berecalculated, treating each set of grouped points as a single point. Then the next step isto regroup the small clusters already formed on the basis of shortest mutual distance toChapter .3. LSP PERFORMANCE IMPROVEMENT 58(b) Multiple linkageFigure 3.3: Single and multiple linkage clustering methods.1iEI,jEJfljflj2,2distanceCrftical©(a) Single linkagei Clusters xlform a larger cluster. A simple and effective way to calculate the distance between thegrouped points is to use the arithmetic mean of distances between every pair of points,each point from a different cluster. For example, two clusters, say the first one containingpoints A&B, and the second one containing points C&D, are made to be regrouped toform a single big cluster. The distance between the two small clusters is calculated as(A + XD + + )/4, where each pair of letters under a bar line designate thedistance between the two points. In general the distance between two clusters, D1, isexpressed as(3.2)whereD1 is the distance between cluster I and cluster J.nj and nj are number of points in cluster I and J respectively.Chapter 3. LSP PERFORMANCE IMPROVEMENT 59d1 is the distance between points i and j.Becanse points in a cluster are treated as a single point, averaging the distancesin the cluster introduces distortion into the dendrogram. Points located far out fromthe centre of the cluster have the greatest influence on the structure. The distortionis introduced when two clusters with unequal number of points join to form a biggercluster. For example, if two points A and B are in a cluster, and a third point C is toenter the cluster, then the distance between the cluster and the third point is calculatedas the average distance between points A and C and the distance between points B andC. Thus point C is involved twice while the other points are each used only once. Thedistortion is increasingly apparent as successive levels of clusters are averaged together.The severity of this distortion can be evaluated by examining the matrix of cophenticvalues, which describes the apparent distance within the dendrogram. The cophenticvalues and original distance matrix can be correlated to reveal the overall degree ofdistortion.Unweighted average clustering methods are similar to the weighted average clustering methods except that they assume an equal number of points in each cluster. Inthese methods late entries into a large cluster have almost no influence on the distancecalculations of the clusters.Figure 3.4 demonstrates how to construct a dendrogram using 9 data points in a twodimensional space. Figure 3.4(a) shows the original data set in two dimensions. Thematrix of distances is calculated from the ordinates of each point. The pairs of pointswith mutually closest distances, are grouped and connected together as shown in figure3.4(b). Figure 3.4(c), 3.4(d) and 3.4(e) demonstrate, step by step, how the distancematrixes are calculated for already clustered points, how new points are entered intothe structure, and how small clusters are sequentially agglomerated to form the finaldendrogram.Chapter 3. LSP PERFORMANCE IMPROVEMENTOriginai data60Point i 2A 2.00 2.00B 6.00 4.00C 7.00 2.00D 5.00 3.00E 2.00 3.00F 3.00 4.00G 3.00 3.00H 7.00 3.00I 5.00 4.00e5.I. B.E H2 A C.2 4 6xlDistance matrices between pointsAA 0.00B 4.4 0.00C 5.00 2.24 0.00o 3.15 1.41 2.24E 1.00 4.12 5.10F 2.24 3.00 4.47o 1.41 3.16 4.12H 5.10 1.41 1.00I 3.61 1.00 2.830.001.41 0.001.00 1.00 0.005.00 4.12 4.00 0.003.15 2.00 2.24 2.24 0.00AEI B.c )I H0FEFa (c)BF 0CHEF0 (d)LZ 0CHAEF0B (e)0C2.2 1.5 12 1.0 0.0Distancex2(a)B C 0 E F 0 H0.003.002.242.002.001.00AE BAE 0.00BI 3.84 0.00CH 5.05 2.180 3.08 1.21FG 1.52 2.60CM 0 FG0.002.124.180.002.12 0.00AE-FG Bi-DAE-FG 0.00Bi-D 2.91 0.00Cl-I 4.61 2.15CH0.00AEFG B1D-CHAEFG 0.00BID-CH 3.76 0.003.8Figure 3.4: Stepwise dendrogram construction.Chapter 3. LSP PERFORMANCE IMPROVEMENT 61There are many ways of allocating points into taxonomic classifications. The choiceof methods depends mostly on how the analyst views the needs of his problem and wishesto trade off compntational effort against the desire for accuracy of cluster identification.3.1.3 A clustering routine for LSPA cluster analysis subroutine which performs a weighted average cluster analysis was incorporated to LSP. This method is chosen for its simplicity and the fact that it needs onlya few user selected parameters. Even though the method might not be the most accurate, it is good enough for LSP purposes since the search does not need precise clusteringmethod for its solntion. With LSP the potential for overall reduction in computationalload and gain in reliability from sophisticated clustering methods is not obvious.A user selected type clustering criterion, which is expressed as the ratio of the minimum distance between two clusters and the distance between the mutually farthest pointswithin a cluster, is used as an iuput for the analysis. The subroutine checks if the clustering criterion is met, and performs sub-grouping of the data set into a number of clusterswhen the criterion is met. The bounds for each cluster, or subgroup, are identified andthen used to define the initial cuhoids for the respective clusters. This subroutine isadapted from [Zupan, 1982].The computational time the search needs while performing the clustering routinesdepends on the strength of the clustering criterion. In this thesis, a clustering criterionis referred to as a strong criterion when the ratio of distances between two clusters to thedistance between the farthest points within a single cluster is high. A strong criterionimplies the expectation of very distinct clusters.If a very strong clustering criterion is used, it might take longer computational timeto meet the criterion, but once the clusters are identified and the problem is subdividedinto the true set of distinct sub-problems, the subsequent LSP search becomes faster.Chapter 3. LSP PERFORMANCE IMPROVEMENT 62The increase in efficiency is demonstrated in Table 3.1 through an example in the latterpart of this section. Since aggregating points into clusters and identifying their boundsdoes not need high precision in LSP, the clustering criterion does not in fact need to bestrong. But if a weak criterion is used for problems whose optimal points are at extremecorners of the initial cuboid, the imperfect cluster identification might result in missingthe global optima. In practice, the probability of generating a sample point at an extremeboundary corner is almost nil for finite sampling schemes. A heuristic method has beendeveloped to avoid this situation and is discussed in section 3.5.The following two variable, unconstrained but bounded, problem with two globaloptima demonstrates the danger of premature clustering and the computational benefitof introducing cluster analysis into LSP.Minimize f(x) = 100 — (x1 + x2 — 10)2Subject to:0 x1 100 x2 10The true global optimum is f(x*) = 0.0 and the optimal points are located at (0, 0)and (10, 10), which are the diagonal corners of the initial cuboid.Figures 3.5(a) and 3.5(b) show this optimization problem in three and two dimensionsrespectively. Figures 3.5(c) and 3.5(d) show plots of the points in the level set in thex2 plane at consecutive iterations. In both cases the two small rectangles around theclusters of points inside the initial cuboid show the dimensions of the new sub-cuboids ifthe problem is to be partitioned into two sub-problems. Figure 3.5(c) demonstrates howa premature clustering can exclude the true global optimal points. However if clusteringis performed at the next iteration, the global optimal points would be included in theChapter 3. LSP PERFORMANCE IMPROVEMENT 63new sub-cuboids, as shown in Figure 3.5(d). Premature clustering can be avoided byadopting a stronger clustering criterion.Table 3.1 shows the number of function evaluations required to meet the LSP convergence criterion for the above example, with and without using cluster analysis method.These results are from a single LSP run for each case. The number of points maintainedin the acceptance set for each sub-problem was the same as for the acceptance set withoutthe cluster analysis.Number of function evaluationsIteration without cluster with clusteranalysis analysis5 3,024 3,2436 4,236 4,94013 1,420,092—14 3,686,64815 8,018,142—37 6,550Table 3.1: Reduced number of function evaluations due to cluster analysis methods1eep value of 30 was used for this example. This is higher than the recommendedvalue of Nkeep = 20 but appropriate in known multiple global optima cases. Withoutcluster analysis, this example had not met the convergence criterion VM 0.0001 at the15th iteration when the number of function evaluations was over 8.0 * 106. In the secondcase, where the cluster analysis method was used, the problem was partitioned into twodifferent sub-cuboids immediately after the sixth iteration. With Neep = 30 in eachof the two sub-problems, it took 37 iterations to meet the convergence criterion. Thenumber of iterations cited is the sum of iterations expended before dividing the probleminto smaller regions plus iterations expended to meet the convergence criterion for bothsub-problems. Even though the number of LSP iterations is high when the problem ishandled in two separate subregions, the total number of function evaluations is only 6,550.Chapter 3. LSP PERFORMANCE IMPROVEMENTx26410x2200 2 4 8xl(c) Premature clustering1Qa68 10 0 2 4 6 5 10xl(d) Proper clusteringxl(a) (b)xla64 4.2iteration 6M=4940Figure 3.5: Cluster analysis approach for a two variable problem.Chapter 3. LSP PERFORMANCE IMPROVEMENT 65As was experienced in a nnmber of other examples, the high number of LSP iterationsdoes not necessarily imply inefficiency of the overall search, but the cumniative nnmberof function evaluations. In this instance partitioning of the problem permitted efficientpoint generation and lowered the number of function evaluations by a factor estimatedat or lower.3.2 Penalty functionsGenerally, feasible sample points are randomly generated with LSP for constrained optimization problems. In some cases the feasible region defined by the set of ineqnalityconstraints might be so small compared with the initial variable bounds that acceptablepoint generation reqnires an impractically large number of trial points. Similarly otherproblems might have equality constraints involving nonlinear eqnations which do not leadto reducing the nnmber of variables by elimination. Such difficulties can be overcome byintroducing penalties into the objective function.One conventional penalty approach converts the constrained problem to an eqnivalentunconstrained problem so that those methods developed for unconstrained problems canthen be applied [Bazaraa, 1979]. Suppose there is an optimization problem with onlyequality constraints, given as followsMinimize f(x)Subject to: h(x) = 0 for i = 1,2,..,lx e FtmThis constrained problem can be transformed to an equivalent unconstrained form asMinimize f(x) + /3(h(x))2Subject to: x C FtmChapter 3. LSP PERFORMANCE IMPROVEMENT 66where /3 is the penalty parameter, a non-negative large number.A similar modification is made to problems with inequality constraints. If the originalproblem isMinimize f(x)Subject to: gj(x) 0 forj = 1,2 ,rnx e B’2then the transformed problem would be expressed asMinimize [f(x) + /3 maximum {0,g(x)}]Subject to: x E B’2.For general optimization problems involving both equality and inequality constraints,given asMinimize f(x)Subject to: h1(x) = 0 for i = 1,2,...,lg(x)0 forj=1,2,...,rnx e B’2a penalty function, a(x) is defined asm 1a(x)= >: ql[gj(x)] + i,b[h(x)] (3.3)j=1 i=1where and 5 are functions satisfyingf th(y)=OI q5(y) > 0 otherwiseI z(y) =0 ify= 0I.. z/’(y) > 0 otherwiseand y is a dummy variable representing h(x) and gj(x) respectively.Chapter 3. LSP PERFORMANCE IMPROVEMENT 67Usually, and ‘ are used in their power form as(y) = [max{O,y}]rwhere i’ is a positive integer [Bazaraa, 1979].The modified objective function, which is also called the auxiliary function, is theuwritteuf(x)+Ba(x) (3.5)aud the transformed problem would be finally expressed asMinimize f(x) + 5a(x)Subject to: x e K’In the above expressions the penalty parameter 5 could be a fixed value or a sequence tending to infinity such that for each iteration k, /3k 0 and Pk+1 > /3k. With thequadratic penalty function, where ct(x) = ç[g3(x)]2f1i/i[h1(x)]2,the penalty parameter 5 in Equation 3.5 must become infinite in order to achieve complete convergenceon x [Gill et al., 1989].In the case of an exact penalty function, where cv(x) = I g(x) I + Z=1 ‘ Ih1(x) I the penalty parameter /3 in equation 3.5 takes a constant value and /3 0. Thereis then a specific value of /3 50 that x, the unconstrained problem minimizer, is also thesolution to the original constrained problem [Gill et al., 1989].The meaning of “large value” in connection with specifying /3 is not very clear for aparticular problem and selecting an inappropriate value of /3 can lead to computationaldifficulty If /3 is too small, the penalty function may be unbounded below and as aChapter 3. LSP PERFORMANCE IMPROVEMENT 68consequence produce a solution point far from the feasible region. If j3 is too large thenit induces an ill conditioned Hessian matrix which can imply slow convergence for manyNLP algorithms. Matrix condition is defined as the ratio of its largest and smallesteigenvalues, and a matrix is ill conditioned if this ratio is large.With large values of more emphasis is placed on feasibility and most proceduresfor unconstrained optimization will move quickly towards a feasible point. But it ispossible for search termination to occur on a feasible point even though the point is farfrom optimum. This is especially true in the presence of nonlinear equality constraints[Bazaraa, 1979].There is no clear indication that progressively increasing fi has any advantage over afixed large 3 in saving computational effort. But using several values of j3, together withtheir corresponding solution points, to predict the solution point with the next /9 value isa classical search procedure. Interpolation is sometimes used to predict the next /9 fromthe past values [Gill et al., 1989].3.2.1 Penalty parameters and LSPGenerally LSP responded best to the use of an exact penalty function as opposed to thequadratic penalty function. The gentle slopes induced in the objective function surface byexact penalty functions encourages LSP to form larger volume cuboids at every iterationwhen compared with the surface produced by the quadratic penalty function.In practice, LSP starts finding the solution for the auxiliary function with a large /3,and reduces the value of 3 by 1O%-30% at every iteration. In this research a sequentialreduction of the penalty parameter has been found to provide the greatest enhancementto the efficiency of the LSP search. This approach is exactly the opposite of the traditionalNLP penalty function strategy, where the value of 3 increases as optimum is approached.The advantage of sequentially reducing the penalty parameter is that the decreaseChapter 3. LSP PERFORMANCE IMPROVEMENT 69brings about the acceptance of more infeasible points to the cnrrent cuboid. With theintroduction of more infeasible points into the cuhoid, the cuboid volume is larger thanin the constant penalty parameter case. This in effect retards the rate of reduction ofcuboid volume over consecutive iterations. Although slow cuboid reduction requires moreiterations to reach convergence, its compensating advantage is that the newly generatedpoints are retained for a greater number of iterations than with the constant penaltyparameter strategy. Furthermore, because points farther out of the true feasible regionare tolerated with a lower penalty parameter value, fewer function evaluations per acceptable point are required. The process of reducing j3 continues until a(x), and hencethe penalty term /3a(x), goes to zero. This must occur before convergence so that all ofthe unmodified constraints are satisfied and the original problem is finally solved.Figure 3.6 demonstrates the effect on the level set boundaries of modifying the penaltyparameter for a one variable problem with two constraints. The objective function is astraight line. The problem can be formulated as followsMinf(x) = kxSubject to:x>3x 3.1where Iv is a constant.A penalty is applied to just one of the constraints, i.e. to x 3 only. Those linesdesignated as Pi(x), P2(x), etc. indicate plots of the auxiliary functions, that is the objective function plus the penalty term, at the first, second, etc. iterations. The c1, c2, etc.indicate the level set c-value established in the LSP search at the corresponding iterations.The plot of the intersection points of the level set values and the correspondingChapter 3. LSP PERFORMANCE IMPROVEMENTf(x)gCl875Ca4CaFigure 3.6: Modified penalty function.70auxiliary function values represents the objective function of an equivalent unconstrainedproblem.The whole idea of reducing the penalty parameter is to generate new points moreefficiently and at the same time smoothly contract the level set boundary so that thesearch eventually excludes any infeasible space. The full benefits of penalty modificationmay not be achieved unless a near optimum rate of reduction is used, otherwise thefollowing problems may arise:• If the reduction of 9 between consecutive iterations is large in the initial iteration,then the influence of the penalty term on the objective function must be small inthe later iterations as 3 approaches its lower bound value. Then many infeasiblepoints generated at the later iterations would be present in the acceptance set. Inthat case the search is not directed towards the feasible region and may convergeon an infeasible point.P(4)x 3.1f(x)2 tx4Chapter 3. LSP PERFORMANCE IMPROVEMENT 71A large reduction of /3 allows more infeasible points to remain in the level set fora longer period, If the problem is divided into subregions because of suspectedmultiple optima, it is possible that a subregiou may contain only infeasible points.Then the search in that subregion would obviously lead to an infeasible solutionpoint.• A very small reduction of /3 does not allow many infeasible points into the acceptance set, hence introduces an inefficient search.In order to overcome the problems mentioned, two measures are taken.i) The reduction of /3 between consecutive iterations is kept small (between 0.1 and0.25 of /3).ii) A lower bound for the parameter /3 is predetermined. The upper bound can be ashigh as required, though extremely high values would lead to an inefficient search.Typical values of /3 and its reduction are given in Chapter 5, section 5.2.6.3.3 Inequality and equality constraint relaxationWhen the feasible region defined by the set of the constraints is very small or very narrow,or defined by equality constraints, the generation of acceptable points can he inefficientand even difficult. To overcome such difficulties and speed up the search, the constraintfunctions can be initially relaxed and the relaxation eliminated in the later iterations.The relaxation method may also be applied to the bounds of the original cuboid if it isfelt that the bounds are too tight for efficient point generation. Penalties are associatedwith the relaxation of constraints, so that infeasible points are discouraged from stayingin the acceptance set. As a result of the penalties introduced into the problem, the finalsolution is likely to he found in the feasible region.Chapter 3. LSP PERFORMANCE IMPROVEMENT 72The idea of constraint relaxation in this research was first investigated using slackvariables as introduced in the reduced gradient method [Lasdon & Waren, 1982]. Butthe introduction of these new variables increased the dimensionality of the problem without any compensating gain in performance. Instead, the constraints are allowed to relaxby simply adding (or subtracting) a constant term to the right hand side of the constraints. Equality constraints are replaced by two constraints with the constant addedor subtracted from their right hand sides to form an interval. The constraint relaxationraises the volume of the feasible region which in turn raises the ratio of the feasible regionvolume to the cuhoid volume, as a result the efficiency of generating acceptable pointsis increased. Figure 3.7(a) and (b) demonstrate how constraints are relaxed for equalityand inequality constraints respectively for a two variable problem.The number of function evaluations at the first iteration N), gives a general indicationon how tight the constraints are. For a highly constrained problem, N) would be verylarge when compared with the value of 2 x Niceep expected for an ideal unconstrainedproblem.Setting the magnitude of the relaxation is a trial and error process. The constraintsare relaxed by an initial value and N) is compared with 2 x Nkeep. If the response tothis initial relaxation does not seem to be promising, the search should be interruptedduring or after the first iteration and the constraint relaxation adjusted until the userfeels efficient point generation has been achieved. While this process is unquestionablyheuristic, initial trial relaxation values which appeared to he “sensible” in the context ofthe problem being solved were often successful or were easily corrected from the feedbackprovided by N.Chapter 3. LSP PERFORMANCE IMPROVEMENT 733.3.1 Tightening of relaxed constraintsThe initially relaxed constraints are progressively tightened so that the level set will ultimately be confined to the feasible region. Ideally the relaxed constraints will be tightenedback to their original form before the convergence criterion is met. This is achieved by reducing the relaxation by a fixed fraction at each iteration. How far constraints should betightened at each iteration is a compromise between the smallness of the feasible regionand how fast the search is expectd to achieve convergence. If the relaxation reductionper iteration is high, the search might contract into a very small region and the originaldifficulty will again be faced. On the other hand a very small reduction at each iterationcan lead to a high number of iterations. Reducing the relaxation to a typical value of 0.75of its value at the previous iteration would leave to a reduction to 0.01 (i.e. 0.7516) of theinitial relaxation after the l6” iteration. The ideal value of the tightening parameter isrelated to the complexity of the constraint functions and the size of the feasible regionbut was found to be robust in the region of 0.75 for virtually all of the test problemsK2 h(x)=Ox2Rilaxid bandNp(a) R.Iax.d .qullty oonat,alntxlFigure 3.7: Relaxation of constraints.(b) R&ax.d In.qullty oonsiIntChapter 3. LSP PERFORMANCE IMPROVEMENT 74where relaxation was appropriate.3.4 Cuboid stretchingAt any stage of the search, the acceptance set is expected to be a subset of the cuboid.The minimum and maximum coordinate values of the confirmed points are used to locatethe cuboid sides. Dne to limited sampling, the true bounds of the level set may actuallyextend beyond the observed points. Therefore, the confirmed points always tend toproduce a biased under estimate of the bounds of the true level set. This bias is correctedby extending the cuboid faces outwards by some prescribed amount. This is referred toas cuboid stretching. The ideal size of this correction is not known in practice but overcorrection only incurs a small penalty in the efficiency of point generation, while undercorrection can have serious consequences of omitting a global solution. Overcorrection ispreferred and a good heuristic estimate can be made.The following expressions to stretch the bounds of the cuboid were proposed in [Chewa Zheng, 1988]. For a one dimension case, let be a random variable on the interval[a, b] with distribution function F, and F is assumed to be a uniform distribution.Assume that t points are sampled, that is i,e2,..., ± are confirmed points in theacceptance set and t is the number of confirmed points. We consider to estimate theunbiased estimators of a and b from the sampled points. Suppose a = 0 and b = 1B1 = mm {1,2B = maxThen the distribution functions çór and 4j of B1 and B are respectively0 yOco(Y)= 1_[1_F(y)]t 0<<11 y1Chapter 3. LSP PERFORMANCE IMPROVEMENT 750 iO&(u) = [F(y)]’1 ylThe expected mean values for each bound are given asE[B1] JJ (1 — F(y))tdy, which for uniformly distributed F is simpilfied to andE[BU] = 1— JJ (F(y))tdy, which is simplified toThe corrected new bounds are estimated asB1” = B1 —_______Ba” = B + B-B1where B1” and BU” are the unbiased estimators of the end points.The consequence of this bias correction is that each cuboid side is stretched by—---times the length of the original size at every iteration, where n is the number of confirmedpoints surviving in the acceptance set.3.5 Skewness adjustmentSkewness adjustment is a combination of shift of cuboid location and cuboid stretchingat the end of each iteration, and was also proposed in [Chew & Zheng, 1988]. It wasconfirmed in this research to be one of the most important heuristic devices for speedingup an LSP search and increasing the reliability of convergence on optimal points.Since the adopted point generation scheme for new points in a cuboid utilizes auniform random distribution along the axis of each variable, the possibility of missingan important point is high when the optimal point lies at a boundary, and even greaterif it lies at a corner of the feasible region formed by multiple boundaries. The dangerChapter 3. LSP PERFORMANCE IMPROVEMENT 76of missing an important boundary point can be lowered either by substantially raisingwhich would in turn increase the total number of function evaluations, or shiftingthe cuboid to ensure that it straddles the most promising region. When the acceptanceset at any iteration is skewed, that is when the point with the lowest objective value isclose to the boundary of the cuboid, an adjustment can be made to both the size andlocation of the cuboid so that the minimum point would be shifted towards the centreof the cuboid. A slight shift of the cuboid about the current minimum point helps morepoints to be generated around the current minimum point.With a skewness adjustment the shift or translation of the cuboid is governed by thelocation of the single best confirmed point. This introduces a bias where more pointsare generated close to the current best point. But, since the magnitudes of the shiftare controlled by certain user selected parameters, there is a risk that the detection ofmultiple optima can be undermined with imperfect skewness parameters.The magnitude of the shift is somewhat arbitrary. Quantitatively a shift which is toolarge may lead to missing an optimal point, while a shift which is too small does notspeed up the search nor direct the search towards the optimum significantly. Experiencein this research suggests that there is a rauge of shift which results in very low chance ofmissiug the global optimum and this range is in the order of 4% to 10% of each currentcuboid side length.Figure 3.8 demonstrates the principle of skewness adjustment in a two dimensionalproblem. There, the global optimum is located at x’, while.Qi and 92 are two constraintfunctions and the objective function and its gradient is represented by the f(x) isoquantsand the arrows. The rectangular boxes represent the cuboids at consecutive iterations,the largest at the earliest iteration and the smallest at the last iteration. In the iterativeprocess of LSP, points generated close to the right corner of the cuboid would be discardedearlier since they would not fulfil the level set condition. The remaining points wouldChapter 3. LSP PERFORMANCE IMPROVEMENT 77tend to cluster towards the global optimum point. But because of the limited samplepoint generation and deficiency of sampling on the boundary itself, points around thelower left corner of each cuboid may not be sampled. An unadjusted cuboid at anyiteration except the first excludes the lower left corner of the previous cuboid. Thereforethe location of the current cuboid gets further away from the global optimum with eachiteration. Consequently, convergence without skewness adjustment would occur at a nonoptimal point x”.When the skewness adjustment is implemented, as shown in Figure 3.8(b), the cuboidis shifted towards the global optimum at every iteration and eventually convergenceoccurs at the true global optimum.(a) No skewness adjustment (b) Skewness adjustedx2I‘x2IIxlXIxlFigure 3.8: Shift of cuboid with skewness adjustment.Chapter 3. LSP PERFORMANCE IMPROVEMENT 78Skewness adjustment can be nsed for problems of any dimensionality and the adjustment is applied independently in each dimension. The procedure in each dimension is asfollows.Define the skewness adjustment 6 at a single iteration as6 — (x’ — B1) — (B — x’)— (B—B1)defining X,. = B — B1, then6 (2x’—(B11+B) (3.6)Where B1 and B are the lowest and highest observed points measured on this variable’saxis, x’ is the coordinate of the minimum confirmed solution point in the current cuboidand Xr is the length of the cuboid side in this dimension. Note that 6 can be +ve or—ye in the interval —1 6 1.Adopt three skew parameters: which acts as a skewness activator threshold; and6 and 62, which limit the size of the skewness adjustment. Their specific values will bediscussed in Chapter 5, section 5.2.5, but must be bounded as follows.o 6 1o 62 6 1Adjustments to the cuboid bounds are made in relation to the size of the skew andthe skew parameters 6 and applied only if 6> 6 or 6 < —6 [Chew & Zheng, 1988].I B B1 + 662XrIf 6 6B = B + 661XI B = B1 + 661Xff6 —6B = B + 6623CrChapter 3. LSP PERFORMANCE IMPROVEMENT 79where B and B are the new lower and upper bounds of the cuboid.Such an adjustment results in both stretching and shifting of the cuboid. The stretching of the cuboid z\ would be between 6(6 — 62) and (6 — 62), whereas the shift of thecentroid of the cuboid A ranges from 66 up to 6, depending upon the degree of theskew, that is— 62) S (6 — 62)66 S A S 6. (3.7)Experience in this research revealed that skewness adjustment is not a good approachfor discrete variable problems, especially when the sides of the cuboid become smallrelative to the discretization interval. For a discrete variable problem, the shift andstretch of the cuboid introduced by skewness adjustment is not just a small fraction ofthe cuboid dimensions. If the cuhoid side length for an integer variable was say 3 units,then the minimum shift along that axis, will result in a shift of 33% of the total length.Such a significant alteration in the location of the cuboid can lead to the exclusion ofimportant points.3.6 Search domain modificationThe efficiency of generating acceptable points in a cuboid declines as the volume ratiobetween the region occupied by the acceptance set and the cuboid gets smaller. Oneof the typical cases is when the x r%1 xj scatter diagram of points in the acceptance setform a cluster around a diagonal of the current cuboid. In such cases generating pointswithin the cuboid boundary is inefficient. Two different techniques were considered inthis thesis to improve on this situation. The first technique was the rotation of the axesof the cuboid so that one of the sides of the new cuboid would be almost parallel to theChapter 3. LSP PERFORMANCE IMPROVEMENT 80line which passes through the centre of the cluster. This technique requires substantialextra computational effort since it involves transforming all points to the new rotatedaxes, constructing the cuboid and generating new points, and then transforming eachnew point back to the initial axes to check feasibility and compute objective function.The second technique, which is discussed in detail in this section uses a non-right angleparallelepiped as shown in Figure 3.9 instead of a cuboid as a search domain. Thenon-right angle parallelepiped is referred to here as a rhombohedron.Rhombohedron is defined as a parallelepiped whose faces are parallelograms in which the angles are oblique and adjacent sides are unequal.The use of a rhombohedron increases the search efficiency by the ratio of the volumeof the (minimal) cuboid to the volume of the (minimal) rhombohedron.XjV—• /]...Confirrned points.•.CuboidRhombohedronxiFigure 3.9: Contrast between cuboid and rhombohedron in two variable space.The shift from the cuboid to the rhombohedron search domain can be instigatedwhen the displayed scatter diagrams of the current acceptance set indicate a strongChapter 3. LSP PERFORMANCE IMPROVEMENT 81correlation between a pair of variables. Alternatively a statistical procednre to calculatea correlation matrix of all variables can be initiated at any stage of the search. If theabsolute value of any of the correlation coefficients is greater than a pre-specified value,a linear regression analysis can be performed between the highly correlated variables.Choice of the dependant and independent variables is arbitrary.Point generation in a rhombohedron is demonstrated in Figure 3.10. Initially a pointis randomly generated on the independent variable axis within the current cuboid, forexample point a in Figure 3.10. The corresponding value of the dependent variable isthen calculated on the regression line, i.e. at point b in the figure. The generated pointto be considered in the level set would be randomly chosen around the regression lineplus or minus a pre-specified band width. That is to say the candidate point would begenerated randomly on the line joining points c and d.Finally the dependent variable value at point b is randomly perturbed up or down toproduce a generated point. The limits of this perturbation were defined in this work by+ 3x se, where se is the standard error of the estimate returned by the regression analysis.This ensured that the random generation of new points occurred within a rhombohedronlarge enough to virtually eliminate the risk of excluding any of the acceptance set.Chapter 3. LSP PERFORMANCE IMPROVEMENT 82.Regression lineb/f— Range of perturbationCurrent cuboldxiFigure 3.10: Point generation in a rhombohedron search domain.Chapter 4TEST PROBLEMSThe merits of an optimization method lie in its ability to solve a wide range of problemswithin the reqnired bounds of accuracy and reliability A newly developed method cannotbe evaluated solely on its theoretical details but has to be tested numerically. Ideallythe same standardized testing procedure can be applied to all optimization methodsbut such a procedure has not yet evolved. The practical alternative is to apply theoptimization method to a wide variety of suitably challenging test problems. Anotherequally important alternative is to use test problems which provide insights into thestrengths and weaknesses of a particular method.A major component of this research effort was expended on exploring the viabilityof LSP in solving both mathematical and practical engineering problems. This involvedsolving some practical engineering optimization problems where the dimensions and thenon linearities of the problem challenge existing optimization methods. In addition, anew test problem, designed specifically to explore perceived weaknesses with LSP, wasdeveloped.About 200 nonlinear optimization problems with published solutions were collectedfrom the mathematical literature. A selection of 78 of the more challenging of the 200mathematical test problems solved with LSP are listed in Appendix A. Solutions citedin the literature and the LSP solution(s) are given for each problem. The problemsdocumented in the Appendix were chosen for two major reasons. First, for the challengethey have presented to many of the existing NLP methods. Second, because they were83Chapter 4. TEST PROBLEMS 84found to provide useful insights into LSP performance.A second group of test problems consists solely of engineering problems. The problems in this group are pubhshed practical problems known to be difficult for many NLPmethods and of interest to engineers. Some of these publications present specific NLPapproaches developed to solve just one of these problems.A third group of test problems involves parameter fitting for a specified model. Thevalues of the model parameters are established by optimizing various nonlinear best fitcriteria. This is an area of apphcation where LSP provides a unique and potentiallyattractive perspective. Some of these models are used to demonstrate the capacity ofLSP to refine estimates of parameters previously established with nonlinear regressionmethods.In every one of the problems, the problem formulation used with LSP was tested fornumerical agreement with the published solutions. Except for very few cases, all thesolutions in the source literature were verified. Any deviation from the original solutionis reported and the discrepancy discussed.Whenever possible, the widest possible variety of problems are addressed to explorethe robustness of LSP. However, the main difficulty experienced was in finding suitablychallenging mathematical and engineering test problems. Most of the published testproblems uncovered in the literature were felt to be inadequate for testing all qualitiesof LSP. In response, one new, parametrically adjustable, mathematical test problem wasdeveloped during this research. Significantly, many of the existing NLP methods fail tosolve this new test problem unless a favourable starting point is selected. Details of thisnew test problem are provided in section 4.2.1 of this chapter.The algorithm used in addressing all of the test problems in this research is presentedin Figure 4.1 as a flow chart. The algorithm performs all tasks automatically exceptthose indicated as optional in the figure. The optional tasks can be made functional atChapter 4. TEST PROBLEMSRelax constraints85Generate replacement points * optionalEvaluate objective at each pointEvaluate momentslevel set value; c=Max{f(X)1I Discard bad pointsCheck for linear correlation *Check for distinct clusters *Define new cuboidStretch cuboid boundsMake skewness adjustmentFigure 4.1: The LSP algorithm as implemented in this thesis.Chapter 4. TEST PROBLEMS 86any stage of the search by the user as the need arises or may be triggered automaticallywhen default conditions occur. These optional tasks can be computationally intensiveand are often not crucial to finding a solution.4.1 Coding of mathematical test problemsThe mathematical problems documented in Appendix A are coded in such a way thatthe number of variables and constraints involved can be easily recognized from the code.Each problem coding takes the form[Problem A — B — C — D, Location in thesis]whereA designates the number of variables in the problemB designates the index number within that particular dimeusionality groupC designates the number of equality constraintsD designates the number of inequality constraints.For instance, [Problem 3-7-2-1, Appendix A] would mean the seventh problem withinthe set of 3 variable problems, and the problem has two equality and one inequalityconstraint. It is documented in Appendix A of this thesis.Most practical optimization problems assume upper and lower bounds on each variable but these bounds are not counted here as constraints. The only instances whenbounds are considered as inequality constraints are where the source literature dictatestheir use for specific reasons.The original source for each problem is referenced.Chapter 4. TEST PROBLEMS 874.2 Mathematical problemsIt is not easy to find test problems which can reveal all the strengths and weaknesses ofa single NLP method. Recognizing this difficulty, a wide range of solved test problemswere chosen from different journals and books. The more interesting mathematical testproblems solved using LSP are documented in Appendix A. Results given in the literatureand those obtained with LSP are documented along with each problem. The distributionof these problems according to the number of variables is summarized in Table 4.1.Number of Unconstrained Constrainedvariables problems problems Total1 5 1 62 8 15 233 5 12 174 3 9 125- 4 46- 6 67- 3 38- 3 39- 2 215- 1 120 1- 1Table 4.1: Summary of test problems reported in Appendix AAbout two thirds of the test problems documented in Appendix A involve up to 4variables. Their specific characteristics are intended to challenge NLP methods apartfrom the issue of dimensionality. Furthermore, these low dimensional problems are relatively easy to visualize. The space defined by the set of constraints and the surface of theobjective function can often be interpreted and clarified in simple 2 and 3 dimensionalgraphs. Such graphs help visualization of the search process and its response to thedifficulties presented by the problem. The experience gained with the low dimensionalproblems can often be exploited in higher dimensional problems.Chapter 4. TEST PROBLEMS 88Comparison of LSP results with the solutions given in the literature showed that,in about 95% of the cases, the LSP results confirmed those cited in the literature. Inabout 5% of the cases, LSP gave better results than have been previously published. Twotypes of improvements were observed. One was the identification of multiple solutionswith LSP when the source literature gives only some of these solutions. LSP generallyfound the extra solution point(s) on a single run. For example, only two solution pointswere reported in the literature for [Problem 2-11-0-0, Appendix A], but LSP found twoadditional global optimal points. A second type of improvement was observed when LSPconverged at a different solution point from the one given in the literature and with animproved objective function value. Such improvements were rare since test problemsoften favour the NLP methods that they were developed to demonstrate. Examples oftest problems with improved solutions are: [Problem 4-8-0-3]; [Problem 4-10-1-2]; and[Problem 7-2-0-3, Appendix A].There were several instances where, when the solution cited in the literature wastested, the constraints had been significantly violated, although this was not acknowledged. In other instances, the claimed optimal value of the objective function did notagree with the calculated value at the solution point cited. For example, the constraintsof [Problem 9-2-6-0, Appendix A] were significantly violated at the optimal solution pointin the published solution. Similarly the optimal objective function value cited in the literature for [Problem 1-4-0-0, Appendix A] can not be achieved at the given solution points.LSP found multiple global optimal points which produce a superior objective functionvalue for the same problem. When corrections are made to such incorrect solutions inthis thesis they are not reported as improvements, but as errors in the source literature.Some special problems were encountered by LSP when solving some of the test problems and are reported. Either the results were inferior to those mentioned in the literatureor the number of function evaluations was considerably higher. An interesting exampleChapter 4. TEST PROBLEMS 89is [Problem 2-12-0-4, Appendix A], discussed in detail in Chapter 5, section 5.5, wherethe cause for the difficulty is associated with the geometry of the feasible region.4.2.1 The Road Runner functionThis new test problem was designed to test global optimization methods in general aswell as to specifically challenge LSP and explore its weaknesses. It produces an objectivefunction surface which has an easily missed fissure containing the global optimum anda topology which can cause fragmentation and dispersion of the level set. The scale ofits prominent features and the location of the global optimum point are controlled bytwo parameters. The first parameter a, in Equation 4.1 adjusts the size of the fissureon the objective function surface. It controls how wide and how deep the fissure canbe. Here, fissure width is defined as the distance between the fissure edges on eachvariable axis. Depth of the fissure is defined as the difference between the maximum andminimum objective function values. The second parameter b controls the location of theglobal minimum. Figure 4.2 shows the influence of parameters a and b on the objectivefunction, particularly at the edges of the fissure region.This test problem has been named “The Road Runner function”, after a popular filmcartoon which is set in terrain which resembles the extreme topography of the objectivefunction surface produced. The problem is a challenge for all local optimization methodsunless a favourable starting point is chosen. It is also a challenge for LSP since it isparticularly sensitive to sparse sampling of the level set. The additive structure of thisfunction allows it to be easily extended to any number of dimensions while retaining thesame critical features.The general n dimensional formulation of the Road Runner function is given in Equation 4.1Chapter 4. TEST PROBLEMSa x @ fissure edges 1(x) at edges Fissure width Fissure depth5 ..431l4, 127782 4.22497, 1.76965 1.70894 4.2249710 •0.32818, 100210 7.24521, 2.31046 1.41029 7.2452120 .0.26295,0.95756 13.24932,3.21143 122051 13.24932Figure 4.2: The Road Runner function in one dimension to show the influence of a.nf(x)={(xj_b)2+aIxj_bI}?s=190(4.1)where a and b are parameters which directly adjust the scale of the critical topographicfeatures and n is the required dimensionality for x. This function will always have itssingle global minimum at x, = b; i = 1, ..., n where f(x) = ZerO.The function becomes a unimodal problem if the search domain is restricted to thesmall region around the global optimum. The search domain has to be bigger than this151050xa=5 a=10 a=20Chapter 4. TEST PROBLEMS 91minimal region to observe all of the features of the function. Thus, the bounds on eachvariable should be wide enough to enclose regions beyond the fissure width. Multiplelocal optimal points develop with this bigger search domain and their number increasesexponentially with dimensionality. The total number of local optimal points is 3, 9 and27 for one, two and three dimensional functions respectively. In general, there arelocal optima for an ii dimensional function. Only one of the local points is the globaloptimum for each case. Two and three dimensional plots of the two dimensional versionof this function are given in Figure 4.3 using the values a = 10 and b = 0.5. In this casethe function is1 1f(x) = {(xi — .5)2 + 10 — .5 }T + {(x2 — .5)2 + 10 I — .5 I}’.Figure 4.3(a) shows that the function has two sets of valleys. Each set consists of fourvalleys having similar depths. The depth of each valley increases as it goes farther awayfrom the global optimum. The 4 valleys which are parallel to the axes have all equaldepths. The other 4 valleys which are diagonal to the axes have also similar depths hutdifferent from the first four. The local optimal points appear at the intersection of thevalleys and at the boundaries formed by the bounds of each variable. The global pointis at x = (0.5, 0.5), at the centre of a very narrow fissure, where f(x) achieves its globalminimum of zero.Figure 4.3(b) clearly reveals the influence of starting point location on the solutionsobtained with gradient optimization methods. Gradient based methods always convergedat a local optimum unless they started at a point very close to the global optimum. Figure4.3(b) shows 36 arbitrarily chosen starting points, and the corresponding optimal valuesobtained using gradient optimization methods. The results shown in Figure 4.3(b) werefound using the popular NLP methods such as FMIN, GRG2 and GAMS which usesChapter 4. TEST PROBLEMS 92MINOS. Searches starting at points with circular marks converge at the true globaloptimum. Those searches which started at points other than those identified by circlesconverged at local optimal points. The optimal objective function values mentioned inthe figure are for the purely unconstrained cases, i.e. where there were no bounds on thevariables.As was intended this problem challenges LSP, particularly when the initial variablebounds are wide and the value of I’keep is low. With a large cnhoid, the ratio of the fissurevolume to the cuboid volume becomes very small, so points generated in the early stageof the search are unlikely to fall within the fissure. Because of the nature of the objectivefunction surface, points far away from the fissure produce better objective function valuesthan those that are close to the fissure but not actually within it. As the level set value-capproaches 1.0, there is little size reduction in the cuboid with successive iterations, butthe level set value-c continues to improve at each iteration. This situation is the leastadvantageous for LSP because the ratio between acceptance region and cnboid volumesgets very small where the search becomes inefficient.If an initial cuboid with one of its corners very close to the global optimum is used,the chances of generating sample points within the fissure are much lower than when theglobal solution is more centrally located in the cnboid. The reason is that, in regions closeto the global optimum, points produce very high objective function values and are quicklyrejected from the level set. Therefore, unless points with low objective function valueare sampled on either side of the global minimum, the region close to the global pointcan be excluded from the current cuboid at an early stage of the search. The situationgets worse with low Nkeep. Consequently the search terminates at a point other than theglobal optimum. The reliability of LSP can always be improved by raising Witha higher number of points, the chances of generating points in the fissure is increased atthe early stage of the search.c\1-‘UxIf)QQ3B 2Ø-1.0-2.6 =4. -2.5 -1.0 0.52,0 3.y xChapter 4. TEST PROBLEMS(1x(a)to(b)yxx• give 2 as an optimumAgivel as an optimum• give 0 as an optimum932.0 O. 2 -2.S -.1.0 g. 2.0y-i.ePoints starting atPoints starting atPoints starting atFigure 4.3: The Road Runner function in two dimensions.Chapter 4. TEST PROBLEMS 94The Road Runner function was solved using LSP for various number of variables. Theresult for the 2 dimensional function with iV = 20, convergence criterion VM i*1O°,a = 10 and b = 0.5 and the initial bounds —4< xi,x2 4, wasf(x*)= 4.0E — 4at xK = (0.5000, 0.5000).The number of function evaluations expended to meet this convergence criterion was4,230.For a two dimensional Road Runner function there are 8 valleys stretching radiallyaway from the global optimum. The valleys are equally spread in the search domain andthe minimum point in each valley occurs at the boundaries formed by the variable bounds.Such a distribution of the local minimal points is believed to introduce the greatestinefficiency in set based search schemes and in particular to LSP. Due to the orientationand spatial distribution of the valleys, the domain modifications discussed in Chapter 3,section 3.6, do not improve the LSP efficiency as there is no search domain which canenclose all confirmed points more efficiently than the cuboid. The function with rotatedaxes has features similar to the original one, so it does not bring any computational gainwhatsoever.4.3 Engineering test problemsFor the most part real world engineering test problems for optimization methods deal withthe search for superior designs. Such problems often involve both continuous and discretevariables. In addition the functions involved may be discontinuous and non differentiable.Most of the established nonlinear optimization methods are designed to handle onlycontinuous and differentiable functions and are local rather than global optimizers. TheseChapter 4. TEST PROBLEMS 95restrictions may account for some of the resistance by civil engineering practitioners tothe application of these existing optimization methods for solving practical problems.There are very few well documented civil engineering test problems in the literature andthis scarcity represented one of the major difficulties in testing LSP’s capabilities in thatfield of specialization. In addition, those examples which do exist in civil engineeringwere not developed to isolate and test specific qualities of optimization methods, otherthan perhaps their high dimensionality capabilities.Floudas and Pardalos [Floudas & Pardalos, 1990] documented a collection of engineerin test problems for constrained global optimization algorithms. A few of the moreinteresting problems have been selected from this collection. Significantly, some of theresults given in the literature are not global solutions. For instance, solutions to the4-variable problem [Problem 4-10-1-2, Appendix A] and the six variable problem [Problem 6-5-3-12, Appendix A] are far from the optimums which are cited on page 29 of thebook entitled 4 collection of test problems for constrained global optimization algorithms,[Floudas & Pardalos, 1990]. In some cases the reported optimum value of the objectivefunction is different from that obtained by using the values in the solution vector. The4-variable problem already mentioned [Problem 4-10-1-2, Appendix A] is an example ofsuch inconsistency. In other cases the optimal points given for some test problems violatethe constraints. Examples of these are the 7-variable problem on page 153 and the 27variable on page 67 in the same mentioned book.Because of such inconsistencies, many test problems which at first sight seem to bewell documented were not of any value for LSP testing purposes. In spite of having toreject many, about 20 credible published engineering problems were found and used fortest purposes. These problems varied from 3 to 38 variables, and included both discreteand continuous variables. They cover applications of optimization to pipe network design, structural truss design, treatment basin design, heat exchange design, air pollutionChapter 4. TEST PROBLEMS 96control design, irrigation system design and hydrological model parameter determination.Details of 11 of the most challenging problems, including their solutions, are given in thenext subsections.In most cases the LSP results were comparable to the solutions given in the literature,but again some improvements were observed. For instance, a 12.91% improvement onthe published objective function value was found with the pipe network design [Problem32-1-40-34, section 4.3.2]. Another improvement was obtained with a better fit for thehydrologic model parameters which, from a hydrologist’s point of view, produced a farmore reasonable unit hydrograph for the UH ordinate determination problem [Problem19-1-21-0, section 4.3.5].4.3.1 Basin design [Problem 3-16-1-3]Source: F. Douglas Shields Jr., and Edward L. Thackston, “Designing Treatment BasinDimensions to Reduce Cost”, ASCE, Journal of Environmental Engineering Vol 117 No3 May/June 1991.Method of optimization used in the source literature: Choosing the best curve out of aseries of Length/Width versus cost plots drawn separately for different number of spurs.Problem formulation:A rectangular shallow treatment basin is to be designed with internal spur dikes, asshown in Figure 4.4. The designer is expected to choose the basin length L, the basinwidth I4 at water surface, the number of dikes N and the ratio of spur dike length tobasin length r. The choices have to satisfy the governing constraints at a minimum cost.The geometric and hydrologic relationships are given as follows.Geometric relationshipsFigure 4.4: Treatment basin.A=LW—NrLW3V=LWD-NrLVL1 Ww =rL(N+1)±(Nl)orLf L 2V =(W)r(N+l)whereA is the water surface area calculated as the total basin area at the watersurface elevation minus the area occupied by the spur dikesV is the basin volume, which is calculated as the total basin volumeassuming that the volume of side slopes below water is negligible,minus the volume of the spur dikesL is the length of basinChapter 4. TEST PROBLEMS 97W+2wwPlanSection A-AChapter 4. TEST PROBLEMS 98W is the width of basinW3 is the width of spur dike at the water surfaceN is the number of spur dikesr is the ratio of spur dike length to basin lengthD is the depth of waterV’ is the unit volume of spur dike below the water lineLf is the flow lengthWf is the flow widthHydraulic relationshipsHE4whereHE = hydraulic efficiencyt = mean hydraulic retention timeT = volumetric residence timeQ = mean flow rateFor a given Q, the value of T can be adjusted by changing the basin geometry.Experimentally the hydraulic efficiency is found to be related to the flow length, Lf,width, Wf, ratio, expressed asHE = O.84[1 — exp(O.59J)}Thereforev_tQHEChapter 4. TEST PROBLEMS 99orLWD—NrLV tQ — O.84{1 — exp[_O.59()r(AT + 1)2]}For a small volume of the spur dikes, NrLV, the basin volume is approximately LWD,thenLW= tQO.84D{l — exp[—O.59()r(N + l)2]}Normally t, Q, and D are fixed by design constraints soLW a- {1 - exp[-O.59(4)r(N + l)2]}where a = tQ/O.84D.The basin cost includes the costs of land, dikes, inlet and outlet structures and accessroads. If f is the ratio of the unit cost per foot of spur dike to the unit cost of theperimeter dike, and if all other costs are negligible relative to the cost of the dikes andland, then the basin cost may be approximated bycost = [(2L + 2W + 4w) + fNrL]cd + [(L + 2w)(W + 2w)] ciwhere w is the base width of the perimeter dike, cd is the cost per unit length of perimeterdike and c1 is the cost per unit area of land.If Q and t are known and the dike design has determined B, r and w then the onlyremaining geometric variables to be chosen are L, W and N.Two cases are considered:Case a) No constraint on land.Case b) Land is a constraint, width is fixed to 500 ft.Chapter 4. TEST PROBLEMS 100The optimization problem has a nonlinear objective fnnction involving three decisionvariables. One of the decision variables, N, takes integer valnes only. The problem hasone nonlinear eqnality constraint. The formnlation isMinimize cost = [(2L-f- 2W + 4w) + fNrL]cd + [(L + 2w)(W + 2w)]ciSubject to:LW=a{1 — exp[—0.59(4)r(N + i)2J}W,L,N 0The predetermined constant values used and physical conditions of the basin are givenas follows.- The basin is rectangular, and spur dikes are longitudinal, with r = 0.85.- Perimeter dikes have crown widths of 10 ft, heights of 7 ft and side slopes of 1:3,their unit volume is 217 cu ft/ft, with a base width, w, of 52 ft.- Spur dikes have crown widths of 2 ft, heights of 7 ft, and side slopes of 1:2. Widthat the water line, W5, is 10 ft, and the unit volume below the water line is 100 cu ft/ft.- Basin depth is 5 ft.- Average flow rate, Q, is 27 cu ft/sec.- The required residence time is 45 hours.- The unit cost of the perimeter dikes Cd, is $20/ft.- Unit cost of spur dikes is 810/ft, so f = 0.5.- The unit cost of land is 81000/acre, or $0.23/sq ft.The optimal solution cited in the literature and LSP’s output are given in Table 4.2.Observations:• It took 34 minutes to find the optimal solution for Case a of Table 4.2, and only 4seconds for Case b with LSP on a 80386-33Mhz microcomputer.Chapter 4. TEST PROBLEMS 101Case a Case bLiterature LSP Literature LSPN 1 1 0 1L(ft) 1245 1266 2310 2083.3W(ft) 890 869 500 500Cost (1000 $) 131 131 150 155.5Error (ft2) 368.58 0.67 314.57 0.26Error is the violation of constraintsTable 4.2: Optimal results for the basin design problem• LSP found that the problem has multiple global optima for case (a). A number ofnear optimal points were also found. The sizes of the final cuboid on the lengthvariable L side ranged between 1,267 and 1,303 ft. On the width side it rangedbetween 853 and 869 ft whereas the number of spur dikes was always one. Thevalue of the cost within the final acceptance set varied only by about 0.15%.• When the results cited in the literature are used, some constraints are violatedsignificantly.4.3.2 Pipe network design [Problem 32-1-40-34]Source:1. E. Alperovits and U. Shamir, “Design of optimal water distribution systems”, Waterresources research, Vol 13, No 6, pp 885-900, 1977.2. 0. Fujiwara, B. Jenchaimahakoon, N. C. Edirisinghe, “A modified Linear programming gradient method for optimal design of looped water distribution networks”,Water Resources Research, Vol 23, No 6, pp 977-982, June 1987.3. G. V. Loganathan and J. J. Greene, “Global approaches for the nonconvex optimization of pipe networks”, Water Management in the 90’s, 1993.Chapter 4. TEST PROBLEMS 102Methods of optimization used in the source literature: LPG, Linear programming gradientin [Alperovits & Shamir, 1977] and [Fnjiwara et al., 1987]. The solution is obtained viaa hierarchial decomposition of the optimization problem. The primary variables are theflow in the network. For each flow distribution the other decision variables are optimizedby linear prograrmning. Post optimality analysis of the linear program provides theinformation necessary to compute the gradient of the total cost with respect to changesin the flow distribution. The gradient is used to change the flows so that a (local)optimum is approached.Two global search schemes, MULTISTART and ANNEALING, were used in [Loganathan & Greene, 1993]. The problem is decomposed into a two stage problem whereone search strategy selects link flows and a second strategy, linear programming, seeksthe optimal pipe design.Problem formulation:Consider the network shown in Figure 4.5.• the network has 7 nodes, the water demand and ground elevation at each node areknown.• the head at each node has to be at least 30 rn above the ground elevation of thenode.• the network has 8 links with known lengths.• the data on allowable gradient, available pipe dimensions, maximum diameter ofpipe to be used, Hazen Williams coefficients and unit prices for each pipe size areknown.Each link is assumed to comprise m pipes and the overall pipe cost is calculated asChapter 4. TEST PROBLEMSFigure 4.5: Two-looped water supply system.COSt = qjmxijmijmThe objective of the problem is to minimize cost under the following constraints.Xijm =Xijm 0LHi,m = JijmX:jm103(100)link0 node0demandJ = a(Q/C)’2D487Chapter 4. TEST PROBLEMS 104Hmin <H8 + > J’ijmnijm H,,mwhere the first summation is over all links (i,j) connecting node s and node ii, and thesecond summation is over all segments rn in each link.The nomenclatnre is as follows:Xjjm is the length of a pipe segment of the m’ diameter in the linkconnecting nodes i and j;cjjm is the unit cost of the m’ diameter in the link connecting nodes i and j;L1 is the length of the link connecting nodes i and j;AHijm is the head loss of the mth diameter in the link connecting nodes i and j;ijm is the hydraulic gradient of the mt diameter;Q is the discharge;C is the Hazen Williams constant;D is the pipe diameter;113 is the head at node s;Hmin and Hmax are the lower and upper head constraints at each node ii; anda is a coefficient = 1.526 * 1010 when Q is in m3/s and B in cm.Input data for the pipe system are given in Table 4.3 and Table 4.4. The flow ineach pipe for a given set of pipe diameters and this input data was solved using theconventional Hardy Cross relaxation method.The least cost solution for looped pipe network problems demands that some links willhave two sections with different diameters [Orth, 1986]. Therefore, each link is assumedto consist of two different sized pipes i.e. in = 2. The sizes of the pipes in a link are to beconsecutive sizes available in the market. The problem has 16 pipe diameter variables,Chapter 4. TEST PROBLEMS 105Node Demand Elevation Head Link Length1 -1120 210 210 1 10002 100 150 2 10003 100 160 3 10004 120 155— 4 10005 270 150 5 10006 330 165 6 10007 200 160— 7 10008 1000data for the pipeDiameter Unit cost Diameter Unit cost(in) ( $/ft) (in) ( 8/ft)1 2 12 502 5 14 603 8 16 904 11 18 1306 16 20 1708 23 22 30010 32 24 550Table 4.4: Available pipe diameters and nnit priceswhich take discrete values, and another 16 continnous pipe length variables. Therefore,the problem has a total of 32 decision variables.Observations:Optimal solutions cited in the first source literature and those found with LSP aregiven in Table 4.5 and Table 4.6 respectively. It is reported in [Alperovits & Shamir,1977] that the problem was solved after 19 iterations, which took 4.05 seconds CPU timeon IBM 370/168 machine. LSP found an optimal solution different from the one cited inthe literature. The solution found with LSP showed about 12.91% improvement on theobjective fnnction value over the solution cited in [Alperovits and Sharnir, 1977]. In thesecond source literature, [Fujiwara et al., 1987], a minimum cost of $ 415, 271 is reported.Table 4.3: Input networkChapter 4. TEST PROBLEMS 106Unfortunately no details of the solution are given in the literature, hence no comparisoncould be made with the LSP results. In the third source literature, Loganathan andGreene found a minimum cost of $ 405,301, which is an improvement over the solutionfound with LSP. A close examination of this solution revealed that the minimum headrequirements at two nodes were violated.Bigger pipe Smaller pipeLink Total length Length Diameter Length Diameter(m) (m) (in) (m) (in)1 1000 255.97 20 744.00 182 1000 996.37 8 3.61 63 1000 999.98 18 0 04 1000 319.38 8 680.62 65 1000 1000.00 16 0 06 1000 784.94 12 215.06 107 1000 999.99 6 0 08 1000 990.91 6 9.06 4Total cost = $ 479,525Table 4.5: Optimal design for the pipe network given in ASCE, 1977Bigger pipe Smaller pipeLink Total length Length Diameter Length Diameter(m) (m) (in) (m) (in)1 1000 0 22 1000 202 1000 0 12 1000 103 1000 632.13 16 367.87 144 1000 232.30 2 767.70 15 1000 294.89 16 705.11 146 1000 0 12 1000 107 1000 860.36 10 139.64 88 1000 114.14 2 885.86 1Total cost = $ 417,607Table 4.6: Optimal design for the pipe network found with LSPChapter 4. TEST PROBLEMS 1074.3.3 10-member truss design [Problem 10-2-0-40]Source:1. Venkayya, \7 B., “Design of optimum structures” in Computers and Structures,Vol 1, pp 265-309, 1971.2. David E. Goldberg and Manohar P. Samtani, “Engineering optimization via geneticalgorithm”, in Electronic computation, pp 471-482, Kenneth M Will ed 1986.Methods of optimization used in the source literature: Venkayya used a method basedon an energy criteria and a gradient search procedure for design of structures subjectedto static loading. A Genetic Algorithm was used in [Goldberg & Samtani, 1986].Problem formulation:The geometry and loading system of an indeterminate structure for a 10 membertruss is given, as shown in Figure 4.6. The cross sectional area of all members to give theminimum weight of the structure is required. The optimization problem is formulated asfollows.Minimize Weight=p L 4wherep is the specific weight of the material usedW is the total weight of the membersA is the cross sectional area of the member j in square inchesL is the member j length in inchesThe problem is constrained by the allowable stresses and the bounds on membersections.Chapter 4. TEST PROBLEMS 108Figure 4.6: 10-member truss.‘7mifl j umaxAmin A2 Amaxwhereo is member stress°min and °maa are the minimum and maximum stresses.Amin and A,, are the minimum and maximum areas allowable,which are given as 0.1 in2 and 10 in2 respectively.The material to be used is aluminum which has maximum stress of 2,000 psi (compression and tension), Modulus of elasticity iO psi and specific weight of 0.1 lb/in3.Both loads shown in Figure 4.6 are 100K lb each. A stiffness analysis computer programdeveloped by Fleming [Fleming, 1986] was used with LSP to solve the structural analysis.360”1360”26C)100KIb lOOKIbObservations:(hapter 4. TEST P1WBLEMS 109This problem has been solved by many authors using different niethods. Goldbergand Samtani [Goldenberg & Samtani, 1986] are those who addressed the problem todemonstrate the use of the “Genetic Algorithm” for solving engineering problems. Theycompare their result with Venkayya’s [Venkayya, 1971] result. Assuming Venkayya’sconfirmed feasible solution as the ‘true’ solution, their three best-of-run results showthat the Genetic Algorithm results deviated from the ‘true’ solution by 0.82%, 1.32%and 2%. Unfortunately the best run cited in [Goldenberg & Samtani, 1986] violates sixof the ten stress constraints, which is acknowledged as ‘minor excursions beyond thestress constraint’. Figure 5 of the original source paper shows the violations. It citedthat 6,400 points were explored to reach the solution in all the three runs.A number of LSP runs were made to compare results with the ‘true’ solution results.The three best LSP feasible results showed 0.6%, 0.9% and 1.00% deviation from theresults given by Venkayya. LSP found a set of well separated near optimal points, whichsuggests that the problem has multiple global optima The LSP results given in Table 4 7were found, on the average, after 73,800 function evaluations per each run which took 5minutes and 15 seconds on a 80386-33Mhz microcomputer. The detailed results cited byVenkayya and those found with LSP are given in Table 4.7. The three different solutionsfound with LSP and Venkayya’s optimum solution are shown in Figure 4.7. Figure 4.7(a)shows the optimal decision variables, whereas Figure 4.7(b) shows the stress in eachmember of the optimal solution.LSP’s failure to identify the optimal solution in this instance can be attributed to thefact that it lies on the boundary of the original search domain. Further improvement inthe LSP solution might be achieved through local refinement using a gradient methodbut was not attempted.For a general comparison, a summary of minimum objective function values for thetruss problem found with the two methods mentioned above, and LSP, are given in Table4.8.Chapter 4. TEST PROBLEMS 110et.. 52§SU)8MembersV.nkayya Run 1 Run 2 ES] Run 3(a) Optimal dimensions2 3 4 5 5 7 8 9 10MembersV.nkayya Run I Run 2 ES] Run 3(b) Optimal member stressesFigure 4.7: Detailed results for Venkayya’s 10-member truss problem.Chapter 4. ‘TEST PROBLEMS It ICross sectional area (in2)Member Venkayya LSPrun I run 2 run 31 7.938 7.826 7.753 7.7782 0.100 0.236 0.321 0.2893 8.062 8.179 8.251 8.2234 3.938 3.827 3.754 3.7795 0.100 0.100 0.100 0.1016 0.100 0.223 0.359 0.2957 5.745 5.906 6.028 5.9928 5.569 5.411 5.308 5.3439 5.569 5.411 5.309 5.34310 0.100 0.330 0.447 0.421Truss weight (ib) 1,593.2 1,602.6 1,609.6 1,607.2Table 4.7: Detailed results for the 10-member truss designVenkayya Goldberg LSP 71,593.2 run 1 1,606 1,602.6‘True’ run 2 1,614 1,609.6solution run 3 1,625 1,607.2Table 4.8: Comparison of truss weights for the 10-member truss designChapter 4. TEST PROBLEMS 1124.3.4 38-member truss design [Problem 38-1-0-77jSource: Andrew B. Templeman, “Discrete optimum structural design”, Computer andStructures, Vol 30, No 3 pp 511-518, 1988.Method of optimization used in the source literature: Heuristic methods for finding discrete optimum solution.Problem formulation:Consider the 38-member truss shown in Figure 4.8. With the given loading system,the truss is to be designed in such a way that the tip displacement does not exceed 10millimetres. Members should be chosen from the five different cross section bars available,that is 5.0, 10.0, 20.0 40.0 and 75.0 * iO mm2. All bars are of the same material withmass density, p = 7.85 * 10-6 Kg/mm3 and Young’s Modulus, E = 21OKN/mm.Figure 4.8: 38-member truss.The total mass of the truss is used to represent the cost so the optimization problemis formulated asMinimize Mass = p L A1lu---fIN1. i.m i. • ,. 1. II II U IIf-ar1o@1 hChapter 4. TEST PROBLEMS 113Subject to 6 10mmwhere L1 and A1 are the length and cross section of the ith member, and 6 is the displacement at the tip of the truss. L1 are known quantities and the A1 are discrete variables.Observations:The result given in the literature is reported to he the global discrete optimum design.Even though the difference is small, LSP achieved a slight improvement in the objectivefunction. The most interesting aspect is that LSP identified multiple global optima. Over40 different global optimal solutions were identified, of which only ten are given in Table4.9 along with the single solution given in the source literature.Fleming’s [Fleming, 1986) computer programme for plane trusses was used with LSPto solve the structural analysis problem using the stiffness method. It took about 75hours to meet the modified variance convergence criterion using an 80386-33Mhz microcomputer, but the total number of function evaluations was less than 500,000. Thereason for such a long computation time is the effort expended on the determination ofthe deflection for the many designs evaluated in the LSP search.As was pointed out by the external examiner Dr. A. Templeman, there exists a uniquealgebraic equation for the tip deflection of this truss. While use of the equation wouldhave significantly reduced the computational time needed for constraint evaluations thenumber of function evaluations would remain unchanged.Chapter 4. TEST PROBLEMS 114Member areas (1000mm2)—Member Literature LSP solutions1 10 10 10 10 10 10 10 10 10 10 102 510555105555 Sf3 10 10 10 10 10 10 10 10 10 10 104 10 5 10 10 10 10 5 10 10 5 10 f5 10 10 10 10 10 10 10 10 10 10 106 10 10 5 10 5 10 10 10 10 10 10 f7 10 10 10 10 10 10 10 10 10 10 108 10 10 10 10 10 10 5 5 10 10 10 f9 10 10 10 10 10 10 10 10 10 10 1010 10 5 10 10 10 5 10 10 10 10 10 f11 10 10 10 10 10 10 10 10 10 10 1012 10 10 10 10 10 10 10 10 10 10 5 f13 10 10 10 10 10 10 10 10 20 10 10 *14 10 10 10 10 10 10 10 10 10 10 1015 10 10 10 10 10 10 10 10 10 10 1016 10 10 10 10 10 10 10 10 10 10 1017 10 10 20 20 10 10 10 20 10 10 10 *18 10 10 10 10 10 10 10 10 10 10 1019 10 20 10 10 20 20 20 10 10 20 20 *20 5 55105551055Sf21 20 10 10 10 10 10 10 10 10 20 10 *22 20 20 20 20 20 20 20 20 20 20 2023 40 40 40 40 40 40 40 40 40 40 4024 40 40 40 40 40 40 40 40 40 40 4025 40 40 40 40 40 40 40 40 40 40 4026 75 75 75 75 75 75 75 75 75 75 7527 75 75 75 75 75 75 75 75 75 75 7528 73 75 75 75 75 75 75 75 75 75 7329 55555101010105 5f30 20 20 20 10 20 10 20 10 10 10 20 *31 20 20 20 20 20 20 20 20 20 20 2032 40 40 40 40 40 40 40 40 40 40 4033 40 40 40 40 40 40 40 40 40 40 4034 40 40 40 40 40 40 40 40 40 40 4035 75 75 75 75 75 75 75 75 75 75 7536 75 75 75 75 75 75 75 75 75 75 7537 75 75 75 75 75 75 75 75 75 75 7538 75 75 75 75 75 75 75 75 75 75 75Mass (Kg) 8489.16 8482.42f takes eitner 5 or 10 * takes either 10 or 20Table 4.9: Alternate optimal designs for the 38-member trussChapter 4. TEST PROBLEMS 1154.3.5 Unit Hydrograph ordinate determination. [Problem 19-1-21-OjSource: Larry W. Mays and Cheng-Kang Tanr, “Unit Hydrographs via Nonlinear Programming”, Water Resources Research, Vol 18, No 4 pp 744-752, 1982.Method of optimization used in the source literature: Large scale Generalized ReducedGradient Technique, where the output from a linear programming version of the problemis used as the starting point for GRG2.Problem formulation:The optimal unit hydrograph from a set of known rainfall and runoff data is required.The unit hydrograph should be able to produce either a minimum sum of deviations or aminimum sum of squares of deviations between observed and derived runoff hydrographs.Unlike the traditional approach, determination of rainfall excess does not need to bedefined in advance. Therefore rainfall excess values are considered as part of the set ofdecision variables. The relationship between rainfall excess and total rainfall is shown inFigure 4.9.At any time much of the rainfall is lost as ground water flow and evapotranspirationand only the rest contributes to the surface runoff. The rainfall excess, which contributesto the surface runoff, is expressed asi=1,2whereis the total rainfall intensity of event i at time n.H1, is the rainfall loss of event i at time n.is the rainfall excess of event i at time n.I is the total number of observed hydrographs.Chapter 4. TEST PROBLEMSU,0Figure 4.9: Rainfall components.116N1 is the total number of ordinates of observed hydrograph from the th event.The objective of this problem is to minimize the sum of absolute deviations betweenthe observed runoff and the runoff derived from the unit hydrograph. The optimizationproblem is formulated, in its general form, asSubject to:IN,Minimize > F1,, — Q I’i1 n1= P1,,U +P1,_iU2+ ... + Pi,n_m+iUmfori=1,2,...,Iandn=1,2,...,N= 13,N,MKUm = 1. . .Timen—I n—2 n—SChapter 4. TEST PROBLEMS 117Urn0 m=1,2,..,M\ThereIVI is total number of unit hydrograph ordinates.Urn is the value of the rn unit hydrograph ordinate.F1, is observed direct runoff from the i/h hydrograph at the n11’ ordinate.Q is derived direct runoff from the i/h hydrograph at the nt’’ ordinate.D is the direct runoff volume for the i/h rainfall event.L1 is the number of periods of rainfall excess for the ith event.r is a constant which takes a value of 1 or 2 depending uponthe definition of the objective function.K is a constant expressed asK—12(3600)At— (5280)2 Awhere t is the interval in hours between hydrograph ordinates, and A is the drainagearea of the watershed in miles2.LSP solved the Little Walnut Creek watershed in Austin, Texas, hydrograph using thedata given in the literature. Table 4.10 shows details of the data recorded on 07/19/79.The watershed has a drainage area of 5.57 miles2. The problem has 9 excess rainfallvariables and 10 unit hydrograph ordinates which makes the total number of decisionvariables to be 19.In the source literature this problem was solved using different approaches. The bestresult was claimed to be the minimization of the sum of absolute deviations using GRG2.It is reported that optimal solution was reached after 7.8 seconds on CDC Cyber 170/750B computer system. The results cited for the ‘best’ result were used to make a comparisonwith the LSP results and are given in Table 4.11. The optimal unit hydrographs suggestedin the literature and the one found with LSP are given in Figure 4.10.Chapter 4. TEST PROBLEMS 118Total ObservedTime rainfall runoff(hr) (in) (cfs)1845 0.02 0.41915 0.36 3.01945 0.19 12.52015 0.16 62.2045 0.49 466.002115 0.87 1750.002145 0.13 985.002215 0.04 647.002245 0.01 330.002315 216.002345 97.000015 52.000045 43.000115 33.800145 24.600213 19.600245 18.800315 17.90Table 4.10: Rainfall-Runoff data for Little Walnut CreekObservations:The results with LSP show a considerable improvement in the objective functionvalue over that is cited in the literature. The total sum of deviation between observedand calculated runoff cited in the literature is 54.7 whereas the LSP results show the sumof deviations to be 50.57. This is about an 8% improvement on the optimum value. Butan equally significant improvement from a hydrological modelling standpoint is that theunit hydrograph derived from the LSP results is superior to the hydrograph result citedin the literature. The recession side of the hydrograph derived from LSP results showsthe smooth monotone decay expected in a unit hydrograph. Figure 4.10 contrasts theunit hydrographs derived from the two results.Chapter 4. TEST PROBLEMS 119Literature LSPUnit UnitRainfall Hydrograph Calculated Rainfall Hydrograph CalculatedLoss Ordinates Runoff Loss Ordinates Runoff(in) (cfs) (cfs) (in) (cfs) (cfs)0.020 4445.00 0.40 0.0199 3638.42 0.370.359 841.90 3.00 0.3592 1506.41 3.000.187 881.70 12.50 0.1869 964.16 12.470.147 271.30 62.00 0.1445 399.77 62.030.388 347.30 466.00 0.3693 335.04 466.010.498 70.00 1750.00 0.4436 124.20 1749.960.000 50.50 985.00 0.0698 58.79 985.100.000 68.60 647.00 0.0148 64.24 647.300.000 36.10 330.00 0.0042 52.10 330.0586.30 216.00 45.79 216.0697.00 97.0552.00 51.9943.00 43.0033.80 33.8140.00 24.6212.35 4.443.80 1.450.90 0.27Table 4.11: Optimal hydrograph for the Little Walnut Creek4.3.6 Air pollution control design [Problem 4-11-0-40]Source: Wang, Bi-Chong and Luus, Ruin, “Reliability of optimization procedures forobtaining global optimum”, AIChE Journal Vol 24, No 4, pp 619-626, 1978.Method of optimization used in the source literature: Direct search method based onrandom sampling and search domain contraction.Problem formulation:The maximum ground level concentration of a pollutant resulting from the emissionof multiple sources is required. Holland’s plume rise equation and Gifford’s dispersionChapter 4. TEST PROBLEMS 120= InUFigure 4.10: Optimal unit hydrographs for Little Walnut Creek.equation are used for estimating the ground level sulphur dioxide concentration [Turner,1973]. Under adiabatic conditions, the ground level concentration of sulphur dioxide isgiven by=c, exp[_.5()2 — .5()2ji=1 0•Y °Zi •yj °Ziwhere= .sin9(x — a1) + cosO(y — b1)X1 = cos6(x — a) — .sin6Q,, — b1)1H = [1.5 + 2.68TçTad1}=H31+t .Time (hours)- - - -Litrure LSPfor 1, 2, ..., 10Chapter 4. TEST PROBLEMS 1210.9591 X 100.1136X?92 5 10 < X, <2 * iOo•gi =O.1385X901 2 * iO < X, iO0.2O30X°° iO < X < iO0.07925 X 104.828 *1O(lnX)88766 10 < X < 200= 3.108 *106(lnX)05295 200 <X < iO1.808 *107(lnX)198 io <X 5 * iO1.892 * 10—(lmX)4’ 5 * 103X < iOand the values of all constants are given in Table 4.12.a b H8 d T3-3000 -2500 183.0 8.0 413 2882.6 19.245-2600 -300 183.0 8.0 413 2882.6 19.245-1100 -1700 160.0 7.6 413 2391.3 17.6901000 -2500 160.0 7.6 413 2391.3 17.6901000 2200 152.4 6.3 413 2173.9 23.4042700 1000 152.4 6.3 413 2173.9 23.4043000 -1600 121.9 4.3 413 1173.9 27.128-2000 2500 121.9 4.3 413 1173.9 27.1280 0 91.4 5.0 413 1304.3 22.2931500 -1600 91.4 5.0 413 1304.3 22.293Table 4.12: Stack and Emission datawherea = x coordinate of stack i, (m)= y coordinate of stack i, (m)C = average ground level sulphur dioxide concentration based on30 minute sampling time, (g/m3)Chapter 4. TEST PROBLEMS 122= exit diameter of stack i, (m)H1 = effective height of stack i, (m) H31 = height of stack i, (m)= emission rate of stack i, (m/s)T = air temperature, taken as (283°K)= exit gas temperature at top of stack i, (°K) it = wind velocity, (m/s)V31 = exit gas velocity of stack i, (m/s)X1 = downwind distance from stack i, (m)= crosswind distance from the plume centre line of stack i, (m)zXH1 = plume rise from stack i, (m)o = wind direction measured clockwise from x axis, (radian)The optimization problem is to find the coordinates on the ground .x y, the windvelocity it, and the wind direction 0 for which C is maximized. From physical considerations, the following constraints are imposed.—20000 x 20000—20000 y 200000 ‘a 12.50 <0 <27rIt is reported in the literature that this 4-variable problem has numerous local optima.It is also mentioned that one of the local optimal values which is C = —7.5316 is closeto the global optimum value. The closeness of these objective function values makes thesearch more challenging for both gradient and LSP methodologies.Observations:The global solution cited in the source literature and that found with LSP are givenin Table 4.13.Chapter 4. TEST PROBLEMS 123Variable Literature LSPx -8039.6 -8038.06y 9369.2 9369.70u 5.6371 5.6410060 3.996 3.9966C —7.711410 —7.709210—Table 4.13: Optimal design cited in the literature and that found with LSPThe result found with LSP is similar to the result cited in the source literature. Theconvergence criterion was set by trial at T4 1 * 10_19 to achieve comparable accuracyto the cited solution. LSP reached convergence after 28,220 function evaluations whichtook 3 minutes and 35 seconds on a 80386-33Mhz microcomputer.4.3.7 Irrigation system design [Problem 6-7-1-4]Source:1. Holzapfel, Edward A., Marino, Mignal A. and Chavez-Morales, Juses, “Surface Irrigation Optimization Model”, ASCE, Irrigation and Drainage Engineering Journal,Vol 112 No 1, pp 1-19, Feb 1986.2. Holzapfel, Edward A. and Marino, Migual A., “Surface Irrigation Nonlinear Optimization Models”, ASCE Journal of Irrigation and Drainage Engineering, Vol 113No. 3 pp 379-392, August 1987.Methods of optimization used in the source literature: The first authors, [Holzapfel etal., 1986], solved this problem using linear approximations of the nonlinear objectivefunction and constraint equations, then they used linear programming packages. Thesecond authors, [Rolzapfel et al., 1987], used MINOS 5.0 to solve the same problem.Problem formulation:Chapter 4. TEST PROBLEMS 124An irrigation system is to be designed so that the maximum profit from the crop can beobtained by identifying the most efficient furrow irrigation system. The example adoptsthe field data and profit yield relationship for the Chillan project in Chile. A surfaceirrigation system is designed on the basis of soil, crop, topography, size and shape of theirrigable area, and the availability of farm equipment. For this particular case the sixdesign variables involved in the problem formulation are: 1) the flow discharge; 2) thelength of run in the direction of the flow; 3) the time of irrigation cutoff; 4) the numberof furrows per set; 5) the number of runs per set; and 6) the number of sets per day.The objective of the design is to maximize profit (PRO) which is defined as totalrevenue minus total cost, expressed asPRO =yP-(W-I-L)-OCwherePRO =Profit in dollars per hectare= Yield in metric tons per hectareP = Price of crop in dollars per tonW = Water cost in dollars per hectare= Seasonal labour cost in dollars per hectare00 = Other costs (cost of crop, fertilizer, pesticides, harvesting, etc). Since00 does not change with alternative irrigation systems, it can be dropped from theanalysis.It was reported in the source literature that, in all the studies they made earlier, agood correlation was found between profit and requirement distribution efficiency (RDE).The relationship of PRO and RDE is given asPRO = C1RDE + B1where Cj and B1 are regression coefficients with known values.The mathematical relationship between RDE and the design variables considers theChapter 4. TEST PROBLEMS 125water distribution in the soil and involves the following problem variables.The mathematical expressions areRDE =thereforePRO =C11WQcLbT:0+ B1whereQ = inflow discharge in millilitre per secondL = Length of run in meters= Time of irrigation cutoff in minutesN,1 = Number of furrows per setN,. = Number of runsN3 = Number of irrigation sets per dayQr = Total discharge available in millilitres per secondQmaxf = Maximum discharge in furrow in millilitres per second= Total length in metersLmin = Minimum run length in metersAmin = Minimum area irrigated per day in square metersF3 = Furrow spacing in meters= Maximum allowable number of irrigation setsTmar = Maximum time per day to irrigate minimum areaEFFm = Minimum required distribution efficiency allowable in7cKq,M,a,b,c,u,w,Ce and B1 are constants.Since 01, M and B1 are constant values, taking them out of the objective functionequation would not change the optimal solution. The reduced form of the objectivefunction would then beChapter 4. TEST PROBLEMS 126Maximize QaL_bTcThe design variables are restricted within specified limits depending upon given fieldconditions, and fulfil certain physical conditions. Finally the model formulation isMaximize QaL_bTcSubject to:Q QrnaxjQLTc i KQQTLIVr = LTLiVf N8 14min/F8T0N8 TmQaL_bTc 100/MQaL—bTc EFFm/ML LminN,Nf 1N8<SAll necessary data for the design is given in Table 4.14.QT 32,000 mI/s Kq 25Qrnaxf 1,300 mi/S M 0.129LT 800 m a 0.706Lmin 50 m b 0.645Amin 20,000 sq. m c 0.809F5 im u 1.02Tma 1200 mm w 0.26S 6 01 9.98EFFm 20% B1 443Table 4.14: Constants for the irrigation system designChapter 4. TEST PROBLEMS 127The problem involves six variables with one equallty and six inequality constraints.Lower and upper bounds on each variable are also specified. In solving the problem withLSP, three of the variables, Nf, N7 and N3 were assumed to take integer values eventhough it was not clearly stated in the literature.The problem was solved under two different conditions [Holzapfel et al., 1987]. Thefirst case was where all variables were allowed to take any value within the variabledomain, and the second case where the length of the run was restricted to a fixed value.Their solution to the first case was questionable so that only the second case with L = 200was solved using LSP.Observations:Results cited in the source literatures and those found with LSP are given in Table4.15. It took 5,785 function evaluations to meet the convergence criterion of VM 0.0001with LSP.LiteratureVariables 1986 1987 LSPQ (ml/s) 1,260 1,255 1,280L (m) 200 200 200To (mm) 306 305.9 300Nf 26 25 25N7 4 4 4N3 4 4 4Profit (8) 525 524 523Table 4.15: Optimal solution for the irrigation system designThe optimal objective function value found with LSP is slightly inferior, by about0.19%, to those values given in the literature. But the LSP result is feasible while theother two cited are infeasible. That isi) Q * iV = 32, 760 32,000 (about 2.38% violation) [Holzapfel et al., 1986].Chapter 4. TEST PROBLEMS 128ii) T0*N3= 1,224 1,200 (about 1.97% violation) [Holzapfelet al., 1986 and 1987].If the same constraint violations are allowed with LSP, the objective function valuecan be improved by about 2.53% over the solutions cited in the literature.Amongst all engineering fields there has perhaps been the longest sustained interestin practical applications of nonlinear optimization in chemical engineering. The followingfour test problems are taken from the chemical engineering literature. These particularproblems have been included here for the challenge they present to NLP methods andbecause in recent years they have been used as test problems by some authors. In onecase GAMS, which uses MINOS was used to find the optimal solution. This provided agood opportunity to test LSP against a current optimization package. The statementsof the problem formulation are taken directly from the sources cited.4.3.8 Alkylation process [Problem 10-1-7-0]Source:1. Rein Luus and T. H. I. Jaakola, “Optimization by direct search and systematicreduction of the size of search region”, AIChE Vol 19, NO. 4, pp 760-766, 1973.2. T. F. Edgar and D. M. Himmelblau. “Optimization of chemical process”. McGrawHill 1988.3. Moran, Manfred and Grossmann, lgnacio E., “Chemical engineering optimizationmodels with GAMS”, CACHE Process design case studies, Vol 6, 1991.Methods of optimization used in the source literature: Luus and Jaakola used a directChapter 4. TEST PROBLEMS 129search optimization method with sequential search domain reduction. Edgar and Himmelblau nsed sequential quadratic programming while GAMS was used in the third sourcecited.Problem formulation:The optimization problem is to determine the optimal operating conditions for thesimplified alkylation process shown in Figure 4.11. The variables involved in the problemformulation, along with their upper and lower bounds, are given in Table 4.16. Theobjective function was defined in terms of alkylate product, or output valne minus feedand recycle costs; operating costs were not reflected in the fnnction. The profit per dayto be maximized is:f (x) =01x47— C2x1 — 03x2 — C4x3 — 05xwhereC1 = alkylate product value ($ 0.063 per octane-barrel)02 = olefin feed cost (85.04 per barrel)03 = isobutane recycle costs ($0.035 per barrel)C4 = acid addition cost ($10.00 per thousand pounds)= isobntane make-up cost ($3.36 per barrel)It was stated in the literature that a regression analysis was first carried out toform the process model. The alkylate yield, x4, was a function of the olefin feed, x1,and the external isobntane-to-olefin ratio, x8. The relationship determined by nonlinearregression, holding the reactor temperatures between 80 to 90°F and the reactor acidstrength by weight percent at 85 to 93, was= xi(1.12 + 0.13167x8— 0.00667x).Chapter 4. TEST PROBLEMS 130sobutan. r.cycf.4Spent acidre 4.11: Alkylation process.The isobutane make-up, x5, was determined by a volumetric reactor balance. Thealkylate yield, x4, equals the olefin feed, x1, plus the isobutane make-up, x5, less shrinkage. The volumetric shrinkage can be expressed as 0.22 volume per volume of alkylateyield so that= 1.22x4— x.The acid strength by weight percent, x6, could be derived from an equation thatexpressed the acid addition rate, x3, as a function of the alkylate yield, x4, the aciddilution factor, x9, and the acid strength weight percent, x6 (the addition acid wasassumed to have acid strength .f 98%).— 98000x3x6—x49 + 1000x3The external isobutane-to-olefin ratio, x8, was equal to the sum of the isobutanerecycle, x2, and the isobutane make-up, x5, divided by the olefin feed, x1x2 + x5x8 =x1leobutan. make-upOlefin feedIReactorHydrocarbon Product.Fresh acidFractfonatorAlkylate productChapter 4. TEST PROBLEMS 131Lower UpperVariable Description bound boundx1 Olefin feed (barrels per day) 0 2000x2 Isobutane recycle (barrels per day) 0 16000x3 Acid addition rate (100 lb per day) 0 120x4 Alkylate yield (barrels per day) 0 5000x5 Isobutane make-up (barrels per day) 0 2000x6 Acid strength (weight percent) 85 93x7 Motor octane number 90 95x5 External isobutane-to-olefin ratio 3 12x9 Acid dilution factor 1.2 4x10 F-4 performance number 145 162Table 4.16: Bounds for the variables involved in the alkylation processThe motor octane number, x7, was a function of the external isobutane-to-olefin ratio,x5, and the acid strength by weight percent, x6 (for the same reactor temperatures andacid strengths as for the alkylate yield , xj)= 86.35 + l.098x8— 0.038x + 0.325(xe — 89).The acid dilution factor, x9, could be expressed as a linear function of the F-4 performance number, x10= 35.82 — 0.222x10The last dependant variable is the F-4 performance number, x10, which was expressed asa linear function of the motor octane number, x7= —133 + 3x-.Observations:Edgar and Himmelblau solved a modified version of this problem in which some of theconstraints were relaxed. The optimal solution for the modified case is not reported hereChapter 4. TEST PROBLEMS 132but it was observed that two of the constraints were slightly violated. The constraintviolation is so small that it might have arisen from round off error.The optimal solution cited in [Luus & Jaakola, 1973) was reported to be close toearlier works by other authors. In CACHE, the CAMS package, which makes use ofMINOS 5.3, was used to solve the same problem. The results with this approach showedinferior results compared to the earlier published works. The result found with LSP issimilar to the one reported by Luus and Jaakola, and is an improvement over the CAMS’sresults. It took 4,789 LSP function evaluations to obtain the optimal result. The optimalsolutions found with the three methods are given in Table 4.17.Optimal solutionLiteratureVariable CACHE Luus et al. LSPx1 1734.410 1728.40 1728.40x2 16000.000 16000.00 16000.00x3 98.405 98.40 98.40x4 3060.992 3056.00 3056.00x5 2000.000 2000.0 2000.00x6 90.592 90.60 90.60x7 94.170 94.20 94.20x8 10.378 10.41 10.41x9 2.629 2.61 2.61x10 149.509 149.60 149.60f(x*) 1,154.4 1,162.009 1,162.0174.3.9 Heat exchanger network configurationSource:[Problem 16-1-15-0]1. C. A. Floudas and A. R. Ciric, “Strategies for overcoming uncertainties in heatexchanger network synthesis”, Computers and Chemical Engineering, 13(10), pp1133-1152, 1989.Table 4.17: Optimal solution for the Alkylation processChapter 4. TEST PROBLEMS 1332. C. A. Floudas and P. M. Pardalos, “A collection of test problems for constrainedglobal optimization algorithms”, Lecture Notes in Computer Science, NO. 455,Springer-Verlag, 1990.3. Moran, Manfred and Grossmann, Ignacio E., “Chemical Engineering OptimizationModels with GAMS”, CACHE Process design case studies, Vol 6, 1991.Methods of optimization used in the source literature: GAMS was used in CACHE.Floudas and Ciric used a method where the variable set and the constraint set are decomposed into two sets leading to two subproblems so that each subproblem containsonly linear constraints. This decomposition of the original problem induced a specialstructure in the resulting subproblems.Problem formulation:The problem involves determination of the optimal heat exchanger network configuration for a system of two hot streams and one cold stream. A minimum utility consumptionanalysis has shown that no stream of cooling water is required. The inlet temperatures,outlet temperatures, and heat capacity flow rates for the hot streams and cold streamsare given in Table 4.18. The heat duties and overall heat transfer coefficients for thematches that will take place in this network are given in Table 4.19. The objective ofthe problem is to obtain the optimal configuration of the cold stream so as to minimizethe heat exchanger investment cost.The problem formulation for this 16 variable problem is given as follows:Minimize Cost= 1300[oo5[2(TTj°5°°+1(AT+T)j] .6I io 600 1.6m ovChapter 4. TEST PROBLEMS 134Stream T in (K) T out (K) FC(ç)Hi 500 250 4H2 350 200 4Ci 150 310 10Tmin iO°KTable 4.18: Stream dataMatch Q(kW) U() A(m2)Hi Ci 1000 0.05 207.357H2 Ci 600 0.05 137.230Cost of heat exchangers =i300Ab]Table 4.19: Match dataSubject to:• Mass balances at the splitting and mixing pointsff+f= 10fI4BJi J12 — Ji —cI çB çE_J2 + J21— J2O B E_0Ji +J21J1 —O rB E0J2 +J12J2 —• Energy balances at mixing points and over the exchangers150’ tOiB jIcE_0J1 + 2 J12 1J1 —incI OB 414LE_iJUJ2 + G1 J21— 2J2fjE(t— tf) = 1000E(tO t— 0J2 2— 2 —• Temperature approach calculations for the heat exchangers/T11 =500—t?Chapter 4. TEST PROBLEMS 135zT12=250—tfZT21 =350—tz.T22 =200—ti• Minimum temperature approach constraintsT11,T2/2 10• Bounds of the flow rates and temperatures< ci I B B O jO <‘-‘—Jl,J2,JlJl, 2 —2.941 fE 103.158 fE 10150 tf 240250 t 490150 t 190210 t 340wheref1 = flow rate flowing from the splitter at the beginning of the network to the mixer.fE= flow rate flowing through the exchangerfE= flow rate flowing from the splitter following the match to the mixer precedingthe exchangerf° = flow rate flowing from the splitter following the exchanger to the mixer atthe end of the network= inlet temperature= outlet temperatureChapter 4. TEST PROBLEMS 136= temperature approach.Observations:The decomposition procedure adopted by [Floudas & Ciric, 1989] was not necessaryin the LSP solution of the problem.The solution cited in both sources and the one found with LSP showed similar results.The optimal solution is given in Table 4.20. It took 182,905 function evaluations to getthe optimal solution with LSP.Variable Optimum value4’IJ1f21 10flo 10;OJ2f1E 10fE 10tf 210t 150t 310t 210zT11 190zT12 40T21 140zT22 50BJ 12çBJ21Objective 56825Table 4.20: Optimal solution to the heat exchanger configuration4.3.10 Power generation using fuel oil [Problem 4-11-0-5]Source: Moran, Manfred and Grossmann, Ignacio E., “Chemical Engineering Optimization Models with GAMS”, CACHE Process design case studies, Vol 6, 1991.Chapter 4. TEST PROBLEMS 137Method of optimization used in the source literature: GAMSProblem formulation:The two-boiler turbine-generator combination shown in Figure 4.12 is used to producea power output of 50 MW. It can use any combination of fuel oil and blast furnace gas(BFG). However only 10.0 fuel units per hour of BFG is available. Since this supply ofBFG may not be sufficient for the required power generation, fuel oil must be purchasedand used. It is desired that we use the minimum total amount of fuel oil in the twogenerators. For this purpose, fuel requirements for the two generators are expressed as aquadratic function of the MW produced by using nonlinear curve-fitting. Thus if x (inMW) is the power produced and f (in ton/h for fuel oil and in fuel unit/h for BFG) isthe amount of fuel used, then f = a0 + a1x +a2x. The constants a (i = 0, 1,2) dependon the generator and type of fuel used.Fuels Power outputFigure 4.12: Two-boiler Turbine-Generator combination.Assume that when a combination of fuel oil and BFG is used, the total power generated is given by the sum of the powers generated by each fuel. The ranges of operationChapter 4. TEST PROBLEMS 138Generator Fuel type a0 a1 a21 Fuel Oil 1.4609 0.15186 0.0014501 BFG 1.5742 0.16310 0.0013582 Fuel Oil 0.8008 0.20310 0.0009162 BFG 0.7266 0.22560 0.000778Table 4.21: Constants in fuel consumption equationsof the two generators are [18, 30] MW and [14, 25] MW respectively. It is required toformulate an optimization problem which minimizes the amount of fuel oil purchased anddetermine the amounts used by each generator.For operating the generators, define fj as the amount of fuel type j used by generatori, and x the corresponding MW generated. From the curve-fitting dataf = ao + +a24 i = 1,2 j = 1,2where the ajj (k = 0, 1, 2) are the constants for generator i and fuel type j (1 = fuel oil,2 = BFG). The values of the constants are given in Table 4.21.The power P generated by generator i will be given by= x1 + x2; i = 1, 2 whereF1 + F2 50. Define zj as the total amount of fuel j purchased, then>= 2a0 + +a24; j = 1,2Z2 10xii 0Minimize z1Observations:In the source literature, the optimal power outputs for generator 1 and 2 are 30 MWand 20 MW respectively, and 4.68 1 ton/h of fuel oil is purchased. BFG generates a powerof 36.325 MW and the fuel oil is used to generate 13.675 MW. The optimum objectiveChapter 4. TEST PROBLEMS 139function value found with LSP is 4.683 ton/h, which is very similar to the solution citediii the literature. One of the important results found with LSP is that the global optimumis accompanied with multiple near optimal solutions. Details of solutions cited in theliterature and found with LSP are given in Tables 4.22 and 4.23 respectively. The totalnumber of function evaluations expended to meet LSP’s termination criterion was 33,556,which took 50 seconds on a 80386-33Mhz computer.Generator Fuel oil BFG Power1 10.114 19.886 302 3.561 16.439 20Total 13.675 36.325 50Table 4.22: Optimal power generation cited in the literatureGenerator Fuel oil BFG Power1 10.479 19.504 29.9832 3.254 16.768 20.022Total 13.733 36.272 50.005Table 4.23: Optimal power generation found with LSP4.3.11 Chemical equilibrium [Problem 10-1-3-0]Source: Moran, Manfred and Grossmann, Ignacio E., “Chemical Engineering Optimization Models with CAMS”, CACHE Process design case studies, Vol 6, 1991.iViethod of optimization used in the source literature: GAMSProblem formulation:From the thermodynamics of chemical reaction equilibrium, the equilibrium state ofa closed system at constant temperature, T, and pressure, F, is the state at which itstotal Gibbs free energy is a minimum. This criterion is used to obtain the equilibriumChapter 4. TEST PROBLEMS 140composition of a given mixture by minimizing its free energy with respect to its composition. An ideal gas mixture of ten chemical species is maintained at T = 298 K andP = 750 Hg. The species are made up of three atomic elements (e.g. H, 0 and C). If wedenote the three elements as A, B and C, then the species formulas in terms of A, B andC are A, A2,A2C, B, B2,AB, BC, C, C2 and AC respectively for .s = 1, 2, ..., 10. Being anideal gas, the Gibbs free energy per mole of species .s is given by C3 = C03 + RT ln(Fy3),where y3 is the mole fraction of species s in the mixture. The C05 are given in terms ofin3 = C03/RT in Table 4.24. The mixture contains 2 mole A, I mole B and 1 mole C.5 inS 5 in31 -10.021 6 -18.9182 -21.096 7 -28.0323 -37.986 8 -14.6404 - 9.846 9 -30.5945 -28.653 10 -26.111Table 4.24: Free energy constants (in3)Define x3 as the moles of species .s and F as the total moles of all species in themixture, that isF=x1+x2...+x0Since the mixture is ideal, C is the sum of the free energies of the species in the mixture,where= x1C + x2C + x3C + ... +x10C.Substituting in3 in(Fy3) for C3, dividing C by RT, and expressing the mole fractionsYs = x3/F then,C = = YZx3[w + ln(Px3/F)]Let a3 denote the moles of element i in one mole of species .s. The values of a13 areeasily obtained from the species formulas. For instance a11 = 1, since species A(s = 1)Chapter 4. TEST PROBLEMS 141has only one atomic element A(i = 1), a21 = 0, a13 = 2, etc.. The total number of molesof element i in the mixture, X0, is given by,= a1x +a2x + a3x + ... +a1ox0; i = 1,2,3.The optimization problem is to minimize the total Gibbs free energy, G, with respectto the mixture composition. But since the mixture T is maintained constant, minimizingG is equivalent to minimizing G. Then the optimization formulation isMinimize GThe solutions for this 10 variable problem cited in the literature and found with LSPare given in Table 4.25.Observations:The LSP results confirm the results cited in the literature. LSP found the solution in3 minutes and 20 seconds on a 80386-33Mhz microcomputer, which took 83,588 functionevaluations to meet the convergence criterion VM j 1 * iO.xss Literature LSP1 0.007 0.00612 0.068 0.07093 0.907 0.90384 0.0004 0.00055 0.491 0.49026 0.0005 0.00087 0.018 0.01838 0.003 0.00349 0.015 0.015410 0.042 0.0436G -43.495 -43.4942Table 4.25: Optimal solution to the Chemical equilibrium problemChapter 4. TEST PROBLEMS 1424.4 Model fittingRegression analysis is used to fix the parameters of a model from a set of experimentaldata. The experimental data will have a set of observed points with known values forthe independent variables and the corresponding values of the dependent variable. Therelationship between the dependent and independent variables can be expressed in avariety of ways. For example, as a simple linear model= /3o + /31x;or as a multiple linear modelor as a polynomial model+ + + ... +or as a general nonlinear modelU = o + thx1 +52x1+ 3x +In all the above forms the regression analysis determines the values of the parameters, i.e. the fi vector, in the model. The methods used to estimate these parametersare known as simple, multiple linear and nonlinear regression analysis depending uponthe formulation of the model. The equations used to define a certain model normallyreflect the physical processes involved and models dealing with the physical world arepredominantly nonlinear in their formulation. On the other hand estimating parametersfor linear models is easier than for nonlinear models and therefore the equations of nonlinear models are often linearized so that linear regression methods can be used. Suchtransformations may not be possible with highly nonlinear models.Chapter 4. TEST PROBLEMS 143Three best fit criteria used in the estimation of parameters of a linear model are: minimization of the sum of the squares of deviation between observed and model generatedvalues; minimization of sum of absolute deviations; and minimization of the maximumabsolute deviation [Narula, 1982]Suppose a set of data is collected and the parameters for the linear model y = j3xare to be estimated. To fit a model from m experimental points using the least squares,or the absolute deviations, as the model criterion, the objective function can be expressedasMinimize L I — /3x2 (4.2)Subject to typical constraints with as many constraints as there are data values asj=1,...,mrherey represents the experimental value of the dependent variable at point j.x is the coordinate of the independent variable vector at point j.j3 is the vector of model parameters.If p takes a value of 2, the objective would be to minimize the sum of squares of thedeviations. When p takes a value of 1, then the objective would be to minimize the sumof absolute deviations.LSP can be used to optimize the function given in Equation 4.2 where the constant ptakes the value of 1, 2 or any real value. LSP finds a solution to such problems withoutthe need for linearization of any of the functions. It also provides designers with morefreedom when deciding on the severity of the criterion for model fitting. In addition, LSPcan easily estimate parameters for both linear and nonlinear regression models and, infact, provides new perspectives on the parameter values found.Chapter 4. TEST PROBLEMS 144The type of sensitivity information on the model parameters, provided by LSP, isunique and has considerable potential value to an engineer seeking a representative model.The sensitivity information is provided by LSP in the form of the acceptance sets asthe optimal solution is approached. This information is provided without any extracomputational overhead and is also easily understood. More discussion on sensitivityanalysis with LSP in connection with model fitting is discussed in Chapter 8.Examples of regression models were taken from the literature and solved using LSP. Inalmost all the cases LSP confirmed the results in the literature. Test problems [Problem3-17-0-0 and Problem 4-11-0-0, Appendix A] are representative of some of the nonhnearregression models solved with LSP. A 4% improvement over the solution cited in theliterature was found with LSP’s solution to [Problem 3-17-0-0, Appendix A]. While ittook 19,842 function evaluations to meet the convergence criterion of l%- 1 * 10,this still took only 3 minutes with an 80386-33Mhz personal computer and this time isof httle practical significance. Only 2,128 function evaluations were expended to meet asimilar convergence criterion for the four variable problem [Problem 4-11-0-0, AppendixALChapter 5EXPERIENCE WITH LSPConsiderable experience was gained with LSP when the many test problems, both mathematical and engineering in nature, were solved during the course of this research. Thisexperience showed that the output obtainable from LSP as the search progresses can giveimportant information concerning the effectiveness of the search, possible adjustments tothe search strategy when difficulties are experienced, as well as insights into the topologyof the problem being solved. It has also helped to give a clearer in dication of the influenceof certain LSP parameters, such as the number of points in a level set on the number offunction evaluations required for solution, on the reliability and efficiency of the overallsearch.While the basic LSP algorithm with the recommended default parameter values is aneffective global optimizer, it is natural to seek further improvement in performance. Thischapter describes a variety of enhancements and adjustments which have been beneficialin solving some of the test problems and would be implemented under the direct controlof the user.5.1 Intermediate output, its presentation, and interpretationOne of the important contributions of this research in implementing LSP is the information which is displayed graphically on the computer monitor after every iteration. Thisfeature was not anticipated but, as its potential value became apparent, it was exploredand exploited in this research. Three principal types of output displays were investigated;145Chapter 5. EXPERIENCE WITH LSP 146scatter diagrams of points in an acceptance set on two variable planes; plots of level setvalue-c against iterations; and the cumulative number of function evaluations againstiterations. These can all be generated rapidly with little computational overhead andwere provided without difficulty by the LSP software developed. These plots and theirsignificance are described in detail in the next three sections. AU of the diagrams usedin this chapter are either derived from these displays or represent actual screen dumpsof the displays.5.1.1 Plots of rj, x3 pairsA set of two dimensional x rSi x plots of the confirmed points in the current acceptanceset are displayed on the computer monitor at each iteration. When the axes of theseplots span the initial variable bounds, the cluster formed by these points indicate thesize and location of the current acceptance set in the decision domain. The approximateacceptance set boundaries can be visually inferred from the cluster of points displayed.The existence of multiple local optima is suggested in these x xj plots when thedisplayed points form distinct clusters with open spaces in between. It is quite commonto observe that, after a few more iterations and an accompanying improvement of the levelset value-c, the displayed points form a smaller single cluster. The existence of distinctclusters at some stage of the search and their disappearance at a later stage is indicativeof local optima and LSP’s ability to reject them. If there is some special reason to explorea particular local optimum, LSP can be rerun within a new initial cuboid enveloping thecluster of points of interest. Clusters may however exist in n dimensional space but notnecessarily be evident in many or even all of the x1 “.‘ xj plots due to overlapping andthe limited perspective of each of the two dimensional views. Thus while visual detectionof clusters is sufficient to confirm their existence it is not necessarily the case that theywill be detectable in this way. In section 5.1.2 other graphical evidence of the existenceChapter 5. EXPERIENCE WITH LSP 147of multiple clusters is described.Two plots derived from a single LSP run, but at different iterations, are shown inFigure 5.1 for a two variable problem and indicate the existence of multiple optima.The points form two distinct clusters, which indicates that there are at least two localoptima. At this stage there is nothing to suggest whether the two clusters will lead toa single global optimum or to two or more equal global optima. After a few additionaliterations the points around the upper left corner of Figure 5.1 (a) completely disappearand all points in the acceptance set cluster around a single point, as shown in Figure5.1(b). The disappearance of those points which had formed a distinct cluster at anearlier iteration implies the existence of a local minimum in that vicinity. As the LSPsearch progresses the displayed points coalesce about the solution and ultimately occupyonly a single pixel in the graphical display.Similar scatter diagrams plotted with axes which span the sides of the current cuboid,that is where the full length of the axes correspond with the current cuboid side, can alsobe displayed at any stage of the search. Such a plot gives a closer look at the patternof points in the current acceptance set. These patterns may reveal useful informationconcerning, for example, the distribution in the decision space of near optimal solutionsand the mutual relationships of each pair of the variable set. The significance of suchindications will be discussed un4er the subject of sensitivity in Chapter 8.5.1.2 Plots of cumulative function evaluations-Nf against iteration number-IMany different schemes for counting the number of function evaluations incurred in nonlinear optimization appear in the literature. The count adopted in this research reflectsboth objective function and constraint function evaluations. When a point is generated in LSP it is first checked for feasibility against the set of constraints including thebounds on each variable. If the point is feasible then it is retained, or if infeasible, itSChapter .5. EXPERIENCE WITH LSP 148xa za(a) (b)Cubold. :...—.•.‘.%•:•- xl*1Figure 5.1: Local optima suggested by the disappearance of a point cluster.is rejected. In either case this is counted as one function evaluation. At every feasiblepoint the objective function value is calculated and this is also counted as a functionevaluation. Consequently every confirmed point in an acceptance set will count as twofunction evaluations.A variety of shapes of the N1 I curve can be observed. From experience with thetest problems these curves have been categorized into three types: linear, exponential-likeand sigmoidal curves. The implication of each type of curve is explained next.Linear curveThe number of function evaluations at the first iteration gives an indication of theratio of the feasible region volume and the initial cuboid volume. The number of functionevaluations at the first iteration for an unconstrained but bounded problem should beequal to that is equal to the number of points which are to be confirmed in theacceptance set.For the ideal of an unconstrained, but bounded, problem with convex objective, theN1 —‘ I plot has been confirmed through the test problems to be almost a straight line.Chapter 5. EXPERIENCE WITH LSP 149The slope of this line, the number of function evaluations per iteration, is only slightlylarger than Njceep/2, which is shown in Figure 5.2(a). The reason is simply that, in thisrather ideal case, only about half of the points in a level set are lost with the lowering ofthe level set value-c to M(f, c) at each iteration. These lost points must be replaced bynew ones at every iteration and each new point needs one function evaluation.Exponential-like curveWhen a problem has constraints in addition to the variable bounds then the numberof function evaluations at the first iteration will be greater than as a result of thediminished feasible space within the initial cuboid. For constrained problems, using theuniform random trial point generation scheme described in Chapter 2, Section 2.3.2, thenumber of function evaluations increases as the ratio of the feasible region volume to thecuboid volume gets smaller. A progressive plot of the cumulative number of functionevaluations against iteration number, NJ I, indicates the effectiveness of the search ateach iteration and any change in this efficiency as the search progresses.For constrained problems the NJ rsI plot usually deviates from a straight line, and fora typical “well behaved” LSP problem, progressively steepens reflecting that generatingnew acceptable points gets more difficult with the number of iterations. Figure 5.2(b) isa plot for a typical constrained problem.The inefficiency of acceptable point generation is indicated by the degree of deviationof the NJ r. J plot from the straight line of slope Nk€, each point requiring one functionevaluation for feasibility; and one function evaluation foi’ eligibility in the level set. Thiscan occur when the problem has widely spaced multiple optimal points which produce adisjoint acceptance set or when the geometry of a connected acceptance set produces alarge cuboid volume with small acceptance set volume, for example when the acceptanceset is distributed around a cuboid diagonal.ChapterS. EXPERIENCE WITH LSP 150ci,.2 .2.. .C C.2 .2.isC C/1NEz •1o 0•j>N.Iterations Iterations(a) Linear N1-”l curve (b) Exponential-like N1-l curve(0...C.2..C.2isC.a).EC.)- N• 1Iterations(C) Sigmoidal N”l curveFigure 5.2: N1 I plots.Chapter 5. EXPERIENCE WITH LSP 151Sigmoidal curveFor some problems the Nf r’. plot may be linear for the initial iterations, thenadopt an exponentially increasing shape for a few more iterations, and finally declineand continue as a straight line, as shown in Figure 5.2(c). Such a plot shows thatgenerating points in the acceptance set was efficient at the earlier stages of the search,then lower efficiency was experienced for a while, but that the efficiency was restored inthe later stages when approaching the final convergence of c on C.A common explanation for such a curve is that, at the early stage of the search,the level set requirement is easily met and feasibility dominates acceptance set pointgeneration. At this stage all points in the acceptance set constitute a single cluster,indicating that feasible region is not disjoint. Therefore the Nf ‘-. I plot is close to astraight line with slope approximately equal to Nkeep. At the intermediate stages theacceptance set becomes fragmented, the points forming distinct separate clusters eachsurrounding a local optimum. Since acceptance set point generation is inefficient withmultiple optima, their existence is reflected in the Nf I curve by a steepened slope.But in the later stages the local optima are discarded and the search is performed ona single connected acceptance set. Such a curve shape is most pronounced when theobjective function values at widely dispersed local optima are close to the global optimumobjective function value. If the objective function value differences at all local and theglobal optimum are large, the influence of the local optima on the N I curve is small.The sigmoidal curve can be demonstrated with an LSP solution of the Rastriginfunction, [Problem 2-16-0-0, Appendix A], a two variable problem with 50 local optimabut only one global optimum. This problem’s formulation isMinimize f(x) = x + 4 — co.s18x1 — cosl8x2Subject to:Chapter 5. EXPERIENCE WITH LSP 152—1 x1 11 2 1.Figure 5.3 shows the screen display at convergence obtained from the LSP solution.Figure 5.3: R.astrigin function-Implied local optima by N1 I plot. (Actual screendump).A second possible reason for observing the characteristic sigmoidal N1 .s curve ariseswhen a fissure exists in the objective function surface in the feasible region at the globaloptimum. For a problem with more than one local optimum, points in the acceptance setcan form clusters around the individual local optimal points but not necessarily includea point in the fissure. The ratio of the acceptance set and cuboid volumes may be smallunder these conditions and the search inefficient. Provided the cuboid at this stageincludes the fissure region, then there is still a possibility of generating points at or nearSChapter 5. EXPERIENCE WITH LSP 153the global optimum. Once a single point is generated in the fissure, the search tends toconverge rapidly on the global optimum as more and more points are generated withinthe fissure, so that the efficiency of the search is restored. The Road Runner functiontest problem [Problem 2-22-0-0, Appendix Al screen display is shown in Figure 5.4, andis an example where the N1 I plot exhibits the sigmoidal shape due to the presence ofa fissure.r*n.yVi.s. •r• rnLEVEL SET PROGRAMMING[ LSP — Vast ian 1993______1giMt’W i1j4mwxrnpnc,xri rPLhf:)•4VJt_________ ______— ]@I!W5E2_ __________________15EFigure 5.4: Screen display for the Road Runner function, two dimensional case. (Actualscreen dump).5.1.3 Plots of level set value-c against the iteration number-IThe progressive plot of level set value against the number of iterations, c e-’ J, givesfeedback on the strength of the convergence on c*. In this research a pre-specified valueof the modified variance was adopted as a convergence criterion. This value is comparedChapterS. EXPERIENCE WITH LSP 154with the modified variance after each iteration. Ideally the modified variance shouldreach zero at the global optimum, but a small value is used as a tolerance for practicalimplementation purposes.Because the modified variance is not dimensionless and is also sensitive to the absolutevalue of the objective function, a strong convergence criterion value for one case may beweak for another case. It can therefore be difficult to specify a convergence criterion inadvance of obtaining a solution of the problem. The convergence question can be judgedfrom the evolving c I curve and based on the difference of level set value betweenconsecutive iterations, as was discussed in Chapter 2, section 2.3.5.Ideally c “-‘ I would be a smooth and monotone decreasing curve, and strongly asymptotic to the global minimum. The advancement of c on c’ resembles a binary search insome respects and therefore often shares siniilar convergence characteristics. As the curvebecomes nearly horizontal it indicates that the difference between the level set values atconsecutive iterations is very small, and that further iterations would probably bring evensmaller improvement to the objective function per iteration, see Figure 5.5(a). But if thesearch stops before the horizontal part of the curve is well developed, it may suggest thatthe convergence criterion adopted is too large and has lead to a premature termination,as in Figure 5.5(b).In less ideal cases the level set value versus iterations plot might show some discontinuity as in Figure 5.5(c). The causes for the discontinuity can be either a discontinuityin the objective function surface, or that the global optimum lles in a small fissure anda first point has been generated in that fissure just prior to the discontinuity. When alow point is generated in the fissure it focuses the search around the new point and amajor correction occurs in the level set boundary estimates. As more points are thengenerated in the fissure the level set value drops rapidly compared to previous iterations,hence an abrupt change in shape is introduced into the c I curve. An example of theChapter .5. EXPERIENCE WITH LSP155zIterations Iterations Iterations(a) Ideal convergence (b) Premature termination (c) Discontinuous convergenceFigure 5.5: Level set value-c versus iteration-I.above is given in Figure 5.6, where the Road Runner Function is used to demonstrate adiscontinuity in the c I-s I plot. In this instance the number of points maintainedin the level set, was reduced to half of its recommended value. The curve exhibits theinitial tendency to converge on a non optimal point, as discussed above, and then theabrupt slope change leads to the final convergence on the global optimum.5.1.4 Further generalization of the N1 - I and c ‘-‘ I plot interpretationsThe N —‘ I and c - I plots are considered to be an important by-product of theLSP search and their interpretation is unique to LSP. Experience with these plots whensolving problems has revealed some distinct and frequently observed patterns. Thesepatterns have been linked to the known features of the many test problem solved andhas lead to the following generalizations. While these are explained with the aid of aone dimensional problem, their key features can easily be extended to the multidimensional equivalent cases. It should be stressed that, regardless of the dimensionality ofthe problem being solved, the c I and N1 I plots will remain two dimensional.Interestingly, the characteristic plots shown in the following pages were clearly observedin many multidimensional problems. This suggests that the types of features reflected in.....•..••I•• •Chapter 5. EXPERIENCE WITH LSP 156LEVEL SET PROGRAM hGI______LSP-jqqII e .LiE!.. :• .-.. ..L-.--. I:zi--ra. a • •-a•— a a• p-L ii:: —Iii!r. aa’___________________ _ _ _ _ _ _______________________• a aFigure 5.6: Low Nke search for the Road Runner function. Demonstrates discontinuityin c I plot. (Actual screen dump).the two dimensional examples are often also dominant in more complex problems.The top two shaded diagrams in Figure 5.7 show plots of the objective function againstthe variable x for two single dimensional problems. The user would not have knowledgeof these curves, but their general nature is revealed by the LSP screen displays of thec I and N1 I plots shown in the lower sections of Figure 5.7.The plots of the level set value-c versus iterations are both simple monotone decreasing. This suggests for example that no fissure like feature in the objective functionsurfaces have been detected during LSP search.In the unimodal objective function case of Figure 5.7, the N1 I plot, shows a linearrelationship between N1 and I. The slope of the curve is close to Nkeep, which is indicativeof good search efficiency. This curve, together with the smooth monotone decreasingChapter 5. EXPERIENCE WITH LSP 157c I curve, leads to the conclusiou that the optimization problem being solved has onlyone global optimum and an objective function surface which is essentially unimodal.In the multimodal case of Figure 5.7 the Nj n-i J plot increases exponentially. Thisdecline in efficiency is indicative of a progressive decrease in the volume ratio of the levelset to the current cuboid which often arises when the problem being solved has multipleglobal, or near global, optima.Figure 5.8 shows three sets of more complex c ‘‘-‘ I and Nj n-i J plots reflecting morecomplex objective functions. The first set, Figure 5.8(a), shows a smooth monotonedecreasing c n-i J curve and an almost constant slope of the N n.? I curve except for aslightly higher slope at the initial stage of the search. These two curves are characteristicof problems with a set of local optima, and a single global optimum lying in a relativelybroad depression in the objective function surface. Moreover the objective function valueat the local points is significantly inferior to the value at the global optimum. If theobjective values of the local optimal points are extremely inferior to the value at theglobal optimum, then the influence of the local optimal points on the Nf n-’ I curve is 50weak that the higher slope around the initial stages may not be distinct. The objectivefunction of Figure 5.8(a) is representative of this situation.Figure 5.8(b) shows a smooth monotone decreasing c n.i curve but the N n-’ Jplot has a characteristic sigmoidal shape. The likely indication from these two curves isthat the problem has multiple local optima, but only one is a global optimum, and thedifference between the values of the objective function at the local and global optima isquite small. This kind of objective function is depicted in the top figure of 5.8(b).Figure 5.8(c) shows a pronounced discontinuity in the c n-’ I curve and the characteristic three part sigmoidal N n-i curve. Such curves are generally obtained withproblems which have a fissure in the objective function surface. The fissure is containedwithin the cuboids throughout the search but no point is sampled inside the fissure atChapter 5. EXPERIENCE WITH LSP 158the earlier stages of the search. The type of objective function manifesting these plots isdepicted at the top of Figure 5.8(c) for a single variable case.5.2 Adjusting LSP parametersWith LSP there are a number of parameters which control key aspects of the search.These can remain fixed at recommended default values, be initialized at other values, orbe adjusted as the search progresses in the light of the outputs discussed in section 5.1.The default parameter values recommended in sections 5.2.1 to 5.2.5 have proven to beeffective in many cases, but when search difficulties are indicated by the c n’ J and Nf Icurves, adjustments to these values may be warranted. Each of the parameters impactsdirectly on quantities in the domain of the engineer’s real world problem as opposed tobeing mathematically abstract.The important LSP parameters are: the number of confirmed points in the acceptanceset Niveep; the value of the criterion for convergence VM; the value of the clusteringcriterion for partitioning the problem into subregions; the criterion for initiating skewnessand the adjustment parameters; the change of the penalty parameters per iteration; andthe tightening of any relaxed constraints per iteration. These parameter values and theiradjustment, which were investigated and exploited throughout the test problems, arediscussed in the next sections.5.2.1 Number of confirmed points in the acceptance set-NkeThe required number of confirmed points in an acceptance set depends primarily on thenumber of variables involved in the problem formulation. Generally, increasing Nkeepimproves reliability but reduces the search efficiency. The ideal relationship betweenNjeep and the number of variables in the problem is not necessarily linear, though aChapter 5. EXPERIENCE WITH LSP 159C • C• ••III.. aI IN •. Nf :I I(a) Unlmodal (b) MultlmodalFigure 5.7: Interpretation of simple c ‘ I and N1 I curves, unimodal and multimodalobjective functions.Chapter 5. EXPERIENCE WITH LSP 160f(fJ\J\J !XfJJJJc C CN1 N1 N1(a) (b) (c)Figure 5.8: Interpretation of more complex Nf I and c I plots.Chapter 5. EXPERIENCE WITH LSP 161linear relationship appears to work well for lower dimension problems.From the experience gained dnring this research, Nkeep = (loxnumber of variables)was found to be adequate for most problems up to 6 variables. With larger dimensionproblems substantially less than the l0xnumber of variables was often satisfactory. Forexample, the maximum value of 11keep utilized was 160 for Templeman’s problem, discussed in Chapter 4, section 4.3.4, involving 38 variables and yet the solution found withLSP improved (very sllghtly) on the published results for the same problem as well asfinding multiple optima in a single run.5.2.2 Termination criterion-VMSince the convergence criterion used in LSP is based on the variance of the objectivefunction values in the acceptance set, it is expressed in the same units as the objectivefunction and is prone to scaling problems. Therefore a specific value of the convergencecriteria performs best for a specific range of objective function values. A similar scaling problem is common with gradient optimization convergence criteria. One gradientmethod, GRG2, recommends that the objective function and constraint functions bescaled to have absolute values greater than 10”’ and less than 100 for successful operation of its convergence criterion [Lasdon, 1982]. Convergence criterion values rangingbetween 1 * i0 to 1 * 108 were adopted with LSP. The higher values were used forproblems with high absolute values of the objective function at their global optimum.The modified variance of the objective function values in the level set is reportedat every iteration. Its value can also be assessed in the light of the information on theprogress of the search provided by the c ‘—‘ I and Nf I curves as discussed in Chapter2, section 2.3.4 and earlier in this chapter.Chapter 5. EXPERIENCE WITH LSP 1625.2.3 Cluster criterionSubdividing the cuboid into a set of smaller cuboids when multiple optima are indicatedby the formation of multiple point clusters can be an important strategy in achieving acomputationally efficient and successful solution with LSP. A simple clustering analysisroutine is adequate to identify the clusters and hence their boundaries. Any overestimation and even overlap of cluster boundary estimates does not threaten the success ofthe search. Once the bounds of each cluster and hence their cuboids are established, thesearch proceeds independently within each of these individual new cuboids.The standard cluster analysis routine adopted in this research involves the construction of a dendrogram. At some distance from the branch ends, the dendrogram is cut bya line perpendicular to the branches so that the initial dendrogram tree is divided into aset of smaller trees. Points corresponding to each of the smaller trees are considered toform individual sets and are identified as clusters if the necessary clustering criterion ismet. •The clustering criterion was specified in Chapter 3, section 3.1.3 as the ratio of theminimum distance between two clusters and the maximum distance between two pointswithin a cluster. When this ratio is above a certain predefined value, the region occupiedby the acceptance set is subdivided into smaller subregions. Based on the test problems,experience showed that the useful range of the clustering criterion is between 2 and 8.Even though clustering analysis methods can be used for any n dimensional problems,their application in this research was confined to problems involving up to 4 variables.This was largely because of the difficulties which occur in cluster investigation in largerdimensional problems. Furthermore, there are dangers in over generalizing the experienceobtained with a limited number of test problems to high dimensional problems that mightbe encountered in engineering.Chapter 5. EXPERIENCE WITH LSP 1635.2.4 Cuboid stretching parametersAt every iteration the cuboid defined by the confirmed points in the acceptance set isstretched on each side by - of its total length, where n is the nnmber of points fromthe previous iteration remaining in the acceptance set after points which do not fuffil thecurrent f(x) c condition have been discarded. Details of the stretching parameters aregiven in Chapter 3, section 3.4.The stretching of cuboids obviously increases the cuboid volume, but the efficiencyof the search is not noticeably reduced by this extra volume when the stretch is small.Experience with the test problems showed that no significant difference in the totalnumber of function evaluations exists for cases with and without cuboids being stretched.But the increased reliability benefits from cuboid stretching were significant, especiallywhen optimal points were close to the cuboid boundaries.5.2.5 Skewness adjustment parametersThe LSP parameters used for skewness adjustment have already been discussed in Chapter 3, section 3.5. The choice of value for each parameter is made so as to balance theincreased speed of convergence against the increased risk of missing the global optimum.The parameters help reduce the risk of missing optimal points located at the boundaries,and in particular the corners, of the cuboid.From the experience gained in this research, the best values for the three skew parameters range from 0.3 to 0.4 for 6, 0.10 to 0.25 for 6 and 0.05 to 0.15 for 62.When skewness adjustments were used, the default values 6o = 0.4, 6 = 0.1 and62 = 0.05, were adopted. For a one variable problem with a unit length of cuboid, thesefigures give a maximum elongation (stretch) of 5% when 6 = 1 and a minimum elongationof 2% when 6=6. For cases where 6J 6 there is no adjustment. With the defaultChapter 5. EXPERIENCE WITH LSP 164parameter values adopted, the shift of the cuboid would be a maximum of 10% when6 = 1 and a minimum of 4% when 6 = 6o Therefore for a cuboid of unit length thestretch introduced due to skewness adjustment will be in the range [0.02, 0.05] and theconsequent shift will be in the interval [0.04,0.10].The above mentioned default parameter values produced satisfactory results throughout the test problems. As a test of response to a highly skewed feasible region, the initialbounds for the Rosenbrock function, where the solution is at x1 = 1 and x2 = 1, were setat 0 x1, x2 20. Two sets of runs were conducted, one without, and the other with,skewness adjustment. In the first case, out of the 10 different LSP runs without skewnessadjustment, only 3 runs converged on the true optimum point. But, in the second case,where skewness adjustment was used, all the 10 different runs converged on the globaloptimum point. These examples, as well as a number of the other test problems, demonstrated that the skewness adjustment significantly improves the reliability of convergenceon the true optimum point with only a small addition in computational effort.5.2.6 Penalty and constraint relaxation and tightening parametersDetails of penalty parameter modification are given in Chapter 3, section 3.2.1. Experimental work showed good results for the penalty coefficient reduction within a rangeof 0.7 to 0.9 at every iteration. A minimum value of 100 was found to be satisfactoryfor problems whose absolute value of objective function evaluated at the optimal pointis less than 100. Reductions outside this range encountered difficulties which commonlyinvolved convergence on a non optimal point. This is also discussed in Chapter 3, section3.2.1.The relaxation parameter value was usually determined by trial and error while running LSP. In some instances, with heavily constrained problems, very little progresstowards establishing the initial set of Nheep points occurred even after the expenditure ofChapter 5. EXPERIENCE WITH LSP 165considerable computational effort. The constraints were then relaxed before completionof the first iteration. But, if possible, it is suggested that at least one iteration is completed so that an indication of N, the number of function evaluations needed to establishNk€6 confirmed points, is obtained. If N) 20 x one might think of relaxing theconstraints. Generally the value of the relaxation parameter is related to the complexityof the constraint functions. The merits of relaxing constraints are discussed in Chapter3, section 3.3.The six variable problem [Problem 6-1-0-2, Appendix A] demonstrates the advantageof constraint relaxation. Without relaxing the constraints it took over 100,000 functionevaluations to generate a single acceptable point. But with constraint relaxation, it waspossible to find the optimal solution with 272,225 function evaluations.From experience with the test problems, values between 0.5 and 0.85 were found tobe reasonable for tightening relaxed constraints. A lower value can be used for problemswhere the search efficiency indicated by Nf after the first iteration is high indicating thatthe feasible region volume is close to the current cuboid volume.5.3 Use of a rhombohedron shaped search domainA rhombohedron can be used as a more compact search domain in place of a cuboid. Theefficiency of LSP then increases in proportion to the volume ratio of the minimal cuboidto the minimal rhombohedron enclosing the same acceptance set. This is discussed inChapter 3, section 3.6. This efficiency improvement increases with increasing linearity inthe relationship between pairs of variables. The greatest efficiency gain occurs when thecorrelation coefficient approaches +1 or —1.As the search becomes confined to a very small region of the decision domain, relationships which are highly nonlinear will often approximate locally to a linear relationshipChapter 5. EXPERIENCE WITH LSP 166so that the rhombohedron can be exploited frequently. However, it is important to notethat LSP does not impose an actual linear relationship on the variable pair so that noapproximation or distortion is being introduced into the problem being solved.The rhombohedron was used for a number of test problems which had shown distinctevidence of linearity between some of the variables in the x xj plots at some stage in thesearch. The three variable flywheel design problem [Problem 3-15-0-2, Appendix A], andthe fonr variable power generation problem [Problem 4-11-0-5, Chapter 4], were used toinvestigate the rednction in the nnmber of function evalnations when the rhombohedronwas nsed. Each of these test problems was solved in two ways. In the first case, a cuboidwas used throughout the search while in the second case shifts were made between cuboidand rhombohedron as the need arose. The shift to the rhombohedron was made onlywhen the absolute value of the correlation coefficient between any two variables wasgreater than or equal to 0.9. Both test problems were run three times for each case andthe average number of function evaluations required for convergence is given in Table5.1. The same termination criterion and Nj’6ep values were adopted in all of the cases.‘ IIn multidimensional problems more than one pair of variables can exhibit high linearcorrelation. In such cases the rhombohedron defined might have more than one non-rectangular face. Even though it is technically possible to make further modifications tothe cuboid in response to multiple high correlation coefficients, only a single variable pairwas accommodated at any one time in this research. When more than one pair of variableshad correlation coefficients greater than 0.9 the cuhoid was replaced with a rhomboidProblem3-15-0-24-11-0-5Table 5.1: FunctionNumber of function evaluationsCuboid Rhombohedron7,432,678 507,5581,431,317 L 84,653evaluations using cuboid and rhombohedronChapter 5. EXPERIENCE WITH LSP 167only in the plane of the variable pair with the highest correlation. At this presentstage of implementing the nse of rhombohedron both conceptualization and visualizationdifficulties preclude the use of a rhomboid for more than one variable pair.The rhombohedron also provides benefits in other situations. When a problem hasmultiple global optima, the number of function evaluations can be very large before theconvergence criterion is met, as discussed in section 5.1.4. If there are only two globaloptimal points, or there are more than two optimal points lying approximately on astraight line in the decision space, adopting a rhombohedron can dramatically improvethe search efficiency. Therefore in some cases, the use of the rhombohedron can be asubstitute for cluster analysis. For example, a rhombohedron search domain was used tosolve the two variable, two global optima problem discussed in Chapter 3, section 3.1.3,using Njeep = 20 and VM 0.001. The convergence criterion was met and both optimalpoints identified after 234,866 function evaluations. Comparison of the result with the8,018,142 function evaluations cited in Table 3.1 in Chapter 3 indicates that the efficiencygain is substantial.5.4 Relationship between initial cuboid volume and its location, and NfWhen optimization problems are bounded but bound estimates are not known, the designer has to guess the possible upper and lower limits for each variable in order todefine the initial cuboid. A generous guess is preferred since precision on the bounds isnot required and it is important that the global optimum not be excluded. The size ofthe initial cuboid was found not to have a significant influence on the total number offunction evaluations required to solve the problem.A two variable unconstrained problem, [Problem 2-24-0-0, Appendix A], demonstratesthe influence of the initial cuboid can have on the total number of function evaluations.Chapter 5. EXPERIENCE WITH LSP 168This problem was run with two different sized initial cuboids. In the first case the boundsof the variables used were 0 x1 $ 4 and 0 x2 $ 4, which gives a volume of 16 units.In the second case, where the bounds used were 0 x1 20 and 0 x2 20, the volumeof the initial cuboid was increased to 400 units, i.e. 25 times that of the volume in firstcase. Ten separate runs for each of these cases were conducted and the average valueof number of function evaluations was found to be 2,369 for the first case and 2,725 forthe second, larger initial volume, case. These numbers suggest that there is only a weakrelationship between initial cuboid volume and the total number of function evaluationsrequired for solution.Inappropriately skewed initial estimates of bounds relative to the global optimalpoint(s) can result in a high number of function evaluations for solution. The skewness adjustment generally modifies the location of the cuboids so that clusters of pointsin the vicinity of a cuboid boundary are not on the boundaries of subsequent cuboids.Even if the skew adjustment is high at the earlier stage of the search, it can take manyiterations to shift the cuboid until the global optimum is included in the current cuboid.Such extra iterations introduce computational inefficiency. Of course the inefficiency canbe reduced with a different set of skewness parameter values, but adjusting these parameters for each particular problem is not a feasible option as there will rarely be any apriori information on which to base the adjustment.5.5 Some specific difficulties encountered while implementing LSPComputational inefficiency and convergence difficulties were encountered by LSP in justfour of the 200 published mathematical test problems attempted. LSP can fail to find thetrue solution to a problem when a cuboid at any stage of the search erroneously excludesa portion of the level set containing the global optimum and this is not subsequentlyChapter 5. EXPERIENCE WITH LSP 169recovered through cuboid stretching or skewness adjustment. Evidence of this difficultywas provided by a test problem taken from Subrahmanyam, [Problem 2-12-0-4, AppendixA). This two variable, four constraint problem is shown in Figure 5.9. Its key features area narrow crescent shaped feasible region formed by two parabolas and an almost parallelobjective function surface to one of the variable axis. The formulation isMinimize f(x) = (xi — i0)-I- (x2 — 20)Subject to:—x1 + 13 0—(x1 — 5)2— (x — 5)2 + 100 0(x1 — 6)2 + (x2 — 5)2 — 82.81 0x2 0Subrahmanyam, [Subrahmanyam, 1989], solved this problem using the “Extendedsimplex method applied to constrained nonlinear optimization”. He states that convergence was achieved at 833 function evaluations, though it is not clear if this numberincludes the evaluation of constraints or is only the number of objective function evaluations.Attempts to solve the above problem with LSP, using the default parameter values,always converged on non optimal points. Since the global optimum lies at an acutecorner of the feasible region, the cuboids at successive iterations excluded the globalpoint for the reason discussed in Chapter 3, section 3.5. Using skewness values withinthe range recommended in this thesis, the skewness adjustment could not shift the cuboidfar enough to include the optimum point before the convergence criterion was met. Theskewness parameter values were easily tailored to solve this particular problem, valuesof 6 = 0.25, b = 0.22 and 62 = 0.15 produced the correct global optimum. Such valueswould not be appropriate for general LSP use because of the severe reduction in efficiencythat would result with most types of problems.Chapter .5. EXPERIENCE WITH LSP 17014.012.010.06.0x26.04.02.00.013.0 16.0Figure 5.9: The crescent shaped feasible region of Subrahmanyam’s problem [Problem2-12-0-4, Appendix A].Subrahmanyam’s problem suggests an even simpler test problem. As the only criticalingredient is the crescent shaped feasible region, the objective function can be replaced bya simple linear function with a mild gradient. It appears that, with its elegant simplicity,this modified version of Subrahmanyam’s problem might play a similar role for level setmethods in general as that played by Rosenbrock’s valley problem for gradient searchmethods. A sample objective function for a modified Subrahmanyam’s problem can bestated asxlChapter 5. EXPERIENCE WITH LSP 171f(x) = 245x1 + 4292x.Another particular difficulty arises with LSP when the shape of an active constraintsurface is similar to the shape defined by the objective function surface in the vicinity ofthe global optimum value. An elementary example of this occurs in linear programmingwhen the objective function is parallel to an active constraint. There, all points onthe line joining the two neighbouring optimal vertices constitute a set of global optima.When solving nonlinear problems having this characteristic, the cuboid volume reductionbetween consecutive iterations can, from a practical point of view, cease. But the volumeof the level set inevitably decreases at every iteration, even though the magnitude of thisdecrease may be small. Thus the ratio of the acceptance set volume to the cuboidvolume gets progressively smaller with each iteration. As this ratio declines, the chancesof generating acceptable points decreases, and the search becomes increasingly inefficient.Another situation where a progressive reduction of the acceptance set volume occurswhile the volume of the cuboid remains almost constant is shown in Figure 5.10 for anunconstrained problem. The figure shows that the cuboid volume, which corresponds toarea in this two dimensional example, remains nearly the same at iterations k and k + 1,while the volume of the acceptance set is substantially reduced. In this case the efficiencyof generating acceptable points at the k + 1 iteration declines proportionally with theacceptance set volume. Such a situation can, however, often be detected visually in thexj displays and remedied by partitioning.Efficiency problems also arise with LSP when a problem has multiple global optima,and the global optimal points are approximately distributed along an elongated path.An example is provided by the problem shown in Chapter 3, Figure 3.5 when solved asa maximization problem, so that the problem becomesChapter 5. EXPERIENCE WITH LSP 172x2 x2..• S..•: ‘.I • ... Approxmst.d \• .— acc.ptano. t •• boixidary ••*.•xl xl(a) acceptance set at Iteration k (b) acceptance set at Iteration k+1Figure 5.10: Acceptance set volume change after an iteration while cuboid remains constant.Maximize f(x) = 100 — (xi + x2 — 10)2Subject to:0 x1 100 x2 10The solution is f(x*) = 100, and all points on the line connecting (0,10) and (10,0)on the x1-x2 plane are members of the solution set. The final cuboid enveloping thesepoints is the same as the original cuboid.Under normal circumstances the on-screen plot of number of iterations versus the levelset value, c I, shows a monotone decreasing curve, but in some cases discontinuitiesin the curve are observed. This indicates that there is an abrupt change in the rate ofimprovement of the level set value. This phenomenon may arise from the fact that thereis a fissure in the objective function surface in the feasible region which gives lower valuesof the objective function and no point was sampled in the fissure in any iteration prior toChapter 5. EXPERIENCE WITH LSP 173the discontinuity. There could of course be more than one fissure in the objective surfaceand consequently any discontinuity in the c n. J curve should signal caution about thevalidity of any presumed global solution.5.6 Observations and recommendationsPrior information about a nonlinear optimization problem, such as the nature of theobjective function surface or the shape of the feasible region, can be of help in thesearch for the global optimum. Realistically though, such a priori information is seldomavailable, even with engineering problems of modest complexity.The information provided by LSP as the search progresses, as discussed earlier inthis chapter, is simple in nature but gives important clues to the likely nature of thedifficulties being encountered. As a result, possible remedies to overcome the difficulty,like relaxation of constraints or adjustments of other LSP parameters, can be suggestedas the search progresses and before unnecessary computational effort is expended or thesearch abandoned.In most of the test problems, the volume of the cuboid reduces and the density ofpoints (i.e. points per unit volume in the decision space) in the acceptance set increasesfrom one iteration to the next. In order to maintain a more or less constant density ofpoints, the number of confirmed points in the acceptance set can be reduced at every iteration without impairing the convergence properties of LSP. A recommended refinementis that, unless the computational effort for a single function evaluation is very large, ahigher initial Nkeep value be used than that suggested in section 5.2.1. Initially, Nk66can be raised by about 30% of the recommended value and then be gradually reduced,by one or two, at every iteration, the reduction to stop when Nkeep reaches about twothirds of the recommended (loxnumber of variables) value. When no other difficultiesChapter 5. EXPERIENCE WITH LSP 174are experienced, the total number of function evaluations per iteration is proportional tothe number of points in the acceptance set so that, while this strategy improves reliabilitythe overall computational effort expended is unchanged.The other heuristic adjustments discussed are modification of the penalty parameterand tightening of constraints following relaxation. These systematic adjustments areintended to increase the efficiency of generating acceptable points within a cuboid. Therewere instances where problems with feasible regions which were too small relative to theinitial cuhoid to successfully generate any acceptable points could only be solved usingthese approaches. Examples are [Problem 3-7-0-2, Appendix A] and the ten membertruss problem in Chapter 4, section 4.3.3.In LSP the number of function evaluations expended at each iteration are displayedin the N1 I graph and used to interpret certain phenomena. In general the N1 -‘ Iplot serves as an indication of search efficiency as the search progresses. High numbersof function evaluations at the initial iteration indicates difficulty in generating points inthe feasible region. Continuing the search under those circumstances, without modifyingthe LSP parameters, may result in an overall inefficient search or even an unsuccessfulsearch. Therefore it is recommended that, in response to a high N1 value in the firstor early iterations, the search be halted and the technique of constraint relaxation andtightening actuated in the LSP program. The fact that the need for such modificationis usually indicated quite clearly by the output from the first iteration is of significantpractical value as it limits wasteful expenditure of computational effort.When the plot of the level set value versus iterations, c n- J, shows a discontinuity atsome stage of the search, the reasons for the discontinuity can usually be attributed tothe detection of a narrow and deep fissure in the objective surface in the region containingthe global optimum point. No point has been sampled in the fissure prior to the c Icurve discontinuity but the acceptance set (so also the cuboid) still includes the globalChapter 5. EXPERIENCE WITH LSP 175optimum. The subsequent generation of a single point in the fissure, with substantiallylower objective function value, triggers the discontinuity in the rate of reduction in cover the current and following iterations. When such a phenomenon is observed, Nkmight be raised, say by about 50%, and LSP rerun so that the chances of generatingpoints in any similar fissures, if they exist, is increased. The increase in Nkeep will helpto increase the chance of converging on the global optimum, but at the expense of extracomputational effort.In some cases, and especially with low dimension problems, the x n. x scatterdiagrams of points in the acceptance set clearly indicate the existence of more than onecluster of points. In that case LSP can be halted and separate LSP searches initiatedwithin cuboids bounding the individual subregions indicated by the clusters. The boundsof the subregions can be estimated visually from the scatter diagrams. Some additionalgraphical identification techniques are discussed in Chapter 7, section 7.5.2.Finally it should be emphasized that, for all of the parameter adjustments discussed,much of the triggering information concerns just c, 1/f and I. The magnitude of cis related directly to the problem being solved while N1 and I are easily understoodquantities related to the search. The influence of these quantities on the search strategyis therefore not likely to be difficult for an engineer practitioner, as opposed to thenumerical analysis specialist, to understand. The combined influence of a complicatedset of constraints and a complicated objective function surface on the LSP search, evenwhen a high number of dimensions are involved, can be interpreted from the set oftwo dimensional screen plots discussed in sections 5.1.1 to 5.1.3 far more easily thanone might expect. In addition, the two dimensional x1 xj scatter diagrams have thepotential to provide an engineering practitioner with useful new knowledge concerningthe topology of his design problem. The convergence criterion can be judged and adjustedin the context of the specific problem being solved and in light of the progress indicatedChapter 5. EXPERIENCE WITH LSP 176in these plots. The plots also provide important clues concerning the modification ofcertaiu LSP parameters which can enhance computational efficiency and reliability whilethe search is in progress.Chapter 6EVALUATION OF NLP PERFORMANCEDuring the course of this research into the implementation of a level set optimizationscheme it became evident that it was evolving into an optimization tool with many features which might be considered attractive. At the same time it was recognized that,with the large number of solution points needed to be generated at each iteration, LSPmight be considered to be computationally extravagant when compared with most gradient methods as well as certain direct search methods. Some kind of objective evaluationof LSP’s overall performance and characteristics seemed desirable and the literature wassearched for a suitable evaluation procedure.There are a growing number of nonlinear optimization methods and many have beendeveloped within very specialized fields of application. Diverse testing procedures andevaluation criteria have been presented but the problem remains of determining whichtest problems and evaluation criteria are relevant to an individual user needs. Whilesome users are concerned primarily with speed and accuracy of convergence, others aremore interested in reliably obtaining the true global solution.No established universal criteria set for evaluating nonlinear optimization methodswas found. As well, most of the test problems which appear in the literature, andhave been used for evaluating performance, were developed to demonstrate the specialstrengths of one specific NLP method.The performance criteria which have been offered in the literature are geared almost177Chapter 6. EVALUATION OF NLP PERFORMANCE 178entirely to evaluating local optimization methods. The reason being that local optimization methods are by far the most prevalent. Since very few NLP methods have beendeveloped for global optimization, evaluation criteria for global optimization methodsare rare. Unfortunately, the criteria sets which have been proposed for local NLP methods do not consider some of the important featnres of global optimization techniques andthe pitfalls of global optimization problems.In this chapter some of the existing criteria for assessing global optimization methodsare reviewed and a revised set proposed which, in particular, attempt to address thecurrent needs of engineering practitioners. An assessment of LSP’s performance underthe recommended performance criteria set is also demonstrated in the last section of thischapter.6.1 Existing evaluation criteriaMost of the existing performance indicators or performance criteria focus on the followingfactors: efficiency, expressed as the CPU time required to obtain a solution; the numberof function evaluations for solution; and the numerical accuracy of the final results. Theability of a particular method to solve a wide variety of optimization problems is seldomconsidered and quantitatively measured since in the past efficiency has been the dominantconsideration [Reklaitis et al., 1983]. Sandgren set as a criterion the ability to solve alarge number of problems in a reasonable amount of time [Sandgren, 1980]. He rankedthe algorithms on the basis of the number of problems solved within a specified CPU timelimits. The limits were based on a fraction of the average CPU time for all algorithmsto solve each of a set of problems.Most of the criteria sets to date have been tailored to a specific algorithm. Regardlessof the type of test problems used, the following performance indicators have consistentlyChapter 6. EVALUATION OF NLP PERFORMANCE 179appeared in the literature and are therefore considered to be the most general.• Efficiency (CPU time). A measure of the central processor time for termination ofa successful search for a specific problem with a specified degree of precision. Thisindicator is highly dependant on the type of computer used and the coding of analgorithm.• Robustness. The ability to solve a large variety of problems in a reasonable timewith a specified precision.• Number of function evaluations. A count of how many times the objective functionand/or constraint functions are called during execution. There is not even consistent agreement as to how the number of fnnction evaluations required to find asolution should be counted so the numbers quoted are often not comparable. Thisis an alternative measure of efficiency, which is independent of the type of computerused in the test.• Number of Iterations. The nnmber of iterations to reach convergence for iterativealgorithms. This indicator is not influenced by the computer used, but it is notnecessarily comparable between different algorithms.• Basic operation count. How many basic computer numerical operations (or flops)are performed. Will be influenced by the code and may even depend on the typeof computer used.• Numerical accuracy. The deviation of the attainable solutions from the “true” oranalytical solution. With dimensionally large and complex test problems the truesolution may not be known.Chapter 6. EVALUATION OF NLP PERFORMANCE 180• User friendliness. Simplicity, convenience of input format, ease of understandingof the output both during the search and at termination. A crucial componentfor practitioners, usually overlooked, is the time required by an occasional user tofamiliarize himself with the theoretical basis of the methodology and (re)acquireimplementation skills.• Reliability. The ratio of the number of successful solutions found to the totalnumber of problems attempted. Wrong solutions involve either convergence on anon-optimal point, violating constraints, failure to meet the convergence criteriondue to excessive computation time, or search termination because of computationalfatal errors or numerical overflow.• Problem dimensional capability. Expressed as the maximum number of variablesand equality and inequality constraints which can be handled without substantially reducing the performance indicators, while the computational effort requiredremains within acceptable or feasible bounds.• Complexity. The ability to handle ill conditioned, indefinite and degenerate problems.• Sensitivity to starting point. Effect of choice of starting point on the success of thesearch.Most attempts to combine the above criteria have been purely qualitative. Schittkowski produced one of the rare attempts to give a quantitative interpretation to therelative weight of each criterion [Schittkowski, 1980]. He adopted nine criteria and assigned the three sets of weighting factors given in Table 6.1 below. The choice of thecriteria set depended on the relative importance of the various criteria to three distincttypes of users. The actual nature of these types of users was not described but theChapter 6. EVALUATION OF NLP PERFORMANCE 181emphasis on ease of use for type II users suggests that this is most applicable to thenon-specialist.Alternate_WeightPerformance criteria I II IIIEfficiency (speed) .32 .18 .14Reliability .23 .18 .36Global Convergence .08 .08 .20Ability to solve degenerate problems .05 .03 .03Ability to solve ill conditioned problems .05 .06 .03Ability to solve indefinite problems .03 .03 .03Sensitivity to slight problem variation .03 .03 .06Sensitivity to position of the starting point .07 .06 .06Ease for use .14 .35 .09Table 6.1: Schittkowski’s NLP performance criteria6.2 Limitations of Schittkowski’s performance criteriaA principal shortcoming of Schittkowski’s performance criteria is that they do not addresssome of the currently important issues which impact upon the appeal and practicabilityof nonlinear optimization to the practising engineer, while considering others which are oflittle relevance. Limitations also arise from the fact that the criteria sets were designed toevaluate local optimization methods with continuous functions only. Therefore these setsof criteria and weights need to be extended and the weighting of each criterion needs tobe adjusted to reflect contemporary needs and economic realities. Schittkowski’s criteriaweights were proposed when mainframe computers were the only computing resourceand the cost of computation, which was directly proportional to CPU time, was thedominant expense. The development of microcomputers has substantially lowered thecost of computation. Furthermore, the marginal cost of computing on microcomputerscan approach zero on machines which primarily serve other purposes and are severelyChapter 6. EVALUATION OF NLP PERFORMANCE 182under utilized over a 24 hour day. Thus the weighting of efficiency should be considerablylower today.6.3 Recommended set of evaluation criteriaThe purpose of a criteria set should be to serve as an aid to identifying the best optimization method for a particular class of user. Therefore, as many of the relevant practicalconsiderations as possible should be addressed by the criteria set. The factors consideredimportant in this thesis for NLP performance evaluation are as follows.Reliability:One of the most important qualities of an optimization method is its ability to reacha successful solution. A solution is successful when the final solntion is achieved withoutnumerical failure and without any violation of the constraints. Reliability is defined asthe ratio of the number of successful solutions obtained to the total number of problemsattempted. High reliability makes a method applicable, with confidence, to the widestpossible range of problems, hence a high weight is given to reliability in the set of criteria.Global convergence:Many civil engineering optimization problems are nonlinear in their formulation whichcan lead to multiple local optima. Since the difference between local and global optimalsolutions can be considerable, there is always the need to identify the global optimum.Therefore, a considerable weight is assigned for the capacity to guarantee finding the trueglobal solution.Multiple global solutions:A practitioner would definitely be interested in identifying all possible global optimalpoints or at least being given some indication of their possible existence. These pointsChapter 6. EVALUATION OF NLP PERFORMANCE 183may give designers and decision makers the flexibility to choose from significantly differingalternate optimal decisions. Consequently, the ability to identify mnltiple global optimais desirable and should have high weight amongst the performance criteria.Near optimal solutions:In many cases nonlinear problems have an undnlating objective function surface whichresults in multiple local optima. All of the local optima may not necessarily be of greatinterest to a designer but those local points which produce objective function values closeto the global optimum can, from a practical standpoint, be of equal interest to the globaloptimum. Therefore, a considerable weight should be assigned to near optimal solutionidentification.Discrete variables:Many civil engineering problems involve discrete variables as well as mixtures ofdiscrete and continuous variables, so that the capacity to handle integer variables shouldbe given some weight.Ease of use:Ease of use and interpretation of the output are major factors for choosing and usingan optimization method. The general NLP user cannot be expected to know the meaningof all of the terminology used in numerical analysis. Terms like Kuhn Tucker conditionsand Hessian matrix are often alien to him and there is a natural reluctance to adopt amethod if the important features of the output data are not immediately understood. Afriendly input-output format in a language the designer can understand is appealing. Inpractice, ease of use is related to expenditures of an engineer’s time and hence to costof implementation. This includes the time an engineer needs to learn and familiarizehimself with the theory and implementation of the method and in the interpretation ofthe results. In this thesis, the highest weight in the performance criteria set is dedicatedChapter 6. EVALUATION OF NLP PERFORMANCE 184to user friendliness.Speed of convergence:As discussed in section 6.1, speed of convergence should no longer be considered asa major factor. It used to be important when the cost of computation was dominatedby CPU time on a mainframe computer. The fact that some NLP methods can be runon personal computers has changed the relationship between cost and computation time.Microcomputers, which are often under utilized, can be left to run for the whole day andthe computational cost is not far from the electrical power used. Moreover the speedof computing has continued to increase. This phenomenon has further diminished theimportance of speed or efficiency as an indicator of NLP method performance. However,the number of function evaluations, which does not depend on the type of machine used,can serve as a relative indication of speed of convergence.After considering the above factors, a new proposal for an NLP performance evaluation criteria set along similar lines to those originally proposed by Schittkowski is givenin Table 6.2. This table is an extended version of Table 6.1. The criteria set and weightsrecommended are based on personal experience gained in using a variety of NLP techniques in this research. Each figure in the last column indicates the author’s view ofthe relative importance of the various criteria for the present day needs of a practitionerengineer doing only occasional optimization.6.4 Evaluation example: LSP versus GRG2The performance of LSP under the proposed criteria set was not investigated in anysystematic or formal way over the more than 200 test problems investigated in thisresearch. Only a limited example of performance evaluation is given here, comparingChapter 6. EVAL UATION OF NLP PERFORMANCE 185Mainframe PC.1980 1993Alternate WeightPerformance criteria I II IIIEfficiency (speed) .32 .18 .14 .03Reliability .23 .18 .36 .21Global Convergence .08 .08 .20 .15Ability to solve degenerate problems .05 .03 .03 ftAbility to solve ill conditioned problems .05 .06 .03 negligibleAbility to solve indefinite problems .03 .03 .03 importanceSensitivity to slight problem variation .03 .03 .06 1).Sensitivity to position of the starting point .07 .06 .06 .03Ease for nse .14 .35 .09 .25Integer handling— .12Near optimal solution identification — .08Multiple optima identification— .13Table 6.2: Recommended NLP performance criteriathe performance of LSP against a leading gradient method GRG2 using the proposedcriteria set. A single test problem, [Problem 4-12-1-5, Appendix A], is used in thisexample. This test problem was used by Schittkowski to evaluate the performance of avariety of NLP methods including GRG2 [Schittkowski, 1980]. In Schittkowski’s work,the performance under each of his criteria was evaluated on the basis of 10 different runs.The same problem was run ten times using LSP. The results showed that all of the LSPruns converged at a single global optimum. The performances and scores of the twomethods derived from the test problem are given in Table 6.3. The performance for eachcriterion is assigned values between 0.0 and 1.0, and scores are calculated as (weight xperformance). The weights adopted here are those cited in Table 6.2, section 6.3. Thetotal score, which can take a value between 0.0 and 1.0, is a comparative indicator of theoverall performances of the two methods.In Table 6.3, a performance value of 0.0 is assigned for efficiency to LSP and a valueof 1.0 to GRG2. This is because the number of function evaluations to reach searchChapter 6. EVALUATION OF NLP PERFORMANCE 186termination with LSP is quite high compared to that reported for GRG2. On the otherhand, it is assumed that the time an engineer needs to acquire a working knowledge ofthe gradient method and interpret its output is about four times the time required withLSP. Therefore, the performance values for ease of use are 1.0 and 0.25 for LSP andGRG2 respectively. The performance values for reliability and global convergence forGRG2 are from [Schittkowski, 1980].Performance ScorePerformance criteria Weight GRG2 LSP GRG2 LSPEfficiency (speed) 0.03 1.00 0.00 0.03 0.00Reliability 0.21 0.867 1.00 0.182 0.21Global Convergence 0.15 0.654 1.00 0.098 0.15Sensitivity to position of the starting point 0.03 0.00 1.00 0.00 0.03Ease for use 0.25 0.25 1.00 0.065 0.25Integer handling 0.12— 1.00— 0.12Near optimal solution identification 0.08— 1.00 0.08Multiple optima identification 0.13— 1.00— 0.13Total score 0.435 0.97Table 6.3: Comparison of GRG2 and LSP using proposed criteria setIn general, with the proposed criteria set and weighting scheme, LSP showed superiorperformance over GRG2. The main factors for LSP’s high performance rating are asfollows.• It is particularly easy to understand the output from LSP as it is expressed almostentirely in terms of the problem variables and their values (see Chapter 7 on LSPcomputer implementation).• Almost all attempted problems were solved to a reasonable accuracy with LSP butGRG2 failed to find solutions in about 13.3% of the cases.• Multiple optima are clearly identified by LSP while GRG2 identifies only a singlesolution at each run.Chapter 6. EVALUATION OF NLP PERFORMANCE 187• A set of near optimal points always accompanies the global optimum solution withLSP but only one local optimum is provided by GRG2 at the end of each search.Because efficiency still dominates many people’s view of the value of an NLP method,and even though LSP was given a zero score in this category in the above evaluation, thenumber of function evaluations with LSP for most of the cases, were of the same orderof magnitude as those solved using CR02. Only a short time was required to solve thesesmall sized problems. For example, with LSP, the time taken to solve the two variableRosenbrock function is about 2.86 seconds [Problem 2-7-0-0, Appendix A], a 3 variablefuel allocation problem [Problem 3-14-0-1, Appendix A] took about 17.53 seconds, andabout 90 minutes was taken to solve a 15 variable problem with 29 constraints [Problem15-1-0-29, Appendix A]. These times were on an 80386-33Mhz microcomputer. The timeswere averaged over five different runs.Because of the difficulties in finding representative test problems and finding universalperformance measures, a single quantitative assessment of the performance of an NLPmethod does not appear to be possible. In addition, ease of use, which is an importantfactor in determining if an NLP package is ever to be adopted by non-speciallst users, isnecessarily a subjective issue.This chapter has provided a reassessment of Schittkowski’s set of qualitative performance criteria to reflect the present day computing environment and the needs of anengineer practitioner. Although no systematic and complete evaluation of a number ofNLP packages was attempted in this research it is evident that LSP has characteristicswhich make it worthy of serious consideration as a tool for nonlinear optimization.Chapter 7LSP COMPUTER IMPLEMENTATIONLSP was written in Quick Basic. This language was adopted for convenience in theearly stages of the research with the assumption that a more powerful language wouldbe adopted when the need arose. It transpired that LSP does not place any specialprogramming or computational demands so that, for developmental purposes, there wasno incentive to change from Quick Basic. While it eventually became apparent thatthe graphical interface could be a unique and attractive factor of LSP it also presentedminimal demands on the graphics capabilities of Quick Basic. The only justificationfor reprogramming would simply be to maximize computational speed and refine theappearance of the graphics displays.LSP executes the optimization process iteratively, giving results at the end of everyiteration. This intermediate output is available as numerical data and also in the form ofgraphical displays. When the convergence criterion is satisfied, the program terminatesand gives the points in the acceptance set and their corresponding objective functionvalues. All of the intermediate and the final results can also be saved in a file.Normally LSP does not require any adjustment to its parameters, it performs thewhole search automatically and outputs the intermediate and the final results. Modification of penalty terms, tightening of relaxed constraints and adjustments to the number ofpoints in the acceptance set are usually done without user interference. The predominantuser interaction is in the form of requests for specific information during the search, suchas displaying particular types of plots or sending information to a printer or to a file, or188Chapter 7. LSP COMPUTER IMPLEMENTATION 189when the search experiences difficulties.7.1 SubroutinesThe LSP package subroutines are categorized into three main groups. The first groupconsists of the problem definition routines, which contain the objective and constraintfunction code and the LSP search parameter values, and also process user instructions forany LSP parameter changes. The second group consists of the routines which performthe optimization. The third group consists of the output routines which generate thegraphics displays and numerical solutions at various stages of the search.LSP lends itself to programming as a set of short subroutines. The purposes of eachsubroutine are discussed in more detail below. A schematic representation of the generalfunctions and operations of each set of the subroutines is also given in Table 7.1.Problem definition subroutines:The user is required to furnish information concerning the problem being solved. Thesubroutines where this information is coded or entered are:CONSTANTS: All LSP parameters and constants describing the problem size are storedin this subroutine. These include the number of points in an acceptance set, thevalue of the termination criterion, the value of the clustering criterion, values ofthe skewness adjustment parameters, and the number of variables in the problemformulation.INITBOIJNDS: The minimum and maximum possible values for each variable are codedinto this subroutine and on execution these values are passed to the main programto define the initial cuboid.Chapter 7. LSP COMPUTER IMPLEMENTATION 190PROBLEM DEFINITION• formulate objective function• formulate set of constraints• specify variable boundsOPTIMIZATIONInitialization• generate feasible points• evaluate moments• specify level set valueAlgorithm• check for termination• revise level set value• discard bad points• define current cuboid• generate new feasible pointsOUTPUT• Display:- points in the current cuboid- current level set value- current cuboid boundaries- computational effort expended- scatter plots(in both initial and current cuboid)- efficiency and search progress curves• send information to a printer and/or a fileTable 7.1: LSP’s problem definition and operationsChapter 7. LSP COMPUTER IMPLEMENTATION 191CONSTRAINTS: All constraints are coded in this snbroutine. It receives the x vectorfor a point and retnrns a binary value indicating feasible or infeasible. Constraintevaluation at a point is terminated when the first violated constraint is detected sothat generally, on average, only half the constraints will need to be checked beforerejection occurs.OBJECTIVE: The objective function is coded in this subroutine. It receives the x vectorand returns the value of the objective function.Optimization subroutines:This set of subroutines perform the actual optimization task.INITKEEP: In this subroutine trial points are generated, points are checked for feasibilityby calling the CONSTRAINTS subroutine and the objective function is evaluatedby calling the OBJECTIVE subroutine. The number of confirmed points generatedis controlled by the value set for Nkeep.UPDATEMSD: The mean and variances of the objective function values at the pointsin an acceptance set are evaluated. A new level set value-c is assigned for the nextiteration.SWAPSKEW: Points with objective function value greater than the new level set value-care discarded. A new cuboid, which contains those points fulfilling the level setcondition, is defined. Modifications of the new cuboid, in the form of stretchingand skewness adjustments are done when the necessary criteria are met.FILLKEEP: More points are generated in the new cuboid to replace those points discarded at the new level set value-c. The generated points are checked for feasibilityand against the new level set value-c before they are accepted.Chapter 7. LSP COMPUTER IMPLEMENTATION 192Output subroutines:Some of the subroutines give the intermediate and final results as an output. Theoutput is given in the form of numerical values and/or in the form of graphical displays.The user has the option to supply the format required for the output.REVIEWKEEP: This subroutine prints intermediate and final numerical results on thescreen. The current level set value-c, the best point x vector found so far and itsobjective function value, the total number of function evaluations and the modifiedvariance of the points in the acceptance set are displayed after every iteration.KEEPOTIT: This subroutine prints the decision variable values and objective functionvalues of the confirmed points in the acceptance set on the screen at any timerequested. This subroutine is activated at any time, while the search is in progress,by pressing a function key.DATA: This subroutine sends the decision variable values and the objective functionvalues of points in the current acceptance set, the bounds of the current cuboid,the number of function evaluations expended so far and the current level set value-cto a file whenever requested. Data transfer is activated by pressing a function key.PLOTS: Generates the various types of screen plots discussed in this thesis. Some ofthem are prompted by pressing a specific function key, and others are displayedautomatically after each iteration. The Nf ‘-. I and the c I plots are alwaysdisplayed after every iteration.SCRGRAB: This subroutine sends graphical information into a specified file by savingthe currently displayed screen. It can be invoked either by pressing a function keyor automatically after every iteration.Chapter 7. LSP COMPUTER IMPLEMENTATION 193CLUSTER: The clustering analysis subroutines calculate the dendrogram using the confirmed points and displays the distance coefficients on the screen. The user is askedto specify the number of clusters extracted on the basis of the dendrogram. Oncethe clusters are identified the confirmed points are reclassified into the differentclusters and the cuboid bounds for each cluster calculated.The LSP program framework can readily incorporate auxiliary subroutines whichsolve major tasks that are associated with a specific type of problem. For example,the cluster analysis routine, the Hardy Cross relaxation routine for solving pipe flowproblems, and frame and structural truss analysis routines have all been incorporatedinto the LSP program in the form of subroutines when solving test problems.7.2 Confirmed point generation strategyA point has to be both feasible and produce an objective function value less than orequal to the level set value-c to enter the acceptance set. Guaranteeing the feasibility ofa point before evaluating the objective function or guaranteeing the level set conditionbefore checking feasibility leads to two different strategies of confirmed point generation.These different strategies can affect the efficiency of the search depending upon the natureof the optimization problem being solved. Even though it is not necessary to check pointfeasibility for unconstrained problems it is relevant, in the following discussion, to considerboth constrained and unconstrained problems. Constrained problems can yield effectivelyunconstrained interior solutions as the search progresses. At some stage of the search thecurrent cuboid might remain entirely within the feasible regiou, consequently from theLSP perspective, the problem has become unconstrained. Confirmed point generationin such a situation is discussed below under the title ‘Interior solutions with multipleChapter 7. LSP COMPUTER IMPLEMENTATION 194optima’. Two distinct situations and their recommended point confirmation strategiesare discussed here.• Interior solutions with multiple optima:How often trial points generated in the current cuboid violate the level set conditionis inversely proportional to the volume ratio of the acceptance set to the currentcuboid. In the common situation of global solutions lying at interior stationarypoints, the rejection of trial points due to violation of the level set condition increases as the search approaches the optimum value. The reason being that thevolume of the cuboid remains close to constant while the acceptance set volumereduces to small regions surrounding the optimal points. This situation lowers theefficiency of the search.The efficiency is therefore often related to closeness of the level set value-c to theoptimum value, but is also affected by the dispersion of the optimal points in thedecision space. For example, for a two variable problem with three global optimalpoints, the efficiency is low if these three points are on a straight line and the lineis parallel to one of the variable axes. The efficiency decreases further when theline joining the three points sits on a diagonal of the cuboid.As an experiment, a problem was chosen with the acceptance set volume at the kEhiiteration occupying 10% of the current cuboid volume. This meant that, on theaverage, 10 trial points were being generated to get a single confirmed point. Thenumber of function evaluations necessary to generate 10 confirmed points was theninvestigated.The two strategies were examined. In the first case, generating a feasible point andthen calculating the objective function. In the second case, the steps were reversedand a point which fulfils the level set condition was first generated and then itsChapter 7. LSP COMPUTER IMPLEMENTATION 195feasibility checked. The same experiment was done again after few more iterationswhere the volume of the acceptance set had been reduced to only 1% of the currentcuboid. The number of function evaluations for the two strategies at these twoiterations are given in Table 7.2.a. When acceptance set is 10% of the cuboid volumeFunction FunctionStrategy evaluations Strategy evaluationsCheck EvaluateFeasibility 10 x 10 Obj. function 10 < 10JL.j.J JLEvaluate j CheckObj. function 10 x 10 Feasibility 10Sum 200 110b. When acceptance set is 1% of the cuboid volumeFunction FunctionStrategy evaluations Strategy evaluationsCheck EvaluateFeasibility io x 100 Obj. function io x 100.iL Jj.Evaluate CheckObj. function 10 x 100 Feasibility 10Sum 2000 1010Table 7.2: Comparison of Nf for two different point confirmation strategiesThe above examples support the general experience with the test problems that, iffeasibility is checked before the level set condition, the number of function evaluations required for search termination is about twice the number required for thereverse strategy. This advantage may not necessarily be enjoyed for all problemsChapter 7. LSP COMPUTER IMPLEMENTATION 196solved with LSP but appears to be the best default strategy.• Constrained problems - general case:Inefficiency arises when the acceptance set volume to the cuboid volume ratio islow. The acceptance set for a constrained problem is governed by both the set ofconstraints and the level set condition. Which of the two factors, that is either thefeasibility or the level set condition, is the most likely to cause rejection of a trialpoint depends upon the nature of the objective function and the complexity of thefeasible region in a specific problem.In general, the minimum total number of function evaluations is always achievedby checking the dominant rejecting factor first. In practice this means checkingfeasibility first in the case of highly (tightly) constrained problems, and evaluatingthe objective function first for problems whose objective function surface is veryirregular. Unfortunately, identifying the dominant factor at any stage of an LSPsearch is not always an easy task.Consider an example, where at some stage of the search, the feasible region and thelevel set occupy 30% and 80% of the current cuboid volume respectively. Assumethat the acceptance set, the intersection of the two regions, is 20% of the currentcuboid. Table 7.3 shows the number of function evaluations needed, in this example,to generate 10 confirmed points using the two different strategies. The strategywhich generates feasible points before evaluating the objective function requiredless function evaluations since the feasible region volume is smaller than the levelset volume.Checking the feasibility of a single point might consume significant CPU time ifthe constraints are many and compilcated. For example, a 4 variable problem with 3Chapter 7. LSP COMPUTER IMPLEMENTATION 197Function FunctionStrategy evaluations Strategy evaluationsCheck EvaluateFeasibility 5 x 10 Ohj. function 5 x 10Jyl. itI). LIEvaluate CheckObj. functionj 1.5 >< Feasibility 4 >< 10Sum 65 90Table 7.3: Comparison of Nf for different strategies for a constrained probleminequality constraint functions [Problem 4-1-0-3, Appendix A] was used to assess thetypical time spent for checking feasibility and for evaluating the objective function. Theaverage time taken to calculate the constraint functions at a feasible point was about2.64 times the time required to evaluate the objective function. However, no matter howlong the computation takes, it is simply counted as a single function evaluation in thisresearch (a rationale for this was discussed in Chapter 5, section 5.1.2). Therefore therelationship between CPU time and number of function evaluations is not always linear.The same test problem was used to compare the total number of function evaluationsto meet the convergence criterion with both strategies, i.e. with checking feasibilitybefore evaluating objective function or with the reverse strategy. The problem was run10 times for each case. The results showed that the strategy which evaluated objectivefunction prior to checking feasibility required 30.7% extra function evaluations than thestrategy which generated feasible points before evaluating objective function.In the discussion above, the superiority of one strategy over the other is based purelyon the total number of function evaluations. From the experience gained with the testproblems, it was generally found better to check feasibility before evaluating the objectivefunction. This was basically due to the computational effort required to evaluate a setChapter 7. LSP COMPUTER IMPLEMENTATION 198of constraint functions for most of the test problems. It is possible to switch from onestrategy to the other without difficulty so that a choice can be made after appraising therelative computational complexity of the various functions involved in a specific problem.7.3 Intermediate results presentationAfter each iteration, the new and improved level set value-c, the best point observed sofar, and the bounds of a new cuboid are all displayed on the screen. The results at anyiteration will always show some improvement over the previous iteration. Even if theprogram is interrupted at the middle of the search, the effort expended is not wastedas the intermediate solution can be used to provide initial cuboid data for a subsequentreviewed attempt to find the solution.There may be justification to modify the values of some LSP parameters on the basisof the display indicating search performance trends. If it is necessary to make any changesthe search can be paused, the necessary changes made and the search then resumed.All points in the acceptance set, the objective function value at each confirmed point,the level set value-c and the number of function evaluations at every iteration are stored inthe computer memory. At any stage of the search the results provided at every iteration,from the beginning of the search up to the current stage, can be displayed and reviewedwithout stopping the search.A variety of information is given out by the plots at every iteration. The set ofscatter diagrams of points on a two variable plane are displayed by pressing a functionkey. The scatter diagrams are plotted with a choice of axis lengths, either the initialvariable bounds or, alternatively, the current cuboid lengths. The c r-’ and Nf Iplots, are automatically displayed after every iteration.If the intermediate results, including the plots, give an indication that there areChapter 7. LSP COMPUTER IMPLEMENTATION 199multiple optima and efficiency is low, it is better to stop the search and divide the currentcuboid into subregions so that the search can be carried out in each subregion separately.Such user intervention increases the overall efficiency. LSP could be programmed toautomatically make the necessary changes without the need for sophisticated detectionroutines. It must be stressed that the need for such adjustments can be recognized byonly a moderately experienced user.7.4 Final outputThe final output from LSP gives the global optimal point(s) and their corresponding objective function values and a set of alternate near-optimal points whose objective functionvalues do not exceed the final level set value-c. This information is available both numerically and graphically. Certain performance indicators, like the total number of functionevaluations and the number of iterations, are given as a supplementary information.7.5 Alternative presentation of scatter plotsThe most informative graphical representation of intermediate and final results is inthe form of the scatter plots of confirmed points in the x x planes. When dealingwith problems involving more than two variables, a set of scatter plots are required torepresent the points in every plane of the variables. Various ways of presenting theseplots have been assessed during the course of this research. Two methods of scatter plotpresentations on a computer screen are discussed in the next subsections under static anddynamic plots. One of the methods utilizes the static plot where no logical connectionbetween points is displayed. The alternative dynamic method allows removal of someof the points from the display, a change to the scale of a plot or labelling points havingsomething in common (for example, a point identified in one plot showing up in the sameChapter 7. LSP COMP UTER IMPLEMENTATION 200colour in all other displayed plots).7.5.1 Static plotsFor a problem involving n variables there will be n(n—1) scatter plots, each showing onen-i x plane. These plots are ideally juxtaposed in a single screen frame. A maximumof 14 plots have been displayed at any one time on a standard 14” VGA display. Forhigher dimensional problems requiring more than 14 plots the user is either expected tochoose a subset of the planes of interest or use a higher resolution display. A page displayfeature could easily be implemented if necessary.In some cases there may be a practical, problem related, need for identifying a specificconfirmed point, or a subset of confirmed points, in one of the scatter plots and tracingthem in the other plots. One example arises when one is interested in tracking solutionpoints which lie within a preferred area of the search domain. A second need, whichmight arise when the level set value-c has fallen to an acceptably low value, is to trackpoints which were once confirmed in the acceptance set and were subsequently discardedin the search process when they did not fuffil the current level set condition.Tracking a single point with some particular numerical property which can be detectedwithin the computer program was implemented. An example of this was tracking of thecurrent best point in the acceptance set. A special display symbol (+) was assigned tothe current best point and this can be seen in Figure 8.3, Chapter 8, for a six variableproblem where it is discussed in connection with sensitivity analysis. Experience withthis feature and the many test problems showed that, at any stage of the search, thecurrent best point rarely lies close to the centre of the cluster in any of the x1 n-i x plots.The greatest insights to the problem being solved can be provided by the scatterplots of confirmed points at any stage of the LSP search. For example, when dealingwith problems involving more than two variables, the user might have preference forChapter 7. LSP COMPUTER IMPLEMENTATION 201solution points within a certain range for one or two of the key variables. His interestwill then be to identify the corresponding points in the other planes.The implementation of point identification schemes becomes more complicated whenthere is an interest in identifying groups of points at the same time. Some multiplepoint detection schemes were implemented to investigate the kinds of information whichcould be inferred from visually identifying points which had been discarded at previousiterations. Points discarded in each of the last 5 iterations, identified by iteration, andthose surviving in the current acceptance set were displayed on the same plot. Pointsdiscarded at each iteration were thus considered as members of a single group.Several techniques were tried to visually distinguish between points in each group.The first attempt used different colours for different point groups, and the second useddifferent symbols for points in different groups. Both attempts did not give meaningfulresults after the first few iterations. In the latter stages of LSP search, points fromdifferent groups overlap so that their differences become indistinct.A slightly modified approach to the above methods where only the points discardedat one iteration are displayed at any time was also implemented. The set of plots fora single group were displayed in a single frame. Each frame remained on the screen foronly a short time (about a second) and the display cycled through all the frames. Thisanimated approach did not appear to have any particular value.To overcome the difficulties mentioned above, and to exploit the benefits of scatterplots, it is better to use dynamic graphs instead of static graphs. In dynamic graphics,a viewer can see changes which have occurred over time. This improved approach topresenting scatter plots is explained in the next section.Chapter 7. LSP COMPUTER IMPLEMENTATION 2027.5.2 Dynamic plots - AlternagraphicsAlternagraphics [Becker et al., 1987] is a technique which allows the viewer to interactively select some points from a set of scatter points and then view the selected pointdisplayed in any number of corresponding two dimensional displays. Another featureis that the displays can be modified rapidly as the user explores the data set so thatprimitive animation effects are achieved. The purpose of alternagraphics is to conveyinformation which is not easily perceived in static displays. It is, in effect, just anotherrather specialized form of computer graphical user interface. Because LSP generates alarge set of solution points at every iteration as well as in the final solution, alteruagraphics is far more exploitable with LSP than with any of the more conventional nonlinearoptimization techniques. Alternagraphics can be implemented in three different waysprovided that the points plotted on the screen may be categorized into various groupsdepending on their importance to the viewer or other numerically identifiable characteristics.The first approach displays points belonging to successive acceptance sets for only ashort time, automatically cycling through the most recent sequence of iterations. Thesecond approach displays all data points on the screen simultaneously in a number oftwo dimensional plots, but highlights the points belonging to a particular set. Points ina series of sets are then highlighted one after the other. The third approach provides theuser with the capability of turning any set ON or OFF rapidly. When a set is turnedON, either its points are highlighted and other points remain unchanged, or only thosepoints turned ON are shown and all the other points are erased from the screen. TheON or OFF procedure can be initiated from a displayed menu, pressing a function keyor clicking a button on the mouse to activate the changes.The third approach mentioned above, where the user has the freedom to interactivelyChapter 7. LSP COMPUTER IMPLEMENTATION 203choose point sets, appears to offer the greatest benefits in engineering practice. Forexample, in solving engineering problems with LSP, the engineer might be interested indesign solutions which can be achieved within a specific value range of one of the designvariables. Using the scatter plot of points in any of the two dimensional displays includingthe variable of interest, the points which lie in that specific range can be selected andturned ON. The display of the corresponding (i.e. same) points in all the other planes willthen be highlighted automatically. This helps the engineer to visualize the distributionof the selected solution points in the other design variable domains.Some of the main features of dynamic graphics discussed extensively in [Becker et al,1987] are reviewed here to further elaborate on their use and implementation with LSP.Deletion:When a single point distorts the scale of a scatter plot, the outlier can be deletedfrom the plot by simply clicking the mouse cursor at that particular point. Once theoutlier is removed, the plot is instantaneously re-scaled and the points appear on thescreen with the new axes ranges. Points which had been crammed in a small regionare now well dispersed, improving the resolution of the plot. Figure 7.1 demonstratesthe improvement of plot resolution by removing the outlier at the upper right corner ofFigure 7.1(a).Linking:Suppose there are n(n+1) scatter plots of points representing the N168 points in anacceptance set of an n variable problem, each plot in an x n. x3 plane. Linking visuallyconnects the corresponding points in the different scatter plots. Certain points are chosenin one of the scatter plots by clicking the mouse on each point and the linking procedurethen highlights them on all the other plots. Consider, for example, in a 4 variableChapter 7. LSP COMPUTER IMPLEMENTATION 204Outliec point• 1• Xla(a) Points crammed in one area (b) Improved resolutionFigure 7.1: Deletion of a point in dynamic graphics.problem, the two scatter plots in the planes of x1 x2 and x3 - x4. Suppose we wantto identify the same points in these two scatter plots. Some points are first chosen inthe x1 - x2 plane, the same points are then automatically identified in the x3plane by highlighting or similar labelling technique. Therefore joining points meansvisually linking the point (x1,X2)C on the first plot to (x3,X4)IC on the second plot, wherek indicates the point number. Linking of points is illustrated in Figure 7.2, where thehighlighted points in Figure 7.2(a) and (b) represent the same four points in two differentplanes. Intermediate results of an LSP run for a 4 variable problem, [Problem 4-1-0-3,Appendix A], were used to draw the points in Figure 7.2. In this instance the highlightedpoints correspond to the current 4 best points which had been selected automatically bythe LSP software.Linking is also a feature of static graphics display and was described in section 7.5.1.Chapter 7. LSP COMPUTER IMPLEMENTATION 2052a1.2 1.1a.01.2 00000o 02$ • a •04 00 0X2 C 00000 o a 0a 4 0 a—1.5..13 a 0I I I I I I I I I I—2I I I I4.1 20 20 0.5 1.5 .0.5 .02 0.2 06 1.0 1.4 16 22XI . Y3Figure 7.2: Linking of points for a 4 variable problem [Problem 4-1-0-3].Brushing:This is a dynamic method in which the user moves a small adjustable rectangle aroundthe screen, with a mouse, in order to identify points within the rectangle. The rectangleis called a brush and each two dimensional plot is called a panel. This technique is usedfor high dimensional problems which need more than one panel to show points in allplanes of the feasible region.There are two ways of displaying points which lie inside the brush. As the brushmoves around the active panel, points within the brush are highlighted in all of the otherpanels. The second method is performed with all data points plotted on the active panel,but only points in the brush are shown in the other panels. Figure 7.3 is adopted fromBecker to demonstrate the second approach of brushing, where only points in the brushare displayed in all the panels other than the active panel [Becker et al, 1987]. In thisfigure, ozone, radiation, temperature and wind speed designate 4 different variables.The brushing technique can also be used to delete points by pressing one of theChapter 7. LSP COMPUTER IMPLEMENTATION 206BrushActivepanelFigure 7.3: Points in the brush highlighted in all the panels [Becker, 1987].mouse’s buttons. Points inside the brush on the active panel are deleted, temporarily,along with corresponding points on the other panels.The shape and size of the brush can be changed to meet the requirements of the user.When there is a need to examine the effect of only one variable on the others, the brushcan be adjusted to produce a long and narrow rectangle. Such a slender brush can alsobe used to give information on the nonlinear dependence of one variable on the others.Alternagraphics enhances understanding of the nature of the problem being solved.Implementation of this information in engineering has yet to be fully investigated but, inChapter 7. LSP COMP UTER IMPLEMENTATION 207conjunction with LSP, it appears to have considerable potential for supporting practicalengineering applications of nonlinear optimization.Chapter 8SENSITIVITY ANALYSISSensitivity analysis is dependant upon numerical information which can be derived directly from the optimal solution and obtained with minimal compntational effort. Forthe established optimization methods this means that the sensitivity output is primarilyassociated with changes in the optimal solution due to variation in the numerical valueof coefficients in the problem formulation, both in the objective function and the righthand side constant coefficients. This information can be important when the coefficientsin a problem formulation are poorly defined or can in fact be adjusted.In LSP, useful sensitivity information which is obtained without any additional analysis is different from that discussed above. It utilizes the acceptance sets with level setvalues close to the global optimal value to indicate how near optimal solution points aredistributed in the search domain. Here the emphasis is not on the coefficients but on thevalues of the decision variables and the value of the objective function. Therefore thesensitivity information indicates how much one can deviate from the optimal point(s)without the objective function excluding its global optimal value by some prescribedamount. Revealing the distribution of multiple global optima in the decision domainmight also be considered as a part of this kind of sensitivity analysis.LSP sensitivity information is undoubtedly of practical value as it focuses on theinfluence of those decision, design, or activity variables over which the engineer exercisesactual control and can therefore adjust their value. Furthermore, in some importantapplications of nonlinear optimization, conventional sensitivity analysis results which208Chapter 8. SENSITIVITY ANALYSIS 209focus on variation of coefficient values are of no practical value. For example, in modelfitting the decision variables are the parameters of the model and the coefficients are theexperimental data.There are a number of options for a model fitting criterion. Consider the case of asimple linear model where the optimization problem is to minimize the sum of the squaresof deviations between observed and model generated estimates of a dependent variablez. There are two independent variables .x and y and m sets of observations of x, y and z.Each observation is identified by its subscript i. The problem is expressed as followsMinimize— z)2where the model generated data set E is expressed as= a * x + b *and the constraints area * x1 + b * Yi =a * x2 + b * Y2 =a * x3 + b * =a * Xm + b * Yin = z.The two model parameters, a and b, must be established on the basis of m observationsof the independent and dependent variables.In this problem x, y and z represent the already realized experimental data butalso serve as coefficients in the constraint set, whereas a and b are the decision variables,i.e. the unknown model parameters whose values are being established. Because theChapter 8. SENSITIVITY ANALYSIS 210coefficients in the optimization formulation are observed data sets a sensitivity analysiswhich is concerned with changing the coefficients has little meaning. In contrast, anindication of the sensitivity of the objective function value to the value of a and b in theregion of their optimal values could provide some useful insight as to the robustness ofthe estimates of a and b. Sensitivity information of this type is readily provided by LSP.The remainder of this chapter attempts to describe the generally interpretive nature ofLSP sensitivity analysis which is derived almost entirely from the graphical outputs. Theinterpretation of the graphical output is influenced by the finite nature of the point sampling and by topological characteristics of the various functions involved in the problembeing solved. The sensitivity conclusions discussed below may therefore not necessarilybe the most appropriate for all cases.8.1 Sensitivity with LSPOnce the LSP convergence criterion is met then all points in the final acceptance setare global optima or provide near optimal solutions. In cases where these points forma distinct single cluster and the difference between any two points is approaching thenumerical precision of the computer, then the final acceptance set can be consideredto represent a single global optimum point. The size of the final acceptance set is,however, normally determined by the convergence criterion, VM so that in many cases,the confirmed points in the final cuboid may be scattered over a larger region. Thenthe difference between any two confirmed points is not simply governed by numericalprecision but reflects a considerable difference between solutions yielding optimal or verynear optimal solutions. Large distances between points suggests that the problem haseither multiple optima or a very flat objective function surface in the region of theacceptance set.Chapter 8. SENSITIVITY ANALYSIS 211In spite of the interpretations suggested above, and in the remainder of this chapter,one should exercise caution when associating the cuboid size and the distribution ofpoints in the acceptance set with the number of distinct global optima. Even though thefinal cuboid is assumed to contain all globally optimal points, all confirmed points in thefinal acceptance set may not produce the global optimum value and, with the discretesampling involved, some globally optima points may not have been sampled.The x ‘-S- x scatter plots can be viewed with axes lengths determined by eitherthe variable bounds established prior to running LSP (the initial cuboid) or within thecuboids at the later or final stages of the search.8.1.1 Sensitivity interpretation of the confirmed points at convergence plotted in the initial cuboidThe plots of the confirmed points at convergence in the initial cuboid provide anothersensitivity perspective on the LSP solution. Usually the scale of these plots, and the display resolution, will obscure the details. They do, however, reveal certain characteristicsof the optimum solution from the perspective of the engineer and his problem. Becauseof resolution limitations at this scale, points which are close together will coalesce to asingle point in the display so that the visible points can be assumed to be global optimal points. Problems which have multiple optima often produce distinct scatter plotswhich can be meaningfully interpreted in various ways. Figures 8.1(a) to (e) are usedto demonstrate the interpretation of some typical final scatter plots, within the initialcuboid, for some two variable problems.Figure 8.1(a): Suggests that there is only a single optimum in the region of interestand the optimum point lies around the centre of the search domain.Figure 8.1(b): Suggests that there is a single optimum point but two of the variablebounds are probably acting as active constraints. From a practical point of view thisChapter 8. SENSITIVITY ANALYSIS 212X2(a)X2[ X2L (c)_________Figure 8.1: Plots of confirmed points after convergence criterion is met.suggests that, if it is possible, a relaxation of bounds might produce a better result.Figure 8.1(c): The plot clearly suggests the existence of two distinct global optimalpoints.Figure 8.1(d): The plot suggests the possible existence of multiple optima, but oneof the variables (x1 in this case) has the same, i.e. constant, value at all of the optimalpoints.Figure 8.1(e): Indicates that in a specific region of the search domain, the confirmedpoints can be approximated by a diagonal line, which implies a linear relationship betweenthe two variables in the region of near optimality.An actual screen dump for a six variable problem [Problem 6-2-0-6, Appendix A] isgiven in Figure 8.2 to demonstrate the kind of variability that can occur in the appearanceof the 14 individual x x plots. The plot was generated with a relatively high value ofthe convergence criterion to make the plots more distinctive. It shows, for example, thatfixing the values for variables x1 and x2 might be justifiable to simplify the optimizationproblem as this may not change the optimal solution significantly.xlChapter 8. SENSITWITY ANALYSIS 213Figure 8.2: Plots of confirmed points at convergence within the initial cuboid for a 6variable problem (Actual screen dump)8.1.2 Response of acceptance set to changes in the level set value-cOnce LSP is run and an optimal solution found, there may be a value in investigating thesensitivity of the decision variables with respect to the value of the objective function.More specifically determining to what extent an objective function value which is inferiorto the global optimum value permits a wider range of decision variable values. This isdone by storing the acceptance sets at all iterations. Figure 8.3 shows the kind of effectwhich might be observed when c is raised above c* when the acceptance sets are plottedin the initial cuboid.Actual screen dumps of plots for the test problem [Problem 4-6-1-2, Appendix A] areChapter 8. SENSITIVITY ANALYSIS 214x2 x2xl xl(a) c = c (b) C> c (connected level set) (C) C> c (partitIoned level set)Figure 8.3: Plots of confirmed points within the initial cuboid for different c valuesgiven in Figure 8.4 and represent outputs at two level set values. Figure 8.4(b) showspoints in the acceptance set for a level set value of 29.68, which is the global minimum.Figure 8.4(a) shows points in the acceptance set for the same problem but with the levelset value-c raised by 1% to 29.96 which coincides with c at iteration 8. As a result manynew points are introduced into the acceptance set. Most of the new points are locatedfar from the global optimum point, as shown in Figure 8.4(a). The small increase in theobjective function value has permitted a wide range of possible solution points some ofwhich may offer benefits not measured by the objective function and therefore preferredas practical solutions to the problem.8.2 Other approaches to obtaining sensitivity informationInformation similar to that obtained for coefficient values in conventional sensitivity analysis can also be obtained with LSP, though at some additional computational expense.When the sensitivity can be expressed in the form of a gradient at the optimal solution (e.g. Lagrange multipliers) then only a slight perturbation of a coefficient value isxxl.Chapter 8. SENSITIVITY ANALYSIS 215necessary to estimate the gradient by finite difference. A small change would almost guarantee that the acceptance set already obtained just a few iterations before convergencewould provide an appropriate starting point for solving the revised problem. Thus onlya small additional computational effort would be involved for solving for each perturbedcoefficient.Chapter 8. SENSITIVITY ANALYSIS 216(a)(b)Figure 8.4: Points in the level set, for c = c and c = 1.01 * c - [Problem 4-6-1-2] (Actualscreen dump)Chapter 9CONCLUSION9.1 IntroductionA global search scheme for nonlinear optimization problems based exclusively on levelsets, was first presented in “Integral global optimization” by [Chew & Zheng, 1988]. It wasthis work that provided the starting point for the research in this thesis. The followingsurmnary of its shortcomings is not intended to question the value of Chew & Zheng’scontribution in any way but to provide a clearer perspective on the contribution of thisthesis. As was suggested by the title, they placed considerable emphasis on the theoreticalproperties of integral expressions for the higher moments of the objective function valuesassociated with solution points in level sets. Much of this theory was found to have littleor no bearing on the implementation of a level set based optimization scheme. Althoughthe capability of identifying multiple optima was mentioned, the authors focused on amethod for solving problems with a single global optimum only. Thus the potentialof the methodology to solve multiple global optima problems as well as identify nearoptimal solutions was overlooked. A systematic performance assessment over a varietyof challenging test problems was not included. The importance of the interpretation ofintermediate and final output from a level set search and the potential for a graphicalinterface were not recognized.Level sets have been used to augment some global optimization schemes. But theview of some of the authorities in that field has been that the use of level sets as the217Chapter 9. CONCLUSION 218principal search tool is impractical. The level set method was labelled as being only a“theoretical” iterative scheme in [Horst & Hoang, 1990], and as being “not designed forsolving practical problems” in [Torn & Zilinskas, 1988]. A level set based search fallsinto the class of iterative direct search methods of optimization. Some anthors clearlyfeel that direct search methods are applicable for only low dimensional problems. Edgar& Himmelblan [Edgar & Himmelblau, 1988] suggest that “direct methods ... are not asefficient and robust as many of the modern indirect methods, but for simple two variableproblems, they are satisfactory”. Also surprising in the light of experience in this thesis isthe view that optimization methods based on random sampling are suitable only for lowdimensional problems or as devices for generating starting points for more sophisticatedmethods [Reklaitis et al., 1983].Computational efficiency has dominated the development of nonlinear optimizationmethods over the past 30 years. The negative views which have been expressed aboutdirect search methods in general, and level set methods specifically, arise from computational efficiency concerns. As was discussed in Chapter 6 on performance assessment,ease of use is more likely to be the dominant criterion in the future and computationalefficiency only of secondary importance.The research reported in this thesis set out to investigate the use of level sets inpractical nonlinear optimization. As the investigation proceeded many new, previouslyunexploited aspects of level set optimization were revealed and incorporated into theevolving implementation scheme. This implementation was eventually given the nameLevel Set Programming (LSP) as it appears to have the necessary characteristics, including a theoretically sound global convergence criterion, to qualify as a mathematicalprogramming tool. It uses estimates of only the first two moments of the objective function values at the (discrete) points in the level sets to redefine the search domain at eachChapter 9. CONCLUSION 219iteration and to measure convergence on the optimum solution. LSP utilizes approximations of the true level set (or acceptance set in constrained problems) boundaries in theform of cuboids for efficiently generating points in the level sets. The combined effectof the progressively improving level set and the contraction of the search domain makesLSP an efficient and reliable global search procedure.LSP can handle, without modification, both equality and inequality constraints, andalso has the provisions to incorporate existing techniques of optimization, such as thepenalty functions. It also makes use of established clustering analysis techniques toimprove the search efficiency for problems involving widely separated multiple globaloptima.Unlike many other optimization methods, the intermediate results generated duringan LSP search provide a global view of the problem being solved. This is in contrast withthe single path view provided by say gradient methods. Because a much larger numberof point solutions are established at each iteration than is the case with other practicaldirect search methods, LSP’s view of the problem is far more complete than with theestablished direct search methods.9.2 Evaluating LSP performance using test problemsA large set of published mathematical and engineering test problems, already solvedby a variety of NLP methods, were solved with LSP to evaluate its performance. Thetest set included unconstrained and constrained problems, continuous and discontinuousfunctions, and continuous and discrete variables. The dimensions of the problems variedfrom 1 to 38 variables. The results showed that solutions found with LSP were generallyin agreement with the published results, with some iniprovement in about 5% of the cases.The only aspect of performance to other methods cited in the literature, which showsChapter 9. CONCLUSION 220some superiority over LSP, is in the shorter computational time expended to solve testproblems. Even though similar machines are not used for all the cases, the computationaltime taken to meet the termination criterion reported in the literature was generally lowerthan with LSP. Still, the time spent with LSP was found to be well within a tolerabletime frame from a practical engineering design optimization point of view. The additionalcomputational time can be justified easily when it is weighed against the reliability ofLSP as a global optimizer and the extra information it provides during the search.The nonlinear optimization test problems in the literature have many shortcomingsand there are virtually no test problems capable of testing global optimization schemesin any systematic way. To more adequately test the strengths and limitations of LSP, anew mathematical test problem named ‘The Road Runner Function’ was developed. Ithas a single global optimum and several local optima for any dimension n. The functionis parametrically adjustable and can be easily extended to any number of dimensionswhile retaining its key features. It is a challenge to all existing NLP and direct searchmethods unless an ideal starting point is chosen.One problem discovered in the literature, that by Subrahmanyam [Subrahmanyam,1989], was found to have many desirable attributes for a test problem. The formulation and the geometry of its feasible region and objective function surface are simple.It presents convergence difficulties for direct search methods even though the dimensionality of the problem is low. LSP was only able to solve this problem by significantreadjustments to its search parameters.9.3 LSP performance improvementsAs a result of the experience gained with the test problems both the reliability and efficiency of LSP were substantially improved over the course of this research. This entailedChapter 9. CONCLUSION 221the refinement of some of the techniques suggested in [Chew & Zheng, 1988], development of some new techniques, the introduction of parameters to control these techniques,and the development of diagnostics to assess progress and guide the modification of theseparameters.One heuristic method investigated involved subdividing the cuboid into a set ofsmaller subcuboids and then performing searches within each subcuboid. This techniqueis applied when the intermediate results displayed in the graphical output or cluster analysis indicate the possible existence of a partitioned level set. The division of the searchdomain avoids inefficient searches in-between the connected regions of the level set, sothat the search then concentrates on smaller subregions each of which is fully connected.The sum of the function evaluations to meet the termination criterion for each subregionwas found to be considerably lower than the number of function evaluations for the singleregion with a partitioned level set.When the feasible region formed by the set of constraints occupies only a very smallregion of the initial cuboid, feasible point generation can be difficult and result in aninefficient or suspended LSP search. An indication that this difficulty has arisen is provided by the number of function evaluations required at the first LSP iteration. Thisearly feedback permits corrective action to be taken before a large amount of computational effort is wasted. Constraint relaxation was found to improve the efficiency ofsample point generation by temporarily increasing the volume of the feasible region. Todiscourage the existence of infeasible points in the solution set, penalty terms were alsoadded to the objective function. The progressive tightening of the relaxed portion of theconstraints in conjunction with the penalty terms on the objective function assures thebest continuity between successive acceptance sets and that solutions are found withinthe feasible region of the original problem.Due to the random generation of sample points, a global optimum point may be missedChapter 9. CONCLUSION 222if it lies at a boundary of the feasible region. To increase the capability of generatingboundary points the technique of skewness adjustment was implemented. This techniquemodifies the location and boundaries of the current cuboid to ensure that no potentiallyimportant regions are excluded from the current cuboid. In addition to this skewnessadjustment, reduced bias estimates of cuboid bounds are calculated at every iteration.Both of these adjustments further increase the volume of the cuhoid and maximize theprobability that the true acceptance set is included within the modified cuboid.Two other alternatives to the cuboid approach, replacing the cuboid with a rhombohedron and exploiting a linear relationships between a variable pair, were also investigated.These were found to provide significant efficiency improvements under specific but commonly occurring conditions which are detectable from LSP’s output.As with all NLP schemes, equality constraints present significant difficulties. In anideal case, equality constraints can be used to reduce the number of variables involved,otherwise they must be accommodated using a penalty technique. The approach usedwith LSP is similar to the classical penalty approach, but the penalty parameter valueis reduced at every iteration. This progressive modification of penalty terms throughoutthe search process was found to significantly improve the chances of reaching the globaloptimum in the presence of equality constraints.9.4 Graphical outputThe intermediate graphical displays provided by LSP and their interpretation open upa new dimension in engineering optimization. These plots provide entirely new ways tojudge the progress of the search as well as insights into the topology of the particularproblem being solved.The first set of plots, the x, x scatter plots, could of course, be obtained withChapter 9. CONCLUSION 223any NLP method, but are far more meaningful with a technique such as LSP whichprovides a large number of confirmed solution points at each stage of the search. Theseplots facilitate visualizing the acceptance set boundary shapes and will often reveal thepresence of distinctly separate global optima or near optima. The plots can also giveindications of dependencies between the variables which can be used to speed up thesearch. The second type of plots are the progressive plots of level set value-c versusiterations-I which can be used as an indication of the strength of convergence. Mostimportantly the c n.’ I curve can also suggest the possible existence of fissure like featureson the objective function surface. A third type of plot, the plot of number of functionevaluations-Nf versus iteration-I, reveals the efficiency of the search as it progresses. Itsinterpretation can give an indication of the existence of multiple global optima, or theexistence of multiple local optima with only one global optimum.One of the most interesting aspect of the plots discussed is that much of the triggeringinformation for LSP parameter adjustment or problem modification concerns just c, Nfand I. It is also significant from a users point of view that the magnitude of c is relateddirectly to the real world problem being solved while Nj and I are easily understoodquantities related to the search. The influence of these quantities on the search strategyis therefore easy for practitioners to understand.9.5 Civil engineering design problems and LSPThe mathematical equations used in civil engineering design are predominantly nonlinear. In many instances the mathematical formulation of the associated design problemsdo not meet the ideals, from a nonlinear optimization point of view, of providing a convexobjective function and a convex feasible region. It is also not uncommon for the decisionvariables to refer to discrete choices of pipe diameter, structural member size, etc. soChapter 9. CONCLUSION 224that discrete decision variables mnst be accommodated. Most civil engineering designoptimization problems therefore require the use of nonlinear programming techniqueswhich can accommodate mixed variables for their solution. Furthermore, in engineeringdesign, there are instances where the design which produces the optimum objective function value may not be the best practical option. Alternative solutions which produceslightly inferior values to the mathematical global optimum solution may be preferredbecause of factors which could not be captured in the formulation. Therefore, a featureof an optimization method which might appeal to civil engineers is one which can identifynear optimum solutions. Simplicity of use of an optimization method also translates toeconomy of use in a commercial engineering office and this alone may be the major factordetermining the practical use of optimization.Currently, there are several NLP packages being offered commercially but most welldeveloped packages use gradient methods and require the differentiability of the objectiveand constraint functions. These do not fully satisfy the needs of civil engineers for severalreasons. Gradient methods basically find local optima and can not guarantee finding aglobal optimum except under very ideal conditions. Unfortunately, they do not giveany indication of the local or the global nature of the solution found. When problemshave multiple optima, either local or global, a single gradient search can only identify asingle solution. The solution found depends entirely on the starting point of the search.No systematic way of selecting starting points to ensure finding the global optimum isavailable and the responsibility for the overall global search strategy and its success isleft entirely to the user.Most NLP packages which have the ability to handle practical sizes of problems have,until very recently, been available only on mainframe computers. There are only a fewnonlinear optimization packages which can address practical engineering problems usingpersonal computers and these use almost without exception, the same gradient methodsChapter 9. CONCLUSION 225which have dominated mainframe optimization for the past 30 years.The feedback provided by the gradient search methods are in the form of matricesof first and second partial derivatives, matrix condition, Lagrange multipliers, etc.. Thiskind of information rarely has any direct relevance to an engineer’s view of his engineeringdesign problem. Additionally, this information can be particularly difficult to interpretwhen search failures occur or when the global nature of the solution is in question. Ifa gradient NLP method fails to confirm even a local optimum, the user is left with nosuggestion for his next move.In summary, in spite of the recognized need for optimization in civil engineering,existing NLP packages do not appear to satisfy the needs of practitioners. It is entirelypossible that it is the available methods, and not just their computer implementation,that have insufficient appeal to convince engineers that their time should be invested inthat direction.LSP is a global search method and the chance of converging on the global solutionis much higher than with any of the gradient based methods such as GRG and MINUS,even when a number of different starting points are used. One of the important aspects ofLSP is its capability to identify multiple optima and near optimal solutions. If multiplelocal optima exist, LSP either identifies all of them or gives indications of their existencein the process of identifying the global optimum.LSP overcomes many of the limitations of the existing gradient methods but doesnot radically reduce the computational effort involved for those problems which couldbe solved using a gradient method. Perhaps the most significant features of LSP froma practical engineering standpoint are that: the method is conceptually simple, and cantherefore be easily understood by engineers who are generally not experts in numericalanalysis techniques; virtually all of the computations and numerical results generatedduring the search are meaningful in the context of the engineering problem; the programChapter 9. CONCLUSION 226can be run on personal computers; and an elementary, and therefore fast, graphicalinterface can display all of the useful information at any stage of the search. Flexibility,simplicity and meaningful intermediate results during the search lower the engineers’time requirements for familiarization and therefore make LSP economical to implementeven when its use is only occasional.The success of global optimization is highly dependent on the complexity of the objective function surface and the boundaries of the feasible region, in other words theirgeneral topology A better appreciation of the topological nature of the global optimization problem can only help to greater understanding and possibly refinement of globaloptimization methods. Yet topological interpretations have not been widely nsed in optimization, the theoretical basis for level set optimization presented in [Chew & Zheng,1988] being a rare example. Topology is an established and mature subtopic of mathematics with a considerable body of theory, research in this topic may therefore providenew perspectives on level sets and provide the basis for further refinements in LSP typemethods.Subdividing the search domain into smaller regions was found to increase the searchefficiency considerably for multiple optima problems. It was also observed that determining the number of clusters and allocating points into different clusters becomes difficultwith the number of variables in a problem. Consequently, high dimensionality clusteranalysis methods should be further explored with LSP’s particular requirements of lowprecision cluster analysis, as discussed in Chapter 3, section 3.1.3, in mind.Graphical interpretation of both intermediate and final optimization results opens upa new perspective in the field of optimization. The fact that LSP produces and storesa large set of intermediate results in the form of solution points makes the graphicalinterface far more attractive than with almost all other optimization schemes. ThisChapter 9. CONCLUSION 227is especially true with high dimensional problems when the information provided in thedynamic graphs is high. Therefore, alternagraphics and similar display techniques shouldbe further researched to fully exploit the inherent ability of LSP to support graphicaloutputs.BibliographyThe following bibliography is arranged in conventional alphabetical order. The numerical references provided are used only in the Appendix.[1] Alperovits, A. and Shamir, U., “Design of optimal water distribution systems”,Water Resources Research, pp 885-900, 1977.[2] Aluffi-Pentini, F., Parisi, V. and Zirilli, F., “Global optimization and stochasticdifferential equations.”, JOTA Vol 47 NO. 1 pp 1-16, Sept 1985.[3] Armstrong, M. A.,”Basic topology”, New York, Springer-Verlag, 1983.[4] Archetti, F. and Schoen, F.,”A survey on the global optimization problems: Generaltheory and computational approaches”, Annals of operations research No 1, pp 87-110, 1984.[5] Ballard, D. H, Jelinek, C, 0., and Schinzinger, R., “An algorithm for the solutionof constrained generalized polynomial programming problems”, Computer Journal,Vol 17, pp 261-266, 1974.[6] Bazaraa, S. Mokhater, “Nonlinear programming, theory and algorithms”. John Wiley & sons, 1979.[7] Becker, A. Richard, Cleveland, S. William and Wilks, R. Allan, “Dynamic graphicsfor data analysis”, Statistical Science, Vol 2, No 4, Nov 1987.[8] Ben Saad, Sihem and Stefer Jacobsen, E. “A level set algorithm for a class of reverseconvex programs”, Annals of operations research, No 25, pp 19-42, 1990.[9] Betrio, B. and L. De Biase, “A recursive spline technique for uniform approximationof sample data”, Technical Report, University of Piza (1), 1987.[10] Betro, B. and Schoen, F., “Sequential stopping rules for the multistart algorithm inglobal optimization”, Mathematical programming 38, pp 271-280, 1987.[11] Boender, C. G. E. and Kan, A. H. G. Rinnooy, “Bayesian stopping rules for multistart global optimization methods”, Mathematical programming 37, pp 59-80, 1987.[12] Box, M. J., “A new method of constrained optimization and a comparison withother methods”, Computer Journal, Vol 8 pp 42-52, 1965.228Bibliography 229[13] Chew, Soo Hong and Zheng, Quan, “Integral Global Optimization: Theory, implementation and applications”, Lecture notes in economics and mathematical systems,Vol 298, Springer-Verlag, Berlin Heidelberg 1988.[14] Cole, F., Gochet, W. and Smeers, Y., “A comparison between a primal and dualcutting plane algorithm for posynomical geometric programming problems”, JOTAVol 47 NO. 2 Oct pp 159-180, 1985.[15] Corana, A., Marchesi, M., Martini, C., and Ridella, S., “Minimizing multimodalfunctions of continuous variables with the simulated annealing algorithm”, ACMTransactions on Mathematical Software, Vol. 13, No. 3, pp 262-280, 1987.[16] Crowder, H. P., Dembo, R. S. and Mulvey, J. M., “Reporting computational experiments in mathematical programming”, Math Programming, No 15, pp 316-329,1978.[17] Davis, John C., “Statistics and data analysis in geology”, John Wiley & Sons 1973.[18] Devore, Jay L., “Probability and statics for engineering and the sciences”,Brooks/Cole publishing company, 1982.[19] Dixon, L. C. W. and Szego, G. P., “Towards global optimisation 2”, Elsevier NorthHolland, 1976.[20] Eason, E. D. and Fenton, R. G., “A comparison of nnmerical optimization methodsfor engineering design”, Journal of Engineering for Industry, Trans of ASME, pp196-200, February 1974.[21] Edgar, T. F. and Himmelblau, D. M., “Optimization of chemical processes”, McGraw Hill Inc. 1988.[22] Everitt, Brian, “Math cluster analysis”, Heinemann Educational Books Ltd, 1980.[23] Falk, James E. and Hoffman, Karla L., “Concave minimization via collapsing polytopes”, Operations Research Vol 34 No. 6, Nov-Dec, pp 919-929, 1986.[24] Fleming, John F., “Structural engineering analysis on personal computers”, McGrawHill Book Company, 1986.[25] Flondas, C. A. and Ciric, A. R., “Strategies for overcoming uncertainties in heatexchanger network synthesis”, Computers and Chemical Engineering, 13(10), pp1133-1152, 1989.[26] Floudas, C. A. and Pardalos, P. M., “A collection of test problems for constrainedglobal optimization algorithms”, Lecture Notes in Computer Science, NO 455,Springer-Verlag, 1990.Bibliography 230[27] Fujiwara, 0., Jenchaimahakoon, B., and Edirisinghe, N. C., “A modified Linearprogramming gradient method for optimal design of looped water distribntion networks”, Water Resources Research, Vol 23, No 6, pp 977-982, June 1987.[28] Ghani, S. N., “An improved ‘complex’ method of function minimization”, Computeraided design, January 1972.[29] Gill, Philip E., Murray, Walter, Saunders, Michael A. and Wright, Margaret H.,“Constrained nonlinear programming” in Handbooks in operations research and management science, Volume 1, Optimization. North Holland, 1989.[30] Goldberg, David E., “Genetic algorithms in search, optimization, and machine learning”, Addison-Wesley Publishing Company, Inc., 1989.[31] Goldberg, David E. and Samtani, Manohar P., “Engineering optimization via geneticalgorithm” in Electronic computation, pp 471-482, Kenneth M. Will ed. 1986.[32] Gottfried, Byron S. and \Veisman, Joel, “Introduction to optimization theory”, Prentice Hall, 1973.[33] Hartigan, John A., “Clustering Algorithms”, Wiley, 1975.[34] Himmelblan, David M., “Applied nonlinear programming”, McGraw Hill Inc. 1972.[35] Horst, Reinner and Hoang, Tuy, “Global optimization: deterministic approaches”,Springer-Verlag, 1990.[36] Huang, H. Y. and Aggarwal, A. K., “A class of quadratically convergent algorithmsfor constrained function minimization”, JOTA Vol 16 Nos 5/6, pp 447-485, 1975.[37] Jenson, David L., “The role of cluster analysis in computer assisted mass appraisal”,Lincoln Institute Monograph NO 77, July 1977.[38] Kan, A. H. G. Rinnooy and Timmer, G. T., “Global optimization”, in Handbooksin operations research and management science, Volume 1, Optimization, NorthHolland, 1989.[39] Kassicich, Suleiman K., “International intra-company transfer pricing”, OperationsResearch Vol 29, No 4, pp 817-828, 1981.[40] Lasdon, Leon S. and Waren, Allan D., “GRG2 user’s guide”, May 1982.[41] Leon, A., “A classified bibliography on optimization”, in recent advances in optimization techniques, pp 599-649, 1966.Bibliography 231[42] Lipschutz, Seymour, “General topology”, Schaum’s outline series in Mathematics,McGraw-Hill book company, 1965.[43] Loganathan, G. V. and Greene, J. J., “Global approaches for the nonconvex optimization of pipe networks”, Water Management in the 90’s, 1993.[44] Lucidi, S. and Piccioni, M., “Random tunnelling by means of acceptance-rejectionsamphng for global optimization”, JOTA Vol 62 NO. 2, pp 255-278, 1989.[45] Lnenberger, David G., “Linear and nonlinear programming”, Addison Wesley, 1984.[46] Luus, Rein and Jaakola, T. H. I., “Optimization by direct search and systematicreduction of the size of search region”, AIChE Vol 19, NO. 4, pp 760-766, 1973.[47] Mays, Larry W. and Taur, Cheng-Kang, “Unit hygrographs via nonlinear programming”, Water Resources Research, Vol 18, No 4, pp 744-752, 1982.[48] Mezzich, Juan. E. and Solomon, Herbert, “Taxonomy and behavioral Science, comparative performance of grouping methods”, Academic press, 1980.[49] Moran, Manfred and Grossmann, Ignacio E., “Chemical engineering optimizationmodels with GAMS”, CACHE Process design case studies, Vol 6, 1991.[50] Narula, Subhash C., “Optimization techniques in linear regression: A review”, inOptimization in statistics, TIMS Studies in the management sciences, volume 19,pp 11-29, North-Holland publishing company, 1982.[51] Orth, Hermann, “Model based design of water distribution and sewage systems”,John Wiley & Sons, 1986.[52] Pike, Ralph W., “Optimization for engineering systems”, Van Nostrand Reinhold,1986.[53] Ratschek, H. and Rokne, J., “New computer methods for global optimization”, EllisHorwood Limited, 1988.[54] Reklaitis, G. V., Ravindran, A. and Ragsdell, R. M., “Engineering optimization,methods and application”, New York Wiley, 1983.[55] Sandgren, E and Ragsdell, K. M., “The utility of nonlinear programming algorithms:A comparative study parts 1 and 2”, ASME J. Mech Des 102(3), pp 540-551, July1980.[56] Sarma, M. S., “On the convergence of Buta and Dora random optimization methods”, JOTA vol 66, No 2, Aug 1990.Bibliography 232[57] Schittkowski, Klaus, “Nonlinear programming codes: Information, Tests, Performance,” Lecture Notes in Economics and Mathematical Systems, Vol 183, Springer-Verlag, New York, 1980.[58] Schittkowski, Klaus, “Test examples for nonlinear Programming Codes”, LectureNotes in Economics and Mathematical Systems, Vol 187, Springer-Verlag New York,1981.[59] Schittkowski, Klaus, “More test examples for nonlinear programming codes”, Lecture Notes in Economics and Mathematical Systems, Vol 282, Springer-Verlag NewYork, 1987.[60] Shields, F. Douglas Jr., and Thackston, Edward L., “Designing treatment basindimensions to reduce cost”, ASCE, Journal of Environmental Engineering, Vol 117,No 3, May/June 1991.[61] Subrahmanyam, M. B., “An extension of the simplex method to constrained optimization”, JOTA, vol 62, NO 2, pp 311-319, August 1989.[62] Templeman, Andrew B., “Discrete optimum structural design”, Computer andStructures, Vol 30, No 3, pp 511-518, 1988.[63] Torn, Amino and Zilinskas, Antanas, “Global optimization”, Lecture Notes in Computer Science, NO. 350, Springer-Verlag, 1988.[64] Turner, D. B., “Workbook of atmospheric dispersion estimates”, Environmentalprotection agency, office of Air programs, Research Triangle Park, N.C., 1873.[65] Venkayya, V. B., “Design of optimum structures” in Computers & Structures, Vol1, pp 265-309, 1971.[66] Visweswaran, V., and Flondas, C. A., “A global optimization algorithm (GOP) forcertain classes of nonconvex NLPs-II. Application of theory and test problems”,Computers and Chemical Engineering, Vol 14, No 12, pp 1419-1434, 1990.[67] Wang, Bi-Chong and Luus, Rein, “Reliability of optimization procedures for obtaining global optimum”, AIChE Vol 24, No 4, pp 619-26, 1978.[68] Wasserman, William and Kutner, Michael H., “Applied linear regression models”,Richard D. Irwin, Inc., 1983.[69] Zanakis, S. H. and Evans, J. R., “Heuristic optimization: why, when, and how touse it”, Interfaces 11, pp 84-91, 1981.[70] Zupan, Jure, “Clustering of large data sets”, Chemometrics research studies series.Research studies press, 1982.Appendix AMATHEMATICAL TEST PROBLEMSProblem 1-1-0-0. Goldstein function Source: [2]f(x) = — 15x + 27x + 250Subjected to NO constraint.Initial bounds:—10 <x <10Solution in literature:f(x*)= 7 atx’ = +3and f(x) = 250 atx = 0Solution using LSP:f(x*)= 7.000 atx* = +3LSP Convergence criterion:VM1E-3LSP computational effort:= 592 Nkeep = 10Problem 1-2-0-0. Source: [63]f(x) = sin x + .sin(lOx/3) + in x — 0.84xSubjected to NO constraint.Initial bounds:2.7 <x <7.5Solution in literature:f(x*)= —1.6013075 atx* = 5.1997784There is an error, f(x*) should be -.6O13O75 for the given x.Computational effort using six different algorithms (33, 29, 45, 462, 25, 120)Solution using LSP:f(x*) = —4.6013074 atx = 5.1995869LSP Convergence criterion:VM1E-3LSP computational effort:Nf = 110 Nkeep = 10Problem 1-3-0-0. Source: [63]f(x) = sin x + sin(2x/3)Subjected to NO constraint.Initial bounds:3.1 x <20.4233Appendix A. MATHEMATICAL TEST PROBLEMS 234Solution in literature:f(x*)= —1.9059611 atx* = 17.0391986Computational effort using six different algorithms = (37, 38, 442, 448, 45, 158)Solution using LSP:f(x*)= —1.9059612 atx* = 17.0390205LSP Convergence criterion:VM1E-3LSP computational effort:Nf104 iVkeep=lOProblem 1-4-0-0. Source: [63] Solution Improved1(x) =-Zsin((i + 1)x + i)Subjected to NO constraint.Initial bounds:—10 5 x 10Solution in literature:f(x*)= —12.0312494 atx* = —6.7745760,—0.4913908or5.7917947There is an error, f(x) should be -3.2649353 for the given points.Computational effort using six different algorithms = (125, 165, 150, 3817, 161, 816)Solution using LSP:f(x*)= —3.3728776 at x = —6.7187829, —0.4363553 or 5.8455772LSP Convergence criterion:VM1E-3LSP computational effort:IN/f = 2920 1keep = 10Problem 1-5-0-0. Source: [63]f(x) = (x + sin x)eSubjected to NO constraint.Initial bounds:—10 x < 10Solution in literature:f(x*)= —0.6795797 atx* = —0.8242384The optimum point and its value are written interchangedComputational effort using six different algorithms = (35, 34, 98, 376, 229, 83)Solution using LSP:f(x*)= —0.824239319999 atx* = —0.6793407LSP Convergence criterion:VM1E-3LSP computational effort:Nf 92 Nkeep = 10Problem 1-6-0-0. Source: [26]Appendix A. MATHEMATICAL TEST PROBLEM’S 235fix) = x6— fijx5 + 1x4 + 7.1x3 — jx2 — x + .1Subject to:—2 < x < 11Initial bounds: As given in the constraintSolution in literature:f(x*)= —29763.233 atx* = 10but f(x) = 803570.1 at x = 10Solution using LSP:f(x*)= —7.487312 atx* = —1.19108LSP Convergence criterion:VM<1E—3LSP computational effort:iV1 = 132 j\Tk = 10Appendix A. MATHEMATICAL TEST PROBLEMS 236Problem 2-1-0-2. Source: [36]f(x) = sin(xi + x2) + (xi — x2)2 — 1.5x + 2.5x + 1Subject to:—1.5 x 4—3 < x2 3Solution in literature:f(x*) = —1 .913223 at x = (—.54772, —1 .5472)and f(x) = 1.2283700 at x = (2.5944, 1.5944)Solution using LSP:f(x*) = —1.9131677= (—.54154, —1.54837)LSP initial bounds:—1.5 x1 4—3 < x2 3Convergence criterion:VM lB-SLSP computational effort:Nf = 374 = 20Problem 2-2-1-1. Source: [58]f(x) = (xi — 2)2 + (x2 — 1)2Subject to:— .25x? — x + 1 0— 2x + 1 = 0Solution in literature:f(xj = 1.39346= (.82288, .9 114)Computational effort cited in literature ( ORGA, FMIN): 108, 838Solution using LSP:f(x*) = 1.39352= (.82285, .91143)LSP initial bounds:—2 < x1 2—1 x2 1Convergence criterion:VMS lB-SLSP computational effort:Nf = 303 Nkeep = 20Problem 2-3-1-0. Source: [58]f(x)=(1_xi)2Subject to:10(x2— x?) = 0Solution in literature:f(xj=0Appendix A. MATHEMATICAL TEST PROBLEMS 237x =(1,1)Computational effort cited in literature ( GRGA, FMIN): 33, 546Solution using LSP:f(x*) = 4K— 7= (1.0004,1.0008)LSP initial bounds:—40 x1 $ 400 j X2 1600Convergence criterion:VM<1E-5LSP computational effort:= 188“keep = 20Problem 2-4-1-0. Source: [58]f(x) =ln(1+xfl—x2Subject to:(1 + xfl2 + x — 4 = 0Solution in literature:f(x*) = —1 .73205= (0, 1.73205)Computational effort cited in literature ( GRGA, FMIN): 204, 260Solution using LSP:f(x*) = —1.73204= (0.00194, 1.73205)initial bounds:—l x1 1—2 5; 2Convergence criterion:1K-SLSP computational effort:Nf = 156 Nkeep = 20Problem 2-5-0-1. Source: [58]f(x) = 100(x2— xfl2 + (1 —Subject to:1.5 5; x2Solution in literature:f(xj = 0.0504261879= (1.224, 1.5)Computational effort cited in literature ( GRGA, FMIN): 115, 444Solution using LSP:f(x*) = 0.05044= (1.22449, 1.50000)LSP initial bounds:05; x1 5; 51.5 5; x2 5; 5Appendix A. MATHEMATICAL TEST PROBLEMS 238Convergence criterion:VM1E-5LSP computational effort:N1 = 2088 Nkeep = 20Problem 2-6-0-3. Source: [58]f(x) = 100fr2 — xfl2 + (1 —Subject to:x21 — 1 0x1 + x 00.5Solution in literature:f(x*) = 306.5= (0.5,2)Computational effort cited in literature ( GROA, FMIN): 508, 2464Solution using LSP:f(x*) = 306.71= (0.50000, 2.00059)LSP initial bounds:0 j x1 .52 x2 5Convergence criterion:Yw 1E—3LSP computational effort:N1 = 1002“keep = 20Problem 2-7-0-0. Rosenbrock function Source: [20], [58]f(x) = lOO(x2— xfl2 + (1 —Subject to NO constraint.Solution in literature:f(x*) = 0x = (1,1)Computational effort cited in literature (ORGA, FMIN): 596, 638Solution using LSP:f(x*) = 1E— 7= (1.0001, 1.0002)LSP initial bounds:—5 xl $ 5—5 x2 5Convergence criterion:T4w<1E—12LSP computational effort:Nf = 3206 Nkeep = 20Problem 2-8-0-5. Source: [23]Appendix A. MATHEMATICAL TEST PROBLEMS 23930.Solution using LSP:Xi + X2 1xi — 2x < 12x1— 53x1 + 5x2 27—6x1 + lOx2 < 30f(x* 4999f(x* = 4973and f(x*) = —4.990A single run could not easily detectLSP initial bounds:0 xi 90 x2 6Convergence criterion:VM<1E-3LSP computational effort:Nf = 24253atx*= 0,1)at x = 0,2.987)at x* = (0.995, 0.005)all global points unless Nkeep was raised aboveProblem 2-9-0-0. Source: [34]f(x) = (x + 12x — 1)2 + (49x + 49x + 84x1 + 2324x — 681)2Subject to NO constraint.Solution in literature:f(x) = 5.9225and f(x) = 0.0000Solution using LSP:Subject to NO constraint.Solution in literature:f(x*) = 2E— 7x = (—21.0267, —36.7600)—25 1 5—40 x2 5Subject to:f(x) = -(xi - 2)2-(x - 2)2Solution in literature:f(x*)= 5x’ = (0,1),(4,3),(0,3) or (1,0)Nkeep 30LSP initial bounds:at x = (0.28581,0.27936)at x = (—21.026653, —36.760090)Convergence criterion:1E—12LSP computational effort:Nf = 110032 Nkeep = 50The global optimum was missed with Nkeep values below 0.Problem 2-10-0-0. Source: [34]f(x) = —[e() * (2x + 3x)]Appendix A. MATHEMATICAL TEST PROBLEMS 240f(x*)= —1.1036xK = (0,1) or x’ = (0,—i)Solution using LSP:LSP initial bounds:Convergence criterion:VM<1E-8LSP computational effort:Nf = 5848 Nkeep = 20Problem 2-11-0-0. Source: [34]. [54]f(x) = (x + x2 — 11)2 + (x1 + x — 7)2Subject to NO constraint.Solution in literature:f(x*)=0x = (3.58443, —1.84813)Only two points are mentioned in [SJ.Solution using LSP:—5 < x1 5—5 x2 5Convergence criterion:Yw<iE3LSP computational effort:Nf = 2853 Nkeep = 30f(x) = (x1— 10) + (x2 — 20)—x1 + 13 < 0—(x1 — 5)2— (x2 — )2 + 100 < 0(x1 — 6)2 ( — 5)2 — 82.81 00Solution in literature:f(x*) = —6961.8106875x = (14.0950013, 0.8429636)Computational effort cited in the literature = 833Solution using LSP:f(x*) = —6961.795410x* = (14.0950079,0.8429772)f(x*) = —1.1036x = (—0.0018, —0.9999) or x = (0.0007, 0.9976)—3 <3f x = 0.00000f x = 0.00009f x = 0.00088f x = 0.00058LSP initial bounds:at xat xat xat xor x = (3,2)= (—3.779, —3.283)= ‘—2.806, 3.133)= 2.999, 1.998)= 3.586, —1.847)Problem 2-12-0-4. Source: [61]Subject to:Appendix A. MATHEMATICAL TEST PROBLEMS 241LSP initial bounds:13 x1 160 15Convergence criterion:VM1E-2LSP computational effort:Nf 8469 Nkeep = 20This problem was initially found to be a challenge to LSP and is discussed in Chapter.5, section 5.5.Problem 2-13-0-0. Gear train of minimum inertia, (Fenton and Eason’s Function)Source: [54], [20]f(x) = [12 + x -I- (1 + x)/x + (xx + 100)/(xix2)4]/10Subject to:1 <x,2 <3Solution in literature:f(x*) = 1.74x’ = (1.7435,2.0297)Solution using LSP:f(x*) = 1.7442= (1.7333,2.0449)LSP initial bounds: As given in the set of constraints.Convergence criterion:VM1E-7LSP computational effort:A’f = 200 Akeep 20Problem 2-14-0-0. Branin Equation Source: [63]f(x) = (x2— + xi — 6)2 + 10(1 — —)cosxi + 10Subject to:—5 x1 100 2 < 15Solution in literature:f(x*) = 0.398x = (—3.142, 12.275), (3.142, 2.275), or (9.425,2.425)Computational effort cited in literature: 134-11910Solution using LSP:f(x) = 0.39901,0.39791and0.39852x = (3.135, 2.310), (9.427, 2.475), and (—3.152, 12.290)LSP initial bounds: As given in the set of constraints.Convergence criterion:VM1E-5LSP computational effort:N1 = 11895 Nk = 30Appendix A. MATHEMATICAL TEST PROBLEMS 242Problem 2-15-0-0. Goldstein-2 Equation Source: [63]f(x) = (1 + (xi + x2 + 1)2(19 — 14x + 3x — 14x2 + 6x1x2+ xfl*(30 + (2x1 — 3x)(18 — 32x1 + 124 + 48x2 — 36x1x2+ 27x))Subject to:—2 x1,x2 2Solution in literature:f(x)=3x = (0, —1)Computational effort cited in literature: 139 up to 4042Solution using LSP:f(x*)= 3.0000x = (—0.0002, —1.0003)LSP initial bounds: As given in the set of constraints.Convergence criterion:VM1E-5LSP computational effort:Nf = 522“‘keep = 20Problem 2-16-0-0. Rastrigin Function Source: [63]f (x) = 4 + x — cosl8x1 — cosl8x2Subject to:—1 Xi,2 1Solution in literature:There are about 50 local minima arranged in a lattice configuration, of which thefollowing is the global minimum.f(x*)= —2= (0,0)Solution using LSP:f(x*)= —1 .9999= (—0.0007, —.0001)LSP initial bounds: As given in the set of constraints.Convergence criterion:VM1E-5LSP computational effort:Nf = 2510‘‘Tkeep = 20Problem 2-17-0-0. Source: [44], [63]f(x)=44—2.14+4/3--J-xix— x+ a4Subject to:—55; x1,x2 5 5Solution in literature:f(x*)= —1.0316285= (—0.08983, 0.7126) or (0.08983, —0.7126)Computational effort cited in literature: 40.8 [44]Solution using LSP:Appendix A. MATHEMATICAL TEST PROBLEMS 243f(x*)= —1.031570,—l.031559xx= (—0.090677, 0.710088) and (0.093965, —0.713538)LSP initial bounds: As given in the set of constraints.Convergence criterion:VM1E-5LSP computational effort:Nf = 8620 = 20Problem 2-18-0-2. Source: [53]f(x) = (xi — 2)2 + (x2 — 1)2Subject to:— X2 0X1 + X2 —2 0Solution in literature:f(x*)= 1x = (1,1)Computational effort cited in literature: (252-390)Solution using LSP:f(x*) = 1.0033xK = (0.9984,0.9969)LSP initial bounds:—1.414 x1 1.4140 X2 2Convergence criterion:VM1E-5LSP computational effort:Nc = 560 ‘keep 40The global optimum was not reliably found with Nkeep substantially less than 40.Problem 2-19-0-1. Source: [20]f(x) = (.44xx2+ lOxT1 + 0.592x143)/10Subject to:1— 8.62xT1a$ 00 xi,x2 5 5Solution in literature:f(x*)= 1.6206= (1.2867,0.53047)Solution using LSP:f(x*) = 1.6189xK = (1.2862,0.5308)LSP initial bounds: As given in the set of constraints.Convergence criterion:VM1E-8LSP computational effort:Nf = 3580 Nkeep = 20Appendix A. MATHEMATICAL TEST PROBLEMS 244Problem 2-20-0-0. Penalized Shubert Function Source:[2], [44] Solution Improvedf(x)= {Z i cos[(i + 1)xi + i]}{ i cos[i + l)x2 + i]}+.5[(xi-I- 1.42513)2 + (x2 + 0.080032)2]Subject to:—10 x1,x2 10Solution in literature:The problem has 760 local minima of which the following is the single global minimum.f(x*)= —186.73091= (—1.42513, —0.80032)Computational effort = 7711 [44], 8755, 76894 [2]Solution using LSP:f(xj = —186.7075= (—1.4229,—0.7980) or (—0.7984,—1.4217)LSP initial bounds: As given in the set of constraints.Convergence criterion:VM1E-6LSP computational effort:Nf = 5516 = 20Problem 2-21-0-3. Source: [12], [28]f(x)= _[9_(x1_3)2]33Subject to:00X2<0 x1--i/(x2) 5; 6The problem has a different type of constraint in which the current limit for one ofthe variables (x2) depends on the current value of another variable (x1).Solution in literature:f(x*)=_1x=Solution using LSP:f(x*)= —0.99999x*= (3.000, 1.732)Computational effort:Nf = 159[12] and 642[28]LSP initial bounds:0 5; x1 5; 60 5; 5; 4Convergence criterion:VM5;1E-6LSP computational effort:= 481 ll4Tk = 20Appendix A. MATHEMATICAL TEST PROBLEMS 245Problem 2-22-0-0. Source: Nonef(x) = 100 — (x1 + x2 — 10)2Subject to:0 x1,x2 10True solution:f(x*)=0= (0,0) or (10,10)Solution using LSP:f(x*)= 0.00013x’ = (0.0000, 0.0000) or (10.0000, 10.0000)LSP initial bounds: As given in the set of constraints.Convergence criterion:VM1E-7LSP computational effort:iVr = 6354 P/keep = 40Appendix A. MATHEMATICAL TEST PROBLEMS 246Problem 3-1-1-0. Source: [58]f(x) = (xi + x2) + @2 + x3)2Subject to:x1 + 2x + 3x — 1 = 0Solution in literature:f(x*)=0x” = (0.5, —0.5, 0.5)Computational effort, Nf, cited in literature (GRGA, FMIN): 177, 46Solution using LSP:f(x*)= 7E— 5x = (0.49493,—0.48871,0.49417)LSP initial bounds:—5 xi,x2,x3 $ 5convergence criterion:VM 1E—3LSP computational effort:N1 = 578 Nkeep = 30Problem 3-2-1-0. Source: [58]f(x) = 0.0l(xi — 1)2 + (x2 — xfl2Subject to:x1 + 4 + 1 = 0Solution in literature:f(x*)= 0.04= (—1,1,0)Computational effort, N1, cited in literature (ORGA, FMIN): 962, 346Solution using LSP:f(x*) = .040084= (—1.0012,1.0084,0.0342)LSP initial bounds:—10 ; Z1 —1—5 x2,x3 5Convergence criterion:VM1E-5LSP computational effort:N1 = 696 = 30Problem 3-3-0-2. Source: [20], [59]f (x) = —x1x23Subject to:x1 + 2x + 2x3 072 — x1— 2x — 2x3 00 x1 200 2 5; 1105; x3 5; 42Solution in literature:Appendix A. MATHEMATICAL TEST PROBLEMS 247f(x*) = —3300= (20, 11,15)Computational effort, N1, cited in literature (NLPQL): 25 [59]Solution using LSP:f(x*) = —3299.946x = (20.0000, 10.9993, 15.0007)LSP initial bounds: As given in the set of constraintConvergence criterion:VM1E-5LSP computational effort:N1 = 8886 Nkeep 30Problem 3-4-2-0. Source: [52]f(x) —1 * (3x + 2x — x3)Subject to:x + x = 259x1 — x + x3 = 27Solution in literature:f(x*) = —66= (4,3,0)Solution using LSP:f(x’) = —66= (4,3,0)LSP initial bounds:0x5 fori=1,3Convergence criterion:VM 1E-3LSP computational effort:= 877 Nkeep = 30Problem 3-5-2-0. Source: [59]f(x) = Iog(x3)—Subject to:x + 4 —4 = 0—1— = 0Solution in literature:f(x*) = —1 .73205x = (0, 1.732,1)Computational effort, N1, cited in literature (NLPQL): 54Solution using LSP:f(x*) = —1.732x* = (—.0225, 1.73176,1.00050)LSP initial bounds:—1 5; x1 5; 1—2 5; x2 5; 21 x3 5; 2Appendix A. MATHEMATICAL TEST PROBLEMS 248Convergence criterion:VM1E-3LSP computational effort:Nf = 88 Niveep = 20Problem 3-6-0-1. Source: [59]f(x) = 0.2/(xixx3)+ 4/x1 + 3/xSubject to:10— 2x1x3— x12 0Solution in literature:f(x*) = 3.36168xK = (2.380, 0.3162, 1.943)Computational effort, P’Jf, cited in literature (NLPQL): 41Solution using LSP:= 3.3628x =‘(2.3699, 0.3162, 1.9505)LSP initial bounds:0 i,2,X3 10Convergence criterion:VM1E-3LSP computational effort:Nf = 2804 = 20Problem 3-7-0-2. Source: [58]f(x) = 0.2x3 — 0.8x1Snbject to:— exp(xi 0— exp(x2 005; X1,X2 5; 10005; X3 < 10Solution in literature:f(x*) = 0.5181632741= (0.1841264879, 1.202167873, 3.327322322)Computational effort, Nf, cited in literature (GRGA, EMIN): 208, 1917Solution using LSP:f(x*)= 0.51858x’ = (0.1777,1.1945,3.3038)LSP initial bounds: As given in the set of constraints.Convergence criterion:VM5;1E-7LSP computational effort:Nf = 65973 Nkeep = 40Problem 3-8-1-1. Source: [58]f(x) = (.x + 3x2 + x3)2 + 4(xi —Appendix A. MATHEMATICAL TEST PROBLEMS 249Subject to:6x2+4x3—x—301 — x1— x2 — = 00 xj for i = 1,3Solution in literature:f(x*) =1x = (0,0,1)Computational effort, Nf, cited in literature (ORGA, FMIN): 138, 3202Solution using LSP:f(x*)= 1.0006xK= (0.0111,0.0000,0.9889)LSP initial bounds:0 x1,x23 1Convergence criterion:VM1E-6LSP computational effort:= 661‘1keep 30Problem 3-9-0-0. Source: [58]f(x)== —0.Oli + exp(—1/xi(u1—it1 = 25 + (—50ln(0.01i))/Subject to:0.1 Xi 1000 x2 25.60 X3 5Solution in literature:f(x*)=0x’ = (50,25, 1.5)Computational effort, N1, cited in literature (GRGA, FMIN): 506, 873Solution using LSP:f(xj = 0.26E — 5x = (52.993, 24.924, 1.519)The problem is very insensitive to x1.LSP initial bounds: As given in the set of constraints.Convergence criterion:VM1E—6LSP computational effort:N1 = 6000 PViceep = 40Problem 3-10-0-0. Hartman Family; Source: [19] Solution ImprovedAppendix A. MATHEMATICAL TEST PROBLEMS 250a(x—pjj)2]Table A.1. Subject to:±JL__ Pu1 3.0 10 30 1.0 0.3689 0.1170 0.26732 0.1 10 35 1.2 0.4699 0.4387 0.74703 3.0 10 30 3.0 0.1091 0.8732 0.55474 0.1 10 35 3.2 0.03815 0.5743 0.8828TableA.1: Dataformz4andn=3Solution in literature:f(x*) = —3.8626x = (0.1310,0.5556,0.8526)0.113,0.868,0.561= 0.700, 0.555, 0.857= 0.095, 0.555, 0.849= 0.108, 0.552, 0.853Problem 3-11-0-2. Source: [14]f(x) = 314.16xx2+ 408.41x + 100x3Subject to:1592.36xr2x+ 0.125xr2x’3< 1xx2 + 4xx 1x 0 for i = 1,3Solution in literature:f(x*)= 91246.018x (8.648, 2 1.339, 27.468)Solution using LSP:f(x*)= 91260.79xK= (8.6241,21.4560,27.5314)LSP initial bounds:0 x1,x23 40f(x)=Values of c, and Pij are given in0 1 forj = 1,nSolution using LSP:f x = —3.08796 at x*f x’ = —3.68868 at x*f x = —3.86131 at xf xK = —3.86023 at xLSP initial bounds:0 x1,x23 1Convergence criterion:VM < 0.01LSP computational effort:Nf = 1496 Nkeep = 35Convergence criterion:Appendix A. MATHEMATICAL TEST PROBLEMS 251- VM1E-2LSP computational effort:= 304113 “Tkeep = 30Problem 3-12-0-2. Source: [14]f(x) = 4004 + 4.183xL’2+ lOxr5x24Subject to:x1 + x2 10.1975xT5xE+ 0.1975xr5xr’ 1xj 0 for i = 1,3Solution in literature:f(x*) = 280.623= (0.238,0.413,1.704)Solution using LSP:f(x*) = 280.674x = (0.240, 0.410, 1.696)LSP initial bounds:0 x1 50 2 5; 11 5; x3 5; 5Convergence criterion:VM1E-3LSP computational effort:Nf = 2556 Nk8 = 30Problem 3-13-2-0. Source: [46]fl’ \ 2 2 2J ,x) — —x1 — — x3Subject to:x1 + 2x + 3x = 1x+4+4=4The constraints form ellipses.Solution in literature:f(x) = —9.995 and other local optima values —9.051, —4.430, —4.049Four optimal values are given though the corresponding points are not specified.x = (not given)Solution using LSP:f(x*) = —9.9946x = (—0.16401, —2.43407, 2.201072)LSP initial bounds:—2 5; x1 5; 2—2.828 5; 5; 2.828—4 5; .x3 <4Convergence criterion:1w 5; 1E — 4LSP computational effort:Appendix A. MATHEMATICAL TEST PROBLEMS 252N1 = 387287 Niveep = 30Problem 3-14-0-1. Fuel allocation problem. Source: [46], [54]= 1.4609-1- O.15186x + 0.00145xW2 = 1.5742 + 1.631x + 0.001358x= 0.8008 + 0.2031(50 — x1) + 0.000916(50 —= 0.7266 + 0.2256(50 — x1) + 0.000778(50 —f(x) = x2W1 + z3}’Subject to:(1 — x2)W + (1 — x3)Y2 10Solution in literature (53):f(x*) = 3.05= (30, 0.00, 0.58)Solution using LSP:Number of functional evaluation cited in Reklaitis = 8600, AIChE = 2989f(x) = 3.0557= (29.6843,0.0018,0.5736)Convergence criterion:VM1E-5LSP computational effort:Nf = 62275 Niveep = 30Problem 3-15-0-2. Flywheel design Source: [20]f(x) = —0.0201xx24/107Subject to:675 — xx2 00.419— (xix3)/107 00 x1 5 360 x2 50 x3 5 125Solution in literature:f(x*)= —5.684802= (not given)Solution using LSP:f(xj = —5.6830x*= (18.802, 1.909, 108.871)LSP initial bounds: As given in the set of constraintsConvergence criterion:VM<1E-5LSP computational effort:N1 = 114824 Niceep = 30The slow convergence is discussed in Chapter 5, section 5.3.Problem 3-16-0-2. Source: [67]Appendix A. MATHEMATICAL TEST PROBLEMS 253Subject to:4(x1 — .5)2 + 2(x — .2)2 + 4 + .1xx2+ .2xx3 162x + 4 — 24 2—2.3 x1,x23 2.7Solution in literature:The problem has four local optima, of which the global value isf(x*) = —11.67664= (0.988, 2.674, —1.884)Solution using LSP:f(x*) = —11.676620= (0.9853,2.6752,—1.8839)LSP initial bounds: As given in the set of constraintsConvergence criterion:Yw<1E—8LSP computational effort:Nf = 1032281 Nkeep = 30Problem 3-17-0-0. Nonlinear regression model fitting. Source: [18]Problem formulation:The following formulation is directly taken from the source literature.An article in Lubrication Engineering (“Accelerated Testing of Solid Film Lubricants”, 1972, pp 365-372) reported on an investigation of wear life for solid film lubricant.Three sets of journal bearing tests were run on a Mil-L-8937 type film at each combination of three loads(3000, 6000 and 10000 psi) and three speeds (20, 60 and 100 rpm) andthe wear life (hours) was recorded for each run. It is also known that a three parameternonlinear model can fit the experimental data. The model is written asw = c * * lbwherew is the wear life (hours)s is the speed (rpm)1 is the load (psi)a, b and c are the model parameters.It is required to determine the parameters for the multiple nonlinear regression modelusing the experimental data given in Table A.2. The model is expected to produce theminimum sum of squares between the recorded and simulated wear life.Method used in the literature:The data was linearized by transforming to logarithmic values so that a multiplelinear model could be used. The linear model was then solved using the least squaresmethod.LSP solution:The nonlinear model was solved using LSP without any linearization. The solutionscited in the literature and those found with LSP are given in Table A.3. The solutionsfound with LSP showed about 4% improvement over the solution cited in the literature.Appendix A. MATHEMATICAL TEST PROBLEMS 254—:- i w______ I w20 3000 300.20 60 6000 65.9020 3000 310.80 60 10000 10.7020 3000 333.00 60 10000 34.1020 6000 99.60 60 10000 39.1020 6000 136.20 100 3000 26.5020 6000 142.40 100 3000 22.3020 10000 20.20 100 3000 34.8020 10000 28.20 100 6000 32.8020 10000 102.70 100 6000 25.6060 3000 67.30 100 6000 32.7060 3000 77.90 100 10000 2.3060 3000 93.90 100 10000 4.4060 6000 43.00 100 10000 5.8060 6000 44.50Table A.2: Experimental data used to fit the nonlinear regression modelIt took 198 function evaluations to meet the convergence criterion of VM 1E — 5,and this was achieved in 3 minutes on an 80386 machine.Parameter Literature LSPa -1.20603 -1.2319b -1.39876 -1.3371c 8.3*108 5.6*108Z(w—J5)2 10748.95 10312.1Improvement 4.06%Table A.3: Optimal solution for the regression modelAppendix A. MATHEMATICAL TEST PROBLEMS 255Problem 4-1-0-3. Source: [59]f(x) = 4 + 4 + 24 + 4 — 5x1 — 5x2 — 21x3 + 7x4Subject to:—4 — 4— 4 — 4 — x1 + x2 + x3 + x4 + 8 0—4 — 24— 4 — 24 + x1 + x4 + 9 0—24 — 4— 4 — 2x1 + x2 + x4 + 5 0Solution in literature:f(x*) = 44x = (0,1,2,—i)Computational effort, Nf, cited in literature (NLPQL): 95Solution using LSP:f(x*) = —44.097x = (0.01337, 0.84628, 1.99776, —1.11141)LSP initial bounds:—5x5 fori=1,4Convergence criterion:VM1E-3LSP computational effort:= 16701 Nk€62 = 40Problem 4-2-0-6. Source: [58]f (x) = — — — x13 + x14 + x23 — x24Subject to:8 — — 2x 012 — 4x1 — x2 012 — 3x1 — 4x2 08 — 2x3 — x4 08 — — 2x4 05— 3:3 — x4 00x fori=1,4Solution in literature:f(x*) = _15x = (0,3,0,4)Computational effort, Nf, cited in literature (GRGA, FMIN): 203, 8547Solution using LSP:f(x*)= —14.988xK = (0.0004, 2.9995, 0.004, 3.9976)There is a local optima at x = (3,0,4, 0) with f(x) = 13LSP initial bounds:0 30 3:3,3:4 4Convergence criterion:VM 1E-4LSP computational effort:Nf = 5000 keep = 40Appendix A. MATHEMATICAL TEST PROBLEMS 256Problem 4-3-2-0. Source: [52]f(x)=x-b4xSubject to:x1 + 2x — = 1—x1 + x2 + x4 = 0Solution in literature:f(xj = 0.5x = (0.5, 0.25, 0, 0.25)Solution using LSP:f(x*) = 0.5001x = (0.5023, 0.2489,0.0001, 0.2535)LSP initial bounds:0x1 fori=l,4Convergence criterion:VM1E-6LSP computational effort:Nj = 4696 Nkeep 40Problem 4-4-1-1. Source: [58]f (x) = xix4(x1+ x2 + x3) + x3Subject to:x1234— 25 0x + x + 4 + 4 — 40 = 01x5 fori=1,4Solution in literature:f(x*) = 17.0140173Xx = (1,4.7429994, 3.8211503, 1.3794082)Computational effort, Nf, cited in literature (GRGA, FMIN): 411, 6175Solution using LSP:f(x*)= 17.015= (1.0002,4.7565,3.8034, 1.3817)LSP initial bounds:1x5 fori=1,4Convergence criterion:VM1E-5LSP computational effort:Nj = 15186 Nkeep = 40Problem 4-5-0-0. Wood’s function Source: [20], [34], [54]f(X) = 100(x2 — 4) + (1 — x1)2 + 90(x4 — 4) + (1 —+10.1((x — 1)2 + (x4 — 1)2) + 19.8(x2 — 1)(x4 — 1)Subject to:—10x10 fori=1,4Solution in literature:f(X*)=0Appendix A. MATHEMATICAL TEST PROBLEMS 257x = (1,1,1,1)Solution using LSP:f(x*) = 0.00062x = (1.011,1.023,0.988,0.978)LSP initial bounds:—10x10 fori=1,4Convergence criterion:VM1E-3LSP computational effort:Nf = 50824 Nkeep = 40Problem 4-6-1-2. Source: [58]f(x) = 24.55x1 + 26.75x + 39x + 40.50xSubject to:x1 + x2 + x3 + x4 —1 = 02.3x1 + 5x2 + 11.1x3 + 1.3x4 — 5 012x + 11.9x2 + 41.8x3 + 52.1x4 — 21—1.645(0.284 + 0.194 + 20.54 + 0.624)1/2 00x fori=1,4Solution in literature:f(x*) = 29.894378x = (0.355216,—.12E — 11,0.3127019,0.05177655)Constraints are violated.Computational effort, Nf, cited in literature (GRGA, FMIN): 394, 4620Solution using LSP:f(x*) = 29.8947x = (0.6351, 0.0004,0.3126, 0.0518)LSP initial bounds:0x1 fori=1,4Convergence criterion:VM1E-3LSP computational effort:Nf = 13924 Nkeep = 40Problem 4-7-2-0. Source: [59]f(x) = [1 — exp(—10xexp(—x2))]Subject to:x1 + x2 —1 = 02;3 + x4 —1 = 00x1 fori=1,4Solution in literature:f(x*) = 0.974747= (1,0,1,0)Appendix A. MATHEMATICAL TEST PROBLEMS 258Computational effort, Nf, cited in literature (NLPQL): 12Solution using LSP:f(x*) 0.9749x = (1.0000, 0.0000, 0.9995, 0.0005)LSP initial bounds:0x1 fori=1,4Convergence criterion:VM1E-3LSP computational effort:Nf = 800 Niceep = 40Problem 4-8-0-3. Source: [14] Solution Improvedf(x) =x14+ 2x1 -1- x14 + x1Subject to:x’x2 + xj’ 1x1’ 1X13 1x0 fori=1,4Solution in literature:f(x*)= 11.946= (1.267,0.267,0.789,1.267)f(x*) = 9.919 when the given x value is used.Solution using LSP:f(x*)= 9.8089= (1 .3216, 0.3216, 0.7566, 1.3219)LSP initial bounds:0x2 fori=1,4Convergence criterion:VM1E-5LSP computational effort:Nf = 178145 Nkeep = 40Problem 4-9-0-0. Shekel’s family. Source: [19]m nf(x) = — [((x — ajj)2) + ej]’i1 j=1Values of and c3 are given in Table A.4.Subject to:0 <x3 < 10 for j = 1,nSolution in literature:f(x*)= —10.1532= (4,4,4,4)Solution using LSP:Appendix A. MATHEMATICAL TEST PROBLEMS 259i q1 4 4 4 4 0.12 1 1 1 1 0.23 8 8 8 8 0.24 6 6 6 6 0.45 3 7 3 7 0.4Table A.4: Data for m = 5 and n = 4f(xj = —10.1528= (4.002,4.000,3.999,4.001)LSP initial bounds:0x10 fori=1,4Convergence criterion:VM1E-3LSP computational effort:N1 = 14008 Nkeep = 80Problem 4-10-1-2. Source: [26] Solution Improvedf(x) = 4 + 24 + 2x3 — 2x —Subject to:— 3x1— 3 = 0x1 + 2x3 4x2 + x4 4x1 3X4 2xj0 fori=1,4Solution in literature:f(x*)—2.07x = (4/3,4, 0, 0)Solution using LSP:f(x*) = —3.1325= (0.0000, 3.0000, 0.0001, 0.9995)LSP initial bounds:0 x1 30 x2 40 x3 20 x4 2Convergence criterion:VM1E-7LSP computational effort:Nf = 5035 Nkeep = 40Problem 4-11-0-0. Source: [68]Appendix A. MATHEMATICAL TEST PROBLEMS 260Problem formulation:The following formulation is directly taken from the source literature.An electronic products manufacturer undertook the production of a new product intwo locations (location A: coded x = 1, location B: coded x = 0). Location B has moremodern facilities and hence was expected to be more efficient than location A, even afterthe initial learning period. An industrial engineer calculated the expected unit productioncost for a modern facility after learning has occurred. Weekly unit production costs foreach location were then expressed as a fraction of this expected cost. The reciprocalof this fraction is a measure of relative efficiency, and this measure was utilized as theefficiency measure in this study.The model used was= 7° + 7iXj +73exp(72wj) + ejHere yo is the upper asymptote for location B as w gets large, and 70+71 is the upperasymptote for location A. The parameters 72 and reflect the speed of learning, whichwas expected to be the same in the two locations.The data on location, time and relative efficiency are presented in Table A.5. Notethat the relative efficiency in location B toward the end of the 90 week period evenexceeded 1.0, i.e. the actual unit costs then were lower than the industrial engineer’sexpected unit cost.Location Week Rel. eff. Location Week Rel. eff.xi x1 1 0.483 0 1 0.5171 2 0.539 0 2 0.5981 3 0.618 0 3 0.6351 5 0.707 0 5 0.751 7 0.762 0 7 0.8111 10 0.815 0 10 0.8481 15 0.881 0 15 0.9431 20 0.919 0 20 0.9711 30 0.964 0 30 1.0121 40 0.959 0 40 1.0151 50 0.968 0 50 1.0071 60 0.971 0 60 1.0221 70 0.96 0 70 1.0281 80 0.967 0 80 1.0171 90 0.975 0 90 1.023Table A.5: Experimental data used to fit the nonlinear regression modelMethod used in the literature:Appendix A. MATHEMATICAL TEST PROBLEMS 261A direct search computer program was used to estimate the four parameters in themodel. A reasonable starting point was used as input to run the computer package. Theresulting least squares regression function wasY = 1.0156 — 0.04727x— (0.5524)exp(—0.1348w)and the error sum of squares was SSE = 0.00329.LSP solution:Output from LSP confirmed the results given in the source reference. It took 2128function evaluations to meet a termination criterion of VM 1E — 5.Problem 4-12-1-5. Source: [57]f(x) = .so(x) + xTHx + qTx + aH is (n x n) matrix, q R, a R and T designates the transpose of a matrix.Subject to:si x) + 40.99 = 02 x) + 3.98 093X +4.610s4x +32.060.s5x —2.7106 x + 2.27 01x4 fori=1,2,3,4whereso(x) = .32x8x’6— .88x6+ .81x°748+ .66x5 — .83x34027s1(x) = .96— 1.12x3x+ 3.13x’x + 3.3x — 4.84xj13= 5.39 + 1.98xxx’ — 4.14x2 — 1.88xx4s3(x) = .99 — 4.64x3x’— .57x4s4(x) = —3.23— 1.89x — 1.18x — 2.01xx’— 1.99xs5(x) = 3.45 — 2.98xx— .96xx — 2.90x’x + 3.39xs6(x) = —.51 — 2.89xx3— 6.24x + 2.15xx4171.49 200.07 156.82 —3.12200.07 78.48 86.63 —12.85H=156.82 86.63 51.75 —20.46—3.12 —12.85 —20.46 —5.07q = (—908.19, —546.53, —407.35, 64.17)Ta = 1499.97Solution in the literature:x = (1.35, 2.17, 2.80, 3.71)Appendix A. MATHEMATICAL TEST PROBLEMS 262Objective function calculated at optimal point = -0.2298Solution using LSP:f(x*) = —0.2716x = (1.341152, 2.190871, 2.787508, 3.715618)LSP initial bounds: As given in the set of constraints.Convergence criterion:V 1E—4LSP computational effort:Nf = 602, 075 Nkeep 40Appendix A. MATHEMATICAL TEST PROBLEM’S 263Problem 5-1-3-0. Source: [36], [58]f (x) = (xi — x2) + (x2 — x3)2 + (x3 — x4) + (x4 —Subject to:x1 + 4 + 4 —3 = 0— 4 + x4 —1 = 0x15 — 1 = 0Solution in literature:f(x*) = 0x = (1,1, 1, 1, 1)Computational effort, Nf, cited in literature (ORGA, FMIN): = 623, 472 [58]Solution using LSP:f(x) = 8.8 * 10—6= (1.0006, 1.0008,0.9978, 0.9949,0.9994)LSP initial bounds:0x13 fori=zl,5convergence criterion:a1E—6LSP computational effort:Nf = 1008 Nkeep = 50Problem 5-2-0-2. Source: [14]f(x) = 10xE25x’5-1- 20xjx345+ 15xx’xESubject to:.01x1x2+ .02x2x45 1.04x3+ .02x3+ .02x4x3+ .06xE’x2x.5xt 1x0 fori=1,5Solution in literature:f(x*) = 37.216= (11.425,8.747,0.787,2.401,1.179)Solution using LSP:f(x*) = 37.226The problem has many global optimal points, of which one is= (4.6472, 16.3588, 9.3017, 1.7066, 12.5463)LSP initial bounds:0x20 fori=1,5Convergence criterion:Vw<1E—3LSP computational effort:Nf = 40887 = 50Problem 5-3-0-6. Source: [34], [28]f(x) = 5.35785474 + 0.8356891x1x5+ 37.293239x1— 40792.141Subject to:0 85.334407 + 0.0056858x2x5+ 0.0006262x1x4— 0.0022053x35 9290 80.51249 + 0.0071317x + 0.0029955x2+ 0.00218134 110Appendix A. MATHEMATICAL TEST PROBLEMS 26420 < 9.300961 + 0.0047026x3x5+ 0.0012547x13+ 0.0019085x34 2578 x1 10233 x2 4527 x3,x4, x5 45Solution in literature:f(x*) = —30665.5= (78, 33, 29.995, 45, 36.776)Solution using LSP:f(x*)= —30633.80xc= (78.00016,33.04861, 30.1344,44.99984, ,36.57489)LSP initial bounds: As given in the set of constraintsConvergence criterion:VM1E-3LSP computational effort:Nj = 5876 Niceep = 50Problem 5-4-3-0. Source: [58]f(x) = fri — x2) + (x2 + x3 — 2)2 + (x4—1)2 + (x5—1)2Subject to:x1 + 3x2 = 0x3 + x4 — 2x5 0—:r5 = 0—10x<10 fori=1,5Computational effort, N, cited in literature (GRGA, FMIN): = 82, 2172Solution in literature:f(x) = 176/43= (—33,11,27,—5,11)/43Solution using LSP:f(xj = 4.0973x = (—0.7911,0.2637,0.5934,—0.0660,0.2637)LSP initial bounds:—10x10 fori=1,5Convergence criterion:VM<1E-3LSP computational effort:N = 619 Niceep = 50Appendix A. MATHEMATICAL TEST PROBLEMS 265Problem 6-1-0-2. Transformer Design Source: [5], [56], [58]f(x) = 0.0204xx4(xi+ x2 + x3) + 0.0187x2x3(xi+ 1.57x2 + x4)+0.0607xixx(x + x2 + x3) + 0.0437xx(xi + 1.57x2 + x4)Subject to:0.001x123456— 2.07 01 — 0.00062xx(x+ x2 + x3) — 0.00058x2x3(x1+ 1.57a2 + x4) 00x fori=1,65.27 xi 5.814.29 4.7110.14 < x3 10.5611.89 x4 12.310.73 x5 1.27Bounds are taken from [56].Solution in literature:f(x*)= 135.075961x = (5.332666,4.651744, 10.43299, 12.08230, 0.7526074, 0.87865084)Computational effort, Nf, cited in literature:(GRGA, FMIN): =. 2577, 3419 [58] and 4360355 [56]Solution using LSP:The problem has different near optimal points, of which one isf(x*)= 135.08276= (5.28766,4.68762, 10.40192, 12.15321, 0.75406, 0.87688)LSP initial bounds: As given in the set of constraints.Convergence criterion:VM 1E4LSP computational effort:N1 272225 Nkeep = 60Problem 6-2-0-6. Hesse’s Function Source: [66]f(x) = —25(xi — 2)2— (x2 — 2)2— (x3 — 1)2— (x — 4)2—(x5 — 1)2— (x6 — 4)2Subject to:x1 + x2 2x1 + x2 <6—x1 + x2 2— 3x2 < 2(x — 3)2 x4 4(x5 — 3)2 + x6 4x1,x2 01 <X3 50 <X4 61 <x5 50 x6 10Solution in literature:f(x) = —310The problem has 18 local optima one of which is global.Appendix A. MATHEMATICAL TEST PROBLEMS 266= (5,1,5,0,5,10)Solution using LSP:f(x*) = —309.997x = (5.00000, 1.00000, 4.99999,0.00030, 4.99996, 9.99997)LSP initial bounds: As given in the set of constraints.Convergence criterion:VM1E-5LSP computational effort:N1 = 14064 Nkeep = 60Problem 6-3-6-2. Source: [58]f(x) = x1 + 2x + 4x5 + exp(xix4)Subject to:x1 + 2x + 5x — 6 = 0x1 + x2 + x3 —3 = 0x4 + x5 + x6 —2 = 0x1 + z4 —1 = 0x2 + x5 —2 = 0x3 + x6 — 2 = 00x1 fori=1,61Solution in literature:f(x*) = 19/3xK = (0,4/3, 5/3, 1, 2/3, 1/3)Computational effort, N1, cited in literature (FMIN): = 4767Solution using LSP:f(x*)= 6.335xx= (.001, 1.334, 1.665, 0.999, 0.666, 0.335)LSP initial bounds:0 i,X4 10 X2,X3,Xs6 2Convergence criterion:VM1E-3LSP computational effort:N1 = 4966 Nkeep = 60Problem 6-4-4-0. Source: [34], [58]f(x) = fi(xi) + f2(x2))I 30x1 0<x1<30031x 300 x1 <400Appendix A. MATHEMATICAL TEST PROBLEMS 26728x 0<x2<100f2(x) 29x 100 x2 <20030x2 200 < x2 < 1000Subject to:= 300—1,8co.s(1.48577— x6) + O.9O798r (1 47588)= 13L078c08(1.48577 + x6) + OO798X (1 47588)=—,sin(1.48577+ x6) +017sin(1.47588)200—13x478sin(1. 8577— x6)-I- O.9O798Xi(1 47588) = 00 4000 < X2 1000340 < x3 <420340 x4 420—1000 X5 10000 x6 0.5236Solution in literature:Results differ in the precision of the x vector as a discontinuityin the constraint derivatives forces jump changes in f(x) and x.High precision: f(x*) = 8927.5888 [34], [58]x = (107.81, 196.32, 373.83,420.00, 21.31, 0.153)Moderate precision: f(x*) = 8853.44 or 8953.40 [34]x = (201.78, 100.00, 383.07, 420.00, —10.907, 0.07314)Computational effort, Nf, cited in literature (, GRGA, FMIN): = 1428, 4630 [58]Solution using LSP:f(x*) = 8889.881x = (202.996, 100.000, 383.071,419.999, —10.914,0.073)LSP initia’ bounds:As given in the set of constraintsConvergence criterion:VMlE-4LSP computational effort:Nf = 3924 Nkeep 60Problem 6-5-3-12. Source: [26] Solution Improvedf(x) = x6 + x + x + 2x4 + 5x — 4x3 — x6 Subject to:— 3x1—3x4 = 0—2x— 2x5 = 04x—= 0x1 + 2x4 <4+ x5 4Appendix A. MATHEMATICAL TEST PROBLEMS 268X3 + x6 6342x0 fori=1,6Solution in literature:f(x*)= —11.96x = (0.67,2,4,0,0,0)Solution using LSP:f(x*)= —13.402x = (0.1667, 2.0000,4.0000, 0.5000, 0.0000, 2.0000)LSP initial bounds:0 30 <x2 40 x3 40 <X4 20 <X5 20 x6 6Convergence criterion:VM1E—7LSP computational effort: Nf = 25927 Nkeep = 60Problem 6-6-0-4. Source: [8]f(x) = 10.5x— 3.95x2 + 3x + 5x4--1.5x5— 1.5x6—1.5x——— 2x—— 2.5xSubject to:XH-x2+34xs+x6<500x1 + 3x2 + 6x3 + 2x4 503x5 + 4x6 50x3 + 2x4 -I- 3x5 + x6 < 3500x99 fori=1,2,...,6Solution in literature:f(x*)= —70262.05x = (99, 99, 53, 99, 0, 99)Solution using MIJ\TOS 5.], as reported in the literaturef(x*) = —69181.04x’ = (99, 99, 99, 76, 0, 99)Solution using LSP:f(x*)= —70261.953= (98.99994, 98.99965, 52.99992, 98.99999, 0.00002, 99.00000)LSP initial bounds: As given in the set of constraints.Convergence criterion:VM 1E—2LSP computational effort:Nf = 46986 Nkeep 60Appendix A. MATHEMATICAL TEST PROBLEMS 269Problem 7-1-0-24. Source: [13]f(x) = .7854xix(3.3333x-f- 14.9334x3— 43.0934) — 1.5080xi(x + x)+7.4770(x + 4) + 0.7854(x + x54)Subject to:x1zx3 > 27> 397.5x23/x 1.93a1 = ((745 *x4/x2) + 16.91 * 106).5a2 = ((745 *5/x3+ 157.5 * 106).5= 0.1 *b2 = 0.1 * 4a1/b < 1100a2/b < 850x23 40x1/x2 5121.5x6 + 1.9 < X41.1x7 + 1.9 < x53 4.6 x2 .817 x3 <206.6 r4 8.37.3 < x5 8.32.9 < x6 < 3.95 X7 5.5Solution in literature:f(x*)= 2994.47x’ = (3.5, 0.7, 17, 7.3, 7.71, 3.35,5.287)Solution using LSP:f(x*)= 2994.5x’ = (3.50, 0.70, 17.00,7.30, 7.716, 3.351, 5.287)LSP initial bounds: As given in the set of constraints.convergence criterion:V,1<1E-3LSP Number of functional evaluation:Nf = 82826 = 70Problem 7-2-2-3. Source: [59]f(x) = —5x1—5x2— 4x3—x13— 6x4— (5x)/(1 + x5)—(8x6)/ 1 + x6) — 10(1 — 2exp(—xr) + exp(—2x7))Subject to:10—x 05— x1 0x3xg—402x4+0.8x6xr—5=0Appendix A. MATHEMATICAL TEST PROBLEMS 2704 + 4 + 4 + 4 —5 = 0Solution in literature:f(x*)= —37.4130= (1.47,1.98,0.35,1.2,0.57,0.78,1.41)Computational effort, N1, cited in literature (NLPQL): = 437Solution using LSP:f(x*)= —37.352x = (1.271, 1.972, 0.536, 1.220, 0.498, 0.758, 1.455)LSP initial bounds:0 x2,x3,x5,x6,x7 <2.30 5; x1,x4 5; 5Convergence criterion:VM1E-3LSP Number of functional evaluation:N1 =6973724 IN’keep = 50Problem 7-3-0-4. Source: [61]f(x) =(xi—10)+5(x2—12)-I-(4+3(x—110*$+74+4—4x0x8Subject to:24-I- 34 + x3 + 44 -1- 5x — 127 <07x1 + 3x2 + 104 + x4 — — 282 5; 023x1 + 4 + 64 — 8x7 — 196 5; 044 + X2 — 3x1x2 + 245x6 — 11X 5; 0Solution in literature:f(x*)= 680.63005743= (2.330517, 1.9513717, —0.47759196,4.365728, —0.6245119, 1.0381598, 1.5942702)Computational effort, N1, cited in literature: = 4368Solution using LSP:f(x*)= 680.6488= (2.33976,1.94295, —0.44145,4.38498, —0.63151,1.04464,1.60214)LSP initial bounds:—5<x1< fori=1,7Convergence criterion:VM1E-3LSP Number of functional evaluation:N1 = 68027 = 70Appendix A. MATHEMATICAL TEST PROBLEMS 271Problem 8-1-0-10. Source: [39]f(x) = —1 * [0.034x1x5+ 0.1225x3x7— 0.280125xx6—0.191625x48+ 50x7 + 145.6875x6+ 145.6875x8+50x — 47.52x5 — 47.52x6— 44.3625x — 44.3625x]Subjected to:x15 + x26 > 2000 + 80x5 + 80x6X37 + X43 7000 + 65x7 + 65x8X15 + x37 < 10000 + 89.286(x + x7)X26 + X48 3000 + 166.66(x7+ x8)4x5 + 4x6 10005x7+5x8<700x5 H- x6 <233X7 + x8 150H- .X7 200X6 + X8 10048 5 12048 20042 x3 12042 x4 200Solution in literature:f(x*) = —16592.55x” = (120,53.51,120,108.87,122.89,110.1,77.1,62.89)The first two constraints are slightly violatedSolution using LSP:f(x*) = —16563.059x = (120,57.41, 119.98, 110.28, 116.06, 116.95, 83.95, 56.06)LSP Initial bounds:48 12048 $ 20042 12042 x4 2000 2330 x 2330 1400 <x8 140convergence criterion:V1E—3Computational effort:N1 = 1426144 Nkeep = 80Problem 8-2-0-6. Source: [58]f(x) = .4xf7x67 + .4$7x67+ 10 ——Subject to:1— .0588xx7—.1x 01— 0.0588x68—.1x—.1x2 > 01 — 4x3x1— 2x71x’ — 0.0588x137— 1 01 — 4xx1— 2x711— 0.0588x’3x8— 1 0Appendix A. MATHEMATICAL TEST PROBLEMS 2721f(x)4.20.1x10 fori=1,8Solution in literature:f(x*) 3.9511634396= (6.465114,2.232709, 0.6673975, 0.59575645.932676, 5.527235, 1.013322, 0.4006682)Computational effort, iVf, cited in literature (GRGA, FMIN): = 5929, 9544Solution using LSP:f(x*) = 3.960746x = (6.092, 2.550, 0.706, 0.611, 6.016,5.546, 1.105, 0.416)LSP Initial bounds:0.1x10 fori=1,8convergence criterion:VM 1E-6LSP Computational effort:Nf = 192453 Nkeep 80Problem 8-3-0-7. Source: [14]f(x) = 19.4xT7 + 16.8x6 + 91.5x3+ 19.4xZ’47+86x138 + 152x27 + 16.8x°+ 27.4x63Subjected to:0.537x1x23+ 4.47xx56+ 0.386x7x8 1.65xT’ 1.15xTx’ ’ 1.65x’ 1.15xx’ 1.60x’ 1.20x’x’ 1x0 fori=1,8Solution in literature:f(x*) = 550.642x’ = (0.937, 0.901,0.676,0.671,0.331,0.467, 1.096, 0.554)Solution using LSP:f(x*) = 551.1217= (0.940, 0.954, 0.656, 0.693, 0.326, 0.487, 1.090, 0.581)LSP Initial bounds:0x3 fori=1,8LSP convergence criterion:VM1E—3LSP Computational effort:JVj = 5485 Nkeep = 80Appendix A. MATHEMATICAL TEST PROBLEMS 273Problem 9-1-0-13. Source: [58]f (x) = —.5 * (x1x4 — x23 + x39 — x59 + x5 — x6y)Subject to:1—— x 01—— x 01— (xi— — (x2 — x6)2 01— (x1— x7)2 — (x2 — x8)2 01 — (x3— x5)2 — (x4 — xg)2 01 — (x3— — (x4— xs)2 01 2 1—x39 0X58— X67 01— 4 01—— (x2— x9)2 0Xi4 — X23 0x59 0Solution in literature:f(x*) = —0.8660254038x = (.884129, .4672425, .03742076, .9992996, .884129, .467242, .03742076, .99929996, 0)Computational effort, A/f, cited in literature (ORGA, FMIN):= 16857, 16724GRGA did not find the global solutionSolution using LSP:f(x*) = —0.8657007x = (.96099, .29952, .23727, .917137, .95973, .28040, .24091, .99339, .02330)LSP initial bounds:0 x 1 fori= ito 9convergence criterion:Yw1E—7LSP computational effort:Nf = 60500 iVkeep 80Problem 9-2-6-0. Source: [59]f(x)=ExSubject to:x1 + x2 *6(—5r3) + x4 — 127 = 0r1 + x2 * e(3Ta) + x5 — 151 = 0x1 + x2 * + x6 — 379 = 0x1 + x2 * e(xs) + x7 — 421 = 0x1 + x2 * e(3T3) + x8 — 460 = 0xi + X2 * e() + x9 — 426 = 0Solution in literature:f(x*) = 13390.1Appendix A. MATHEMATICAL TEST PROBLEMS 274x’ = (523.31, —156.95, —0.2, 29.61, —86.62,47.33, 26.47, 22.92, —39.47)Computational effort, iif, cited in literature (NLPQL):= 2443The constraints are slightly violated, possibly due to round off.Solution using LSP:f(x*) = 13390.102x’ = (523.45996094, —157.13957214, —0.19949156, 29.60516357, —86.56939697,47.37318420, 26.26046944,22.91170883, —39.50439453)LSP initial bounds:400 5; x < 600—200 5; 5; —100—10 5; X3 5; 00 5; 5; 600000—10000 ioooo0 5; 5; 800—10 5; x7 5; 400—10 5; 5; 400—600 5; x9 5; 10Convergence criterion:VM1E-3LSP computational effort (number of functional evaluation):Nf 16601 Nkeep 90Appendix A. MATHEMATICAL TEST PROBLEMS 275Problem 15-1-0-29. Source: [58]f(x)= E(2.3x3k+I + 0.00014+ + l.7X3k+2 + 0.00014+2 +2.X3k+3 + 0.OOO1543)Subject to:O— x33_2 + 7 130— x3_1 + 7 140 x33— x3 + 7 13forj 1,4xi+x2+x—600x4 + x5 + x6 —50 07+xs+x9—700x10 + x11 + x12 — 85 0x13-I- x14 + xiS — 100 08< x1 2143 573 x3 160 X3k+1 900 X3k+2 1200 X3k+3 60for k = 1,4Solution in literature:f(x*) = 664.8204500= (8,49,3,1,56,0,1,63,6,3,70,12,5,77,18)Computational effort, Nf, cited in literature (GRGA): = 3857Solution using LSP:f(xj = 668.921x = (8.113,48.861,3.031, 2.081,48,934,0.070,8.067, 55.933,6.002, 13.793, 62.804, 8.405, 16.043, 69.665, 14.305)LSP initial bounds: As given in the set of constraints.convergence criterion:VM1E-3LSP computational effort:Nf = 3805238 Niveep = 120Appendix A. MATHEMATICAL TEST PROBLEMS 276Problem 20-1-1-0. Source: [59] Solution improvedf(x) = i(x + x)Subject to:1x=lSolution in literature:f(x*) = 1.91667x = (0.91287, 0.408286, —0.00017, —0.0000054, 0.0000, —0.0000089,)0.0000082, —0.000014, 0.000022, —0.0, 0.0000135, —0.000004,—0.000011, —0.000013, 0.0, 0.00002, 0.00000546, —0.000009, —0.00001,0.0)Computational effort, N, cited in literature (NLPQL): = 198Solution using LSP:f(x*) = 1.91836x = (—0.9112, —0.4116, —0.0101, —0.0087,0.0077, 0.0021, —0.0050,—0.0016,0.0051,0.0008, —0.0013,0.0047, —0.0034,0.0010,0.0013. 0.0004, —0.0041, —0.0024, —0.0036, 0.0005)LSP initial bounds:—1<x1 fori=1,20Convergence criterion:VM1E-6LSP computational effort:Nf 767986 Nkeep 120The LSP solution is inferior by 0.088% and different from the one given in theliterature, but multiple local optima were observed using LSP. All ordinateswith negative and positive sign give optimal solution. There are 220 possiblesolution points if the initial bounds are —1 < < 1 for i = 1,20.Appendix BDIRECT SEARCH METHODSTo give a wider perspective of direct search methods some of the most common methods which appear in the literature are described below. The descriptions are presentedverbatim from the primary sources cited.Blind search [Leon, 1966]: The most elementary type of sampling procedure is one inwhich trial points are generated randomly within the feasible region. This searchmethod simply selects a starting feasible point x0, evaluates f(x) at x0 and thenrandomly selects another feasible point x1, and evaluates f(x) at x’. In effect, boththe search direction and the step length are chosen simultaneously. The currentpoint has to provide a better value of the objective function to be retained, otherwiseit is discarded. Such a process continues until a specified number of points havebeen tested or a specified computational effort expended.A slightly more efficient approach is to divide the sampling range into a set ofsubareas known as blocks. Initially the search is carried out in each block separately.The block providing the best values is used to initiate the next phase of the searchwhich is executed within progressively smaller blocks.Grid search : A series of points are evaluated about a reference point selected accordingto some type of design. Then move to that point which improves the objectivefunction the most, and repeat. If each length of the search domain is divided into b277Appendix B. DIRECT SEARCH METHODS 278sections, there will be b1t grid points for an n dimensional problem. Therefore, thismethod function evaluations at 19’— 1 points in addition to the reference point.For ii = 10, we must examine 310— 1 = 59, 048 values of f(x) if a three-levelfactorial design is to be used, which is a prohibitive number of function evaluations[Edgar & Himmelblau, 1988] for an equivalent Nk6 = 3 and a single iteration.Univariate search [Edgar & Himmelblau, 1988]: Select ii fixed search directions (usually the coordinate axes) for an objective function of ii variables. Then f(x) isminimized in each direction sequentially using a one dimensional search. Whilethis method is effective for a quadratic function of the formf(x)=cjxbecause the search direction lines up with the principal axes, it does not performsatisfactorily for the more general quadratic objective function of the formf(x) = EZdjjzxji:=1 j=1where c and are coefficients.Conjugate direction methods [Edgar & Himmelhlau, 1988]: Conjugate direction methods are presented as methods for optimizing strictly convex quadratic functions.Experience has shown that conjugate directions are more effective as search directions than arbitrarily chosen search directions or even orthogonal search directions.Two directions S1 and 5 are said to be conjugate (or conjugated) with respect toeach other if(5i)TQ()= 0Appendix B. DIRECT SEARCH METHODS 279where, the matrix Q is the Hessian matrix of the objective function, H.In general, at a set of ii linearly independent directions of search 50, 51 5n—1 aresaid to be conjugate with respect to a positive definite square matrix Q if(S1)TQ(Si)= 0; 0 0 i n —‘The conjugate directions exist only for a quadratic approximation of the functionat a single stage Ic. Once the objective function is modeled by a new approximationat stage (Ic + 1), the directions on stage Ic are unlikely to be conjugate to any ofthe directions selected on stage (Ic + 1).Powell’s conjugate direction methods [Reklaitis et al., 1983]: This algorithm usesthe history of the iterations to build up directions for acceleration and at the sametime avoids degenerating to a sequence of coordinate searches. It is based upon themodel of a quadratic function.The motivation for the algorithm stems from the observation that if a quadraticfunction in n variables can be transformed so that it is just the sum of perfectsquares, then the optimum can be found after exactly n single variable searches,one with each of the transformed variables.Simplex method [Edgar & Himmelblau, 1988]: It uses a rectangular geometric figure(a simplex) to select points at the vertices of the simplex at which to evaluate f(x).The simplex search method is based upon the observation that the first-order experimental design requiring the smallest number of points is the regular simplex.In it dimensions, a regular simplex is a polyhedron composed of it + 1 equidistantpoints, which form its vertices.Appendix B. DIRECT SEARCH METHODS 280The method begins by setting up a regular simplex in the space of the independentvariables as= bfr+e1(bY—bf)where j is a random deviate nniformly distribnted over the interval [0, 1], bf andare the lower and npper bonnds respectively for each variable i.The objective fnnction is evaluated at each vertex. The vertex with highest functionvalue (for a minimization problem) is located and labelled as the “worst” vertex.Before throwing away the “worst” vertex, it is reflected through the centroid togenerate a new trial point, which is used to complete the next simplex. As long asthe objective function decreases smoothly, the iterations move along crabwise untileither the minimum is straddled or the iterations begin to cycle between two ormore simplexes. Then, the simplex sides are reduced to advance the search until aprescribed side length is maintained or until the standard deviation of the functionvalues at the vertices gets smaller than a prescribed value.Complex method [Box, 1965]: The constrained simplex (Complex) method searchesfor the minimum value of a function f(x) subject to in constraints.The simplex direct search is based on the generation and maintenance of a patternof search points and the use of projections of undesirable points through the centroidof the remaining points as the means of finding new trial points. In the presenceof inequality constraints, it is evident that if the new point is infeasible, then it isa simple matter to retract it toward the centroid until it becomes feasible.It is assumed that an initial point x0, which satisfies all constraints is available. Itis assumed that the feasible region is convex.Appendix B. DIRECT SEARCH METHODS 281For an n variable problem, Ic n + 1 points are used, of which one is the givenpoint. The further (Ic — 1) points required to set up the initial configuration areobtained one at a time by the nse of random numbers and ranges for each of theindependent variables.The function is evaluated at each vertex, and the vertex of highest function value(for a minimization problem) is replaced by a point a 1 times as far from thecentroid of the remaining points as the reflection of the worst point in the centroid,the new point being collinear with the rejected point and the centroid of the retainedvertices. If the trail point is also the worst, it is moved halfway towards the centroidof the remaining points to give a new trial point.Typical values of Ic and a are 2 * n and 1.3 respectively.If a trial vertex does not satisfy some constraint on some independent variable x,that variable is reset to a value 0.000001 inside the appropriate limit.Different runs are required to identify the global solution. Therefore, this methodtoo is a local search.The Hooke-Jeeves pattern search [Reklaitis et al., 1983]: This method is a combination of “exploratory” moves of the one variable at a time kind with “pattern” oracceleration moves regulated by some heuristic rules [Reklaitis et al., 1983]. Theexploratory moves examine the local behaviour of the function and seek to locatethe direction of any sloping valleys that might be present. The pattern moves utilize the information generated in the exploration to step rapidly along the valleys.A pattern move consists of a single step from the present base point along the linefrom the previous to the current base point. Thus, a new pattern point, x, iscalculated asAppendix B. DIRECT SEARCH METHODS 282= x + (x — x(k_1))whereis the current base pointx(c_l) is the previous base point4+1) is the pattern move pointx1) is the next (new) base pointRecognizing that this move may not result in an improvement, the point 4+1)is accepted only temporarily. It becomes the temporary base point for a newexploratory move. If the result of this exploratory move is a better point than theprevious base point x(k), then this point is accepted as the new base point x1).On the other hand, if the exploratory move does not produce improvement, thenthe pattern move is discarded and the search returns to xV’), where an exploratorysearch is undertaken to find a new pattern.Because of its reliance on coordinate steps, the algorithm can, however, terminateprematurely, and in the presence of severe non-linearities will generate to a sequenceof exploratory moves without benefit of pattern acceleration.Simulated annealing [Corana et al., 1987]: Simulated annealing means simulating theannealing process by a Monte Carlo method (random changes in the state of thesystem,), where the global minimum of the objective function represents the lowenergy configuration [Torn & Zilinskas, 1988]. The method discriminates between“gross behaviour” of the objective function and finer “wrinkles”, where the grossAppendix B. DIRECT SEARCH METHODS 283behaviour directs the search to the region where the global optimum should bepresent. Then a finer local search is performed within the neighbourhood of thepresumed global optimum. In any case this method does not guarantee findingthe global optimum. For this reason, it is referred to as a suitably perturbed localsearch procedure [Kan et al., 1989].The method proceeds iteratively starting from a given point and generating newcandidates around the current point applying random moves along each coordinatedirection, one after the other. A candidate point is accepted if it produces an objective function value less than that of the previous point, or if greater accepted withprobability F. This probability is given by F = exp(—Af/T) and is a function ofthe ratio of the increase in objective function /.f = f(xk+l)— f(xk) and the temperature T. The temperature T starts with a large value and reduces progressivelyuntil no more useful objective function improvement is expected. An importantcharacteristic of this random procedure is that the next point accepted may havehigher objective value than the previous one [Torn & Zilinskas, 1988].Experimental results on test problems showed that a large number of functionevaluations is required to obtain solutions with the simulated annealing method[Aluffi-Pentini et al., 1985]. The method has also been found very inefficient whencompared with the simplex method. On the average simulated annealing takes500-1000 times more function evaluations than the simplex method.
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- A level set global optimization method for nonlinear...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
A level set global optimization method for nonlinear engineering problems Yassien, Hassen Ali 1994
pdf
Page Metadata
Item Metadata
Title | A level set global optimization method for nonlinear engineering problems |
Creator |
Yassien, Hassen Ali |
Date Issued | 1994 |
Description | The mathematical equations used in civil engineering design procedures are predomi nantly nonlinear. Most civil engineering design optimization problems would therefore require the use of nonlinear programming (NLP) techniques for their solution. Those NLP packages with the ability to handle practical sizes of problems, and have been available on mainframe computers for many years, are only now becoming available on microcomputers. On top of this, these existing NLP techniques, which are dominated by the gradient methods, do not guarantee global solutions. As a consequence suitable optimization methods for civil engineering design are not being enjoyed by practitioners. In this thesis, the level set optimization method, whose theory was initially presented in “Integral global optimization” by [Chew & Zheng, 1988] was further developed to address, in particular, practical engineering problems. It was found that Level Set Pro gramming (LSP), offers a viable alternative to existing nonlinear optimization methods. While LSP does not radically alter the computational effort involved it has some unique characteristics which appear to be significant from the engineering users point of view. LSP which is classified as a direct search method of optimization, utilizes the set theory concept of a level set. It uses estimates of moments of the objective function values at the confirmed points within a level set to control the search advance and as a measure of convergence on the global optimum. The reliability and efficiency of LSP was verified by comparing its results with pub lished results for both mathematical and engineering test problems. In addition to the published test problems, a new parametrically adjustable mathematical test problem was designed to test global optimization methods in general and to explore the strengths and weaknesses of LSP in particular. Experience with these test problems showed that LSP gave similar results to those cited in the literature as well as improved results or more complete sets of global solution. The large number of solutions developed at each iteration of LSP permits meaningful graphical displays of the progressive reduction in the level set boundaries as the global solution is approached. Other displays were also found to provide insights into the solution process and a basis for diagnosing search difficulties. |
Extent | 4824523 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
FileFormat | application/pdf |
Language | eng |
Date Available | 2009-04-08 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0050419 |
URI | http://hdl.handle.net/2429/6965 |
Degree |
Doctor of Philosophy - PhD |
Program |
Civil Engineering |
Affiliation |
Applied Science, Faculty of Civil Engineering, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 1994-05 |
Campus |
UBCV |
Scholarly Level | Graduate |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-ubc_1994-893974.pdf [ 4.6MB ]
- Metadata
- JSON: 831-1.0050419.json
- JSON-LD: 831-1.0050419-ld.json
- RDF/XML (Pretty): 831-1.0050419-rdf.xml
- RDF/JSON: 831-1.0050419-rdf.json
- Turtle: 831-1.0050419-turtle.txt
- N-Triples: 831-1.0050419-rdf-ntriples.txt
- Original Record: 831-1.0050419-source.json
- Full Text
- 831-1.0050419-fulltext.txt
- Citation
- 831-1.0050419.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.831.1-0050419/manifest