UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Nonlinearly constrained optimization via sequential regularized linear programming Crowe, Mitch 2010

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
24-UBC_2010_fall_crowe_mitch.pdf [ 1.44MB ]
Metadata
JSON: 24-1.0051992.json
JSON-LD: 24-1.0051992-ld.json
RDF/XML (Pretty): 24-1.0051992-rdf.xml
RDF/JSON: 24-1.0051992-rdf.json
Turtle: 24-1.0051992-turtle.txt
N-Triples: 24-1.0051992-rdf-ntriples.txt
Original Record: 24-1.0051992-source.json
Full Text
24-1.0051992-fulltext.txt
Citation
24-1.0051992.ris

Full Text

Nonlinearly constrained optimization via sequential regularized linear programming by Mitch Crowe  B.Sc., The University of British Columbia, 2007  A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF Master of Science in The Faculty of Graduate Studies (Computer Science)  The University of British Columbia (Vancouver) August 2010  ©  Mitch Crowe 2010  11  Abstract This thesis proposes a new active-set method for large-scale nonlinearly con strained optimization. The method solves a sequence of linear programs to generate search directions. The typical approach for globalization is based on damping the search directions with a trust-region constraint; our proposed ap proach is instead based on using a 2-norm regularization term in the objective. Numerical evidence is presented which demonstrates scaling inefficiencies in current sequential linear programming algorithms that use a trust-region constraint. Specifically, we show that the trust-region constraints in the trustregion subproblems significantly reduce the warm-start efficiency of these subproblem solves, and also unnecessarily induce infeasible subproblems. We also show that the use of a regularized linear programming (RLP) step largely elim inates these inefficiencies and, additionally, that the dual problem to RLP is a bound-constrained least-squares problem, which may allow for very efficient subproblem solves using gradient-projection-type algorithms. Two new algorithms were implemented and are presented in this thesis, based on solving sequences of RLPs and trust-region constrained LPs. These algorithms are used to demonstrate the effectiveness of each type of subproblem, which we extrapolate onto the effectiveness of an RLP-based algorithm with the addition of Newton-like steps. All of the source code needed to reproduce the figures and tables presented in this thesis is available online at http: //www.cs.ubc.ca/labs/scl/thesis/lOcrowe/  111  Table of Contents Abstract  ii  Table of Contents  iii  List of Tables  v  List of Figures  vi  Acknowledgments  vii  1  Preliminaries 1.1 Introduction 1.2 Notation and definitions  1 1 2  2  Current approaches for solving nonlinear programs 2.1 Interior-point methods 2.2 Active-set methods 2.3 Sequential quadratic programming  5 5 6 7  3  Sequential linear—quadratic programming 3.1 Local algorithm 3.2 Globalization 3.3 Infeasible subproblems  9 9 10 13  4  Sequential regularized linear programming 4.1 Algorithm outline 4.2 Infeasible subproblems 4.3 Step acceptance 4.4 Regularization parameter updates 4.5 Lagrange multiplier estimates 4.6 Algorithm statement  16 16 17 17 19 21 22  5  Experiments 5.1 A benchmark trust-region algorithm 5.2 Testing environment 5.3 Implementation 5.4 Results  25 25 25 27 30  Table of Contents  iv  6  Discussion 6.1 Expensive infeasible subproblems 6.2 Slow active-set updates 6.3 Inefficient warm-starts  31 31 32 33  7  Conclusions  38  Bibliography  40  Appendices A Derivation of the active sets for TrLP subproblems  42  B SRLP Results  43  C STrLP Results  48  V  List of Tables 4.1  First-order KKT conditions for NLP and RLP  22  5.1 5.2  Relevant SQOPT parameters used for solving RLP Relevant SNOPT parameters used for solving MINFR  28 30  B.1 Definition of symbols used in Table B.2 B.2 SRLP Results  43 47  C.1 Definition of symbols used in Table C.2 C.2 STrLP Results  48 52  vi  List of Figures 3.1  Infeasibility induced by a trust-region  13  6.1 6.2 6.3 6.4 6.5 6.6  Infeasible subproblems for SRLP and STrLP Warm-start quality of STrLP on HIMMELBI Warm-start quality of SRLP on HIMMELBI Warm-start quality cummulative distribution function Minor iterations per iteration on HIMMELBI Minor iterations per iteration cummulative distribution function  32 34 35 36 37 37  vi’  Acknowledgments I owe a huge debt of gratitude to my superviser Michael Friedlander. This work would not have been possible without his enthusiastic support and encourage ment. Michael allowed me the freedom to explore and to truly enjoy this re search, while at the same time managing to keep me on track and focused. I was very fortunate to have him as my supervisor. A special thanks goes to my labmate Ewout van den Berg for his tireless help and for being a great friend. Many thanks go also to Chen Grief both for his helpful suggestions with regards to this work, and for his very kind support throughout my degree.  Mitch Crowe  Vancouver August, 2010  1  Chapter 1  Preliminaries 1.1  Introduction  Nonlinear programs (NLP), which can be written generically as NLP  minimize  f(x)  subject to c(x) x  =  0 0,  (1.1)  have a broad range of applications in engineering, science, and economics. In —+ this thesis, we assume that f : —+ Ill and c: are twice-continuously differentiable functions. A huge body of work on solving NLP has been produced in the last 50 years, and a vast and diverse array of algorithms has been devel oped to solve problems of this form. Even so, one can easily choose an f or a c so that none of the current methods will suffice. In truth, NLP describes such a broad class of optimization problems that the best we can hope to do is to expand the small domain of those we can solve. As computing power grows, real world applications are begining to demand the solution to very large-scale NLPs—possibly including millions of variables and constraints. Unfortunately, this is a domain in which current algorithms have proven to be insufficient. The focus of this thesis is to explore an algorithm which scales gracefully on such problems. The sequential quadratic programming (SQP) method is a staple for solving NLP [1]. Here, an algorithm successively makes quadratic models of NLP and solves an inequality-constrained quadratic program to generate a new iterate. The trouble is that solving a quadratic program can become very expensive as problem dimensions grow and, as such, SQP methods can be inefficient at solving large-scale NLPs. More recently, a variant of SQP was proposed in order to help it scale to larger problems [7, 3, 5]. This is known as sequential linear—quadratic pro gramming (SL—QP). The idea is rather simple: instead of solving an expensive IQP at each iteration, we solve a cheaper linear program (LP), followed by an equality-constrained quadratic program (EtP). Conclusive evidence has not yet been able to demonstrate the effectiveness of currently implemented SL—QP algorithms; however it is our belief that this is due to several sources of inefficiencies in these algorithms which are bottlenecks as problem dimensions grow. This thesis will discuss these inefficiencies, and  1.2. Notation and definitions  2  propose a significant modification to the basic SL—QP method which is meant to alleviate them. Specifically, we propose a different means of controlling the step-length generated by the LP subproblem at each iteration: current methods use a trust-region constraint, where we use regularization. In the next two chapters, we discuss the current landscape of algorithms for solving NLP. Chapter 4 then presents a new algorithm, which we call sequen tial regularized linear programming (SRLP). This is not an SL—QP algorithm, because it does not solve an EQP subproblem at each iteration. However, since convergence guarrantees of current SL—QP algorithms rely solely on the LP step—with the EQP step added simply to improve convergence rates [3, 5], this algorithm will provide a suitable means of testing the effectiveness of regularized LP subproblems versus those with a trust-region constraint added. Chapter 5 describes the implementation details of the SRLP algorithm, together with our testing environment. Finally, Chapter 6 discusses the advantages of solving RLP versus TrLP subproblems. We demonstrate several sources of inefficiencies in current SL—QP algorithms, and show how an RLP-based algorithm would fair better.  1.2  Notation and definitions  This section presents a brief review of some of the optimization background required for a proper reading of this thesis. Along the way, we will encounter some important definitions and notation that are used throughout this text. This review is not meant to be comprehensive, and the curious reader is directed to the excellent book by Nocedal and Wright [13] for a more detailed background to this material.  1.2.1  Optimization basics  The field of optimization can be roughly summarized as the pursuit to find a —* ]R, where x is limited to vector x E ll which minimizes a function f : lie within a constraint set, We can write this as the generic optimization problem .  P  minimize  f(x)  subject to x  e  .  V  Here, f(x) is the objective, and the feasible set. If 0 (i.e., if there is no x which satisfies the condition x e we say that P is infeasible, or inconsistent. Otherwise it is feasible or consistent.  1.2.2  Nonlinear programming  m is refered to Problem NLP (1.1) is a special case of P. Here c : as the constraint function, the condition c(x) = 0 as the constraints, and the condition x 0 as the variable bounds. The feasible set for NLP, then, is  1.2. Notation and definitions  3  the set of points which satisfy the constraints and the variable bounds, i.e., ={x:c(x)=Oandx>O}. Formulations with more generic constraints (such as inequality constraints, c(x) 0) can easily be re-written in this form, and so we will be able to consider it without loss of generality. The Lagrangian is an important concept for nonlinear programming. Definition 1 1  The Lagrangian of NLP is £(x, y, z)  where  e R,  f(x)  —  yTc(x)  —  zTx,  and z E ]R7, its gradient with respect to x is given by,  g(x, y, z)  =  VL(x, y, z)  =  Vf(x)  —  Vc(x)Ty  —  and, its Hessian with respect to x is given by, H(x,y,z)  =  VL(x,y,z)  =  f(x) 2 V  —  c:(x) 2 y:V  The first-order KKT (Karush-Kuhn-Tucker) conditions must hold at an op timal point of NLP. Definition 1.2. The point x satisfies the first-order KKT conditions (and is called a first-order KKT point) of NLP if and only if there exists vectors m and z e R such that e R gc(x,y,z) c(x)  0 =  0  x0 z0 xoz = 0, where (x o z) := x z. Under certain conditions on the constraints (the linear independence con straint qualification, for instance) it can be shown that any local solution solu tion, x, to NLP must be a first-order KKT point; see [13, Theorem 12.11. The converse is not necessarily guaranteed, however. This requires the satisfaction of stronger conditions collectively known as the second-order sufficient conditions, which are outside the scope of our current discussion, or convexity (see [13, Theorem 12.6]). This thesis focuses on active-set methods, which require the following defi nition:  1.2. Notation and definitions  4  Definition 1.3. The active set, A(x), for NLP at a given point x, is the set  of variables which are at their bounds, i.e., A(x)  :=  {i :  =  O}  The active set at the solution to NLP—which we refer to as the optimal active set—is of special importance to active-set methods. We use the shorthand notation A*  5  Chapter 2  Current approaches for solving nonlinear programs There have been many different methods proposed for solving NLP (1.1). The methods that we consider can broadly be divided into two main classes: interiorpoint and active-set methods.  2.1  Interior-point methods  The variable bounds x 0 can add a significant cost to solving NLP: the number of ways x can be chosen active or inactive grows combinatorially with the number of variables in the problem. Interior-point methods reformulate these inequality constraints: penalizing their violation by adding a barrier function to the objective. Many different barrier functions could be employed, however a common choice is the log-barrier. The most basic log-barrier interior-point method re peatedly solves problems of the form LBNLP(i)  minimize  ‘(x)  subject to c(x)  := =  f(x)  —  ln (xi)  0,  where u> 0 is a sequence of barrier parameters such that —* 0. The objective ‘I’ has the property that 4’(x) = oc whenever x 0, which forces each iterate to lie strictly in the interior of the feasible set (hence the term ‘interior point’). As 0 the barrier weakens, and it can be shown under mild assumptions that the solution to LBNLP(i) approaches the solution of NLP [13, Theorem 17.3]. Algorithm 2.1 shows a basic log-barrier interior-point method. These methods were first introduced in the context of nonlinear program ming by Fiaco and McCormick in 1964 [6], but were subsequently abandoned in favour of active-set methods such as sequential quadratic programming (see Section 2.3). More recently, however, they have enjoyed a strong resurgence, and have been shown to be quite effective on medium- to large-scale problems (see [18], for example). —  2.2. Active-set methods  6  Algorithm 2.1: Basic log-barrier interior-point method input: x,,a >0 1 2 3 4  while x is not optimal for NLP do x ÷— solve LBNLP(/i) i €— reduce barrier parameter end  Algorithm 2.2: Generic active-set method input: A 1 2 3  while A A* (the optimal active set for NLP) do A — generate new active set prediction end  2.2  Active-set methods  If Ak—an optimal active set for NLP—was known apriori, one could simply determine an optimal point for NLP by fixing the variables in this set at their bounds, ignoring other bound constraints, and solving a much cheaper equalityconstrained problem. In this sense, much of the cost of solving NLP can be said to be the determination of an optimal active set. Where interior-point methods skirt this issue by removing the bound constraints, active-set methods seek the optimal active set directly by successively updating a sequence of predictions Although interior-point methods can be very robust and efficient, one clear advantage of active-set methods is in warm-starting.  2.2.1  Warm-starting  When information received from solving one problem is used to speed the con vergence of a second, related, problem, we say that we have warm-started the second problem. Active-set methods are readily warm-started. Two problems which are only slightly perturbed from each other are likely to have very similar optimal active sets, and we can use the optimal active set from one as an initial prediction for the optimal active set of the second. In contrast, interior-point methods are very difficult to warm-start effectively. (Strong efforts have been made recently to improve interior-point warm-starting, but success is still limited compared to that enjoyed by active-set methods [11].) There is a vast array of applications which require the solution to many similar NLPs. Consider, for instance, the branch-and-bound method for mixed integer nonlinear programming, where a related NLP is solved at every major iteration [2]. In such applications, effective warm-starting can be crucial to  2.3. Sequential quadratic programming  7  Algorithm 2.3: Sequential quadratic programming input: x,A i 2 3 4  while x is not optimal for NLP do d, A solve IQP(x)[A] —  x—x+d end  the success of an algorithm: potentially, the cost of solving a series of related problems can be much less than that of solving them individually. As such, because they are readily warm-started, and in spite of the significant successes of interior-point methods, we believe the pursuit of active-set methods for largescale NLPs to be a very important topic. The algorithms in the remainder of this thesis, all of them active-set methods, make heavy use of warm-starting. Each of them successively solves a series of related subproblems. The optimal active set from the most recently solved subproblem, A, is used as an initial prediction of the optimal active set for the following subproblem. The efficiency of these warm-starts plays a crucial role in the effectiveness of the algorithm in question, as we discuss in Chapter 6. As a reminder that we are using an active-set prediction for warm-starting our subproblems, algorithms in this thesis write this explicitly. When an algo rithm states, for instance, “solve IQP(x) [A]”, it is understood that the subprob lem IQP(x) is solved using A as an initial active-set prediction. (An important technical detail must be addressed here. The set of active variables defined by some A, together with a set of equality constraints, c(x) = 0, may define an inconsistent system. In this case we say that the active set is degenerate. Prac tical algorithms choose a non-degenerate subset of A—a ‘working set’ [3, 19, 101. Unless otherwise stated, we assume throughout this thesis that the active sets generated by the algorithms in question are non-degenerate.) Many successful active-set algorithms exist. Perhaps the most well-known is the simplex method for linear programming [13, Chapter 13]. For nonlinear programming, however, sequential quadratic programming (SQP) [11 has proven to be one of the most effective active-set methods, as demonstrated by the success of algorithms such as SNOPT [101.  2.3  Sequential quadratic programming  SQP is, essentially, a generalization of Newton’s method to the constrained optimization case. At each iteration, SQP methods form a quadratic model of NLP about the current iterate—a quadratic model of the Lagrangian, and a linear model of the constraints—and minimizes this model to determine a new  2.3. Sequential quadratic programming  8  iterate. Its subproblems can be written as IQP(x)  minimize  g (x)Td + dTH(x)d  subject to c(x) + J(x)d = 0 x+d > 0. Algorithm 2.3 lists a basic SQP algorithm (note that this is simply a local algorithm; we will discuss globalization techniques in Section 3.2). SQP allows for any method of solving IQP. Active-set quadratic-programming methods [16], however, are a common choice because of their warm-start effi ciency. In this case, SQP itself is an active-set method—iteratively updating an estimate, A, of the optimal active set, A*, for NLP. Under mild conditions it can be shown that (after incorporating globalization techniques similar to those discussed in Section 3.2) iterates generated by SQP give A = A* within a finite number of steps [17], and that, once this occurs, they achieve quadratic convergence towards a solution, x [13, Theorem 18.21.  9  Chapter 3  Sequential linear—quadratic programming 3.1  Local algorithm  As numbers of constraints and variables increase into the millions, each itera tion of SQP can become very expensive [3]. This cost is dominated by solving the IQP subproblem—this may be very slow for large-scale problems consid ering the combinatorial difficulties implicit in the bound constraints, together with the possibility of a nonconvex quadratic objective to handle. The idea behind sequential linear—quadratic programming (SL—QP) was first introduced by Fletcher and Sainz de la Maza [7] as a means of reducing the cost of ma jor iterations compared with SQP—more recently, algorithms were proposed by both Chin and Fletcher [5] and, Byrd, Gould, Nocedal and Waltz [3]. Instead of solving IQP, SL—QP solves at each iteration a linear model of NLP given by LP(x)  minimize  gc(x)Td  subject to c(x) + J(x)d x+d  =  (3.1)  0 0,  in order to update its estimate of the optimal active set. Here, SL—QP bears a resemblance to sequential linear programming methods [14], which have been used for many years. However, SL-QP methods generate an additional search direction by solving the equality-constrained quadratic program, EQP(x,A)  minimize  g(x)Td+ dTH(x)d  subject to c(x) + J(x)d  = =  0 0 for i  e A.  This additional step has the benefit of allowing for a superlinear convergence rate, as experienced by SQP methods. Importantly, EQP—a quadratic model of NLP in which active-set variables are fixed and other bounds are ignored— can be much cheaper to solve than IQP, as it avoids handling variable bounds. Algorithm 3.1 lists a basic SL—QP algorithm. SQP methods have a single subproblem (1QP) that gives active-set prediction and fast convergence. SL—QP methods decouple these objectives into more spe cialized subproblems. The hope is that the LP step will give a good prediction of  3.2. Globalization  10  Algorithm 3.1: Basic local SL—QP method input: x,A i 2 3  while x is not optimal for NLP do A *— solve LP(x)[A] d solve EQP(x,A) —  4 5  X  end  the optimal active set, without the significant cost of solving IQP (LP is a simpler and more structured problem than IQP, and linear-programming technology is more advanced than that of quadratic-programming) and, as such, will handle large-scale problems more efficiently. Algorithm 3.1 on its own, however, is not guarranteed to converge. In the next two sections, we discuss features incorporated into Chin et al.’s algorithm and into Byrd et al.’s algorithm which allow for such guarrantees.  3.2  Globalization  Algorithm 3.1 is a local algorithm. That is, starting at a point x arbitrarily far from a solution to NLP, it may never converge to a solution. To drive iterates towards a solution from an arbitrary starting point (globalization), iterative algorithms typically employ a linesearch or a trust-region, both of which control the size of the step d taken at each iteration. The algorithms of Byrd et al. [3] and of Chin et al. [5] are both trust-region methods. Instead of solving LP, which may be a very poor model of NLP far from the current iterate x, they solve TrLP(x, )  minimize  g(x)Td  subject to c(x) + J(x)d = 0 x+d>0  IIdII  (3 2)  ,  where the step d is forced to lie within an infinity-norm ball of radius (called the trust-region radius). The goal is to force the generated step to lie within a region in which we trust our model of NLP. In a trust-region approach, if a step is deemed to have made sufficient progress towards a solution, is held constant—or even increased to allow bigger (bolder) steps. Otherwise, our model of NLP may be too poor, and must be reduced to take smaller (safer) steps. But how do we determine whether a step gives sufficient progress? Clearly a step which reduces the objective and brings us closer to the feasible region for NLP should be taken. By the same token, one which increases the objective and moves us farther away from the feasible region should not be taken. However,  3.2. Globalization  11  what if a step generated by solving TrLP significantly reduces the objective, but moves us farther away from the feasible region of NLP? The answer is not obvious. Clearly some tradeoff between the goals of minimizing the objective and becoming feasible must be made. Answers to how to make this tradeoff have been the subject of a great deal of research, and this is an issue on which the algorithms of Byrd et al. [3] and of Chin et al. [5] differ. Byrd et al. employ a merit function and Chin et al. use a filter. We discuss both ideas now.  Merit functions  3.2.1  For unconstrained problems, sufficient progress towards a solution can simply be equated to sufficient descent in the objective; this single function defines the goal of a step. Merit functions seek to generalize this concept to constrained optimization. They combine the objective and constraint violation into a single function, , known as the merit function. Sufficient progress is then determined by whether or not a step generates sufficient descent on this combined function. One way to measure constraint violation is via the function h(x)  IIc(x)IIi  .  (3.3)  (The algorithms we consider explicitly enforce bound constraints at each it eration, and so we do not consider bound violation here.) A common merit function, and the one employed by Byrd et al., is the £ merit function (x, v) : f(x) + vh(x), 1 where  ji  (3.4)  > 0. This is an exact merit function [13, Section 15.3].  Definition 3 1 A merit function c,b(x, v) is exact if there exists some  > 0  jk such that for any v > 1 every local solution of NLP corresponds to a local minimizer of q5(x, v) [13, Definition 15 1]  It can be shown that, for the Li merit function,  IIyII,  (3.5)  where y are the Lagrange multipliers at a solution to NLP, satisfies Definition 3.1 [13]. We refer to v as the penalty parameter; it controls the amount of penaliza tion given to constraint violations. Choosing an appropriate value, z , for this 1 penalty parameter at each iteration k, is a difficult task. Convergence proofs for an algorithm which uses an exact merit function typically require that the algo rithm eventually generates iterates with vk > j,*• Perhaps the simplest solution is to choose some v 0 >> v and to set ii = 1)0 at every iteration. This is not effective in practice for several reasons, however. First, we do not know y (or, consequently, *) until we’ve actually solved the problem, so it is impossible to know for sure whether o is large enough. Second, there may be multiple solu tions to NLP, and we may thus be required to know * for all of them. Third, it  3.2. Globalization  12  has been observed in practice that if ii is chosen too large convergence can be very slow, and, so, choosing an arbitrarily large v 0 may be very inefficient [3]. Despite this difficulty, heuristics have been developed for choosing v that have been shown to be effective in practice [4, 13]. We discuss one such heuristic in Section 4.3.  3.2.2  Filters  The £ merit function combines the goals of minimizing f(x) and h(x). In contrast, filter methods, introduced by Fletcher and Leyffer [8], view these as two separate and competing goals, i.e., minimize f(x)  and  minimize h(x)  Instead of instituting an arbitary trade-off between these goals, a filter method considers only whether the objective—infeasibility pair at a given point, (fi, h ) 1 (f(xi), h(xi)), is unequivocally worse than another, (f2, h ). If so, we say that 2 the latter dominates the former.  Definition 3 2 A pair (f2, h ) dominates another pair (fi, hi) if and only 2 if f2 ft and h 2 hi [8, Definition 11 In the basic filter method, a step is acceptable if the new objective—infeasibility pair it would give, (f, h), is not dominated by that of any previous step taken by the algorithm. This is determined by maintaining a list of previously accepted pairs—a filter. Each time a step is accepted, its pair may be added to the filter (and, for efficiency, any pairs in the filter which it dominates are subsequently removed).  Definition 3 3 A filter is a list of pairs any other [8, Definition 2]  (f, h) such that no pair dominates  A step has sufficient progress if it is acceptable to the filter.  Definition 3.4. A step d generated about the current iterate x is acceptable to a filter if the pair (f(x + d), h(x + d)) is not dominated by any pair in the filter The SL—QP algorithm of Chin et al. [5] uses a filter. Two of the main draws of this globalization method are that fewer steps are rejected by a filter than a merit function (it is, in some sense, as accepting as possible), and that no arbitrary parameter is required. However, while the basic idea quite attractive due to its elegance, many less attractive features must be added in practice to ensure convergence [8]. In particular, the notion of an envelope (which introduces the very tradeoff between f and h we were trying to avoid), and the requirement that certain iterations neglect filter acceptability and instead give descent on the objective, detract from the elegance of the method. Effectiveness of filters in practice is also less established than that of merit functions.  3.3. Infeasible subproblems  13  “  /  I,,  x  / / / /  c(x)+J(x)d=O  Figure 3.1: A small trust region can induce a TrLP subproblem to be infeasible. Here, every point on the line c(x) + J(x)d = 0 is feasible for the corresponding LP subproblem. However, none of these points lie within the trust-region radius, , of x and so the TrLP subproblem is infeasible.  3.3  Infeasible subproblems  Algorithm 3.1 assumes that we can solve TrLP at every iteration. In practice, however, this is not always possible. Assuming that the original NLP is feasi ble, there are two ways that the TrLP subproblem can be infeasible. First, the linearized constraints about a point, x, may themselves be infeasible: if the con straint gradients, Vc(x)T, are degenerate, then the system Vc(x)T d = —c(x) may simply have no solution. Second, if the trust-region is too small, the TrLP subproblem may be infeasible even if the corresponding LP subproblem was not. Consider Figure 3.1. Here, the linearized constraints, c(x) + J(x)d = 0, while themselves consistant, cannot be satisfied within the trust-region radius, , of x, and so the TrLP subproblem is infeasible. If an infeasible subproblem is encountered, a successful algorithm must have a means of determining a new step and proceeding with its iterations. Two common techniques for this are feasibility restoration—which is utilized in Chin et al. ‘s algorithm [5], and penalty reformulation—which Byrd et al. use [3]. We explore these alternatives next.  3.3.1  Feasibility restoration  If x is feasible for NLP, then the subproblem TrLP(x, z) is guarranteed to be feasible for any (we can see immediately that a step d = 0 satisfies the constraints in TrLP when c(x) = 0 and x 0). Feasibility restoration searches for such a feasible point, so that we can be sure that the next TrLP subproblem to solve will be feasible. It was introduced by Fletcher and Leyffer [8] specifically  3.3. Infeasible subproblerus  14  for use with the filter method, however there is nothing in the method that prevents one from using it with a merit-function method. A feasibility restoration step attempts to find a feasible point by solving the problem FR  h(x)  minimize  subject to x  (3.6)  0,  where h is defined in• Equation 3.3. If the objective at a solution to FR is nonzero, then we could not find a feasible point, and NLP is declared infeasible. Otherwise, we have found a feasible point for NLP, and may proceed with the regular iterations of our algorithm. Any algorithm which can minimize a nonlinear objective subject to bound constraints will be suitable to generate a feasibility restoration step. However, Fletcher and Leyffer [8] do describe a method designed specifically for this pur pose. Unfortunately, feasibility restoration can be quite expensive. Indeed, FR can be almost as difficult to solve as our original NLP! Another potential weakness of feasibility restoration is that it has essentially given up on the objective and, as such, the step it generates may unnecessarily increase it.  3.3.2  Penalty reformulation  Penalty reformulation is a staple technique for handling infeasible subproblems. Its use in practice is described in [10], and it is also discussed in detail in [13]. Here, the view is that there is no need to satisfy the linearized constraints at every iteration. If an infeasible TrLP subproblem is encountered, a penalty reformulation method instead solves PLP(XJL)  minimize  gc(x)Td+iIIc(x) +  subject to x+d  IIdII  J(x)dIIi  0 Z,  which is formed by replacing the explicit constraints (c(x) + J(x)d = 0) with a term in the objective that penalizes their violation. Importantly, PLP is always feasible as long as x 0, which the algorithms we consider enforce explicitly at each iteration. It is common to reformulate PLP as a linear program, thus taking advantage of the availability of highly-tuned large-scale linear programming solvers. PLP can be re-written as PR(x, ji)  minimize  g(x)Td + ueTq +  subject to  c(x)  d,q,r  + J(x)d x + d, q, r  =  q 0  ueTr  —  r  lIdW where q, r E  are slack vectors, and e E  m  is a vector of ones.  3.3. Infeasible subproblems  15  Two advantages of penalty reformulation over feasibility restoration are that PR is much cheaper to solver than FR, and that it includes objective information instead of throwing it away. The step generated by PR, however, may not give as much of a reduction in infeasibility as FR. Also, penalty reformulation requires the choice of a parameter, i. Furthermore, the effectiveness of a method using penalty reformulation can be very sensitive to the choice of this parameter, and finding an appropriate value may be quite expensive in practice [4]. While PR has the same linear program form as TrLP, it has n + 2m variables compared with n for the latter—it may be significantly more expensive to solve. This, together with a potentially expensive routine for selective i (Byrd et al. [3] suggest a full extra solve of PR!) makes penalty reformulation iterations quite costly for an algorithm.  16  Chapter 4  Sequential regularized linear programming This chapter presents a new active-set method, SRLP (sequential regularized linear programming) for solving large-scale NLPs.  4.1  Algorithm outline  In order to control the size of steps generated at each iteration, current SL—QP algorithms add a trust-region constraint to the LP subproblem, solving TrLP. In contrast, SRLP controls the step size and enforces global convergence by adding a regularization term to the LP subproblem, which penalizes overly large steps. At each iteration a regularized linear program, RLP(x,p)  minimize  (x)Td+ 1 g  pIIdI  subject to c(x) + J(x)d x+d  =  0 0,  (4.1)  is solved. This idea was first proposed in a technical report by Friedlander, Gould, Leyffer, and Munson [9], but to our knowledge it has not yet been implemented. Algorithm 4.1 lists an outline of our proposed SRLP algorithm. Lines 3 and 4 handle the possibility of encountering an infeasible RLP subproblem, as discussed in Section 4.2. Lines 7 through 10 are repeated until an acceptable step, the definition of which is given in Section 4.3, is generated. Within this ioop, we update the regularization parameter. A discussion of how this is done is presented in Section 4.4. Finally, line 13 updates estimates of the optimal Lagrange multipliers of NLP on line 13, which is discussed in Section 4.5. (While SRLP does not explicitly require these estimates, they are used in practice in order to determine whether optimality conditions have been satisfied, and as part of our globalization technique.) Notice that SRLP does not take an EQP step, and thus does not fully qualify as an SL—QP method. Convergence guarantees of current SL—QP algorithms, however, rely solely on the LP step, with the EQP step added only to increase convergence rate [3, 51. As such, SRLP provides a suitable means of determining the effectiveness of using regularization versus a trust-region to control step size in an SL—QP method. A practical implemention, though, will likely include an  17  4.2. Infeasible subproblems Algorithm 4.1: Sequential regularized linear programming (outline) input: x,p>O while x is not optimal for NLP do if RLP(x, p) is infeasible then d +— handle infeasible subproblem else 4 d +— solve RLP(x,p)[A] while d is not acceptable do 6 7 p — update regularization parameter d—findanewstep 8 end 9 end io ii x—x+d 12 update Lagrange multiplier estimates 13 end i  2  /* Section 4.2 */  /* Section 4.3 */ /* Section 4.4 */ /* Section 4.4 */  /* Section 4.5 */  EQP step, and the Lagrange multiplier estimates discussed in Section 4.5 will be important for computing the Hessian of the Lagrangian in this case.  4.2  Infeasible subproblems  The RLP subproblem encountered on a given iteration may be infeasible if the linearized constraints (those in Equation 3.1) are degenerate (see Section 3.3). SRLP generates a feasibility restoration step, which solves FR (3.6), when this occurs. There are many solutions to FR. We take the view that feasibility restoration should interfere with regular iterations as little as possible, and so choose the solution which is as close to our current iterate, in a 2-norm sense, as possible. Our feasibility restoration steps solve MINFR(x)  minimize  2 dII  subject to c(x+d) x+d  =  0 0.  (4.2)  If MINFR(x) is infeasible, SRLP declares NLP itself to be infeasible, and iterations are terminated. Otherwise, its solution d is taken as a step in lieu of an RLP generated step.  4.3  Step acceptance  Successively accepting any step derived from the solution to RLP does not guar antee convergence to a first-order KKT point of NLP from an initial point ar bitrarily far away. Instead, we must employ globalization techniques. This  18  4.3. Step acceptance  requirement is common for any algorithm in which local models of a problem are used as subproblems and, as such, a large body of work has been pub lished analysing various strategies. See Section 3.2 for a review of some relevent techniques. Our globalization strategy will define what an acceptable step is for our algorithm. When a step is deemed acceptable, the SRLP algorithm takes it, otherwise either the RLP subproblem is modified or the step is modified until it is acceptable. SRLP utilizes the £ -exact merit function (, see Equation 3.4) to enforce 1 globalization. While this technique has the potential downside of requiring a good choice of penalty parameter, we have found the method we describe below to work very well in practice.  Penalty parameter choice  4.3.1  1 at each iteration is that ç be exact. Thus, Our primary concern in choosing v impose the condition we following Equation 3.5, > y1C  are the Lagrange multiplier estimates at the given iteration (see Sec where tion 4.5). Secondly, we require that v’ be chosen so that a solution to PIP is at least a descent direction for the merit function, ç (xk, vk) Steps dk generated by SRLP are guaranteed to satisfy J(xv) dk = _c(xc) (see Equation 4.1), and the directional derivative of i in such a direction is given by .  , d’) 1 D(  =  Vf (xk)Tdk  —  , dk) <0 then gives us a second 1 [13, Lemma 18.2]. Our requirement that D( condition for the penalty parameter: >  v  V  kTdk  IIc(xk)IIi  for c(xk)  0.  Notice that if xk is feasible (i.e., Ic(x’)IIj = 0), then descent on the merit function is not determined by A. In this case, however, a step d which solves RLP is always a descent direction for the objective unless we are optimal, and so descent on the merit function will be guaranteed for any value of iA. Our final requirement amalgamates these two conditions, i/  k  fmax(va,  Vb)  >1/mzn.’  if IIc(2)IIi otherwise .  In practice constants c > 0 and c  0;  4.3  1 are chosen so that  VCVmjn+f  (4.4)  is sufficiently large. (The specific choice of c and e may have a large practical impact on an algorithm, and adjustment may be required to find appropriate values for a given implementation.)  4.4. Regularization parameter updates  4.3.2  19  An acceptable step  The use of an exact merit function defines a single goal for each iteration of SRLP, namely reduction of the merit function. It is well known, however, that simply achieving reduction in the merit function at each iteration may, instead of achieving convergence to a first-order KKT point of NLP, reach a non-optimal accumulation point. Instead we must require that each iteration gives at least a sufficient amount of reduction in the merit function. We define this to mean that the step achieves at least a fraction of the descent predicted by a linear model of i, defined by  £(d,x,v)  +J(x)dIIi.  :=  We are now prepared to define an acceptable step for SRLP.  A step d generated by SRLP about a current iterate x is considered acceptable if it satisfies  Definition 4 1  (x,v) 1  —  qiQv+d,v)  >  (x,v) —e(d,x,v)) 1 /3(  where 3 e (0, 1) is the required fraction of descent, and v is chosen by Equa tion 4 4 (A similar condition is used in [5] and in [3]) In our algorithm we use the notion of step quality. A high-quality step will be accepted, whereas a low-quality one will not.  The quality, q, of a step d is the fraction of predicted linear achieved by the step  Definition 4 2  descent in  qi  q  q!i(x,zi)—l(x+d,v)  (45  With q given by Equation 4.5, and 3 our chosen required fraction of descent, our step acceptance criterion becomes  q>/3.  4.4  Regularization parameter updates  Consider a given iteration k of SRLP with an iterate xk, and where the current value of the regularization parameter is pk This section discusses how SRLP chooses Pk+1, and concurrently generates an acceptable step. For convenience, we will herein use the notation d to signify the solution to the subproblem RLP(x, p). The regularization term in RLP, 11 dli , replaces the trust-region constraints, IIdii , in TrLP. Like the trust-region, its role is to control the length of the step d taken by the subproblem at each iteration. The penalty parameter, p, defines the degree of penalization of the step-length in a 2-norm sense: p 0 gives no penalization, and a step which freely solves LP(x); p = oo gives the  4.4. Regularization parameter updates  20  smallest possible step that satisfies the linearized constraints of NLP about x, i.e., RLP(x,00)  minimize  IdII  subject to c(x) + J(x)d = 0 x+d > 0.  (4.6)  There is an important distinction between a trust-region constraint and reg ularization, however: A trust-region imposes a direct and strict limit on steplength, where regularization only imposes a degree of aversion to large steps. Thus, using regularization, SRLP does not have the full control over step-length, and this has important consequences for the convergence of the algorithm. In particular, notice that it is not true in general that IdpI2 —* 0 as p — 00 (consider any instance where c(x) 4 0, in that case d = 0 cannot satisfy the constraints in Equation 4.6). The primary aim of controlling step-length, whether by regularization or a trust-region, is to ensure that an acceptable step is taken at each iteration. This is required by globalization techniques and thus for convergence of an algorithm. SRLP’s primary means of step-length control is through updates to the regularization parameter. When an unaccepted step dpa is generated, we predict that, in the large majority of cases, we can find some Pb > Pa 50 that dpb is acceptable. However, we cannot guarantee that such a Pb exists, and, if it does not, a secondary means of generating an acceptable step is required. One possible choice for this secondary step is a linesearch on some dp. Our choice of penalty parameter (Section 4.3) ensures that dp (for any value of p) is a direction of descent for the merit function, q . By the twice-continuous 5 differentiability of f and c, then, we can see that some c > 0 exists such that d := c d satisfies our acceptance requirement (Definition 4.1); we can find an acceptable step by performing a linesearch on any step dp. While any d will suffice, d (a solution to (4.6)) is in some sense the best choice: it defines the direction in which the linearized constraints are nearest—in a 2-norm sense—to being satisfied and is thus likely to give good progress towards feasibility. To maximize the efficiency of SRLP, we wish to minimize the number of RLP subproblems solved at each iteration. We expect that, on most iterations, the initial step dpa will be accepted. When it is not, we further expect that a modest increase in regularization parameter Pb = 7Pa (with 7> 1) will, in the majority of cases, generate an acceptable step dpb. Thus, SRLP attempts each of these steps in succession before any more expensive means of analyzing the subproblem are performed. If the second attempted step is still not acceptable (we expect this to be a rare occurance in practice), we immediately seek to determine whether there exists any Pc > Pb for which dp is acceptable. This test is performed by solving RLP a third time to obtain d. If d is unacceptable, we know immediately that increasing p further is futile, and instead perform a linesearch on this direction. Otherwise, we know that there is some Pc large enough to generate an acceptable step and we continue to search for it.  4.5. Lagrange multiplier estimates  21  Algorithm 4.2: Regularization parameter update and step selection input: d, x, p> 0, y> 1 1 2 3 4 5 6  r s 9  io ii 12 13 14 is 16  ir 18  if d is not acceptable then /* try increasing p */ p—yp d—solveRLP(x,p) if d is not acceptable then /* test p=oo step */ d+—so1veRLP(x,x) if d, is not acceptable then choose a E (0, 1) such that a d is acceptable d—ad else choose Pc > p such that dp is acceptable d *— solve RLP(x,p) PPc end end else possibly decrease p end x —x+d  Algorithm 4.2 lists our regularization parameter update scheme.  4.5  Lagrange multiplier estimates  Estimates of the optimal Lagrange multipliers for NLP are used in order to update the penalty parameter (Section 4.3), to check whether optimality conditions are satisfied (Section 5.3.7), and, when an EQP step is added, to compute an estimate of the Hessian of the Lagrangian. We have three main criteria when choosing a source for these estimates. First, they should be inexpensive to compute. Second, they should provide a good estimate of the optimal Lagrange multipli ers, based on the limited local information we have of the problem. Third, and most important, they should converge to the optimal Lagrange multipliers as we approach a solution to NLP. SRLP uses the optimal Lagrange multipliers for the RIP subproblems (the RLP estimates, see Definition 4.3) as an estimate of the optimal Lagrange mul tipliers of NLP (the optimal multipliers) at each iteration. These have the im mediate advantage that almost any solver of RIP will provide them at no extra cost over solving the subproblem.  4.6. Algorithm statement NLP g(x*) — J(x*)Ty* z*O z*ox*=O  22  RLP =  z  g(x) + pd  —  J(x)’  =  o(x+d)=O  Table 4.1: First-order KKT conditions satisfied by the optimal Lagrange multipliers for NLP at an optimal point x, and by the RLP estimates at a given. Definition 4 3 The RLP Lagrange multiplier estimates (or RLP estimates) are the pair of vectors e R’ and 2 R which, together with the solution  d, satisfy the first order KKT conditions to the subproblem solved at a given iteration, RLP(x, p) (see Table 4 1) The first-order KKT conditions for RLP and NLP give a system of equations that the RLP multipliers and the optimal multipliers must satisfy. Table 4.1 lists the relevant conditions for convenience. Consider the conditions in Table 4.1. As x x and, consequently, d —* 0, (y*, z*) we can see by inspection that satisfy the condition for fliP estimates. Furthermore, the RLP estimates are unique (by the convexity of RLP). Continu ous differentiability of f and c, then, gives us (y, ) (y, z*) as x —+ x*; the RLP Lagrange multiplier estimates approach the optimal Lagrange multipliers for NLP as we approach a solution. —÷  4.6  Algorithm statement  A full listing of SRLP is given in Algorithm 4.3. This pseudo-code describes a process which iterates until an estimate x that is optimal for NLP is found. This stopping condition is left open to the implementation of the algorithm—we discuss the condition used in this thesis in Section 5.3.7. Throughout the rest of this thesis, we will refer to major iterations—which we define to be a full step through the ioop from lines 1 to 28, and to minor iterations—which we define to be the iterations taken by the RLP subproblem solver. The cost of these minor iterations will play a very important role in the effectiveness of a practical implementation of SRLP, and will be highly dependent on the choice of a solver for the RLP subproblems. We discuss this issue in the context of our implementation of SRLP in Section 5.3.4. Note that the regularization parameter update scheme presented in Sec tion 4.3 is directly incorporated into this flow. Algorithm 4.3 also utilizes a number of subroutine calls which we now define.  d, y, A ÷— solve RLP(x, p): Solve the subproblem RLP(x, p) (Equation 4.1) to ob tain a step, d, and Lagrange multiplier estimates, y (see Section 4.5). Compute the active set, A(x + d). d  i—  solve MINFR(x): Solve the subproblem MINFR(x) for a step d.  4.6. Algorithm statement v q  —  —  penaltyParameter(d, y, x): Compute v from Equation 4.4. stepQuality(d,  ii,  x): Compute q from equation Equation 4.5.  23  4.6. Algorithm statement  Algorithm 4.3: Sequential regularized linear programming input: x, y, A, p> 0, 1 2  4  5  6 7  8  > 0,  qjgj  while x is not optimal for NLP do if RLP(x, p) is feasible then /* RLP step *1 d, y, A÷—solveRLP(x,p)[A] penaityParameter(d, y, x) V else /* feasibility restoration step */ solve MINFR(x) d —  —  end if stepquality(d, v, x) <qmin then /* try increasing p */  p—2p d, y—solveRLP(x,p)[A] then if stepQuality(d, ii, x) < /* test p = step */ d, y—solveRLP(x,cc.)[A] if stepQuality(d, ii, x) < qj then  9  10 ii  12 13  /* linesearch on p  until stepQuality(d, v, x) else  16 17  19  21 22 23 24 25 26 27 28  Dc step *1  d—d  15  20  =  repeat  14  18  qmin  1* increase p until acceptable step found */ repeat p 2 p€— d, y€—solveRLP(x,p)[A] until stepQuality(d, ii, x) qmin end end else if stepQuality(d, v, x) qhigh then PP end  x—x+d end  24  25  Chapter 5  Experiments This chapter discusses the details of the experiments performed for this thesis. The implementation of SRLP (Algorithm 4.3), as well as the testing environment used, is described. Determining the effectiveness of SRLP requires a benchmark, and we begin by presenting the benchmarking algorithm used in this thesis—a sequential trust-region linear programming method (STrLP).  5.1  A benchmark trust-region algorithm  Like SRLP, STrLP is an SL—QP algorithm without an EQP step. However, it generates steps at each iteration by solving a TrLP, rather than an RLP, subproblem. Algorithm 5.1 lists STrLP. It is by no means novel—in fact, it closely resem bles the SL—QP implementation by Byrd et al. [3]. The details of this algorithm are meant to mimic those of SRLP as closely as possible, so that the two may be fairly compared: globalization is similarly enforced by an £ -merit function; the 1 penalty parameter update method and step acceptance criterion are identical to those of SRLP; and it uses the same feasibility restoration routine as SRLP. The significant difference between STrLP and SRLP is the means with which they control the length of the step generated by subproblems. For this, STrLP uses a trust-region. When the quality of a step (Equation 4.5) is deemed to too low, the trust-region radius, , is reduced; if it is very high, /. may be increased.  5.2  Testing environment  Determining the success of an optimization algorithm is quite difficult to do subjectively. Undoubtedly, an algorithm will fair better on certain types of problems and worse on others. The Constrained and Unconstrained Testing Environment revisited (CUTEr) [12] is a set of test problems which is commonly used as a benchmarking environment for optimization algorithms. We have selected a subset of the CUTEr problems as our test set for SRLP and STrLP. Only generally-constrained problems (nonlinear programs excluding linear and quadratic programs) for which n + m < 1000 were chosen. From the 225 matching problems, we removed a number which were found to be ill suited to our analysis: 1 unconstrained problems (this problem was labeled generally constrained, but we found it to contain no constraints or variable bounds),  5.2. Testing environment  26  Algorithm 5.1: Sequential trust-region linear programming input: x, y, 1 2  4  6 7 8  >  0,  qhigh  while x is not optimal for NLP do if TrLP(L, x) is feasible then 1* TrLP step */ d, y, A — solve TrLP(L, x)[A] else /* feasibility restoration step */ d <— solve MINFR(x) end V — penaltyParameter(d, y, x) if stepQuality(d, v, x) <qmin then /* don’t accept step. *1  9  10 ii  else  1* accept step *1 if stepQuality(d, v, x)  13  9 then qn  j  12  end  14  15 16  end end  3 problems which would not compile on our system, 5 problems we found to be poorly defined in regions visited by our algorithm (either the objective or constraints did not evaluate to a number), 11 problems which SNOPT declared as infeasible, and 38 problems which required more than 3000 iterations to solve. We were left with a set of 167 problems which we refer to as our test set. The list of CUTEr problems included in this test set is given in Appendix B. Why use such small-scale problems for our analysis, when our goal is effi ciency on very large scale problems? First, we are not directly concerned with the efficiency of SRLP, but rather with the use of RLP subproblems in an SL—QP method. Small-scale test problems allow us to explore a broad range of prob lems, which helps us to properly analyze certain features of RLP subproblems in Chapter 6. Success on these features can then be extrapolated to the efficiency of an SL—QP algorithm on large-scale problems. Second, developing a solver which robustly and efficiently handles large-scale problems in practice is a very arduous task. Many issues—such as effective use of memory, and optimized numerical routines—which can be safely forgotten on small-scale problems, re quire very careful attention as problems become large. At some point scaling an algorithm becomes an art rather than a science.  5.3. Implementation  5.3  27  Implementation  This section describes the details of our implementation of the SRLP and STrLP algorithms. The main logic of these algorithms was implemented in the MATLAB programming language. The bulk of the computations, however, are left to highly optimized third-party Fortran routines.  5.3.1  Parameters  SRLP calls for three initial parameters—we have used p0 = 1, q = 0.2, and 0.8. For STrLP we used o = 20, qj = 0.1, and qhjgh = 0.7. For 0 and ho were provided by the CUTEr problem. The initial both algorithms, x active-set prediction was calculated as A 0 = A(xo). qhih  5.3.2  Problem formulation  Instead of reformulating problems to the somewhat restrictive form of NLP (1.1), SRLP accepts problems with more generic bound constraints minimize  f(x)  subject to c(x)  =  1 b  <  (5.1)  0 x < b  where b 1 R and b e W are the lower and upper bounds, respectively. This generalization has little effect on the details of the algorithms, and allows us to avoid splitting problem variables. Moreover, our subproblem solvers can handle these generic bound constraints directly.  5.3.3  CUTEr problems  The problems in our test set were compiled using the SifDec 2 and CUTEr 2 packages [12]. In the compilation settings for SifDec, we used the g95 For tran compiler, and the “large scale problems” option. Compiled problems were accessed from MATLAB using the MEX interface provided with CUTEr. CUTEr presents problems in the generic format  f(x) subject to c 1 c(x) c bj<x <ba. minimize  We reformulate these CUTEr problems into the form accepted by SRLP (5.1) by subtracting slack variables from the constraints  f(x) subject to c(x) minimize  —  s b 1 Cl  =  <  0 x S  C  5.3. Implementation  Feasibility Optimality Max iterations Scale option  28  le—8 le—8 5000 0  Table 5.1: Relevant SQOPT parameters used for solving RLP.  Solving RLP subproblems  5.3.4  RLP subproblems (4.1) are solved using SQOPT version 7, a large-scale linear or quadratic programming algorithm [16]. SQOPT is not optimized specifically for handling problems of the form of RLP, but instead solves more general convex quadratic programs. However, it has been shown in practice to be a very robust and efficient solver, and is able to take advantage of the diagonal Hessian of RLP. In our implementation, the RLP subproblems solved at each iteration have generic bounds, and can be written as minimize  g(x)Td+ pc[ ’d 1  subject to J(x)d —c(x) 1 < x+d < b. b SQOPT accepts problems of the form minimize  gTx + xTHx  subject to cj Ax c,, <x 1 b <bu.  (5.2)  Reformulation from one form to the other is straight forward. Importantly, though, SQOPT does not accept the Hessian of our objective as a parameter. Instead, we provide a call-back function which evaluates products with this Hessian. Our call-back function for RLP evaluates lix where H := pI. SQOPT requires a number of parameters. Those relevant to our implemen tation are listed in Table 5.1. At the time of this writing, no MATLAB interface to SQOPT 7 was officially supported, and we developed a custom C MEX interface to the Fortran routines. The exact process through which SQOPT solves a quadratic program is be yond the scope of this thesis (see instead [10, Section 4] for a detailed treatment). However, as the cost of an SRLP iteration is largely dominated by solving the RLP subproblem, it is important that we consider the source of this expense. In Section 4.6 we refered to minor iterations as the iterations taken by our sub problem solver. SQOPT obtains a search direction L\x by solving the system  (  H  WT  )( ) =_( ), x  where gq:g+Hxk,  (5.3)  and the rows of W, the current working-set matrix, are a linearly independent subset of the constraint gradients of (5.2).  29  5.3. Implementation The special form of H in our RLP subproblem (i.e., H cipal allow us to obtain Lx via the least-squares problem  y—gqII, and T minimizeIIW  zx—  =  p1) would in prin  (gq_WTy*).  An RLP algorithm could solve this least-squares problem efficiently by maintain ing a factorization of W. To compute a search direction for RLP, the algorithm would update this factorization. A similar factorization update is performed at each iteration by the simplex method for linear programming. The fundamental cost of solving RLP is, thus, comparable to that of solving TrLP.  5.3.5  Solving TrLP subproblems  TrLP subproblems are handled almost identically to RLPs (see Section 5.3.4), with SQOPT used as the solver. A TrLP with generic bounds can be written as minimize  gc(x)Td  subject to J(x)d = —c(x) 1 < x+d < b, b  00 11d11  Li,  which we can rewrite to match the form accepted by SQOPT more closely: minimize  (x)Td 1 g  subject to J(x)d = —c(x) 1 < d b b and b, := min(b — x, 1 := max(bi — x, with b Implementation details are identical to those described in Section 5.3.4.  5.3.6  Solving MINFR  Feasibility restoration steps in SRLP and STrLP solve MINFR (4.2). With generic bound constraints, this problem can be written minimize  2 11d11  subject to c(x+d) 1 b  =  0 x+d < b.  (5.4)  We use SNOPT version 7 [15, 101, a general purpose algorithm for constrained optimization, to solve these problems. SNOPT accepts problems of the generic form minimize  f(x)  subject to cj < c(x) < c x 1 b b,  5.4. Results  30 le—8 le—8 8000 500 0  Major feasibility Optimality Major iterations Minor iterations Scale option .  Table 5.2: Relevant SNOPT parameters used for solving MINFR.  which easily accommodates problems in the form of Equation 5.4. Relevant SNOPT parameters used in our implementation are listed in Ta ble 5.2. The return status from SNOPT is important to our algorithms. A status of 1 or 2 indicates a successful feasibility restoration step; 11, 12, or 13 indicates that NLP may be infeasible; any other status is considered an error. SNOPT Fortran routines were called from within the MATLAB environment using the MEX interface provided with the package.  5.3.7  Optimality tests  SRLP and STrLP are terminated at iteration k, and the primal-dual pair is deemed optimal, if the following conditions hold:  max max where  z :=  ( c(xk)II, I [xk]  (u  Vf(x’)  [zk]  —  II,  ii)  <  (1 +  Wzk oxkII)  <  (i + IykIi2),  J(xk)Tyk,  and [a]  :=  (xk, k)  UXkW2)  max(0, max(_a)).  These are the optimality tests proposed by Byrd et al. [3j—rearranged to ac commodate our formulation of a nonlinear program (which is different from that used by Byrd et al.). We have used the value i = 10—6 on all our experiments.  5.4  Results  Appendices B and C list the results for SRLP and S’&LP, respectively, on our test set problems. For each problem, we list the number of major iterations required to solve the problem, the total number of minor iterations (SQOPT iterations) required, the number of feasibility restoration steps which had to be taken (equivalently the number of infeasible subproblems which were encoun tered for this problem), and the total time taken to solve this problem, among other statistics described in the appendices.  31  Chapter 6  Discussion The SL—QP method (Chapter 3) was developed in order to alleviate some of the potential inefficiencies experienced by the widely successful SQP method (Sec tion 2.3) when problem dimensions become very large. Careful analysis of cur rent SL—QP algorithms, however, reveals that they too contain many potential inefficiencies that may prove to be performance bottlenecks for very large-scale problems. This chapter discusses three such inefficiencies, and demonstrates how using an RLP subproblem (4.1) in place of a TrLP subproblem (3.2) could remove them.  6.1  Expensive infeasible subproblems  When an SL—QP algorithm encounters an infeasible subproblem, it must go to extra lengths to determine a new iterate. Section 3.3 discussed the two most common solutions to this problem: feasibility restoration and penalty reformulation. While there are tradeoffs between these two methods, it can be generally agreed that both have significant expenses compared with the cost of a standard iteration, especially for very large-scale problems. Clearly, an SL—QP algorithm should avoid infeasible subproblems whenever possible. RLP and TrLP subproblems share the same linearized constraints (those which also appear in Equation 3.1), and, if the system these define is degenerate, both are infeasible. This is the only way an RLP subproblem can be infeasible. Section 3.3 explains, however, that the trust-region constraint could induce an otherwise feasible TrLP to be infeasible. An SL—QP algorithm which uses RLP subproblems will, therefore, encounter fewer infeasible subproblems than one which uses TrLP. The benefit gained by avoiding these additional infeasible subproblems is, of course, completely dependent on how often they are encountered in practice. Appendix B and C list the number of infeasible subproblems handled for each problem in our test set (or, equivalently, the number of feasibility restoration steps taken) by SRLP and STrLP respectively. These data are summarized in Figure 6.1, which shows that SRLP handled vastly fewer infeasible subproblems than STrLP—9 compared with 74. Given the high cost of feasibility restoration and penalty reformulation steps as problems become very large, using an RLP subproblem instead of a TrLP subproblem could greatly increase the efficiency of an SL—QP method due to the reduced number of infeasible subproblems it encounters.  6.2. Slow active-set updates  32  E 0 I 0. e  0  a) E  .0  z  Infeasible subproblems  Figure 6,1: The number of problems (out of 167) in our test set (see Sec tion 5.2) on which the SRLP and STrLP algorithms encountered a given number of infeasible subproblems. SRLP, which solves RLP subproblems, only needed to handle 9 infeasible subproblems over this entire set, compare this with its trust-region-based counterpart, which handled 74.  6.2  Slow active-set updates  We have seen that SQP requires a (potentially very expensive) quadratic pro gramming iteration for every variable added to, or removed from, the active set. SL—QP alleviates this cost by replacing these iterations with cheaper, lin ear programming ones. However, the very nature of using an active-set method to solve these subproblems (whether IQP or TrLP) requires that elements in the active set are updated one-by-one. Such slow updates could prove to be a bottleneck for current SL—QP methods as problem dimensions (and, thus, the combinatorics of the active set) grow. In this section we explore the possibility of solving SL—QP subproblems with a gradient-projection algorithm—which are able to activate many variables during each iteration. Definition 6 1 A gradient-projection algorithm generates new iterate from its current iterate using k+1  where  [ J  k [xk  —  xk  kVf(xk)]  denotes proectzon onto the feasible set f, and  v, k 13  e IR  The cost of each gradient projection step is often dominated by projection onto the feasible set and, as such, they are known to be much more effective  6.3. Inefficient warm-starts  33  when has a form which is particularly cheap to project onto. Neither TrLP or RLP is particularly well suited to gradient projection: pro jection onto their linear equality constraints is a quadratic program unto itself. Friedlander et al. [9], however, suggest that the dual problem to RLP has a form which is very well suited to gradient projection, as we now explore. A satisfactory treatment of duality theory is outside the scope of this the sis, and is not required for this discussion. (The interested reader is directed towards [13, Section 16.8]). Instead, it is sufficient to understand that duality theory for convex quadratic programming tells us that there exists a problem related to RLP (called the dual of RLP) which has identical KKT conditions to it, and that it is possible to obtain a solution for RLP (the primal problem) by finding a solution to this related problem. Theorem 6 2 NNQP(x,p)  The dual problem of RLP is given by  _IIJ(x)Ty+z —g(x)I  minmuze  subject to z  +yc— zTx  0  The dual of RLP, NNQP, is a convex quadratic program with non-negativity constraints. Importantly, its feasible set (Q = {y, z z O}) allows for very cheap projections—simple truncations of z, with linear cost. Gradient projec tion is very effective on such problems. We can, thus, solve RLP subproblems (albeit indirectly) very efficiently using a gradient projection algorithm. This will allow for very fast active-set updates, and could prove to be a significant advantage of using RLP over TrLP subproblems in an SL—QP algorithm.  6.3  Inefficient warm-starts  Warm-start efficiency (see Section 2.2.1) is crucially important for the effec tiveness of an SL—QP algorithm. This efficiency is dependent on how good a prediction the active set at one iteration, Ak_i, is of that at a subsequent iter ation, A’. We measure this as warm-start quality, which we define in terms of the difference between these two sets.  The method is given by  Definition 6 3  warm-start =  quality,  IAk_mflAkl  of iteration k for an active-set  x 100%,  (61)  where Ac_i = A(x’’-), and A’s’ = A(x’) are the corresponding active-set predictions, and) denotes the cardinality of a set An SL—QP algorithm should seek to maximize at each iteration. We suspect, however, that the trust-region constraints in the TrLP subproblems may significantly reduce it, thus impeding the efficiency of the algorithm.  6.3. Inefficient warm-starts  34  Warm—start quality of STrLP on HIMMELBI 100  :  90  Ict1veset  ——---Natura1 active set  50  I  0  50  100  150  200  250  300  I  I  350  400  450  Iteration Figure 6.2: The warm-start quality, wc, measured at each iteration for STrLP on the CUTEr problem HIMMELBI. The full active set (AF) gives a much lower quality than the natural active set (AN) because the trust-region constraints constraints continue to vacillate as we approach an optimal point. Why would the trust-region adversely affect warm-start quality? Let us consider the constraints in TrLP in two subsets: the natural constraints (c(x) + J(x) d 0, x + d 0), and the trust-region constraints (d z). The natural constraints are a linearized model of those in NLP, and thus are related on subsequent steps. Consequently, we expect some coherence in the active-set for these constraints from iteration to iteration, translating into a higher warmstart quality. The trust-region constraints, in contrast, are wholly artificial; they do not model any part of NLP. Instead, they hover around our iterate, x, and grow or shrink suddenly whenever is updated. We expect much less coherence from these trust-region constraints, which would translate into a lower warm-start quality. In order to quantify the effect of trust-region constraints on warm-start qual ity, we consider a subset of the active set of TrLP which includes only the natural constraints.  =  Definition 6 4 The natural active set of TrLP is the set of active natural constraints, and can be written Ag  ={  d=_x}  We also consider the full active set. Definition 6.5. The full active set of TrLP is the set of all active constraints in TrLP including the natural and trust-region constraints It can be written  A.={i d =max(—x,—t)}U{—i d 2 =i} 2 (Definitions 6.4 and 6.5 can be derived directly from Definition 1.3, which is  6.3. Inefficient warm-starts  35  Warm—start quality of SRLP vs. STrLP on HIMMELBI 100 90  .  50  .  .  .  .  100  150  200  250  .  STrLP —SRLP  I  0  50  300  350  400  450  Iteration  Figure 6.3: The warm-start quality at each iteration for both STrLP and SRLP algorithms on the CUTEr problem HIMMELBI. The full active (AF) is used in both cases; the absence of trust-region constraints on SRLP’s RLP subproblems leads to a much higher warm-start quality. shown in Appendix A.) Let us first consider this effect on a sample problem from our CUTEr test set: HIMMELBI, which has 12 constraints and 100 bounded variables. We run the trust-region based solver, STrLP, on this problem, and measure the warmstart quality, wk, at each iteration k using both Av and AF. Figure 6.2 shows the results. Notice how for AF never reaches 100%, as—even once we have reached the optimal active set for NLP—the trust-region constraints continue to change iteration by iteration. Warm-start quality for AN, however, increases very quickly as the active set becomes optimal. An RLP subproblem includes only the natural constraints (we may use AN as the active set for all its constraints), and so we expect the warm-start quality of the SRLP algorithm to be much higher than that of STrLP. We again look at HIMMELBI, and measure wk for the full active set of SRLP (which is equivalent to AN). Figure 6.3 compares the warm-start quality of the two algorithms, and we indeed see a significant advantage for SRLP. (Note, though, that these solvers also required a different number of major iterations to solve HIMMELBI.) This result is typical for our test set. The average warm-start quality over all iterations for each problem is given in Appendices B and C. These data are represented in Figure 6.4, which shows a cumulative distribution function of the number of problems with given warm-start qualities. This plot clearly shows that, on our test set, the warm-start quality was indeed significantly higher when using RLP subproblems instead of TrLP. A higher warm-start quality should result in less work required to solve each subproblem. In our implementations, which use SQOPT as the subproblem solver, this translates into fewer minor iterations required per major iteration. Figure 6.5 shows the number of minor iterations required for each iteration—for  6.3. Inefficient warm-starts  36  Average warm—start quality of SRLP vs. STrLP 1 —  0.9  U.  SRLP STrLP  —  .  :  .  0.8  I  0.7  .: .  .  .  .  .  S  0.6 0.5  3 2  .  j.  .  0.4  .  .  0.3 0.2 0.1 0 20  ;  .  :  .  I  I  30  40  50  60  70  80  — I  I  90  100  Average warm—start quality (%)  Figure 6.4: A cumulative distribution function of the average warm-start quality (over all iterations) for all problems in our test set. The results for both SRLP and STrLP are shown—we see a significantly higher fraction of problems with high average quality in SRLP due to the absence of trust-region constraints. .  both SRLP and STrLP—on HIMMELBI, and demonstrates a large reduction in subproblem cost for SRLP due to increased warm-start efficiency. Figure 6.6 shows a cumulative distribution function of the average number of minor itera tions per major iteration for each problem in our test set. We see here that the major iterations for SRLP were indeed cheaper on average than those of STrLP. It seems, then, that a significant advantage can be gained due to an increased warm-start efficiency by replacing the TrLP subproblems with fliPs in an SL—QP algorithm.  6.3. Inefficient warm-starts  37  Cost per iteration of SRLP vs. STrLP on HIMMELBI  50  0  100  150  200  250  300  350  400  450  Iteration  Figure 6.5: The number of minor (SQOPT) iterations taken by both SRLP and STrLP on each iteration while solving HIMMELBI. Due to a much higher warm-start quality, SRLP takes advantage of significantly cheaper iterations than STrLP on this problem.  Average cost per iteration of SRL.P vs. STrLP  1 —  SRLP STrLP  0.8 L1  ..:  ..  —  09  :,::.  •:.:::: .::  .::  0.7  1:::..:  I  0.6 05  o  0.4 0.3  ::.:::::  :..:::::, .  0 .  .....“  :  :  ::.  0.2  .  : :  -.....  I  .  ...  .  ...  : : ::: .  .:  .  :::  :  -  0.1 0 102  ....  I  ::: —,  —  10_i  I  10°  I  I  lOl  102  Minor iterations per major iteration  Figure 6.6: A cumulative distribution function of the average number of minor (SQOPT) iterations taken per major iteration for all problems in our test set. SRLP and STrLP results are shown. Major iterations for SRLP were cheaper over this test set, presumably due to an increased warm-start quality. .  38  Chapter 7  Conclusions In this thesis, we explored the effectiveness of a proposed variant of current SL— QP methods at solving very large-scale nonlinear programs. Instead of solving a TrLP subproblem (Equation 3.2) at each iteration to update a prediction of the optimal active set, this new method solves an RLP subproblem (Equation 4.1). Two algorithms were presented to facilitate a comparison between these subproblem choices: SRLP—which successively takes steps generated by solving an RLP subproblem (Algorithm 4.3), along with STrLP—a benchmarking algorithm which takes TrLP steps (Algorithm 5.1). Experiments using a diverse set of problems selected from the CUTEr set (Section 5.2) demonstrated that the trust-region constraints in TrLP subprob lems can lead to significant inefficiencies. These constraints often greatly re duced the effectiveness of warm-starting successive subproblems solves, which lead to more expensive subproblem solves (Section 6.3). They were also shown to induce a significant number of infeasible subproblems, which lead the very expensive iterations (Section 6.1). In contrast, it was shown that using an RLP subproblem instead of a TrLP largely eliminates these inefficiencies. The RLP subproblem contains no arti ficial constraints and, as such, enjoys a much higher average warm-start effi ciency on subsequent solves—and thus cheaper subproblem solves—than TrLP subproblems. Also, regularization cannot induce infeasibility, and thus SRLP encountered far fewer infeasible subproblems than STrLP (Section 6.1). Additionally, we showed that an RLP subproblem has another potential ad vantage over a TrLP: its dual problem is a bound-constrained least-squares problem (Section 6.2). This form is very conducive to gradient-projection-type solvers, which could prove significantly more effective than active-set subprob lem solvers for certain large-scale problems. An RLP-based algorithm could greatly benefit from the ability to choose a gradient-projection-type subprob lem solver. Based on these results, it seems likely that our proposed variant of SL—QP will prove effective on very large-scale nonlinear programs. It is our inten tion that the SRLP algorithm presented here will serve as a base for such an algorithm, with an EQP step added to incorporate knowledge of second-order information of NLP. Some important considerations must be made when adding an EQP step, however: (i) a strategy for controlling the length of generated EQP steps, (ii) the selection of a non-degenerate subset of the active-set to ensure that EQP is not inconsistent, and (iii) the choice of a trial step which combines the LP and EQP steps. These issues are discussed in detail by Byrd et al. [3] and  Chapter 7. Conclusions Waltz [19].  39  40  Bibliography [1] Paul T. Boggs and Jon W. Tolle. Sequential quadratic programming for large-scale nonlinear optimization. Journal of Computational and Applied Mathematics, 124(1-2) :123— 137, 2000. [2] Brian Borchers and John E. Mitchell. An improved branch and bound algorithm for mixed integer nonlinear programs. Computers and Operations Research, 21(4):359 367, 1994. —  [3] Richard H. Byrd, Nicholas I. M. Gould, Jorge Nocedal, and Richard A. Waltz. An algorithm for nonlinear optimization using linear programming and equality constrained subproblems. Math. Program., 100(1) :27—48, 2004. [4] Richard H. Byrd, Jorge Nocedal, and Richard A. Waltz. Steering exact penalty methods for nonlinear programming. Optimization Methods Soft ware, 23(2):197—213, 2008. [5] Choong Ming Chin and Roger Fletcher. On the Global Convergence of an SLP-Filter Algorithm that takes EQP steps. Mathematical Programming, 96:161—177, 2003. [6] Anthony V. Fiacco and Garth P. McCormick. The sequential unconstrained minimization technique for nonlinear programing, a primal-dual method. Management Science, 10(2) :360—366, 1964. [7] R. Fletcher and E. Sainz de la Maza. Nonlinear programming and nonsmooth optimization by successive linear programming. Math. Program., 43(3):235—256, 1989. [8] Roger Fletcher and Sven Leyffer. Nonlinear programming without a penalty function. Mathematical Programming, 91:239—269, 2000. [9] Michael P. Friedlander, Nick I. M Gould, Sven Leyffer, and Todd S. Munson. A filter active-set trust-region method. ANL/MCS-P1456-0907. September 2007. [10] P. E. Gill, W. Murray, and M. A. Saunders. SNOPT: An SQP Algorithm for Large-Scale Constrained Optimization. SIAM Review, 47:99—131, January 2005.  41 [11] Jacek Gondzio and Andreas Grothey. A new unbiocking technique to warmstart interior point methods based on sensitivity analysis. SIAM Journal on Optimization, 19(3):1184—1210, 2008. [12] Nicholas I. M. Gould, Dominique Orban, and Philippe L. Toint. CUTEr and SifDec: A constrained and unconstrained testing environment, revisited. ACM Trans. Math. Softw., 29(4):373—394, 2003. [13] Jorge Nocedal and Stephen J. Wright. Numerical Optimization. Springer, New York, NY, 1999. [14] F. Palacios-Gomez, L. Lasdon, and M. Engquist. Nonlinear optimization by successive linear programming. Management Science, 28(10):1106—1120, 1982. [15] Walter Murray Phillip E. Gill and Michael A. Saunders. User’s Guide for SNOPT Version 7: Software for Large-Scale Nonlinear Programming. http://www.stanford.edu/group/SOL/snopt.htm, February 2008. [16] Walter Murray Phillip E. Gill and Michael A. Saunders. User’s Guide for SQOPT Version 7: Software for Large-Scale Linear and Quadratic Pro gramming. http://www.stanford.edu/group/SOL/sqopt.htm, June 2008. [17] Stephen M. Robinson. Perturbed Kuhn-Tucker points and rates of conver gence for a class of nonlinear-programming algorithms. 7:1—16, 1974. [18] Andreas Wachter and Lorenz T. Biegler. On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear program ming. Math. Program., 106(1):25—57, 2006. [19] Richard A. Waltz. Algorithms for large-scale nonlinear optimization. http://www.eecs.northwestern.edu/rwaltz/articles.html, June 2002.  42  Appendix A  Derivation of the active sets for TrLP subproblems We define an active set for NLP in Definition 1.3. TrLP is a specific case of NLP, and so this definition still holds, however it requires us to reformulate it into the format given by Equation 1.1. Instead we redefine the active set for TrLP. Section 6.3 defines two subsets of constraints, and accordingly two active sets. In this appendix we derive these definitions from Definition 1.3. The natural bound constraints of TrLP, rewritten to match NLP, give d —x. We can thus simply write  Ajr :=  { : d — _x}.  The trust-region constraints make the active set slightly more cumbersome. The bound constraints, together with these trust-region constraints for TrLP can be written succinctly as max(—xi,  —s.)  <d <.  Because a variable that is active at its lower bound should not be considered equivalent to one at its upper bound, we should begin by defining two separate active sets for TrLP and  {  d =  To simplify our notation, and to remain consistent with our earlier definition of an active set, however, we can combine these active sets by inserting variables at their upper bound with a negative index (in this way the definition of w in Equation 6.1 can be used). Our definition of the full active set for TrLP is, thus,  := where—{C}:={—aIaE}.  u (— {A})  43  Appendix B  SRLP Results This appendix lists the results of the algorithm SRLP on each problem in the test set defined in Section 5.2. problem n m maj mill fr time w  .  problem alsotaine btl3 cantilvr cb2 cb3 chaconni chaconn2 concon congigmz csf ii csfi2 dembo7 demymalo dnieper eqc expfita gigoluezi gigomez2 gigoiuez3 haif as  name of CUTEr problem number of variables number of constraints number of major iterations taken number of minor iterations taken number of feasibility restoration steps taken time taken to solve in seconds measure of subproblem warm-start quality (Equation 6.1) value of the objective at solution Table B.1: Definition of symbols used in Table B.2.  n 2 5 5 3 3 3 3 15 3 5 5 16 3 61 9 5 3 3 3 13  m 1 1 1 3 3 3 3 11 5 4 4 20 3 24 3 22 3 3 3 9  maj 7 57 17 11 6 12 6 4 65 48 56 1464 10 9 2 34 9 14 6 35  mm 7 62 18 13 5 10 1 13 79 59 144 1583 13 269 5 78 11 16 5 54  ft 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  time 0.257 0.304 0.166 0.199 0.146 0.150 0.142 0.174 0.290 0.241 0.285 2.586 0.182 0.211 0.174 0.225 0.181 0.155 0.144 0.223  w 95.238 99.573 100.000 96.296 97.222 98.611 100.000 99.038 98.566 98.551 84.127 99.837 94.444 80.196 91.667 98.810 94.444 97.436 97.222 97.947  f 8.21e-02 00 1.34e+00 1.95e+00 2.OOe+00 1.95e+00 2.OOe+00 -6.23e+03 2.80e+01 -4.91e+01 5.50e+01 1.75e+02 -3.OOe+00 1.87e+04 -8.30e+02 1.14e-03 -3.OOe+00 1.95e+00 2.OOe+00 -4.50e-01  Appendix B. SRLP Results problem haldinads himmelbk himmelp4 himmelp5 himmelp6 hong hslO hslO2 hslO3 hslO4 hslO6 hslO7 hslO8 hsll hslll hsll2 hsll4 hsll6 hsllT hsll9 hsl2 hsl3 hsl4 hsl5 hsl6 hsl7 hsl8 hsl9 hs2O hs22 hs23 hs24 hs29 hs3O hs3l hs32 hs33 hs34 hs36 hs37 hs4l hs43 hs54 hs55  n 6 24 2 2 2 4 2 7 7 8 8 9 9 2 10 10 10 13 15 16 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 4 4 6 6  m 42 14 3 3 5 1 1 5 5 5 6 6 13 1 3 3 11 14 5 8 1 1 2 2 2 2 2 2 3 2 5 3 1 1 1 2 2 2 1 2 1 3 1 6  maj 12 50 283 28 110 26 16 956 284 94 21 20 25 17 20 663 707 46 351 11 180 14 5 17 4 11 103 5 3 4 20 8 248 15 19 14 9 13 2 9 6 89 2 2  mm 166 121 298 34 123 61 17 3571 1106 105 40 74 60 18 29 794 747 86 571 56 184 3 2 1 0 7 104 3 4 4 22 13 252 7 21 18 7 15 4 13 8 102 4 5  fr 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  time 0.193 0.250 0.614 0.180 0.341 0.209 0.191 3.060 0.835 0.318 0.209 0.200 0.210 0.200 0.200 1.236 1.263 0.250 0.786 0.202 0.567 0.154 0.177 0.175 0.140 0.153 0.289 0.140 0.176 0.194 0.199 0.178 0.557 0.189 0.161 0.188 0.149 0.186 0.170 0.182 0.178 0.313 0.177 0.173  44  f* 88.447 97.530 98.660 93.077 98.393 94.667 100.000 95.359 94.674 99.827 97.768 95.556 97.727 100.000 100.000 99.439 99.852 98.589 99.380 86.806 99.708 95.238 100.000 95.000 93.750 78.571 99.500 85.000 86.667 87.500 99.248 95.000 99.860 98.214 96.667 98.571 97.778 93.846 62.500 90.000 90.000 99.532 100.000 91.667  1.22e-04 5.18e-02 -5.90e+01 -5.90e-I-01 -5.90e+01 2.26e+01 -1.OOe+00 9.12e+02 5.44e+02 3.95e+00 9.53e+03 5.06e+03 -5.OOe-01 -8.50e+00 -4.78e+01 -4.78e+01 -1.76e+03 1.85e+02 3.23e+01 2.45e+02 -3.OOe+01 1.Ole+00 1.39e+00 3.06e+02 2.31e+01 01 5.OOe+00 -6.96e+03 4.02e+01 1.OOe+00 2.OOe+00 -1.OOe+00 -2.26e+01 1.OOe+00 6.OOe+00 1.OOe+00 -4.OOe+00 -8.34e-01 -3300 -3.46e+03 1.93e+00 -4.40e+01 -7.22e-34 6.67e+00  Appendix B. SRLP Results problem hs57 hs59 hs6O hs62 hs64 hs65 hs66 hs68 hs7l hs72 hs73 hs74 hs75 hs8O hs8l hs83 hs84 hs85 hs86 hs88 hs89 hs9O hs9l hs92 hs93 hs95 hs96 hs97 hs98 hs99 hubfit kiwcresc loadbal lsnnodoc madsen makelal makela2 matrix2 mconcon miff lini mifflin2 mininaxrb mistake mribasis  n 2 2 3 3 3 3 3 4 4 4 4 4 4 5 5 5 5 5 5 2 3 4 5 6 6 6 6 6 6 7 2 3 31 5 3 3 3 6 15 3 3 3 9 36  m 1 3 1 1 1 1 2 2 2 2 3 5 5 3 3 3 3 21 10 1 1 1 1 1 2 4 4 4 4 2 1 2 31 4 6 2 3 2 11 2 2 4 13 55  maj 8 88 18 163 44 129 15 363 14 8 7 20 10 8 7 3 69 364 36 192 204 448 407 44 1056 8 8 10 10 123 10 12 37 2 22 6 35 18 4 8 11 27 22 107  mm 10 133 20 181 71 155 19 375 18 27 11 28 11 12 7 5 116 397 81 195 208 453 413 51 1074 20 13 14 13 247 13 17 80 5 36 9 40 27 12 14 15 30 57 239  fr 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  time 0.187 0.376 0.197 0.389 0.207 0.339 0.189 0.712 0.189 0.145 0.181 0.200 0.172 0.180 0.143 0.182 0.284 0.758 0.230 0.794 0.853 1.994 2.032 0.366 1.783 0.181 0.146 0.151 0.151 0.359 0.196 0.196 0.233 0.171 0.201 0.176 0.223 0.200 0.182 0.185 0.187 0.210 0.208 0.377  45  w 100.000 91.786 100.000 100.000 98.333 98.101 96.000 99.898 95.000 100.000 89.796 98.889 96.296 100.000 100.000 83.333 97.750 99.947 96.410 100.000 100.000 100.000 100.000 100.000 99.973 91.250 91.250 92.000 92.000 100.000 92.593 95.556 99.478 83.333 96.732 93.333 99.048 97.500 99.038 93.333 95.556 98.901 96.717 99.486  3.06e-02 -7.80e+00 3.26e-02 -2.63e+04 6.30e+03 9.54e-01 5.18e-01 -9.20e-01 1.70e+01 5.47e+02 2.99e+01 5.13e+03 5.17e+03 5.39e-02 5.39e-02 -3.07e+04 -5.28e+06 -1.68e+00 -3.23e+01 1.36e+00 1.36e+00 1.36e+00 1,36e+00 1.29e+00 1.35e+02 1.56e-02 1.56e-02 4.07e+00 4.07e+00 -8.31e+08 1.69e-02 -4.89e-08 4.53e-01 1.23e+02 6.16e-01 -1.41e+00 7.20e+00 6.16e-07 -6.23e+03 -1.OOe+00 -1.OOe+00 4.51e.-16 -1.OOe+00 1.82e+01  Appendix B. SRLP Results problem odf its optcntrl pentagon polakl polak4 polak5 prodplo prodpll qc qcnew rk23 s365 snake stancinin synthesl synthes2 synthes3 tenbarsl tenbars2 tenbars3 truspyrl truspyr2 try—b twobars water womf let zecevic3 zecevic4 zy2 airport batch britgas corel core2 dallasiu eigmaxa eigmaxb  eigmina eigminb expfitb expfitc feedloc grouping haif am  n 10 32 6 3 3 3 60 60 9 9 17 7 2 3 6 11 17 18 18 18 11 11 2 2 31 3 2 2 3 84 48 450 65 157 196 101 101 101 101 5 5 90 100 99  m 6 20 15 2 3 2 29 29 4 3 11 5 2 2 6 14 23 9 8 8 4 11 1 2 10 3 2 2 2 42 73 360 59 134 151 101 101 101 101 102 502 259 125 150  maj 22 3 13 15 4 3 11 10 2 2 7 3 4 2 19 17 35 57 87 76 1854 9 7 15 155 23 22 33 9 72 27 12 19 82 453 2 7 2 7 26 36 152 1 67  mill  32 32 35 17 6 5 184 84 11 5 22 18 5 4 30 87 147 82 110 92 1895 30 1 17 197 36 40 45 10 1086 386 1063 112 365 1391 100 101 101 101 160 718 12375 50 355  fr 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0  time 0.206 0.184 0.199 0.192 0.181 0.175 0.199 0.152 0.171 0.178 0.180 0.176 0.217 0.173 0.204 0.202 0.239 0.469 0.469 0.516 3.238 0.240 0.179 0.212 0.430 0.203 0.219 0.189 0.182 0.462 0.236 0.328 0.202 0.370 1.667 0.698 0.187 0.693 0.186 0.227 0.433 1.924 0.179 0.465  46 w 100.000 85.897 98.901 100.000 91.667 100.000 85.815 92.759 73.077 91.667 97.024 77.778 93.750 90.000 95.833 88.727 95.096 99.581 99.768 99.737 99.982 95.960 100.000 98.077 99.179 96.667 92.857 97.115 93.333 94.520 91.804 98.066 98.005 99.444 99.857 97.525 99.929 97.525 99.929 99.439 99.750 96.584 100.000 99.743  -2.38e+03 5.50e+02 1.56e-04 2.72e+00 -2.66e-13 5.OOe+01 5.88e+01 3.57e+01 -9.57e+02 -8.07e+02 8.33e-02 00 2.32e-14 4.25e+00 7.59e-01 -5.54e-01 1.51e+01 2.30e+03 2.30e+03 2.25e+03 1.12e+01 1.12e+01 1.OOe+00 1.51e+00 1.05e+04 4.57e-08 9.73e+01 7.56e+00 2.OOe+00 4.80e+04 2.59e+05 00 9.lle+01 7.29e+01 -4.82e+04 -01 -9.67e-04 1.OOe+00 9.67e-04 5.02e-03 2.33e-02 6.39e-19 1.39e+01 -4.50e+01  Appendix B. SRLP Results problem himmelbi hydroels lakes leaknet reading6 robot ssebnln tfii tfi3 twirismi yorknet zamb2—i0 zanib2—ii zamb2-8 zamb2—9  n  m  100 169 90 156 102 14 194 3 3 343 312 270 270 138 138  12 168 78 153 50 2 96 101 101 313 256 96 96 48 48  maj 469 21 14 1 111 2 10 467 14 67 16 32 140 24 126  mm 773 95 147 157 376 8 342 596 178 602 532 896 1129 285 722  fr 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0  time 1.176 0.216 0.823 0.178 0.428 0.269 0.206 1.047 0.202 0.751 0.263 0.334 0.656 0.238 0.494  Table B.2: SRLP Results  47  99.505 99.407 99.628 100.000 99.485 81.250 92.931 99.705 98.077 99.665 96.420 95.807 98.815 96.006 98.183  -1.74e+03 -3.47e+06 3.51e+05 1.53e+01 -1.45e+02 5.46e+00 16170600 5.33e+00 4.30e+00 -1.Ole+00 2.44e+04 -1.57e+00 -1.lle+00 -1.52e-01 -3.50e-01  48  Appendix C  STrLP Results This appendix lists the results of the algorithm STrLP on each problem in the test set defined in Section 5.2. problem n m maj mm fr time w  f*  name of CUTEr problem number of variables number of constraints number of major iterations taken number of minor iterations taken number of feasibility restoration steps taken time taken to solve in seconds measure of subproblem warm-start quality (Equation 6.1) value of the objective at solution Table C.1: Definition of symbols used in Table C.2.  problem alsotanie btl3 cantilvr cb2 cb3 chaconnl chaconn2 concon congigmz csf ii csfi2 dembo7 demymalo dnieper eqc expfita gigomezi gigomez2 gigomez3 haif as  n 2 5 5 3 3 3 3 15 3 5 5 16 3 61 9 5 3 3 3 13  rn 1 1 1 3 3 3 3 11 5 4 4 20 3 24 3 22 3 3 3 9  maj 4 25 28 38 6 38 6 2 16 53 2 2 9 8 1 31 11 38 6 49  mm 2 23 125 39 1 35 0 11 22 139 3 39 10 222 5 127 18 36 1 60  & 0 2 1 0 0 0 0 1 1 1 1 1 0 1 0 0 1 0 0 0  time 0.401 1.989 0.227 0.251 0.142 0.195 0.144 0.241 0.235 0.316 0.213 0.280 0.185 0.382 0.181 0.226 0.250 0.194 0.146 0.249  w 91.667 77.778 33.333 84.848 97.222 85.000 100.000 78.846 86.250 72.629 77.778 77.778 89.583 83.294 83.333 87.037 83.333 84.848 97.222 93.007  8.21e-02 00 1.34e+00 1.95e+00 2.00e+00 1.95e+00 2.00e+00 -6.23e+03 2.80e+01 -4.91e+01 5.50e+01 2.lOe+02 -3.OOe+00 1.87e+04 -8.30e+02 1.14e-03 -3.OOe+00 1.95e+00 2.OOe+00 -4.50e-01  Appendix C. STrLP Results problem haldmads himmelbk himnielp4 himmelp5 himmelp6 hong hslO hslO2 hslO3 hslO4 hslO6 hslO7 hslO8 hsll hslll hsll2 hsll4 hsll6 hsll7 hsll9 hsl2 hsl3 hsl4 hsl5 hsl6 hsl7 hsl8 hsl9 hs2O hs22 hs23 hs24 hs29 hs3O hs3l hs32 hs33 hs34 hs36 hs37 hs4l hs43 hsS4 hs55  n 6 24 2 2 2 4 2 7 7 8 8 9 9 2 10 10 10 13 15 16 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 4 4 6 6  m 42 14 3 3 5 1 1 5 5 5 6 6 13 1 3 3 11 14 5 8 1 1 2 2 2 2 2 2 3 2 5 3 1 1 1 2 2 2 1 2 1 3 1 6  maj 21 27 624 15 317 43 35 28 24 131 21 38 44 27 29 353 679 5 550 26 80 14 8 7 4 14 51 6 3 8 15 2 542 29 37 3 4 15 2 39 22 174 2 1  mm 218 105 1242 18 630 144 37 47 41 544 60 65 265 28 223 2731 925 42 2926 126 133 2 8 1 0 6 52 4 4 10 16 5 1757 35 74 6 1 27 5 86 44 507 4 5  fr 1 0 0 1 2 0 1 1 1 1 0 0 1 1 0 0 0 1 0 0 0 1 0 1 0 0 2 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0  time 0.342 0.217 1.107 0.190 0.716 0.237 0.276 1.118 1.892 0.422 0.203 0.226 0.309 0.236 0.217 0.739 1.186 0.270 1.062 0.212 0.283 0.176 0.192 0.174 0.151 0.152 0.257 0.173 0.183 0.180 0.220 0.170 0.996 0.209 0.188 0.174 0.140 0.190 0.170 0.225 0.207 0.431 0.172 0.171  49 w 86.553 90.486 60.285 66.154 71.772 20.833 66.667 85.714 86.111 69.161 80.357 91.111 79.273 66.667 49.744 42.388 94.260 74.074 75.899 82.083 41.026 90.476 87.500 75.000 93.750 75.000 73.333 75.000 86.667 81.250 90.909 80.000 25.635 73.077 45.833 73.333 95.000 77.778 50.000 59.130 60.000 61.238 78.571 83.333  f* 1.22e-04 5.18e-02 -5.90e+01 -5.90e+01 -5.90e+01 2.26e+01 -1.OOe+00 1.28e+03 9.24e+02 3.95e+00 9.14e+03 5.06e+03 -6.75e-01 -8.50e+00 -4.78e+01 -4.78e+01 -1.77e+03 190 3.23e+01 2.45e+02 -3.OOe+01 1.Ole+00 1.39e+00 3.06e+02 2.31e+01 01 5.OOe+00 -6.96e+03 4.02e+01 1.OOe+00 2.OOe+00 -1.OOe+00 -2.26e+01 1.OOe+00 6.OOe+00 01 -4.OOe+00 -8.34e-01 -3300 -3.46e+03 1.93e+00 -4.40e+01 -1.43e-88 6.67e+00  Appendix C. STrLP Results problem hs57 hs59 hs6O hs62 hs64 hs65 hs66 hs68 hs7l hs72 hs73 hs74 hs75 hs8O hs8l hs83 hs84 hs85 hs86 hs88 hs89 hs9O hs9i hs92 hs93 hs95 hs96 hs97 hs98 hs99 hubfit kiwcresc loadbal isnuodoc madsen makelal makela2 matrix2 mconcon miff lini miff].in2 minmaxrb mistake mribasis  n 2 2 3 3 3 3 3 4 4 4 4 4 4 5 5 5 5 5 5 2 3 4 5 6 6 6 6 6 6 7 2 3 31 5 3 3 3 6 15 3 3 3 9 36  m 1 3 1 1 1 1 2 2 2 2 3 5 5 3 3 3 3 21 10 1 1 1 1 1 2 4 4 4 4 2 1 2 31 4 6 2 3 2 11 2 2 4 13 55  maj 39 54 37 168 7 773 30 376 37 7 3 35 5 26 30 3 9 8 42 27 51 94 70 136 613 1 1 4 4 85 31 34 35 2 38 44 29 23 2 35 78 23 42 58  mm 55 73 65 406 22 2330 46 744 40 14 7 39 3 49 63 5 45 55 119 33 112 316 263 697 2470 11 0 10 12 449 47 43 499 5 57 53 33 29 11 47 160 39 335 544  fr 0 2 0 0 1 1 0 0 0 1 1 1 1 1 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 0 2 0 0 1  time 0.222 0.310 0.228 0.381 0.196 1.357 0.209 0.741 0.226 0.206 0.210 0.266 0.188 0.237 0.191 0.178 0.183 0.186 0.237 0.294 0.457 0.422 0.658 0.589 1.135 0.170 0.137 0.156 0.171 0.313 0.253 0.309 0.241 0.171 0.225 0.228 0.210 0.295 0.242 0.218 0.373 0.201 0.240 0.505  50  f* 33.333 68.387 5L389 50.000 30.000 25.435 76.000 66.810 82.500 61.111 71.429 88.889 91.111 76.250 75.962 83.333 50.000 84.615 80.303 66.667 50.000 39.231 33.696 31.905 50.050 90.000 90.000 70.000 70.000 48.457 36.111 76.667 78.802 83.333 86.772 76.800 82.540 78.125 78.846 76.842 61.333 74.107 75.413 93.626  3.06e-02 -7.80e+00 3.26e-02 -2.63e+04 6.65e+03 9.54e-01 5.18e-01 -9.20e-01 1.70e+01 7.28e+02 2.99e+01 5.13e+03 5.17e+03 5.39e-02 5.39e-02 -3.07e+04 -2.38e+06 -1.25e-l-00 -3.23e+01 1.36e+00 1.36e+00 1.36e+00 1.36e+00 1.36e+00 1.35e+02 1.56e-02 1.56e-02 3.14e+00 3.14e+00 -8.31e+08 1.69e-02 -4.47e-08 4.53e-01 1.23e+02 6.16e-01 -1.41e+00 7.20e+00 1.03e-01 -6.23e+03 -1.OOe+00 -1.OOe+00 00 -1.OOe+00 3.15e+01  Appendix C. STrLP Results problem odf its optcntrl pentagon polaki polak4 polak5 prodpl0 prodpll qc qcnew rk23 s365 snake stancmin synthesl synthes2 synthes3 tenbarsi tenbars2 tenbars3 truspyrl truspyr2 try—b twobars water womf let zecevic3 zecevic4 zy2 airport batch britgas corel core2 dallasm eignlaxa eigiuaxb eigmina eigminb expfitb expfitc feedloc grouping haif am  n 10 32 6 3 3 3 60 60 9 9 17 7 2 3 6 11 17 18 18 18 11 11 2 2 31 3 2 2 3 84 48 450 65 157 196 101 101 101 101 5 5 90 100 99  m 6 20 15 2 3 2 29 29 4 3 11 5 2 2 6 14 23 9 8 8 4 11 1 2 10 3 2 2 2 42 73 360 59 134 151 101 101 101 101 102 502 259 125 150  maj 32 4 15 23 8 23 8 9 1 1 60 2 2 2 37 10 50 104 116 63 303 2 6 35 17 25 54 44 4 192 43 14 2 2 55 2 7 2 7 29 36 2 1 11  mm 148 49 48 25 18 22 71 109 11 5 112 7 3 3 66 61 272 553 606 396 2319 19 1 36 126 35 84 77 5 8015 380 1742 56 133 3570 100 101 101 101 381 1701 601 50 293  & 1 1 0 0 1 1 0 0 0 0 1 0 0 1 0 1 1 1 1 1 1 1 0 1 1 1 1 0 0 1 0 1 1 1 0 1 0 1 0 0 0 1 0 0  time 0.336 0.287 0.195 0.201 0.222 0.242 0.182 0.151 0.174 0.181 0.334 0.171 0.169 0.195 0.221 0.260 0.357 0.522 0.545 0.503 0.800 0.244 0.181 0.245 0.292 0.238 0.289 0.207 0.179 1.236 0.250 5.907 0.497 2.117 0.491 0.701 0.185 0.694 0.186 0.242 0.497 1.349 0.176 0.212  51  w 47.443 80.769 80.000 80.000 70.000 70.000 93.399 91.386 46.154 83.333 93.304 83.333 87.500 70.000 87.281 76.000 87.891 82.463 82.284 78.114 53.866 81.818 100.000 73.529 80.976 85.294 63.281 57.292 85.000 66.555 86.860 95.150 86.290 88.660 71.025 97.525 99.929 97.525 99.929 95.736 98.981 78.940 100.000 96.185  f* -2.38e+03 5.50e+02 1.37e-04 2.72e+00 2.24e+01 5.OOe-l-01 5.88e+01 3.57e+01 -9.57e+02 -8.07e+02 8.33e-02 00 1.20e-13 4.25e+00 7.59e-01 -5.54e-01 1.51e+01 2.30e+03 2.30e+03 2.25e+03 1.17e+01 1.12e+01 1.OOe+00 1.51e+00 1.07e+04 00 9.73e+01 7.56e+00 2.OOe+00 4.80e+04 2.59e+05 00 9.lle+01 7.29e+01 -4.82e+04 -01 -9.67e-04 1.OOe+00 9.67e-04 5.02e-03 2.33e-02 00 1.39e+01 9.98e+01  Appendix C. STrLP Results problem himjnelbi hydroels lakes leaknet reading6 robot ssebnln tfil tfi3 twirismi yorknet zamb2—1O zainb2—i1 zamb2—8 zamb2—9  n 100 169 90 156 102 14 194 3 3 343 312 270 270 138 138  m 12 168 78 153 50 2 96 101 101 313 256 96 96 48 48  maj 447 1 2 1 51 2 3 53 9 53 22 22 19 40 26  mm 11820 30 107 158 1231 8 392 647 478 2891 763 1099 1632 1537 1061  fr 0 0 1 0 1 1 1 0 0 0 1 0 0 0 0  time 3.763 0.178 0.821 0.179 0.752 0.272 1.028 0.316 0.208 0.847 2.341 0.268 0.257 0.303 0.230  Table C.2: STrLP Results  52  w 75.831 98.220 80.952 99.029 89.686 81.250 58.391 94.373 98.077 94.233 93.486 89.344 81.379 82.484 84.097  f* -1.74e+03 -3.43e+06 2.50e+11 8.98e+00 -1.45e+02 5.46e+00 2.06e+07 5.33e+00 4.30e+00 -1.Ole+00 1.45e+04 -1.58e+00 -1.12e+00 -1.53e-01 -3.54e-01  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0051992/manifest

Comment

Related Items