UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Nonlinearly constrained optimization via sequential regularized linear programming Crowe, Mitch 2010

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.

Item Metadata

Download

Media
24-UBC_2010_fall_crowe_mitch.PDF [ 1.44MB ]
Metadata
JSON: 24-1.0051992.json
JSON-LD: 24-1.0051992-ld.json
RDF/XML (Pretty): 24-1.0051992-rdf.xml
RDF/JSON: 24-1.0051992-rdf.json
Turtle: 24-1.0051992-turtle.txt
N-Triples: 24-1.0051992-rdf-ntriples.txt
Original Record: 24-1.0051992-source.json
Full Text
24-1.0051992-fulltext.txt
Citation
24-1.0051992.ris

Full Text

Nonlinearly constrained optimization via sequential regularized linear programming by Mitch Crowe B.Sc., The University of British Columbia, 2007 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF Master of Science in The Faculty of Graduate Studies (Computer Science) The University of British Columbia (Vancouver) August 2010 © Mitch Crowe 2010 11 Abstract This thesis proposes a new active-set method for large-scale nonlinearly con strained optimization. The method solves a sequence of linear programs to generate search directions. The typical approach for globalization is based on damping the search directions with a trust-region constraint; our proposed ap proach is instead based on using a 2-norm regularization term in the objective. Numerical evidence is presented which demonstrates scaling inefficiencies in current sequential linear programming algorithms that use a trust-region constraint. Specifically, we show that the trust-region constraints in the trust- region subproblems significantly reduce the warm-start efficiency of these sub- problem solves, and also unnecessarily induce infeasible subproblems. We also show that the use of a regularized linear programming (RLP) step largely elim inates these inefficiencies and, additionally, that the dual problem to RLP is a bound-constrained least-squares problem, which may allow for very efficient subproblem solves using gradient-projection-type algorithms. Two new algorithms were implemented and are presented in this thesis, based on solving sequences of RLPs and trust-region constrained LPs. These algorithms are used to demonstrate the effectiveness of each type of subproblem, which we extrapolate onto the effectiveness of an RLP-based algorithm with the addition of Newton-like steps. All of the source code needed to reproduce the figures and tables presented in this thesis is available online at http: //www.cs.ubc.ca/labs/scl/thesis/lOcrowe/ 111 Table of Contents Abstract ii Table of Contents iii List of Tables v List of Figures vi Acknowledgments vii 1 Preliminaries 1 1.1 Introduction 1 1.2 Notation and definitions 2 2 Current approaches for solving nonlinear programs 5 2.1 Interior-point methods 5 2.2 Active-set methods 6 2.3 Sequential quadratic programming 7 3 Sequential linear—quadratic programming 9 3.1 Local algorithm 9 3.2 Globalization 10 3.3 Infeasible subproblems 13 4 Sequential regularized linear programming 16 4.1 Algorithm outline 16 4.2 Infeasible subproblems 17 4.3 Step acceptance 17 4.4 Regularization parameter updates 19 4.5 Lagrange multiplier estimates 21 4.6 Algorithm statement 22 5 Experiments 25 5.1 A benchmark trust-region algorithm 25 5.2 Testing environment 25 5.3 Implementation 27 5.4 Results 30 Table of Contents iv 6 Discussion 31 6.1 Expensive infeasible subproblems 31 6.2 Slow active-set updates 32 6.3 Inefficient warm-starts 33 7 Conclusions 38 Bibliography 40 Appendices A Derivation of the active sets for TrLP subproblems 42 B SRLP Results 43 C STrLP Results 48 VList of Tables 4.1 First-order KKT conditions for NLP and RLP 22 5.1 Relevant SQOPT parameters used for solving RLP 28 5.2 Relevant SNOPT parameters used for solving MINFR 30 B.1 Definition of symbols used in Table B.2 43 B.2 SRLP Results 47 C.1 Definition of symbols used in Table C.2 48 C.2 STrLP Results 52 vi List of Figures 3.1 Infeasibility induced by a trust-region 13 6.1 Infeasible subproblems for SRLP and STrLP 32 6.2 Warm-start quality of STrLP on HIMMELBI 34 6.3 Warm-start quality of SRLP on HIMMELBI 35 6.4 Warm-start quality cummulative distribution function 36 6.5 Minor iterations per iteration on HIMMELBI 37 6.6 Minor iterations per iteration cummulative distribution function 37 vi’ Acknowledgments I owe a huge debt of gratitude to my superviser Michael Friedlander. This work would not have been possible without his enthusiastic support and encourage ment. Michael allowed me the freedom to explore and to truly enjoy this re search, while at the same time managing to keep me on track and focused. I was very fortunate to have him as my supervisor. A special thanks goes to my labmate Ewout van den Berg for his tireless help and for being a great friend. Many thanks go also to Chen Grief both for his helpful suggestions with regards to this work, and for his very kind support throughout my degree. Mitch Crowe Vancouver August, 2010 1Chapter 1 Preliminaries 1.1 Introduction Nonlinear programs (NLP), which can be written generically as NLP minimize f(x) subject to c(x) = 0 (1.1) x 0, have a broad range of applications in engineering, science, and economics. In this thesis, we assume that f : —+ Ill and c: —+ are twice-continuously differentiable functions. A huge body of work on solving NLP has been produced in the last 50 years, and a vast and diverse array of algorithms has been devel oped to solve problems of this form. Even so, one can easily choose an f or a c so that none of the current methods will suffice. In truth, NLP describes such a broad class of optimization problems that the best we can hope to do is to expand the small domain of those we can solve. As computing power grows, real world applications are begining to demand the solution to very large-scale NLPs—possibly including millions of variables and constraints. Unfortunately, this is a domain in which current algorithms have proven to be insufficient. The focus of this thesis is to explore an algorithm which scales gracefully on such problems. The sequential quadratic programming (SQP) method is a staple for solving NLP [1]. Here, an algorithm successively makes quadratic models of NLP and solves an inequality-constrained quadratic program to generate a new iterate. The trouble is that solving a quadratic program can become very expensive as problem dimensions grow and, as such, SQP methods can be inefficient at solving large-scale NLPs. More recently, a variant of SQP was proposed in order to help it scale to larger problems [7, 3, 5]. This is known as sequential linear—quadratic pro gramming (SL—QP). The idea is rather simple: instead of solving an expensive IQP at each iteration, we solve a cheaper linear program (LP), followed by an equality-constrained quadratic program (EtP). Conclusive evidence has not yet been able to demonstrate the effectiveness of currently implemented SL—QP algorithms; however it is our belief that this is due to several sources of inefficiencies in these algorithms which are bottlenecks as problem dimensions grow. This thesis will discuss these inefficiencies, and 1.2. Notation and definitions 2 propose a significant modification to the basic SL—QP method which is meant to alleviate them. Specifically, we propose a different means of controlling the step-length generated by the LP subproblem at each iteration: current methods use a trust-region constraint, where we use regularization. In the next two chapters, we discuss the current landscape of algorithms for solving NLP. Chapter 4 then presents a new algorithm, which we call sequen tial regularized linear programming (SRLP). This is not an SL—QP algorithm, because it does not solve an EQP subproblem at each iteration. However, since convergence guarrantees of current SL—QP algorithms rely solely on the LP step—with the EQP step added simply to improve convergence rates [3, 5], this algorithm will provide a suitable means of testing the effectiveness of regularized LP subproblems versus those with a trust-region constraint added. Chapter 5 describes the implementation details of the SRLP algorithm, together with our testing environment. Finally, Chapter 6 discusses the advantages of solving RLP versus TrLP subproblems. We demonstrate several sources of inefficiencies in current SL—QP algorithms, and show how an RLP-based algorithm would fair better. 1.2 Notation and definitions This section presents a brief review of some of the optimization background required for a proper reading of this thesis. Along the way, we will encounter some important definitions and notation that are used throughout this text. This review is not meant to be comprehensive, and the curious reader is directed to the excellent book by Nocedal and Wright [13] for a more detailed background to this material. 1.2.1 Optimization basics The field of optimization can be roughly summarized as the pursuit to find a vector x E ll which minimizes a function f : —* ]R, where x is limited to lie within a constraint set, . We can write this as the generic optimization problem P minimize f(x) subject to x e . V Here, f(x) is the objective, and the feasible set. If 0 (i.e., if there is no x which satisfies the condition x e we say that P is infeasible, or inconsistent. Otherwise it is feasible or consistent. 1.2.2 Nonlinear programming Problem NLP (1.1) is a special case of P. Here c : m is refered to as the constraint function, the condition c(x) = 0 as the constraints, and the condition x 0 as the variable bounds. The feasible set for NLP, then, is 1.2. Notation and definitions 3 the set of points which satisfy the constraints and the variable bounds, i.e., ={x:c(x)=Oandx>O}. Formulations with more generic constraints (such as inequality constraints, c(x) 0) can easily be re-written in this form, and so we will be able to consider it without loss of generality. The Lagrangian is an important concept for nonlinear programming. Definition 1 1 The Lagrangian of NLP is £(x, y, z) f(x) — yTc(x) — zTx, where e R, and z E ]R7, its gradient with respect to x is given by, g(x, y, z) = VL(x, y, z) = Vf(x) — Vc(x)Ty — and, its Hessian with respect to x is given by, H(x,y,z) = VL(x,y,z) = V2f(x) — y:V2c:(x) The first-order KKT (Karush-Kuhn-Tucker) conditions must hold at an op timal point of NLP. Definition 1.2. The point x satisfies the first-order KKT conditions (and is called a first-order KKT point) of NLP if and only if there exists vectors e Rm and z e R such that gc(x,y,z) 0 c(x) = 0 x0 z0 xoz = 0, where (x o z) := x z. Under certain conditions on the constraints (the linear independence con straint qualification, for instance) it can be shown that any local solution solu tion, x, to NLP must be a first-order KKT point; see [13, Theorem 12.11. The converse is not necessarily guaranteed, however. This requires the satisfaction of stronger conditions collectively known as the second-order sufficient conditions, which are outside the scope of our current discussion, or convexity (see [13, Theorem 12.6]). This thesis focuses on active-set methods, which require the following defi nition: 1.2. Notation and definitions 4 Definition 1.3. The active set, A(x), for NLP at a given point x, is the set of variables which are at their bounds, i.e., A(x) := {i : = O} The active set at the solution to NLP—which we refer to as the optimal active set—is of special importance to active-set methods. We use the shorthand notation A* 5Chapter 2 Current approaches for solving nonlinear programs There have been many different methods proposed for solving NLP (1.1). The methods that we consider can broadly be divided into two main classes: interior- point and active-set methods. 2.1 Interior-point methods The variable bounds x 0 can add a significant cost to solving NLP: the number of ways x can be chosen active or inactive grows combinatorially with the number of variables in the problem. Interior-point methods reformulate these inequality constraints: penalizing their violation by adding a barrier function to the objective. Many different barrier functions could be employed, however a common choice is the log-barrier. The most basic log-barrier interior-point method re peatedly solves problems of the form LBNLP(i) minimize ‘(x) := f(x) — ln (xi) subject to c(x) = 0, where u> 0 is a sequence of barrier parameters such that —* 0. The objective ‘I’ has the property that 4’(x) = oc whenever x 0, which forces each iterate to lie strictly in the interior of the feasible set (hence the term ‘interior point’). As — 0 the barrier weakens, and it can be shown under mild assumptions that the solution to LBNLP(i) approaches the solution of NLP [13, Theorem 17.3]. Algorithm 2.1 shows a basic log-barrier interior-point method. These methods were first introduced in the context of nonlinear program ming by Fiaco and McCormick in 1964 [6], but were subsequently abandoned in favour of active-set methods such as sequential quadratic programming (see Section 2.3). More recently, however, they have enjoyed a strong resurgence, and have been shown to be quite effective on medium- to large-scale problems (see [18], for example). 2.2. Active-set methods 6 Algorithm 2.1: Basic log-barrier interior-point method input: x,,a >0 1 while x is not optimal for NLP do 2 x ÷— solve LBNLP(/i) 3 i €— reduce barrier parameter 4 end Algorithm 2.2: Generic active-set method input: A 1 while A A* (the optimal active set for NLP) do 2 A — generate new active set prediction 3 end 2.2 Active-set methods If Ak—an optimal active set for NLP—was known apriori, one could simply determine an optimal point for NLP by fixing the variables in this set at their bounds, ignoring other bound constraints, and solving a much cheaper equality- constrained problem. In this sense, much of the cost of solving NLP can be said to be the determination of an optimal active set. Where interior-point methods skirt this issue by removing the bound constraints, active-set methods seek the optimal active set directly by successively updating a sequence of predictions Although interior-point methods can be very robust and efficient, one clear advantage of active-set methods is in warm-starting. 2.2.1 Warm-starting When information received from solving one problem is used to speed the con vergence of a second, related, problem, we say that we have warm-started the second problem. Active-set methods are readily warm-started. Two problems which are only slightly perturbed from each other are likely to have very similar optimal active sets, and we can use the optimal active set from one as an initial prediction for the optimal active set of the second. In contrast, interior-point methods are very difficult to warm-start effectively. (Strong efforts have been made recently to improve interior-point warm-starting, but success is still limited compared to that enjoyed by active-set methods [11].) There is a vast array of applications which require the solution to many similar NLPs. Consider, for instance, the branch-and-bound method for mixed integer nonlinear programming, where a related NLP is solved at every major iteration [2]. In such applications, effective warm-starting can be crucial to 2.3. Sequential quadratic programming 7 Algorithm 2.3: Sequential quadratic programming input: x,A i while x is not optimal for NLP do 2 d, A — solve IQP(x)[A] 3 x—x+d 4 end the success of an algorithm: potentially, the cost of solving a series of related problems can be much less than that of solving them individually. As such, because they are readily warm-started, and in spite of the significant successes of interior-point methods, we believe the pursuit of active-set methods for large- scale NLPs to be a very important topic. The algorithms in the remainder of this thesis, all of them active-set methods, make heavy use of warm-starting. Each of them successively solves a series of related subproblems. The optimal active set from the most recently solved subproblem, A, is used as an initial prediction of the optimal active set for the following subproblem. The efficiency of these warm-starts plays a crucial role in the effectiveness of the algorithm in question, as we discuss in Chapter 6. As a reminder that we are using an active-set prediction for warm-starting our subproblems, algorithms in this thesis write this explicitly. When an algo rithm states, for instance, “solve IQP(x) [A]”, it is understood that the subprob lem IQP(x) is solved using A as an initial active-set prediction. (An important technical detail must be addressed here. The set of active variables defined by some A, together with a set of equality constraints, c(x) = 0, may define an inconsistent system. In this case we say that the active set is degenerate. Prac tical algorithms choose a non-degenerate subset of A—a ‘working set’ [3, 19, 101. Unless otherwise stated, we assume throughout this thesis that the active sets generated by the algorithms in question are non-degenerate.) Many successful active-set algorithms exist. Perhaps the most well-known is the simplex method for linear programming [13, Chapter 13]. For nonlinear programming, however, sequential quadratic programming (SQP) [11 has proven to be one of the most effective active-set methods, as demonstrated by the success of algorithms such as SNOPT [101. 2.3 Sequential quadratic programming SQP is, essentially, a generalization of Newton’s method to the constrained optimization case. At each iteration, SQP methods form a quadratic model of NLP about the current iterate—a quadratic model of the Lagrangian, and a linear model of the constraints—and minimizes this model to determine a new 2.3. Sequential quadratic programming 8 iterate. Its subproblems can be written as IQP(x) minimize g (x)Td + dTH(x)d subject to c(x) + J(x)d = 0 x+d > 0. Algorithm 2.3 lists a basic SQP algorithm (note that this is simply a local algorithm; we will discuss globalization techniques in Section 3.2). SQP allows for any method of solving IQP. Active-set quadratic-programming methods [16], however, are a common choice because of their warm-start effi ciency. In this case, SQP itself is an active-set method—iteratively updating an estimate, A, of the optimal active set, A*, for NLP. Under mild conditions it can be shown that (after incorporating globalization techniques similar to those discussed in Section 3.2) iterates generated by SQP give A = A* within a finite number of steps [17], and that, once this occurs, they achieve quadratic convergence towards a solution, x [13, Theorem 18.21. 9Chapter 3 Sequential linear—quadratic programming 3.1 Local algorithm As numbers of constraints and variables increase into the millions, each itera tion of SQP can become very expensive [3]. This cost is dominated by solving the IQP subproblem—this may be very slow for large-scale problems consid ering the combinatorial difficulties implicit in the bound constraints, together with the possibility of a nonconvex quadratic objective to handle. The idea behind sequential linear—quadratic programming (SL—QP) was first introduced by Fletcher and Sainz de la Maza [7] as a means of reducing the cost of ma jor iterations compared with SQP—more recently, algorithms were proposed by both Chin and Fletcher [5] and, Byrd, Gould, Nocedal and Waltz [3]. Instead of solving IQP, SL—QP solves at each iteration a linear model of NLP given by LP(x) minimize gc(x)Td subject to c(x) + J(x)d = 0 (3.1) x+d 0, in order to update its estimate of the optimal active set. Here, SL—QP bears a resemblance to sequential linear programming methods [14], which have been used for many years. However, SL-QP methods generate an additional search direction by solving the equality-constrained quadratic program, EQP(x,A) minimize g(x)Td+ dTH(x)d subject to c(x) + J(x)d = 0 = 0 for i e A. This additional step has the benefit of allowing for a superlinear convergence rate, as experienced by SQP methods. Importantly, EQP—a quadratic model of NLP in which active-set variables are fixed and other bounds are ignored— can be much cheaper to solve than IQP, as it avoids handling variable bounds. Algorithm 3.1 lists a basic SL—QP algorithm. SQP methods have a single subproblem (1QP) that gives active-set prediction and fast convergence. SL—QP methods decouple these objectives into more spe cialized subproblems. The hope is that the LP step will give a good prediction of 3.2. Globalization 10 Algorithm 3.1: Basic local SL—QP method input: x,A i while x is not optimal for NLP do 2 A *— solve LP(x)[A] 3 d — solve EQP(x,A) 4 X 5 end the optimal active set, without the significant cost of solving IQP (LP is a simpler and more structured problem than IQP, and linear-programming technology is more advanced than that of quadratic-programming) and, as such, will handle large-scale problems more efficiently. Algorithm 3.1 on its own, however, is not guarranteed to converge. In the next two sections, we discuss features incorporated into Chin et al.’s algorithm and into Byrd et al.’s algorithm which allow for such guarrantees. 3.2 Globalization Algorithm 3.1 is a local algorithm. That is, starting at a point x arbitrarily far from a solution to NLP, it may never converge to a solution. To drive iterates towards a solution from an arbitrary starting point (globalization), iterative algorithms typically employ a linesearch or a trust-region, both of which control the size of the step d taken at each iteration. The algorithms of Byrd et al. [3] and of Chin et al. [5] are both trust-region methods. Instead of solving LP, which may be a very poor model of NLP far from the current iterate x, they solve TrLP(x, ) minimize g(x)Td subject to c(x) + J(x)d = 0 (3 2) x+d>0 IIdII , where the step d is forced to lie within an infinity-norm ball of radius (called the trust-region radius). The goal is to force the generated step to lie within a region in which we trust our model of NLP. In a trust-region approach, if a step is deemed to have made sufficient progress towards a solution, is held constant—or even increased to allow bigger (bolder) steps. Otherwise, our model of NLP may be too poor, and must be reduced to take smaller (safer) steps. But how do we determine whether a step gives sufficient progress? Clearly a step which reduces the objective and brings us closer to the feasible region for NLP should be taken. By the same token, one which increases the objective and moves us farther away from the feasible region should not be taken. However, 3.2. Globalization 11 what if a step generated by solving TrLP significantly reduces the objective, but moves us farther away from the feasible region of NLP? The answer is not obvious. Clearly some tradeoff between the goals of minimizing the objective and becoming feasible must be made. Answers to how to make this tradeoff have been the subject of a great deal of research, and this is an issue on which the algorithms of Byrd et al. [3] and of Chin et al. [5] differ. Byrd et al. employ a merit function and Chin et al. use a filter. We discuss both ideas now. 3.2.1 Merit functions For unconstrained problems, sufficient progress towards a solution can simply be equated to sufficient descent in the objective; this single function defines the goal of a step. Merit functions seek to generalize this concept to constrained optimization. They combine the objective and constraint violation into a single function, , known as the merit function. Sufficient progress is then determined by whether or not a step generates sufficient descent on this combined function. One way to measure constraint violation is via the function h(x) IIc(x)IIi . (3.3) (The algorithms we consider explicitly enforce bound constraints at each it eration, and so we do not consider bound violation here.) A common merit function, and the one employed by Byrd et al., is the £ merit function 1(x, v) : f(x) + vh(x), (3.4) where ji > 0. This is an exact merit function [13, Section 15.3]. Definition 3 1 A merit function c,b(x, v) is exact if there exists some > 0 such that for any v > 1jk every local solution of NLP corresponds to a local minimizer of q5(x, v) [13, Definition 15 1] It can be shown that, for the Li merit function, IIyII, (3.5) where y are the Lagrange multipliers at a solution to NLP, satisfies Definition 3.1 [13]. We refer to v as the penalty parameter; it controls the amount of penaliza tion given to constraint violations. Choosing an appropriate value, z1, for this penalty parameter at each iteration k, is a difficult task. Convergence proofs for an algorithm which uses an exact merit function typically require that the algo rithm eventually generates iterates with vk > j,*• Perhaps the simplest solution is to choose some v0 >> v and to set ii = 1)0 at every iteration. This is not effective in practice for several reasons, however. First, we do not know y (or, consequently, *) until we’ve actually solved the problem, so it is impossible to know for sure whether o is large enough. Second, there may be multiple solu tions to NLP, and we may thus be required to know * for all of them. Third, it 3.2. Globalization 12 has been observed in practice that if ii is chosen too large convergence can be very slow, and, so, choosing an arbitrarily large v0 may be very inefficient [3]. Despite this difficulty, heuristics have been developed for choosing v that have been shown to be effective in practice [4, 13]. We discuss one such heuristic in Section 4.3. 3.2.2 Filters The £ merit function combines the goals of minimizing f(x) and h(x). In contrast, filter methods, introduced by Fletcher and Leyffer [8], view these as two separate and competing goals, i.e., minimize f(x) and minimize h(x) Instead of instituting an arbitary trade-off between these goals, a filter method considers only whether the objective—infeasibility pair at a given point, (fi, h1) (f(xi), h(xi)), is unequivocally worse than another, (f2, h2). If so, we say that the latter dominates the former. Definition 3 2 A pair (f2, h2) dominates another pair (fi, hi) if and only if f2 ft and h2 hi [8, Definition 11 In the basic filter method, a step is acceptable if the new objective—infeasibility pair it would give, (f, h), is not dominated by that of any previous step taken by the algorithm. This is determined by maintaining a list of previously accepted pairs—a filter. Each time a step is accepted, its pair may be added to the filter (and, for efficiency, any pairs in the filter which it dominates are subsequently removed). Definition 3 3 A filter is a list of pairs (f, h) such that no pair dominates any other [8, Definition 2] A step has sufficient progress if it is acceptable to the filter. Definition 3.4. A step d generated about the current iterate x is acceptable to a filter if the pair (f(x + d), h(x + d)) is not dominated by any pair in the filter The SL—QP algorithm of Chin et al. [5] uses a filter. Two of the main draws of this globalization method are that fewer steps are rejected by a filter than a merit function (it is, in some sense, as accepting as possible), and that no arbitrary parameter is required. However, while the basic idea quite attractive due to its elegance, many less attractive features must be added in practice to ensure convergence [8]. In particular, the notion of an envelope (which introduces the very tradeoff between f and h we were trying to avoid), and the requirement that certain iterations neglect filter acceptability and instead give descent on the objective, detract from the elegance of the method. Effectiveness of filters in practice is also less established than that of merit functions. 3.3. Infeasible subproblems 13 / I,, “ x / / / / Figure 3.1: A small trust region can induce a TrLP subproblem to be infeasible. Here, every point on the line c(x) + J(x)d = 0 is feasible for the corresponding LP subproblem. However, none of these points lie within the trust-region radius, , of x and so the TrLP subproblem is infeasible. 3.3 Infeasible subproblems Algorithm 3.1 assumes that we can solve TrLP at every iteration. In practice, however, this is not always possible. Assuming that the original NLP is feasi ble, there are two ways that the TrLP subproblem can be infeasible. First, the linearized constraints about a point, x, may themselves be infeasible: if the con straint gradients, Vc(x)T, are degenerate, then the system Vc(x)T d = —c(x) may simply have no solution. Second, if the trust-region is too small, the TrLP subproblem may be infeasible even if the corresponding LP subproblem was not. Consider Figure 3.1. Here, the linearized constraints, c(x) + J(x)d = 0, while themselves consistant, cannot be satisfied within the trust-region radius, , of x, and so the TrLP subproblem is infeasible. If an infeasible subproblem is encountered, a successful algorithm must have a means of determining a new step and proceeding with its iterations. Two common techniques for this are feasibility restoration—which is utilized in Chin et al. ‘s algorithm [5], and penalty reformulation—which Byrd et al. use [3]. We explore these alternatives next. 3.3.1 Feasibility restoration If x is feasible for NLP, then the subproblem TrLP(x, z) is guarranteed to be feasible for any (we can see immediately that a step d = 0 satisfies the constraints in TrLP when c(x) = 0 and x 0). Feasibility restoration searches for such a feasible point, so that we can be sure that the next TrLP subproblem to solve will be feasible. It was introduced by Fletcher and Leyffer [8] specifically c(x)+J(x)d=O 3.3. Infeasible subproblerus 14 for use with the filter method, however there is nothing in the method that prevents one from using it with a merit-function method. A feasibility restoration step attempts to find a feasible point by solving the problem FR minimize h(x) (3.6) subject to x 0, where h is defined in• Equation 3.3. If the objective at a solution to FR is nonzero, then we could not find a feasible point, and NLP is declared infeasible. Otherwise, we have found a feasible point for NLP, and may proceed with the regular iterations of our algorithm. Any algorithm which can minimize a nonlinear objective subject to bound constraints will be suitable to generate a feasibility restoration step. However, Fletcher and Leyffer [8] do describe a method designed specifically for this pur pose. Unfortunately, feasibility restoration can be quite expensive. Indeed, FR can be almost as difficult to solve as our original NLP! Another potential weakness of feasibility restoration is that it has essentially given up on the objective and, as such, the step it generates may unnecessarily increase it. 3.3.2 Penalty reformulation Penalty reformulation is a staple technique for handling infeasible subproblems. Its use in practice is described in [10], and it is also discussed in detail in [13]. Here, the view is that there is no need to satisfy the linearized constraints at every iteration. If an infeasible TrLP subproblem is encountered, a penalty reformulation method instead solves PLP(XJL) minimize gc(x)Td+iIIc(x) + J(x)dIIi subject to x+d 0 IIdII Z, which is formed by replacing the explicit constraints (c(x) + J(x)d = 0) with a term in the objective that penalizes their violation. Importantly, PLP is always feasible as long as x 0, which the algorithms we consider enforce explicitly at each iteration. It is common to reformulate PLP as a linear program, thus taking advantage of the availability of highly-tuned large-scale linear programming solvers. PLP can be re-written as PR(x, ji) minimize g(x)Td + ueTq + ueTr d,q,r subject to c(x) + J(x)d = q — r x + d, q, r 0 lIdW where q, r E are slack vectors, and e E m is a vector of ones. 3.3. Infeasible subproblems 15 Two advantages of penalty reformulation over feasibility restoration are that PR is much cheaper to solver than FR, and that it includes objective information instead of throwing it away. The step generated by PR, however, may not give as much of a reduction in infeasibility as FR. Also, penalty reformulation requires the choice of a parameter, i. Furthermore, the effectiveness of a method using penalty reformulation can be very sensitive to the choice of this parameter, and finding an appropriate value may be quite expensive in practice [4]. While PR has the same linear program form as TrLP, it has n + 2m variables compared with n for the latter—it may be significantly more expensive to solve. This, together with a potentially expensive routine for selective i (Byrd et al. [3] suggest a full extra solve of PR!) makes penalty reformulation iterations quite costly for an algorithm. 16 Chapter 4 Sequential regularized linear programming This chapter presents a new active-set method, SRLP (sequential regularized linear programming) for solving large-scale NLPs. 4.1 Algorithm outline In order to control the size of steps generated at each iteration, current SL—QP algorithms add a trust-region constraint to the LP subproblem, solving TrLP. In contrast, SRLP controls the step size and enforces global convergence by adding a regularization term to the LP subproblem, which penalizes overly large steps. At each iteration a regularized linear program, RLP(x,p) minimize g1(x)Td+ pIIdI subject to c(x) + J(x)d = 0 (4.1) x+d 0, is solved. This idea was first proposed in a technical report by Friedlander, Gould, Leyffer, and Munson [9], but to our knowledge it has not yet been implemented. Algorithm 4.1 lists an outline of our proposed SRLP algorithm. Lines 3 and 4 handle the possibility of encountering an infeasible RLP subproblem, as discussed in Section 4.2. Lines 7 through 10 are repeated until an acceptable step, the definition of which is given in Section 4.3, is generated. Within this ioop, we update the regularization parameter. A discussion of how this is done is presented in Section 4.4. Finally, line 13 updates estimates of the optimal Lagrange multipliers of NLP on line 13, which is discussed in Section 4.5. (While SRLP does not explicitly require these estimates, they are used in practice in order to determine whether optimality conditions have been satisfied, and as part of our globalization technique.) Notice that SRLP does not take an EQP step, and thus does not fully qualify as an SL—QP method. Convergence guarantees of current SL—QP algorithms, however, rely solely on the LP step, with the EQP step added only to increase convergence rate [3, 51. As such, SRLP provides a suitable means of determining the effectiveness of using regularization versus a trust-region to control step size in an SL—QP method. A practical implemention, though, will likely include an 4.2. Infeasible subproblems 17 Algorithm 4.1: Sequential regularized linear programming (outline) input: x,p>O i while x is not optimal for NLP do 2 if RLP(x, p) is infeasible then d +— handle infeasible subproblem /* Section 4.2 */ 4 else d +— solve RLP(x,p)[A] 6 while d is not acceptable do /* Section 4.3 */ 7 p — update regularization parameter /* Section 4.4 */ 8 d—findanewstep /* Section 4.4 */ 9 end io end ii x—x+d 12 update Lagrange multiplier estimates /* Section 4.5 */ 13 end EQP step, and the Lagrange multiplier estimates discussed in Section 4.5 will be important for computing the Hessian of the Lagrangian in this case. 4.2 Infeasible subproblems The RLP subproblem encountered on a given iteration may be infeasible if the linearized constraints (those in Equation 3.1) are degenerate (see Section 3.3). SRLP generates a feasibility restoration step, which solves FR (3.6), when this occurs. There are many solutions to FR. We take the view that feasibility restoration should interfere with regular iterations as little as possible, and so choose the solution which is as close to our current iterate, in a 2-norm sense, as possible. Our feasibility restoration steps solve MINFR(x) minimize dII2 subject to c(x+d) = 0 (4.2) x+d 0. If MINFR(x) is infeasible, SRLP declares NLP itself to be infeasible, and iterations are terminated. Otherwise, its solution d is taken as a step in lieu of an RLP generated step. 4.3 Step acceptance Successively accepting any step derived from the solution to RLP does not guar antee convergence to a first-order KKT point of NLP from an initial point ar bitrarily far away. Instead, we must employ globalization techniques. This 4.3. Step acceptance 18 requirement is common for any algorithm in which local models of a problem are used as subproblems and, as such, a large body of work has been pub lished analysing various strategies. See Section 3.2 for a review of some relevent techniques. Our globalization strategy will define what an acceptable step is for our algorithm. When a step is deemed acceptable, the SRLP algorithm takes it, otherwise either the RLP subproblem is modified or the step is modified until it is acceptable. SRLP utilizes the£1-exact merit function (, see Equation 3.4) to enforce globalization. While this technique has the potential downside of requiring a good choice of penalty parameter, we have found the method we describe below to work very well in practice. 4.3.1 Penalty parameter choice Our primary concern in choosing v1 at each iteration is that ç be exact. Thus, following Equation 3.5, we impose the condition > where y1C are the Lagrange multiplier estimates at the given iteration (see Sec tion 4.5). Secondly, we require that v’ be chosen so that a solution to PIP is at least a descent direction for the merit function, ç (xk, vk) . Steps dk generated by SRLP are guaranteed to satisfy J(xv) dk = _c(xc) (see Equation 4.1), and the directional derivative of i in such a direction is given by D(1, d’) = Vf(xk)Tdk — [13, Lemma 18.2]. Our requirement that D(1, dk) <0 then gives us a second condition for the penalty parameter: v kTdk > V IIc(xk)IIi for c(xk) 0. Notice that if xk is feasible (i.e., Ic(x’)IIj = 0), then descent on the merit function is not determined by A. In this case, however, a step d which solves RLP is always a descent direction for the objective unless we are optimal, and so descent on the merit function will be guaranteed for any value of iA. Our final requirement amalgamates these two conditions, k fmax(va, Vb) if IIc(2)IIi 0;i/ >1/mzn.’ . 4.3 otherwise In practice constants c > 0 and c 1 are chosen so that VCVmjn+f (4.4) is sufficiently large. (The specific choice of c and e may have a large practical impact on an algorithm, and adjustment may be required to find appropriate values for a given implementation.) 4.4. Regularization parameter updates 19 4.3.2 An acceptable step The use of an exact merit function defines a single goal for each iteration of SRLP, namely reduction of the merit function. It is well known, however, that simply achieving reduction in the merit function at each iteration may, instead of achieving convergence to a first-order KKT point of NLP, reach a non-optimal accumulation point. Instead we must require that each iteration gives at least a sufficient amount of reduction in the merit function. We define this to mean that the step achieves at least a fraction of the descent predicted by a linear model of i, defined by £(d,x,v) := +J(x)dIIi. We are now prepared to define an acceptable step for SRLP. Definition 4 1 A step d generated by SRLP about a current iterate x is considered acceptable if it satisfies 1(x,v) — qiQv+d,v) > /3(1(x,v) —e(d,x,v)) where 3 e (0, 1) is the required fraction of descent, and v is chosen by Equa tion 4 4 (A similar condition is used in [5] and in [3]) In our algorithm we use the notion of step quality. A high-quality step will be accepted, whereas a low-quality one will not. Definition 4 2 The quality, q, of a step d is the fraction of predicted linear descent in qi achieved by the step q q!i(x,zi)—l(x+d,v) (45 With q given by Equation 4.5, and 3 our chosen required fraction of descent, our step acceptance criterion becomes q>/3. 4.4 Regularization parameter updates Consider a given iteration k of SRLP with an iterate xk, and where the current value of the regularization parameter is pk This section discusses how SRLP chooses Pk+1, and concurrently generates an acceptable step. For convenience, we will herein use the notation d to signify the solution to the subproblem RLP(x, p). The regularization term in RLP, 11 dli , replaces the trust-region constraints, IIdii , in TrLP. Like the trust-region, its role is to control the length of the step d taken by the subproblem at each iteration. The penalty parameter, p, defines the degree of penalization of the step-length in a 2-norm sense: p 0 gives no penalization, and a step which freely solves LP(x); p = oo gives the 4.4. Regularization parameter updates 20 smallest possible step that satisfies the linearized constraints of NLP about x, i.e., RLP(x,00) minimize IdII subject to c(x) + J(x)d = 0 (4.6) x+d > 0. There is an important distinction between a trust-region constraint and reg ularization, however: A trust-region imposes a direct and strict limit on step- length, where regularization only imposes a degree of aversion to large steps. Thus, using regularization, SRLP does not have the full control over step-length, and this has important consequences for the convergence of the algorithm. In particular, notice that it is not true in general that IdpI2 —* 0 as p — 00 (consider any instance where c(x) 4 0, in that case d = 0 cannot satisfy the constraints in Equation 4.6). The primary aim of controlling step-length, whether by regularization or a trust-region, is to ensure that an acceptable step is taken at each iteration. This is required by globalization techniques and thus for convergence of an algorithm. SRLP’s primary means of step-length control is through updates to the regularization parameter. When an unaccepted step dpa is generated, we predict that, in the large majority of cases, we can find some Pb > Pa 50 that dpb is acceptable. However, we cannot guarantee that such a Pb exists, and, if it does not, a secondary means of generating an acceptable step is required. One possible choice for this secondary step is a linesearch on some dp. Our choice of penalty parameter (Section 4.3) ensures that dp (for any value of p) is a direction of descent for the merit function, q5. By the twice-continuous differentiability of f and c, then, we can see that some c > 0 exists such that d := c d satisfies our acceptance requirement (Definition 4.1); we can find an acceptable step by performing a linesearch on any step dp. While any d will suffice, d (a solution to (4.6)) is in some sense the best choice: it defines the direction in which the linearized constraints are nearest—in a 2-norm sense—to being satisfied and is thus likely to give good progress towards feasibility. To maximize the efficiency of SRLP, we wish to minimize the number of RLP subproblems solved at each iteration. We expect that, on most iterations, the initial step dpa will be accepted. When it is not, we further expect that a modest increase in regularization parameter Pb = 7Pa (with 7> 1) will, in the majority of cases, generate an acceptable step dpb. Thus, SRLP attempts each of these steps in succession before any more expensive means of analyzing the subproblem are performed. If the second attempted step is still not acceptable (we expect this to be a rare occurance in practice), we immediately seek to determine whether there exists any Pc > Pb for which dp is acceptable. This test is performed by solving RLP a third time to obtain d. If d is unacceptable, we know immediately that increasing p further is futile, and instead perform a linesearch on this direction. Otherwise, we know that there is some Pc large enough to generate an acceptable step and we continue to search for it. 4.5. Lagrange multiplier estimates 21 Algorithm 4.2: Regularization parameter update and step selection input: d, x, p> 0, y> 1 1 if d is not acceptable then 2 p—yp /* try increasing p */ 3 d—solveRLP(x,p) 4 if d is not acceptable then 5 d+—so1veRLP(x,x) /* test p=oo step */ 6 if d, is not acceptable then r choose a E (0, 1) such that a d is acceptable s d—ad 9 else io choose Pc > p such that dp is acceptable ii d *— solve RLP(x,p) 12 PPc 13 end 14 end is else 16 possibly decrease p ir end 18 x — x + d Algorithm 4.2 lists our regularization parameter update scheme. 4.5 Lagrange multiplier estimates Estimates of the optimal Lagrange multipliers for NLP are used in order to update the penalty parameter (Section 4.3), to check whether optimality conditions are satisfied (Section 5.3.7), and, when an EQP step is added, to compute an estimate of the Hessian of the Lagrangian. We have three main criteria when choosing a source for these estimates. First, they should be inexpensive to compute. Second, they should provide a good estimate of the optimal Lagrange multipli ers, based on the limited local information we have of the problem. Third, and most important, they should converge to the optimal Lagrange multipliers as we approach a solution to NLP. SRLP uses the optimal Lagrange multipliers for the RIP subproblems (the RLP estimates, see Definition 4.3) as an estimate of the optimal Lagrange mul tipliers of NLP (the optimal multipliers) at each iteration. These have the im mediate advantage that almost any solver of RIP will provide them at no extra cost over solving the subproblem. 4.6. Algorithm statement 22 NLP RLP g(x*) — J(x*)Ty* = z g(x) + pd — J(x)’ = z*O z*ox*=O o(x+d)=O Table 4.1: First-order KKT conditions satisfied by the optimal Lagrange multipliers for NLP at an optimal point x, and by the RLP estimates at a given. Definition 4 3 The RLP Lagrange multiplier estimates (or RLP estimates) are the pair of vectors e R’ and 2 R which, together with the solution d, satisfy the first order KKT conditions to the subproblem solved at a given iteration, RLP(x, p) (see Table 4 1) The first-order KKT conditions for RLP and NLP give a system of equations that the RLP multipliers and the optimal multipliers must satisfy. Table 4.1 lists the relevant conditions for convenience. Consider the conditions in Table 4.1. As x —÷ x and, consequently, d —* 0, we can see by inspection that (y*, z*) satisfy the condition for fliP estimates. Furthermore, the RLP estimates are unique (by the convexity of RLP). Continu ous differentiability of f and c, then, gives us (y, ) (y, z*) as x —+ x*; the RLP Lagrange multiplier estimates approach the optimal Lagrange multipliers for NLP as we approach a solution. 4.6 Algorithm statement A full listing of SRLP is given in Algorithm 4.3. This pseudo-code describes a process which iterates until an estimate x that is optimal for NLP is found. This stopping condition is left open to the implementation of the algorithm—we discuss the condition used in this thesis in Section 5.3.7. Throughout the rest of this thesis, we will refer to major iterations—which we define to be a full step through the ioop from lines 1 to 28, and to minor iterations—which we define to be the iterations taken by the RLP subproblem solver. The cost of these minor iterations will play a very important role in the effectiveness of a practical implementation of SRLP, and will be highly dependent on the choice of a solver for the RLP subproblems. We discuss this issue in the context of our implementation of SRLP in Section 5.3.4. Note that the regularization parameter update scheme presented in Sec tion 4.3 is directly incorporated into this flow. Algorithm 4.3 also utilizes a number of subroutine calls which we now define. d, y, A ÷— solve RLP(x, p): Solve the subproblem RLP(x, p) (Equation 4.1) to ob tain a step, d, and Lagrange multiplier estimates, y (see Section 4.5). Compute the active set, A(x + d). d i— solve MINFR(x): Solve the subproblem MINFR(x) for a step d. 4.6. Algorithm statement 23 v — penaltyParameter(d, y, x): Compute v from Equation 4.4. q — stepQuality(d, ii, x): Compute q from equation Equation 4.5. 4.6. Algorithm statement 24 Algorithm 4.3: Sequential regularized linear programming input: x, y, A, p> 0, > 0, qjgj qmin 1 while x is not optimal for NLP do 2 if RLP(x, p) is feasible then /* RLP step *1 d, y, A÷—solveRLP(x,p)[A] 4 V — penaityParameter(d, y, x) 5 else /* feasibility restoration step */ 6 d — solve MINFR(x) 7 end 8 if stepquality(d, v, x) <qmin then /* try increasing p */ 9 p—2p 10 d, y—solveRLP(x,p)[A] ii if stepQuality(d, ii, x) < then /* test p = step */ 12 d, y—solveRLP(x,cc.)[A] 13 if stepQuality(d, ii, x) < qj then /* linesearch on p = Dc step *1 14 repeat 15 d—d 16 until stepQuality(d, v, x) 17 else 1* increase p until acceptable step found */ 18 repeat 19 p€—2 20 d, y€—solveRLP(x,p)[A] 21 until stepQuality(d, ii, x) qmin 22 end 23 end 24 else if stepQuality(d, v, x) qhigh then 25 PP 26 end 27 x—x+d 28 end 25 Chapter 5 Experiments This chapter discusses the details of the experiments performed for this thesis. The implementation of SRLP (Algorithm 4.3), as well as the testing environment used, is described. Determining the effectiveness of SRLP requires a benchmark, and we begin by presenting the benchmarking algorithm used in this thesis—a sequential trust-region linear programming method (STrLP). 5.1 A benchmark trust-region algorithm Like SRLP, STrLP is an SL—QP algorithm without an EQP step. However, it generates steps at each iteration by solving a TrLP, rather than an RLP, sub- problem. Algorithm 5.1 lists STrLP. It is by no means novel—in fact, it closely resem bles the SL—QP implementation by Byrd et al. [3]. The details of this algorithm are meant to mimic those of SRLP as closely as possible, so that the two may be fairly compared: globalization is similarly enforced by an£1-merit function; the penalty parameter update method and step acceptance criterion are identical to those of SRLP; and it uses the same feasibility restoration routine as SRLP. The significant difference between STrLP and SRLP is the means with which they control the length of the step generated by subproblems. For this, STrLP uses a trust-region. When the quality of a step (Equation 4.5) is deemed to too low, the trust-region radius, , is reduced; if it is very high, /. may be increased. 5.2 Testing environment Determining the success of an optimization algorithm is quite difficult to do subjectively. Undoubtedly, an algorithm will fair better on certain types of problems and worse on others. The Constrained and Unconstrained Testing Environment revisited (CUTEr) [12] is a set of test problems which is commonly used as a benchmarking environment for optimization algorithms. We have selected a subset of the CUTEr problems as our test set for SRLP and STrLP. Only generally-constrained problems (nonlinear programs excluding linear and quadratic programs) for which n + m < 1000 were chosen. From the 225 matching problems, we removed a number which were found to be ill suited to our analysis: 1 unconstrained problems (this problem was labeled generally constrained, but we found it to contain no constraints or variable bounds), 5.2. Testing environment 26 Algorithm 5.1: Sequential trust-region linear programming input: x, y, > 0, qhigh 1 while x is not optimal for NLP do 2 if TrLP(L, x) is feasible then 1* TrLP step */ d, y, A — solve TrLP(L, x)[A] 4 else /* feasibility restoration step */ d <— solve MINFR(x) 6 end 7 V — penaltyParameter(d, y, x) 8 if stepQuality(d, v, x) <qmin then /* don’t accept step. *1 9 10 else 1* accept step *1 ii if stepQuality(d, v, x) qn9 then 12 j 13 end 14 15 end 16 end 3 problems which would not compile on our system, 5 problems we found to be poorly defined in regions visited by our algorithm (either the objective or constraints did not evaluate to a number), 11 problems which SNOPT declared as infeasible, and 38 problems which required more than 3000 iterations to solve. We were left with a set of 167 problems which we refer to as our test set. The list of CUTEr problems included in this test set is given in Appendix B. Why use such small-scale problems for our analysis, when our goal is effi ciency on very large scale problems? First, we are not directly concerned with the efficiency of SRLP, but rather with the use of RLP subproblems in an SL—QP method. Small-scale test problems allow us to explore a broad range of prob lems, which helps us to properly analyze certain features of RLP subproblems in Chapter 6. Success on these features can then be extrapolated to the efficiency of an SL—QP algorithm on large-scale problems. Second, developing a solver which robustly and efficiently handles large-scale problems in practice is a very arduous task. Many issues—such as effective use of memory, and optimized numerical routines—which can be safely forgotten on small-scale problems, re quire very careful attention as problems become large. At some point scaling an algorithm becomes an art rather than a science. 5.3. Implementation 27 5.3 Implementation This section describes the details of our implementation of the SRLP and STrLP algorithms. The main logic of these algorithms was implemented in the MATLAB programming language. The bulk of the computations, however, are left to highly optimized third-party Fortran routines. 5.3.1 Parameters SRLP calls for three initial parameters—we have used p0 = 1, q = 0.2, and qhih 0.8. For STrLP we used o = 20, qj = 0.1, and qhjgh = 0.7. For both algorithms, x0 and ho were provided by the CUTEr problem. The initial active-set prediction was calculated as A0 = A(xo). 5.3.2 Problem formulation Instead of reformulating problems to the somewhat restrictive form of NLP (1.1), SRLP accepts problems with more generic bound constraints minimize f(x) subject to c(x) = 0 (5.1) b1 < x < b where b1 R and b e W are the lower and upper bounds, respectively. This generalization has little effect on the details of the algorithms, and allows us to avoid splitting problem variables. Moreover, our subproblem solvers can handle these generic bound constraints directly. 5.3.3 CUTEr problems The problems in our test set were compiled using the SifDec 2 and CUTEr 2 packages [12]. In the compilation settings for SifDec, we used the g95 For tran compiler, and the “large scale problems” option. Compiled problems were accessed from MATLAB using the MEX interface provided with CUTEr. CUTEr presents problems in the generic format minimize f(x) subject to c1 c(x) c bj<x <ba. We reformulate these CUTEr problems into the form accepted by SRLP (5.1) by subtracting slack variables from the constraints minimize f(x) subject to c(x) — s = 0 b1 < x Cl S C 5.3. Implementation 28 Feasibility le—8 Optimality le—8 Max iterations 5000 Scale option 0 Table 5.1: Relevant SQOPT parameters used for solving RLP. 5.3.4 Solving RLP subproblems RLP subproblems (4.1) are solved using SQOPT version 7, a large-scale linear or quadratic programming algorithm [16]. SQOPT is not optimized specifically for handling problems of the form of RLP, but instead solves more general convex quadratic programs. However, it has been shown in practice to be a very robust and efficient solver, and is able to take advantage of the diagonal Hessian of RLP. In our implementation, the RLP subproblems solved at each iteration have generic bounds, and can be written as minimize g(x)Td+ pc[1’d subject to J(x)d —c(x) b1 < x+d < b. SQOPT accepts problems of the form minimize gTx + xTHx subject to cj Ax c,, (5.2) b1<x <bu. Reformulation from one form to the other is straight forward. Importantly, though, SQOPT does not accept the Hessian of our objective as a parameter. Instead, we provide a call-back function which evaluates products with this Hessian. Our call-back function for RLP evaluates lix where H := pI. SQOPT requires a number of parameters. Those relevant to our implemen tation are listed in Table 5.1. At the time of this writing, no MATLAB interface to SQOPT 7 was officially supported, and we developed a custom C MEX interface to the Fortran routines. The exact process through which SQOPT solves a quadratic program is be yond the scope of this thesis (see instead [10, Section 4] for a detailed treatment). However, as the cost of an SRLP iteration is largely dominated by solving the RLP subproblem, it is important that we consider the source of this expense. In Section 4.6 we refered to minor iterations as the iterations taken by our sub problem solver. SQOPT obtains a search direction L\x by solving the system ( H WT )( x ) =_( ), where gq:g+Hxk, (5.3) and the rows of W, the current working-set matrix, are a linearly independent subset of the constraint gradients of (5.2). 5.3. Implementation 29 The special form of H in our RLP subproblem (i.e., H = p1) would in prin cipal allow us to obtain Lx via the least-squares problem minimizeIIWTy—gq I, and zx— (gq_WTy*). An RLP algorithm could solve this least-squares problem efficiently by maintain ing a factorization of W. To compute a search direction for RLP, the algorithm would update this factorization. A similar factorization update is performed at each iteration by the simplex method for linear programming. The fundamental cost of solving RLP is, thus, comparable to that of solving TrLP. 5.3.5 Solving TrLP subproblems TrLP subproblems are handled almost identically to RLPs (see Section 5.3.4), with SQOPT used as the solver. A TrLP with generic bounds can be written as minimize gc(x)Td subject to J(x)d = —c(x) b1 < x+d < b, 11d1100 Li, which we can rewrite to match the form accepted by SQOPT more closely: minimize g1(x)Td subject to J(x)d = —c(x) b1 < d b with b1 := max(bi — x, and b, := min(b — x, Implementation details are identical to those described in Section 5.3.4. 5.3.6 Solving MINFR Feasibility restoration steps in SRLP and STrLP solve MINFR (4.2). With generic bound constraints, this problem can be written minimize 11d112 subject to c(x+d) = 0 (5.4) b1 x+d < b. We use SNOPT version 7 [15, 101, a general purpose algorithm for constrained optimization, to solve these problems. SNOPT accepts problems of the generic form minimize f(x) subject to cj < c(x) < c b1x b, 5.4. Results 30 Major feasibility le—8 Optimality le—8 Major iterations 8000 Minor iterations 500 Scale option 0 . Table 5.2: Relevant SNOPT parameters used for solving MINFR. which easily accommodates problems in the form of Equation 5.4. Relevant SNOPT parameters used in our implementation are listed in Ta ble 5.2. The return status from SNOPT is important to our algorithms. A status of 1 or 2 indicates a successful feasibility restoration step; 11, 12, or 13 indicates that NLP may be infeasible; any other status is considered an error. SNOPT Fortran routines were called from within the MATLAB environment using the MEX interface provided with the package. 5.3.7 Optimality tests SRLP and STrLP are terminated at iteration k, and the primal-dual pair (xk, k) is deemed optimal, if the following conditions hold: max ( c(xk)II, II [xk] ii) < (1 + UXkW2) max (u [zk] II, Wzk oxkII) < (i + IykIi2), where z := Vf(x’) — J(xk)Tyk, and [a] := max(0, max(_a)). These are the optimality tests proposed by Byrd et al. [3j—rearranged to ac commodate our formulation of a nonlinear program (which is different from that used by Byrd et al.). We have used the value i = 10—6 on all our experiments. 5.4 Results Appendices B and C list the results for SRLP and S’&LP, respectively, on our test set problems. For each problem, we list the number of major iterations required to solve the problem, the total number of minor iterations (SQOPT iterations) required, the number of feasibility restoration steps which had to be taken (equivalently the number of infeasible subproblems which were encoun tered for this problem), and the total time taken to solve this problem, among other statistics described in the appendices. 31 Chapter 6 Discussion The SL—QP method (Chapter 3) was developed in order to alleviate some of the potential inefficiencies experienced by the widely successful SQP method (Sec tion 2.3) when problem dimensions become very large. Careful analysis of cur rent SL—QP algorithms, however, reveals that they too contain many potential inefficiencies that may prove to be performance bottlenecks for very large-scale problems. This chapter discusses three such inefficiencies, and demonstrates how using an RLP subproblem (4.1) in place of a TrLP subproblem (3.2) could remove them. 6.1 Expensive infeasible subproblems When an SL—QP algorithm encounters an infeasible subproblem, it must go to extra lengths to determine a new iterate. Section 3.3 discussed the two most common solutions to this problem: feasibility restoration and penalty reformulation. While there are tradeoffs between these two methods, it can be generally agreed that both have significant expenses compared with the cost of a standard iteration, especially for very large-scale problems. Clearly, an SL—QP algorithm should avoid infeasible subproblems whenever possible. RLP and TrLP subproblems share the same linearized constraints (those which also appear in Equation 3.1), and, if the system these define is degenerate, both are infeasible. This is the only way an RLP subproblem can be infeasible. Section 3.3 explains, however, that the trust-region constraint could induce an otherwise feasible TrLP to be infeasible. An SL—QP algorithm which uses RLP subproblems will, therefore, encounter fewer infeasible subproblems than one which uses TrLP. The benefit gained by avoiding these additional infeasible subproblems is, of course, completely dependent on how often they are encountered in practice. Appendix B and C list the number of infeasible subproblems handled for each problem in our test set (or, equivalently, the number of feasibility restoration steps taken) by SRLP and STrLP respectively. These data are summarized in Figure 6.1, which shows that SRLP handled vastly fewer infeasible subproblems than STrLP—9 compared with 74. Given the high cost of feasibility restoration and penalty reformulation steps as problems become very large, using an RLP subproblem instead of a TrLP subproblem could greatly increase the efficiency of an SL—QP method due to the reduced number of infeasible subproblems it encounters. 6.2. Slow active-set updates 32 E 0 I 0. e 0 a) .0 E z Figure 6,1: The number of problems (out of 167) in our test set (see Sec tion 5.2) on which the SRLP and STrLP algorithms encountered a given number of infeasible subproblems. SRLP, which solves RLP subproblems, only needed to handle 9 infeasible subproblems over this entire set, compare this with its trust-region-based counterpart, which handled 74. 6.2 Slow active-set updates We have seen that SQP requires a (potentially very expensive) quadratic pro gramming iteration for every variable added to, or removed from, the active set. SL—QP alleviates this cost by replacing these iterations with cheaper, lin ear programming ones. However, the very nature of using an active-set method to solve these subproblems (whether IQP or TrLP) requires that elements in the active set are updated one-by-one. Such slow updates could prove to be a bottleneck for current SL—QP methods as problem dimensions (and, thus, the combinatorics of the active set) grow. In this section we explore the possibility of solving SL—QP subproblems with a gradient-projection algorithm—which are able to activate many variables during each iteration. Definition 6 1 A gradient-projection algorithm generates new iterate xk from its current iterate using k+1 k [xk — kVf(xk)] where [ J denotes proectzon onto the feasible set f, and 13v, k e IR The cost of each gradient projection step is often dominated by projection onto the feasible set and, as such, they are known to be much more effective Infeasible subproblems 6.3. Inefficient warm-starts 33 when has a form which is particularly cheap to project onto. Neither TrLP or RLP is particularly well suited to gradient projection: pro jection onto their linear equality constraints is a quadratic program unto itself. Friedlander et al. [9], however, suggest that the dual problem to RLP has a form which is very well suited to gradient projection, as we now explore. A satisfactory treatment of duality theory is outside the scope of this the sis, and is not required for this discussion. (The interested reader is directed towards [13, Section 16.8]). Instead, it is sufficient to understand that duality theory for convex quadratic programming tells us that there exists a problem related to RLP (called the dual of RLP) which has identical KKT conditions to it, and that it is possible to obtain a solution for RLP (the primal problem) by finding a solution to this related problem. Theorem 6 2 The dual problem of RLP is given by NNQP(x,p) minmuze _IIJ(x)Ty+z —g(x)I +yc— zTx subject to z 0 The dual of RLP, NNQP, is a convex quadratic program with non-negativity constraints. Importantly, its feasible set (Q = {y, z z O}) allows for very cheap projections—simple truncations of z, with linear cost. Gradient projec tion is very effective on such problems. We can, thus, solve RLP subproblems (albeit indirectly) very efficiently using a gradient projection algorithm. This will allow for very fast active-set updates, and could prove to be a significant advantage of using RLP over TrLP subproblems in an SL—QP algorithm. 6.3 Inefficient warm-starts Warm-start efficiency (see Section 2.2.1) is crucially important for the effec tiveness of an SL—QP algorithm. This efficiency is dependent on how good a prediction the active set at one iteration, Ak_i, is of that at a subsequent iter ation, A’. We measure this as warm-start quality, which we define in terms of the difference between these two sets. Definition 6 3 The warm-start quality, of iteration k for an active-set method is given by = IAk_mflAkl x 100%, (61) where Ac_i = A(x’’-), and A’s’ = A(x’) are the corresponding active-set predictions, and) denotes the cardinality of a set An SL—QP algorithm should seek to maximize at each iteration. We suspect, however, that the trust-region constraints in the TrLP subproblems may significantly reduce it, thus impeding the efficiency of the algorithm. 6.3. Inefficient warm-starts 34 Warm—start quality of STrLP on HIMMELBI 100 : 90 Ict1veset ——---Natura1 active set 50 I I I 0 50 100 150 200 250 300 350 400 450 Iteration Figure 6.2: The warm-start quality, wc, measured at each iteration for STrLP on the CUTEr problem HIMMELBI. The full active set (AF) gives a much lower quality than the natural active set (AN) because the trust-region constraints constraints continue to vacillate as we approach an optimal point. Why would the trust-region adversely affect warm-start quality? Let us consider the constraints in TrLP in two subsets: the natural constraints (c(x) + J(x) d = 0, x + d 0), and the trust-region constraints (d z). The natural constraints are a linearized model of those in NLP, and thus are related on subsequent steps. Consequently, we expect some coherence in the active-set for these constraints from iteration to iteration, translating into a higher warm- start quality. The trust-region constraints, in contrast, are wholly artificial; they do not model any part of NLP. Instead, they hover around our iterate, x, and grow or shrink suddenly whenever is updated. We expect much less coherence from these trust-region constraints, which would translate into a lower warm-start quality. In order to quantify the effect of trust-region constraints on warm-start qual ity, we consider a subset of the active set of TrLP which includes only the natural constraints. Definition 6 4 The natural active set of TrLP is the set of active natural constraints, and can be written Ag ={ d=_x} We also consider the full active set. Definition 6.5. The full active set of TrLP is the set of all active constraints in TrLP including the natural and trust-region constraints It can be written A.={i2=max(—x,—t)}U{—i d2=i} (Definitions 6.4 and 6.5 can be derived directly from Definition 1.3, which is 6.3. Inefficient warm-starts 35 Warm—start quality of SRLP vs. STrLP on HIMMELBI 100 90 . . . . . . STrLP —SRLP 50 I 0 50 100 150 200 250 300 350 400 450 Iteration Figure 6.3: The warm-start quality at each iteration for both STrLP and SRLP algorithms on the CUTEr problem HIMMELBI. The full active (AF) is used in both cases; the absence of trust-region constraints on SRLP’s RLP subproblems leads to a much higher warm-start quality. shown in Appendix A.) Let us first consider this effect on a sample problem from our CUTEr test set: HIMMELBI, which has 12 constraints and 100 bounded variables. We run the trust-region based solver, STrLP, on this problem, and measure the warm- start quality, wk, at each iteration k using both Av and AF. Figure 6.2 shows the results. Notice how for AF never reaches 100%, as—even once we have reached the optimal active set for NLP—the trust-region constraints continue to change iteration by iteration. Warm-start quality for AN, however, increases very quickly as the active set becomes optimal. An RLP subproblem includes only the natural constraints (we may use AN as the active set for all its constraints), and so we expect the warm-start quality of the SRLP algorithm to be much higher than that of STrLP. We again look at HIMMELBI, and measure wk for the full active set of SRLP (which is equivalent to AN). Figure 6.3 compares the warm-start quality of the two algorithms, and we indeed see a significant advantage for SRLP. (Note, though, that these solvers also required a different number of major iterations to solve HIMMELBI.) This result is typical for our test set. The average warm-start quality over all iterations for each problem is given in Appendices B and C. These data are represented in Figure 6.4, which shows a cumulative distribution function of the number of problems with given warm-start qualities. This plot clearly shows that, on our test set, the warm-start quality was indeed significantly higher when using RLP subproblems instead of TrLP. A higher warm-start quality should result in less work required to solve each subproblem. In our implementations, which use SQOPT as the subproblem solver, this translates into fewer minor iterations required per major iteration. Figure 6.5 shows the number of minor iterations required for each iteration—for 6.3. Inefficient warm-starts 36 Average warm—start quality of SRLP vs. STrLP 1 — — SRLP . : . 0.9 STrLP 0.8 I 0.7 .: U. . . . . . S 0.6 0.5 3 . . j. 2 0.4 . . 0.3 0.2 . ; 0.1 . : — 0 I I I I 20 30 40 50 60 70 80 90 100 Average warm—start quality (%) . Figure 6.4: A cumulative distribution function of the average warm-start quality (over all iterations) for all problems in our test set. The results for both SRLP and STrLP are shown—we see a significantly higher fraction of problems with high average quality in SRLP due to the absence of trust-region constraints. both SRLP and STrLP—on HIMMELBI, and demonstrates a large reduction in subproblem cost for SRLP due to increased warm-start efficiency. Figure 6.6 shows a cumulative distribution function of the average number of minor itera tions per major iteration for each problem in our test set. We see here that the major iterations for SRLP were indeed cheaper on average than those of STrLP. It seems, then, that a significant advantage can be gained due to an increased warm-start efficiency by replacing the TrLP subproblems with fliPs in an SL—QP algorithm. 6.3. Inefficient warm-starts 37 Cost per iteration of SRLP vs. STrLP on HIMMELBI 0 50 100 150 200 250 300 350 400 450 Iteration Figure 6.5: The number of minor (SQOPT) iterations taken by both SRLP and STrLP on each iteration while solving HIMMELBI. Due to a much higher warm-start quality, SRLP takes advantage of significantly cheaper iterations than STrLP on this problem. Average cost per iteration of SRL.P vs. STrLP 1 .. ..: — — SRLP09 STrLP 0.8 :,::. •:.:::: 1:::..: .:: .::0.7 L1 I 0.6 05 :..:::::, ::.:::::o 0.4 . .....“ -..... .... ... . ... . 0 I 0.3 . : ::. : . : : . : : . ::: .: ::: :I 0.2 - 0.1 ::: —.,... 0 —, — I I I 102 10_i 10° lOl 102 Minor iterations per major iteration . Figure 6.6: A cumulative distribution function of the average number of minor (SQOPT) iterations taken per major iteration for all problems in our test set. SRLP and STrLP results are shown. Major iterations for SRLP were cheaper over this test set, presumably due to an increased warm-start quality. 38 Chapter 7 Conclusions In this thesis, we explored the effectiveness of a proposed variant of current SL— QP methods at solving very large-scale nonlinear programs. Instead of solving a TrLP subproblem (Equation 3.2) at each iteration to update a prediction of the optimal active set, this new method solves an RLP subproblem (Equation 4.1). Two algorithms were presented to facilitate a comparison between these sub- problem choices: SRLP—which successively takes steps generated by solving an RLP subproblem (Algorithm 4.3), along with STrLP—a benchmarking algorithm which takes TrLP steps (Algorithm 5.1). Experiments using a diverse set of problems selected from the CUTEr set (Section 5.2) demonstrated that the trust-region constraints in TrLP subprob lems can lead to significant inefficiencies. These constraints often greatly re duced the effectiveness of warm-starting successive subproblems solves, which lead to more expensive subproblem solves (Section 6.3). They were also shown to induce a significant number of infeasible subproblems, which lead the very expensive iterations (Section 6.1). In contrast, it was shown that using an RLP subproblem instead of a TrLP largely eliminates these inefficiencies. The RLP subproblem contains no arti ficial constraints and, as such, enjoys a much higher average warm-start effi ciency on subsequent solves—and thus cheaper subproblem solves—than TrLP subproblems. Also, regularization cannot induce infeasibility, and thus SRLP encountered far fewer infeasible subproblems than STrLP (Section 6.1). Additionally, we showed that an RLP subproblem has another potential ad vantage over a TrLP: its dual problem is a bound-constrained least-squares problem (Section 6.2). This form is very conducive to gradient-projection-type solvers, which could prove significantly more effective than active-set subprob lem solvers for certain large-scale problems. An RLP-based algorithm could greatly benefit from the ability to choose a gradient-projection-type subprob lem solver. Based on these results, it seems likely that our proposed variant of SL—QP will prove effective on very large-scale nonlinear programs. It is our inten tion that the SRLP algorithm presented here will serve as a base for such an algorithm, with an EQP step added to incorporate knowledge of second-order information of NLP. Some important considerations must be made when adding an EQP step, however: (i) a strategy for controlling the length of generated EQP steps, (ii) the selection of a non-degenerate subset of the active-set to ensure that EQP is not inconsistent, and (iii) the choice of a trial step which combines the LP and EQP steps. These issues are discussed in detail by Byrd et al. [3] and Waltz [19]. Chapter 7. Conclusions 39 40 Bibliography [1] Paul T. Boggs and Jon W. Tolle. Sequential quadratic programming for large-scale nonlinear optimization. Journal of Computational and Applied Mathematics, 124(1-2) :123— 137, 2000. [2] Brian Borchers and John E. Mitchell. An improved branch and bound algorithm for mixed integer nonlinear programs. Computers and Operations Research, 21(4):359 — 367, 1994. [3] Richard H. Byrd, Nicholas I. M. Gould, Jorge Nocedal, and Richard A. Waltz. An algorithm for nonlinear optimization using linear programming and equality constrained subproblems. Math. Program., 100(1) :27—48, 2004. [4] Richard H. Byrd, Jorge Nocedal, and Richard A. Waltz. Steering exact penalty methods for nonlinear programming. Optimization Methods Soft ware, 23(2):197—213, 2008. [5] Choong Ming Chin and Roger Fletcher. On the Global Convergence of an SLP-Filter Algorithm that takes EQP steps. Mathematical Programming, 96:161—177, 2003. [6] Anthony V. Fiacco and Garth P. McCormick. The sequential unconstrained minimization technique for nonlinear programing, a primal-dual method. Management Science, 10(2) :360—366, 1964. [7] R. Fletcher and E. Sainz de la Maza. Nonlinear programming and non- smooth optimization by successive linear programming. Math. Program., 43(3):235—256, 1989. [8] Roger Fletcher and Sven Leyffer. Nonlinear programming without a penalty function. Mathematical Programming, 91:239—269, 2000. [9] Michael P. Friedlander, Nick I. M Gould, Sven Leyffer, and Todd S. Munson. A filter active-set trust-region method. ANL/MCS-P1456-0907. September 2007. [10] P. E. Gill, W. Murray, and M. A. Saunders. SNOPT: An SQP Algorithm for Large-Scale Constrained Optimization. SIAM Review, 47:99—131, January 2005. 41 [11] Jacek Gondzio and Andreas Grothey. A new unbiocking technique to warm- start interior point methods based on sensitivity analysis. SIAM Journal on Optimization, 19(3):1184—1210, 2008. [12] Nicholas I. M. Gould, Dominique Orban, and Philippe L. Toint. CUTEr and SifDec: A constrained and unconstrained testing environment, revisited. ACM Trans. Math. Softw., 29(4):373—394, 2003. [13] Jorge Nocedal and Stephen J. Wright. Numerical Optimization. Springer, New York, NY, 1999. [14] F. Palacios-Gomez, L. Lasdon, and M. Engquist. Nonlinear optimization by successive linear programming. Management Science, 28(10):1106—1120, 1982. [15] Walter Murray Phillip E. Gill and Michael A. Saunders. User’s Guide for SNOPT Version 7: Software for Large-Scale Nonlinear Programming. http://www.stanford.edu/group/SOL/snopt.htm, February 2008. [16] Walter Murray Phillip E. Gill and Michael A. Saunders. User’s Guide for SQOPT Version 7: Software for Large-Scale Linear and Quadratic Pro gramming. http://www.stanford.edu/group/SOL/sqopt.htm, June 2008. [17] Stephen M. Robinson. Perturbed Kuhn-Tucker points and rates of conver gence for a class of nonlinear-programming algorithms. 7:1—16, 1974. [18] Andreas Wachter and Lorenz T. Biegler. On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear program ming. Math. Program., 106(1):25—57, 2006. [19] Richard A. Waltz. Algorithms for large-scale nonlinear optimization. http://www.eecs.northwestern.edu/rwaltz/articles.html, June 2002. 42 Appendix A Derivation of the active sets for TrLP subproblems We define an active set for NLP in Definition 1.3. TrLP is a specific case of NLP, and so this definition still holds, however it requires us to reformulate it into the format given by Equation 1.1. Instead we redefine the active set for TrLP. Section 6.3 defines two subsets of constraints, and accordingly two active sets. In this appendix we derive these definitions from Definition 1.3. The natural bound constraints of TrLP, rewritten to match NLP, give d —x. We can thus simply write Ajr := { : d — _x}. The trust-region constraints make the active set slightly more cumbersome. The bound constraints, together with these trust-region constraints for TrLP can be written succinctly as max(—xi, —s.) <d <. Because a variable that is active at its lower bound should not be considered equivalent to one at its upper bound, we should begin by defining two separate active sets for TrLP and { d = To simplify our notation, and to remain consistent with our earlier definition of an active set, however, we can combine these active sets by inserting variables at their upper bound with a negative index (in this way the definition of w in Equation 6.1 can be used). Our definition of the full active set for TrLP is, thus, := u (— {A}) where—{C}:={—aIaE}. 43 Appendix B SRLP Results This appendix lists the results of the algorithm SRLP on each problem in the test set defined in Section 5.2. problem name of CUTEr problem n number of variables m number of constraints maj number of major iterations taken mill number of minor iterations taken fr number of feasibility restoration steps taken time time taken to solve in seconds w measure of subproblem warm-start quality (Equation 6.1) value of the objective at solution . Table B.1: Definition of symbols used in Table B.2. problem n m maj mm ft time w f alsotaine 2 1 7 7 0 0.257 95.238 8.21e-02 btl3 5 1 57 62 0 0.304 99.573 00 cantilvr 5 1 17 18 0 0.166 100.000 1.34e+00 cb2 3 3 11 13 0 0.199 96.296 1.95e+00 cb3 3 3 6 5 0 0.146 97.222 2.OOe+00 chaconni 3 3 12 10 0 0.150 98.611 1.95e+00 chaconn2 3 3 6 1 0 0.142 100.000 2.OOe+00 concon 15 11 4 13 0 0.174 99.038 -6.23e+03 congigmz 3 5 65 79 0 0.290 98.566 2.80e+01 csf ii 5 4 48 59 0 0.241 98.551 -4.91e+01 csfi2 5 4 56 144 0 0.285 84.127 5.50e+01 dembo7 16 20 1464 1583 0 2.586 99.837 1.75e+02 demymalo 3 3 10 13 0 0.182 94.444 -3.OOe+00 dnieper 61 24 9 269 0 0.211 80.196 1.87e+04 eqc 9 3 2 5 0 0.174 91.667 -8.30e+02 expfita 5 22 34 78 0 0.225 98.810 1.14e-03 gigoluezi 3 3 9 11 0 0.181 94.444 -3.OOe+00 gigomez2 3 3 14 16 0 0.155 97.436 1.95e+00 gigoiuez3 3 3 6 5 0 0.144 97.222 2.OOe+00 haif as 13 9 35 54 0 0.223 97.947 -4.50e-01 Appendix B. SRLP Results 44 problem n m maj mm fr time f* haldinads 6 42 12 166 0 0.193 88.447 1.22e-04 himmelbk 24 14 50 121 0 0.250 97.530 5.18e-02 himmelp4 2 3 283 298 0 0.614 98.660 -5.90e+01 himmelp5 2 3 28 34 0 0.180 93.077 -5.90e-I-01 himmelp6 2 5 110 123 0 0.341 98.393 -5.90e+01 hong 4 1 26 61 0 0.209 94.667 2.26e+01 hslO 2 1 16 17 0 0.191 100.000 -1.OOe+00 hslO2 7 5 956 3571 0 3.060 95.359 9.12e+02 hslO3 7 5 284 1106 0 0.835 94.674 5.44e+02 hslO4 8 5 94 105 0 0.318 99.827 3.95e+00 hslO6 8 6 21 40 0 0.209 97.768 9.53e+03 hslO7 9 6 20 74 0 0.200 95.556 5.06e+03 hslO8 9 13 25 60 0 0.210 97.727 -5.OOe-01 hsll 2 1 17 18 0 0.200 100.000 -8.50e+00 hslll 10 3 20 29 0 0.200 100.000 -4.78e+01 hsll2 10 3 663 794 0 1.236 99.439 -4.78e+01 hsll4 10 11 707 747 0 1.263 99.852 -1.76e+03 hsll6 13 14 46 86 0 0.250 98.589 1.85e+02 hsllT 15 5 351 571 0 0.786 99.380 3.23e+01 hsll9 16 8 11 56 0 0.202 86.806 2.45e+02 hsl2 2 1 180 184 0 0.567 99.708 -3.OOe+01 hsl3 2 1 14 3 0 0.154 95.238 1.Ole+00 hsl4 2 2 5 2 0 0.177 100.000 1.39e+00 hsl5 2 2 17 1 0 0.175 95.000 3.06e+02 hsl6 2 2 4 0 0 0.140 93.750 2.31e+01 hsl7 2 2 11 7 0 0.153 78.571 01 hsl8 2 2 103 104 0 0.289 99.500 5.OOe+00 hsl9 2 2 5 3 0 0.140 85.000 -6.96e+03 hs2O 2 3 3 4 0 0.176 86.667 4.02e+01 hs22 2 2 4 4 0 0.194 87.500 1.OOe+00 hs23 2 5 20 22 0 0.199 99.248 2.OOe+00 hs24 2 3 8 13 0 0.178 95.000 -1.OOe+00 hs29 3 1 248 252 0 0.557 99.860 -2.26e+01 hs3O 3 1 15 7 0 0.189 98.214 1.OOe+00 hs3l 3 1 19 21 0 0.161 96.667 6.OOe+00 hs32 3 2 14 18 0 0.188 98.571 1.OOe+00 hs33 3 2 9 7 0 0.149 97.778 -4.OOe+00 hs34 3 2 13 15 0 0.186 93.846 -8.34e-01 hs36 3 1 2 4 0 0.170 62.500 -3300 hs37 3 2 9 13 0 0.182 90.000 -3.46e+03 hs4l 4 1 6 8 0 0.178 90.000 1.93e+00 hs43 4 3 89 102 0 0.313 99.532 -4.40e+01 hs54 6 1 2 4 0 0.177 100.000 -7.22e-34 hs55 6 6 2 5 0 0.173 91.667 6.67e+00 Appendix B. SRLP Results 45 problem n m maj mm fr time w hs57 2 1 8 10 0 0.187 100.000 3.06e-02 hs59 2 3 88 133 0 0.376 91.786 -7.80e+00 hs6O 3 1 18 20 0 0.197 100.000 3.26e-02 hs62 3 1 163 181 0 0.389 100.000 -2.63e+04 hs64 3 1 44 71 0 0.207 98.333 6.30e+03 hs65 3 1 129 155 0 0.339 98.101 9.54e-01 hs66 3 2 15 19 0 0.189 96.000 5.18e-01 hs68 4 2 363 375 0 0.712 99.898 -9.20e-01 hs7l 4 2 14 18 0 0.189 95.000 1.70e+01 hs72 4 2 8 27 0 0.145 100.000 5.47e+02 hs73 4 3 7 11 0 0.181 89.796 2.99e+01 hs74 4 5 20 28 0 0.200 98.889 5.13e+03 hs75 4 5 10 11 0 0.172 96.296 5.17e+03 hs8O 5 3 8 12 0 0.180 100.000 5.39e-02 hs8l 5 3 7 7 0 0.143 100.000 5.39e-02 hs83 5 3 3 5 0 0.182 83.333 -3.07e+04 hs84 5 3 69 116 0 0.284 97.750 -5.28e+06 hs85 5 21 364 397 0 0.758 99.947 -1.68e+00 hs86 5 10 36 81 0 0.230 96.410 -3.23e+01 hs88 2 1 192 195 0 0.794 100.000 1.36e+00 hs89 3 1 204 208 0 0.853 100.000 1.36e+00 hs9O 4 1 448 453 0 1.994 100.000 1.36e+00 hs9l 5 1 407 413 0 2.032 100.000 1,36e+00 hs92 6 1 44 51 0 0.366 100.000 1.29e+00 hs93 6 2 1056 1074 0 1.783 99.973 1.35e+02 hs95 6 4 8 20 0 0.181 91.250 1.56e-02 hs96 6 4 8 13 0 0.146 91.250 1.56e-02 hs97 6 4 10 14 0 0.151 92.000 4.07e+00 hs98 6 4 10 13 0 0.151 92.000 4.07e+00 hs99 7 2 123 247 0 0.359 100.000 -8.31e+08 hubfit 2 1 10 13 0 0.196 92.593 1.69e-02 kiwcresc 3 2 12 17 0 0.196 95.556 -4.89e-08 loadbal 31 31 37 80 0 0.233 99.478 4.53e-01 lsnnodoc 5 4 2 5 0 0.171 83.333 1.23e+02 madsen 3 6 22 36 0 0.201 96.732 6.16e-01 makelal 3 2 6 9 0 0.176 93.333 -1.41e+00 makela2 3 3 35 40 0 0.223 99.048 7.20e+00 matrix2 6 2 18 27 0 0.200 97.500 6.16e-07 mconcon 15 11 4 12 0 0.182 99.038 -6.23e+03 miff lini 3 2 8 14 0 0.185 93.333 -1.OOe+00 mifflin2 3 2 11 15 0 0.187 95.556 -1.OOe+00 mininaxrb 3 4 27 30 0 0.210 98.901 4.51e.-16 mistake 9 13 22 57 0 0.208 96.717 -1.OOe+00 mribasis 36 55 107 239 0 0.377 99.486 1.82e+01 Appendix B. SRLP Results 46 problem n m maj mill fr time w odf its 10 6 22 32 0 0.206 100.000 -2.38e+03 optcntrl 32 20 3 32 0 0.184 85.897 5.50e+02 pentagon 6 15 13 35 0 0.199 98.901 1.56e-04 polakl 3 2 15 17 0 0.192 100.000 2.72e+00 polak4 3 3 4 6 0 0.181 91.667 -2.66e-13 polak5 3 2 3 5 0 0.175 100.000 5.OOe+01 prodplo 60 29 11 184 0 0.199 85.815 5.88e+01 prodpll 60 29 10 84 0 0.152 92.759 3.57e+01 qc 9 4 2 11 0 0.171 73.077 -9.57e+02 qcnew 9 3 2 5 0 0.178 91.667 -8.07e+02 rk23 17 11 7 22 0 0.180 97.024 8.33e-02 s365 7 5 3 18 0 0.176 77.778 00 snake 2 2 4 5 0 0.217 93.750 2.32e-14 stancinin 3 2 2 4 0 0.173 90.000 4.25e+00 synthesl 6 6 19 30 0 0.204 95.833 7.59e-01 synthes2 11 14 17 87 0 0.202 88.727 -5.54e-01 synthes3 17 23 35 147 0 0.239 95.096 1.51e+01 tenbarsl 18 9 57 82 1 0.469 99.581 2.30e+03 tenbars2 18 8 87 110 1 0.469 99.768 2.30e+03 tenbars3 18 8 76 92 1 0.516 99.737 2.25e+03 truspyrl 11 4 1854 1895 1 3.238 99.982 1.12e+01 truspyr2 11 11 9 30 1 0.240 95.960 1.12e+01 try—b 2 1 7 1 0 0.179 100.000 1.OOe+00 twobars 2 2 15 17 0 0.212 98.077 1.51e+00 water 31 10 155 197 0 0.430 99.179 1.05e+04 womf let 3 3 23 36 0 0.203 96.667 4.57e-08 zecevic3 2 2 22 40 0 0.219 92.857 9.73e+01 zecevic4 2 2 33 45 0 0.189 97.115 7.56e+00 zy2 3 2 9 10 0 0.182 93.333 2.OOe+00 airport 84 42 72 1086 0 0.462 94.520 4.80e+04 batch 48 73 27 386 0 0.236 91.804 2.59e+05 britgas 450 360 12 1063 0 0.328 98.066 00 corel 65 59 19 112 0 0.202 98.005 9.lle+01 core2 157 134 82 365 0 0.370 99.444 7.29e+01 dallasiu 196 151 453 1391 0 1.667 99.857 -4.82e+04 eigmaxa 101 101 2 100 1 0.698 97.525 -01 eigmaxb 101 101 7 101 0 0.187 99.929 -9.67e-04 eigmina 101 101 2 101 1 0.693 97.525 1.OOe+00 eigminb 101 101 7 101 0 0.186 99.929 9.67e-04 expfitb 5 102 26 160 0 0.227 99.439 5.02e-03 expfitc 5 502 36 718 0 0.433 99.750 2.33e-02 feedloc 90 259 152 12375 0 1.924 96.584 6.39e-19 grouping 100 125 1 50 0 0.179 100.000 1.39e+01 haif am 99 150 67 355 0 0.465 99.743 -4.50e+01 Appendix B. SRLP Results 47 problem n m himmelbi 100 12 hydroels 169 168 lakes 90 78 leaknet 156 153 reading6 102 50 robot 14 2 ssebnln 194 96 tfii 3 101 tfi3 3 101 twirismi 343 313 yorknet 312 256 zamb2—i0 270 96 zanib2—ii 270 96 zamb2-8 138 48 zamb2—9 138 48 maj mm fr time 469 773 0 1.176 99.505 -1.74e+03 21 95 0 0.216 99.407 -3.47e+06 14 147 1 0.823 99.628 3.51e+05 1 157 0 0.178 100.000 1.53e+01 111 376 0 0.428 99.485 -1.45e+02 2 8 1 0.269 81.250 5.46e+00 10 342 0 0.206 92.931 16170600 467 596 0 1.047 99.705 5.33e+00 14 178 0 0.202 98.077 4.30e+00 67 602 0 0.751 99.665 -1.Ole+00 16 532 0 0.263 96.420 2.44e+04 32 896 0 0.334 95.807 -1.57e+00 140 1129 0 0.656 98.815 -1.lle+00 24 285 0 0.238 96.006 -1.52e-01 126 722 0 0.494 98.183 -3.50e-01 Table B.2: SRLP Results 48 Appendix C STrLP Results This appendix lists the results of the algorithm STrLP on each problem in the test set defined in Section 5.2. problem name of CUTEr problem n number of variables m number of constraints maj number of major iterations taken mm number of minor iterations taken fr number of feasibility restoration steps taken time time taken to solve in seconds w measure of subproblem warm-start quality (Equation 6.1)f* value of the objective at solution Table C.1: Definition of symbols used in Table C.2. problem n rn maj mm & time w alsotanie 2 1 4 2 0 0.401 91.667 8.21e-02 btl3 5 1 25 23 2 1.989 77.778 00 cantilvr 5 1 28 125 1 0.227 33.333 1.34e+00 cb2 3 3 38 39 0 0.251 84.848 1.95e+00 cb3 3 3 6 1 0 0.142 97.222 2.00e+00 chaconnl 3 3 38 35 0 0.195 85.000 1.95e+00 chaconn2 3 3 6 0 0 0.144 100.000 2.00e+00 concon 15 11 2 11 1 0.241 78.846 -6.23e+03 congigmz 3 5 16 22 1 0.235 86.250 2.80e+01 csf ii 5 4 53 139 1 0.316 72.629 -4.91e+01 csfi2 5 4 2 3 1 0.213 77.778 5.50e+01 dembo7 16 20 2 39 1 0.280 77.778 2.lOe+02 demymalo 3 3 9 10 0 0.185 89.583 -3.OOe+00 dnieper 61 24 8 222 1 0.382 83.294 1.87e+04 eqc 9 3 1 5 0 0.181 83.333 -8.30e+02 expfita 5 22 31 127 0 0.226 87.037 1.14e-03 gigomezi 3 3 11 18 1 0.250 83.333 -3.OOe+00 gigomez2 3 3 38 36 0 0.194 84.848 1.95e+00 gigomez3 3 3 6 1 0 0.146 97.222 2.OOe+00 haif as 13 9 49 60 0 0.249 93.007 -4.50e-01 Appendix C. STrLP Results 49 problem n m maj mm fr time w f* haldmads 6 42 21 218 1 0.342 86.553 1.22e-04 himmelbk 24 14 27 105 0 0.217 90.486 5.18e-02 himnielp4 2 3 624 1242 0 1.107 60.285 -5.90e+01 himmelp5 2 3 15 18 1 0.190 66.154 -5.90e+01 himmelp6 2 5 317 630 2 0.716 71.772 -5.90e+01 hong 4 1 43 144 0 0.237 20.833 2.26e+01 hslO 2 1 35 37 1 0.276 66.667 -1.OOe+00 hslO2 7 5 28 47 1 1.118 85.714 1.28e+03 hslO3 7 5 24 41 1 1.892 86.111 9.24e+02 hslO4 8 5 131 544 1 0.422 69.161 3.95e+00 hslO6 8 6 21 60 0 0.203 80.357 9.14e+03 hslO7 9 6 38 65 0 0.226 91.111 5.06e+03 hslO8 9 13 44 265 1 0.309 79.273 -6.75e-01 hsll 2 1 27 28 1 0.236 66.667 -8.50e+00 hslll 10 3 29 223 0 0.217 49.744 -4.78e+01 hsll2 10 3 353 2731 0 0.739 42.388 -4.78e+01 hsll4 10 11 679 925 0 1.186 94.260 -1.77e+03 hsll6 13 14 5 42 1 0.270 74.074 190 hsll7 15 5 550 2926 0 1.062 75.899 3.23e+01 hsll9 16 8 26 126 0 0.212 82.083 2.45e+02 hsl2 2 1 80 133 0 0.283 41.026 -3.OOe+01 hsl3 2 1 14 2 1 0.176 90.476 1.Ole+00 hsl4 2 2 8 8 0 0.192 87.500 1.39e+00 hsl5 2 2 7 1 1 0.174 75.000 3.06e+02 hsl6 2 2 4 0 0 0.151 93.750 2.31e+01 hsl7 2 2 14 6 0 0.152 75.000 01 hsl8 2 2 51 52 2 0.257 73.333 5.OOe+00 hsl9 2 2 6 4 1 0.173 75.000 -6.96e+03 hs2O 2 3 3 4 0 0.183 86.667 4.02e+01 hs22 2 2 8 10 0 0.180 81.250 1.OOe+00 hs23 2 5 15 16 1 0.220 90.909 2.OOe+00 hs24 2 3 2 5 0 0.170 80.000 -1.OOe+00 hs29 3 1 542 1757 0 0.996 25.635 -2.26e+01 hs3O 3 1 29 35 0 0.209 73.077 1.OOe+00 hs3l 3 1 37 74 0 0.188 45.833 6.OOe+00 hs32 3 2 3 6 0 0.174 73.333 01 hs33 3 2 4 1 0 0.140 95.000 -4.OOe+00 hs34 3 2 15 27 0 0.190 77.778 -8.34e-01 hs36 3 1 2 5 0 0.170 50.000 -3300 hs37 3 2 39 86 0 0.225 59.130 -3.46e+03 hs4l 4 1 22 44 0 0.207 60.000 1.93e+00 hs43 4 3 174 507 0 0.431 61.238 -4.40e+01 hsS4 6 1 2 4 0 0.172 78.571 -1.43e-88 hs55 6 6 1 5 0 0.171 83.333 6.67e+00 Appendix C. STrLP Results 50 problem n m maj mm fr time f* hs57 2 1 39 55 0 0.222 33.333 3.06e-02 hs59 2 3 54 73 2 0.310 68.387 -7.80e+00 hs6O 3 1 37 65 0 0.228 5L389 3.26e-02 hs62 3 1 168 406 0 0.381 50.000 -2.63e+04 hs64 3 1 7 22 1 0.196 30.000 6.65e+03 hs65 3 1 773 2330 1 1.357 25.435 9.54e-01 hs66 3 2 30 46 0 0.209 76.000 5.18e-01 hs68 4 2 376 744 0 0.741 66.810 -9.20e-01 hs7l 4 2 37 40 0 0.226 82.500 1.70e+01 hs72 4 2 7 14 1 0.206 61.111 7.28e+02 hs73 4 3 3 7 1 0.210 71.429 2.99e+01 hs74 4 5 35 39 1 0.266 88.889 5.13e+03 hs75 4 5 5 3 1 0.188 91.111 5.17e+03 hs8O 5 3 26 49 1 0.237 76.250 5.39e-02 hs8l 5 3 30 63 0 0.191 75.962 5.39e-02 hs83 5 3 3 5 0 0.178 83.333 -3.07e+04 hs84 5 3 9 45 0 0.183 50.000 -2.38e+06 hs85 5 21 8 55 0 0.186 84.615 -1.25e-l-00 hs86 5 10 42 119 0 0.237 80.303 -3.23e+01 hs88 2 1 27 33 1 0.294 66.667 1.36e+00 hs89 3 1 51 112 1 0.457 50.000 1.36e+00 hs9O 4 1 94 316 1 0.422 39.231 1.36e+00 hs9i 5 1 70 263 1 0.658 33.696 1.36e+00 hs92 6 1 136 697 1 0.589 31.905 1.36e+00 hs93 6 2 613 2470 0 1.135 50.050 1.35e+02 hs95 6 4 1 11 0 0.170 90.000 1.56e-02 hs96 6 4 1 0 0 0.137 90.000 1.56e-02 hs97 6 4 4 10 0 0.156 70.000 3.14e+00 hs98 6 4 4 12 0 0.171 70.000 3.14e+00 hs99 7 2 85 449 0 0.313 48.457 -8.31e+08 hubfit 2 1 31 47 0 0.253 36.111 1.69e-02 kiwcresc 3 2 34 43 1 0.309 76.667 -4.47e-08 loadbal 31 31 35 499 0 0.241 78.802 4.53e-01 isnuodoc 5 4 2 5 0 0.171 83.333 1.23e+02 madsen 3 6 38 57 0 0.225 86.772 6.16e-01 makelal 3 2 44 53 0 0.228 76.800 -1.41e+00 makela2 3 3 29 33 0 0.210 82.540 7.20e+00 matrix2 6 2 23 29 1 0.295 78.125 1.03e-01 mconcon 15 11 2 11 1 0.242 78.846 -6.23e+03 miff lini 3 2 35 47 0 0.218 76.842 -1.OOe+00 miff].in2 3 2 78 160 2 0.373 61.333 -1.OOe+00 minmaxrb 3 4 23 39 0 0.201 74.107 00 mistake 9 13 42 335 0 0.240 75.413 -1.OOe+00 mribasis 36 55 58 544 1 0.505 93.626 3.15e+01 Appendix C. STrLP Results 51 problem n m odf its 10 6 optcntrl 32 20 pentagon 6 15 polaki 3 2 polak4 3 3 polak5 3 2 prodpl0 60 29 prodpll 60 29 qc 9 4 qcnew 9 3 rk23 17 11 s365 7 5 snake 2 2 stancmin 3 2 synthesl 6 6 synthes2 11 14 synthes3 17 23 tenbarsi 18 9 tenbars2 18 8 tenbars3 18 8 truspyrl 11 4 truspyr2 11 11 try—b 2 1 twobars 2 2 water 31 10 womf let 3 3 zecevic3 2 2 zecevic4 2 2 zy2 3 2 airport 84 42 batch 48 73 britgas 450 360 corel 65 59 core2 157 134 dallasm 196 151 eignlaxa 101 101 eigiuaxb 101 101 eigmina 101 101 eigminb 101 101 expfitb 5 102 expfitc 5 502 feedloc 90 259 grouping 100 125 haif am 99 150 maj mm & time w f* 32 148 1 0.336 47.443 -2.38e+03 4 49 1 0.287 80.769 5.50e+02 15 48 0 0.195 80.000 1.37e-04 23 25 0 0.201 80.000 2.72e+00 8 18 1 0.222 70.000 2.24e+01 23 22 1 0.242 70.000 5.OOe-l-01 8 71 0 0.182 93.399 5.88e+01 9 109 0 0.151 91.386 3.57e+01 1 11 0 0.174 46.154 -9.57e+02 1 5 0 0.181 83.333 -8.07e+02 60 112 1 0.334 93.304 8.33e-02 2 7 0 0.171 83.333 00 2 3 0 0.169 87.500 1.20e-13 2 3 1 0.195 70.000 4.25e+00 37 66 0 0.221 87.281 7.59e-01 10 61 1 0.260 76.000 -5.54e-01 50 272 1 0.357 87.891 1.51e+01 104 553 1 0.522 82.463 2.30e+03 116 606 1 0.545 82.284 2.30e+03 63 396 1 0.503 78.114 2.25e+03 303 2319 1 0.800 53.866 1.17e+01 2 19 1 0.244 81.818 1.12e+01 6 1 0 0.181 100.000 1.OOe+00 35 36 1 0.245 73.529 1.51e+00 17 126 1 0.292 80.976 1.07e+04 25 35 1 0.238 85.294 00 54 84 1 0.289 63.281 9.73e+01 44 77 0 0.207 57.292 7.56e+00 4 5 0 0.179 85.000 2.OOe+00 192 8015 1 1.236 66.555 4.80e+04 43 380 0 0.250 86.860 2.59e+05 14 1742 1 5.907 95.150 00 2 56 1 0.497 86.290 9.lle+01 2 133 1 2.117 88.660 7.29e+01 55 3570 0 0.491 71.025 -4.82e+04 2 100 1 0.701 97.525 -01 7 101 0 0.185 99.929 -9.67e-04 2 101 1 0.694 97.525 1.OOe+00 7 101 0 0.186 99.929 9.67e-04 29 381 0 0.242 95.736 5.02e-03 36 1701 0 0.497 98.981 2.33e-02 2 601 1 1.349 78.940 00 1 50 0 0.176 100.000 1.39e+01 11 293 0 0.212 96.185 9.98e+01 Appendix C. STrLP Results 52 problem n m maj himjnelbi 100 12 447 hydroels 169 168 1 lakes 90 78 2 leaknet 156 153 1 reading6 102 50 51 robot 14 2 2 ssebnln 194 96 3 tfil 3 101 53 tfi3 3 101 9 twirismi 343 313 53 yorknet 312 256 22 zamb2—1O 270 96 22 zainb2—i1 270 96 19 zamb2—8 138 48 40 zamb2—9 138 48 26 mm fr time 11820 0 3.763 30 0 0.178 107 1 0.821 158 0 0.179 1231 1 0.752 8 1 0.272 392 1 1.028 647 0 0.316 478 0 0.208 2891 0 0.847 763 1 2.341 1099 0 0.268 1632 0 0.257 1537 0 0.303 1061 0 0.230 w f* 75.831 -1.74e+03 98.220 -3.43e+06 80.952 2.50e+11 99.029 8.98e+00 89.686 -1.45e+02 81.250 5.46e+00 58.391 2.06e+07 94.373 5.33e+00 98.077 4.30e+00 94.233 -1.Ole+00 93.486 1.45e+04 89.344 -1.58e+00 81.379 -1.12e+00 82.484 -1.53e-01 84.097 -3.54e-01 Table C.2: STrLP Results

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            data-media="{[{embed.selectedMedia}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.24.1-0051992/manifest

Comment

Related Items