T H E I N T E R I O R P O I N T M E T H O D A N D ITS A P P L I C A T I O N T O T H E F R A C T I O N A L (#,/)-FACTOR P R O B L E M By Ping Shi B. Eng., Beijing University of Posts and Telecommunications, 1985 M . Eng., Beijing University of Posts and Telecommunications, 1988 A THESIS SUBMITTED IN PARTIAL F U L F I L L M E N T OF T H E REQUIREMENTS FOR T H E D E G R E E OF M A S T E R OF SCIENCE in T H E FACULTY OF G R A D U A T E STUDIES DEPARTMENT OF MATHEMATICS INSTITUTE OF APPLIED MATHEMATICS We accept this thesis as conforming to the required standard T H E UNIVERSITY OF BRITISH COLUMBIA September 1996 © Ping Shi, 1996 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of Mathematics Institute of Applied Mathematics The University of British Columbia 2075 Wesbrook Place •Vancouver, Canada V6T 1Z1 Date: Abstract This thesis contains six chapters. From Chapter 1 to Chapter 5, we give an exposition of interiorpoint methods (IPM). In Chapter 1 a brief survey of the recent development in interior-point methods is given. Chapter 2 gives a detailed description on the history of the primal-dual pathfollowing methods for linear programming. Chapter 3 introduces several predictor-corrector methods. This is an efficient and powerful class of interior-point methods and a lot ofattention has been paid during recent years. Chapter 4 discusses the various efficient linear algebraic methods which are used for solving the Newton equations system. Practically, most of the computational effort in implementations of interior-point methods is taken up in solving the Newton equations system. Therefore an efficient method for solving the Newton equations system can make the whole algorithm very efficient. Chapter 5 talks about the presolving procedure which is used to reduce the size of the L P problem and.to make the system sparser before passing the problems to optimization. The presolve procedure is a very efficient adjunct for any L P solver. In chapter 6, we apply the interior-point method to the fractional (#,/)factor problem, we first exploit the special structure of the fractional (<?,/)-factor problem and propose two I P M algorithms to take advantage of the special structure of the fractional (5,/)-factor problem. In each of the two algorithms, we use a different linear programming model for the fractional /)-factor problem in order to make the corresponding algorithm more efficient. Then, we discuss three starting point approaches and the strategy for creating our test problems. Last, we use the high order predictor-corrector primal-dual path-following code HOPDM[45] to test the three starting point approaches and from the computational results we get the conclusion that a fractional (<7,/)-factor problem-specific starting point can save iterations of the I P M and make the I P M more efficient for solving fractional (g, /)-factor problem. ii Table of Contents Abstract ii List of Tables vi List of Figures vii Acknowledgement viii 1 Introduction to Interior Point Methods 1 2 Path-following Methods for Linear Programming 7 2.1 Preliminaries 2.2 7 2.1.1 Newton-Raphson's method for solving nonlinear equations 11 2.1.2 The Karush-Kuhn-Tucker Conditions 12 The Affine Scaling Direction 13 2.2.1 The Primal Affine Scaling Direction 13 2.2.2 The Dual Affine Scaling Direction . 2.3 The Centering Direction 14 15 2.3.1 The Primal Centering Direction 15 2.3.2 The Dual Centering Direction 16 2.4 The Primal Path-following Method 16 2.5 The Dual Path-following Method 19 2.6 The Primal-Dual Path-following Method 19 2.7 The Primal-Dual Infeasible-Interior-Point Method 25 iii 3 Predictor-Corrector Methods 29 3.1 Mehrotra's Predictor-Corrector Method 29 3.2 Carpenter's Higher-Order Predictor-Corrector Method . 34 3.3 Gondzio's Multiple Centrality Corrections Method 37 3.4 Starting Point 40 4 Solving the Newton Equations 4.1 4.2 42 Direct Methods 45 4.1.1 The Normal Equations Approach 45 4.1.2 The Augmented System Approach . . . 50 The Adaptive Approach 54 5 Presolve Analysis 5.1 Reduction of Problem Size 57 5.1.1 Simple Presolve Methods 57 5.1.2 Forcing and Redundant Constraints 58 Tightening Variable Bounds and Implied Free Variables 60 . 5.1.3 5.1.4 5.2 5.3 5.4 57 Bounds on Shadow Prices and Dominated Variables . Improving Sparsity of the Problem 63 5.2.1 Making the Matrix A Sparser 63 5.2.2 Splitting Dense Columns 63 Removing Linearly Dependent Rows and Columns 5.3.1 Linearly Dependent Rows 5.3.2 Duplicate Columns 64 . Postsolve • • • • The Parallel I P M for the Fractional (g, /)-factor Problem 6.1.1 64 65 • • 66 6 Solving the Fractional (</,/)-factor Problem 6.1 61 The Interior Point Algorithm 67 . 68 69 iv 6.1.2 Staircase Structure 71 6.1.3 The Fractional (p,/)-factor Problem 74 6.2 The Preconditioned Conjugate Gradient Algorithm 78 6.3 Computational Implementation 85 6.3.1 Starting Point 86 6.3.2 Test Problems 89 6.3.3 Computational Results 90 Bibliography . 9 5 List of Tables 6.1 The preconditioned conjugate gradient algorithm 82 6.2 Original problem size 91 6.3 Iteration counts for approach 2 93 6.4 Iteration counts for different starting point approaches 94 vi List of Figures 6.1 The layered graph G 76 6.2 The staircase matrix A of the layered graph G 77 vii Acknowledgement I would like to express my appreciation to my supervisor, Dr. R. P. Anstee, for his guidance and criticism throughout the preparation of this work. I would also like to thank Dr. S. T . McCormick for his helpful suggestions, comments and careful reading of the draft. viii Chapter 1 Introduction to Interior Point Methods Interior point methods for mathematical programming were present at least since the sixties. In 1967, Dikin[25] introduced an affine-scaling method for linear and quadratic programming. Then Fiacco and McCormick[31] developed the IPM as a tool for nonlinear programming. Although Fiacco and McCormick noted that their proposed methods could be applied in full measure to linear programming, neither they nor any other researchers at that time seriously proposed that interior point method would provide a viable alternative to the simplex method for actually solving linear programming problems. This is for two reasons. First, due to the storage limitations, the size of the problems solved at that time did not exceed a couple of hundred rows and columns, and for such sizes the simplex method is practically unbeatable. Secondly, there was no numerical linear algebra for solving large, sparse linear equation systems at that time. (This was done in the '70s and '80s by George and Liu [34].) Therefore the orthogonal projections in IPM were very expensive. IPM needs significantly more memory than the simplex method which was an unacceptable requirement at that time. Although the simplex method is efficient and elegant, it does not possess a property that became more and more important in the last two decades: polynomial complexity. It was proved that the simplex method has an exponential worst-case complexity[61]. In 1978 Khachiyan[60] provided the first polynomial worst-case complexity algorithm for linear programming by showing how to adapt the ellipsoid method for convex optimization provided by Shor[94] and Yudin and Nemirovskii[116] to linear programming which requires 0(n L) 4 time, where L is the bit size of the problem. This method has significant theoretical implications for mathematical programming but is not competitive with the simplex method in 1 Chapter 1. Introduction to Interior Point Methods 2 practice. In 1984, Karmarkar[57] published his projective algorithm, which was proven to require 0(nL) iterations and 0(n - L) time overall, lower than Khachian's. It was also announced 3 5 that it was more efficient than the simplex method in practice. Karmarkar's algorithm is essentially different from the simplex method in that it moves through the relative interior of the feasible set instead of following a sequence of vertices as does the simplex method. Karmarkar's work opened a new research field, the field of interior point methods, which has yielded many theoretical and practical achievements. During the last eleven years a substantial number of contributions have been made in this field, both theoretical and practical. Inspired by Karmarkar's publication many other polynomial methods were proposed by researchers, sometimes containing new ideas, sometimes containing only slight modifications or embellishments. Numerical results have shown that some interior point methods can solve large linear programming problems faster than the simplex method. For a complete survey, the reader can consult the following excellent articles or books by Andersen, Gondzio, Meszaros and Xu[4], Goldfarb and Todd[40], Gonzaga[50], Hertpg and Roos[51], Lustig, Marsten and Shanno[68], Todd[98, 99], Wright[108, 109] and Ye[114]. I also refer the reader to the following web sites: 1. Linear Programming - Frequently Asked Questions List: http://www.skypoint.com/subscribers/ashbury/linear-programming-faq.html. 2. Interior-Point Methods Online: http://www.mcs.anl.gov/home/otc/InteriorPoint/index.html. All variations of IPM can be classified into four main categories: 1. Projective potential reduction methods; 2. Affine scaling methods; Chapter 1. Introduction to Interior Point Methods 3 3. Path-following methods; 4. Affine potential reduction methods. Karmarkar's algorithm is the original projective potential reduction method. It was studied and improved by many researchers: simplification of the analysis, studies of limiting behavior, removing the initial assumptions, etc. From the practical point of view the early results, e.g.[100], were disappointing. Later more efficient algorithms were developed using better formulas of the linear programming problem. Anstreicher [10],[11], Gay[33], De Ghellinck and Vial[36], Gonzaga[48], Yamashita[110], Ye[112], Ye and Kojima[115] studied the projective potential reduction method. Actually, these earlier methods provide useful perspective on later path-following approaches and the projective algorithms soon became off the main stream of research. The practical merits of IPM became clear when several researchers proposed and implemented the so-called affine scaling method. In each iteration of the projective potential reduction method a projective transformation is carried out to 'center' the current iterate. Instead of doing this projective transformation, Vanderbei et al.[106] and Barnes[13] proposed to do a simple affine scaling to center the current iterate. It turned out that this natural simplification of Karmarkar's method was already proposed by Dikin[25] in 1967. Cavalier and Soyster[20] also studied this method. Several implementations of the primal and dual affine scaling methods (e.g.[2],[14] and[106]) showed promising results. The affine scaling methods have no known polynomial time complexity and, in fact, Megiddo and Shub[75] have indicated that these methods may require an exponential number of iterations if they are started "close" to the boundary of the feasible region. Also, Vanderbei[103] has pointed out that these methods can make it difficult to recover dual solutions and prove optimality when there is degeneracy. However, these affine method do work well in practice; see Chen[22] for the primal affine method, and Adler et al.[2],[l], Marsten[71], and Monma and Morton[80] for the dual affine method. Monterio et al.[83] have obtained a polynomial time bound for a primal-dual affine method. Chapter 1. Introduction to Interior Point Methods 4 Most recent research in the IPM has concentrated on path-following methods. This class of methods explicitly restricts the iterates to a neighborhood of the central path and follows the path to a solution of the linear programming problem. Both the classical logarithmic barrier method and the center method are called path-following methods. We now describe the main results obtained for this class of methods. Gill, Murray, Saunders, Tomlin and Wright[38] first built the bridge between Karmarkar's method and the logarithmic barrier method for linear programming. They showed that the search directions used in Karmarkar's method and in the logarithmic barrier method were closely related. They also implemented the logarithmic barrier method and obtained promising results. However, they did not prove polynomiality as Karmarkar did. The logarithmic barrier function approach is usually attributed to Frisch[32] and is formally studied in Fiacco and McCormick[31] in the context of nonlinear optimization. The presentation of a continuous trajectory was first proposed by Karmarkar and extensively studied by Bayer and Lagarias[16], Megiddo[74], and Megiddo and Shub[75]. Megiddo[73] relates these central paths to the classical barrier path in a primal-dual setting. Kojima, Mizuno and Yoshise[62] used this framework to present a primal-dual path-foDowing method which requires O(nL) iterations and overall 0(n L) time. Monteiro and Adler[81] improve upon 4 this result by using ideas of Gonzaga[46] and Karmarkar[57] to obtain a primal-dual short-step path-following method which require O(n L) iterations and overall 0(n L) time. 05 3 Roos and Vial[90] and Gonzaga[49] independently developed a natural and more practical version of the logarithmic barrier method for linear programming, the long step path-following method. In this method the barrier parameter may be reduced by an arbitrary factor between 0 and 1. This means that the iterates need not lie close to the central path and hence long steps can be taken. They also proved that this method has an O(nL) iteration bound. Hertog[53] gave a simple analysis for the logarithmic barrier method and showed that the overall complexity can be reduced by a factor y^n, using safeguarded line searches. Chapter 1. Introduction to Interior Point Methods 5 Better results are obtained by predictor-corrector methods. The predictor-corrector methods (adaptive primal-dual algorithms) were developed by Mizuno, Todd and Ye[79]. The more practical predictor-corrector algorithms were proposed by Mehrotra[76], Carpenter, Lustig, Mulvey and Shanno[19] and Zhang and Zhang[117]. Nowaday, almost all the existing interiorpoint codes for general-purpose linear programming problems are based on Mehrotra's predictorcorrector algorithm. These algorithms represent current state-of-the-art I P M . The center method was proposed by Huard[55] in 1967. Compared to the logarithmic barrier method, the center method has.received less attention. Renegar[87] proved in 1987 that a very special version of the center method for linear programming is polynomial. In fact, this was the first polynomial path-following method with an 0(y/nL) iteration bound. In this method very short steps are taken. Although this method for linear programming has an 0(y/nL) iteration bound, which is a factor yfn better than the projective potential reduction method, it is very inefficient in practice. Later on Vaidya[102] reduced the total complexity bound to 0(n L), 3 using approximate solutions and rank-one updates. Hertog[52] developed a natural and more practical version of the center method for linear programming, in which the relaxation factor may be any number between 0 and 1, and line searches are allowed. The affine potential reduction algorithm was developed by Gonzaga[47]. His algorithm has iteration complexity O(nL). The primal-dual potential function and algorithm were analyzed by Anstreicher and Bosch[12] and Ye[113]. These algorithms possess 0(y/nL) iteration complexity. Ye proved that by doing projected gradient steps (after affine scaling) a constant reduction for the potential function value can be obtained. His method was not a symmetric primal-dual method. Kojima et al.[63] developed a symmetric primal-dual version of this method for linear complementarity problems. The potential reduction methods take steps of the same form as do path-following methods, but they do not explicitly follow the path and can be motivated independently of the path. Chapter 1. Introduction to Interior Point Methods 6 They use a logarithmic potential function to measure the worth of each point in the primaldual relative interior of the feasible set and aim to achieve a certain fixed reduction in this function at each iteration. Chapter 2 Path-following Methods for Linear Programming 2.1 Preliminaries In this section we will give notation and some relevant theory for linear programming problems and I P M . We also present a conceptual path-following algorithm. We consider the linear programming problem in the standard form minimize c x, T subject to Ax = 6, (2-1) x > 0, where c,x £ lZ , b £ 1Z , A € lZ , n m mxn n > m. The corresponding dual problem is maximize subject to b y, T Ay T (2.2) < c. It can be changed to the following form maximize subject to b y, T Ay + z = c, T z where.?/ G K , m z € (2.3) > 0, 1l . n With respect to other forms of linear programs, an analogous analysis can be carried out. The linear problem with upper bounded variables will be discussed later. We assume that the primal feasible region T P = {x e Tl\ : Ax = b}, where 11+ = {x e 1Z : x > 0}, (2.4) Chapter 2. Path-following Methods for Linear Programming 8 is bounded and has a nonempty relative interior given by f = {x e 7e+ : Ax = 6}, P where TZ + = {x e U : x > 0}. ++ (2.5) The corresponding dual feasible region is T = {(y,z)en xn :A y m D n + z = c}, T + (2.6) and the dual relative interior is F = {(y, z)en x m D n\+ -.A + z = }. (2.7) T y c We also use T = T-p x Tv and T = Tp x Tp> to represent primal-dual feasible region and its relative interior, respectively. The duality gap is c x — b y and the complementarity gap is x z. T T If (x, y, z) is primal and T dual feasible then the duality and complementarity gap are identical. Our concern starts from the fact that I P M performs well by avoiding the boundary of the feasible set. It does this with the help of a classical resource first used in optimization by Frisch[32] in 1955: the logarithmic barrier function n xeTZ ,x>0 n =» J>(a:) = l o g s , n JJxj. =-log (2.8) This function grows indefinitely near the boundary of the feasible set T, and can be used as a penalty attached to those points. Combining P(x) with the objective makes points near the boundary expensive, and forces any minimization algorithm to avoid them. The unique minimizer of P(x) in .Fis the analytic center of T, and it coincides with the point that maximizes the product of the variables in T. Newton-Raphson's method can be adapted to determine a center with a prescribed precision with excellent theoretical and practical performance. The main idea behind all interior point method is that costs should try to be decreased and simultaneously move away from the boundary. Thus we examine combinations of cost and barrier function, in a traditional construction known as the internal penalized function: H €ll,x 6 111+ ' => f(x,p) = c x - /x^jTlogz, = c x + pP(x). T T (2.9) Chapter 2. Path-following Methods for Linear Programming 9 where p is a barrier parameter. For a given value of p, (2.1) becomes minimize f(x,p), (2.10) subject to Ax = b. Similarly, the internal penalized function for (2.3) is peTl,zeni => + g(y,z,p) = & j / + / i £ T log ^ , (2.11) thus transforming (2.3) to maximize subject to g(y,z,p), Ay T (2.12) + z = c. The internal penalized function was extensively studied by Fiacco and McCormick in their book[31], and described since then in all nonlinear programming textbooks. Now associate to each value of the parameter p a primal central point x(p) uniquely denned by x(p) = argmin f(x,p). (2.13) The curve (x(p) : yu € 72.) is the primal central path for the linear programming problem (2.1). The curve is smooth and has the important property that as p goes to zero it converges to an optimal solution of (2.1). In the same way we can define a dual central point (y(p), z(p)) as (»(/*)> *(/*))= The curve {(y(p),z argmin g(y,z,p). (2.14) (/x)) : p £ 11} is the dual central path for the linear prog (2.3). Assume that f ^ 0, i.e. both Tp ^ 0 and TD ^ 0- We define the primal-dual central path as foDows r = {(x,y, z)e where X = diag(xi,a;2,.. .,x ), n T :Xz = pe where x^ z p - —}, (2.15) and e denotes the vector of all ones that is e = ( 1 , 1 , . . . , 1) . Ye[114] gives the following theorem: T 10 Chapter 2. Path-following Methods for Linear Programming Theorem 2.1.1 Let (x(p), y(u-), z(p,)) G T(p) then 1. The central pair (x(fi), z(fx)) is bounded where 0 < /i < /x° /or any given u° > 0. 2. For 0 < fj,' < p, c x(p') < c x(fi) and b y(n') > b y(p). T 3. (x(n), T T T Z(H)) converges to an optimal solution pair for (2.1) and (2.3) as \x —• 0. The theorem shows the primal-dual central path is well defined and its limit point is an optimal solution to (2.1) and (2.3). Path-following algorithms follow the central path. Let (r(/x) : p 6 1Z) be a parameterization of the central path. A l l path-following algorithms follow the model below. Algorithm 2.1.Conceptual path-following: given an initial point (x°,y°,z°) with (x°,j/V°) £ j , u° € Ti, 7 = rv°). k := 0. REPEAT Choose Hk+i < Hk- Call an internal minimization algorithm to find (x , y , z ) = T(pk+i)k+1 k+1 k+1 k := k + 1. U N T I L convergence. The model simply generates a sequence of independent central points T(/Xfc),(A; = 1 , . . . ) . The parameterization and the updating rule characterize different algorithms. In the internal minimization algorithm, Newton-Raphson's algorithm can be used to find the central point for corresponding pk- We will discuss the choices of p, later. Exact central points cannot be exactly computed in finite time. The practical algorithms generate points in a neighborhood of the central path converging towards the optimal solution. Before we present the detailed description about the path-following algorithm, let us review some basic concepts which we will use in the following sections. Chapter 2. Path-following Methods for Linear Programming 2.1.1 11 Newton-Raphson's method for solving nonlinear equations For F(x) = 0, where x - (x x ,...,x ) , F(x) = (fi(x), f (x), ..., f (x)) T u 2 (2.16) T m 2 and /,-(x), (i = l , 2 , . . . , m ) are m difFerentiable at D C Tl . m The Taylor expansion of fi(x), (i x k = 1,2,..., m) at is /,(x)«/ (^) + f:^|^Ax i 9XJ I where Ax j = x j - Xj, (j — f c J , (2.17) 1,2,..., m). Combine (2.16) and (2.17) we get f^y^lx* 7= 1 =-Mz ) (i=l,2,...,m), k (2.18) ^ or in vector form F'(x )Ax k k = -F(x ), k (2.19) where / ... Ill ^ dA d X m F'(x ) = k dim d X X=X I m k and Ax = k (Ax ,...,Ax J , k k T and F'(x) is Jacobi matrix of F(x). Then =x k + A«*. (2.20) Newton's method can be applied to unconstrained minimization problems. To minimize L(x), we take F(x) = L\x) and use Newton's method to search for a zero of F(x). If L(x) is Chapter 2. Path-following Methods for Linear Programming 12 a real valued function of m variables, a local minimum of L will satisfy the following system of m equations in m variables: dL(x) dxi 8L(x) 8x = 0, = 0. r So the Newton iteration (2.19) becomes V L(x )Ax 2 k = -VL(x ), k (2.21) k where VL is the gradient of L and V L is the Hessian of L. If L is convex, then any local 2 minimizer is also a global minimizer. If x* is a local minimizer of L(x), i.e. V i ( x * ) = 0, and if W L(x*) has full rank, then Newton's method will converge to x*. 2 2.1.2 The Karush-Kuhn-Tucker Conditions We consider the N L P of the following form: NLP: minimize {f(x) : (x) > 0, for t = 1,..., m}. gi (2.22) The Karush-Kuhn-Tucker Conditions for N L P are m 1. V / ( z ) + 5>,-Va(a:) = 0. t=i 2. 9i(x) > 0 3. yigi{x) = 0 for for i=l,...,m; and y < 0. (2.23) i=l,...,m. If all the constraints in N L P are equations, the N L P becomes the problem of Lagrange multipliers(PLM). P L M can be stated as follows: P L M : minimize {f(x) : g (x) = 0, for i = l , . . . , m ) . t (2.24) The Lagrange function is m L(x,y) = f(x) + Y,yi9i(*)- (- ) 2 25 Chapter 2. Path-following Methods for Linear Programming 13 The Optimality Conditions for PLM A vector x € TZ is optimal for PLM only if there exists a vector y € TZ such that n m m V L(x,y) = Vf(x) + J2yiV {x) i=i 1. x 2. gi gi(x) — 0 for = 0. (2.26) i = 1,..., m; and y is sign free. The Lagrange function (2:25) transforms a constrained optimization problem with equality constraints into an unconstrained optimization problem. To solve (2.24) from the Lagrange function (2.25), we minimize the unconstrained function L(x,y) by solving the system of (n + m) equations in (n + m) variables: dL = df(x) A dg (x) - ^ - - £ ^ - ^ = °> = -gi(x) = 0, t TToyi . forj = l , . . . , n , for i = 1,.. .,m. These equations can be solved by Newton's method. 2.2 The Affine Scaling Direction 2.2.1 The Primal Affine Scaling Direction For x 6 Tp, the affme-scaling algorithm generates an improved feasible solution by scaling x k k to e, the vector of ones, and making a step in the projected steepest-descent direction for the transformed objective function in the transformed space. A scaling transformation is a change of variables x = Dx, where D is a positive diagonal matrix. Give a point x 6 Tp, scaling about x is the scaling transformation x — XkX, where k X = k k d\ag(x ,...,x ). k k The scaled version of (2.1) is then minimize cFi, subject to Ax = b, x > 0, (2.27) 14 Chapter 2. Path-following Methods for Linear Programming where A = AX k and c = X c are obtained by substitution of x — X x in (2.1). The feasible k k solution x — x of (2.1) corresponds to the feasible solution i = e of (2.27). k Dikin's[25] primal affine-scaling algorithm takes a step from x = e in the direction of the negative projected gradient. This direction is —c projected onto the null space of A p ff = -P c a where P AXk =1- X A (AX A )~ T k 2 T = -P X c, A AXk (2.28) k AX . 1 k k The affine-scaling direction in the original space corresponds to moving from x k in the direction P a f f = -X P c k = -X P X c. A k AXk (2.29) k This primal affine-scaling direction is used in the primal affine scaling methods[25, 2, 13, 106]. 2.2.2 T h e D u a l Affine Scaling Direction Similar to the primal affine scaling direction, the scaled version of (2.3) is maximize b y, T subject to Z " A y + z = Zl c, 1 T l (2.30) z > 0. Gonzaga[48] gave the following Lemma which transforms the dual problem to the primal problem. L e m m a 2.2.1 Formulation (2.30) is equivalent to minimize b z — bo, T subject to A z = c, T z > 0. where A T = P A Z -i, (2.31) Chapter 2. Path-following Methods for Linear Programming b = Z- A (AZ- A )- b, 0 = b (AZ- A )- AZ- c, c = b 1 T 2 T 2 T T 15 1 1 2 PAZ-I Moreover, given h G 7Z , consider the direction d? = P rh, then h is equivalent to the direction n A d for (2.3) given by d = -{AZ- A )- AZ~ h. 2 T l (2.32) 1 The dual affine scaling direction is the projected steepest-descent direction for problem (2.31). Use lemma 2.2.1 we get that d aff = (AZ~ A )- AZ~ b 2 T l = 1 {AZ~ A )- b. 2 T X (2.33) This direction is used in dual affine scaling methods. 2.3 2.3.1 The Centering Direction The Primal Centering Direction IP Ms try to avoid the boundary of feasible region T. A good way of avoiding the boundary is obtained by denning a center for the polytope T, and by simultaneous consideration of two objectives: reducing costs and centering. We now ignore the cost and turn to the problem of finding a "center" for the polytope T. Most IPMs use the analytic center which was defined by Sonnevend[95]. The analytic center of Tp is the unique point given by n X = argmin P(x) = argmin(- y^log Xj). (2.34) The analytic center is approached by solving the following problem n minimize subject to - 2 log Xj, (2.35) Ax — b. Chapter 2. Path-following Methods for Linear Programming 16 The Lagrangian is n L(x,y) = - ^ o g x - y ( A x - b ) . T j (2.36) The first order conditions are VL X = -X~ e- VL = - 4 x + 6 = 0. y Ay l T = 0, (2.37) Solving these equations by Newton's method, we get p = Ax k cent = X P e. k AXk (2.38) This is the primal centering direction. 2.3.2 The Dual Centering Direction Similar to the primal centering direction, the dual centering direction is obtained by solving the foDowing problem n minimize subject to - ^ log Zj, j=i Ay (2.39) + z = c. T The Newton direction for this problem is deem = -(AZ- A y AZ- e. 2 T 1 1 (2.40) This is the dual centering direction. 2.4 The Primal Path-following Method The primal path-following method was first studied by Gill et al. [38]. They consider the linear programming problem in the standard form (2.1). Chapter 2. Path-following Methods for Linear Programming 17 Replacing non-negativity of constraints in (2.1) with the primal logarithmic barrier penalty terms in the objective function, we get n c x — p^T^lnxj j=i minimize = f(x,u-), T subject to (2.41) Ax = b, where \x > 0 is the barrier penalty parameter. This technique is well known in the context of genera] constrained optimization problems. One solves the problem penalized by the logarithmic barrier function term for several values of the parameter //, with /x decreasing to zero, and the result is a sequence of feasible points converging to a solution of the original problem. The objective function /(x,/x) of problem (2.41) is a strictly convex function. This implies that this problem has at most one global minimum, and that this global minimum, if it exists, is completely characterized by the Karush-Kuhn-Tucker stationary conditions(first order conditions). The function f(x,p) achieves the minimal value in its domain (for fixed p) at a unique point, which is denoted by x([i), and called the p-center. The Lagrangian for (2.41) is 71 L(x, y, p) = c x - (i ]P i=i T j - V {Ax - b), m x (2.42) T arH the first order conditions(KKT conditions) for (2.41) are V X = c - pX~*e - A y VL = -Ax X y T = 0, + b = 0, (2 A3) where y is the vector of Lagrangian multipliers associated with the equality constraints of problem (2.41), and X denotes the diagonal matrix whose diagonal elements are the variables Xj, 1 < j < n, which is denoted by Xk when evaluated at the iterate x k (i.e. X = diag(x\,.. .,x )). n Now, we apply one iteration of Newton's method to find an approximate solution to (2.43) for a fixed value of \i. We get the search direction Ax k = -(l/flk)X PAX X C k k k + XP Xe k A k = (l/Pk)Paff + Pcent, (2.44) Chapter 2. 18 Path-following Methods for Linear Programming where PAX, = A new estimate optimal solution x {I-X A {AXlA )- AX ). T k l (2.45) k is defined by + x where a T k k+1 = x + a Ax , k (2.46) k k is respective step length in the primal spaces chosen to assure the positivity of the variable x. Rather than making several Newton steps to converge to the central trajectory for a fixed value of p, the algorithm reduces p by p +\ k = pp ,0 < p < 1, at each step, and the k algorithm continues. Gill et al. [38] and Gay[33] noted that Karmarkar's method is just a special case of primal logarithmic barrier methods. We find that (2.44) consist of a centering term to keep away from the boundary, and an affine term that leads toward an optimal solution. When p k -* 0 optimality dominates, whereas for large p , the method proceeds toward the analytic center of the feasible region. Den Hertog k and Roos[51] showed that most interior point methods have search directions that are linear combinations of these two vectors. The choice of a k and p is important for the convergence of the algorithm. Gonzaga[46] shows a certain choice which updates the barrier penalty parameter p by short steps, forces the algorithm to trace the central path, so that all points generated are near the central path. This algorithm converges in 0(y/nL) iterations assuming the initial point (a; , y°) is close to the 0 central path. The overall complexity is 0(n L). 3 Roos and Vial[91] and Gonzaga[49] propose a long-step path-following method based on the same direction (2.44). In their algorithms, the barrier parameter p is reduced by a large factor if the iterate lies close to the central path. Their algorithms also converges in 0(\/nL) iterations for certain values of p. Chapter 2. 2.5 Path-following Methods for Linear Programming 19 The Dual Path-following Method Renegar proposed a special case of dual logarithmic barrier method in[87]. We consider the dual linear programming problem (2.2). We do not use the logarithmic barrier function(2.11). We eliminate the inequality in(2.2) by incorporating it into a logarithmic barrier term appended to the objective to obtain the following transformed problem n minimize b y + /x ^T] ln(cj - ajy), (2-47) T where aj is the jth column of the matrix A. The first order conditions(KKT conditions) are b-pAZ^e = 0, (2.48) where Z is the n X n diagonal matrix with elements Zj = Cj — ajy- One step of Newton's method yields Ay = ^{AZ- A )- b 2 T l - (AZ- A )- AZ- e 2 T 1 1 = ^d aff + d , cent (2.49) where the first term in(2.49) represents a step toward optimality and the second term is a centering step in the dual space. 26 The Primal-Dual Path-following Method Most of the new interior-point methods can be seen as variants of the primal-dual logarithmic barrier method due to Megiddo[74] who used logarithmic barrier methods to solve the primal and dual problems simultaneously. Meggiddo's method was developed into a convergent primaldual logarithmic barrier algorithm by Kojima et al. [62]. This method was shown by Monteiro and Adler[82] to require 0(n L) 3 overall time and no interior point algorithm has been shown to have a better worst-case complexity bound. In practice the primal-dual algorithm outperforms the primal and dual barrier algorithms. Because in Chapter 6 we will use the primal-dual method to solve the fractional (g,f)factor problem which has upper bounded variables, now we consider the primal problem in the Chapter 2. Path-following Methods for Linear Programming 20 following form minimize c x, subject to Ax = b, T (2.50) 0 < x < u, where c, x,u G lZ , b G TZ , A G 1Z , n m n > m. Some or all of the upper bounds u may be mXn infinite. Adding slack variables to x < u, we get minimize c x, subject to Ax = b, T x + s — u, (2.51) x > 0, s > 0, where c, x, s,u£ 72™, and b G TZ , A G Tl , m n > m. mxn The corresponding dual problem is maximize subject to b y — u w, T Ay T - w + z = c, T (2.52) * > 0, w>0, where y G 1Z , and z, w G 72 . m 71 To derive the primal-dual algorithm, we eliminate the inequality constraints in(2.51) by incorporating them in a logarithmic barrier function as n minimize cx —u^ T subject to n In Xj — \i ^ In SJ , Ax = 6, (2.53) x + s = u. The Lagrangian for(2.53) is then n L(x, s, y, u;,//) = c a; - JJL T i=i n In a;j - u ^ j=i In Sj - 2/ (Ax — 6) + w ( x + s — u). T r (2.54) Chapter 2. Path-following Methods for Linear Programming 21 The first order optimality conditions ( K K T conditions) for(2.54) are Ax = b, x+ s = u, (2-55) A y + pX e — w = l pS~ e - w 1 Introducing the dual slack variable vector z = pX e, c, — 0. the first order optimality conditions(2.55) -1 give Ax — b, x+ s = u, Ay + z - w = c, XZe - /lie, SWe = pe, T (2.56) where X, 5, Z, and W are diagonal matrices with the elements and XJ,SJ,ZJ Wj, respectively, e is the n-vector of all ones and p is a barrier parameter. The set of solutions of(2.56) (x(p),s(p)) and (y(p),z(p),w(p)) defines the central path of the primal and dual problem, respectively. The central path is the set of points at which the product of each primal-dual variable pair is identical. For any primal and dual feasible solution (x.s) and (y, z, w) the quality of centrality is measured by n ^ , , , ^ . , ) =g ( ^ - ^ 9 If 6(x,s,y,z,w,p) g , ^ - ^ , ' . + (2,7, = 0 then (x,s) and (y,z,w) are on the central path. The smaller 6 is, the better the points are centered. From (2.56) we find the complementarity gap is rp c x rp — rp b y+ u w - rp . rp x z— x w+ u w T T z+ s w = X = x pX~ e - np + np = 2np. T 1 + s pS~ e T 1 (2.58) Chapter 2. Path-following Methods for Linear Programming Therefore when p —• 0 z(p), 22 the complementarity gap go to 0. It proves (x(p),s(p)) and (y(p), + w(p)) converge to the optimal solution for p —• 0 . Let us observe that the first three + equations of(2.56) are linear and force primal and dual feasibility of the solution. The last two equations are nonlinear and become the complementarity conditions for'// = 0, which together with the feasibility constraints provides optimality of the solutions. A single iteration of the basic primal-dual algorithm makes one step of Newton's method applied to the first order optimality conditions (2.56) with a given p and then p is decreased. The algorithm terminates when the complementarity gap is reduced below a predetermined tolerance. Assuming that the current estimate of the primal and dual variables satisfies both primal and dual feasibility, one step of Newton's method is applied to (2.56) by solving the following system of linear equations A 0 0 0 0 Ax 0 I 0 I 0 0 Ay 0 0. I -I As A 0 T = 0 Z 0 0 X 0 Az pe -XZe 0 0 w 0 5 Aw pe - SWe Solving (2.59) we get the search directions where 0 = ( S - 1 W.+ X~ Zy l l Ay = (AQA )~ AQp(p), Ax = e(A Ay-p(p)), Az = pX'^e- Ze-X^ZAx, Aw = S~ (pe- SWe + As - -Ax, T 1 T l and p{p) = p^S' 1 - X - 1 (2-60) WAx), )e - (W - Z)e. A new approximation Chapter 2. Path-following Methods for Linear Programming 23 to the minimizer is then determined as where i = x + ctpAx, s = s + apAs, y = y + a Ay, z = z + ct£)Az, w = tz; + a n A w , (2.61) D and Q £ > are step lengths in the primal and dual spaces respectively chosen to assure the positivity of the variables x,s,z and w. As in[66] ap and a® are determined as • <y • £• I m i n { — — , A X J < 0},min{^-,AXJ > 0} > , j — AXJ j AXJ J Z' T ^AV QP = 0.995ap, = 0.995a , I 10' { A 2 j < ° 'T ^' ^' < ° } { A } J' (2.62) D where the 0.995 factor keeps the iterates from actually touching the boundary. When the minimum index corresponds to the artificial variable, then the relevant a was set equal to one. The barrier parameter /i serves as a centering parameter, pulling the primal variables and dual slacks away from zero. Let <?ap(0) denote the current duality gap. From (2.58) we know the duality gap is gap(0) — c x - b y + u w = z x + w s. T T T T (2.63) T Let a < m i n { a p , a £ ) } , From (2.59) we know, after one iteration, the new duality gap is gap(a) = (z + aAz) (x + aAx) + (w + aAw) (s + aAs) T T = z x + w s + a(ZAx + XAz + WAs + SAw) e = gap(0) + a(2np - gap^)), T T T Thus the duality gap is reduced at each step as long as 2np - gap(0) < 0, i.e. gop(O) _ c x - b y + u w T < ^ 2n T 2n T 4 Chapter 2. Path-following Methods for Linear Programming 24 In the primal-dual algorithm, we choose p by the algorithm of McShane et al.[72] and Lustig[66], namely, for feasible x and y, cx — by + uw u- = -f, T T T 2.65 and n , if n < 5000, n,y/n, if n > 5000. 2 4>(n) = I (2.66) The algorithm is repeated until the relative duality gap satisfies cx — by + uw T T T for a user predetermined e. In comparing primal, dual, and primal-dual methods, we find that all search directions consist of a matrix of the form ADA , T where D is a diagonal matrix. The content of D varies, but the computational work does not. There are several advantages for the primal-dual method. First, primal feasibility may be obtained well before optimality. For the dual affine method, a primal feasible solution cannot in general be recovered until optimality. Thus if a run must be stopped short of optimality, no primal feasible solution can be guaranteed to be available for purely dual methods. Second, under primal and dual degeneracy, when large steps are taken, primal methods cannot be guaranteed to yield dual optimal solutions at optimality, nor can dual methods yield primal optimal solution in general[115]. The primal-dual method yields both under any degeneracy conditions. Third, for primal feasible (x, s) and dual feasible (y, z, w), the exact current duality gap (2.67) is always known. Thus an excellent measure of how close the given solution is to the optimal is always available. Finally, the primal-dual method allows for separate step lengths in the primal and. dual spaces as shown in (2.61). McShane, Monma, and Shanno[72] have proven that these separate step lengths significantly reduce the number of iterations to convergence. Chapter 2. Path-following Methods for Linear Programming 2.7 25 The Primal-Dual Infeasible-Interior-Point Method In the discussions so far, we have been assuming that a feasible solution (x°, s°) £ Tp of the primal problem and a feasible solution (y°, z ° , w°) € To of the dual problem are available so that we can immediately start the algorithm. In general, it is necessary to provide a means for determining an 5°, y°, z°, w°) 6 Tp X Try as an initial point for the algorithm. Lustig[65] has used the following pair of primal and dual linear programs to find the solution of a problem for which an initial interior feasible point is not known. The augmented primal system is minimize c x + cx, T a a subject to Ax + dpx = b, a x + s = u, dJ)X + X b X, 5, X , a (2.68) = b, a ^ 0. X b The dual of this system is maximize b .y — u w + b y , T T a a subject to A y + d^y + z — w = c, T a d y+ z = c, (2.69) T P a a y + *b = 0, a 2, where d P = b- Ax° and d - A y° + z°-w°-c, T D W , Z a , Z b > 0, x° > 0, z° > 0, w° > 0 and y° are arbitrary initial points. We need to take c and b satisfying a a c >d%y = (b-Ax°) y , 0 T a (2.70) 0 and b > d x° T a D = (A y° T + z°-w°- c) x°, T (2.71) Chapter 2. Path-following Methods for Linear Programming so that (x°,xl,s°,x%) and (y°, 26 z°, z ° , z$, w°) are feasible solution of (2.68) and (2.69), re- spectively, where = X , = 1, = b - s° = u — x°, y° - if , y° - -i, x° a a dpx° 0 = *°, Z° = W° 4 - c = 1. a - dpy° Therefore, we can apply the algorithm to the artificial problems (2.68) and (2.69) with these initial feasible solutions. In the limit as b and c approach infinity, the search directions of (2.68) and (2.69) satisfy a a the following system AAx A Ay T = x dp, a Ax + A s = 0, + Az - Aw = yd , ZAx + XAz = -XZe + pe, WAs + SAw = -SWe + pie. a (2.72) D From (2.68), we get x dp — b—Ax, a (2.73) Chapter 2. 27 Path-following Methods for Linear Programming and from (2.69), we get yd a = c - Ay - z + w. T D (2.74) Lustig[65] proposes a primal-dual infeasible-interior-point method which relaxes the primal and dual feasibility assumption on the primal-dual logarithmic barrier method, and shows this method is equivalent to (2.72), which is derived using artificial vectors when the artificial cost become infinitely large. He applies Newton's method to the first order conditions (2.56) without requiring primal and dual feasibility. He assumes only that x and s satisfy x + s = u with x > 0 and s > 0, which implies that A s still satisfies A s = —Ax. The defining Newton equations for the change in the variables are then AAx — b- Ax + As A Ay T Ax, = 0, + Az-Aw = c - A y - z + w, ZAx + XAz = -XZe + pe, WAs + SAw = -SWe + pe. (2.75) T The resulting search directions are Ay = Ax = Q(A Ay - p(p) + Az = pX~ e- Ze- Aw = S - \ n e - SWe As where 0 = {S^W + X' 1 = Z)' , 1 (AQA^-HAQipi^-r^-iAx-b)}, T l r ), D X-^ZAx, + (2.76) WAx), -Ax, p(p) = piS' - X~ )e 1 l - (W - Z)e, and r = Ay + z-w-c. T D In comparing (2.72) and (2.75), we find they are identical. Therefore, we get the conclusion that the primal-dual logarithmic barrier method using the limiting feasible direction (i.e. b -,c —* oo) leads to the primal-dual infeasible-interior-point method. a a The infeasible-interior-point method requires only the positivity of ( x ° , . s ° ) and (z°,w°) x ° + s ° = u. The generated sequence (x ,s ,y ,z ,w ) k k k k k and is not restricted to the interior of Chapter 2. 28 Path-following Methods for Linear Programming the feasible region. A n iterate (x ,s ,y ,z ,w ) is required to satisfy neither the equality constraints Ax = b of primal problem nor A y - w + z = c of the dual problem, but only the k k k k k T positivity x > 0 and z > 0 and x + s = u. From (2.75) we see that the search direction is still a Newton step toward the point on the central path. It tries to get rid of all the infeasibility in the equality constraints in a single step. If a full step is ever taken (i.e. a = 1), the residuals rp = Ax — b and rp — A y T + z —w - c become zero, and all subsequent iterates remain strictly feasible. The choice of step length parameter ap and a n are same as in primal-dual method described in last section. The barrier parameter p is chosen as: for feasible x and y, cx - by + T A*= T T7~\ uw T <p(n) , (2.77) and when feasibility has not been obtained c x - b y + u w + M6i + M6 —— , <p(n) T [i- T , . (Z.tb) T 2 where S 1 = \\b- Ax\\/\\b- 6 2 = \\c-A y-z Ax°\\, (2.79) + w\\/\\c-A y -z T T 0 0 + w°\\, (2.80) and n, ct>(n) = { if n < 5000, ~ 2 ny/n, (2.81) if n > 5000. Here M is chosen as in [72], namely M = p</>(n)max< max {|c,|}, max {l<j<n l<i<m \ , (2.82) J and p is a scalar multiplier. The algorithm is repeated until the relative duality gap satisfies c x — b y -f u w < e, 1 + \b y — u w\ T T T for a user predetermined e. T T (2.83) Chapter 3 Predictor-Corrector Methods Mizuno, Todd, and Ye[78] developed the adaptive primal-dual algorithm which is also called a predictor-corrector algorithm. In this algorithm the predictor step is a affine scaling (damped Newton) step for problem (2.56), producing a new strictly feasible iterate. This affine scaling step can make a large reduction in complementarity gap. The subsequent corrector step is a centered Newton step to move the point back to the central path. In this corrector step, the choice of the barrier parameter a is based on the predictor step. Both the predictor and corrector steps require essentially the same amount of work, the evaluation and factorization of the Jacobian matrix. Later on, Mehrotra[76] proposed a more practical predictor-corrector algorithm. A common feature in these two predictor-corrector algorithms is that the value of the barrier parameter in the corrector step depends on the predictor step. However, unlike Mizuno, Todd, and Ye's corrector step, Mehrotra's corrector step does not evaluate a new Jacobian matrix. Instead, it reuses the Jacobian matrix used by the predictor step. Mehrotra's predictor-corrector method has been used by almost all L P interior-point implementations [76]. In some sense these predictor-corrector methods follow the central path in primal-dual form, but certain of algorithms allow a very loose approximation to the central path. 3.1 Mehrotra's Predictor-Corrector Method Mehrotra[76] proposed a predictor-corrector method which takes a single "corrector" step to the central path.after each "predictor" step to decrease a. 29 Lustig present an extension of Chapter 3. Predictor-Corrector Methods 30 Mehrotra's method in [67]. His method can be derived directly from the first order conditions (2.56). Rather than applying Newton's method to (2.56) to generate correction term to the current estimate, Lustig substitute the new point into (2.56) directly, yielding A (y T A(x + A x ) = 6, (x + A x ) + (s + A s ) = u, + Ay) + (z + Az) - f> + Aw) = c, (X + AX)(Z + AZ)e = /ie, (S + AS)(W + AW)e = pe, where AX, AS, AZ, and AW are diagonal matrices with the elements AXJ,ASJ,AZJ (3.1) and AWJ, respectively. Simple algebra reduces (3.1) to the equivalent system AAx = b - Ax, = u — x — s, + Az-Aw = c - A y - z + w, ZAx + XAz = pe-XZe- AXAZe, WAs + SAw = /Lie - SWe - ASAWe. Ax + A s A Ay T T (3.2) The left-hand side of (3.2) is identical to (2.75),while the right-hand side has two distinct differences. The first is the upper bounds equation. In (2.75) we always choose x ° and s ° satisfying x° + s ° = u with x ° > 0 and s ° > 0. This means the right-hand side of the second equation is always zero. For problems with small upper bounds this requirement will bring computational instability. Therefore we assume only that x ° > 0 and s ° > 0 but do not require that the upper-bound constraint x ° + s ° = u be satisfied initially. We allow the method to iterate to bound feasibility in precisely the same manner it iterates to primal and dual feasibility. The second difference between (3.2) and (2.75) is the presence of the nonlinear terms and ASAWe AXAZe in the right-hand side of the last two equations in (3.2). Therefore Mehrotra's predictor-corrector technique incorporates higher order information when approximating the Chapter 3. Predictor-Corrector Methods 31 central trajectory and computing the direction step. This method looks for a parametric representation of the trajectory that goes from a given (infeasible) point in primal and dual spaces to a solution of (2.51) and (2.52). The predictor-corrector technique decomposes direction step A = (Ax, As, Ay, Az, Aw) into two parts A = A i.e., combine affine-scaling, A a + A , a (3.3) c and centering, A components. The term A c a (affine direction) is obtained by solving AA x = b — Ax, Ax + As = u — x - s, = c- Ay a a AAy a + Az T a - Aw a a : T ZA x + XA z = -XZe, WA s + SA w = -SWe. a a a a Actually the affine scaling direction A a - z + w, (3.4) is the predictor direction. It then used in two ways. One is to approximate the nonlinear terms in the right-hand side of (3.2). The other one is to dynamically estimate /x. Mehrotra use the affine scaling direction A a to performs the standard ratio test on both the primal and dual variables to determine the maximum stepsizes 6p and tip which preserve nonnegativity of (x,s) and (z,w). min{——,A Xj j -A Xj a a min{—- —,A Zj 1 j -A Zj a bp = 0.99995<5 , 8 = 0.99995^ , D a < 0},min{——,A Sj j -A Sj a < 0},mm{—-—,A Wj j < 0} a -A Wj a < 0} a P c where the 0.99995 factor keeps the iterates from actually touching the boundary. Chapter 3. Predictor-Corrector Methods 32 The predicted complementarity gap that would result from a step in the affine direction is 9a = (x + S A x) (z T P a + 6 A z) D + (s + S A s) (w + 6 A w). T a P a D (3.6) a Lustig's[67] estimate for p is then where g = x z T + sw T denotes current complementarity gap. The term g /g a measures the achievable progress in the affine scaling direction. Now (3.7) chooses a small p when good progress can be made in the affine direction and a large p when the affine direction produces little improvement. This means the poor progress in the affine direction generally indicates the need for more centering and hence a large value of p. If the affine scaling direction offers considerable progress in the reduction of the complementarity gap, then a more optimistic target is chosen. The actual new step (Ax, Ay, Az, As, Aw) is then chosen as the solution to = b- = u — x — s, = c - A y - z + w, ZAx + XAz = pe- XZe - A XA Ze, WAs + SAw = pe - SWe - A SA We. AAx Ax + A s A Ay-\-Az-Aw T Ax, (3.8) T Note that the full primal-dual affine scaling terms, A XA Ze a the right-hand side of (3.8). The step lengths S P a a a a a and A SA We, a a are added to and 8p defined by (3.5) are used to compute p but not to modify the affine scaling correction terms. Thus the affine correction added is one that results from taking a full step of length one in the affine direction. Whenever a primal and dual full step of length one can be taken, primal feasibility and dual feasibility are achieved exactly within the numerical accuracy of the computations. So the solution to (3.8) can be Chapter 3. Predictor-Corrector Methods 33 written as where A x, A y, A z, A w a A y, c A z, c a a Aw = Ay + A y, Ax = Ax + A x, Az = Az c Aw = Aw + As = As + and A s a and A s c Ay c a a c + A z, a (3.9) A w, a c A s, a c are the solution to (3.4) and the correction terms a A x, c satisfy c AA x = 0, Ax + As = 0, + Az - Aw = 0, c c A Ay T c c c c (3.10) ZA x + XA z = fie - A XA Ze, WA s + SA w = fie- A SA We. c c c c a a a a According to (3.4) xz + Axz T + Azx T = sw + Azw T a T a T a + Aws T a = 0. (3.11) Tf the full step of one were achieved on the affine scaling step, the new complementarity gap would be (ar + A x) (z + A z) T a rr* — x z + Ax a = AxAz T a + (s + A s) (w T a a rri a z + Az a rrt x + Ax a + A w) a rrt Az + s w + As a a rri w + Aw a rry s + As a + As a a (3.12) T a a a a Aw + A s A w. Therefore (3.10) could describe a centered Newton step from the point x + A x,y A z,s f-rt and w + A w a except that the corrections A x, A y, A z, A s a a a a + A y,z a and A w a + have not been added to the diagonal matrices on the left-hand side of (3.10). Thus the centering correction terms are a Newton step from the point achieved by a full affine step, but using Chapter 3. Predictor-Corrector Methods the second-derivative matrix at the current point x,y,z,w 34 and s as an approximation to the Hessian at the point corresponding to a full affine step. From (3.4) and (3.10), we see that a single iteration of the predictor-corrector primal-dual method needs two solutions of the same large, sparse linear system for two different right hand sides. The iterate obtained from the predictor-corrector method can be viewed as a point on a quadratic path which approximates the central trajectory and pushes an iterate towards an optimum along such an approximation. Tapia et al.[96] demonstrate that Mehrotra's predictorcorrector methods is the level-1 composite Newton variant of the Newton interior-point method and prove that it has a quadratic local convergence rate. The computational results of Lustig[67] show that the predictor-corrector method almost always reduces the iteration count and usually reduces computation time. Furthermore, as problem size and complexity increase, the improvements in both iteration count and execution time become greater. Thus the predictor-corrector method is generally very computationally efficient. 3.2 Carpenter's Higher-Order Predictor-Corrector Method As we described in previous section, Mehrotra's predictor-corrector method first solves (3.4) for the predictor direction A , then it computes the barrier parameter fi as a function of both the a current point (x, s), (y, z, w) and A , and finally, it uses A a a to correct the centered direction that would be obtained by solving (3.10). The systems we solve to obtain the predictor or the corrector direction each involve the same matrix. Each of these directions is obtained based on the evaluation of the same derivative. The idea of the predictor-corrector procedure is to reduce the work required in the primal-dual interior point procedure by reusing the factorization required to solve the Newton system (3.4). The factorization phase is computationally much more expensive than the solve phase. The method in last section performs only one correction in obtaining the direction. It can Chapter 3. Predictor-Corrector Methods 35 easily be extended to perform several corrections in a multiple predictor-corrector procedure. Instead of solving (3.8) once at each step of the primal-dual interior point method, it can be solved repetitively with each direction corrected based on the previous direction. That is, we substitute at each step the Ax, Ay, Az, Aw and As terms found by solving (3.8) back into the right-hand side A . 0 Now, we outline the multiple predictor-corrector algorithm [19] which is invoked at each iteration of the I P M . Algorithm MPC (Multiple Predictor-Corrector) Given v = (x , s , y , z , w ) with x ,s ,z , k k k k k k k k w > 0. k k Step 1: Solve (3.4) for the affine direction A v. a Step 2: Compute p(v ,A v) k a using (3.7). Step 3: For i = 1 , . . . , nik (m : the number of corrections in iteration k) do k solve the following system for Av : 1 AAx - 1 Ax + As 1 -- u - x — s, + Az - Aw { = c - A y - z + w, = ue-XZe- 1 A Ay T i b - Ax, 1 ZAx i + XAz WAs { + SAw i l = (3.13) T AX^AZ^e, . u-e-SWe-AS^AW^e. end do. Define Av = A / / " * . Step 4: Perform the ratio test to determine primal and dual steplength ap and ap. Chapter 3. Predictor-Corrector Methods Step 5: Move to the new point v k+1 36 defined by k+l = x + otpAx, k+l = s + apAs, y = y + a Ay, k+i = z + a Az, = w + aoAw. x s h+1 z k k k k (3-14) D D The primal and dual step lengths ap and ap> are chosen to insure the nonnegativity of ' the variables x,s,z and w. Given v and a direction Av, the ratio functions ap(v, Av) and ap(v,Av) are defined as ap(v, Av) = min ^ m i n { — ^ — , A x j < 0 } , m i n { — ^ — , A S J < 0 } | , a (v,Av) = min «j m j n { — ^ j - , AZJ < 0 } , m i n { - ^ - , AWJ < 0} j , a P = 0.99995ap(i/, Av), a D = 0.99995d£>(i/,Ai/), D ^ ^ ap and ap> are the steps that are actually taken in M P C . Carpenter et al. [19] give a heuristic strategy for determining m^: the number of corrections in iteration k. They make different consideration on feasible point and infeasible point. For feasible points, let g be the complementarity that would result from taking a step in l the direction obtained after i corrections. If g < g ~ and i is less than some maximum number 1 l l df corrections, then an (z + l)st correction is performed. If g > g ~ , then we stop correcting l and use the direction Av ' . l l Therefore, another correction is considered only if the previous 1 1 one decreased the complementarity. When correcting from an infeasible point, the value rrik is chosen so that the algorithm reduces complementarity while reducing infeasibility. Carpenter et al. [19] define G' = g + (1 + a )\\dv\\ + (1 - apXWdplU + \\d \U). x D x u (3.16) Chapter 3. Predictor-Corrector Methods 37 where dp, do and d are the primal, dual and upper bound infeasibilities at the current point u which are denned as dp — b — Ax, dp, = c - A y - z + w, d u — x — s. u — T (3-17) If i is less than the allowable maximum and G < G ~ , then an (z + l)st correction is performed. x l l Carpenter et al. proved that the multiple predictor-corrector algorithm is equivalent to the level-m composite Newton method. Although the use of higher order terms usually results in the reduction of the number of iterations to reach optimality, it does not guarantee the total time savings due to the increased effort in a single iteration. Computational results indicate that on average, the second order methods seem to be the most efficient. In the case that Cholesky factorization is very expensive compared with a single solve of Newton equations, we prefer to use higher (than two) order method to save the number of factorizations. But Carpenter's higher-order method has difficulty of building an accurate higher order approximation of the central trajectory. The method presented in the next section solves this problem. 3.3 Gondzio's Multiple Centrality Corrections Method Gondzio offered a multiple centrality corrections method in [43] . This method combines the multiple centrality correction with a choice of reasonable well centered targets that are supposed to be easier to reach than perfectly centered analytic centers. The algorithm takes special care on restoring centrality of the next iterate and, at the same time, to increase stepsize in the primal and dual spaces. The effort of multiple corrections does not concentrate on reducing the complementarity gap. He believe the complementarity gap will be sufficiently reduced if a long step along a primal-dual affine scaling direction is made. Driving the primal-dual point as close to the central path as possible is thus an investment that is expected to pay off in the ability to make a larger step in the next iterate. Chapter 3. Predictor-Corrector 38 Methods This method uses the second order predictor-corrector technique which introduced in Section 3.1 to produce the predictor direction. The following corrector terms concerne improving the centrality of the subsequent iterate and with increasing the stepsizes in primal and dual spaces. For a given point (x, s) and (y, z, w), the quality of centrality is measured by ^ , , , ^ . ± ^ . 0 ) ± [ ^ . ^ ' , + (3,8, or 6(x,s,y,z,w,fi) = \\Xz-pe\\ + \\X~ Z~ e x - x + \\Sw -e\\ - pe\\ + WS^W^e- p -e\\. (3.19) p What really hurts a primal-dual algorithm is a large discrepancy between complementarity products XjZj and SjWj. with their average p Both too small and too large complementarity products, compared = (x z T a + s w)/(2n), are undesirable. T The step A of (2.75) aims at drawing all complementarity products to the same value p. Usually, there is little hope to reach such a goal. Gondzio's[43] approach combines the choice of targets[56] that are supposed to be easier to reach with the use of multiple correctors. Assume (x, s) and (y, z, w) are primal and dual solutions at a given iteration of the primaldual algorithm, and x,s,z and w are strictly positive. We also assume that the predictor direction A at this point, and the maximum stepsizes in primal ap and dual a n are determined. p The algorithm looks for a corrector direction A m such that larger stepsizes in primal and dual spaces are allowed for a composite direction A = A + A p To enlarge these stepsizes from a P 6 , l ) , respectively, a corrector term A a m . (3.20) and OLD to dtp = min(ap + 6 , 1) and dx> = min(ajr, + a m has to compensate for the negative components in the primal and dual variables (i,S) (y,z,w) = (x,s) = (y,z,w) + ap{A x,A s), p + a (A y, D p p A z,A w). p p Chapter 3. Predictor-Corrector Methods 39 The algorithm tries to reach the goal by adding the corrector term A m that drives from this exterior trial point to the next iterate (x, s, y, z, w) lying in the vicinity of the central path. However, there is little chance to attain the analytic center in one step, i.e., to reach the point v = (pe,pe) e Tl , 2n (3.22) in the space of complementarity products. Hence the algorithm computes the complementarity products i) = (Xz, Sw G TZ ) and 2n concentrates its effort only on correcting their outliers. Thus, these complementarity products are projected component-wise on a hypercube H = [/?minAi,/3maxM] "- This gives the following target 2 v = v(v\H) € n . 2n (3.23) t The corrector term of the direction is defined by the following system of linear equations 0 A O 0 0 7 / 0 0 Ax m 0 0 Ay m 0 -I A s 0 0 A 0 I Z 0 0 X 0 A 0 0 W 0 s Aw T m M (3.24) 2 v -v t m The right hand side in the above system of equations has nonzero elements only in a subset of positions of v — v that refer to the complementarity products which do not belong to the t interval (/Smin/^AnaxA )1 Once corrector term A m is computed, the new stepsizes ap and &D are determined for the composite direction A = A + A p m . (3.25) Then the primal-dual algorithm can move to the next iteration. The correcting process can easily be repeated a desirable number of times. Direction A of (3.25) becomes a new predictor A p for which a new trial point is computed from (3.21) The point in the complementarity products space is then used to define the new target (3.23). Then, Chapter 3. Predictor-Corrector Methods a new modified centering direction A m 40 from (3.24) is computed and added to the predictor term as in (3.25). In this algorithm, correcting terminates when the stepsizes in the primal and dual spaces ap and ap determined for a composite direction (3.25) do not increase sufficiently compared with the stepsizes ap and a n found earlier for a predictor direction. That is, we stop correcting if ap < ap + -yS or a ap, < ao + fS , a (3.26) where 7 is some prescribed tolerance. The computational experience of [43] shows that this method saves significant G P U time over the Mehrotra's predictor-corrector method. The Higher Order Primal-Dual Method ( H O P D M ) [45] which we will use in Chapter 6 is an implementation of this algorithm. Starting Point 3.4 All interior point methods(including the infeasible IPMs) are very sensitive to the choice of an starting point. The choice of a good starting point for an I P M is still an open question. Good guesses of (a: , s ° , y°, z°, w°) can reduce the computational effort considerably. One would 0 like the starting point to be well centered and to be as close to primal and dual feasibility as possible. Surprisingly, points that are relatively close to the optimal solution (but are not well centered) often lead to bad performance and/or numerical difficulties. The starting point in most implementations of primal-dual infeasible IPMs [45, 76, 66] are some variation of the approximate solution of the following auxiliary Q P problem which is given in [4]: minimize subject to c x + ^(x x T T + Ax = 6, s s), T (3.27) x + s = u, where g is a predetermined weight parameter. A solution of (3.27) can be given by an explicit Chapter 3. Predictor-Corrector Methods 41 formula and can be computed at a cost comparable to a single interior point iteration. It is supposed to minimize the norm of the primal solution (x,s) and promote points that are better in the sense of the L P objective. As the solution of (3.27) may have negative components in x and s, those negative components are pushed towards positive values sufficiently bounded away from zero (all elements smaller than 6 are replaced by 6, say, 6 = 1). Independently, an initial dual solution (y,z,t) is chosen similarly to satisfy y = 0 and the dual constraint. Again, all elements of z and t smaller than 6 are replaced by 6. Chapter 4 S o l v i n g the N e w t o n E q u a t i o n s In section 2.5, we obtained Newton equations (2.75). If we do not require that the upper bound constraint a; + s ° = u is satisfied initially, i.e. we allow the method to iterate to bound 0 feasibility in precisely the same manner it iterates to primal and dual feasibility, then we get the Newton equations system A 0 0 0 0 Ax I 0 J 0 0 Ay 0 I -I As 0 A T 6> = Z 0 0 X 0 Az lie - XZe 0 0 w 0 s Aw fie - SWe where £ c - b- Ax, = c - Ay T - z + w, denote the violation of the primal, the upper bound and the dual constraints, respectively. After elimination of Az = X~\fie- AS = £ - Aw = S^ifie- u XZe - ZAX), Ax, SWe- WAs) = S~\fie 42 - SWe- W{ u + WAx), Chapter 4. Solving the Newton Equations 43 it reduces to D~ A Ax r A 0 Ay h 2 T where D = (X^Z r = Z -X-\ue-XZe) h = 6- 2 + S- W)-\ l c + S- (ue-SWe)-S- W£ , 1 1 u This form of the Newton equations system usually is known as the augmented system. The coefficient matrix of this system is symmetric indefinite and is easier and cheaper to factorize than the original form. We can further reduce (4.2) to the normal equations (AD A )Ay 2 T = g = AD r 2 + h (4.3) by eliminating A x from (4.2). The matrix for normal equations is positive definite and symmetric, has smaller size (m x m), and may be more dense. The solution of the Newton equations system is the computationally most involved task in any interior point method. All IPMs solve an identical system of linear equations. difference is in the value of the diagonal matrix D 2 The only and the right-hand side. The coefficient matrix in the Newton equations system is usually large and sparse, since the constraint matrix A is itself large and sparse in most applications. Both linear systems (4.2) and (4.3) can be solved using either direct or iterative methods. These two approaches are not completely disjoint; the process of constructing a preconditioner for iterative methods is in many ways similar to direct factorization [41]. The choice of the method (direct or iterative) depends on a large number of factors. Karmarkar in [58] gave a detailed discussion of these factors and their influence on the choice of the method. Direct methods involve the factorization of the system coefficient matrix, usually obtained through Gaussian elimination. Implementations of methods in this class require the use of Chapter 4. Solving the Newton Equations 44 specific data structures and special pivoting strategies, in an attempt to reduce the amount of fill-in during Gaussian elimination. Notable examples of software using this type of technique are SPARSPAK[34], YSMP[30], and MA27[29, 28]. Iterative methods generate a sequence of approximate solutions to the system of linear equations usually involving only matrix-vector multiplications in the computation of each iterate. Methods like Jacobi, Gauss-Seidel, Chebychev, Lanczos (see [41] for a description of these methods) and the conjugate gradient are attractive by virtue of their low storage requirements, but these methods have slow convergence unless an effective preconditioner is applied. For IPMs, iterative methods, e.g., conjugate gradient algorithms, are not competitive in the general case due to the difficulties in choosing a good and computationally cheap preconditioner. Some success with iterative methods for special L P problems has been obtained [84, 89, 85]. Resende indicated in [88] that the direct method is not appropriate for solving Newton equations system that occurs in network flow problems. This is because the AD A 2 T and its Cholesky factor are much denser than the matrix A, even if the incidence matrix A has no dense columns. Portugal and Resende[85] used an iterative method to solve the minimum cost network flow problem. They used a heuristic to select the preconditioner. The initial selection is the diagonal preconditioner, since it tends to outperform the other preconditioners during the initial interior point iterations. If the number of conjugate gradient iterations exceeds \frn~, the diagonal preconditioner is changed to the spanning tree preconditioner. For general purpose IPMs, a direct method is better than an iterative method to solve the Newton equations. Hence, almost all the available implementations of the general purpose IPMs use a direct approach to solve the Newton equations and they all use some variant of the symmetric triangular LAL T decomposition, where L is a lower triangular matrix and A is a block diagonal matrix with blocks of size 1 or 2. A n alternative direct approach is the QR decomposition of A. Although this approach uses orthogonal transformations and guarantees high accuracy of solutions, it cannot be used in practice since it is prohibitively expensive. Another more efficient approach for solving Newton equations system is called the adaptive Chapter 4. Solving the Newton Equations 45 approach. This approach combines the advantages of both the direct and iterative methods. It uses an adaptive automated procedure to determine whether to use a direct or iterative method. Computational results show this approach is more efficient than a pure direct method. 4.1 Direct Methods From the previous discussion, we get the conclusion that one practical approach to solving the Newton equations in general purpose I P M codes is the direct method using LAL T decomposi- tion. There are two major alternative approaches corresponding to the solving of the normal equations (4.3) and the augmented system (4.2). t. 4.1.1 The Normal Equations Approach Many current implementations [57, 38, 67, 72, 76] solve the normal equations directly by using Gaussian elimination to compute the Cholesky factorization LAL T — AD A . 2 T A major advan- tage of this approach is.that it is based on a symmetric positive definite matrix guaranteed to yield a factorization of the form LAL , where L is lower triangular with unit diagonal and A is T diagonal with strictly positive entries. Given the LAL T decomposition, A y is easily obtained from (4.3) by solving two triangular systems. Moreover, the sparsity pattern in the decomposition is independent of the value of D 2 and hence is constant in all I P M iterations. Therefore, a good sparsity-preserving pivot order can be chosen with much care, even if it involves considerable computational effort, since it will be used extensively throughout the whole solution process. The sparsity-preserving sequence of pivots (ordering) in which the symmetric Gaussian elimination is performed can be defined in advance, i.e., before the numerical operations start. Thus the analysis phase is completely separated from the factorize phase [27] and the analysis phase has to be done only once, before the whole solution process. The success of the implementation of the Cholesky factorization depends on the quality of its analysis phase, i.e. reordering for sparsity. Its goal is to find a permutation matrix P such that the Cholesky factor of PAD 2 AP 1 T is the sparsest possible. Chapter 4. Solving the Newton Equations 46 The overall solution of the normal equations is typically divided into four distinct independent phases: 1. Find an appropriate ordering P for AD A . 2 T 2. Set up a data structure for L, the Cholesky factor of 3. Numerically factor PAD A P 2 4. Solve (LAL )(PAy) T T T into PAD A P . 2 T T LAL . T = Pg. Step 1 and step 2 are the analysis phase. They depend only on the structure of AD A 2 T and are independent of its numerical values. Step3 and step 4 are the factorize phase. Finding the permutation that yields the minimum fill-in in the Cholesky factors is an NPhard problem[lll]. As a practical consequence, we cannot expect to find an efficient algorithm that identifies an optimal ordering. Hence, heuristics are devised that efficiently compute a permutation that approximates the effect of the optimal one. Two such heuristics, namely the minimum degree and the minimum local fill-in orderings [27, 34, 35] are particularly useful in the context of I P M implementations. Now we give a brief description of these two heuristics. • Minimum Degree Ordering The minimum degree ordering algorithm follows the Markowitz criterion[69], which is designed for unsymmetric matrices. Markowitz observed that in the kth step of an sparse Gaussian elimination the local best pivot candidate a j is the one that minimizes t fa = (n - i)( Cj - l), (4.4) where ri and Cj are the numbers of nonzero entries of row i and column j in the kth Schur complement. The value fa gives the number of floating point operations required by the A;th step of Gaussian elimination and, at the same time, estimates the potential fill-in caused by this step. Chapter 4. Solving the Newton Equations 47 Tinney and Walker[97] applied this strategy to symmetric matrices which results in the minimum degree ordering heuristic[92, 97]. This heuristic can be regarded as an attempt to minimize the number of arithmetic operations in the next stage of the Gaussian elimination algorithm, and as an attempt to select the pivot column that introduces the least fill-in in the next stage. This local minimization strategy does not guarantee a global minimum for either the amount of fill-in or total number of arithmetic operations performed during Gaussian elimination. Nevertheless, the strategy has been proved to be very effective in reducing arithmetic operations and fill-ins. When symmetric positive definite systems are considered, pivot selection is restricted to diagonal elements, hence the Markowitz merit function simplifies to f = {c -lf 3 3 ' " (4.5) which leads to a simple rule that the best candidate for the pivot column is, among the columns in the portion of the matrix still to be factored, the one with the minimum number of nonzero entries. Interpreting this process in terms of the elimination graph[35] one sees that this is equivalent to the choice of the node that has the minimum degree, which gave the name to this heuristic. The minimum degree ordering algorithm can be implemented efficiently both in terms of speed and storage requirements. George and Liu in[35] described several powerful enhancements of the basic algorithm. The various methods that have been developed to improve the performance of the basic minimum degree algorithm are mass elimination, indistinguishable nodes(supernodes), incomplete degree update, element absorption and multiple elimination. George and Liu[35] also discussed the tie-breaking strategy used to select the next pivot column among all the columns matching the minimum number of nonzero entries. • Minimum Fill-in Ordering The minimum fill-in ordering heuristic results from the effort of trying to reduce the Chapter 4. Solving the Newton Equations 48 amount of fill-in further than the Markowitz criterion. The function (4.5) considerably overestimates the expected number of fill-ins in a given iteration of the Gaussian elimination because it does not take into account the fact that in many positions of the predicted fill-in, nonzero entries already exist. It is possible that another node, although more expensive in term of (4.5), would produce less fill-in as the elimination step would mainly update already existing nonzero entries of the the Schur complement. The minimum local fill-in ordering heuristic selects, at each stage of the Gaussian elimination procedure, the pivot element that introduces the minimum amount of fill-in. To count predicted fill-in one has to simulate the elimination step, which is quite an expensive operation. Generally, the minimum local fill-in algorithm produces a sparse factorization but at higher initial cost to obtain the ordering. The motivation for introducing the minimum local fill-in heuristic in I P M is straightforward. IPMs solve a sequence of systems of linear equations sharing an identical nonzero structure. Therefore, the ordering procedure will be executed only once at the beginning of the algorithm, and the resulting permutation remains valid for the remaining iterations. By contrast, the Gaussian elimination procedure is repeated in every iteration of the I P M and any computational savings achieved by a better ordering is multiplied by the total number of iterations of the algorithm. Moreover, the effort of ordering a matrix according to either heuristic does not constitute a significant portion of the algorithm's total computation effort. Therefore, the extra effort involved in performing the minimum local fill-in heuristic does not impact significantly the algorithm's running time. The normal equations approach has been observed to work well for many linear programs. Nevertheless, it is known to encounter difficulties in two common situations. The first drawback of the normal equations approach is that dense columns in A can create unacceptably dense matrix AD A 2 T and thus unacceptably dense Cholesky factors L. There are several methods to solve this problem. Adler[2] removes the dense columns from A, and then corrects for their absence by a conjugate gradient method to solve the normal equations using Chapter 4. Solving the Newton Equations 49 the incomplete Cholesky factors as preconditioners. Gill et al.[37] and Choi et al.[24] employ the Schur complement algorithm and partition matrix A into sparse and dense columns. Let A = [A x where A\ G TZ ( - ) and A2 G TZ mX n k mxk A ], (4.6) 2 are matrices built of sparse and dense columns, respec- tively. The Schur complement mechanism based on (4.6) and an explicit decomposition of the matrix AD A 2 into a presumably sparse part A\D\A\ T = AiD Aj + A D\A 2 T 2 2 (4.7) and a significantly denser symmetric rank k update of it. A Cholesky decomposition is then computed for the "sparse" part and the dense rank-A; update is handled via the Sherman-Morrison-Woodbury formula. This method is not guaranteed to work correctly because the sparse part may be rank deficient. Whenever this happens, the Cholesky decomposition of A\D\A\ does not exist and the Sherman-Morrison-Woodbury update is not well defined. Therefore in a practical implementation a small diagonal regularization term is added to A\D\A\ such that the decomposition exists. The method usually works satisfactorily for a small number of dense columns. Andersen[5] proposed a remedy to the rank deficiency arising in the Schur complement mechanism. Lustig[67] proposed an algorithm which combines Adler's and Gill's method efficiently. This algorithm monitors the spread between the largest and the smallest diagonal elements of the factorization L\AL\ = A\D\Aj. When this spread becomes large, all smaller diagonal elements are set to 1 and the remaining elements of these columns are set to 0. These Cholesky factors are used as preconditioners for a preconditioned conjugate gradient method with the dense columns added to the matrix A. This method eliminates rank deficiency and removes much of the instability. A more numericaDy sound alternative is based on expanded normal equations formed by "splitting" dense columns[42, 104], but the efficiency of this kind of modification is not fully established. In any case, all of these approaches have the drawback of requiring that dense columns be identified in A in some arbitrary way. Chapter 4. 50 Solving the Newton Equations The second difficulty occurs when the linear program contains "free variables" which are not explicitly bounded above or below. To transform the problem to the standard form, any free variable has to be replaced with a difference of two nonnegative variables: Xf = x + — x~. The presence of logarithmic terms in the objective function causes very fast growth of both split brothers. Although their difference may be kept relatively close to the optimal value of xj, both x + and x~ tend to infinity. This results in a serious loss of accuracy in (4.6). A remedy used in many I P M implementations is to prevent excessive growth of x + and x~. The difficulties of the normal equations motivated several researchers to pay special attention to the augmented system form of the Newton equations which allows more freedom in the pivot choice. 4.1.2 T h e Augmented System Approach A n alternative approach that avoids the difficulties of the normal equations is to apply elimination directly to factor the larger "augmented system" (4.2). This approach is based on performing a Bunch-Parlett[18] factorization to the symmetric indefinite matrix -D~ A A 0 2 K This factorization has the form PKP T = LAL , T T (4.8) where P is a permutation matrix and A is an symmetric indefinite block diagonal matrix with l x l and 2 x 2 blocks. Because the matrix is symmetric indefinite, the equations (4.2) can not be solved quite so easily as the normal equations. There exists a stable elimination procedure for symmetric indefinite systems, due to Bunch and Parlett[18], but it must allow for certain 2 x 2 pivots along the diagonal in addition to the usual l x l pivot elements. The factorization LAL T is guaranteed to exist[39], but unlike the positive definite case(normal equations approach), the pivot order cannot be computed symbolically because the permutation choice is based on numerical values and both the sparsity and stability of the triangular factor. Thus, the pivot order cannot be fixed once in the analysis phase, but must be recomputed as x and z are updated in each Chapter 4. iteration. Solving the Newton Equations 51 On the other hand, due to the greater freedom in the choice of the pivot order, the augmented system factorization may produce significantly sparser factors than that of the normal equations. Actually the normal equations are a special case of the augmented system (4.8) in which the first n pivots are chosen from the D 2 part regardless their stability properties and without any concern about the fill-in they produce. Since D is positive, no zero pivots are encountered, and after the nth stage of symmetric elimination the partially reduced matrix has the form -D- 0 2 0 (4.9) AD A 2 T Each stage of the elimination performs a rank-one update of the lower right mx m submatrix, and after n stages, AD A 2 T is explicitly formed. The remaining matrix is positive definite and the factorization Z . D £ ^ of AD A 2 m m T is guaranteed to exist. Moreover, the last m pivots can be performed in any order, using whatever efficient pivoting strategy is preferred. For example, the minimum degree or the minimum local fill ordering heuristics might be used. When the symmetric elimination is complete, we have K = LkDkLj where Dk is an (n + m) X {n + m) diagonal matrix of the form -D~ 0 0 D 7 0 -AD L 2 (4.10) m and 2 (4.11) m Hence, the normal equations approach is embedded in the augmented system approach as one particular pivot strategy. Vanderbei[105] call this pivot strategy the conservative pivot strategy. Solving (4.2) and solving (4.3) are roughly equivalent under the conservative strategy. Observing (4.9), when A has dense column ay, a,aJbecomes a fully dense m x m matrix. Consequently, AD A 2 T formed in the lower right hand corner is also fully dense. So if A has dense column, the conservative pivot strategy can dramatically impair the efficiency of solving the augmented system. Chapter 4. Solving the Newton Equations 52 The success of the augmented system factorization depends highly on the efficiency of the pivot selection rule. More freedom in the pivot choice opens the possibility of getting sparser factors than in the normal equations approach (e.g., the degrading influence of dense columns can be avoided). A popular strategy for pivot selection rules is to detect "dense" columns and then to pivot out the columns in the diagonal positions of D~ 2 corresponding to the nondense columns of A. A difficulty arises with the choice of a threshold density used to group columns of A into the sparse and the dense parts in (4.6). A fixed threshold value approach works well only in a case when dense columns are easily identifiable. Whenever more complicated sparsity structure appears in A, a more sophisticated heuristic is needed. Maros and Meszaros [70] give a detailed analysis of this issue. Instead of (4.6), they consider the following partition of the L P constraint matrix r (4-12) A A A\ A n 12 2 22 where An is supposed to be very sparse and additionally it is assumed to create a sparse adjacency structure AnA , A\ u 2 is a presumably small set of "difficult" columns, e.g., dense columns or columns referring to free variables, and [^21^22] is a set of "difficult" rows. A n efficient heuristic to find such a partition is given in [70]. Therefore (4.2) becomes 2 0 2 T 21 Axi A A T 2 u 22 A Ax T2 2 An Au 0 0 A 2/1 hi A\ A22 0 0 Aj/2 h2 2 Elimination of A 0 -Di (4.13) causes very limited fill-in and reduces the matrix to A A T T •D2- 2 22 12 A A A AnD A AnDJAl A A D\A A D\Al 2 12 22 u 2l n 2l x (4-14) Chapter 4. The ehmination of the D block should be delayed after all attractive pivot candidates from 2 2 A\\D\A\ and A \D A blocks are exploited. Fill-in can be reduced dramatically by pivoting 2 X 53 Solving the Newton Equations 2 2L out the dense columns of A as late as possible. The normal equations approach makes no such a distinction and pivots out both D, and D 2 blocks. 2 2 This technique for delaying the elimination of dense columns is algebraically equivalent to the well-known Schur-complement technique for handling dense columns. Observe that the normal equations An _ ' An A ^21 D\ T 0 _ 22 A 0 9i 21 A A T AJ 2 ^22 . A y 2 . (4.15) 92 _ can be replaced with the following system A^DJAT, AuD A A D Aj, A D\A DA DA 2 21 2 T 21 2X T 2 2X 2 A2/1 A22D2 A2/2 12 2 -I r l2 AD 22 9i = 92 Ai/ (4.16) 0 in which all "difficult" columns are handled as a symmetric rank-k update of an "easy" part (cf.(4.7)) A u A\ D 2 D ilA 2 n A12 2 A* } + [D Al 2 2 D A ]. T 2 2 22 (4.17) AD 22 2 We can find that the matrix involved in the system (4.16) has exactly the same sparsity pattern as that in (4.14). In the implementation of an augmented system approach, to save on the expensive analysis phase, the pivot order is reused in subsequent I P M iterations and only occasionally updated when the numerical properties of the Newton equation matrix have changed considerably. In every I P M iteration, the augmented system approach tends to require fewer arithmetic operations than the normal equations approach to solve for ( A x , A j / ) , because it computes sparser factors. Summing up, the augmented system approach has the following advantages: 1. Good accuracy properties. Chapter 4. Solving the Newton Equations 54 2. Easily generalizable to exploit the sparsity of K K T systems arising in nonseparable quadratic programming and linear complementarity problems. 3. Naturally handles free variables. If Xj £ (—oo,+oo), then D~ 2 in (4.2) in replaced by zero. 4. Dense columns of A do not degrade its efficiency, do not lead to significant fill-in. It has two important disadvantages: 1. It is more complicated to implement than the normal equations approach. 2. The analysis and factorize phases cannot be separated and the factorization is more expensive than in the normal equations approach. Computational practice [4] shows that both methods are important and have their own advantages. It would be beneficial to have both of them implemented as well as to have an analyzer that is able to determine which of them should be used [70]. 4.2 The Adaptive Approach Wang et al.[107] propose an efficient adaptive approach for solving the Newton equations in I P M . They use an adaptive automated procedure for determining whether to use a direct or iterative solver, whether to reinitialize or update the preconditioner, and how many updates to apply. This algorithm exploits the advantages of both the direct and the iterative methods and avoids the disadvantages of both of them. This algorithm uses normal equations to solve the Newton equations. The ideas in this algorithm can be extended to the augmented system. Now we give a brief description of Wang's algorithm. In the first iteration of the algorithm, the Newton equations are solved using a direct method. Starting from the second iteration, the algorithm uses a preconditioned conjugate gradient method. The preconditioner for each iteration is determined by factoring the current Chapter 4. 55 Solving the Newton Equations matrix K = AD A 2 T or by updating the current preconditioner. The determination of whether to update the current preconditioner or refactor the matrix AD A 2 T to obtain a new precondi- tioner based on the cost of determining the previous search direction or the predicted cost of determining the current iteration. If either of the two costs is high, the algorithm reinitializes the preconditioner by factoring the current matrix K = AD A . 2 T Otherwise, it updates the current preconditioner. The number of Cholesky updates is changed adaptively over the course of the algorithm in order to improve efficiency. It is increased if the previous search direction took many iterations and decreased if the previous search direction took a very small number of iterations. After computing the preconditioner, the algorithm solves the normal equations using the preconditioned conjugate gradient method. It gives a maximum number of iterations allowed. If the preconditioned conjugate gradient iteration number exceeds the maximum number, then the current preconditioner is abandoned and a new preconditioner is determined by Cholesky factorization. If this happens twice, the iterative method is not suitable and the algorithm switch to a direct method. The iterative method will continue until the relative duality gap is small enough. In this situation, the iterates are close to the optimal solution and accuracy requirements are high. The elements in matrix D vary significantly and make the matrix K = AD A 2 T very ill-conditioned. The Cholesky factorization of K may not generate a good preconditioner, even if stable methods are used. For all of these reasons, a direct method is used to determine thefinalsearch direction. It also switches to a direct method when a Cholesky factorization has a zero on the diagonal (i.e. the diagonal of the preconditioner is singular). The algorithm prefers to use the preconditioned conjugate gradient method in early iterations of IPM because of the following advantages. 1. The conjugate gradient method has better stability than the direct solver. Convergence can be achieved even if the factorization is quite inaccurate. 2. The storage requirement of the preconditioned conjugate gradient method is quite low, Chapter 4. Solving the Newton Equations 56 amounting to a few vectors of length m. 3. Since the accuracy requirements for the search direction in the beginning iterations of the I P M are quite low, only a few conjugate gradient iterations are required. 4. A good preconditioner may dramatically accelerate the convergence rate and gain great computational saving. Computational results in [107] demonstrated that this algorithm can enhance performance of interior point algorithms on large sparse problems. Chapter 5 Presolve Analysis Many LP problem can be simplified to a form that is faster to solve. It is beneficial to implement a presolve procedure in an LP solver to detect redundancy and to remove it. A properly formulated LP problem must contain as few variables, constraints, and nonzeros as possible, it must be well-scaled and the constraints must be linearly independent. The issue of presolve analysis has been addressed by many authors [1], [17], [68], [3], [101]. Andersen[3] and Gondzio[44] proposed the presolve procedures for IPM. We consider the primal and dual LP problems (2.50) and (2.52). The computational complexity of IPM algorithms is dependent on the number of constraints (m) and variables (n) in (2.50). The sparsity of A is more important because only nonzeros in A are stored and the work is dependent on the number of nonzeros. These require that the presolve procedure should reduce the size of A without creating any new nonzeros in A. The presolve procedures for IPM can be categorized into three classes: 1. Reduction of problem size. 2. Improving sparsity of the problem. 3. Removing linear dependent rows and columns. 5.1 Reduction of Problem Size 5.1.1 Simple Presolve Methods • An empty row: 3i:aij = 0,Vj. 57 (5.1) Chapter 5. Presolve Analysis 58 A n empty row must must either be redundant or infeasible. • A n empty column: 3j : dij = 0 , V i . . (5.2) Dependent oil the bounds on variable Xj and its objective coefficient Cj, variable Xj is fixed at one of its bounds or the problem is unbounded. • A n infeasible variable: 3j : lj > Uj. , (5.3) The problem is trivially infeasible! • A fixed variable: (5.4) 3j : lj = Vj. Variable Xj can be substituted out of the problem. • A singleton row: 3(i,k) : dij = 0,Vj A n equality type singleton row fixes variable ^ k,dik Xk at level ^ 0. bi/dik- (5.5) If t\/a -fc < Ik or &;/a fc > t 4 Uk, then the problem is infeasible. The elimination of an equality type singleton row may create new singleton rows and the process is repeated. A n inequality type singleton row introduces a new bound on an appropriate variable. Such a bound can be redundant, tighter than one already existing or infeasible. 5.1.2 Forcing and Redundant Constraints The simple bounds on the primal variables (2.50) are exploited to detect forcing and redundant constraints. We define the following lower and upper limits for a constraint i Ps= ]C (5.6) 59 Chapter 5. Presolve Analysis and b = YI iJ h a (5-7) u t jeP,. where P,» = {j : a,j > 0} and iV,* - {j : a,-j < 0}. We have bi<Y2dijXj < bi, (5.8) J for any Xj that satisfies 0 < XJ < u . 3 We now consider the "less than or equal to" constraint Y^ ii j a < f x (5.9) b j ' ' " • A similar analysis applies to a "greater than or equal to" constraint. The following four possibilities may occur. • A n infeasible constraint: 3i : b, < bi. (5.10) The problem is infeasible. • A n redundant constraint: (5.11) 3i:b~<b . t Constraint i is redundant, row i can be eliminated. • A forcing constraint: 3i:bi = bi. (5.12) If 6i = bi then due to linearity the only feasible value of Xj is 0 if j € Pi*(j € Ni*). Therefore, we can fix all variables in the ith constraint. • A n non constructive constraint: 3i:b <b 1 l <b~i. Constraint i cannot be eliminated. This case is nonconstructive. (5.13) 60 Chapter 5. Presolve Analysis 5.1.3 Tightening Variable Bounds and Implied Free Variables The non constructive case bi<bi<F (5.14) t can often be used to generate implied bounds on variables involved by (5.9). Equation (5.6) and (5.9) imply x < u' = (6; - bj)/a , Vfc G k k (5.15) ik and x >l' k k = u + {b ~b )la ^k£N^. k l i (5.16) ik These new bounds can be constructive (tighten appropriate bound), redundant or contradictory to the existing bound. When an implied finite bound is imposed on a free variable, the free variable does not have to be split and represented as the difference of two nonnegative variables. If an upper bound in (5.6) is infinite, i.e. u = +00 for some k € N{*, then bj_ is infinite. k There.is still an implied bound on x . Gondzio[44] gives k Xk > l' = (bi k ^2 if dijUj)/aik, k e N'i*. (5-17) jeN..-{k) i Implied free variables: If 0 < l' < u' < u , then the original bound 0 < x k k k k < u can be removed and x becomes k k free. • A free singleton column: 3(k,j) : (aij = 0,Vt ^ k,a kj ^ 0) A (lj = -oo) A (u = -foo). 3 For equality constraints, substitute Xj = (b — ^ a x )/a j k kg q k (5.18) in to the objective function, and the constraint k can be eliminated. Thus one constraint and one variable are removed from the problem without generating any fill-ins in A, although the objective function is modified. Chapter 5. Presolve Analysis 61 • A n implied free singleton column: 3(k,j) : (a - = 0 0,V»V k,a ± kj 0) A (0 < 1'3 < u) < u ) 3 (5.19) where l'j and u' - are computed according to (5.15) and (5.16) or (5.17). Then the variable } Xj can be treated as a free column singleton and the constraint k and variable Xj are removed from the problem. 5.1.4 Bounds on Shadow Prices and Dominated Variables All presolve procedures discussed in the previous section are concerned with the analysis of a primal formulation (2.50). The similar analysis can be used on the dual problem (2.52). If the upper bound of a primal variable is infinite, we get if UJ = +oo then Wj = 0. (5.20) The associated constraint in (2.52) becomes an inequality al ]V < CJ. (5.21) If a„j is a singleton column with an entry a j, that is k 3(k,j):(a = 0,\/i^k,a ^0), i3 k3 (5.22) then the inequality (5.21) become a singleton row for the dual problem (2.52) and can be used to produce a bound on the shadow price y k Assume all dual variables y have explicit (possibly infinite) bounds Pi<Vi<qi, t=l,2,---,m. (5.23) Define the lower and upper limits for all dual constraints as follows £/= o,ijPi+ 53 °»J9»> (5.24) Chapter 5. Presolve Analysis 62 and X) j = r where P*j = {i : a j > 0} and JV„j = {i : t + Yl HPi, (5.25) a < 0). Any feasible y satisfies Si ^ ]C 'j2/<- (5.26) a i The following possibihties may occur: 1. Cj < Cj (5.21) can not be satisfied, the problem is dual infeasible. 2. cj < CJ The induced cost Zj is strictly positive, so the variable Xj is dominated. It can be fixed at its lower bound Xj = 0 and eliminated from the primal problem. The dual constraint j becomes redundant. 3. c] = Cj (5.21) is always satisfied with Zj > 0. 4. Cj < Cj < cj The variable Xj cannot be eliminated. If the third case cj = Cj occurs, the variable Xj is called weakly dominated. It is claimed in [3] and [44] that under some additional conditions the variable can be eliminated. If Cj < Cj < c], we can apply the same technique as in last section to derive new bounds on the dual variables from (5.21),(5.24) and (5.26), we get: Vk < q'k = Pk + (CJ - Cj)/a j, k V7c € P*j, (5-27) and Vk > p' = Q + (CJ - cj)/a j, k k k VAreJV.j, (5.28) for dkj ^ 0. If the new bound improves the older one, we call it constructive. If it is redundant, we omit it. Finally, if it contradicts the older one, the problem is dual infeasible. The new bounds on y are used to recompute Cj and cj which in turn are used to detect more dominated columns. Chapter 5. 5.2 Presolve Analysis 63 Improving Sparsity of the Problem 5.2.1 Making the Matrix A Sparser For solving the Newton equation system, both the normal equations and the augmented system are strongly influenced by the sparsity structure of A. Therefore, we want to make the matrix A as sparse as possible. This Sparsity Problem(SP) can be described as: try to find a nonsingular matrix M G TZ mXm such that the matrix MA is the sparsest possible. Unless some simplifying assumptions are added, this problem is NP-hard[54]. Some efficient heuristics are proposed in [21]. These methods are relative expensive to reduce the number of nonzeros of A. Gondzio[44] gives an considerably simpler heuristic for improving the sparsity of A. He analyze every equality type row, say, a « and to look for all L P constraints with the sparsity t pattern being the superset of a;*. If a * is a superset of a;», then for any real a, a' = a * + aa^ k krt has the same sparsity pattern as a *. If a * is replaced by a' k k km k and b is replaced by b' = b + abi, k k k the changed linear program is clearly equivalent to the original one. A n appropriate,choice of a can cause elimination of at least one nonzero entry from a *, or k more, thus producing a sparser a' . kx Adler et al.[l] also gives a very simple and fast heuristic for improving the sparsity of A. 5.2.2 Splitting Dense Columns Sometimes even very sparse A may produce relative dense factors in the normal equations or the augmented system. If dense columns exist in A, the factors of normal equations can have a lot of fill-ins. Therefore if the number of dense columns is not excessive, we can split them into shorter pieces. A dense column d of A with ..., nonzero elements is replaced with a set of columns d\,d , 2 d such that k k . d = Y,dii=l (5-29) 64 Chapter 5. Presolve Analysis Adding the linking constraints 4-4 =0, +1 z= l , 2 , . . . , f c - 1 , (5.30) to force the variables xj, x ,,..., x^ associated with all the d to be the same. We have no longer 2 t large dense block of size structure AQA T x created by the dense column d. Instead, the new adjacency of the transformed problem contains k smaller dense windows resulting from piecesrf,-,for which we pay by adding k — 1 constraints (5.30). 5.3 Removing Linearly Dependent Rows and Columns 5.3.1 Linearly Dependent Rows Linearly dependent rows in A are undesirable because in an interior-point algorithm a certain linear system must be solved in each iteration. If A contains linearly dependent rows, the linear system has a singular matrix and does not have a unique solution. To find all the linearly dependent rows in A is computational too expensive. Therefore, most IPM codes try to find only simple linear dependent rows. One simple form of linear dependent rows is called duplicate row by Tomlin and Welch[101]. The definition is 3(i,k) : dij = j / a ; t j i ^ k . (5.31) The heuristic of section 3.6 to make A sparser naturally detects this situation if at least one of the constraints i and k is of the equality type. It must be noticed that due to primal degeneracy, even if the original A has full row rank, the normal equations matrix becomes rank deficient when the optimum is approached. Therefore, it is necessary to correct small rank deficiency of A in the optimization procedure, not only in the presolve procedure. Chapter 5. Presolve Analysis 5.3.2 65 Duplicate Columns We define columns j and k to be duplicate if they satisfy 3(j,k) : a,-j = va , j # k,Vi, ik (5.32) where v is a scalar multipher. We have that t = Cj: - 1/ yi ik a i V(Z% ~W* - C) = Cj + = Cj - vc + v(z* - w ). k k k k (5.33) k • Fixing a duplicate column: Cj--vc ^0 (5.34) k In this case, under some additional assumptions, one of the columns dominates the other. If Cj > vc then k (5.35) z*-w*>v{zi-wt). Now, if u •= +oo (i.e. w* = 0) and v > 0, or l k k k = - o o (i.e. z* = 0) and v < 0 then k (5.35) yields z* - w* > 0. (5.36) Variable Xj can then be fixed to its lower bound lj. If lj = — oo then the problem is dual infeasible. A n analogous analysis applies to the opposite case Cj < vc . k • Replacing two duplicate columns by one: Cj - vc — 0 k (5.37) Chapter 5. Presolve Analysis 66 Let us study the primal problem (2.50), we have minimize h CjXj + c x k subject to k + • • •, h a.jXj -f a. x + • • • = b, k ij Xj h < X k k ^**^ < (5.38) 1Jb j ^ U. K Define x = -VXJ + x' , k (5.39) k substitute (5.39) into (5.38) we obtain minimize h c x' + • • •, subject to (- a x'u + • • • = b, k k k h < %' K VXJ (5.40) < u. k Bounds of x' depend on the sign of v: k if v >0 then l' = l + vlj if v <0 then l' = l + k k k k VUJ and u' = u + k k VUJ, and u' = u + vlj. k k (5-41) (5-42) Therefore, the duplicate column procedure removes variable £ j from the problem and modifies the bounds on variable x according to (5.41) and (5.42). k 5.4 Postsolve After presolving the original problem (2.50), the reduced problem is solved by an I P M algorithm. In general, the optimal solution to the reduced problem is in different variables than the original problem. Therefore, a restoration procedure is needed to "undo" the presolving. To perform the "undo" procedure efficiently, one has to store some trace information on all presolve modifications done to the primal and dual variables, respectively. Gondzio [44] gave detailed description about the postsolve procedure. Andersen[3] and Chapter 6 Solving the Fractional (g, /)-factor Problem A (g, /)-factor of a graph G is a subgraph of G whose valencies are bounded between g and /. More formaUy, let G — (V(G), E(G)) be a graph on m vertices with multiple edges and loops allowed, let A,j denote the multiplicity of the edge G E(G) and let dc(i) denote the degree of vertex i in G. A loop counts 2 towards the degree. Let g = (#,- : i G V(G)) and / = (/,•: i G V'(Cr)) be vectors of positive integers satisfying Vi G V(G) : 0 < g < /,• < t d (i). G (6.1) A (g, /)-factor is a subgraph H of G with Vi G V(G): g < d„(i) < /,-. t If we allow the subgraph H to have fractional edges, i.e., a real vector x — (Xij '• (6.2) S E(G)) The fractional (g, /)-factor problem can be solved using network flow algorithms[6]. In this chapter, we use I P M to solve the fractional (g, /)-factor problem. We will exploit the special structure of the fractional (g,/)-factor problem and propose two I P M algorithms to take advantage of the special structure of the fractional (g, /)-factor problem. In Section 6.1 we develop Choi's parallel interior point algorithm for Network flow problem[23] to the algorithm for the fractional (p,/)-factor problem. In Section 6.2 we propose a preconditioned conjugate gradient based implementation of I P M which uses a efficient preconditioner for the fractional (g, /)-factor problem. 67 Chapter 6. 68 Solving the Fractional (g, f)-factor Problem There are several different ways to formulate the fractional (g, /)-factor problem as a linear programming problem. In order to make the corresponding algorithm in Section 6.1 and Section 6.2 more efficient, we use two different linear programming models of the fractional (g, /)-factor problem in these two sections. The model we use in Section 6.1 has relative fewer variables and is good for the parallel algorithm. The model we use in Section 6.2 has more variables, but for this model we can use a very efficient preconditions and therefore make the algorithm more efficient for solving the fractional (g, /)-factor problem. 6.1 The Parallel IPM for the Fractional (g, /)-factor Problem The fractional (g, /)-factor problem can be formulated as the following linear program: minimize ^ /,• + ^ i=l subject to — hi, i-l ^2 Xij ~ h < ~9i, f° i X Xij ~ hi < fi, for r i = 1, • • - ,"x, = 1,.. . , m , (6.3) (i,j)eE(G) o< x where X = (Xij = o - t o (i,j)e E(G), /,• > 0, for i = 1,..., m, hi > 0, for i = 1,..., m, G TZ , I = (/,• : i G V(G)) G Tl € E(G)) G TZ , gj n m m and h = (hi : i G V(G)) G TZ (U and hi represent the lower and upper deficiency of vertex i respectively). Here m we try to reduce the deficiency to zero. In matrix notation, the above L P problem can be formulated as the foDowing form: minimize c x, subject to i i < i, T • Xij<*ij, x>0, for (i,j)e _ (6-4) E(G), Chapter 6. Solving the Fractional (g, f)-factor Problem 69 where A b x C - = -A -I o A 0 -/ g 7£2mx(n+2m) e 7e , fV 2m T l-9 T G n , n+2m = lx T l T h] T T = tor -'n 2eL m } G 7^™+2m T c 0 denotes the vector of all zeros in lZ , e2m denotes the vector of all ones in 7Z , and A is the n n mxn 2m node-arc incidence matrix of the graph G = (V(G), E(G)), i.e., for each arc G E(G) y there is an associated column in matrix A with exactly two nonzero entries: an entry 1 in row i and an entry 1 in row j. Choi and Goldfarb propose a parallel interior point algorithm in [23]. They show that their algorithm can exploit various special structures of linear programs. We develop this algorithm for our fractional (g,/)-factor problem by taking advantage of the special structure of the fractional /)-factor problem. For convenience, we reproduce Choi's resultant algorithm in Section 6.1.1 and 6.1.2. In Section 6.1.2, we suggest using George's algorithm[34] to solve the tridiagonal linear equations system in order to save the required storage for the algorithm. This suggestion is not mentioned in [23]. In Section 6.1.3, the layering routine is proposed by Choi and Goldfarb[23] for the network flow problem. We add a modifying routine to construct the staircase matrix A for our fractional (g, /)-factor problem and make Choi's algorithm suitable for the fractional (g, /)-factor problem. 6.1.1 . The Interior Point Algorithm We consider a linear program of the following form: minimize subject to c x, T Ax = 6, Ex < u, x > 0, (6.5) 70 Chapter 6. Solving the Fractional (g, f)-factor Problem where c, x £ K , be Tl , u £ TZ , A n Comparing m k £ 7e and JY e l l mXn k x n . (6.5) with (2.50), we can see the only difference is: in (6.5) we change the upper bound constraints x < u in (2.50) into Hx < u. We study problem in this form because it captures the special structures of the fractional (g, /)-factor problem. We will give detailed description in Section 6.1.3. Adding slack variables to Hx < u, we get minimize c x, subject to Ax = b, T (6.6) Hx + s = u, x > 0, s>0, where c,x € TZ , s,ueTl ,be n TZ , A £ TZ k m mxn and H £ TZ . kxn The corresponding dual problem is maximize b y — u w, T T subject to A y — H w + z = c, T T (6.7) z > 0, w > 0, where ?/ £ 72 , 2 € K m n and w £ Tl . k Using the same procedure as in Section 2.6, we can get the Newton search directions: Ay = (AQ-*A )-\AQ-*p(p), Ax = Q-\A Ay-p{p)), Az = pX^en Aw = S-^pek - SWe As = T . T - Ze n -X^ZAx, k + (6.8) WHAx), -HAx, where Q = X~*Z + HTS^WH, (6.9) Chapter 6. Solving the Fractional (g, f)-factor Problem 71 and p(p) = pH S T e l k e - H We - pX T 1 k n + Ze . (6.10) n Using the Sherman-Morrison-Woodbury updating formula[41], we can write Q = (X Q- 1 _ 1 Z + ^S^WH)- = XZ' 1 1 - XZ~ H K- HZ- X, l T l 1 as (6.11) l where W~ S + K = (6.12) HXZ~ H . l l T In the predictor-corrector I P M , the search directions can be computed the same way. Now we consider the case where the matrix H has a block diagonal structure. In particular, let H have t blocks H 2 H = where H{ € TZ ', k,Xn H S~ WH T i — 1,...,/, 17Ji=i = k and YA=I i n and Q are square block diagonal matrices with n, x l = - n Then, the matrices diagonal blocks and K is a square block diagonal matrix with k{ x k{ diagonal blocks. Because of the block diagonal structure of Q and K, operating involving Q - 1 requires the solution of several smaller systems of equations, all of which can be done in parallel. Moreover, when H has a large number of rows relative to A, i.e. k > m, the speed-up obtainable by using parallel computation can be substantial. 6.1.2 Staircase Structure IPMs work well for staircase linear programs. If matrix A is staircase then the matrix AA T is block tridiagonal so that no fill-in occurs outside of the diagonal and off-diagonal blocks during Cholesky factorization. Thus, the search direction for such problems can be computed efficiently. Chapter 6. Solving the Fractional (</, f)-factor Problem 72 The algorithm described in Section 6.1.1 can take advantage of a constraint matrix A* that has the staircase structure Ai Aii Ai A 2 A ; 2 23 A* = I A A 32 A 3 e 7e(™+*)*". 34 A q+l,2q A i 2 2q+ We assume that A* has an odd number of row blocks. We also assume that all the constraints are equalities and we shall use m to denote the number of rows in the ith row block, i.e., in t submatrices of the form A{.. We decompose the constraint matrix into of rows of A* A* (A G TZ where m = J2i=o 2i+i) mXn (H € Tl and let H be the even blocks of rows of m where k - E L i kxn as follows: let A consist of the odd blocks 2i)- m Obviously, both A and H are block diagonal. Therefore, K and Q~ l are both square block diagonal matrices which can be easily obtained by (6.12) and (6.11). Let the diagonal matrix D = XZ~ be symmetrically partitioned to conform with the column partition (block l structure) of A*, i.e., D = diag{Di, Di , D , D , D ,..., 2 2 23 D 3 2 q t 2 q + i , D +i}, 2q and let W, S and K be symmetrically partitioned to conform with the row partition of H. Then, the q diagonal blocks of K and the 2q + 1 diagonal blocks of Q~ l K = W- S + H D HT, 1 i i i i (<2 )2i+i = D i, -1 are l,...,g, t= i = 2i+ 0,l,...,q, and (Q- ) = D -D H[K- H D , 1 where Hi = [A i-i A 2i<2 2i Moreover, AQ~ A l (AQ~ A ) i 1 T i+ T i = 1 2t i A i] l l i and Di = diag{D -i i, 2it2i+ 2i t2 l,...,q, D i, D 2 i} for i = 1 , . . . , q. 2it2i+ is block tridiagonal with diagonal blocks •= A i iD 2i+i(I 2i+ t2 2ii - A K~ A i i iD i i i)Al l 2i<2i+1 2 t2 + 2 t2 + i+12i Chapter 6. Solving the Fractional (g, f)-factor Problem 73 +^2t+.1^2t'+l'^2i+l + ^.2«+l,2i+2^2«'+l,2t'+2 X (-f ^+2,2t+l^t>\^2i+2,2t>1^2t>l,2t>2)^2t+l,2J+2> - for i = 0 , . . . , q, where the first and last term is vacuous for the cases i = 0 and i = q, and off-diagonal blocks (AQ- A ) 1 = T iti+1 = {AQ- A ) l T i+u ^ 2 i - l , 2 t ' - C 2 t - l , 2 i ^ 2 ! ' , 2 t - l ^ T ^21,21+1 ^2t',2t+1^4-2i+l,2t5. * 1 ) = 1> •••>?• We see that splitting the staircase matrix A* into A and # so that we can apply our algorithm allows us to compute a step direction by factorizing an m X m symmetric block tridiagonal matrix (with diagonal blocks of size m i+i x m,2i+i,i = 0,...,q and off-diagonal 2 blocks of size m^i-x X mii+\, i = 1 , . . . , q) and q smaller matrices K{ of size m,2i x m i. We note 2 that the A'; factorizations can all be done in parallel. This compares with the straightforward approach which involves factorizing an (m + k) x (m + k) symmetric block tridiagonal matrix of the form A*Q~ A* l T where Q is diagonal. This matrix has 2q + 1 diagonal blocks of size m; x mi, i = 1 , . . . , 2q -f 1 and off-diagonal blocks of size m; x m;+i, i = l,...,q. Certainly, when m is much smaller than k, substantial savings in computational effort are afforded by our approach. For this purpose, we can control the size of AQ~ A , making it small at the expense X T of making the blocks of H large. To solve the linear equations system AQ' A Ay 1 need to factorize the matrix AQ -1 T = t, where t = AQ' /?(/i), we do not 1 A . We know R = AQ A T _1 T is a block tridiagonal matrix. George[34] point out that the factorization of such a matrix R can be written in the form fill #21 R12 / #22 #23 ff #32 #33 #21 • _ Pi 1 1 fi32P _1 2 Rmm R\i F 2 / #3 ' I where Pi - Pi = Rii-, Rn - Ri,i-iPr-\Ri-i,i, #23 t = 2,3,...,m. #34 P™ Chapter 6. Solving the Fractional (g, f)-factor Problem 74 To use this form to solve a set of equations RAy — t requires the forward substitution steps v\ = 1*1, Vi = U-Ri,i-iPr_\vi--i, _ i = 2,3,...,ro, (6.13) followed by the back substitution steps Ay m Ayi — P v, m = m P~ {v, l ; R Ay ), ltl+1 i = m-l,...,l. i+1 (6.14) It therefore suffices to keep the off-diagonal blocks of R unmodified and the diagonal blocks Pi in factorized form Pi = LiUi, i = 1,2,.. . , m . Therefore, all fill-in is avoided in off-diagonal blocks and the required storage is saved overall. 6.1.3 The Fractional (<7,/)-factor Problem To apply the I P M to the fractional (g,/)-factor problem, we want to construct staircase structured constraints in (6.4). We use the following layering routine on the graph G = ( V ( G ) , E(G)), to reorder the node-arc incident matrix A into a staircase matrix. Layering routine: Step 1 Starting from a chosen set of nodes, say JVi, perform undirected breadth first search to construct a layered graph, say with 2q + 1 layers. Step 2 Let Ri be the set of arcs in layer i , and Ri i+i be the set of arcs between layer i and t i + 1. Furthermore, let TV,- be the set of nodes in layer i. Step 3 Permute the node-arc incidence matrix A of the graph G = (V(G),E(G)) the rows of A are ordered as N\, J V , . . . , i V 2 R\ i R\,2, R2, R2,3, • • • 5 R2q+1 • 2 g + 1 so that , and the columns of A are ordered as Chapter 6. Solving the Fractional (g, f)-factor Problem 75 The node-arc incidence matrix A produced by the above routine has a staircase structure, since this routine creates a layered graph, i.e., there are no arcs between layer i and layer j for — 1. Note that the choice of N\ greatly affects the number of layers. We j' ^ i + 1 or j i would like to choose N\ so that q is large and rh — YA=Q I -^2t'+i I is small. For given N\, we can find the layered structure by performing breadth first search. But finding the best layered graph with respect to the above criteria is difficult because of the exponentially large number of possible starting sets N\. necessary best) Therefore, we wiD use a heuristic method to find a better (not N\. Now we use the following modifying routine to reorder A of (6.4) into a staircase matrix based on A . Modifying routine -A in (6.4) into a staircase by moving each row in the bottom matrix A Step 1 Order to the position right after the corresponding row in the upper matrix —A. Step 2 For each vector Ri : i = 1 , . . . , 2q + 1, move all the columns corresponding to (Ik hk) : VA; € Ni to the position at end of each vector Ri. After this modifying routine we get a new staircase matrix A. For example, the staircase matrix A of the graph in Figure 6.1 is shown in Figure 6.2. Now applying the method described in Section 6.1.2, we decompose the staircase constraint matrix A into A . . . Adding slack variables r in the constraints corresponding to matrix A , H therefore (6.4) becomes cT x, minimize subject to [A I]x = bi, [H 0]x < b , (6.15) 2 Xij<\j, x > 0, for (i,j)e E(G), Chapter 6. Solving the Fractional (g, f)-factor Problem 76 elO Figure 6.1: The layered graph G. where x = [x T l T h T r] T G 72™+" with n = n + 2m and m = 2m, A € 72™ \ H G 72* , T x xfi and r G 72™. We can write (6.15) in the following form: minimize c x, subject to Ax = b\, T Hx < (6.16) b, 2 x > 0, where A — [A I x 6i 0 H H &2 g I] \b\ \W G 72" G 7£(^+ ) ("+" ) n x i 0 0 = ^mx(n+m) G 72*+" +m G 72™ To get Newton search directions for (6.16), we need to compute Q- 1 = [X~ Z l + H S- WH]-\ T 1 (6.17) 77 Chapter 6. Solving the Fractional (g, f)-factor Problem R12 Rl el II h i 12 h2 e2 e3 e4 -1 -1 1 NI R2 e5 e6 e7 R23 13 h3 14 h4 15 h5 e8 R3 e9 elO 16 -1 -1 h6 17 h7 1 -1 -1 1 -1. 1 -1 -1 -1 1 1 1 -1 1 -1 -1 N2 1 -1 1 1 N3 -1 -1 -1 1 -1 Figure 6.2: The staircase matrix A of the layered graph G. We can see I H H S~ WH T l s-'Wi 0 0 0 H 0 5 " VF 1 2 2 0 SW 0 0 0 l 2 2 0 0 H'S^WiH + Sl 0 0 0 where S~W 1 0 0 2 2 0 0 0 / 0 0 Chapter 6. Solving the Fractional (g, f)-factor Problem 78 Therefore 0 X^Zi Q- = 1 0 X " Z [X7 Z o 2 T x 1 o 1 1 ~\ - i o + jV 5 - VF iY + ft]" l Let + 1 2 j^s^wiJf + n o 1 =- X f ^ ! + # 5 f Wi 7i + ft, thus 1 T x Q- 1 = 0 - QH HQ, T where 0 = ( X f ^ + ft.)" and if = VPf 5j + #0J7 . 1 1 1 T Note that 0 is a diagonal matrix and that K is a block diagonal matrix, therefore QT is a 1 block diagonal matrix. We have QT o 0 Q~ 1 AQ~ A X where Q^ 1 = [X^ /^] 1 = [A T - 1 I] ' A T ' = AQT A 1 X 2 T + Q , l 2 I is diagonal matrix. Therefore, AQ, A 1 T and AQ~ A l T are block tridiagonal matrices with dimension m. We can use the parallel algorithm which is described in Section 6.1.2 to solve the Newton search directions (6.8). 6.2 The Preconditioned Conjugate Gradient Algorithm In this section, we first use the result of [7] and [8] to reduce the fractional (g, /)-factor problem to a minimum cost network flow(MCNF) problem. Then we use a preconditioned conjugate gradient(PCG) based implementation of IPM to solve the MCNF problem. Resende and Veiga[89] have shown that the PCG based IPM makes the IPM very efficient for MCNF problem. The preconditioner we suggest is a combination of Mehrotra and Wang's preconditioner for MCNF problem[77] and Wang and O'Leary's adaptive preconditioner [107]. According to the results of [7] and [8], the network Q = ( V , £ ) for computing the fractional (<7,/)-factor problem is constructed as follows: Chapter 6. Solving the Fractional (g, f)-factor Problem 79 Let the nodes V = {dummy} \j{R :ieV} { U {Si :ie V}, (6.18) and the set E of arcs given as follows V(i,j) £ E(G) (Ri,Sj),(Rj,Si) € El, Vz € V(G) (dummy, R{), (Si, dummy) € E2, both with capacity g,- and cost - 1, both with capacity A,j and cost 0, (dummy, Ri), (Si, dummy) € E3, both with capacity /,• — and cost 0, (dummy, Ri), (Si, dummy) £ EA, both with capacity oo and cost 1. Therefore, the set of arcs € - El U E2 U £ 3 U EA. Proposition 6.2.1 (Anstee[7]) A minimum costflowin Q yields a minimum deficiency fractional subgraph H using V(i,j)eE(G) xo- = ^ ( / ( ^ . ^ ) + /C^,5,-)), where f(p,q) is the flow on the arc (p,q) G £• Let Xij = the flow on arc (Ri, Sj) G El, hi = the flow on arc (dummy, Ri) G E2, hi = the flow on arc (Si, dummy) G E2, mti = the flow on arc (dummy, Ri) £ J53, msi = the flow on arc (Si,dummy) G ES, hri = the flow on arc (dummy, Ri) G EA, hsi = the flow on arc (Si, dummy) £ J54, then the minimum cost network flow problem of Q can be formulated as the following L P Chapter 6. Solving the Fractional (g, f)-factor Problem 80 problem, minimize i=i y) subject to Xij — ' t — ™>T{ — hri = 0, for i = 1,. r («\j)e£i Xij m _ 't ~ 5 (iJ)eEl m s t — 0, _ for i = 1,. ,,m, m Y2(lri + mri + hri) - Y2(lsi + t=l + hsi) = 0, i=l 0 < x.j < for(z,;)G£(G), 0 < /r, < g{, for z = 1 .,m, 0 < /s < g;, for z = 1 .,m, 0 < mri < fi — for i = 1 .,m, 0 < ms; < / , - <7;, for z = 1 ,,m, t /ir, > 0, for i= l .,m, hs{ > 0, for i= l ., m. To solve (6.19), if the optimal objective value is equal to Y^TLi ~9ii w e (6.19) get a fractional (g,f)- factor H with the edge values given as Proposition 6.2.1. Otherwise, we get a minimum deficiency fractional subgraph H. The network flow problem (6.19) can be transformed to the following form: minimize subject to c x, T Ax = b, (6.20) x < u, £ > 0, where. A is the (2m + 1) x (2n + 6m) node-arc incidence matrix of the network Q = ( V , £ ) , i.e. for each arc (i,j) G £ there is an associated column in matrix A with exactly two nonzero entries: an entry 1 in row i and an entry - 1 in row j. i{j denotes the flow on edge (z, j) G £ , Cij is the cost of transporting one unit of flow on edge (i,j) flow (i.e. capacity) on edge (i,j) G £ and u,j is the upper bound on G £ . For each vertex j G V , bj denotes the flow produced or Chapter 6. Solving.the Fractional (</, f)-factor Problem consumed at vertex j. 81 If bj > 0, vertex j is a source. If bj < 0, vertex j is a sink. Otherwise bj = 0, vertex j is a transshipment vertex. In the formulation (6.20), bj = 0 for all j £ V. As we stated in Chapter 4, the major work in a single iteration of any I P M is related to build and update the matrix AD A 2 T and to solve the Newton equations system (AD A )Ay 2 = o T (6.21) that determines the search direction at each iteration of I P M . There are two methods to solve the Newton equations system: the direct method and the iterative method. Resende and Veiga[89] have shown that the direct method performs poorly on even small instances of network flow problems. Tnis is because that even though the node-arc incidence matrix A is sparse, the factorization of AD A 2 can produce considerable fill-in, which will T cause prohibitively high computational effort and storage requirements during Cholesky factorization in direct method. Therefore, the most fundamental requirement of an interior point implementation for network flows is an efficient implementation of an iterative method.to solve the Newton equations system. We pay our attention to the conjugate gradient(CG) method. Because the matrices AD A 2 in IPMs show a tendency to be extremely ill conditioned, a T pure conjugate gradient algorithm shows hopelessly slow convergence. Additionally, the use of finite precision results in round-off errors which sometimes make it impossible for the algorithm to converge. To remedy this we use a preconditioner, i.e., a symmetric positive definite matrix M , such that the preconditioned matrix • is less ill-conditioned than AD A . 2 and M~ l T is easy to compute. M~ (AD A ) rl 2 T For this purpose, M should satisfy that M~ ' « l (AD A ) 2 T _ 1 The preconditioner improves the efficiency of the conjugate gradient algorithm by reducing the number of iterations it takes to find a satisfiable direction. In the preconditioned conjugate gradient(PCG) algorithm, we use the determined preconditioner Chapter 6. Solving the Fractional (g, f)-factor Problem 82 Procedure P C G ( 1 , D, a, e, Ay) 1. Ay 2. r = a- 3. 4. v = M 0 = Ay; (AD A )Ay ; 2 0 0 T 0 _ 1 r ; 0 Po = v ; 0 t = 0; 5. 6. While the stopping criterion is not satisfied do: 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. end q = a = Ay (AD A )p ; vfri/pjqi; = Ayi + aip ; = Ti- aiqi; v i = M- r ; Pi = vf r. /vfri; 2 T t t { i+1 { l i+ i+1 +1 i+1 Pi+i = v i + Pipr, i = i + 1; end do; Ay = Ayi PCG; i+ Table 6.1: The preconditioned conjugate gradient algorithm to solve the following system of linear equations. M- (AD A )Ay l 2 T = M~ o. x The pseudo-code of P C G algorithm is given in Table 6.1. Resende and Veiga[89] have shown that the special structure of network flow problems can be exploited to determine an easily invertible preconditioner for the P C G algorithm. They proposed a preconditioner which is a combination of diagonal and maximum weighted spanning tree preconditioned. M = diag(AD A ), 2 T In the earlier iterations of I P M they use the diagonal preconditioner and switch to the maximum weighted spanning tree preconditioner when the diagonal preconditioner becomes ineffective. • The spanning tree preconditioner Chapter 6. Solving the Fractional (g, f)-factor Problem 83 A basis in the network simplex method correspond to a spanning tree of the graph Q. A nondegenerate minimum cost network flow problem has a unique optimal solution and a corresponding optimal spanning tree T*. Let A = [Aj* AjV>], where M* = Q\T*. is easy to see that De —• (Dj*e,0) and the matrix AD A It —• Aj»D\,Aj., when the I P M goes to the optimal solution. Hence one may expect that Aq-D\Aj, where T is a 2 T maximum weighted spanning tree, would serve as a good preconditioner as the algorithm converges. The maximum weighted spanning tree can be identified using as weights the diagonal elements of the current scaling matrix w = De where e is a unit vector. Based on Resende and Veiga's preconditioner, Mehrotra and Wang[77] proposed a improved preconditioner as follows: Let T be a maximum weighted spanning tree of G, then AD A 2 T = [A D Aj 2 T T + A/fDJrAfr] « A D\A r 7 r + A, where A is a nonnegative diagonal matrix. A simple choice of A which has worked well in practice is to take A.= px diag(A^fD%-Ajf), where p is computed adaptively. Mehrotra and Wang[77] proved that the Cholesky factor of the preconditioner ATDJA^- + A can be computed in 0(m) computational steps. Actually, the Cholesky factor of the preconditioner is an incomplete Cholesky factor of AD A . 2 T The computational results given in [77] show that Mehrotra and Wang's preconditioner is better than Resende and Veiga's preconditioner. Wang and O'Leary[107] proposed an adaptive approach to determine a preconditioner. FoDowing their approach, we use the scheme in which we first perform a procedure to determine whether to update the current preconditioner or to recompute a new preconditioner ArDq-A^ + A for the matrix AD A . a P C G method. 2 T Next, we use the determined preconditioner to implement Chapter 6. Solving the Fractional (g, f)-factor Problem 84 The updated preconditioner is computed by updating the previous preconditioner by a smallrank change, when p, changes. Let D be the current diagonal matrix and D be the one for which we have a preconditioner ATD\AJ AD = D 2 —D 2 + A and the Cholesky factor L of the preconditioner. Define and let a be the i-th column of matrix A. Since 8 AD A 2 T = AD A « A £>TIT + A + = LL 2 + T AADA T T AADA T 2n+6m T + ^2 Aduaiof, t=i we may obtain an improved preconditioner LL by applying a rank-a update to LL . T T This update may be computed as in [15] and [26]. . The determination of whether to use the updated preconditioner or to recompute a new preconditioner is based on the approximate cost of the previous iteration including the cost of any updates that were made to the preconditioner. If the cost of the previous search direction was high, we recompute a new preconditioner for the current matrix AD A . 2 T If the cost of the previous search direction was not high, the preconditioner is obtained by updating the previous one. Wang and O'Leary[107] demonstrate that the adaptive preconditioner can enhance the performance of interior point algorithms on large sparse problems. In the P C G algorithm, the initial direction Aj/o is the direction Ay which is produced in the previous call to the P C G algorithm, i.e., during the previous interior point iteration. This is done with the expectation that D and a change little between consecutive interior point iterations, and consequently the direction produced in an iteration should be close to the one produced in the previous iteration. When first time P C G is called, let Aj/o = (0,.. .,0). In this algorithm, we use the stopping criterion suggested by Karmarkar and Ramakrishnan [59] and Resende and Veiga [89]. Let Ay be an approximate solution of (6.21) and . cos v — | o (AD A )Ay | =— . || o || • || (AD A )Ay \\ T 2 T 2 T 85 Chapter 6. Solving the Fractional (g, f)-factor Problem where 6 is the angle between (AD A )Ay 2 and o. T We terminate P C G iterations when | 1 — cos# |< e, i.e., 6 « 0, where e is some small -tolerance (typically e = 1 0 ) . -8 This stopping criterion effectively halts the conjugate gradient algorithm when a good enough direction is on hand. 6.3 Computational Implementation We use Gondzio's code H O P D M to solve the fractional (g, /)-factor problem. H O P D M stands for Higher Order Primal Dual Method. It is an implementation of Gondzio's multiple centrality corrections method which has been described in Section 3.3. We use the formulation (6.3) to solve the fractional (g, /)-factor problem. The objective function of (6.3) represents the total deficiency for the current x = (Xij '• (*>J) £ B{G)) € TZ n value. If we can find an optimal solution in which the objective value is less than 1, then the current x is a fractional (g, /)-factor. Otherwise the objective value will be an integer K which is equal to or larger than 1. In this case, we can get a (g, /)-barrier (see below) and a minimum deficiency fractional subgraph xTo introduce the definition of (g, /)-barrier, we cite the following theorem. Theorem 6.3.1 (Anstee[6]) There exists a fractional (g, f)-factor in G if and only if Ef>+ for all partitions S,T,U E of vertex 1,2,..:, m € *.-;>5> ( - ) 6 22 V(G). If there is a partition S, T, U violating (6.22), this partition is a (g, /)-barrier. . After executing H O P D M , If the objective value is equal to or larger than 1, we can get a (g, /)-barrier S, T, U as follows: T = {i:l >0, 5 = {i:hi> VieF(G)}, t 0, Vi € V(G)}, Chapter 6. Solving the Fractional (g, f)-factor Problem U = 86 V(G)\(TuS). In this case, the objective value K will be K = ^29i~ Actually, for all • i G T,j Xij-^fi- 13 i€T ieT,j€TuU i£S G T U U, Xij ~ 1- Therefore, ^3 Xij 1 S approximately the ierjeTuu number of edges (i, j): 6.3.1 i G T,j G T U U. Starting Point All interior point methods are sensitive to the starting point. The best choice of a default starting point is still an open question. In this section we use three different starting point approaches to run the H O P D M code for fractional (g, /)-factor problem and compare the results. Now we give a detailed description of these three starting point approaches. • Approach 1 For L P (2.51) and (2.52), the only requirement for the starting point of Gondzio's multiple centrality corrections method is x° > 0 , s ° > 0,z° > 0, and w° > 0. Gondzio generates the starting point by using the solution of a primal least squares problem, minimize | xs T subject to |, Ax = b, (6.23) x + s = u, and a dual least squares problem, minimize subject to | z w I, ' T Ay T (6.24) + z — w = c. The Lagrangian for (6.23) is then L(x,s,y,w)=\ xs T | -v (Ax T - b) + t (x T + s - u). (6.25) Chapter 6. Solving the Fractional (g, f)-factor Problem 87 The first order optimality conditions(KKT conditions) for (6.23) are s- Av +t = 0, x+ t = 0, Ax - b = 0, = 0. T X + 5- U (6.26) We can easily solve the linear equations system (6.26). w '= (A/4 )- ( 4u-26), x = ( « — /l «)/2, s = u — x. T 1 J (6.27) T Similarly, we can find the solution of (6.24) as y = (AA )- Ac, z = T w 1 (A y-c)/2, (6.28) T —z. The stepsizes in primal space dp and dual space do are determined by dp = max( —1.5 X min{xj}, —1.5 x min{sj}, 0), and do = max( —1.5 X min{zj}, -1.5 X min{u>j}, 0), then (a: + d e) (z T dp = dp + 0.5 X P + d e) + (s + d e) (w T D P + d e) D and . ; , « £ ) = « £ > + 0.5 X n c + d e) (z T P + d£,e) + (s + d e) {w + cfoe) ; . T P Chapter 6. Solving the Fractional (g, f)-factor Problem 88 The starting point is then given by + x° = x dpe, s° = s + dpe, z° = z + d e, w° — w + djje, y° = (6.29) D y, (unaltered). We can prove that the starting point satisfies x° > 0,5° > 0, z° > 0, and w° > 0. • Approach 2 Anstee gives a method for choosing the starting point in [9]. For every vertex i G V(G), the interval and §i,fi 9i fi Ld (0' d (i)\ Ii G are defined as follows [9i,fi. G We can see if a fractional subgraph x has Xij 6 li^Ij, V(i, j) G E(G) then x is (g,/)-factor. Therefore if Ii CI lj ^ 0, we choose Xij 6 7,- n lj. If we choose Xij — fi + 2 ^ ( ^ ' - ^ ) ' w 2 a n fractional n Jj = 0 and fi < g~j, here d(Ii,Ij) is the distance between In (6.3), / - : i G ^ ( G ) are determined by gi — Yl(i,j)^E{G)Xiji a and 7,-. d hi : i £ V(G) are determined by £ ( , \ j ) e £ ( G ) X i j ; - /,-. Now we have determined a; = [ x ° 0 T l° T h° ] of (6.4). T The starting value of the primal upper bound slack variables s of (6.4) is determined by s° = u - x°, where u is the upper bound vector. We give a threshold value 6, and adjust all a; and s° that are below this threshold value 0 to the threshold value 6. This guarantees that all x° and s° are sufficiently bounded away from zero. This threshold value S is also used to bound dual variables z° and w° away from zero. For the dual problem, we set y° = 0. z° and w° are determined as follows Chapter 6. Solving the Fractional (g, f)-factor Problem 89 For Xi unbounded: c- if Cj > 6 6 if Cj < 6 0 (6.30) < For Xi bounded: z° = j C + < w° = z° - u>9, if CJ > 0 Cj if Cj < 0 (6.31) • Approach 3 We give a threshold value 6. For primal problem, we set x° = 6 and s° = 6. The dual variables are determined the same way as approach 2. 6.3.2 Test Problems A merely random (g, /)-factor problem is not good. To show this, let us review some available results. We consider a random graph G on n vertices where each vertex chooses randomly a set of r neighbors. Posa [86] proved that a random graph G with f(n) = crclogn edges is Hamiltonian with probability tending to 1 if c > 3. Obviously, if a graph of even order is Hamiltonian, then the graph has two disjoint 1-factors. In [93] Shamir and TJpfal found a sufficient condition for a 1-factor. They proved that if the minimum vertex degree reaches 6, almost every random graph G has a 1-factor. Komlos and Szemeredi[64] proved that when a random graph is chosen by choosing edges independently with the same probability, Hamiltonian cycles appear (in a probabilistic sense) at the same time that the minimum vertex degree reaches 2. This requires about l/2nlogrc + n log log n edges. This means that if in a random graph G each valency is at least 2, then the graph G contains a Hamiltonian cycle "almost surely". From these results, we get the conclusion that a random graph with enough large number of edges almost surely has a 1-factor. We expect that in a similar way a random graph G with "large enough" degree almost surely has a (<7, /)-factor. A proof could involve showing that the cut conditions (6.22) are almost Chapter 6. Solving the Fractional (g, f)-factor Problem 90 surely satisfied. Therefore in order to generate "interesting" (g, /)-factor problems, we do not think it is appropriate to use a wholly random graph. We have generated problems that either contain a (g, /)-barrier or something close a /)- barrier, i.e. a "small" cut. To do this we choose a partition S,T,U of the vertex set V(G) at random and then introduce edges and the data values / , , <7, for every vertex i in V(G) to make the partition S, T, U either violate or nearly violate (6.22), namely equality in (6.22). 6.3.3 Computational Results According to [9], we allow an error of size < 1 for our fractional (g, /)-factor problem. In pur algorithm, when the objective value < 1, we stop the algorithm and get a fractional (g, /)-factor. Otherwise, for a given optimality tolerance e, when the relative duality gap satisfies cx — by + T uw T T 1+ | b y — u w | T T ' we stop the algorithm. In this case, the algorithm converge to an optimal solution with an integral objective value K > 1 and we get a (g, /)-barrier. In this section, we test fourteen (g, /)-factor problems using the three starting point approaches and compare the computational results. Problem 1 to Problem 10 are feasible (having fractional (g, /)-factor) or infeasible problems with small size. Problem 11, 12 and 13 are disjoint union of Problem 1 and Problem 2, Problem 3 and Problem 4, and Problem 3 and Problem 5, respectively. They are the following three types of union: infeasible and infeasible, feasible and infeasible, and feasible and feasible. Problem 14 is a disjoint union of Problem 1 and Problem 2 plus 9 edges, so that there is a cut of capacity 0, i.e. a cut satisfying equality in (6.22). Table 6.2 gives the sizes of the fourteen (g, /)-factor problems and their optimal objective values. Column 1 and Column 2 of this table indicate the numbers of vertices and edges in the original graph. Column 3 indicates the number of constraints in each problem. Column 4 indicates the number of variables in (6.4) and column 5 indicates the number of variables after adding slack variables to (6.4). Column 6 gives the number of nonzero elements in the Chapter 6. Problems Solving the Fractional (g, f)-factor Problem 1 V(G) | 1 E(G) 1 24 55 |Rows 48 2 27 70 54 . Structural cols. 91 Columns Nonzeros 103 151 316 5 124 178 388 4 Opt. values 3 22 45 44 89 133 268 0 4 30 70 60 130 190 400 1 5 24 55 0 7 8 9 70 14 103 124 316 27 48 54 151 6 18 14 18 50 388 92 92 0 1 9 32 32 178 50 0 9 11 19 22 41 63 120 0 . 10 8 11 16 27 43 76 0 11 51 125 102 227 329 704 12 52 115 104 219 46 100 92 192 668 584 9 1 13 14 323 284 0 51 134 102 236 338 740 0 Table 6.2: Original problem size coefficient matrix after adding slack variables. Column 7 gives the optimal objective value of every problem. Table 6.3 shows the number of iterations of IPM using several different threshold value <5 in the starting point approach 2. It shows the results for two different optimality tolerance e. From the total iterations in Table 6.3, we can see that the best threshold value of approach 2 is 6 = 0.4. Table 6.4 compares the iteration numbers of the three starting point approaches using two different optimality tolerances e. From the computational results of approach 2 in Table 6.3, we know that the best threshold value is 6 = 0.4, and the second best threshold value is <5 = 0.5. Therefore, we test S = 0.4 and 6 — 0.5 in approach 3 to see the difference between approach 2 and approach 3. From the total iterations in Table 6.4, we can see that among the three starting point approaches, the best approach is approach 2 with 6 = 0.4. In Table 6.3 and Table 6.4, the iteration counts for all the feasible problems are relatively small. This is because as soon as the objective value is smaller than one, we stop the algorithm Chapter 6. Solving the Fractional (g, f)-factor Problem 92 and get a fractional (g, /)-factor; otherwise, we stop the algorithm only when the relative duality gap is smaller than the optimality tolerance e. From the computational results in Table 6.3 and Table 6.4, we see that doubling the problem size for both feasible and infeasible problems has little effect on the iteration count. The effect is unpredictable. In some cases the iteration count decreases, and in some cases it increases. But doubling the problem size does not affect our conclusion that the /)-factor problem-specific starting point (approach 2 with 6 = 0.4.) is a good starting point and makes the IPM more efficient for solving the fractional (<?,/)-factor problem. All the test problems were solved extremely fast (in a few seconds for the largest problems), and so these methods might apply to problems much larger than these tested here. Nonetheless we imagine a direct networkflowalgorithm would outcompete these methods for all but very large problems. It is shown in [89], [85] and [88] that for many classes of large scale problems interior point methods are currently the most efficient computational solution approaches. Therefore, interior point methods provide, in addition to the direct network flow algorithm, new ways to solve large scale fractional (g, /)-factor problems. 9.3 Chapter 6. Solving the Fractional (g, f)-factor Problem Problem 6 -- 0.05 0.1 0.2 0.3 0.4 £ = ID'- 1 2 3 4 5 6 7 8 9 TO 11 12 13 14 Total 7 10 3 8 4 4 7 1 1 1 8 8 5 4 71 8 8 5 9 5 5 6 3 1 1 9 9 5 6 80 7 7 3 8 4 4 6 2 1 1 7 9 4 5 68 7 10 3 8 3 3 6 2 1 1 7. 7 3 4 65 0.5 0.6 0.7 0.8 0.9 7 8 2 8 3 2 7 2 1 2 8 9 3 4 66 7 8 2 9 3 2 7 2 1 2 8 9 3 4 67 7 8 2 9 3 2 7 2 1 2 8 10 3 4 68 7 9 2 7 3 2 7 2 2 2 8 7 3 4 65 7 8 2 7 3 2 7 2 2 2 8 .7 3 4 64 6 8 2 7 3 2 6 2 1 2 7 8 3 4 61 6 7 2 8 3 2 6 2 1 2 7 8 3 4 61 6 8 2 7 3 2 6 2 1 2 8 8 3 4 62 7 8 2 7• 3 2 6 2 2 2 7 7 3 4 62 7 8 2 7 3 2 6 2 2 2 8 73 4 63 8 7 7 2 7 3 3 7 2 1 1 8 8 3 3 62 = ID - 6 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Total 6 9 3 •7 4 4 6 1 1 1 7 8 5 4 66 8 8 5 8 5 5 6 3 1 1 8 8 5 6 77 6 7 3 8 4 4 5 2 1 1 7 8 4 .5 65 6 .8 3 7 ' 3 3 5 2 1 1 6 7 3 4 59 6 7 2 6 3 3 6 2 1 1 7 7 3 3 57 Table 6.3: Iteration counts for approach 2 94 Chapter 6. Solving the Fractional (g, f)-factor Problem Problem Approach 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Total 8 13 2 8 3 2 8 2 2 1 9 8 3 4 73 1 2 3 4 5 6 •7 8 9 10 11 12 13 . 14 Total 7 11 2 7 3 2 7 2 2 1 8 7 3 4 66 Approach 2 <S = 0.4 Approach 3 S = 0.4 6 = 0.5 e = ID - 8 7 7 2 7 3 3 7 2 1 1 8 8 3 3 62 = 123-6 6 7 2 6 3 3 6 2 1 1 7 7 3 3 57 8 8 2 9 3 3 6 2 2 2 9 9 4 4 71 8 8 2 8 3 3 7 2 1 2 8 9 3 4 68 7 8 2 7 3 3 6 2 2 2 8 8 4 4 66 7 8 2 8 3 3 6 2 1 2 8. 8 3 4 65 Table 6.4: Iteration counts for different starting point approaches Bibliography [1] I. Adler, N . K . Karmarkar, M . G . C . Resende, and G . Veiga. Data structures and programming techniques for the implementation of Karmarkar's algorithm. ORSA Journal on Computing, 1:84-106, 1989. [2] I. Adler, N . K . Karmarkar, M . G . C . Resende, and G . Veiga. A n implementation of Karmarkar's algorithm for linear programming. Mathematical Programming, 44:297-335, 1989. Errata in Mathematical Programming, 50:415, 1991. [3] E . D. Andersen and K . D. Andersen. Presolving in linear programming. Technical report, Dept. of Math, and Computer Sci., Odense University, 1993. To appear in Math Programming. [4] E . D . Andersen, J . Gondzio, C . Meszaros, and X . X u . Implementation of interior point methods for large scale linear programming. In T . Terlaky, editor, Interior Point Methods in Mathematical Programming. Kluwer Academic Publisher, 1996. [5] K . D. Andersen. A modified schur complement method for handling dense columns in interior point methods for linear programming. Technical report, Dept. of Math, and Computer Sci., Odense University, 1994. Submitted to A C M Transaction on Mathematical Software. [6] R. P. Anstee. An algorithmic proof of Tutte's /-factor theorem. Journal of Algorithms, 6:112-131, 1985. [7] R. P. Anstee. Matching theory: fractional to integral. New Zealand Journal of Mathematics, 12:17-32,1992. [8] R. P. Anstee. Minimum vertex weighted deficiency of (g, /)-factors: A greedy algorithm. Discrete Applied Mathematics, 44:247-260, 1993. [9] R. P. Anstee and Yunsun Nam. More sufficient conditions for a graph to have factors, preprint, Department of Mathematics, The University of British Columbia, Vancouver, B . C . , Canada. [10] K . M . Anstreicher. A monotonic projective algorithm for fractional linear programming. Algorithmica, l(4):483-498, 1986. [11] K . M . Anstreicher. A standard form variant and safeguarded linesearch for the modified Karmarkar algorithm. Mathematical Programming, 47:337-351, 1990. [12] K . M . Anstreicher and R. A . Bosch. Long steps in a 0(n L) gramming. Mathematical Programming, 54:251-265, 1992. z 95 algorithm for linear pro- Bibliography 96 [13] E . R. Barnes. A variation on Karmarkar's algorithm for solving linear programming problems. Mathematical Programming, 36:174-182, 1986. [14] E . R. Barnes, S. Chopra, and D. J . Jensen. The affine scaling method with centering. Technical Report, Dept. of Mathematical Sciences, I B M T . J . Watson Research Center, P. 0. Box 218, Yorktown Heights, N Y 10598, U S A , 1988. [15] Richard Bartels and Linda Kaufman. Cholesky factor updating techniques for rank 2 matrix modifications. SIAM Journal on Matrix Analysis and Applications, 10(4):557592, 1989. [16] D. A . Bayer and J . C . Lagarias. The nonlinear geometry of linear programming, Part I : Affine and projective scaling trajectories. Transactions of the American Mathematical Society, 314(2):499-526, 1989. [17] A . L . Breariey, G . Mitra, and H . P. Williams. Analysis of mathematical programming problems prior to applying the simplex algorithm. Mathematical Programming, 15:54-83, 1975. [18] J . R. Bunch and B. N . Parlett. Direct methods for solving symmetric indefinite systems of linear equations. SIAM J. Numer. Anal., 8:639-655, 1971. [19] T . J . Carpenter, I. J . lustig, J . M . Mulvey, and D. F . Shanno. Higher order predictorcorrector interior point methods with application to quadratic objectives. SIAM Journal on Optimization, 3:696-725, 1993. [20] T . M . Cavalier and A . L . Soyster. Some computational experience and a modification of the Karmarkar algorithm. Working Paper 85-105, Dept. of Industrial and Management Systems Engineering, Pennsylvania State University, University Park, P A 16802, U S A , 1985. [21] S.F. Chang and S.T. McCormick. A hierarchical algorithm for making sparse matrices sparser. Mathematical Programming, 56:1-30, 1992. [22] S. Chen. Computational experience with the Karmarkar algorithm. Talk held at the O R S A / T I M S Joint National Meeting in Los Angeles, C A , U S A , A T & T Bell Laboratories, Holmdel, NJ 07733, U S A , April 1986. [23] I. C . Choi and D. Goldfarb. Exploiting special structure in a primal-dual path-following algorithm. Mathematical Programming, 58:33-52, 1993. [24] I. C . Choi, C . L . Monma, and D. F . Shanno. Further development of a primal-dual interior point method. ORSA Journal on Computing, 2:304-311, 1990. [25] I. I. Dikin. Iterative solution of problems of linear and quadratic programming. Doklady Akademii Nauk SSSR, 174:747-748, 1967. Translated in : Soviet Mathematics Doklady, 8:674-675, 1967; Bibliography [26] J . J . Dongarra, J . R. Bunch, C . B. Moler, and G . W . Stewart. UNPACK S I A M , Philadelphia, 1979. 97 User's Guide. [27] I. S. Duff, A . M . Erisman, and J . K . Reid. Direct methods for sparse matrices. Oxford University Press, New York, 1989. [28] I. S. Duff and J . K . Reid. Ma27-a set of Fortran subroutines for solving sparse symmetric sets of linear equations. In Report AERE R-10533, Computer Sciences and Systems Division. A E R E Harwell, 1982. [29] I. S. Duff and J . K . Reid. The multifrontal solution of indefinite sparse symmetric linear equations. ACM Transactions on Mathematical Software, 9:302-325, 1983. [30] S. C . Eisenstat, M . C . Gurshy, M . H . Schultz, and A . H . Sherman. The Yale sparse matrix package, i. the symmetric codes. International Journal for Numerical Methods in Engineering, 18:1145-1151, 1982. [31] A . V . Fiacco and G . P. McCormick. Nonlinear Programming : Sequential Unconstrained Minimization Techniques. John Wiley & Sons, New York, 1968. Reprint : Volume 4 of SIAM Classics in Applied Mathematics, SIAM Publications, Philadelphia, P A 191042688, U S A , 1990. [32] K . R. Frisch. The logarithmic potential method for convex programming. Unpublished manuscript, Institute of Economics, University of Oslo, Oslo, Norway, May 1955. [33] D . M . Gay. A variant of Karmarkar's linear programming algorithm for problems in standard form. Mathematical Programming, 37:81-90, 1987. Errata in Mathematical Programming, 40:111, 1988. [34] A . George and J . W . H . Liu. Computing solution of large sparse positive definite systems. Prentice-Hall, Engliwood Cliffs, N J , 1981. [35] A . George and J . W . H . Liu. The evolution of the minimum degree ordering algorithm. SIAM Rev., 31(1-19), 1989. [36] G . de Ghellinck and J . P. Vial. A polynomial Newton method for linear programming. Algorithmic^, l(4):425-453, 1986. [37] P. E . Gill, W . Murray, and M . A . Saunders. A single - phase dual barrier method for linear programming. Technical Report SOL 88-10, Systems Optimization Laboratory, Dept. of Operations Research, Stanford University, Stanford, C A 94305, U S A , August 1988. [38] P. E . Gill, W . Murray, M . A . Saunders, J . A . Tomlin, and M . H . Wright. On projected Newton barrier methods for linear programming and an equivalence to Karmarkar's projective method. Mathematical Programming, 36:183-209, 1986. Bibliography 98 [39] P. E . Gill, W . Murray, and M . H . Wright. Numerical Linear Algebra and Optimization, Vol. 2. Addison Wesley Publishing Company, Amsterdam, The Netherlands, 1992. Chapter on interior point methods. [40] D . Goldfarb and M . J . Todd. Linear Programming. In G . L . Nemhauser, A . H . G . Rinnooy Kan, and M . J . Todd, editors, Optimization, volume 1 of Handbooks in Operations Research and Management Science, pages 141-170. North Holland, Amsterdam, The Netherlands, 1989. [41] G . H . Golub and C . F . Van Laon. Matrix Computations. The Johns Hopkins University Press, Baltimore, M D , 1989. [42] J . Gondzio. Splitting dense columns of constraint matrix in interior point methods for large scale linear programming. Optimization, 24:285-297, 1992. [43] J . Gondzio. Multiple centrality corrections in primal-dual method for linear programming. Technical Report 20, Department of Management Studies, University of Geneva, Switzerland, November 1994. [44] J . Gondzio. Presolve analysis of linear programs prior to applying an interior point method. Technical Report 1994.3, Department of Management Studies, University of Geneva, Switzerland, 1994. [45] J . Gondzio. Hopdm (version 2.12) - a fast lp solver based on a primal-dual interior point method. European Journal of Operational Research, 85:221-225, 1995. [46] C . C . Gonzaga. A n algorithm for solving linear programming problems in 0(n L) operations. In N . Megiddo, editor, Progress in Mathematical Programming : Interior Point and Related Methods, pages 1-28. Springer Verlag, New York, 1989. 3 [47] C . C . Gonzaga. Polynomial affine algorithms for linear programming. Mathematical Programming, 49:7-21, 1990. [48] C . C . Gonzaga. Interior point algorithms for linear programming problems with inequality constraints. Mathematical Programming, 52:209-225, 1991. [49] C . C . Gonzaga. Large steps path-following methods for linear programming, Part I : Barrier function method. SIAM Journal on Optimization, 1:268-279, 1991. [50] C . C . Gonzaga. Path following methods for linear programming. SIAM Review, 34(2):167-227, 1992. [51] D. den Hertog and C . Roos. A survey of search directions in interior point methods for linear programming. Mathematical Programming, 52:481-509, 1991. [52] D. den Hertog, C . Roos, and T . Terlaky. A large-step analytic center method for a class of smooth convex programming problems. SIAM Journal on Optimization, 2:55-70, 1992. Bibliography 99 [53], D . den Hertog, C . Roos, and J . P. Vial. A complexity reduction for the long-step pathfollowing algorithm for linear programming. SIAM Journal on Optimization, 2:71-87, .1992. [54] A . J . Hoffman and S. T . McCormick. A fast algorithm that makes matrices optimally sparse. Technical Report S O L 82-13, Systems Optimization Laboratory, Stanford University, Stanford, C A , U S A , 1982. [55] P. Huard. Resolution of mathematical programming with nonlinear constraints by the method of centers. In J . Abadie, editor, Nonlinear Programming, pages 207-219. North Holland, Amsterdam, The Netherlands, 1967. [56] B. Jansen, C . Roos, T . Terlaky, and J . Vial. Primal-dual target following algorithms for linear programming. Technical Report 93-107, Faculty of Technical Mathematics and Informatics, T U Delft, NL-2600 G A Delft, The Netherlands, 1993. [57] N . K . Karmarkar. A new polynomial-time algorithm for linear programming. Combina- torica, 4:373-395, 1984. [58] N . K . Karmarkar and K . G . Ramakrishnan. Implementation and computational aspects of the Karmarkar algorithm for linear programming, using an iterative method for computing projections. Technical Memorandum, A T & T Bell Laboratories, Murray Hill, N J 07974, U S A , 1989. See Karmarkar and Ramakrishnan [59]. [59] N . K . Karmarkar and K . G . Ramakrishnan. Computational results of an interior point algorithm for large scale linear programming. Mathematical Programming, 52:555-586, 1991. [60] L . G . Khachian. A polynomial algorithm for linear programming. Doklady Akad. Nauk USSR, Translated in Soviet Math. Doklady, 244:1093-1096, 1979. '" [01] V . Klee and G . J . Minty. How good is the simplex method. In O.Shisha, editor, Inequalities III. Academic Press, New York, 1972. [62] M . Kojima, S. Mizuno, and A . Yoshise. A primal-dual interior point algorithm for linear programming. In N . Megiddo, editor, Progress in Mathematical Programming : Interior Point and Related Methods, pages 29-47. Springer Verlag, New York, 1989. [63] M . Kojima, S. Mizuno, and A . Yoshise. An 0(y/nL) gorithm for linear complementarity problems. iteration potential reduction al- Mathematical Programming, 50:331-342, 1991. [64] J . Komlos and E . Szemeredi. Limit distribution for the existence of hamiltonian cycles in random graphs. Discrete Mathematics, 43:55-63, 1983. [65] I. J . Lustig. Feasibility issues in a primal-dual interior point method for linear programming. Mathematical Programming, 49:145-162, 1990/91. Bibliography 100 I. J . Lustig, R. E . Marsten, and D. F . Shanno. Computational experience with a primaldual interior point method for linear programming. Linear Algebra and Its Applications, 152:191-222, 1991. I. J . Lustig, R. E . Marsten, and D. F . Shanno. On implementing Mehrotra's predictorcorrector interior point method for linear programming. SIAM Journal on Optimization, 2:435-449,1992. I. J . Lustig, R. E . Marsten, and D. F . Shanno. Interior point method for linear programming: Computational state of the art. ORSA Journal on Computing, 6:1-14, 1994. H . M . Markowitz. The elimination form of the inverse and its application to linear programming. Management Sci., 3:255-269, 1957. I. Maros and Cs. Meszaros. The role of the augmented system in interior point methods. Technical Report T R / 0 6 / 9 5 , Department of mathematics and statistics, Brunei University, London,1995. R. E . Marsten. The IMP System on personal computers. Talk held at the O R S A / T I M S Joint National Meeting in St. Louis, U S A , Dept. of MIS, University of Arizona, Tucson, A Z 85721, U S A , October 1987. K. A . McShane, C . L . Monma, and D. F . Shanno. A n implementation of a primal-dual interior point method for linear programming. ORSA Journal on Computing, 1:70-83, 1989. • N. Megiddo. A variation on Karmarkar's algorithm. Preliminary Report, I B M San Jose Research Laboratory, San Jose, C A 95193, U S A , 1984. N. Megiddo. Pathways to the optimal set in linear programming. In N . Megiddo, editor, Progress in Mathematical Programming : Interior Point and Related Methods, pages 131158. Springer Verlag, New York, 1989. Identical version in : Proceedings of the 6th Mathematical Programming Symposium of Japan, Nagoya, Japan, pages 1-35, N . Megiddo and M . Shub. 1986. Boundary behavior of interior point algorithms in linear programming. Mathematics of Operations Research, 14:97-146, 1989. S. Mehrotraand J . Sun. On the implementation of a (primal-dual) interior point method. SIAM Journal on Optimization, 2(4):575-601, 1992. S. Mehrotra and J . S. Wang. Conjugate gradient based implementation of interior point methods for network flow problems. Technical Report 95-70.1, Department of Indus- trial Engineering and Management Sciences, Northwestern University, Evanston, October 1995. [78] S. Mizuno, M . J . Todd, and Y . Ye. Anticipated behavior of interior point algorithms for linear programming. Technical Report, School of Operations Research and Industrial Engineering, Cornell University, Ithaca, N Y 14853-3801, U S A , 1990. Incorrect citation, elsewhere. Bibliography 101 [79] S. Mizuno, M . J . Todd, and Y . Ye. On adaptive step primal-dual interior-point algorithms for linear programming. Mathematics of Operations Research, 18:964-981, 1993. [80] C . L . Monma and A . J . Morton. Computational experience with the dual affine variant of Karmarkar's method for linear programming. Operations Research Letters, 6:261-267, 1987. [81] R. D . C . Monteiro and I. Adler. Interior path following primal-dual algorithms : Part I : Linear programming. Mathematical Programming, 44:27-41, 1989. [82] R. D . C . Monteiro and I. Adler. A n extension of Karmarkar-type algorithm to a class of convex separable programming problems with global linear rate of convergence. Mathematics of Operations Research, 15:408-422, 1990. [83] R. D . C . Monteiro, I. Adler, and M . G . C . Resende. A polynomial-time primal-duaj affine scaling algorithm for linear and convex quadratic programming and its power series extension. Mathematics of Operations Research, 15:191-214, 1990. [84] L . Portugal, F . Bastos, J . Jiidice, J . Paixao, and T . Terlaky. A n investigation of interior point algorithms for the linear transportation problems. Technical Report 93-100, Faculteit der technische wiskunde en informatica, Technische Universiteit Delft, Nederlands, 1993. [85] L . F . Portugal, M . G . C . Resende, G . Veiga, and J . J . Judice. A truncated primalinfeasible dual-feasible network interior point method. Technical Report, A T &: T Bell Laboratories, Murray Hill, N J , U S A , 1995. [86] L . Posa. Hamiltonian circuits in random graphs. Discrete Mathematics, 14:359-364, 1976. [871 J- Renegar. A polynomial-time algorithm, based on Newton's method, for linear pro- gramming. Mathematical Programming, 40:59-93, 1988. [88] M . G . C . Resende and P. M . Pardalos. Interior point algorithms for network flow problems. Technical report, A T & T Bell Laboratories, Murray Hill, N J , U S A , 1996. [89] M . G . C . Resende and G . Veiga. A n efficient implementation of a network interior point method. In D.S. Johnson and C . C . McGeoch, editors, Network Flows and Matching: First DIM ACS Implementation Challenge, DIM ACS Series on.Discrete Mathematics and Theoretical Computer Science, volume 12, pages 299-348. American Mathematical Society, 1993. [90] C . Roos and J . P. Vial. Analytic centers in linear programming. Technical Report 88-74, Faculty of Mathematics and Informatics, T U Delft, NL-2628 B L Delft, The Netherlands, 1988. Bibliography 102 [91] C . Roos and J . P. Vial. Long steps with the logarithmic penalty barrier function in linear programming. In J . Gabszevwicz, J . F . Richard, and L . Wolsey, editors, Economic Decision-Making : Games, Economics and Optimization, dedicated to J. H. Dreze, pages 433-441. Elsevier Science Publisher B . V . , Amsterdam, The Netherlands, 1989. [92] D . J . Rose. Symmetric elimination on sparse positive definite systems and potential flow network problem. PhD thesis, Harvard University, Cambridge, M A , 1970. [93] E . Shamir and E . Upfal. One-factor in random graphs based on vertex choice. Discrete Mathematics, 41:281-286,1982. [94] N . Shor. Utilization of the operation of space dilatation in the minimization of convex functions. Kibernetica, 1:6-12, 1970. English translation: Cybernetics 6, 7-15. [95] G . Sonnevend. A n "analytic center" for polyhedrons and new classes of global algorithms for linear (smooth, convex) programming. In A . Prekopa, J . Szelezsan, and B . Strazicky, editors, System Modeling and Optimization : Proceedings of the 12th IFIP-Conference held in Budapest, Hungary, September 1985, volume 84 of Lecture Notes in Control and Information Sciences, pages 866-876. Springer Verlag, Berlin, West-Germany, 1986. [96] R. A . Tapia, Y . Zhang, M . Saltzman, and A . Weiser. The predictor-corrector interiorpoint method as a composite Newton method. Technical Report T R 90-17, Dept. of Mathematical Sciences, Rice University, Houston, T X 77251, U S A , 1990. [97] W . F . Tinney and J . W . Walker. Direct solution of sparse network equations by optimally ordered triangular factorization. In Proceedings of IEEE 55, volume 55, pages 1801-1809, 1967. [98] M . J . Todd. Recent developments and new directions in linear programming. In M . Iri and K . Tanabe, editors, Mathematical Programming : Recent Developments and Applications, pages 109-157. Kluwer Academic Press, Dordrecht, The Netherlands, 1989. [99] M . J . Todd. Potential-reduction methods in mathematical programming. Technical Report, School of Operations Research and Industrial Engineering, Cornell University, Ithaca, N Y 14853, 1995. [100] J . A . Tomlin. A n experimental approach to Karmarkar's projective method for linear programming. Mathematical Programming Study, .31:175—191, 1987. [101] J . A . Tomlin and J . S. Welch. Finding duplicate rows in a linear program. Operations research letters, 5:7-11, 1986. [102] P. M . Vaidya. +(m + n) nL) 15 A n algorithm for linear programming which requires 0((m arithmetic operations. + n)n 2 Mathematical Programming, 47:175-201, 1990. Condensed version in : Proceedings of the 19th Annual ACM Symposium on Theory of Computing, pages 29-38, 1987. Bibliography 103 [103] R. J . Vanderbei. Affine-scaling for linear programs with free variables. Programming, 43:31-44, 1989. Mathematical [104] R. J . Vanderbei. Splitting dense columns in sparse linear systems. Linear Algebra and Its Applications, 152:107-117, 1991. [105] R. J . Vanderbei and T . J . Carpenter. Symmetric indefinite systems for interior point methods. Mathematical Programming, 58:1-32, 1993. [106] R. J . Vanderbei, M . S. Meketon, and B. A . Freedman. A modification of Karmarkar's linear programming algorithm. Algorithmica, l(4):395-407, 1986. [107] W . Wang and D . P. O'Leary. Adaptive use of iterative methods in interior point methods for linear programming. Technical Report, Applied Mathematics Program, University of Maryland, College Park, M D 20742, 1995. [108] M . H . Wright. A brief history of linear programming. SIAM News, 18(3):4, November 1985. [109] M . H . Wright. Interior methods for constrained optimization. In A . Iserles, editor, Acta Numerica 1992, pages 341-407. Cambridge University Press, New York, U S A , 1992. [110] H . Yamashita. A polynomially and quadratically convergent method for linear programming. Working Paper, Mathematical Systems Institute, Inc., Tokyo, Japan, 1986. [Ill] M . Yannakakis. Computing the minimum fill-in is NP-complete. SIAM Journal on Algebraic Discrete Methods, 2:77-79, 1981. [112] Y . Ye. A class of potential functions for linear programming. Management Science Working Paper 88-13, Dept. of Management Science, University of Iowa, Iowa City, IA 52242, U S A , 1988: [113] Y . Ye. A n 0(n L) 3 potential reduction algorithm for linear programming. Mathematical Programming, 50:239-258, 1991. [114] Y . Ye. Interior point algorithm: Theory and practice, lecture notes, Department of Management Sciences, The University of Iowa, 1995. [115] Y . Ye and M . Kojima. Recovering optimal dual solutions in Karmarkar's polynomial algorithm for linear programming. Mathematical Programming, 39:305-317, 1987. [116] D . Yudin and A . S. Nemirovsky. Informational complexity and efficient methods for the solution of convex extremal problems. Ekonomika i Mathematicheskie Metody, 12:357- 369, 1976. English translation: Matekon 13(2), 3-25. [117] Y . Zhang and D . Zhang. On polynomiality of the Mehrotra-type predictor-corrector interior-point algorithms. Mathematical Programming, 68:303-318, 1995.
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- The interior point method and its application to the...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
The interior point method and its application to the fractional (g.f)-factor problem Shi, Ping 1996
pdf
Page Metadata
Item Metadata
Title | The interior point method and its application to the fractional (g.f)-factor problem |
Creator |
Shi, Ping |
Date Issued | 1996 |
Description | This thesis contains six chapters. From Chapter 1 to Chapter 5, we give an exposition of interiorpoint methods (IPM). In Chapter 1 a brief survey of the recent development in interior-point methods is given. Chapter 2 gives a detailed description on the history of the primal-dual pathfollowing methods for linear programming. Chapter 3 introduces several predictor-corrector methods. This is an efficient and powerful class of interior-point methods and a lot ofattention has been paid during recent years. Chapter 4 discusses the various efficient linear algebraic methods which are used for solving the Newton equations system. Practically, most of the computational effort in implementations of interior-point methods is taken up in solving the Newton equations system. Therefore an efficient method for solving the Newton equations system can make the whole algorithm very efficient. Chapter 5 talks about the presolving procedure which is used to reduce the size of the LP problem and to make the system sparser before passing the problems to optimization. The presolve procedure is a very efficient adjunct for any LP solver. In chapter 6, we apply the interior-point method to the fractional (g,f)-factor problem, we first exploit the special structure of the fractional (g,f)-factor problem and propose two IPM algorithms to take advantage of the special structure of the fractional (g,f)-factor problem. In each of the two algorithms, we use a different linear programming model for the fractional (g/f)-factor problem in order to make the corresponding algorithm more efficient. Then, we discuss three starting point approaches and the strategy for creating our test problems. Last, we use the high order predictor-corrector primal-dual path-following code HOPDM[45] to test the three starting point approaches and from the computational results we get the conclusion that a fractional (g/f)-factor problem-specific starting point can save iterations of the IPM and make the IPM more efficient for solving fractional (g,f)-factor problem. |
Extent | 4528763 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
File Format | application/pdf |
Language | eng |
Date Available | 2009-03-14 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0079841 |
URI | http://hdl.handle.net/2429/6038 |
Degree |
Master of Science - MSc |
Program |
Mathematics |
Affiliation |
Science, Faculty of Mathematics, Department of |
Degree Grantor | University of British Columbia |
Graduation Date | 1996-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
Aggregated Source Repository | DSpace |
Download
- Media
- 831-ubc_1996-0493.pdf [ 4.32MB ]
- Metadata
- JSON: 831-1.0079841.json
- JSON-LD: 831-1.0079841-ld.json
- RDF/XML (Pretty): 831-1.0079841-rdf.xml
- RDF/JSON: 831-1.0079841-rdf.json
- Turtle: 831-1.0079841-turtle.txt
- N-Triples: 831-1.0079841-rdf-ntriples.txt
- Original Record: 831-1.0079841-source.json
- Full Text
- 831-1.0079841-fulltext.txt
- Citation
- 831-1.0079841.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0079841/manifest