UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

A robust linear program solver for projectahedra Laza, Marius 2002

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


831-ubc_2002-0147.pdf [ 3.73MB ]
JSON: 831-1.0051539.json
JSON-LD: 831-1.0051539-ld.json
RDF/XML (Pretty): 831-1.0051539-rdf.xml
RDF/JSON: 831-1.0051539-rdf.json
Turtle: 831-1.0051539-turtle.txt
N-Triples: 831-1.0051539-rdf-ntriples.txt
Original Record: 831-1.0051539-source.json
Full Text

Full Text

A Robust Linear Program Solver for Projectahedra by Marius Laza M.Sc, Polytechnic Institute of Bucharest, 1991  A THESIS SUBMITTED IN PARTIAL F U L F I L L M E N T OF THE REQUIREMENTS FOR THE D E G R E E OF M a s t e r of Science in THE FACULTY OF GRADUATE STUDIES (Department of Computer Science) We accept this thesis as conforming to the required standard  The University of British Columbia December 2001 © Marius Laza, 2001  In p r e s e n t i n g t h i s t h e s i s i n p a r t i a l f u l f i l m e n t of the requirements f o r an advanced degree at the U n i v e r s i t y of B r i t i s h Columbia, I agree t h a t the L i b r a r y s h a l l make i t f r e e l y a v a i l a b l e f o r r e f e r e n c e and study. I f u r t h e r agree t h a t p e r m i s s i o n f o r e x t e n s i v e copying of t h i s t h e s i s f o r s c h o l a r l y purposes may be granted by the head of my department or by h i s or her r e p r e s e n t a t i v e s . I t i s understood t h a t copying or p u b l i c a t i o n of t h i s t h e s i s f o r f i n a n c i a l g a i n s h a l l not be a l l o w e d without my w r i t t e n p e r m i s s i o n .  Department of The U n i v e r s i t y of B r i t i s h Columbia Vancouver, Canada  Abstract Linear programming has a wide range of applications, optimization-related problems being one of them. Important concerns in linear programming are efficiency, robustness, and accuracy. Linear programming is used in a reachability analysis tool called Coho [GM99] for dynamical systems. Previous experience has shown that linear programs in this tool lead to highly ill-conditioned linear systems which prevented successful reachability analysis. This thesis presents a robust linear program solver with provable error bounds that exploits the special structure of the linear programs that result in the reachability tool. This contribution is of interest for the particular application for which it was developed. Furthermore, it shows how duality and combinatorial aspects of linear programming can be exploited to achieve greater efficiency, robustness, and accuracy.  u  Contents Abstract  11  Contents  "i  List of Figures  vi  Acknowledgements  vii  Dedication 1  viii  Introduction  1  1.1  Motivation  1  1.2  Contribution  3  1.3  Outline  3  2 Background  3  6  2.1  Projectahedra  6  2.2  Verification as Reachability  7  2.3  Coho  8  Linear Programs  15  3.1  Problem Definition  15  3.2  Linear Programs in Standard Form  21  3.2.1  21  Feasible Region iii  3.3  4  21  3.2.3  Pivoting  22  3.2.4  Cycling  24  3.2.5  T h e Simplex Tableau  25  Linear Programs in Coho Form  26  29  4.1  Lazy Tableau Generation  31  4.2  Efficient Computation of Tableau Columns  32  4.2.1  Reduction to a Cycle  32  4.2.2  Solving a Cycle  34  Running Error Analysis  36  Analytical Attack on the Error 5.1  5.2  6  Bases  Combinatorial Simplex for Coho  4.3  5  3.2.2  39  Error Bound on Cycle Solution  41  5.1.1  Algorithm for Solving Cycles  41  5.1.2  Estimation of Cycle Condition Number  44  5.1.3  Error Bound on Solution to Cycle  46  5.1.4  Summary  51  Estimation of Optimal Cost  52  5.2.1  Use of Nonoptimal Basis  55  5.2.2  Error Introduced by Dropping One Constraint  59  5.2.3  Use of the Bounding Box  65  5.2.4  Error Bound for Coho Cycles  72  5.2.5  Summary  80  Implementation  82  6.1  F i n d i n g an Initial Invertible Basis  82  6.2  Finding an Initial Feasible Basis  85  6.3  Dealing with Uncertainty and Avoidance of Cycling  87  iv  6.4 Conserving Structure after Moving Forward in Time 7  91 93  Conclusions 7.1  What has been Accomplished  93  7.2  Suggestions for Further Research  95  Bibliography  97  Appendix A Definitions and Notations  99  A.l  Notations  99  A.2 Definitions  100  v  List of Figures 2.1  A three dimensional "projectahedron"  7  2.2  T h e creation of a bloated linear program for the convex hull of the projectahedron  9  2.3  A time step of Coho  10  3.1  Cases of L P feasible region and cost  18  3.2  Linear program with two optimal vertices  26  3.3  Types of vertices in a Coho L P  28  4.1  Non-zero structure of a cycle  34  4.2  Cycles in a matrix  34  5.1  Types of optimal 2D vertices  56  5.2  Halfline emanating from inside a box  70  5.3  T h e best approximating vertex  73  6.1  Subgraph that corresponds to an structurally singular matrix.  6.2  Subgraph that corresponds to an invertible matrix  vi  . . .  83 84  Acknowledgements T h i s thesis would not have been possible without substantial help from several individuals. It is difficult to overstate my gratitude to my supervisor, Mark Greenstreet, for his extensive support and encouragement. I wish to thank David Kirkpatrick, James Varah, and E l d a d Haber for their contribution of time and ideas to my thesis. I am grateful to A l a n H u for the encouragement and advice that he provided to me when I needed them the most. I would like to thank my wife, M i r a , for constantly supporting me and for believing in my eventual success even when I despaired.  MARIUS  The University of British Columbia December 2001  vii  L A Z A  T o my parents, who wanted it more than anybody else.  viii  Chapter  1  Introduction 1.1  Motivation  T h e problem of verification is that of showing that a design satisfies its specification. T h e design may be of an electronic circuit, a computer program, a network or security protocol, a chemical plant, an airplane, etc. For our purposes, the specification describes the desired behaviors of the design: that the circuit implements a particular finite state machine, that the security protocol does not disclose passwords, that the chemical plant does not explode, etc.  T h e goal of formal verification is  to produce a formal, mathematical proof that the design has the desired properties. For this approach, both the design and the specification must be modeled in a mathematical framework where such a proof is mathematically meaningful. T h i s thesis is concerned with verification where the design is modeled by a system of non-linear, ordinary differential equations description of the possible initial states of the system.  (non-linear O D E s ) and a T h e specification  describes  a "safe" region in which the trajectories for all solutions to the model must be contained.  T h e verification task is to determine whether from a possible initial  state the system can ever reach a forbidden state or not. T h i s type of verification is termed reachability analysis. For non-trivial systems, reachability analysis tools must use approximation  1  techniques:  closed-form solutions do not exist.  scribed in this thesis, is one such tool.  Coho, the verification system de-  It computes over-approximations of the  reachable space to provide a sound verification of safety properties: Coho may fail to verify a correct system, but it will not erroneously verify an incorrect system. A s part of computing the evolution in time of the reachable state space of a system, Coho solves a large number of linear programs. T h e soundness of Coho relies on computing accurate error bounds for the solutions of these linear programs.  To  avoid false negatives, it is desirable that these solutions be as accurate as practical. Linear programming is a well-studied problem that has a classical solution, namely the Simplex algorithm. O n any real machine, errors are an inherent part of floating-point computations.  In this thesis, we address the impact of numeri-  cal errors from floating-point computations on the accuracy and robustness of the Simplex algorithm, as applied to our verification system. T h e effect of errors in the input and in the intermediate computations on the result of a problem is measured by its conditioning. T h e result of an ill-conditioned problem may be affected by large errors even though the errors i n the input or in the intermediate computations are small. Many of the linear programs that arise in Coho are ill-conditioned problems. For such problems Simplex tends to yield solutions that contain large errors, compromising the applicability of Coho.  Moreover, no error bounds are available on  the solutions, preventing the tool from producing a guarantee of correctness for the system being analyzed. A noteworthy property of Coho linear systems is that their structure is special. More precisely, the feasible region of a Coho linear program is the intersection of set of orthogonal back-projections into the full-dimensional space of 2D polygons. T h i s means that any inequality in the definition of the feasible region contains only two variables. T h i s thesis explores ways of exploiting this special structure in order to obtain greater accuracy and robustness while keeping efficiency reasonable.  2  1.2  Contribution  The special structure of the linear programs arising from orthogonal projections allows the implementation of an efficient and numerically robust version of the Simplex algorithm. The main contributions of this research are the following: 1. A n implementation of Simplex where the combinatorial representation of bases remains explicit. • The key to practicality is an 0(n) linear system solver, where n is the number of variables in the system. 2. Numerical robustness achieved by computing accurate error bounds. • The combinatorial approach above allows us to avoid numeric error propagation between steps, thus keeping the error bounds reasonably tight. • When the optimal basis is ill-conditioned, it is shown that a pivot can be made to another basis that has nearly the same cost and is wellconditioned with respect to the cost function. 3. A n error bound for the optimal cost that is independent of the numerical value appearing in the constraints. 4. Implementation  1.3  Outline  This thesis presents an efficient and numerically robust method of solving the linear programs that arise in the Coho verification tool.  3  • Chapter 2 presents the context in which linear programs with a particular structure arise. First, the reachability analysis class of problems and its applicability to verification are introduced.  T h e n a system that  implements  reachability analysis is described with emphasis on its use of linear programs. Finally, the special structure of these linear programs and their impact on the usability of the system are underlined. • Chapter 3 reviews the Simplex algorithm for solving linear programs. B o t h the geometric and the combinatorial aspects of the problem are presented.  The  concepts of basis, pivoting, duality, and cycling receive particular attention. T h e standard implementation of the Simplex algorithm is presented briefly, pointing out its problems in the case at hand. • Chapter 4 presents the main features of the proposed modified version of Simplex.  These include the lazy computation of tableau columns and the  linear-time algorithm that accomplishes the computation.  T h e presentation  of the method of computing error bounds on the results of the floating-point operations proposed to be used by the linear program solver concludes the chapter. • Chapter 5 reviews the numerical accuracy and stability of the new algorithm for solving the particular type of linear systems that arise i n Coho. A n error bound on the solution to such linear system as computed by the new algorithm is established. T h e rest of the chapter focuses on ways of obtaining a good approximation of the optimal cost of a linear program. T h e geometrical meaning of an optimal ill-conditioned basis is analyzed and a method of approximating its cost is proposed and then analyzed.  T h e n the application of this method to the  Coho linear programs is studied. • Chapter 6 presents the implementation details that were found to be significant  4  during algorithm implementation.  A chapter of conclusions and suggestions for further research completes the core of the thesis, followed by an appendix containing the definitions of the mathematical notations and definitions used in the thesis.  5  Chapter  2  B a c k g r o u n d  2.1  Projectahedra  A projectahedron is a high-dimensional polyhedron represented by its projections onto two-dimensional subspaces, where these projections are not required to be convex. T h e high-dimensional object is the largest set of points that satisfies each projection. A full-dimensional polyhedron can be obtained from its projections by back-projecting each into a prism in R  d  and computing the intersection of those  prisms. E a c h (1-dimensional) edge of a projection polygon corresponds to a (d — 1dimensional) face of the projectahedron. T h e intersection computation for arbitrary polyhedra in high dimensions is computationally hard. Projectahedra are a restricted class of high-dimensional polyhedra and the complexity of computing the intersection of projectahedra does not appear to have been studied. In the work described in this thesis, the intersections of projectahedra are never explicitly nor exactly computed. Instead, operations on projectahedra are performed projectionwise.  Sometimes this leads to an overap-  proximation of the result projectahedron. We choose these operations in such a way as to preserve the soundness of Coho. Whereas the projectionwise computation of common projectahedra operations like union typically leads to overapproximation, the same method yields the exact result in the case of intersection.  6  2.2  Verification as Reachability  Consider a system whose dimension (i.e. number of variables) is d. The continuous state space of the system is R . Suppose the behavior of the system is described by d  the differential inclusion:  x  G  F(x)  where x G R . The inclusion models uncertainty in the model, environment etc. d  Given two regions, AQ C B C R , the reachability problem is to determine d  whether all trajectories that start m AQ aX t = Q remain in B, either during some time interval, [0,t [], or for all time. enc  For example, we can find At C R such that x(t) G A . d  t  The reachability  problem is satisfied if A C B, Vt G [0, t /]. t  en(  A related problem is the following: given a time t\ and a region A\, show that at t = ti, all trajectories are inside Ai. This can be reduced to the first problem by including time in the state with i = l [AL94]. 7  Many verification problems can be formulated as reachability analysis problems. Consider for example a system consisting of two aircraft [TPS98]. Given the possible initial positions of the aircraft and their equations of motion, the question is whether the distance between the two aircraft remains above a lower bound for all times of interest. Modeling each aircraft as a point in R , the state of the system 3  is a point in R . T h e safety requirement partitions R 6  6  into safe and unsafe regions.  Solving the verification problem boils down to determining whether At intersects the unsafe part of R  6  for any time t of interest.  A n important problem in modern circuit design is determining whether a circuit correctly implements its high-level specification. Reachability analysis can be used for verifying that circuits, as modeled by non-linear O D E ' s , correctly implement discrete specifications.  2.3  Coho  Coho is a verification system that performs reachability analysis [GM99]. Closed form solutions to reachability problems exist only for a few special cases. tems.  Consequently, approximation techniques are used to analyze real sys-  These techniques ensure that the approximations always lead to an over-  approximation of the reachable space.  Every point that actually is reachable is  included in the approximation computed by Coho.  T h e approximation may also  include points that cannot be reached by the real system.  Thus, the verification  performed is sound - an incorrect design will never pass verification, but a correct one might fail it because of the approximations. A general representation of high-dimensional objects is intractable. For this reason, Coho uses projectahedra to approximate high-dimensional objects, such as the initial region and the reachable regions at various times. T h e Coho reachability computation is an iterative, computation algorithm. A single time-step of this algorithm proceeds as follows [GM98]:  8  Projections ft  0  -1 01 , r  roi  ro.i  LPbloat,j/^  Figure 2.2: The creation of a bloated linear program for the convex hull of the projectahedron  9  EdgeLP  bloat  "-1 0 - 1 " 1 0 1 = 1 0 -1 -1 0 1  "-2" 2 < 0 2  X  V z  "0.1" 0.1 0.1 0.1  project  assemble edge projections  x G F(x)  F(x)  C  {p  |x e  | 3u G  F(x)  LPedge  b l o a t  U s.t. ii = Ax + b + A,b,u  Figure 2.3: A time step of Coho  10  u}  1. The time step begins by loading a polygon and its convex hull for each projection of the system. The convex hulls are then bloated outward slightly to ensure that they contain all possible trajectories for the next time step. Each projection's bloated convex hull can be translated into a set of linear inequalities in the projection's two coordinates. The conjunction of all the projections' linear inequalities describes a convex region containing the projectahedron. At this point, the movement of each edge of each projection's polygon can be computed independently. Each edge corresponds to a face of the projectahedron, and the objective is to compute the furthest outward that points on the face could move during a time step. For each face, the following computations occur: 2.  (a) Restriction: The convex region computed from the convex hulls is further restricted to a box around the edge in the coordinates of the edge as described by four more linear inequalities. In the full dimensional space this is equivalent to constructing a slab around the face being examined. The slab is a conservative estimate of the convex hull of the bloated face. (b) Linearize Model: The slab's description is available in terms of the collection of linear inequalities computed in the previous step. The derivative function for the model is assumed to be autonomous (i.e. independent of time) and finitely piecewise continuous (therefore locally bounded). A linearization of the system derivatives that is valid in the slab is computed. This model includes linear and constant terms, and gives bounds on the error introduced by the linearization within the slab. Typically, this linearization is based on bounds for the variables in the model and bounds on linear combinations of these variables. These bounds are computed by solving the corresponding linear programming problem. More formally, let W be the slab represented as a set of inequalities. The  11  non-linear model is approximated with the differential inclusion:  xeW=>ieAx where A G R  d x d  +b+U  (2.1)  is a matrix, b G R is a vector and U G (R x R) is a d  d  hypercube (i.e. a Cartesian product of intervals). (c) Advance Time: The linear model is used to move the slab forward in time according to the first two terms of equation (2.1). The U term is handled as an inhomogeneous stimulus to the system as described in step 2e. The forward time transformation is performed by exponentiating the A matrix. This new convex region contains any point reachable from the convex hull of the face at the end of the time step (ignoring U). Moving the slab model forward in time amounts to right multiplying the left-hand side of its inequalities by a matrix that transforms points at the end of the time step back to their location at the beginning of the step and also modifying their right-hand side. The application of matrix multiplication to the left-hand side would lead to the modification of its structure. For reasons of computational efficiency, the implementation transforms the cost function instead of the inequalities. This is described in detail in chapter 6. (d) Project Back: The slab's end-of-step shape is described by a collection of linear inequalities after time is advanced. Building a polygon from these inequalities requires projecting the region that they contain back onto the basis for the projection polygon corresponding to the face. This projection is computed by running a series of linear programs on the time-advanced set of inequalities. The cost functions used by the linear programs are directions contained in the plane of the polygon. (e) Add Errors: So far, the slab's movement is entirely controlled by the linearized model. To treat the error, we add a constant derivative offset 12  within the error bounds throughout the time step, in such a way as to bloat the slab's projection outward as much as possible. T h i s involves the computation of bounds on of  over the slab.  ||ar||1  A n over-approximation  is computed based on the extremes of each i j over the slab.  Computing these extremes again involves solving linear programs. 3. Each edge of each projection's polygon therefore produces an "edge polygon" at the end of the time step; this polygon contains the projection of all points that could be reached from the corresponding face within the time step. T h e outer boundary of the union of all such polygons is the projection of an overapproximation of the projectahedron at the end of the time step. Clearly the use of linear programming by Coho is heavy.  In fact, a major  limitation to the applicability of this tool stems from the failure of the classical Simplex algorithm to compute sufficiently accurate solutions to the linear programs that arise in Coho. E a c h vertex of the feasible region of a linear program lies at the intersection of d hyperplanes, where d is the dimension of the space. If the normal to at least one of these hyperplanes is almost a linear combination of the normals to the other hyperplanes, the vertex is ill-conditioned: the use of a typical implementation of Simplex or other L P algorithm for the determination of the vertex position leads to a result likely to be affected by large errors. However, the feasible region of any Coho linear program is special. A s any convex polyhedron, the feasible region can be described by the matrix inequality:  Ax > b where A  GR  d x d  and b G R . d  E a c h row of this inequality represents a halfspace that  corresponds to at most one face of the polyhedron. A s the feasible region represents the intersection of back-projections into R  d  of two-dimensional convex polygons,  each row of matrix A contains at most two non-zero elements. 13  T h e above observations lead to the idea of exploiting the special structure of the feasible region of Coho linear programs in order to solve them more accurately, thus enhancing the usability of the system.  14  Chapter 3  Linear Programs Linear programs play an important role in the Coho system. T h i s thesis presents a method for computing better solutions to the particular category of linear programs that arise in Coho. T h i s chapter introduces the mathematical description of linear programs and of an algorithm to solve them.  T h e n , section 3.3 describes the particular linear  programs that arise in Coho.  3.1  Problem Definition  Definition 1 Let m, n be positive integers, A G R bGR  m  m X T l  an m x n matrix of reals,  an m-vector of reals, c G R™ an n-vector of reals, M C { 1 , . . . , m) a set of  indices of rows of matrix A, N C { 1 , . . . , n} a set of indices of columns of matrix A.  Let M = { 1 , . . . , m} \ M and N = {!,...,n}\ N. Let s G {+1, - 1 } .  15  Then the following problem is an instance of a general linear program:  min sc x subject to Ai,-,x = bi  ieM  Ai -x > bi  i  t  XJ  > 0  G  (3.1)  M  JEN j G N  Xj unconstrained  Such an instance of the general linear program is denoted by L~P(A, b, c, M, N, s).  Column matrix c is called the cost vector or the optimization direction of the linear program. The value x t for which the minimum is attained, if it exists, is called the op  optimal solution of the linear program. The value sc x T  opt  is called the optimal cost of the linear program.  The sign s specifies whether the problem is one of minimization (+1) or one of maximization (—1). The following trivial transformations enable the reduction of other forms of linear programs to the one above: • A maximization problem can be turned into one of minimization by negating the cost vector: • T maxc T x = — min— cx  • A n inequality of the form: a x < b, T  a,x  G  R™, b G R  is equivalent to: —a x> —b T  16  A point x G R  n  that satisfies all the constraints of the linear program is  called a feasible point of the linear program. T h e set of all feasible points represents the feasible region of the linear program, denoted by feas(LP). A linear program is called feasible if its feasible region is non-empty. Otherwise it is called infeasible. As an intersection of convex sets (hyperplanes and closed halfspaces), the feasible region of an L P is a convex set. A n optimal solution lies on the boundary of the feasible region. In general, the optimal solution might not be unique. Consider the trivial case with c — 0: all feasible points are optimal. Let x t be an optimal solution to a linear program. If the affine subspace o p  that is normal to the cost vector and contains x  o p t  contains other feasible points,  they too are optimal. Recall that a general linear program was defined as a problem of minimization. T h i s means that the optimal point is the feasible point that lies the farthest in the negative direction of the cost vector (see fig. 3.1). If a linear program consists of a minimization problem and its feasible region is unbounded in the negative direction of the cost vector, then the cost function can take arbitrarily large negative values and the linear program is said to be unbounded. If the feasible region of a linear program is non-empty and bounded i n the negative direction of the cost vector, then the linear program has a finite optimum. A n important particular case of a general program is when M = { 1 , . . . , m} and N — { 1 , . . . ,n}, i.e. when all constraints are equalities and all variables must be positive: min c x T  Ax = b  (3.2)  x>0 Such a linear program is said to be in standard form and is denoted by S L P ( A , 6, c):  S L P ( A , b, c) = LP(A, b, c, { 1 , . . . , m}, { 1 , . . . , n}, +1)  17  Figure 3.1: Types of maximization linear programs: a) bounded, non-empty feasible region; b) unbounded feasible region bounded in the optimization direction; c) unbounded feasible region unbounded in the optimization direction; T h e arrow indicates the optimization direction, which for maximization problems coincides with that of the cost vector. T h e shading indicates the outer (infeasible) side of each line.  Throughout the rest of the discussion about linear programs it is assumed that  m < n and rank(A) = m. If rank(A) < m, then m — rank(A) rows of [A\b] can be deleted without changing the problem. A linear program in standard form is amenable to solution using the Simplex algorithm [PS82, p.26]. A linear program in general form can be reduced to standard form by using the following straightforward transformations: • A variable Xj that is unrestricted as to sign can be replaced with the difference of two non-negative variables:  Xj = xj - xj,  • A n inequality constraint  Y%=i  xj > 0, xj > 0  ^-ii i — ^ x  c  a  n  D  e  converted into the equation:  n  ^ ] AijXj + Sj = b j ,  Sj > 0  i=l  T h e variable s, is called a surplus variable. T h e similar transformation for a "less-than" constraint introduces a slack variable.  18  A l l linear programs that arise in Coho are of the form: max c x (3.3)  Ax>b x unconstrained  Throughout this thesis this form of linear program is termed Coho form. A linear program in this form is called a Coho linear program, denoted by CLP(^4, b, c). The following equation relates a Coho linear program to a general linear program:  CLP(_4,6,c) = LP(_4,&,c, 0,0,-1) A Coho L P can obviously be reduced to an L P in the standard form. Consider a Coho L P that has / inequalities and d variables. Each inequality and each variable in the original system requires the addition of an extra variable in the equivalent L P in standard form. The equivalent system would have d + / + d variables and / equations. Moreover, the special structure of the original L P would be destroyed by the transformation. However, the Coho L P can be solved without reducing it to the standard form by using a general characteristic of linear programs called duality. For an L P in general form, called the primal, the following construction defines another LP, called its dual: Primal  Dual  min sc x  max sb y  T  T  Ai -x = bi  i GM  Ai, x >bi  i GM  t  :  yi unconstrained Vi>0  Xj > 0  j G N  A.. j y  Xj unconstrained  j GN  A j y = Cj  T  t  < Cj  :  The dual can also be rewritten in the following way:  19  (3.4)  eN  j  (3.5)  o  i eM  yi unconstrained  i EM  Vi >  T h i s is to say that:  d u a l ( L P ( A , b, c, M, N, s)) = L P (  A ,-c, T  b,N,M,-s)  T h e attributes "primal" and "dual" are interchangeable: the dual of the dual is the primal. A n y primal-dual pair of linear programs has the following remarkable property: • If the dual has a finite optimum, then so does the primal and their optimal costs are equal. T h e optimal point of the dual can be easily computed from the optimal point of the primal and vice versa. • If the dual is infeasible, then the primal is either infeasible or unbounded. • If the dual is unbounded, then the primal is infeasible. T h i s property means that the solving of an L P can be replaced with the computation of the solution to its dual, with almost no loss of information. T h e only case where a precise verdict cannot be given for the primal is when the dual is infeasible.  However, in many cases, knowing that a linear program doesn't have a  finite optimum suffices. In fact, in the systems we are examining, the linear programs that arise cannot be unbounded.  20  The dual of a Coho linear program is easily seen to be a linear program i n standard form: dual(CLP(A,6,c))  =  dua_(LP(_4,6, c, 0 , 0 , - 1 ) )  =  LP(-A ,  =  LP(-A ,-c,b,{l,...,f},{l,...,d},+l)  - c , b, { 1 , . . . , /} \ 0, { 1 , . . . , d} \ 0, +1)  T  T  =  SLP(-A ,-c,b)  =  SLP{A ,c,b)  (3-6)  T  T  Therefore the solution to CLP (A, b, c) can be obtained by solving S L P ( A , c, b). T  It is clear that no variables are added and the structure of matrix A remains intact.  3.2 3.2.1  Linear Programs in Standard Form Feasible Region  Consider an instance of a linear program i n standard form S L P ( J 4 , b, c) with / variables and d equations.  It will be seen later that this S L P corresponds to a poly-  hedron with / faces i n the d-dimensional space, hence the new choice of letters for the dimensions of the linear program. T h e feasible region of S L P is the portion of a ci-dimensional affine subspace of R^ that lies inside the non-negative orthant. If the feasible region is non-empty, then an optimal point exists at the intersection of this subspace with one of the positive semiaxes of H . d  3.2.2  Bases  A set of d linearly independent columns of matrix A is called a basis. A basis is described either as a set of column indices, more precisely called the basic set corresponding to the basis:  B = {JU---,Jd}  21  or through the restriction of the linear program's matrix A to the basic set of columns: B =  A  B  The terms "basic set" and "basis" are used interchangeably when there is no chance of confusion. The columns that belong to a given basis are called basic columns, whereas the others are called non-basic. Each column of matrix A of an L P in standard form corresponds to a variable. The attribute "basic" extends to variables in the natural way. The values of the basic variables are: to =  B~ b 1  The basic solution corresponding to a basis B is a vector x G R / obtained by expanding the vector of basic variables in the natural way:  Xi  =  0  if j 0 B  to,k  ifj  <  = Bk  If an L P in standard form has an optimal solution, it also has a basic optimal solution. A basic solution that has no negative components represents a feasible point for the L P and is called feasible. Otherwise it is called infeasible. The attribute "feasible" extends to bases in the natural way. The situation in which a basic variable is equal to 0 is called degeneracy. The corresponding basis and the basic solution are said to be degenerate. More than one basis can correspond to the same degenerate solution, all such bases being degenerate.  3.2.3  Pivoting  A well-known algorithm for solving linear programs is called Simplex. Simplex operates on linear programs in standard form. 22  In addition to the description of the linear program to be solved, Simplex must be supplied with a feasible basis for that program. Finding a feasible basis is non-trivial, but it will be dealt with later. Simplex is a greedy algorithm. During each step, it tries to replace one of current basic columns with a new column in order to obtain a new feasible basis of lower cost. The search ends at the optimal basis, which is the cheapest feasible basis. If a non-degenerate feasible basis is not optimal, then there exists at least one non-basic column whose introduction into the basis results in a decrease in the cost. Let tj be the column vector defined by:  tj =  B-%  The quantity —  j  c  =  T  j ~ B^j  C  c  is called the relative cost of column j with respect to basis B. The introduction of column j in basis B might be favorable (reduce the cost) if the relative cost of the column is negative. Once a column with negative relative cost is found, the algorithm must determine which column to evict from the basis. The decision is guided by the requirement that the new basis must be feasible and is accomplished by the following computation: ,  .  *0,i  k = arg mm —-  i tj,i  where k is the index of the column to be evicted. In the presence of degeneracy k may not be uniquely defined. The action of moving from one feasible basis to another is called pivoting. The cost decrease achieved by pivoting as described by the computations above is  to,k '•j,k 23  The following observations are in order regarding pivoting: • If tjj < 0, Vi = 1 , . . . , d, the feasible region of the linear program is unbounded in the negative direction of the cost vector. Feasible points of arbitrary low cost exist. • In the presence of degeneracy, the quantity  can be 0 and so can the  decrease in cost: EH, j s.t.  t ,i 0  = 0 A CJ  <  0At  j:i  > 0 => —  = 0 =>  — C j  = 0  This is equivalent to a change of basis without a change in the basic solution. It is said that the new column  enters  the  basis  at  zero  level.  In the absence of degeneracy, Simplex works because: • There exists a finite number of feasible bases. • At every step the cost decreases monotonically, ensuring that a basis is never visited again. • The optimal solution is among the basic feasible solutions. Simplex is by no means guaranteed to produce the shortest path from an initial feasible basis to the optimal basis, but it typically performs well in practice.  3.2.4  Cycling  In the presence of degeneracy, Simplex is no longer guaranteed to work with any choice of a favorable column. It is possible that, once arrived at a degenerate basis, the algorithm takes a sequence of favorable pivots to subsequent degenerate bases. These pivots do not decrease the cost and the algorithm can eventually return to the first degenerate basis. Obviously, the algorithm can loop indefinitely through a set of degenerate bases that all yield the same solution unless special precautions are taken. This phenomenon is called cycling. 24  Cycling occurs when the cost function is a positive combination of less that d columns of a basis. Cycling avoidance is achieved by Bland's anticycling algorithm [Bla77], [PS82, p.50] that chooses the lexically first pivot at first step: • T h e columns that enters the basis is the lowest numbered one. • In case of a tie in the computation of the column that leaves the basis, the lowest numbered column is selected.  3.2.5  The Simplex Tableau  A t each step, the Simplex algorithm makes pivoting decisions based on the values in the matrix  T = B~ [A\b] 1  T h e matrix T is called the Simplex tableau. For simplicity, its last column is indexed by 0. Computing the Simplex tableau from the input data every time pivoting occurs would render Simplex prohibitively expensive: f — d right-hand sides takes 0(S(f  solving a d x d linear system with  — d)) in the general case.  In practice, this expensive solution is replaced with the computation of each new tableau from the previous one at the lower cost of adding one row to each of the others  (0(d(f - d))).  T h e tableau corresponding to the initial basis still needs  to be computed from the initial data. Let B' be the basis obtained by replacing column j = B(l) with column j' in basis B and T and T' the tableaus corresponding to bases B and B',  respectively.  T h e n T' can be obtained from T through the following operations:  T' .  =  T'i,  =  (  Ti./Tiji Ti.-Tij.T't,  T h e use of a tableau presents the disadvantage that numerical errors accumulate as the algorithm proceeds. 25  Figure 3.2: Linear program with two optimal vertices  3.3  Linear Programs in Coho Form  T h e Coho form of a linear program can offer more insight, particularly as regards the geometric meaning of linear programs. Consider a primal linear program in Coho form with d variables and / inequalities C L P ( A , C  A  s  = (A ) , C  T  b, c) c  c  and its dual in standard form  SLP(A  S  ,b ,c ), s  s  where  b = c, c = b. s  c  s  c  T h e feasible region of a Coho L P is a closed convex polyhedron:  feas(CLP(A ,6 ,c )) = PH(^ ,6 ) c  c  c  C  C  The optimal point of a Coho L P is a vertex of the feasible region. A s illustrated by fig. 3.2, the optimal vertex might not be unique. For example, all the points in a hyperplane normal to the cost vector have the same cost. If the vector of the cost function is normal to a face of the polyhedron and oriented towards its interior, all the vertices on that face are optimal. Each constraint in the primal defines a halfspace whose boundary is a hyperplane. E a c h such hyperplane contains a face of the feasible region, unless it is redundant. E a c h row of the primal A  c  (i.e. each column of the dual A ) s  represents  a normal to a face of the feasible region oriented towards the interior of the feasible region. Such a normal is called an inward face normal. 26  A basis in the standard-form dual represents a set of d halfspaces in the Coho primal. T h e intersection of their boundaries determines a point in R , which d  is the primal solution associated with that basis. T h e  basic primal (Coho) solution  corresponding to a basic set B is:  Basic primal solutions will be termed  vertices  by abuse of terminology, as in general  they do not represent vertices of the feasible region of the Coho L P . Those of them that are actual vertices of the feasible region will be termed  proper vertices.  T h e intersection of the basic halfspaces of the Coho L P is a cone whose vertex is the basic primal solution. This cone represents the feasible region with respect to the constraints comprised in the basis and is called the  basic feasible cone.  A feasible basis of the standard dual represents a set of halfspaces of the Coho primal such that the primal cost vector is a positive combination of their inward face normals. In other words, the primal cost vector must lie inside the cone generated by the basic inward face normals. This cone is called the  basic cost cone. It  natural to consider both the basic cost cone and the primal cost vector originated at the basic primal solution. T h e bases that Simplex visits along the way to an optimal solution (other than the optimal one) represent feasible suboptimal, solutions in the standard form. In the Coho form, they represent infeasible supraoptimal solutions (see fig. 3.3 b). Monotonicity is preserved, however: the cost decreases monotonically in the standard form, whereas the infeasibility (expressed as the distance to the closest feasible vertex along the cost vector) decreases monotonically in the Coho form. T h e optimal vertex of an L P in Coho form is an intersection point of d halfspace boundaries that satisfies both of the following properties (see fig. 3.3 a): • T h e cost vector is a positive combination of the inward face normals. A l l non-optimal vertices lead to at least one negative component.  27  is  Figure 3.3: Types of vertices in a Coho L P : a) optimal; b) Coho-infeasible and standard-suboptimal, at least one constraint is violated; c) Coho-suboptimal and standard-infeasible, the cost vector does not lie within the cost cone • It satisfies all the constraints, i.e. it belongs to the feasible region. Any nonvertex intersection point breaks at least one constraint.  28  Chapter 4  Combinatorial Simplex for Coho A n important obstacle in the way of the verification of systems with moderately high dimensionality (5-20 variables) by Coho is the need to solve linear programs with sufficient accuracy. Coho allows that the solution to any L P be an overapproximation of the feasible region in the direction of the cost function, like in figure 3.3 b. Underapproximation (figure 3.3 c), however, is not allowed - otherwise Coho might incorrectly label faulty systems as correct. T h e amount of overapproximation must be kept low, or Coho might fail to verify correct systems. Coho has previously employed an implementation of the classical Simplex algorithm for its linear programming needs.  Oftentimes, the optimal solution to  a Coho L P represents the solution to a highly ill-conditioned linear system.  In  such cases the solutions computed by classical Simplex tend to contain substantial errors for which no bounds are provided, thus preventing Coho from functioning. In the previous implementation, these large errors often led to arithmetic exceptions preventing Coho from generating any results. From a purely mathematical point of view, Simplex works by taking favorable pivots until it reaches a basis from which no favorable pivot can be taken. T h i s basis is optimal and the solution that corresponds to it is the optimal solution of the linear program.  29  T h e mathematical view of Simplex implies that the arithmetic operations with real numbers are performed with infinite precision. Computers, however, use floating-point arithmetic, which uses only limited precision. T h e favorability of a pivot is determined based on the values in the Simplex tableau.  T h e errors that affect these values can lead to incorrect decisions about  the favorability of a pivot. T h i s in turn can result in incorrect determination of the optimal basis or in numerical cycling. Even if the optimal basis is determined correctly, the optimal solution, which is itself a tableau column, is affected by errors for which no bound is available. These problems are addressed as follows: T h e special structure of the bases that arise in Coho linear programs is exploited in order to make the computation of tableau columns directly from the input data feasible, thus reducing errors. T w o methods for determining error bounds on tableau columns are presented, one relying on the use of running error analysis and the other analytical. T h e analytical method is presented in the next chapter, with the rest of the aforementioned material forming the topic of the current chapter. Even with improved accuracy in the computation of the tableau columns and availability of error bounds, it is still possible that all the pivots from a basis are neither clearly favorable nor clearly unfavorable, which renders the basis neither clearly optimal nor clearly suboptimal. In such a case, branching of the computation path is used in order to guarantee the visitation of the optimal basis.  T h i s is  presented in detail in chapter 6. As discussed in chapter 3, it is advantageous to solve the dual of a Coho linear program instead of reducing the primal problem to standard form at the price of altering its structure. T h i s better solution is assumed to be used throughout the rest of this chapter. A t the same time, the fact that the matrices of the Coho linear program and of its standard form dual are identical up to a transposition enables us to refer to the original Coho L P when that is advantageous to understanding the  30  system. T h e remainder is structured as follows: T h e first section presents the idea of computing tableau columns only when access to them is required by the program. T h e n the linear-time computation of such a column is examined. T h e description of a technique called running error analysis, used in order to obtain error bounds on the solution, concludes the chapter.  4.1  Lazy Tableau Generation  Simplex arrives at the optimal solution by taking a series of favorable pivots. Taking a pivot amounts to identifying a non-basic column that replaces a column in the basis, thus producing a lower-cost feasible basis. Depending on how the selection of the column that enters the basis is made, the need to know some of the tableau columns might not arise during a particular pivoting operation. In particular, it is enough to discover the lowest-numbered favorable column: if this column is chosen to enter the basis, then knowledge of the higher-numbered columns of the tableau is unnecessary in the current step of Simplex and their computation can be omitted. T h i s policy, called lazy tableau generation, is the one followed in the version of Simplex tailored for Coho. T h e computation of a part of the tableau is thus avoided. It must be emphasized that with incomplete tableau generation, the necessary tableau columns are computed from the input data rather than from the incomplete tableau of the previous pivot. T h e only data that is passed from one pivot to the next is the new basis.  A basis is a collection of integers, so error  propagation and accumulation across pivots is eliminated. As mentioned in subsection 3.2.5, the computation of tableau columns from the input data is in general undesirable because of the high computational cost. For Coho linear programs however, a more efficient algorithm is available: a tableau column can be computed in linear rather than cubic time. T h i s makes the algorithm very practical.  31  4.2  Efficient Computation of Tableau Columns  Each inequality in a Coho linear program C L P ( A , b , c ) represents the halfplane C  c  c  corresponding to the backprojection of a side of a two dimensional polygon back into the full-dimensional space. As a result, each row of matrix A  c  contains either  one or two non-zero elements. The one non-zero case occurs when the polygon side is parallel to one of the coordinate axes that determine the plane that contains it. Hence matrix A  s  = {A ) of the standard-form dual of a Coho L P contains either C  T  one or two non-zeros in each column. Let B be an arbitrary basis of the linear program in standard form. As B represents a subset of the columns of matrix A , B is a square matrix that, like A , s  s  has either one or two non-zero elements per column. A tableau column is described by the equation: B- A .. , i f j V O l  s  tj  B~ b , l  s  if j = 0  The task at hand is to solve a linear system whose left-hand side is B. Such a linear system is henceforth referred to as a Coho linear system. Its solution can be determined in two stages that are described in the subsections that follow.  4.2.1  Reduction to a Cycle  Whereas the columns of B are restricted to containing 1 or 2 non-zero elements, the rows of B are not under a similar constraint. However, if a row of matrix B contains no non-zero elements, B is trivially singular and a solution cannot be determined. If a row i of matrix B contains exactly one non-zero element, which lies in column j, the value of the variable Xj can be determined immediately. If column j contains another non-zero element in a row i', variable Xj can be eliminated from row i' by the appropriate substitution. Row i and column j of matrix B can then  32  be deleted, as the value of Xj has been determined and no other equation depends on this value. The rows of B momentarily containing one non-zero element each can thus be eliminated one by one. The key to keeping the running time of this computation linear is to check the number of non-zero elements left in row i' after deleting row i. If there are zero non-zero elements left in row i', matrix B is trivially singular and the algorithm terminates. If there is one non-zero left, the algorithm proceeds with the solving and deletion of row i'. In the end, no rows of B with less than 2 elements are left. It is possible that matrix B has become empty, in which case a solution to the problem has already been found. Now consider the case where B is non-empty. One row and one column have been deleted from matrix B during each step of the algorithm, so B must still be a square matrix. Let its dimension be n. • By the termination condition of this part of the algorithm, each row contains at least 2 non-zero elements. Consequently, B must contain at least 2n non-zero elements. • By hypothesis, each column of B contains at most 2 non-zero elements at the start of this part of the algorithm. As the algorithm proceeds, the number of non-zero elements in columns that remain in the matrix is unchanged. Consequently, B must contain at most 2n non-zero elements. Clearly, both inequalities can be satisfied only if each column and each row of B contains exactly 2 non-zero elements. Consider a graph where the vertices correspond to the rows and there is an edge from vertex ii to vertex 12 if and only if there exists a column of matrix B whose non-zero elements are in rows i\ and  Because every row has exactly 2 non-  zero elements in it, every vertex in the corresponding graph has degree 2. Therefore the graph is a collection of disjoint simple cycles. 33  o o o  o  o o o o o  o  Figure 4.1: Non-zero structure of a cycle  Figure 4.2: Cycles in a matrix  E a c h simple cycle corresponds to a linear system that can be solved independently from the others. B y a suitable permutation, matrix B can be rearranged such that all its non-zero elements are grouped in square blocks along the main diagonal. Blocks are henceforth termed cycles, as each block represents a simple cycle in the graph derived from the matrix. E a c h n x n block A can be permuted such that its non-zero elements are on the main diagonal, right above the main diagonal and in the lower left corner, as shown in F i g . 4.1:  A  i>:j  ^0  j =iv j = (i mod n) + 1  (4.1)  The partitioning of matrix B into cycles is achieved by a greedy walk through the graph corresponding to B. T h i s is easily doable in linear time.  4.2.2  Solving a Cycle  In order to complete the solving of a Coho linear system, solutions to its cycles must be found. Let A be a cycle with the structure described by (4.1) and let  Ax = y be the corresponding linear subsystem to be solved.  34  (4.2)  The rows of matrix A can be scaled to obtain its normalized form: 1  -al  0  0  1  -a  0  0 ....  2  0  A=  (4.3) 0  ....  0  1  'On  0  ..  0  -a -i n  1  which is equivalent to: if j = i  1  Ai j — <— a* if j — i mod n + 1 t  0  (4.4)  otherwise  Let Yli=i i  iffc = l , . . . , n  1  if k = 0  a  Pk= {  (4.5)  It is obvious that: Pk = otkPk-i, Vfc = l , . . . , n  (4.6)  The first row in equation (4.2) yields: X2  yi  (4.7)  Oil  More generally, rows 1,..., i yield:  Xi+l  (4.8)  =  Finally, combining this with the last row of equation (4-2) gives the formula for x±:  Xl  l-Pn 35  (4.9)  Thus, the solution for x\ is ill-conditioned if P is close to 1. Chapter 5 presents n  a more detailed error analysis. Clearly, equation (4.9) can be rewritten to obtain similar formulas for the other Xj's, all with the same denominator. Rather then computing each x, separately, it is more efficient to compute x\ as per formula (4.9), and then use the recurrence: =  ^Vi^  i  e  {!,... , - l } n  (4.10)  Clearly, the above algorithm runs in 0(n) time. Other algorithms that employ elimination, like L U decomposition, can be used in order to achieve the same running time. However, the one presented in this subsection has the advantage of yielding the result in a concise form that is appropriate for the analysis of the numerical stability of the system.  4.3  Running Error Analysis  A n important problem with the Coho LPs is the need for an error bound on the computed solutions. In addition to the solution proper to the Coho LP, error bounds on the solution are necessary if the verification is to be sound. More precisely, it is important to never underapproximate the feasible region of the L P and the optimal cost. Pivoting decisions in the Simplex algorithm involve comparisons between tableau elements. In the presence of ill-conditioning, these comparisons might yield uncertain results. The availability of error bounds on tableau elements and other quantities enables the algorithm to recognize cases of uncertainty in the result of a comparison. The Simplex algorithm and the algorithm that computes tableau columns make use of elementary operations only. This leads to the approach of computing an error bound on the result of each elementary operation. The computation of an error bound along with the result proper of each operation leads to an error bound  36  on the final outcome. T h i s approach to error analysis is called running error analysis [Hig96, p.72]. A slight modification of the method is called interval arithmetic. In many cases, interval arithmetic doesn't work because of the explosion of the interval as the computation proceeds through a long algorithm. T h e slightly better running error analysis does work for Simplex because, with the computation of tableau columns from input data proposed in section 4.1, there is no data propagated from one pivot of the algorithm to the next.  floating-point  A l t h o u g h running  error analysis tends to lead to overly pessimistic error bounds when applied to a long algorithm, it can in some cases provide sharper,  a posteriori bounds than an a  priori analysis can provide [Hig96, p.73]. T h e arithmetic operations on floating-point numbers are generally subject to rounding errors when executed on digital computers. T h i s is caused essentially by the fact that floating-point numbers are stored with only a fixed number of digits, whereas the result of an operation might require more digits than the particular numeric format has available. A l l the arithmetic operations executed on a computer follow the fundamental rule of the arithmetic of the computer: fl(a;opy)  =  (xopy){l + 5),  \5\<u,  opG{+,-,x,^}  where &(x op y) is the result of x op y computed by that arithmetic and u is a constant for a particular arithmetic called the unit roundoff. For the I E E E double precision arithmetic, u = 2 "  53  « 1.1 x I O  - 1 6  .  W h e n the operands are themselves affected by errors, they can be regarded as intervals on the real axes. Consider the pair (x, e) to be the representation of the interval [x — e, x + e]. Let (x,e)  = (x\, ei) op(x2, ef)- T h e n x = Q(x\ o p £ 2 ) and  if op e { + , - } e\\x2\ + e2\xi\ + u\x\ +  eie2  if op 6  {x}  Division can be regarded as the inversion of denominator followed by the multipli-  37  cation of its result with the numerator. Let (x, e) = (xo, eo)  1  .  T h e n x = 1/XQ  and e= <  ko|(|a;o|-eo)  +00  if | x | > eo 0  otherwise  Clearly, the computation of the error bounds is itself affected by rounding errors. Fortunately, the error bound does not need to be known with high precision: its order of magnitude will often suffice.  Moreover, disastrous cancellation cannot  occur in the computation of error bounds: all the numbers involved are positive and subtraction does not occur. There are some problems with the use of running error analysis. T h e first is that ideally one would like to have a simple formula to compute error bounds for the solutions of a linear system.  T h e other issue is that the computation of  error bounds on the result of each arithmetic operation along with the actual result increases the running time of the program. However, the penalty is a constant factor, not a deterioration of the asymptotic running time. A s verification is executed as an off-line process, an increase of the running time by a constant factor can be seen as a reasonable price to pay if the algorithm would otherwise fail.  38  Chapter 5  Analytical Attack on the Error As emphasized in previous chapters, the success of Coho verification depends strongly on the accuracy with which the linear programs that it produces are solved. The main floating-point (hence error-prone) computation that is performed as part of Simplex is the determination of the tableau columns corresponding to a basis. However, for bases other than the optimal one, only enough accuracy is needed to be able to determine the departing and the entering basic variables. This is the case because no floating-point values computed at a basis are subsequently reused in our version of Simplex. For most of the bases encountered during a run of Simplex, the computation of tableau columns using the linear-time algorithm presented in section 4.2 along with running error analysis produces satisfactory results. On the other hand, there can exist suboptimal bases at which the error bounds on the tableau columns are not tight enough to establish whether the basis is optimal and, if it is not, which pivot leads closer to optimality. Whereas such a situation can be dealt with by branching the computation path, it is clear that obtaining sufficiently sharp error bounds to be able to decide with certainty is preferable. The combinatorial solution to a linear program consists of its optimal basis.  39  Once the optimal basis has been found, the optimal solution of the standard-form dual is determined as column 0 of the tableau. However, if the solution to the Coho primal is sought after, a slightly different linear system has to be solved. Whereas the solution to the standard dual is:  ar = (A )5  s  b  1  (5.1)  s  :tB  the solution to the Coho primal is:  x = [A ,)c  C  1  B  b  = ((Asfr b s 1  c B  C  (5.2)  T h e left-hand sides of two linear systems differ only through a transposition. T h e conditioning of any matrix is the same as that of its transpose.  Moreover, only  trivial changes are needed to an algorithm that solves linear systems of the first type to make it work for the second type.  Thus it is sufficient to analyze linear  systems of the first type, under the implicit assumption that the results also apply to the extraction of the optimal solution to the primal. Whereas in some cases the optimal solution of a linear program is the result of interest, there are instances where the optimal cost is the sought-after answer to the problem. In such a case, the components of the error in the optimal solution that are orthogonal to the optimization direction are harmless. T h i s opens the possibility of trading accuracy of the optimal solution in directions that are orthogonal to the optimization direction for accuracy in the optimization direction. Ideally, we would like to be able to characterize the accuracy of the computed optimal solution through a closed-form expression depending on the machine precision and on the matrix structure. In the case of the optimal cost, the comparative flexibility of the constraints suggests that an error bound can be established that depends only on the machine precision and on the dimension of the system. Whereas these problems have not been solved completely, some inroads have been made into them.  These advances form the subject of this chapter. 40  A s the  reduction of an independent Coho linear subsystem to independent cycles has been described in the previous chapter, the focus here will be on the solving of the cycles. T h e first section presents a new algorithm for solving cycles as part of computing tableau columns and the error bound that is thus achieved. T h e second section is concerned with a way of obtaining a better estimate of the optimal cost in cases where the optimal basis is ill-conditioned.  5.1  Error Bound on Cycle Solution  Section 4.2 presents a linear-time algorithm for solving the cycles that appear in , Coho linear systems. T h e key to keeping the running time linear is the calculation of only one component of the solution by means of a direct formula that takes 0(n) time. T h e other components are computed recursively starting from the first component at the cost of 0(1) each. Clearly, the recursive computation accumulates error. T h i s can be avoided by the direct use of formulas similar to (4.9) in order to determine each component of the solution. Obviously this increases the running time of the algorithm to  0(n ). 2  However, even with this modification, only running  error bounds on the solution to the cycle are available. T h i s section examines an alternative way of solving a cycle and of computing a bound on the error in the solution.  5.1.1  A l g o r i t h m for Solving Cycles  Let the cycle to solve be described by the equation:  Ax = y where A G R " " , x  y G R".  (5.3)  Matrix A is supposed to be in normalized form as per  equation (4.4). T h e n it is possible to express A as:  A = SBS-  1  41  (5.4)  where S is a diagonal matrix: S = diag(s) and  (5.5)  B is a particular case of a cyclic matrix in which all off-diagonal elements are  equal:  ( Bi,j = <  1  if j  = i  —fi  if  0  otherwise  (5.6)  j = [i  mod  n) + 1  F r o m equation (5.4) it results immediately that:  B = S~ AS l  (5.7)  Simple computations show that:  iij=i  1  (S^AS)  = {— a, ^  hj  j£  modn)+l  5  0  ^  • _  m  o  c  j  n  )  (5.8)  -j- 1  otherwise  v  Equation (5.7) implies that:  Bi,{i  mod  (S AS) ( 1  =  i  i  m  o  d  y  i  (5.9)  Vi = l , . . . , n  which is equivalent to: ^  =  a  ,*(imodn)+l  =  (5.10)  1  Si  Memberwise multiplication of the equations above for i = 1 , . . . ,n yield:  n p  n  =  d  „  n  )  +  1  )  =  {  T  J  i=l  Il (imodn)+l  n  s  a  i  y ^ _  } =  n  JJ , a  (  5  n  )  i=l  i=l Thus, (5.12)  42  The combination of equations (4.5) and (5.12) yields: P=yp~n  (5-13)  The values of s can be obtained from (5.10): if  it =  1 (5.14)  ^  = 0^1  ifl<fc<n  Matrix S is not unique: multiplication of S by any non-zero real number yields another matrix that satisfies (5.3). Matrix B with the structure as defined by (5.6) falls within the category of circulant matrices or, for short, circulants [Hig96, p.469]. As a circulant, matrix B has the property that it is diagonalized by the Fourier transform matrix F : n  F BF-  r  n  = A = diag(A)  (5.15)  Also as a property of circulants, vector A contains the eigenvalues of matrix B and it satisfies: A = Fb  (5.16)  n  where b is the first column of matrix B: b= B  :tl  = c o l ( l , 0 , . . . , 0,-/3)  (5.17)  By definition:  (F ) n  = e^  1  i>}  2*«-i)C7-i)/n  (5.18)  As b contains only two non-zero elements, a closed form of the eigenvalues of B is easy to obtain: A; = 1 - pe^  Mi-V/n  1  (5.19)  Equation (5.15) implies that B can be expressed as: B = F~ KF l  n  43  (5.20)  and B -  1  =  F -  l  K -  l  (5.21)  F n,  From (5.4) and because S is diagonal it follows that:  A'  1  =  S B ^ S '  (5.22)  1  Equations (5.21) and (5.22) combined yield:  A '  =  1  —l  (5.23)  S F ^ A ^ F n S  and, for the solution to the cycle:  x = AS  =  SF- tr F S~ y x  x  (5.24)  x  n  T h e computations of s and A take linear time. Fast Fourier transform algorithms take 0(nlogn)  time. Consequently (5.24) defines an 0(nlogn)  time algorithm for  solving a cycle.  5.1.2  Estimation of Cycle Condition Number  T h e direct and inverse Fourier transforms are known to be quite stable. T h e algorithm defined by (5.24) might offer higher accuracy than the linear-time one presented earlier on. A s it is more expensive, it will be used only when the linear-time algorithm fails to give satisfactory results. T h e expression of cycle matrix A as the product of matrices with simpler structure enables the determination of a closed-form expression for the error in the solution to the cycle. Error analysis is straightforward for diagonal matrices. Numerical stability properties are also known for circulant matrices. For one thing, the singular values of a circulant are the absolute values of its eigenvalues. Quantity P  n  is the n  t h  is a product of real numbers, so it is itself a real number. Quantity f3  root of P „ , so it can be expressed as: j3 = | / 3 | e ^ ( ( P +  2fc  W"),  where k £ { 0 , . . . , n44  1}  (5.25)  and [o P= \  ifP„>0  [1  ifP <0  (5.26) n  T h e eigenvalues of matrix B are given by (5.19). B y combining it with (5.25) the following equation results for its singular values:  d = y/l + \/3\ - 2|/3|cos6>  i = l,...,n  2  (5.27)  where  e = (P + 2 * + 2 ( » - I ) K  ( 5 - 2 8 )  n A classical measure of the sensitivity of a linear system to numerical errors is the condition number of its left-hand side, which is the ratio of its largest to its smallest singular value: K(B)  = ^  (5.29)  Cmin  where a  = max^ o , and <r i = min,CT,. -  m a x  m  It is easily seen that c r x  a  n  function of 6 is realized for the lowest value of cos 6,  s a  ma  i.e the value of 6 that is closest to 7r (mod 2n): 1 + 1/31  if (P > 0) = (n is even) n  (5.30)  [^/(l + |/3|) -2|/3|(1-cos | )  if (P > 0) = (n is odd)  2  n  Equation (5.30) can be expressed more simply as: crmax < 1 + |/3|  (5.31)  The value a ; , as a function of 9, is realized for the highest value of cosd, i.e the m  n  value of 9 that is closest to 0 (mod 2TT) :  =<  y i / 3 | - 2|/3|co | + 1 2  S  45  ifP„<0  For the P  n  < 0 case, the value of K(B)  can be determined from equations (5.31)  and (5.32):  K{B)  (5-33) ^  |/3| -2|/3|cos- + l 2  T h e m i n i m u m value of this expression for |/3| > 0 is:  iY(B) = - V  (- ) 5 34  sin —  2n  which shows that the system cannot be ill-conditioned when P „ < 0 for the values of n of interest in Coho (n < 20). T h i s agrees with equation (4.9), which suggests that ill-conditioning is related to P  n  being close to 1. Only the case P  n  > 0 is considered  henceforth. For this case, < 7 j simplifies to: m  n  *„_.„ = | 1 - | 0 | |  (5-35)  F r o m equations (5.31) and (5.35) it follows that:  K(B) < jl + Jii  (5.36)  T h e condition number of a diagonal matrix is the ratio of its largest diagonal element in absolute value to its smallest: max K(S)  =  \si\  \ , , min | S i |  (5.37)  i  A s with any matrix, K(S) = K(S~ ). 1  This enables us to establish an upper bound on the condition number of A:  K(A) < K{S)K{B)K{S- ) = K{B)K{S) 1  5.1.3  2  (5.38)  E r r o r Bound on Solution to Cycle  Formula (5.38), although it has the merit of establishing an upper bound on the error in the solution to a cycle, can be rather pessimistic. K n o w n results on circulant systems will be combined with error computations for the scaling matrices to obtain a tighter error bound on the cycle solution.  46  Forward Error Bound for the Cycle Circulant In addition to the properties of circulant matrices presented in the previous section, more results about them can be found in [Lin92].  More specifically, a normwise  forward error bound is established for circulant systems. T h e result, which holds if the input data is free from errors, the only source of errors being roundoff i n the algorithm, is the following:  m  ^ ~  x  l  1  < u(y  (n)(K (B)+2)+c o)  FFT  F:2  C  = /*(»,«)  (5.39)  where fl(:r) denotes the floating-point approximation of x, u denotes the machine precision, ceo = V2~ + 4 is a small constant, Kp^B)  is a pseudo-condition number  defined as:  J E W  k and ^FFT(?I) is a function that characterizes the stability of the particular F F T algorithm employed to solve the circulant system. For example, the radix 2, CooleyTukey algorithm has [Lin92]: *FFT(TI) < c<plog n  (5-41)  2  where  = 1.06 x 4 / . From (5.19) it follows that: 3  2  |A | = l + 9 - 2 0 c o s 2  2  f c  2 7 r ( f c  )  n  ~  (5.42)  1 )  so (5.43)  =  n(l + 0 ) - 2 9 E L i C o s 2  ;  2 7 r ( /  ;~  1 )  A well-known trigonometric result is that: ^  c  o  s  k=l  47  2  ^  ^  (5.44)  which, introduced in (5.43), yields:  X>*| = n ( l + / ? ) 2  (5.45)  2  k=l Similarly to (5.35), we have that: min|A | > fc  I1-I0H  (5.46)  k  The substitution of (5.45) and (5.35) in (5.40) yields: <  KMB)  {5  .47)  The introduction of equation (5.41) and inequality (5.47) into inequality (5.39) yields the following forward error bound for matrix B: fa{n,u)<u  l^c* log n (^^_^  + 2j + c <fj  2  c  (5.48)  E r r o r s Affecting the Scaling M a t r i x As a cycle matrix A consists of matrix B pre- and postmultiplied with matrix 5 and S  1-1  , respectively, the errors introduced by S and 5  _ 1  have to be considered as  well. The coefficients a, result from row scaling, so they will be affected by error: I fl(aj) — ctA , .—- < u 1  (5.49)  v  \Cti\  The application of the error composition rule for multiplication leads to:  \3i^hlz31  \/k=l,...,n  < (2k - l)u < (2n - l)u  (5.50)  The computation of the quantity: P = (Pn)™  (5-51)  k  involves the power operation, which is not elementary. In general, the quantity x , y  where x and y are real numbers, is computed as e  ylnx  and, when the inputs are free  from errors, is affected by the following error [Mul97, p.179]: |fl(a;»)  \ y\  -x*\  = | «» *_l| l n  e  x  48  (5.52)  or, to the first order:  |fhV)  x y\  uylnx  \xv\  (5.53)  In order to account for the error in the inputs to the power, we use its linearization:  (x(l + S ))^ ^  « x (l + y(5 + lnx S ))  1+  (5.54)  v  x  x  v  B o t h k and n are error-free integers, so the error that affects their computed ratio is:  ^ |  f  l  ^~^  (  <u  In I  (5.55)  Based on the rules expressed by (5.53) and (5.54) and on the error bounds on the inputs given by (5.50) and (5.55), the following bound is obtained for f3 computed k  by formula (5.51):  ' ^ l ? ^ ^ -((2*-l)u+|ln|P 1  \p \  n  ||  «) + - | l n | P „ | |  n  K  u  (5.56)  n  which, because ^ < 1, simplifies to: ' Formula (5.14) of s  k  1  |j  P f c |  ' < (2n + 2 | l n | P | l ) u  (5.57)  n  can be rewritten as:  Sk = y  (5-58)  k  W h e n computed by the above formula, the error that affects s  k  <  (2n + 2 | l n | P | | ) u + ( 2 A : - l ) u + u  <  (4n + 2 | l n | P | | ) «  k  k  \s \ k  Wk)-p \, rzz-, hu \Pk\  "<  \m )-v \, rzr, 1 \P \ '  \fL{ )-s \ 8k  is:  k  k  k  (5.59)  n  n  The introduction of (5.13) into (5.59) leads to: |fl(s  f7 }  I *I  gfc|  < (4n + 2n| In |/3||)u  (5.60)  s  It is easy to see that the same error bounds can be attained for the components of  S~ . l  49  Error Bound for Cycle Solution Let us now establish an error bound on the solution of the cycle. Consider that the linear system is solved in the following steps:  t' =  S~ y l  t"  =  B~H'  x  = St"  (5.61)  For all three steps of the algorithm we have to consider two types of error in the output. T h e first type is due to unavoidable roundoff, which is present even if the input is error-free. T h e second type is due to the propagation of input error into the output. T h e relative error in the input appears in the output magnified by the condition number of the system matrix: a;  =  out  Mx- =*  < K(M) ^f l  m  I Pout II  (5.62) ll-^inll  The first step of the algorithm consists of elementwise multiplication: A  = kVk s  (5-63)  Quantity y is considered free from errors. A bound on the componentwise relative error affecting t' follows immediately:  WU-lUMj-^+u l* l fc  (5.64) 1**1  The introduction of inequality (5.60) into the one above yields:  ** ~**  |fl(  )  1  < ((4n + l)+2n\ln\fJ\\)u ^ f (n,u) d  s  B o t h types of error can occur during multiplication by  MM<  / B (  „,„  ) + K ( B )  (5.65)  B~ : l  m^i  (5  ,  6)  Componentwise bounds translate directly into normwise bounds, so it is possible to introduce (5.65) in the inequality above: |lflf/"1 _ / " | |  "y  f||  " < f (n,u)+K(B)f (n,u) B  50  s  (5.67)  T h e error bound for the final result is obtained by considering the multiplication by  S: " <fs(n,u)+K(Sy -  ||x||  \> \\t»,,  (5.68)  11  B y combining the equation above with (5.67) it results that: l|fl  ?,7 < K(S)f (n,u) X|1  + l ) / ( n , « ) ^ g(B,S,n,u)  + (K(S)K(B)  B  (5.69)  5  T h e introduction of (5.36), (5.37), (5.48), and (5.65) into (5.69) leads to the error bound for the cycle solution: max  S^S,n,u)  \ Si\  !^(^log n(^%+p 2) c  =  2  / i i / max s j  +  +  1 +• - • 11 -  +  C  o  N  (5.70)  Nx  + 1  ((4n + l) + 2 n | l n | 0 | | ) u  From (5.70), we see that errors can be large if \/3\ w 1 or  |*'| is large.  As  mentioned earlier, the characterization of the errors caused by solving the circulant matrix assumes that (3  n  is a positive number. W h e n / 3 « 1, the denominator of n  (4.9) is close to zero, reflecting the near singularity of the system. W h e n  |^'| is  large, the system is ill-conditioned due to extreme scaling that can lead to serious cancellation. This analysis has not considered the fact that matrix B is affected by errors  d \\t"\\ in the calculated value of fi.  This can be achieved by computing  ^  " . A first  d WBW step towards this can be the evaluation of  ^  11  . This remains a topic for future  research.  5.1.4  Summary  This section has analyzed the errors that affect the solutions of Coho linear systems. T h e left-hand side of such a linear system is a cyclic matrix A. In this section we have made progress on getting a more detailed characterization of the conditioning of A 51  Cyclic matrices have been shown to be expressible as the product A — S^BS, where S is a diagonal matrix and B is a circulant matrix. Closed-form formulas for the eigenvalues, eigenvectors, singular values, and singular vectors of matrices B and S have been presented. The singular values of matrix B are proportional to the eigenvalues of the same matrix. Matrix B has at most one small eigenvalue and at most one small singular value. The eigenvalues and eigenvectors of matrix A coincide with those of matrix B. Therefore matrix A has at most one small eigenvalue. The conditioning of matrix A is determined by the conditioning of both B and S. Unlike B, matrix S can.have several small singular values. The precise characterization of the conditioning of matrix A requires knowledge of its singular values. Whereas a time-effective method for this computation has not been found, the condition number of A can be bounded by the product of the condition numbers of B and S, which we know exactly. A n analysis of the forward error that affects the solution of a cycle has been presented. However, this analysis does not lead to a closed-form expression for the error bound, which means that further research is necessary in this direction. The most significant improvement can potentially result from the determination of the singular values of matrix A. This too remains a topic for further research.  5.2  Estimation of Optimal Cost  In this section we seek ways to compute the optimal cost of a Coho linear program as accurately as possible. The computation of the optimal cost continues naturally from that of the optimal solution. Consider a Coho linear program CLP(A ,b ,c ) whose optimal basis is B. C  c  c  The optimal cost can be computed as the dot product of the cost vector and the  52  optimal solution either in the Coho primal:  (c ) c  x  T  c  = (c ) c  (A B,-.~  T  C  1  O B)  (5.71)  C  or in the standard-form dual:  (c f s  x  s  = (b f((A , )^c ) c  c  B  T  (5.72)  c  B  T h e second method will be considered as it builds directly on the results of the previous section. Moreover, it is not difficult to show that the same results apply to the first method. O f the two operands of the dot product in (5.71), the dual cost vector is free from errors. T h e dual optimal solution, however, is affected by an error that has been bounded in the previous section. T h e following forward error bound is known for the dot product of two vectors [Hig96, p.69]:  \a{x y)-x y\ T  T  <  \x\ \y\ 1 — nu n  U  T  where i , t / £ R"  T h i s shows that high accuracy is not guaranteed if \x y\ <s£ | x | | y | . T  r  (5.73)  O n the other  hand, the accuracy of the computed dot product is high if all X j j / , terms have the same sign. A n optimal solution of a linear program in standard form is feasible, so all the components of x  s  must be non-negative. O n the other hand, the signs of b can c  in general be arbitrary, holding the potential for disastrous cancellation i n the dot product with the dual optimal solution. T h i s problem can be circumvented by making a change of variables such that the origin lies inside every projection polygon. T h i s will result i n the origin lying inside the feasible region of the linear program. T h e feasible region of the Coho L P is defined by the inequality: Ax c  c  > b  c  53  (5.74)  A s the origin is inside the feasible region, the above inequality must hold for it:  A0  > b  (5.75)  b  < 0  (5.76)  c  c  i.e.: c  Because c  s  = b , this implies that: c  \{c fx \ s  = \{c f\\x \  s  s  s  (5.77)  T h e application of rule (5.73) to equation (5.77) leads to the following formula for the relative error epp  in the computed cost due to the dot product operation:  T h e approximation | —  = N  U  1 introduces negligible errors for for the values of  n < 20 typical of Coho. Formula (5.78) bounds the error that would affect the computed optimal cost if the box c and x were free from errors. A s it has been seen, a: is a computed quantity affected by an error bounded by formula (5.69). T h e error i n the optimal solution translates into the following absolute error in the optimal cost:  E = \c ft{x) -c x\ T  = \c {H(x)-x)\  T  <  T  x  ||c|| ||fl(a;)  (5.79)  A s the magnitude of the cost vector is irrelevant, we can assume ||c|| = 1 and (5.79) becomes: E  < ||fl(x) - x||  x  (5.80)  T h e introduction of (5.69) into (5.80) yields:  E  x  < g(B,S,n,u)\\x\\  (5.81)  T h e total absolute error E in the cost is obtained by adding errors eop  E = E  x  + e  cx T  DP  54  <E  + nu c x T  x  and  E: x  (5.82)  Equation (5.82) says that, assuming a unit cost vector, the error in the computed cost is bounded by the error in computing the optimal point plus nu times the exact cost. If the optimal basis is ill-conditioned, the error in calculating the optimal point may be large. This motivates considering nearby branches where the error can be reduced. 5.2.1  Use of N o n o p t i m a l Basis  When the optimal basis of a linear program is highly ill-conditioned, the error bound on the optimal cost computed by the methods discussed so far might be unacceptably large. Therefore a better over approximation of the optimal cost may be obtainable by computing it for a slightly primal-infeasible well-conditioned basis. More concretely, consider a linear program with cost vector c and optimal vertex x. Let the optimal cost be approximated by the cost of the vertex x'. Let the computed costs of x and x' be &(c x) and Q.(c x'), respectively. The computed T  T  cost of x' differs from the true optimal cost by: &(c x')-c x\ T  T  =  \{&{c x')-c x')  <  I H(c x') - c x'\ + \c x' - c x\  T  T  + {c x'  T  T  T  T  -c x)\ T  (5.83)  T  If this quantity is less than the error H(c x) — c x that affects the computed cost T  T  of x, then x' offers a better approximation of the true optimal cost of the linear program. The error component fi(c x') —c x' depends on the conditioning of the basis T  T  associated with x'. The error component due to the difference between the true costs is: c x — c x\=  x —x  proj(. x'—x)  C  (5.84)  Therefore in order for a vertex x' to yield a good approximation of the optimal cost, x' must satisfy the following properties: • x is not far from the optimal vertex x 1  55  Figure 5.1: Types of optimal 2D vertices: a) highly obtuse optimal vertex and highly acute cost cone; b) highly acute optimal vertex and highly obtuse cost cone  • x  1  is on a near-normal to the cost vector through the optimal vertex  • the basis corresponding to x' is well-conditioned A square matrix is ill conditioned if at least one of its rows is nearly equal to a linear combination of the other rows. T h e maximum number of rows that are nearly equal to linear combinations of other rows represents the degree of ill conditioning. Clearly the degree of ill conditioning depends on what is meant by "nearly equal". A matrix and its transpose share the same conditioning. T h i s is reflected by the fact that the characterization of matrix ill-conditioning with respect to rows is also true of columns. In the case of Coho linear programs, basic columns in the dual represent inward halfspace normals in the primal. T h e feasibility of a basis means that the primal cost vector is a positive combination of basic columns.  Therefore at any  feasible basis the primal cost vector lies inside the cone generated by the basic inward halfspace normals, also called the basic cost cone. Let us now consider the simple case of a two-dimensional linear program  56  whose optimal basis is ill-conditioned. Ill conditioning has a simple geometric interpretation in this case: the two lines that determine the optimal vertex are nearly parallel. This means that the optimal vertex is either highly obtuse or highly acute. It is easily seen that the angle of the optimal vertex is the supplement of the angle of the optimal cost cone. Therefore when the optimal cost cone is highly acute the optimal vertex is highly obtuse and vice versa. The optimal basis is feasible, so the optimization direction lies inside the optimal cost cone. Consequently the angle between the optimization direction and either side of the optimal vertex cannot differ from 7 r / 2 by more than the angle of the optimal cost cone. Therefore, when the optimal cost cone is very acute, i.e. when the optimal vertex is very obtuse, the sides of the optimal vertex are nearly perpendicular to the optimization direction. This suggests that vertices that lie on the lines that determine the optimal vertex are good choices for approximating the optimal cost. Moreover, the more ill-conditioned the system is, the better the approximation offered by such points. A vertex on one of the lines that determine the optimal vertex is obtained by replacing the constraint corresponding to the other line with some other constraint. If the optimal vertex is highly acute, then the optimal cost cone is very wide, so the orientation of the optimization direction relative to the boundary lines cannot be characterized. However, highly acute vertices can be precluded by an appropriate scaling of the LP, not discussed further in this thesis. Let us now try to extend the approximation idea presented above to higher dimensions. Consider a Coho linear program of dimension d whose optimal basis is ill-conditioned. For simplicity, we assume its degree of ill-conditioning to be 1, but we expect the generalization to higher degrees to be straightforward. The i l l conditioning of the basis means that there exists a subset of its columns such that any column in the subset is almost a linear combination of the other columns in the subset. We term the columns in the aforesaid subset interde-  57  pendent and the other columns independent. Let Gh be the linear subspace generated by all the basic columns except fc. This subspace has dimension d — 1, so it is, in fact, a hyperplane. Let k be one of the interdependent columns. The fact that column k is almost a linear combination of the columns that generate G means that the projection of column fc onto the k  normal to G is very small. This implies that, for any two interdependent columns k  fci andfc2,hyperplanes G  kl  and G^ are nearly parallel. 2  Each of the d hyperplanes generated by a set of d — 1 inward face normals contains a face of the optimal cost cone. Hence the faces of the optimal cone corresponding to interdependent basic columns are nearly parallel to one another. Therefore the wedge formed by two such faces can be either very thin or very wide. The cost vector must lie inside the wedge formed by a pair of adjacent faces of the optimal cost cone. If the wedge is very wide, no inference can be made about the direction of the cost vector. This is the case for highly acute vertices, and can be excluded by scaling. For a highly obtuse vertex, the wedge is very thin, and the cost vector is nearly parallel to either face of the wedge. A normal to a face of the wedge will be almost perpendicular to the cost vector. Therefore the points on such a normal may suitably approximate the optimal cost. A vertex on the normal to the face of the optimal cost cone contained in hyperplane G can be obtained by replacing basic column k with another column. k  In the Coho LP, this amounts to replacing one of the constraints in the optimal basis. If an ill-conditioned optimal basis is characterized by the existence of a pair of faces of the optimal cost cone that form a very thin, wedge, then the optimal vertex is said to be highly obtuse and the optimal cost cone is termed highly acute. Otherwise, the vertex is said to be highly acute and the optimal cost cone is termed highly obtuse.  58  The error introduced by using a non-optimal basis is proportional to the distance of the approximating vertex to the optimal vertex and to the projection of the cost vector onto the line passing through the two vertices. In order to bound the approximation error, we have to bound these two quantities.  5.2.2  E r r o r Introduced by Dropping One Constraint  This subsection considers the case of a (i-dimensional cone and a vector inside the cone. A bound on the projection of this vector onto the normal to an arbitrary face of the cone is established. Practically, we are interested in very flat and narrow cones as described in the proposed approximation solution. L e m m a 1 Let u and v be vectors in R " \ { 0 } . Let  6 = -_(u,v)  (5.85)  Let w = w{9) be a function defined as:  (5.86)  Then  u +v Proof  < w{6)  (5.87)  It is simple to establish that: (5.88)  Division by ||u|| turns equation (5.88) into: (5.89) Let: v  59  (5.90)  Substitution of  (5.90)  into  yields:  (5.89)  u +  v  =  yj 1 +  u  jjfi + Ip, cos 9  T h e value of n for which the expression of  Mmin =  1 S  (5.91)  minimized is:  -COSC?  (5.92)  Because both | | u | | and | | v | | are positive, n is restricted to positive values. B y bringing in this restriction the value of u for which ^ i " ^ ^ is minimized becomes:  IMI /.min =  max(/i[  n i n  , 0) =  max(-  cos  0  if 9 < T T / 2  -cose?  if 6» > T T / 2  9,0)  (5.93)  A^min — \  If  t9 <  TT/2, then  ^"1,"*",^  (5.94)  is minimized with fi  =  0  and:  (5.95)  l|u|| If 9 > 7r/2, then ^"M^I^ u + u  Equations  v  1 S  = y/(-cos9)  (5.95)  2  and  (5.96)  minimized with JJ, = cos 9 and:  + 2  cos 9{-cos9)  +  1 =  y/l - cos 9 = sin9 2  (5.96)  establish the result of the lemma.  • The proposed approximation relies on the cost vector being nearly parallel to any face of the optimal cone. This nearness is measured by the projection of the cost function on the normal to a face of the optimal cone. T h e following theorem establishes a bound on this projection:  60  Theorem 1 Let {UJ : i = Ui,..., u  n  _ i  1 , . . . , n} be a set of n-dimensional unit vectors such that  are linearly independent.  Let dx be a unit vector such that d_L-Ui=0,  Vi = l , . . . , n - 1  u „ •d  (5.97)  > 0  ±  (5.98)  Let = u „ • dx  (5.99)  Un,, = u „ - u d±  (5.100)  u  n±  and n±  let  ( <p =  max 1  Let f e R  n  (5.101)  n  fei  V  1  1  \  _ 1  U„ , V b j u ,  L  (6 ,...,&„- )eR;-  "  /  6e a unit vector such that n  f = J2<H"i  (5.102)  i=l  where ai>0,  Vi = l , . . . , n - 1  (5.103)  Let f± = p r o j  d ±  f  (5.104)  Then 0 < f  ±  <  " " i  y/i£ +W {<p) 2  ±  61  _  =  (5.105)  (l-U J 2  Proof  T h e combination of equations (5.102) and (5.104) yields: n  n  f± = f • dx = ( ^ a ; U j ) • dj_ = ^ a ; ( u , • dj_) i=i i=i  (5.106)  T h e introduction of definition (5.97) into equation (5.106) yields:  = a„(u„ • dx)  (5.107)  B y definition (5.99) equation (5.107) becomes: (5.108) B y definitions (5.98) and (5.99):  u _ >0  (5.109)  nj  Thus, definition (5.103) and inequality (5.109) yield f±_ > 0, establishing the first inequality of the theorem. Let f|| = f - / ± d x  (5.110)  T h e introduction of (5.102) and (5.108) into (5.110) yields: n f  ll  =  7i—l  yi i ' a  u  n—1  OjU, + a„(u„ - U n ± d x )  ~ n"nj,dx = ^ a j U j + anUn-anUn^dx = ^ a  i=l  i=l  i=l  (5.111) T h e combination of (5.100) and (5.111) leads to: n-l  fj| = a „ u  n | |  + ^  ajU,-  (5.112)  i=l T h e application of lemma 1 to the vectors in equation (5.112) results in: /  fu  >  /  n-i  \ \ (5.113)  W t=l  B y (5.103) a „ must be positive, so: n-l  n-l (5.114)  t=i  i=i 62  By definition (5.101): n-1  ^  U  , ^2 i i a  n | |  (5.115)  ^  u  i=i  The function w is monotonically decreasing, so the inequality above can be turned into: w (z. (u , ]T] a,i\ii\ \ > w (cp)  (5.116)  ni]  i=i  Combined, inequalities (5.113) and (5.116) yield: fii  w  Q"n^n\\  >  (cp)  (5.117)  In general, for any 2 vectors v and w , the Pythagorean theorem holds: !|v||  For f and  - ||proj  2  w  v||  2  + ||v  - proj  w  v||  2  (5.118)  it becomes:  if ir = i  (5.119)  or: (5.120) By combining inequality (5.117) with equation (5.120) we get: )  (5.121)  (a w(<f) u„ )  (5.122)  / i _ < y 1 - (a w(ip) u n  n||  The introduction of (5.108) into (5.121) yields: a u _ < \Jln  ni  n  (|  By the Pythagorean theorem for u„ and dx: 2 Un,,  = I|U„||  2  - ul  L  = 1  - U *  x  (5.123)  The introduction of (5.123) into (5.122) yields: au n  < xjl- alw {<p) (1 - u. ) 2  n±  63  2  (5.124)  A s the left side of inequality (5.124) has been shown to be positive, we can square it to get: an un ,_ 2  2  n  <Tl 1 < -a  2  2  n±  2  n n,?(,n\C\ w2(<  p ) ( l _- u )  2  (5.125)  2  n  T h i s leads to the following bound on quantity a : n  a  <  n  (5.126)  1  ^ u  +w {<p)  2  {l-u J  2  ±  2  According to (5.109), quantity u _ is positive. nj  (5.126) with u  n  ±  T h e multiplication of inequality  results in:  a  n  u  n  x  < •  U  y j u  2  n  +W (<p) 2  ±  ^  (5.127) (1 ~ U  2  J  T h e introduction of equation (5.108) into the left hand of inequality (5.127) leads to: f  L  <  (5.128) yjul  +W (if)  (1 -  2  ±  U2J  T h i s establishes the second inequality of the proof.  • T h e quantity u  n  ±  measures how close to singularity, i.e. ill-conditioned, the  system is, with lower values representing worse conditioning.  T h e cases i n which  the proposed approximation works are the ones where / j _ is very low. T h e theorem shows that, if  < 7 r / 2 , then: f±  < u  n  ±  "  (5.129)  T h e condition <p < TT/2 represents an obtuseness requirement for the associated vertex. T h i s requirement is not very restrictive. We assume this condition to hold during the rest of this discussion.  64  5.2.3  Use of the Bounding Box  In order to bound the approximation error, a well-conditioned overapproximating basis within suitable distance of the true optimal basis is needed. Such a basis can be guaranteed to exist by adding bounding box constraints to the linear program. T h e fact that each time step of the Coho algorithm starts from a set of projection polygons makes the computation of a bounding box for the feasible region of the linear program trivial. The feasible region of a face undergoes bloating and intersection with other projectahedra, but these operations have obvious equivalents for the bounding box. Intersection operations might result in a slight overapproximation of the bounding box, but this is deemed acceptable. T h e optimal solution of the linear program lies inside the bounding box. Consider a line that is the intersection of ti — 1 of the hyperplanes that define the optimal vertex. T h e optimal point divides the line into two halflines. T h e points that belong to one of the halflines are characterized by higher-than-optimal cost, as needed for the type of approximation that we are seeking. This halfline intersects (at most) half of the bounding box hyperplanes. We shall establish that at least one bounding box hyperplane exists such that its intersection with the halfline is well-conditioned and their intersection point is either on the bounding box or very close to it. In the arguments that follow, the conditioning of the intersection of the halfline with the hyperplane is measured by the projection of the direction of the halfline onto the normal to the hyperplane. Low values indicate that the hyperplane and the halfline are nearly parallel, so their intersection is ill-conditioned.  Definition 2 If .  K  A vector  f  G  R  d  is e-perpendicular to another vector  g  G  R  d  |  IILJ if ,! <_ e. We write f J_ g. 1  I|f|| l|g||  e  T h e above definition gives a quantitative measure for how close two vectors are to  65  iff  being perpendicular. It is easy to see that f l o g O f 1 g.  L e m m a 2 Let f be a vector in R  d  \  {0}.  Let e be a non-negative real number such that 1  (5.130)  Then f cannot be e-perpendicular to all the directions of an orthogonal basis.  Proof  Let E — { e i , . . . , e^} be an orthonormal basis of R . Then: d  d f = ^ 2 ^ i=l  '  e i  w  h  e  r  e  fi = ' i f  e  (  Because basis E is orthonormal:  fi = f • ei  (5.132)  and (5.133) Suppose f _L ej, Vi = 1 ... ,d. B y (5.132), this implies that: £  /i<e||f||,  Vt = l , . . . d  (5.134)  T h e introduction of (5.134) into (5.133) leads to: d  (5.135)  or 1 < eVd  (5.136)  which contradicts hypothesis (5.130).  • 66  Lemma  3 Let f  Let b G R  +  GR  d  be a unit vector such that fi>0,  Vi  = 1,...,  d.  be a d-dimensional point.  Let e G R + be a non-negative real such that e < -4=.  vd Then 3k G { 1 , . . . , d} and 3X > 0 such that: Xfk = h  (5.137)  fk > e  (5-138)  and and <  A  Proof  M  (5.139)  Vl - (d -  l)e  2  A l l the numbers used in the proof are non-negative, either by definition  or by being a sum, product, or quotient of non-negative numbers. Non-negativity will not be stated explicitly in the rest of the proof. Without loss of generality, the unit vectors of the basis can be renumbered such that  h< J±L, Ji Ji+1 b  V* =  l,...,d  (5.140)  Let k=  m i n i : - . ( / _L e*)  (5.141)  e  \<i<d B y lemma 2, such a k exists. Let A = ^  (5.142)  Jk  By  hypothesis:  B y the definition of k: Vi = l , . . . , f c :  /i± ei e  67  =>  <e  (5.144)  T h e introduction of (5.144) in (5.143) yields:  d £ /  2  > l - ( f c - l ) e  (5.145)  2  i=k B y the definition of the Euclidean norm:  ^6 <||6|| 2  (5.146)  2  i=k Combining (5.145) and (5.146):  d  i=k I  JLZ  - (fc -  iw  i=k B y the definition of fc:  -f< i  b  h  Ik  Ji  Vi =  fc,...,d  (5.148)  T h i s can be rewritten as: ^j- =  fk  where v± < 1,  Vz = fc,... ,d  (5.149)  fi  The square of the equation above is:  §=4?'  vi=&,...,d  ( - °) 5  i 5  Summing the numerators and the denominators of the right-hand side of the equation above for i — fc,..., d yields:  u2  2 - -  fk  d  i=k 68  ~  ( 5  1 5 1  )  T h e definition of  by (5.149) as a subunitary number leads to:  d  d Vi = * : , . . . , d  i—k  (5.152)  i=k  B y combining (5.151) and (5.152) it follows that:  d  33  &fc\ ^ i=k 2  <  fk  (5.153)  /  2  From (5.147) and (5.153) we obtain:  bk\\  \\b\\  2  JkJ  l-(fc-l)e  :  T h e introduction of (5.142) in the inequality above yields A <  ,  ll?>11  y/1 - (k - l)e  2  (5.155)  The maximum of the right-hand side of the equation above is achieved for k = d: A<  ,  "»»  Vl-(d-l)e  (5.156) 2  • A halfline H L that emanates from the origin in a direction whose components are all positive is contained in the positive orthant of the space. E a c h point b in R  d  can be seen as the corner of the box whose diagonally opposite corner is at the origin. E a c h face of the box is included in a hyperplane that is normal to a coordinate axis. If point b is situated in the positive orthant, then halfline H L must intersect at least one of the box hyperplanes that pass through b. L e m m a 3 shows that such a hyperplane exists such that its intersection with H L is both well-conditioned and situated not much farther from the origin than the diameter of the box.  69  Figure 5.2: Halfline emanating from inside a box: first intersection with a box line is P I , which is ill-conditioned; second intersection is P2, which is well-conditioned and close to the bounding box.  Theorem 2 Let H C be a d-dimensional hypercube of diameter D. Let H L be a halfline emanating from a point XQ inside the hypercube in direction f: H L = {x G R  d  :x = x  + A / , A > 0}  0  (5.157)  Let e G R+ be a non-negative real such that e < -4=. vd Then an intersection point x of H L with a hyperplane H P of the hypercube H C e  exists such that: - . ( / _L n e  where  TIHP  H P  is the normal to hyperplane  )  HP,  l!s -s ||<—_ V l - (fl -  (5.158)  and: (5.159)  D  e  Proof  0  l)e  Trivial transformations that do not modify distances (translations and  reflections) can move XQ to the origin and / into the positive orthant. Let b the corner of the hypercube now situated in the positive orthant.  L e m m a 3 can be  applied to the current values of / and b to obtain a real number A with the properties stated by the lemma. Let: x  e  = A/ 70  (5.160)  Because xo — 0 and A > 0, x  e  G H L by the definition of H L .  Let: H P = {x G R  d  :x  = b}  k  (5.161)  k  B y the definition of x : e  (x ) e  = A/  k  (5.162)  f c  B y the definition of A (lemma 3, (5.137)), (5.162) can be rewritten as: (x ) e  which means that x  e  = b  k  (5.163)  k  G H P . T h e point x  belongs to both H P and H L , so it is an  e  intersection point of these two sets. One normal to H P is nnp = e . T h e angle between nup and / is characterized by k  the quantity: |/-n 1 1 / 1 1  |  H P  _  \f \ k  i  I K P I I  •  =  h  ( J U 6 4 )  i  The introduction of (5.138) into (5.164) leads to:  1/  • ™ H P |  ^  / ,  ,ar\  IIIHPII  1 1 / 1 1  from which (5.158) immediately follows. A s the point xo is now at the origin, the distance between XQ and x  e  | | a ; e - ^ | | = ||x || = | | A / | | = A | | / | | = A e  is: (5.166)  B y requirement (5.139) of lemma 3: A <  , ^l-(d-l)e  (5.167)  1 1 6 1 1  2  The distance from a point inside a hypercube to a corner of the hypercube cannot be larger than the diagonal of the hypercube. A s the origin is contained inside H C and  H&ll  is in fact the distance from the origin to corner  H&ll < D  b of  H C , it follows that: (5.168)  The combination of equation (5.166) and inequalities (5.167) and (5.168) establishes inequality (5.159). 71  • The theorem shows that any halfline emanating from a point inside a bounding box has an intersection with a bounding box hyperplane that is well-conditioned and situated not far from the point (fig. 5.2). Formula (5.159) expresses the tradeoff that exists between the conditioning of the intersection point x  e  of the halfline with the bounding box, measured by e,  and the distance from the origin XQ of the halfline to x . e  Characteristic values of d  do not exceed 20. Consider that e is required to be at least 0.1, which guarantees the good conditioning of the intersection. Then theorem 2 guarantees that: \\x - z | | < lD e  (5.169)  0  where:  ^ ^ - l l , ! , . " - "  1  ( 5  '  1 7 0 )  The value of e could be set to a much lower value and still represent a wellconditioned intersection while driving 7 very close to 1. That, however, wouldn't introduce a qualitative change to the argument. In the case of Coho LPs, well-conditioned bases that approximate the optimal solution more closely than any basis that contains a bounding box hyperplane may exist. However, it is the use of the bounding box that provides an upper bound on the distance from the true optimal vertex.  5.2.4  Error Bound for Coho Cycles  Having described a way to approximate the cost function for a particular class of illconditioned optimal bases, we shall now examine the application of this idea to Coho linear programs. A basis represents a matrix that may contain several independent cycles, each of which can be considered separately. In order to bound the approximation error, a bound on the projection of an arbitrary basic column onto the normal to all the other basic columns is needed. 72  best approximating vertex when the bounding box is not used  best approximating vertex when using the bounding box  optimal vertex  \ " cost = const.  ^ \ c o s t vector!  f^  Figure 5.3: The best approximating vertex, with and without the bounding box First the direction of a normal to face of the basic cost cone is determined: Lemma 4 Let A be a matrix representing a cycle in normalized form as described  by (4.4). Let d be a vector defined as d = col(Po,...,  d =P k  k  U  P„_i),  i.e.:  Vfc = l , . . . , n  (5.171)  Then d is orthogonal to columns 2 , . . . , n of matrix A:  d±A ,  Vj = 2 , . . . , n  :d  Proof  (5.172)  By the definition of the dot product: dA T  :d  n = J2diA  (5.173)  id  i=l  The introduction of the definition of A-j results in: j-2 d Aj T  :  n  = ^2d 0 + dj-i(-a - ) i=l i  j  1  + d l+ j  73  ^ i ° = i'• ~ j-i j-i i=j+l d  d  a  d  (5.174)  From the definition of d it follows that: dA  = P,-_! - ctj-iPj-2  T  :d  (5.175)  The use of (4.6) leads to the final result: dA  =0  T  :d  (5.176)  • The next step is to determine the projection of a basic column onto the normal to the others:  Theorem 3 In the conditions of lemma 4  :  \Pn - II  UprojrfdirAi  (5.177)  n-l  V Proof  1  + n* a  The columns A j,j = 2 , . . . , n are linearly independent. Together with d, :>  which is normal to each A j , they form a basis for R". Consequently it is possible to express A-^\ as: n  A  = ^2c A.  :tl  j  mtj  + 6d  (5.178)  3=2  where Cj e R, Vj £ { 2 , . . . , n} and iJeR. The first row of equation (5.178) yields: 1 = c ( - a i ) + 5P 2  0  (5.179)  Considering that a\ = P\ and PQ = 1 = PQ , this can be rewritten as: 1 = - c P i + (5P 2  2  0  (5.180)  From rows 2 . . . n — 1 of equation (5.178) it follows that: 0 = a + c i(-a )+6Pi- , i+  i  1  74  i = l,...,n-2  (5.181)  Multiplication of the above with P j _ i and the equality a j P j _ i — Pi lead to: 0 =  0 ^ - 1  i=  - c Pi + 5P?_ i+1  l:  l,...,n-2  (5.182)  T h e memberwise summation of all of the above equations yields: n-1  +  0 = E ( * i - l c  P  i-l)  6P  t=2  n—1  n—1  n—1  = 5>**-i) - E ( ^ + i * ) + p  t=2  i=2 n  n—1 = E ( ^ - i ) i=2  - E ^  5  E  ^ - i  i=2 n—2 - i )  p  (5.183)  +^ E ^ i=l  t=3  2  n-2  = c Pi-c„P„_i+c5E^  2  2  t=i  T h e last row of equation (5.178) can be expressed as:  -a-. = c„ + SPn-i Multiplication with P -i n  (5.184)  and the equality a P - \ n  = P „ enable us to rewrite the  n  above : -P„  + &Pl_  = CnPn-X  (5.185)  x  T h e memberwise summation of (5.180), (5.183), and (5.185) is: l + 0 - P  = ( - c P i + c5P ) 2  n  2  0  n-2 (csPx-c^-i + cSE^ ) i=l  ( -  2  + (c P„_ J l  1  5  1 8 6  )  + M ^ )  T h i s simplifies to: n-1  1-P„  = 5 E ^  (5-187)  2  i=0  or, equivalently: 8=  -—^  L  n—1 i=0  75  (5.188) ' v  P  2  The orthogonality of d onto Cj S R, Vj = 2,... , n implies that: (5.189) and for the direction of A ±: :  proj dir A d  =  j X  proj A i  5d  U,i\\  \\A \  d  (5.190)  :il  The substitution of the norms of d and A- \ yields:  d  n-l  \Pn - I I  |proj dir A  \Pn - I I  (5.191)  :  *  p  \ »=0  2  '  n-l  + "n  1  i=0  E^  2  i=0  • A low value of quantity ||proj dir A d  : )  i||  means that a bounding-box approximation  of the cost achieved by dropping column 1 of matrix A will be accurate. Equation (5.4) associates a circulant matrix B to matrix A. Equation (5.191) shows that the ill conditioning of matrix B, which is reflected in a low value of \P — 1|, implies that a pivot to the bounding box will yield a satisfactory apn  proximation of the cost. The following theorem establishes a bound on the error introduced by pivoting to the bounding box in a Coho L P : Theorem 4 Let C L P ( A , b , c ) be a Coho linear program with n variables and m C  c  c  constraints, where the cost vector, c , is a unit vector. c  Let B be an optimal basis of C L P and x be the optimal vertex of C L P corresponding to B. Let A = {A ,) C  T  B  Suppose that A is a cycle in normalized form as per equation  (5.192) (4.4). Moreover, sup-  pose that the maximum angle between A \ and any positive combination ofA-, 2, • • •, A:>  does not exceed | . 76  t  >n  Let D be the diameter of the bounding box of the feasible region of the LP. The inequalities representing the bounding box are supposed to be among the constraints that define the feasible region of the linear program. Let a Coho boundary hyperplane be defined as: CHP(fc) = {x e R " : A . c  k  x =b}  (5.193)  c  k  Let d\ be the intersection of boundary hyperplanes B{2),...,  B(n):  n  p| CHP(B(*))  d = x  (5.194)  k=2 Let e be a positive number such that e < 4=  (5-195)  Then there exists a basis B' with corresponding basic solution x' such that: B'{k)=B(k)  Vfc = 2 , . . . , n  (5.196)  and -.(di - L A , £  s  ( 1 )  .)  (5.197)  and (c ) x'-(c fx\< c  T  c  1  |  P  y/l-{n-l)e  2  n  _  1  1  (5.198)  n-1  where Pi is defined in equation (4.5). Proof  From the definition of CHP, it follows immediately that: A . c  k>  _L CHP(A;) Vfc = l , . . . , m  The definition of A implies that A  :<k  —A  C B  (5.199)  ^ ) . , which, introduced into (5.199),  yields: A  f e  _L CHP(B(fc)) 77  VJfc = 2 , . . . , n  (5.200)  Definition (5.194) can be restated as: di C C H P ( £ ( f c ) )  Vfc = 2 , . . . , n  (5.201)  T h e combination of (5.200) and (5.201) yields:  d ±A.. 1  V/c = 2 , . . . , n  jk  Feasibility requires that the cost vector c  c  (5.202)  be a positive combination of the columns  of matrix A. Each column of matrix A can be turned into a unit vector by scaling it by a positive factor. Therefore c  is also a positive combination of the unit vectors  c  of the columns of matrix A. Along with (5.202) and with the obtuseness hypothesis, this enables the application of theorem 1 to bound the projection of vector c the normal d\ to the hyperplane generated by dir A 2, • • • , dir A; :>  >n  c  onto  by the projection  of vector dir A-^\ onto the same direction: ||proj c || < ||proj dirA i|| c  d l  d l  : i  (5.203)  T h e right-hand side of the above inequality is determined by theorem 3. T h e introduction of its result into (5.203) leads to: ||proj  c || c  dl  <  |  " ~ ^  P  (5.204)  n-l  Vl + a  2  \  i=0  Let H L be one of the two halflines that point x determines on line d\. T h e fact that the optimal vertex must lie inside the bounding box of the feasible region, along with (5.195), means that the conditions of theorem 2 are met. B y hypothesis, the bounding box inequalities are part of the linear program. Let the hyperplane in theorem 2 be:  HP = CHP(r)  (5.205)  a normal to which is: n p = A H  78  c r  -  (5.206)  Let the intersection point of H P and H L be x'. Then:  --(di - L A , )  (5.207)  c  £  r  and \\x' — aril <  "  "  .  = (n - l ) e  (5.208)  D  Vl-  -  2  A vertex of a Coho L P represents the intersection of the n boundary hyperplanes that correspond to its basis.  Line d\ represents the intersection of n — 1 bound-  ary hyperplanes. A vertex of the L P is to be found at the intersection of d\ with any other boundary hyperplane, the corresponding basis being formed by the hyperplanes that contain d\ and the hyperplane that d\ intersects. Vertex x is to be found the intersection of di with CHP(23(1)). Vertex x' represents the intersection of di and C H P ( r ) . This is to say that x' is the vertex that corresponds to the basis obtained by replacing 23(1) with r:  B'(k) = {  r  if  k=  B{k)  iffc = 2 , . . . , n  1 (5.209)  T h i s definition satisfies (5.196) and, introduced in (5.207), yields (5.197). T h e difference in cost between vertices x' and x is:  \(c ) x' - (c fx\ c  T  c  = \\x'-x\\  ||proj ,_ c || c  x  x  (5.210)  T h e fact that x' £ d\ and x 6 d\ implies that (x' — x) || d\, so equation (5.210) can be rewritten as:  \(c fx' c  - (c fx\ c  = \\x' - x\\  ||proj  c || c  dl  (5.211)  T h e introduction of inequalities (5.208) and (5.204) into (5.211) establish result (5.198) of the theorem.  •  79  Theorem 4 establishes an upper bound on the error introduced into the cost by pivoting to the bounding box of the feasible region. In general, the solution to an n-dimensional linear system can be seen as the intersection of n hyperplanes. If the linear system is non-degenerate, the intersection of n — 1 of these hyperplanes represents a line.  T h e solution of the system is the intersection of this line with  the other hyperplane. Theorem 4 establishes a lower bound on the angle between the intersection line of n — 1 hyperplanes and the normal to the n as a measure of the conditioning of the system.  t h  hyperplane  G o o d conditioning and a fairly  low distance between the optimal and the approximating vertex can be obtained simultaneously. T h e theorem also shows that the approximation error decreases as the conditioning of the circulant matrix corresponding to the optimal basis worsens, i.e. as \P — 1| approaches 0. T h e approximating solution was required to be a basic n  solution of the L P in order to enable its discovery by the Simplex algorithm. Ill-conditioning due to the scaling matrix S does not appear to be curable by pivoting to the bounding box. However, we suspect that obtuse ill-conditioning of the optimal basis should always be nearly orthogonal to the cost vector. T h i s gives us hope for the discovery of better methods resulted from the exploitation of this property. Theorem 4 refers to the case of a matrix with one cycle and so do the results presented earlier in this chapter.  A n n-dimensional square matrix can have at  most n / 2 non-trivial cycles. B y the triangle inequality, the total error affecting the computed cost is bounded by the sum of the errors for the cycles. T h i s observation, along with the error bounds established for cycles, leads immediately to an error bound for the computed cost of a general, multicycle matrix.  5.2.5  Summary  In this section we have studied the approximation of the optimal cost through a pivot to the bounding box. A class of optimal vertices for which such an approximation  80  introduces small errors has been identified. T h e n a bound on the component of the cost vector which is proportional to the approximation error has been determined. T h e approximation technique relies on the existence of a well-conditioned basis formed by replacing an optimal constraint with a bounding box constraint. T h e existence of such a basis has been proven. T h e approximation error is proportional to the distance from the true optimal vertex to the approximating vertex. A bound on this distance has been established when a bounding box for the feasible region is used. T h e case of Coho linear systems has been examined in the final part of the section, with the computation of a bound on the quantity that measures the error introduced in the cost by pivoting to the bounding box. T h e formula for this quantity has shown that the bounding box approximation method works for one type of ill conditioning, whereas the other type remains an open problem.  81  Chapter 6  Implementation T h e previous chapters described how the special structure of the linear programs arising in Coho can be exploited to produce an efficient and robust version of Simplex. T h i s chapter addresses three remaining issues for a practical implementation. First, the problem of finding an initial feasible basis is addressed. Second, the way the algorithm deals with uncertainty in the results of intermediate is presented.  computations  T h i r d , the solution of linear programs whose feasible region is an  arbitrary linear transformation of an actual projectahedron is described.  6.1  Finding an Initial Invertible Basis  Let S L P ( y l s , bs, cs)  be an instance of a linear program in standard form whose  matrix As exhibits the Coho-specific structure (either one or two non-zero elements in each column). of matrix As,  Let d be the number of rows and / be the number of columns  where d < f.  which to start pivoting.  T h e Simplex algorithm needs a feasible basis Bo from  In order for a selection of columns of As to represent a  feasible basis, it must first represent a non-singular matrix. Due to the sparsity of the matrix As of a Coho L P , finding an structurally non-singular column selection is not trivial. T h e structure of matrix As can be seen as a graph G: each row corresponds  82  1  2  3  5  4  X X X X X X  1  2  X X X X  3 4  5  Figure 6.1: Subgraph that corresponds to an structurally singular matrix. to a vertex, whereas each column turns into an edge. The number of non-zero elements in any column must be either 1 or 2. A column whose non-zeros are in rows i\ and i represents an edge between vertices i\ and i . A column whose only 2  2  non-zero element is in row i represents an edge between vertex i and itself. Graph G has d vertices and / edges. Clearly, more that one edge can exist between a pair of vertices. Let GB be the subgraph of G that corresponds to the submatrix (As). 0  Bo  Matrix {As), g contains all the d rows and d of the / columns of As, so GB contains 0  all the d vertices and d of the / edges of graph G. Consider the linear system LS defined by the equation {As). X = y. Each BQ  connected component of GB corresponds to an independent subsystem of LS. The 0  structural non-singularity of matrix {As).  Bo  means that LS must be neither under-  nor overdetermined. In turn, this implies that any independent subsystem of LS must have a square left-hand side: one with more rows that columns is overdetermined, whereas one with more columns than rows is underdetermined. Therefore, any connected component G of GB must represent a square matrix. By the conP  0  struction rules for G and GB , this means that the graph has equal numbers (d ) of 0  p  edges and vertices. The connectedness of G requires the use of d — 1 edges to link P  p  the d vertices together in a tree structure. The d -th edge can join two arbitrary p  p  83  Figure 6.2: Subgraph that corresponds to an invertible matrix. vertices, thus creating a cycle with some overhanging trees. This suggests the following way offindingan invertible column selection: a set of trees is constructed by depth-first search such that every vertex is assigned to a tree; then a cycle is introduced in every tree. Any edge between two different vertices represents a column of matrix AsIn turn, each column of As corresponds to a side of a projection polygon. Moreover, all sides of a projection polygon represent edges between the same pair of vertices of G. So for each edge there will exist at least two more edges between the same vertices. Therefore it is always possible to create a 2 x 2 cycle in any existing tree by adding one more edge between two vertices that are already connected to one another. In addition to satisfying the condition described above, each cycle must give rise to a well-conditioned linear system. We assume that all projection polygons are of a low enough degree that it cannot be the case that all vertices of one of them are highly obtuse. This assumption does not exclude any polygons that other components of Coho would handle conveniently: in order for all its vertices to be highly obtuse, a polygon must be at least of degree 10 , which would render the 9  computational cost of its manipulation prohibitive. 84  As shown in chapter 5, the cycles embedded in the linear systems that arise in Coho can be affected by two types of ill-conditioning: one is caused by the quantity P  n  being close to 1; the other is the result of the bad scaling of the coefficients a , of  the cycle. T h e following method of picking the initial basis guarantees the avoidance of ill conditioning of the first type: Suppose that the cycle is to be created between vertices i\ and 12 between which an edge is known to exist.  Vertices i\ and 12  define a projection plane and the edges between i\ and %2 represent the edges of the corresponding projection polygon. T h e projection polygons are assumed to have no highly acute vertices.  Therefore, no projection polygon encountered in Coho can  have the property that every two of its sides are nearly parallel to one another. It follows that it must be possible to pick a pair of edges of the projection polygon corresponding to the pair ( i i , 1^2) such that the angle between them is not close to either 0 or TT. T h e n the cycle formed by this pair of edges will of necessity be free from ill conditioning of the first type. It is also possible to create a cycle in a tree by adding an edge from a node to itself, if one such edge, i.e. a constraint with only one variable, is present. T h e conditioning is guaranteed to be good in this case. T h e method of finding an initial feasible basis described above does not guarantee the absence of ill conditioning caused by the bad scaling of the a , coefficients of the cycle. Instead, this problem is handled by the branching method described in section 6.3.  6.2  Finding an Initial Feasible Basis  T h e basis B identified in the previous section might not have all the properties required by the Simplex algorithm: some components of T-.fi = B~ b 1  negative, i.e.  B might be infeasible.  might be  T h i s calls for another computational step  towards a feasible basis. Classical solutions to the problem do exist, but they involve the introduction  85  of auxiliary variables that increase the overall computational cost and, worse, destroy the special structure of the LP's matrix As- These disadvantages render a solution tailored specifically for the case at hand more desirable. This new solution employs a helper linear program SLP (Ag , bg , Cg ) constructed as follows: H  -{A )  if B{i) = j and T < 0  {As).j  otherwise  s  :d  ifi  1  if B(i) = j and T < 0  0  otherwise  ifl  <  The columns of Ag corresponding to the negative components of T fi ap:  pear negated in Ag, thus ensuring that B is a feasible basis for SLP *. The same 7  construction ensures that the distribution of the non-zero elements stays the same in Ag as in As- S L P  ff  being completely similar to SLP, the same computational  techniques developed for SLP can be applied to S L P . ff  A variable that appears negated in the helper L P is called undesirable. Our objective is to obtain a feasible basis that contains no undesirable variable. The cost function of the helper L P makes the undesirable variables expensive, whereas all the other variables have 0 cost. In order for a pivot to be favorable, it must drive one of the undesirable variables out of the basis. Given a linear program and a feasible basis for it, the basis remains feasible if arbitrary changes are made to the non-basic columns and to the cost function. This enables us to flip the sign of the undesirable variable that has left the basis while keeping the basis feasible. A variable ceases to be undesirable when its sign is flipped. The cost of the variable is made 0, as the variable doesn't need to be kept out of the basis anymore. Eventually all the undesirable variables are driven out of the basis or the original L P was infeasible. The optimal basis B that we end up with contains only H  86  variables that appear with the same sign in the original linear program SLP. As the right-hand sides 65 and bg are also identical, B  H  must be a feasible basis for SLP  as well. The problem of finding an initial feasible basis for SLP(Ag, bs, cs) is thus solved.  6.3  Dealing with Uncertainty and Avoidance of Cycling  Earlier on in this thesis as well as in most textbook presentations, the description of the Simplex algorithm assumes that the numeric operations executed on various floating-point numbers are free of errors. Unfortunately, on real machines, this is not the case. Unavoidable rounding errors introduce an uncertainty with which any floating-point number is known. Floating-point numbers in computations can be thought of more accurately as real intervals: x ± e. Errors in the result of a computation can diminish the usefulness of the result, occasionally rendering it worthless. The problem can be even more complicated when a decision in the program has to be made based on the computed value of some floating-point number. Any comparison between floating-point numbers can be reduced to the comparison between their difference and 0: I O J « ( I - ! , ) O 0 ,  V o £ { < , <, >, >, =,  so the case of comparisons of computed floating-point numbers with 0 will be considered henceforth. The result of the comparison a;±e ° 0 is undetermined if e > |a:|. This is tantamount to saying that the sign of x is uncertain. Comparisons between computed floating-point numbers are used in Simplex in order to decide whether a column shall enter the basis and which column shall leave the basis. In the presence of ill-conditioning, computed quantities occasionally have uncertain signs. Clearly, deciding to take the wrong branch can make the algorithm fail:  87  • If the wrong column is evicted from the basis, an infeasible basis is reached. U p o n checking the sign of the new basic variables, the algorithm signals failure. • If the wrong column enters the basis, a (slightly) more costly basis is reached. T h e algorithm may end up caught in a cycle of bases: at some of them a correct pivot is taken, at at least one other base the pivot is wrong, the overall outcome being cycling of a type different that the one treated by Bland's anticycling algorithm. Bland's anticycling algorithm essentially provides ways of dealing with the fact that two or more computed quantities are equal.  W i t h floating-point  computer arithmetic, two computed quantities are very seldom equal, even in cases where error-free computation would have lead to equal results. A n obvious solution to the problems raised by uncertainty in the results of comparisons is to try both possible paths of the computation. T h e arrival at an infeasible basis or an increase in the cost mean that the current path of computation is in fact wrong and must be abandoned. Clearly this could potentially lead to an explosion in the running time of the algorithm, as each of the n comparisons that the algorithm would effect on an error-free machine can in principle turn into a node of a computation tree with 2 " nodes. In practice, however, the number of uncertain comparisons is expected to be low. W h e n selecting a column to enter the basis, a clearly favorable column shall always be chosen even if possibly but unclearly favorable columns do exist. Similarly, if a clearly favorable column is found but the identity of the column that must leave the basis is uncertain, another clearly favorable column can be given preference if the column to be evicted is clear in its case. Overall, exploring multiple branches is likely to be necessary only in the neighborhood of an ill-conditioned optimal basis. A record of the visited bases is maintained such that various paths of the computations are not explored more than once. Clearly, this also solves the problem of cycling.  88  W h e n the optimal basis of an L P is ill-conditioned, it might not be possible to label it as "clearly optimal", but only as "possibly optimal" instead.  If the ill-  conditioning of the optimal basis is particularly strong, the optimal basis may seem to be numerically singular: the error bound on some of the basic variables simply becomes infinite.  Moreover, bases whose cost is close to but different from the  optimal one, may not appear to be suboptimal or infeasible, but "possibly optimal" instead. T h e algorithm terminates the search for an optimal basis when a clearly optimal basis is detected or when the exploration of the paths arising from comparisons with unclear result is finished. In order to compute a good estimate of the optimal cost, the algorithm must visit the optimal basis or, if this one is ill-conditioned, a standard-suboptimal well-conditioned basis whose cost is very close to the optimum. T h e algorithm guarantees that, for any visited feasible basis, a neighboring basis of lower cost will be visited if one exists. T h i s holds true even if the true sign of the cost difference between the bases is not clear from the computation. Applied recursively, this invariant leads to the fact that the algorithm is guaranteed to visit the optimal basis, which has no feasible and less costly neighbor. E a c h type of basis provides the following information about the primal L P in standard form: • A feasible, clearly standard-suboptimal basis provides an upper bound on the cost. • A n infeasible basis that yields a feasible solution to the dual provides a lower bound on the cost. • A clearly optimal basis provides both an upper and a lower bound on the cost. • A maximal set of potentially optimal bases provides an upper and a lower bound on the cost. T h e cost of each basis in the set can be computed as an  89  interval. T h e ends of the union of all these intervals represent bounds on the true optimal cost. • A numerically singular basis B that has been reached by taking a clearly favorable pivot from another basis S  p r e  d must have a lower cost than the basis  that preceded it (23 d)pre  A n upper (and sometimes also a lower) bound on the optimal cost can be computed by examining the record of visited bases.  T h e upper bound and the  corresponding solution is the information that Coho expects. T h e accuracy of the bound depends directly on the closeness to optimality of the visited bases and on the error bounds on the solutions that correspond to these bases. Clearly a truly optimal basis is the best result of the linear program as regards cost. W h e n the optimal basis is ill-conditioned and consequently produces a solution with large (possibly infinite) error bounds, it is necessary to replace it with the feasible suboptimal well-conditioned basis that has the lowest cost. T h i s slightly suboptimal solution to the standard L P under consideration represents a slightly infeasible solution to its Coho dual. So the true feasible region of the Coho L P is over approximated as required. Ill conditioning can affect the optimal basis of the helper L P used for determining an initial feasible basis for the actual L P . In this case the optimal basis rather than the optimal cost is of interest. Ill conditioning translates into the discovery of more than one possibly optimal basis rather than one clearly optimal one. T h e solution to this type of uncertainty is to try out all the possibly optimal bases of the helper L P as initial feasible bases of the actual L P .  90  6.4  Conserving Structure after Moving Forward in Time  Consider a Coho projectahedron described by the following equation at the beginning of the time step:  Ax > b  (6.1)  0  where  A£R  , b£R  d x d  rf  and  xo represents  a feasible point at the beginning of the  time step. The linearized model for the time step has the form:  x = Mx + q where  M £ R  d x d  , q£R . d  (6.2)  T h e duration of the time step is A t . Let  x  e  be  the position position at the end of the time step of a point whose position at the beginning of the time step was XQ. B y integrating (6.2) for the time step we obtain:  x =e x MAt  e  + (e  - I)M~ q  MAt  0  (6.3)  l  T h e combination of (6.1) and (6.3) yields the equation of the region resulted from moving the projectahedron forward i n time:  AEx  >b  e  where E = ~ e  M  A  t  (6.4)  e  and b = b + A(I - e~ )M~ q. MAt  l  e  Matrix E results from matrix  exponentiation and is therefore non-singular. A t the end of the time step, the region described by  AEx  e  >b  e  needs to  be projected back onto various projection planes. T h i s amounts to solving several Coho linear programs of the form  CLP{AE,b ,c) e  (6.5)  where c is some cost function. In general, the postmultiplication of A by E yields a matrix that no longer presents the two non-zeros per row structure that describes a projectahedron. Clearly, 91  the machinery developed for linear programs whose feasible regions are projectahedra cannot be applied directly to linear programs of the type described by (6.5). However, the following transformation enables the reduction of the latter type to the former: in the original problem:  cx T  min AEx>b  (6.6)  the following change of variable is effected:  y = Ex&x  = E~ y l  (6.7)  which yields: min  dy T  (6.8)  Ay>b where d = E~ c l  is the transformed cost function. A s matrix A is the left-hand side  of the inequality that describes a projectahedron, this transformed problem can be solved using the techniques presented earlier. Let y t be the optimal solution to the transformed problem. T h e optimal op  solution to the original problem is:  x  = E~ y l  opt  opt  (6-9)  T h e multiplication of the cost vector and of the optimal solution by the quantity  E~ — e l  MAt  are the only operations that need to be added to the Coho  L P solver in order to enable it to deal with the kind of L P s described by (6.5).  92  Chapter 7  Conclusions 7.1  What has been Accomplished  The verification performed by Coho makes heavy use of linear programming. The applicability of the tool depends critically on the accuracy of the solutions to the linear programs that it has to solve. Linear programming has a well-known mathematical solution that is the Simplex algorithm. However, numerical errors that are inherent to floating point computation sometimes make this algorithm produce unacceptably imprecise results. This thesis studied ways of exploiting the special structure of the linear programs that arise in Coho in order to compute better solutions to them. A Coho linear program cannot be solved by Simplex directly. The standard solution of introducing additional variables in order to bring it to standard form would have destroyed its special structure. This has been circumvented by observing that the dual of a Coho linear program is a linear program in standard form that exhibits the same special structure and to which Simplex can be applied directly. Moreover, the solution of the Coho L P is straightforward to compute from that of its dual. As part of solving Coho linear programs, it is necessary to compute Simplex tableau columns. The computation of a tableau column implies solving a linear 93  system. T h e structure of such linear systems is closely related to that of the linear programs from which they arise. Because of their structure it was possible to devise a linear-time algorithm for solving Coho linear systems. Based on the linear-time solver, Simplex was modified such as to prevent the propagation of numerical errors between steps. T h e main numerical problem that affects the Simplex algorithm as modified for Coho is the errors that affect the solutions to linear systems. Therefore research effort has been directed towards computing more accurate solutions to these systems. A characterization of the numerical stability of the Coho linear systems has been obtained. Whereas in quantitative terms it is not complete, it does shed light on the possible sources of ill-conditioning. A possible way of determining a forward error bound on the solution to a Coho linear system has been presented. In the case of most of the L P s encountered in Coho, their optimal cost, rather than their optimal solution, is of interest. T h e research has identified a class of linear programs for which the computation of the optimal cost can be achieved with a small error, although the optimal vertex is replaced with the vertex obtained by pivoting from the optimal basis to the bounding box of the feasible region. T h i s class has been shown to contain a part of the Coho linear programs. T h e error bounds that are computed on the results of floating-point operations permit the linear program solver to identify the situations where the use of a computed value could potentially lead to an incorrect computation path.  In  such situations, the solver tries out both possible computation paths if the first one fails. T h i s strategy prevents the solver from failing by reaching an infeasible basis and guarantees that the optimal basis of the linear program to solve is eventually visited. A n implementation of Simplex that incorporates the modifications proposed in the thesis was written in Java and integrated within the Coho verification tool. T h e implementation raises several non-trivial issues that have been presented in  94  detail in the body of the thesis. In conclusion, this research has resulted in progress towards better solutions to Coho linear systems and, by way of consequence, to Coho linear programs. Consequently, this is expected to increase the usability of Coho. However, some important questions have remained open.  7.2  Suggestions for Further Research  The study of the Coho linear programs is not finished yet. As mentioned in the body of the thesis, some further explorations are necessary. The fact that the optimal cost is computed in two ways which yield errors that vary in opposite directions with the conditioning of the optimal basis suggests that it should be possible to determine an error bound on the optimal cost of a Coho linear program that depends only on the unit roundoff and on the dimension of the space. In order to get a tighter bound on the error in the optimal solution, a better estimate of the condition number of cyclic matrices is needed. Experimental evidence suggests that the available estimate overapproximates the true condition number by several orders of magnitude when the ill conditioning is high. Moreover, there are linear systems for which the condition number is an overly conservative error predictor [CF88]. The identification of such a situation requires knowledge of the singular value decomposition (SVD) of the system's matrix. General methods for SVD computation are not linear time. A linear-time or nearly linear-time SVD method tailored for Coho cyclic matrices is a desirable first step towards more precise error prediction. If highly acute ill-conditioned vertices can be shown to occur, a method for dealing with them is certainly needed. Rescaling the system appears to be a promising approach. This is a topic for future research. When the optimal basis B of a linear program represents a highly obtuse and 95  ill-conditioned vertex, a slightly suboptimal basis B' is guaranteed to exist if the bounding box of the feasible region is made part of the linear program.  Unfortu-  nately, a theoretic proof that B' is reached during the search for optimality is yet to be found. T h e experimental evidence currently available is inconclusive. Alternatively, the algorithm could be modified to ensure the visitation of basis B'.  T h e slightly suboptimal basis B' becomes the optimal basis if the cost  vector is tilted slightly in an appropriate way. Such a slight modification to the cost function would guarantee that basis B' is discovered as the optimal basis. Running error analysis is presently used for all the Simplex operations that involve real numbers. In the absence of ill-conditioning, this is wasteful. A conservative estimate of the condition number of a left-hand side of a Coho linear system can be computed in linear time. T h e ability to recognize cases where error analysis is unnecessary based on the condition number estimate can improve the running time of a program by a constant factor. T h e use of Givens rotations [Hig96, p.371] for the solving of the Coho linear systems has been suggested by scientific computations experts [HabOl]. T h e problem seems to be that Givens rotations work best when various rotations are independent, which is not true in our case. Another suggestion was the replacement of Simplex with the interior point  method  as the algorithm used for solving linear programs. T h i s avenue is entirely  unexplored.  96  Bibliography [AL94]  Martin Abadi and Leslie Lamport. An old-fashioned recipe for real time. ACM Transactions on Programming Languages and Systems, 16(5):15431571, September 1994.  [Bla77]  R. Bland. New finite pivoting rules for the simplex method. Mathematics of Operations Research, 2:103-107, 1977.  [CF88]  Tony F. Chan and David E. Foulser. Effectively well-conditioned linear systems. SI AM Journal on Scientific and Statistical Computations, 9(6):963969, November 1988.  [GM98] Mark R. Greenstreet and Ian Mitchell. Integrating projections. In Thomas A. Henzinger and Shankar Sastry, editors, Proceedings of the First International Workshop on Hybrid Systems: Computation and Control,  pages 159-174, Berkeley, California, April 1998. [GM99] Mark R. Greenstreet and Ian Mitchell. Reachability analysis using polygonal projections. In Proceedings of the Second International Workshop on Hybrid Systems: Computation and Control, pages 103-116, Berg en Dal, The Netherlands, March 1999. Springer. LNCS 1569. [HabOl] Eldad Haber. Personal communication, 2001. [Hig96] Nicholas J. Higham. Accuracy and Stability of Numerical Algorithms. SIAM, Philadelphia, 1996. [Lin92]  Elliot Linzer. On the stability of transform-based circular deconvolution. SIAM Journal on Numerical Analysis, 29(5):1482-1492, October 1992.  [Mul97] Jean-Michel Muller. Elementary Functions: Algorithms and Implementation. Birkhauser, Boston, Basel, Berlin, 1997. [PS82]  Christos H. Papadimitriou and Keneth Steiglitz. Combinatorial Optimization. Prentice-Hall, Englewood Cliffs, New Jersey 07632, 1982. 97  [TPS98] Claire Tomlin, George Pappas, and Shankar Sastry. Conflict resolution in air traffic management: A study in multi-agent hybrid systems. IEEE Transactions on Automatic Control, 43(4):509-521, April 1998.  98  Appendix A  Definitions and Notations A.l  Notations  T h e notation for matrix element selection follows the Matlab style, with the selector appearing as a subscript rather than as an argument.  matrix element Aij  : the element of matrix A at the intersection of column i and row j  matrix row Ai : : row i of matrix A t  matrix column A-j  : column j of matrix A  set of matrix columns A j, :  where  J —  ( j i , . . .  ,jc}  with C columns such that S-  is a set of indices of columns of iC  = Aj , : c  element of row or column bk : element k of the row or column b  set of column elements 99  Vc = 1 , . . . , C  A  : a matrix  S  bj, where J — {ji,...,  jc} is a set of indices of elements of b : a column matrix  S with C elements such that S = bj , Vc — 1,..., C c  c  vertical matrix join [A\B] : a matrix obtained by appending each row of matrix B to the sameindex row of A; A and B must have the same number of rows  identity matrix I  n  A.2  : the n x n identity matrix; n is omitted when it results from the context  Definitions  linear subspace A linear subspace S of R  d  is a subset of R  d  closed under vector addition and  scalar multiplication.  afRne subspace A n affine subspace A of R  d  is a linear subspace S translated by a vector u:  A = {u + x:x G S} orthogonal complement of a subspace T h e orthogonal complement of a subspace S C R™ is defined by:  5  X  =  {y G R" : y x = 0 for T  all  x G S}  orthogonality of a vector to a subspace A vector v is orthogonal to a linear subspace W if v is orthogonal to every vector in W.  projection of a vector onto a subspace Let W be a subspace of R . Let {u\,... d  100  ,Uh} be an orthonormal basis for W.  If v is a vector in R , the projection of v onto W is denoted p r o j ^ u and is d  defined by h  pro)  w  v = ^2{v • ujui i=l  distance from a point to a subspace T h e distance from a point x to a subspace W is defined as:  d(x, W) = vaind(x,y) y€W  A n important property is that: d(x, W) = \\x - projiy x\\  positive combination  p  Given x G R  d  vectors  x\,... ,x  p  G  R , a positive combination of them is a vector d  of the form: P X  x=l  cone  xi,...,x  T h e cone generated by a set of vectors  GR  p  rf  is the set of all their  positive combinations. hyperplane  A hyperplane in R  HP(a,  d  is a set of points defined by:  b) = {x G R  d  :ax T  = b}  A hyperplane is an affine subspace of H  d  aGR  d  \ {0},  bG  R  of dimension d — 1.  hyperplane normal  A n y vector Aa, where A G R \ {0}, is a normal to the hyperplane H P ( a , b). 101  halfspace A hyperplane H P (a, b) in R  d  represents the common boundary of two closed  halfspaces, which are defined as:  HS>(a,6) = {x G R : a x > b} d  T  RS<{a,b) = {x G R : a x < b} d  T  halfspace normal A halfspace normal is a normal to the boundary of a halfspace.  inward halfspace normal A inward halfspace normal n of a halfspace HS is a halfspace normal such that Vx  G H S , VA G [0, oo)  x + An  G HS  Intuitively, a inward normal points from the boundary of the halfspace towards its interior. T h e vector a is a inward normal for the halfplane HS>(a, b). T h e vector —a is a inward normal for the halfplane HS<(a, b).  closed convex polyhedron A closed convex polyhedron is the intersection of a finite number of halfspaces: / P H = P| HS>(ai, bi), where a G R t  d  \ {0}  and b G R t  t=i  In matrix form: PH(A, where  A GR  fxd  and  b) = {x e R : Ax > b} d  bGR . d  Each hyperplane may correspond to a face of the polyhedron, although some of them might be redundant. T h e polyhedron may be empty or unbounded in some directions.  102  hypercube A hypercube in R  d  is the Cartesian product of d closed real intervals.  standard basis T h e standard basis of the R  d  vector space is the basis  {e; : i = where  I 0,  otherwise  vector norm T h e norm of a vector v in R  d  is denned as:  d  i=l unit vector A unit vector is a vector whose norm is 1. T h e unit vector, i.e. the direction, of a vector v is defined by:  dh"u  =  77—TT  ll'"ll  projection of a vector onto another vector T h e projection of a vector v onto another vector u is defined by:  proi,, F  "  J  v— u = (v • divu) d i r u u-u ' v  vector angle T h e angle between two vectors u and v is defined by:  A. (u, v) = arccos  " ..^  = arccosfdir u • dir v)  \\u\\ \\v\\ 103  distance between points T h e distance between two points x and y in R  d  d{x,y) = \\x - y\\  104  is:  


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items