A Block Preconditioning Cost Analysis For Solving Saddle-Point Linear Systems by Sarah Claire MacKinnon-Cormier B.Sc, Dalhousie University, 2003 A THESIS SUBMITTED IN PARTIAL F U L F I L M E N T OF T H E REQUIREMENTS F O R T H E D E G R E E O F Master of Science in The Faculty of Graduate Studies (Computer Science) The University Of British Columbia October ;' 2005 © Sarah Claire MacKinnon-Cormier 2005 11 Abstract We investigate the cost of preconditioning when solving large sparse saddlepoint linear systems with Krylov subspace methods. To use the block structure of the original matrix, we apply one of two block preconditioners. Algebraic eigenvalue analysis is given for a particular case of the preconditioners. We also give eigenvalue bounds for the preconditioned matrix when the preconditioner is block diagonal and positive definite. We treat linear solves involving the preconditioner as a subproblem which we solve iteratively. In order to minimize cost, we implement a fixed inner tolerance and a varying inner tolerance based on bounds developed by Simoncini and Szyld (2003) and van den Eshof, Sleijpen, and van Gijzen (2000). Numerical experiments compare the cost of preconditioning for various iterative solvers and block preconditioners. We also experiment with different tolerances for the iterative solution of linear solves involving the preconditioner. iii Contents Abstract ii Contents • iii List of Tables v List of Figures vi Acknowledgements vii 1 Introduction 1 2 Background . . 5 2.1 Saddle-Point Systems 5 2.2 Krylov Subspace Solvers 7 3 Conjugate Gradient Method 2.2.2 Minimal Residual Method . . . 2.2.3 Generalized Residual Method ' 9 9 10 Block Preconditioners 14 3.1 15 3.2 4 2.2.1 Constraint Preconditioner 3.1.1 Factorizations of M 3.1.2 Eigenvalues of M 15 c • • c Positive-Definite Preconditioner 7 18 Eigenvalue Analysis 4.1 1 Connections Between the Eigenvalues of M 19 c and M p 19 Contents iv 4.2 Eigenvalues of M 4.3 Eigenvalue Bounds 24 4.3.1 Bounding the positive eigenvalues 26 4.3.2 Bounding the negative eigenvalues . . 28 c when G — I . 23 5 Computational Cost of Preconditioning . . . • 31 5.1 Solving Mu = f 32 5.2 Cost of solving preconditioned linear system with GMRES . . . . 35 5.3 Cost of solving preconditioned linear system with MINRES . . . 38 6 Inner Tolerances 40 6.1 Constant Inner Tolerance 40 6.2 Varying Inner Tolerance 41 7 Numerical Experiments 7.1 45 Test Problems 45 7.1.1 stokes2d(0,60) 45 7.1.2 mosarqp2 47 7.2 Comparison of various G values , 49 7.3 The Dominance of the Schur Complement Solve 49 7.4 Comparison of Krylov subspace solvers 56 7.5 Setting the Inner Tolerance . . ' 62 7.6 The Bottom Line 70 8 Conclusions and Future Work 8.1 Future Research 74 ' 75 Bibliography 77 A The Matlab Program 81 A.l OURGMRES A.2 OURMINRES 81 ; 84 V L i s t o f T a b l e s 7.1 Checking validity of estimated cost functions 7.2 Determining ratio of work for 5-solves for stokes2d(0,60) 7.3 Determining ratio of work for 5-solves for mosarqp2, GMRES(15), M, p 7.4 . . . and G equal to the identity 58 • 61 Comparison of number of 5-solve iterations required when using fixed inner tolerance 10~ 7.7 10 and varying inner tolerance 70 Number of 5-solve iterations for stokes2d(0,60) when G equals the diagonal of A 7.9 63 Number of 5-solve iterations for stokes2d(0,60) when G equals the identity 7.8 59 Multiple of extra work required for MINRES to converge based on GMRES convergence 7.6 57 Multiple of extra work required for GMRES(15) to converge based on GMRES convergence 7.5 55 71 Number of 5-solve iterations for stokes2d(0,60) when G is the incomplete Cholesky factorization of A 71 7.10 Number of 5-solve iterations for mosarqp2 when G equals the identity 72 7.11 Number of 5-solve iterations for mosarqp2 when G equals the diagonal of A 72 7.12 Number of 5-solve iterations for mosarqp2 when G is the incomplete Cholesky factorization of A . . 73 L i s t 6.1 o f F i g u r e s Comparison of clamped versus undamped varying inner tolerance 7.1 ,• Sparsity pattern'of H formed by stokes2d(0,60) and M~ K . . . . . . . . 46 7.2 Eigenvalues of K, M~ K, 7.3 Sparsity pattern of Ti formed by mosarqp2 52 7.4. Eigenvalues of H, M^U, 53 7.5 Comparison of various values for G 7.6 Differences in convergence behaviour for full GMRES, GMRES(15), 1 for stokes2d(0,5) . . . . 44 1 and M^H for mosarqp2 54 and MINRES . . . : 7.7 60 Total number of S-solve iterations needed at various inner tolerances when using GMRES with M c 7.8 •. • 64 Total number of S-solve iterations needed at various inner tolerances when using GMRES with M, v 7.9 51 . . . 65 Total number of S-solve iterations needed at various inner tolerances when using GMRES(15) with M 66 c 7.10 Total number of S-solve iterations needed at various inner tolerances when using GMRES(15) with M. v 67 7.11 Total number of S-solve iterations needed at various inner tolerances when using MINRES 68 7.12 X-shaped graphs demonstrating rates at which inner tolerance increases and relative residual decreases as GMRES converges . . 69 Acknowledgements First and foremost, I wish to thank my supervisor, Dr. Chen Greif. Thank you! Thank you! Thank you! I could double the length of this thesis with thank-yous to you alone Chen and still feel indebted. Your guidance, support, encouragement, patience, and kindness will never be forgotten. It has been a true honour working with you. To my second reader, Dr. Michael Friedlander, thank you too. Your thoroughness and willingness to work under such strict time constraints have certainly not gone unnoticed. Infinite thanks are owed to my family whose love and support have never wavered, even when I wouldn't have blamed them if they had. Mum: we did it! You are an inspiration and I am so proud of you. I look forward to our well-earned adventures in Portugal together. •Papa, tu es ma roche. Mon ancre. J'apprecie ton avis, ton support, et surtout ton amour. Sebastien and Julien, where do I start? I am the middle child sandwiched between two brothers, no wonder I ended up in computer science. You taught me to hold my own, to say what I mean, and to be true to myself. I love you guys. Many thanks to the best office mates a girl could ask for. Micheline and Greg, your friendship alone would have made the last two years worthwhile. Luckily, I got a thesis out of it too. Bullpen 238, you all made me feel so welcome right from the start. Yeah thats right, that's you Vance, Jim, Jeanette, Juan, and Richard. Lastly, I apologize to my roommate and best friend Jackie. Living with me these last few months has been hell. I promise to make it up to you. i C h a p t e r 1 I n t r o d u c t i o n We investigate iterative techniques for computing the solution of the linear system (1.1) where A G R n x n , B GR n x m , and m < n. The above saddle-point linear system is also commonly known as a Karush-Kuhn-Tucker (KKT) system. We will concentrate on saddle-point systems where A is symmetric positive definite, B has full column rank, and m < n (possibly m <fC n). We assume that H is large and sparse. Hence, the solution must be found with an iterative method where matrix-vector products are the only convenient operation to perform. In recent years, preconditioning has been the focus of much active research. The idea behind preconditioning is quite simple: transform a linear system into another one that is easier to solve. Left preconditioning (1.1) would give us the new linear system M- Hu 1 where M G K ( n + m ) x = M~ b, 1 (1.2) is a preconditioner. When a linear system is altered by post-multiplying by a preconditioner, the system is said to be right preconditioned. Although right preconditioning is a common preconditioning technique, we will only focus on left preconditioning throughout this thesis. Therefore, left preconditioning will be referred to simply as preconditioning. Since we are working with linear systems with a particular block structure, we wish to exploit that structure when preconditioning. We investigate the Chapter 1. 2 Introduction following block preconditioners M c = | ~_ " | , (1.3) and M = ( G 0 \ (1.4) P \ o ^G-^ y where G is an approximation to A. Within this thesis, M. is called a constraint c preconditioner while A4 is called a positive-definite preconditioner. The latter P of the two names is appropriate as we will only be considering matrices A and G that are positive definite. In that case, M p is positive definite, while A4 is C indefinite. The goal of this thesis is to compare the cost of preconditioning a saddlepoint linear system with the constraint and the positive definite preconditioners when iteratively solving with the Krylov subspace methods GMRES [30] and MINRES [27]. In Krylov subspace methods, solves of the form (1.2) are performed at each iteration. Instead of performing this solve with MATLAB'S built- in backslash operator, we treat linear solves involving the preconditioner as a subproblem. We solve the subproblem with a second iterative solver embedded into the first. We define the iterations taken by the first solver to be the outer iterations while those taken by the embedded solver are defined to be the inner iterations. We derive cost functions for solving (1.2) based on the block structure of the preconditioner used. The cost functions are broken down into total number of matrix-vector products (MVPs) with B or B , as well as G-solves and 5-solves. T As the name infers, a G-solve is the solution a linear system involving G. An S-solve is the solution of a linear system involving the negation of the Schur complement of M c G-solves and S-solves are central to this thesis and are properly defined in Chapter 5. Inner iterations can (and should) be solved with different input parameters than outer iterations. We concentrate on two different ways of setting the inner tolerance in the hopes of reducing the overall cost of preconditioning. 3 Chapter 1. Introduction The first method involves simply setting the inner tolerance to a fixed constant independent of the outer iteration. MINRES responds nicely to this methods and has an optimal fixed tolerance of 1 0 -2 for our test problems. On the other hand, full GMRES often stagnates when the inner tolerance is larger than the outer tolerance. Similar behaviour can be seen when applying GMRES(15) to the same problems. Stagnation in full GMRES and GMRES.(15) can be resolved by introducing a varying inner tolerance. The idea is to force a tight inner tolerance at the beginning of the solve and to relax the tolerance as the iterative method converges. Simoncini and Szyld [31-33], Notay [26] and van den Eshof, Sleijpen, and van s Gijzen [35] all developed similar inner tolerance bounds for inexact matrixvector products in Krylov subspace methods. Despite different inner iteration definitions, we are able to extend this theory to meet our needs. We implement the varying- inner tolerance for both full GMRES and restarted GMRES. For our test problems, the results are favourable. In all cases where stagnation had occurred with a fixed inner tolerance, GMRES and GMRES(15) with a varying inner tolerance converged. All numerical examples are performed with one of three choices for G. Either the identity, the diagonal of A, or the incomplete Cholesky factorization of A. In the case that G is diagonal, we assume that all diagonal entries are positive and therefore that G is positive definite. The outline of this thesis is as follows: In Chapter 2 we introduce to saddlepoint linear systems, their origins, and the fields in which they most often arise. Although it is assumed the reader has some knowledge of iterative methods, an overview of Krylov subspace solvers is given. More specifically, we revisit the conjugate gradient (CG) method, the minimized residual (MINRES) method and the generalized minimal residual (GMRES) method. In Chapter 3, we discuss important properties of M c and M . p We present block factorizations and pre-existing eigenvalue analysis for the two block preconditioners and the generalized eigenvalue problem formed when applying them to (1.1). 4 Chapter 1. Introduction In Chapter 4 we discuss a new algebraic result which links the eigenvalues of the two preconditioners when we take G equal to the identity. We also show a connection between the eigenvalues of M. and the singular values of B when G c is the identity. Then, we closely follow theory developed by Rusten and Winther [29] to find algebraic bounds for the eigenvalues of M~ H, l where M is a block- diagonal positive-definite matrix. The bounds are based on the singular values of the (1,2) block of M~ H. l Cost functions to quantify the cost of preconditioning GMRES or MINRES with M. or M c p are given in terms of MVPs, G-solves, and S-solves in Chap- ter 5. When we solve the preconditioned saddle-point system with a fixed inner tolerance, these cost functions can be drastically simplified by using an average iteration number for an S-solve. Chapter 6 describes the two ways of setting the inner tolerance: fixed and varying. We also show some small cost savings when using a clamped varying inner tolerance (as opposed to an undamped one). Two test problems are chosen to demonstrate our findings in the numerical experiments chapter. We address the differences between the values for G, the dominance of the Schur complement solve, the differences between the convergence of the three Krylov subspace solvers, fixed versus varying inner tolerances, and most importantly, the difference between M c and M . We conclude with an overview of future research. p 5 C h a p t e r 2 B a c k g r o u n d We assume the reader has a fairly extensive knowledge of linear systems and the various ways of iteratively computing their solutions. This chapter is included to introduce the origin and importance of a particular linear system: the saddlepoint linear system. A refresher on Krylov subspace methods is also given as they are the methods of choice throughout this thesis. 2.1 Saddle-Point Systems The K K T system (1.1) often arises from the first-order'optimality conditions for the following classical quadratic program: min ^x Ax — c x subject to B x = d. T T T In the quadratic programming formulation, y represents the vector of Lagrange. multipliers. A solution (x,y) of (1.1) is a saddle-point for the Lagrangian function cf)(x, y) = ^x Ax T - x c + y {B x T T T - d). This explains where and why saddle-point systems get their name. To compute the saddle points of <p(x,y), solve V4>(x,y)=0. For a more in depth look at the quadratic program and its origin, we refer the reader to [25, Chapter 16]. Chapter 2. Background 6 Benzi, Golub, and Liesen [3] provide a survey of numerical solution techniques used to solve saddle-point systems. They also offer the following extensive, though not exhaustive, list of fields where such systems arise: • Circuit analysis • Computational fluid dynamics • Constrained and weighted least-squares estimation • Constrained optimization • Data interpolation'and surface fitting • Economics • Electromagnetism • Finance • Image reconstruction • Image registration • Interior-point methods • Interpolation of scattered data • Linear elasticity • Mesh generation for computer graphics • Mixed finite-element approximations of elliptic PDEs • Model order reduction for dynamical systems • Optimal control • Parameter identification • Structural analysis Chapter 2. Background 7 Typically, saddle-point systems arise when a certain quantity needs to be minimized while assuring that a set of linear constraints are met. In other words, we need to minimize a function of n variables subject to m linear constraints. In some cases, such as in mixed finite-element formulations, constraints are sometimes added to unconstrained problems to acquire stability. Our numerical experiments will be done using saddle-point systems that arise in either optimization or fluid dynamics, more specifically, the Stokes problem. We have added the above lengthy list to convey the importance and abundance of saddle-point systems. 2.2 Krylov Subspace Solvers The literature is rich with ways of solving the saddle-point linear system (1.1). If the system is of a reasonable size, direct methods can be used [7]. There are also null-space methods [14], classical and inexact Uzawa methods [1, 11], splitting schemes [10], and Krylov subspace solvers. We are interested in the solution of (1.1) by preconditioned Krylov subspace solvers. Since the conjugate gradient (CG) method, the minimal residual (MINRES) method, and the generalized residual (GMRES) method play a significant role in this thesis, a short overview of Krylov subspace methods is included. Krylov subspace methods date back several decades. Two ground-breaking papers were published in 1952, one by Lanczos [21] and the other by Hestenes and Stiefel [17], describing what we now know today as the C G method. Oddly enough, Hestenes, who was based in Zurich at the time, and Stiefel, who was based in U C L A , developed the same algorithm independently and only collaborated on a publication after realizing the coincidence. Initially, Krylov subspace methods were not viewed as iterative methods. Despite their early discovery, it was not until 1971 when Reid [28] demonstrated the importance of the finite termination property that Krylov subspace methods were popularized. Like most iterative methods, Krylov subspace methods are used to solve the linear system Ax = b when A G R n x n is large and sparse or when we only have Chapter 2. 8 Background access to a subroutine that returns the matrix-vector product of A and a given vector. Definition 1: Let A G R n x n andbeW 1 (b^O). K. (A, b) = span{6, Ab, k Then, A b,A ~ b) 2 k l is called the k-dimensional Krylov subspace associated with A and b. To understand how Krylov subspace methods work, we first define minimal polynomials. Definition 2. The minimal polynomial p(x) of A £ R n x n , also known as the annihilating polynomial, is the monic polynomial of smallest degree such that P(A) =0. Suppose p(x) is the minimal polynomial of A. Then, we have several desirable properties: • The roots of p(x) are equal to the eigenvalues of A • p(x) is unique up to scaling. • If A is nonsingular, the solution lies in a Krylov subspace of dimension k where k is equal to the degree of p(x). Hence, the fewer the number of distinct eigenvalues, the smaller the size of the Krylov subspace we must search for the solution. This leads to faster convergence. The algorithmic sketch of Krylov subspace methods is as follows: Guess an initial solution x^ Set the initial residual of Ax = b (Typically, a;() — 0). 0 = b — Ax^°\ At every iteration i, produce an approximation x^ that x^ G aj(°) + K.i(r(°\A) of Ax = b such satisfies a certain optimality property. Like with all iterative methods, it is essential to use some form of preconditioning as will be done in this thesis. Chapter 2. 2.2.1 9 Background Conjugate Gradient Method A s previously m e n t i o n e d , Hestenes a n d Stiefel are responsible for the discovery of t h e C G m e t h o d . T o d a y , t h e C G m e t h o d is t h e m e t h o d of c h o i c e for s o l v i n g s y m m e t r i c positive-definite (spd) linear systems. I ti sstrongly related t o the L a n c z o s a l g o r i t h m for t r i d i a g o n a l i z i n g m a t r i c e s . I fe x a c t a r i t h m e t i c is a s s u m e d , it c a n b es h o w n t h a t t h e c o n j u g a t e m e t h o d i s g u a r a n t e e d t o c o n v e r g e i n n iterations w h e n a p p l i e d t o n x n m a t r i c e s . R e i d w a s the first use the fact t h e s o l u t i o n is refined at e a c h i t e r a t i o n . H e n c e , i n e x a c t s o l u t i o n s c a n b e that found m u c h before completing all n iterations. A t e a c h i t e r a t i o n , a n a p p r o x i m a t i o n i W o f Ax = b i s p r o d u c e d b y m i n i m i z i n g the energy n o r m o fthe error. N a m e l y , w eare m i n i m i z i n g t h e e n e r g y norm — x), w h e r e x i s t h e e x a c t s o l u t i o n . (x^ — x) A(x^ T Algorithm 1 P s e u d o c o d e f o r t h e c o n j u g a t e g r a d i e n t Pi = r method 0 for i = 1 , 2 , . . . do „ _ (i-i)r (i-p Pi^l — (;-2)r (i-2) r r r (i) = (i) = p q r (i-l) 1) + (i) Ap _ ^(i) r r <i-l)T (i-l) = r x (i-i) + a . (i) p (i) _ AxM = r ^ " ) 1 r - qW ai end for 2.2.2 Minimal Residual Method M I N R E S i sa L a n c z o s b a s e d m e t h o d d e v e l o p e d i n 1 9 7 5 b y P a i g e a n d Saunders [27]. I tis av a r i a n t o f t h e c o n j u g a t e g r a d i e n t a l g o r i t h m t h a t c a n b e d i r e c t l y a p plied t os y m m e t r i c indefinite s y s t e m s . I n s t e a d of m i n i m i z i n g t h e e n e r g y norm ( w h i c h i sn o t valid i nthe indefinite case), the 2 - n o r m of the residual i s m i n i mized. Chapter 2. Background 10 The pseudocode for unpreconditioned MINRES in Algorithm 2 is given in [36]. 2.2.3 Generalized Residual Method GMRES was developed in 1986 by Saad and Schultz [30]. It is an extension of .MINRES that is applicable to unsymmetric matrices. Although the algorithm still generates a sequence of orthogonal vectors, this can no longer be accomplished with short recurrences. The added storage requirements prompted the development of restarted GMRES. See Algorithms 3 and 4 for pseudocode. 11 Chapter 2. Background Algorithm 2 Pseudocode for the MINRES method Compute v\ = b — Axq for some initial guess xq r? = /3i 7i = 7o = 1 0"! =CTo= 0 v 0 =0 u>o = =0 for i = 1,2,... do ^= cti = (Avi,Vi) Vi+i = Avi - ctiVi - 0Vi-i 8 . = ||fi+l||2 i+l 5 = liOLi - 7i_iaj/?i . ' Pi = y/s + 2 /3f +1 p — Oia + 7 i _ i 7 i A 2 { P3 = 7i+i = 5/pi o~i+i = ft+i/pi Wi = (Vi - P3Wi-2 - IN| P2Wi-\)/p\ = kt+illki-ilb 2 Check convergence; Continue if necessary. V - — cr end for i + 1 Tj Chapter 2. Background 12 Algorithm 3 Pseudocode for the restarted GMRES method is an initial guess for j = 1, 2,... do solve r for Mr = b — Ax^ vW =r/\\r\\ 2 s:=lkl|2ei for ! = l , 2 , . . . , m d o solve w for Mw = Av^ for k = 1,2,..., i do w = w — hk iV^ t end for hi+i,i = HHU apply J i , . . . , Ji-i on / i ^ , . . . , / i construct J,; acting on i th i + M and (i + 1)" component of h.^ such that (z + l) * 1 component of Jih. i is 0 t s := JjS if s(z + 1) is small enough then UPDATED) quit end if end for UPDATE(x,m) check convergence, continue if necessary end for s Chapter 2. Background 13 Algorithm 4 Pseudocode for UPDATE(5,i) Compute y as solution of Hy = s, in which the upper i x i triangular part of H has Hij elements (in least squares sense if H is singular). s represents the first i components of s. x = x( 0) s (»+i) = + yiv^ H ||6-J45|| h yiv {i) 2 if x is an accurate enough approximation then quit else =x end if 14 C h a p t e r 3 Block Preconditioners Since we solve (1.2) iteratively, there is no need to form M~ Ti explicitly. Doing l so defeats the purpose of preconditioning altogether. At each iteration of our iterative solver, we solve a linear system of the form Mz = r. The computational cost of this solve is given in Chapter 5 for two particular choices of M. Choosing M is no easy task. Our selection of M should reduce the overall number of iterations within which our iterative method converges while keeping the cost of each iteration relatively low. We aim for M~ Ti to have nicer spectral properties than Ti. In other words, we want M~ Ti to have fewer l 1 distinct eigenvalues than Ti and that the distinct eigenvalues are closely clustered together [16]. However, we also hope Mz = r can be solved cheaply for z at each iteration of the method. These two conditions tend not to coincide. A good choice for M should satisfy both conditions as reasonably as possible. Let us look at two extreme cases. If M is equal to Ti, then M Ti _1 = / has wonderful spectral properties: all eigenvalues are equal to 1.'However, solving Mz — r is the same as solving Hz = r. We only precondition in the first place if we assume the original matrix is hard to solve for. On the other hand, if M is equal to the identity, then the linear system Iz = r could not be easier to solve. However, the spectral properties of M~ Ti l are identical to the spectral properties of Ti. We wish to precondition the saddle-point system in such a way that we take advantage of its block structure. This is accomplished through the use of block preconditioners. We focus on the constraint preconditioner (M ) and c the positive-definite preconditioner (M ) defined in (1.3) and (1.4) respectively. p Chapter 3. Block Preconditioners G el" " 1 1 15 is a symmetric positive-definite approximation to A. Regardless of whether we employ M c or A4 , a choice for the (1,1) block P G needs to be made. Again, this choice is not trivial. The algebraic analysis presented is with a general G or with G equal to the identity. All of our numerical experiments are performed with G equal to the identity, the incomplete Cholesky factorization of A, or the diagonal of A. It is assumed in the last case that all diagonal entries of A are positive and that G is positive definite. 3.1 M C} Constraint Preconditioner also known as a constraint preconditioner, was given its name and popu- larized by Keller, Gould, and Wathen [19]. Much of their work stems from [22], where a preconditioner of the same form is used in an optimization setting to precondition Newton's method. One of the most attractive properties of the constraint preconditioner is that the constraints are left intact. The solution always stays on the constraint manifold. 3.1.1 Factorizations of M c The popularity of the constraint preconditioner can be somewhat attributed to the fact that it has several nice block factorizations. The first that we will present is as follows: (3-1) Dollar and Wathen [8, 9] have recently published a factorization developed by Wil Schilders. Since the factorization was unpublished at the time, they accredited Schilders by naming the factorization after him. The Schilders fac- Chapter 3. Block Preconditioners 16 torization is as follows: ( Gn G21 Bi G12 G22 B2 M c V Bf Bl ' Bi 0 B L L 2 2 o v 0 j 0 x \ ( D i l\ 0 E 0 i •) \ i I Bl 0 D 2 0 o J \^ L f 0 Bl 0 If 0 £ \ T where Di —B GnB 1 l D 1 =LJ (G 1 2 E =G 2 1 5r 2 2 T T — LjB 1 - B D Bl 2 2 — B x Li, l B E )L,2-T ? - EBl - x - B Di t - 2 T B L B^ . 2 T T G can be chosen to define D\, D , and E; or Di, D , and E can be chosen to 2 2 define G. In either case, two conditions must be met: • G must be nonsingular; • Z GZ, T where Z is an n x (n — m) matrix spanning the nullspace of B, must be symmetric positive definite. The matrices Li e R m x m and L € K("-™)x(™-™) 2 m u s t be chosen so that L 2 is nonsingular. One of the difficulties of working with Schilders's factorization is that a permutation of rows must be devised that splits B into such that Bi G K m x m i nonsingular. On the other hand, the 3 x 3 block form s gives a lot of flexibility and several families of preconditioners can be derived. Schilders's factorization is not discussed further. We return to our block factorization of interest, namely (3.1). The (2,2) block of P is the negation of the Schur complement of G which we denote as S= B G~ B. T l t 17 Chapter 3. Block Preconditioners 3.1.2 Eigenvalues of M c Suppose (v,w) is an eigenvector of the preconditioned matrix A4~ H with assol ciated eigenvalue A. This leads to the following generalized eigenvalue problem Let / B = QT =( Y z){^ T 0 be an orthogonal factorization of B such that T 6 ] R V 6R n x m i upper triangular, m x m s is a basis for the range space of B and Z € K ( - ) n x the nullspace of B . T n m g a j^sis for Then, we can decompose v as follows: v = Yv + Zv . y z for some vectors v and v , v z The following useful proposition was proved in [19]. Proposition 1 (Keller, et al. [19]). Preconditioning H by M = c where G ^ A is symmetric, implies that has an eigenvalue 1 with alge- braic multiplicity 2m and that the other n — m eigenvalues are defined by the generalized eigenvalue problem Z AZv T z = \Z GZv . T z In the same paper, Keller et al. also showed that the n — m eigenvalues of M~ Ti l not equal to 1 are bounded by the extreme eigenvalues of G~ A. ai, c*2, • • •, a l Let be the eigenvalues of G~ A. Then, l n ak < Afc < a fm, k k = 1,2,... ,n - rn. The latter result illustrates the importance of the selection of G. Unlike the next preconditioner we are about to introduce, Ai c is indefinite. 18 Chapter 3. Block Preconditioners 3.2 Positive-Definite Preconditioner Since we only allow choices of G that are positive-definite and we assume that B has full column rank, we know that B G~ B T is also positive definite. X M p is a block-diagonal matrix whose block diagonals are both positive definite. Therefore, A4 is also positive definite. P Numerous forms of block diagonal preconditioners have been used in the past to solve saddle-point problems arising in a variety of applications. As early as 1993, Wathen and Silvester [37] used a positive-definite block preconditioner to precondition stabilized Stokes systems. Batterman and Heinkenschloss [2] used a form of M p to precondition linear systems arising in optimal control. Kuznetsov's [20] need for a block preconditioner had to do with problems in elliptic finite elements on non-matching grids. Only a few references are mentioned here for they are many. We direct the'reader to [3] for a comprehensive list. - • Our reason for choosing to pursue A4 and M C p in this thesis has to do with the fact that the middle matrix of factorization (3.1), otherwise known as P, differs from M p only by a negative sign in the (2,2) block. We exploit this fact in Chapter 4 in order to draw another parallel between the two preconditioners. The similarity between P and A4 also allow us to make comparisons between P the cost of forming the two block preconditioners that are otherwise much harder to conduct. Murphy, Golub, and Wathen [24] showed some elegant algebraic results when G = A. In this case, the preconditioned matrix M~ H, 1 where has at most three distinct eigenvalues: 1, | ± Although analytically elegant, in practice, G = A is not used. If G A' 1 were easily computed, then there wouldn't b<ie a need for preconditioning, and besides, forming the Schur complement would be equivalent to solving the original linear system. 19 C h a p t e r 4 Eigenvalue Analysis In Chapter 3 we presented some of the analytic eigenvalue theory known for M c and M in the literature. This chapter presents three new results concerning the p eigenvalues of our chosen block preconditioners. The first draws a parallel between the eigenvalues of the constraint preconditioner and the positive-definite preconditioner when G is the identity. The second is specific to the constraint preconditioner when G equals the identity. The third result is generic to all positive-definite block-diagonal preconditioners and is based on work done by Rusten and Winther [29]. 4.1 Connections Between the Eigenvalues of M Since M c and c M p is symmetric, it is unitarily diagonalizable. Let the spectral decom- position of M c be M c = VAV . T Here, A is a diagonal matrix whose entries are the eigenvalues of M c and V is an orthogonal matrix whose columns are the eigenvectors of M - We can write c 20 Chapter 4. Eigenvalue Analysis this in term of the block factorization (3.1) RPR T RPR {R r T T VAV (R y = VAV (R y R RP = R VA(R V)- R RP = l T T T VAV = T RP where W = R V. = Since RPR T T T T T 1 1 1 WAW~ l and R RP T T are similairty transformations of A, T we have just proved the following proposition. Proposition 2. RPR and R RP T T have the same eigenvalues. For an alternate (and more general) proof of proposition 2, we refer the reader to [18, Theorem 1.3.20, page 53]. It is beneficial to work with R RP T instead of RPR . To find the eigenvalues T of . M , we consider the generalized eigenvalue problem c Pv = X(R R)- v. T (4.1) 1 Let C = G-^B. Then, R R = 1 C T Since R R T I is symmetric, we know its inverse will also be symmetric. There- fore, we are searching for matrices X,Y, ( and Z such that / 0 0 / 21 Chapter 4. Eigenvalue Analysis It suffices to solve the following four equations X + CC X + CY T = I T Y + CC Y T (4.2) + CZ = 0 • (4.3) + Y = 0 (4.4) + Z = I. (4.5) CX T T CY T Solve (4.2) for Z Z = I-C Y. (4.6) T Substitute (4.6) into (4.3) Y + CC Y + C- T CC Y = 0 T This gives us Y = —C which we proceed to plug into (4.4) CX - C T = 0. T Hence, X — I. The validity of (4.2) is trivially satisfied. Therefore, ( x y \ Y Z ) ~ \ - C T I -C I + CC T T The generalized eigenvalue problem (4.1) can be rewritten as o G 0 where v = -B G~'B T U « . ) _ j \ v 2 A J ( -c ' \-C ) 1+ CC T T j N \ v 2 (vi,V2). Despite many failed attempts, we can say nothing further about the general generalized eigenvalue problem. However, if we allow G = I, this is not the case. Doing so simplifies C to B. Let A ^ 1 be a n eigenvalue of M (with G = I) and hence of R RP. T c generalized eigenvalue problem is now reduced to 1 0 0 -B B T \( \ J [ yJ x ( "M ~ \( I + BB j [ B 1 -B T x T y Our Chapter 4. Eigenvalue Analysis 22 which can be rewritten as x = Xx- XBy-B By T = -XB x . (4.7) + X(I + B B)y. T (4.8) T Solve (4.7) for x to obtain x = X(X — l)~ By. l Substitute the value for x in (4.8): A = - -B By X — 1 2 -B By T + X(I + T B B)y. T Multiply both sides by (A — 1) to eliminate the denominator, - ( A - \)B By T = -X B By 2 + A(A - T + B B)y. T After some simplification, we are left with B By = (A - X)y T 2 Now, let p be an eigenvalue of P (with G = I) with associated eigenvector (p,q). Then, (:-.)(:)-(:)• P and M p - differ only in the fact that the (2,2) blocks are the negations of each other. So, if p is an eigenvalue of P,.then —p is an eigenvalue of M . p We can rewrite (4.9) as P= m -B Bq T = pq. If we focus solely on the second equation, we see" that for m eigenvalues -u = A - A. 2 In the case where G = / , we have found a connection between m eigenvalues of M c and the eigenvalues of M . v Chapter 4.2 4. Eigenvalue 23 Analysis Eigenvalues of M. when G — I c If we allow the (1,1) block of the constraint preconditioner to equal the identity once again, we come up with an equality relating the eigenvalues of M c to the singular values of B. Let B = UYiV T be the singular value decomposition of B. Rewrite our preconditioner as / UY,V T VT, U 0 T U 0 0 V Not only do we have a block factorization of M , we have a similarity transfer c mation. Therefore, the eigenvalues of A4 are equal to the eigenvalues of C J = Let Pit where T e\ e i n+ e v e +2 n 2 denotes the i th column of the identity matrix. P n matrix. Therefore, pre-multiplying J by P v is a permutation and post-multiplying by P j does not affect the eigenvalues of J . It also leaves us with a more desirable block structure. Namely, ' 1 o x °x 0 1 CT2 02 0 1 o- o- ,0 m m •f-n—m 24 Chapter 4. Eigenvalue Analysis Since we have a block-diagonal matrix, the eigenvalues are simply the eigenvalues of the blocks. We already know n — m eigenvalues are equal to 1. The other 2m eigenvalues can be trivially solved since we are dealing with 2x2 blocks. The characteristic equation for each block is 1 — Aj <7j 0 for i = 1, o~i —Xi which can be rewritten as A — Xi — of = 0 for i = 1,..., m. 2 By the quadratic formula, we can conclude that the rest of the eigenvalues are 2 In the situation where n 3> m, it is feasible that one would have easier access to the singular values of B than the eigenvalues of the constraint preconditioner. If G equals the identity, these two pieces of information are now one and the same. 4.3 Eigenvalue Bounds Using the same energy estimates procedure as Rusten and Winther [29], we now derive bounds for the eigenvalues of M^H where M is a block-diagonal positive-definite preconditioner. Since M.~ TC is an indefinite matrix, we will l be able to bound the negative and the positive eigenvalues separately in terms of eigenvalues of the (1,1) block of M~ H and singular values of the (1,2) block 1 of M^H. Let M = where P eR n x n and Q e j£ m X m ( P V 0 0 \ Q are positive definite. We can factor M in the Chapter 4. Eigenvalue Analysis 25 following manner -l p-ip-i 0 0 P-5 0 0 \ / p-i Q-i ) \ 0 0 g-i By Proposition 2, we know that A - l ? ^ has the same eigenvalues as -1 / P-i \ 0 0 \ Q-i j ( A ^ 5 B \ P-T 0 0 Q-i j \ 0 T / / P-iAP-i P-iBQ-i ~ \ Q- 2B P~ 2 1 T | _ c 0 l Let A G A(C) with corresponding eigenvector (x,y). Then, P-lAP-i P-^BQ'i q-\b p-^ \ o t ( x \ _ x ) \ y ) ~ Alternatively, we can write this as p-3AP-ix + p-*BQ-2y Q-iB P~ix P'iBQ~%, (4.10) = Ay. T Note that x ^ 0. = \x; (4.11) This is due to the fact that we assume B, and hence has full column rank. Proposition 3. Suppose A G JR™ is symmetric positive definite with eigenxrl values pi > p2 > • • • Mn > 0. Then, MII-TII > (Ax,x) > M ||x|| . 2 2 n Proof. Let A = V AV T be the eigenvalue decomposition of A. The columns of V, vi, V2, • • •, v , are the eigenvectors associated with pi, p2,..., p n respectively. n We can write x as a linear combination of the columns of V. Namely, y = Vx. Then, y Ay T yy T _ (Vx) A(Vx) _ x V AVx T = {Vx) {Vx) T T = x V Vx T p\x\ H _ x Ax T T T = xx T = h p ^\ n xj + • • • + x 2 ' Chapter 4. Eigenvalue 26 Analysis Hence, Mi > So, \\x\[ —. ;—n— - >(Ax,x)>n \\x\\ . 2 • 2 Ul n L e t P~iAP~? have eigenvalues pi > ^ 2 > • • • > p. > 0. T h e n , b y P r o p o n s i t i o n 3, we have fx \\x\\ < (p-^AP-ix,x) < fi.Wxf, 2 n For a l l x e K". P~^BQ~i Since B has full c o l u m n r a n k , so does P~^BQ~^. have singular values CTi > cr > • • • > c r 2 m < \\Q-lB P-i \\ <CTl||l||, > 0. Let T h e n we c a n use P r o p o s i t i o n 3 t o show t h a t o- \\y\\<\\P-tBQ-h\\<°i\\yl m F o r a l l y eM. m and T X For a l l x € r a n g e ( i J ) . 4.3.1 Bounding the positive eigenvalues L e t A be a p o s i t i v e eigenvalue o f M~ 1i. P r e - m u l t i p l y (4.10) b y x l (P-iAP~ix,x) T + (Q-iB p-$x) y T to obtain = X\\x\\ . T 2 Use (4.11) to s i m p l i f y t h e second t e r m o n the left-hand side of the above e q u a t i o n and obtain (P-vAP-lx,x) + \\\y\\ = A | | x | | . (P-tAP~ix,x) =A||x|| -A||2/|| . 2 2 So, 2 W e n o w use (4.12), a l o n g w i t h t h e i n e q u a l i t y Pn\\x\\ 2 <\\(P-^AP-±X,X), 2 (4.12) Chapter 4. Eigenvalue 27 Analysis to get (^-A)||x|| <-A||y|| . 2 (4.13) 2 Because A > 0, (4.13) implies that A > p . This gives us a lower bound for the n positive eigenvalues. Next, we derive an upper bound for the positive eigenvalues. To do so, we look at equation (4.11). We know that \\Q^B p-^x\\=X\\y\\. T Square both sides and solve for ||y|| : 2 \\y\\ = ^\\Q-iB P^x\\ . 2 T (4.14) 2 Substitute (4.14) into (4.12) to get (P~iAP-ix,x) = X\\xf - \\\Q-lB p-ix\\ . T 2 A To eliminate the denominator, we multiply all terms by A, leaving us with X(p-^AP-ix,x) = X \\xf-\\Q-^B P-^ \\ . 2 T (4.15) 2 X Substitute the inequalities (P-iAP-kx,x) <pi\\x\\ and 2 ||Q-* B p-*rc|| < a ||a;|| :r J 2 2 2 J from Proposition 3 into (4.15) to get (A -A 2 - 7 )||x|| <0. 2 M l 2 C Since ||a.'|| > 0, this leaves us with 2 A - A/^i - o\ < 0. 2 By the quadratic formula, we find that < Mi ± VMI + 4g 2 2 We have assumed A is a positive eigenvalue. Therefore, we ignore one of the inequalities and conclude that the positive eigenvalues are bounded from above by • Mi + v V i + 4er 2 Chapter 4. Eigenvalue Analysis 4.3.2 28 B o u n d i n g t h e negative eigenvalues Let A be a negative eigenvalue of M.~ T-L. l Substitute inequalities from Proposition 3 into (4.15) to get (A -A/x„-o-?)||a;|| <0.. 2 2 Since ||x|| > 0, this leaves us with 2 A - Xp - a\ < 0. 2 n By the quadratic formula, we find that / i ± y/u + 4<T 2 < 2 n ~ 2 We have assumed that A is negative. Therefore, we ignore one of the inequalities and conclude that the negative eigenvalues are bounded from below by M» ~ V > + 4 g 2 2 2 Finally, we derive an upper bound for the negative eigenvalues. Let x = v + w where v G range(B) and w G null(B ). Note that (P~i AP~iv,-w) = T (P~5AP~^w,v) since P~%AP~i is symmetric. Rewrite equations (4.10) and (4.11) in terms of v and w p-iAP-h + P-^AP-^W + p-^BQ-^y Q-^B P~h T = Xv + Xw, (4.16) = Xy. (4.17) , Pre-multiply (4.16) by v T v P~^AP~^v T + v P~^AP^^w T + v P'^BQ~^y T = Xv v + Xv w. T T We simplify by writing with inner products, norms, and remembering that v w = 0: T •(P-lAP-$v,v) + (P-*AP-*w,v) + (Q-iB p-iv) y T T = X\\v\\ . 2 (4.18) Use the same strategy as* we did for the positive eigenvalues. Solve (4.17) for y ^ y = ±(Q-iB P-iv). T Chapter 4. Eigenvalue 29 Analysis Plug this into (4.18) to get (P-iAP~iv,v) + (p-iAP-$w,v) + \\\Q~i B A P~h\\ T = X\\v\\ . 2 2 Use inequalities (P-*AP-iv,v) < m\\v\\ \\Q-% B and 2 T P~i v\\ > a \\v\\ 2 2 2 m from Proposition 3 and substitute them into (4.18) to get (P-2AP-?w,v) > A-^j We now have a lower bound for (P 2 AP *w,v). for (P~%AP~%v,w) w P~i T = (P~% AP~%w,v). AP~^v + w P~i v |2 A We need an upper bound Pre-multiply (4.16) by w to get T AP'^w + w P~^BQ~^y J = \w v + \w w. T T T We can simplify by writing with inner products and norms and remembering that v w = 0 and (P-%BQ~i) w T = 0: T (P-*AP-h,w) + {P-lAP~*w,w) = A||w|| . (4.19) 2 Substitute the inequality (P-%AP~?w,w) > p \\wf n from Proposition 3 into (4.19) to get (P-*AP-lv,w) < (\-n )\\w\\ . 2 n Since we assume A is negative, we know the above quantity is less than or equal to zero. Putting it all together, we have 0 > (X-Un)\\w\\ 2 >(P-iAP-$v,w) = (P-iAP-iw,v) If we ignore the middle terms, we are left with - 0 > ( x - U l - ^ f \ HI . 2 > (\ - p - ^ \\v\\ . x 2 Chapter 4. Eigenvalue 30 Analysis By equation (4.17), if v — 0, then y = 0. However, this reduces equation (4.10) to P~'% AP~?x = Xx. Since P~^AP~% is positive definite, A must be positive, a contradiction to our original assumption. So, we can safely say that 0 > - A - ^ . m ' Multiply both sides by A: <A -A 2 0 M l -ai. ' By the quadratic formula we find that We assume A is a negative eigenvalue. Therefore, we ignore one of the inequalities and conclude that the negative eigenvalues are bounded from above by Mi ~ \/Mj + 4Q- , 2 2 We summarize the eigenvalue bounds just derived in the following theorem. Theorem 1. Suppose that M = where P € R • n x n and Q G R " m X m a r e ( P 0 \ Q 0 positive definite, is used to precondition where A £ M. n = nxn is positive definite and B £ fl£™ xm kas full column rank. Let Mi =^ M2 > ' •' > Mn > 0 6e the eigenvalues of P~?AP~*. a m > 0 be the singular values of P~iBQ~i. Let o\ > a-i > • • • > Then, the eigenvalues of A4~ H l are in one of two intervals: Mi + VMT+4CT^ p - V M „ + 4o- 2 n Mi + \/Mf + 4t7 2 and Mn! n 31 C h a p t e r 5 Computational Cost of Preconditioning Unfortunately, preconditioning is not free. There is a cost involved in applying a preconditioner to a linear system. An effective preconditioner is one whose additional cost is inconsequential when compared with the overall savings of solving a linear system with improved spectral properties. When solving a preconditioned linear system like (1.2) with an iterative solver such as GMRES or MINRES, the additional cost of applying the preconditioner is directly related to the cost of solving Mu=.f, (5.1) where M. is the preconditioner and / is a given right-hand side. We begin this chapter with two definitions that we refer to often. Definition 3. A G-solve is the process of solving a linear system of the form Gx = y, for x £ M . n introduced The matrix G £ W in (1.3) and (1.4) ixn is defined by the preconditioner Sx = B G~ Bx T for x £ R . The matrix B £ R m tively. C or M. v respectively. Definition 4. An 5-solve is the process of solving a linear system defined by the preconditioner A4 M c n x m 1 of the form = y, is the (1,2) block of Ti and G £ R or M p introduced n x n is in (1.3) and (1.4) respec- 32 Chapter 5. Computational Cost of Preconditioning Note: Solving an 5-solve involves solving a G-solve. If the 5-solve is performed iteratively, then a G-solve must be performed at each iteration. Using the above definitions, we can now formulate cost functions associated with solving (5.1) where M is either the constraint preconditioner or the positive-definite preconditioner. 5.1 Solving Mu = / Let us first consider the case of the constraint preconditioner M. . The linear c system of interest is M u f which can be solved in two stages as B G~ Bw T = B G- h -h ; (5.3) Gv = h - Bw. (5.4) 1 T 1 1 2 and 1 We first describe the solution of (5.3) for w. Solve Gk — hi for k (a G-solve). Next, form the right-hand side by performing a matrix-vector product with B. Namely, let / = B k T — h . We are left with the 5-solve 2 B G~ Bw T l = I. (5.5) We will discuss the cost of iteratively solving the 5-solve shortly. For now, we assume that we can find a solution w for (5.5). Next, solve (5.4) for v. Form the right-hand side by performing a matrixvector product with B. Namely, let r — hi — Bw. Then, solve Gv = r for v (yet another G-solve). 33 Chapter 5. Computational Cost of Preconditioning By using the block structure, the overall cost of solving (5.2) can be compressed into equation form as 2 G-solves + 1 5-solve + 2 M V P . (5.6) where M V P stands for matrix-vector products with B or B . T Suppose we wish to solve (5.5) iteratively. Since we assume that G is positive definite and B is full rank, the negation of the Schur complement of G, 5, is also positive definite. The conjugate gradient method is therefore appropriate as an iterative solver. We are given wq, an initial guess for w. At iteration i of the conjugate gradient method, we do the following twice: • Let pi = BiVi (an MVP). • Solve Gqi = pi for g; (a G-solve). • Form B (an MVP). T q i Suppose 7 iterations of the conjugate gradient method are needed for convergence. Then, the cost of solving (5.5) is 27 (2 M V P + 1 G-solve). (5.7) Incorporate (5.7) into (5.6) to get a function for the cost of solving (5.2) in terms of only G-solves and MVPs: 2 (2 M V P + 1 G-solve) + 2 M V P + 2 G-solves 7 = (47 + 2) MVP + (27 + 2) G-solves. (5.8) Now, let us consider the case of the positive-definite preconditioner M . The p linear system of interest is Chapter 5. Computational Cost of which can be solved in two stages as . Gv B G~ Bw T 34 Preconditioning 1 — hi; = h. 2 Here, the order in which the equations are solved does not matter. Simple observation tells us that the cost of solving (5.9) is 1 5-solve + I G-solve. (5.10) In (5.7) we derived the cost of an iterative 5-solve. Thus, the cost of solving (5.9) in terms of only G-solves and matrix-vector products with B (and B ) T is 27 (2 MVP + 1 G-solve) + 1 G-solve = 47 M V P +(27 + 1) G-solves. ' (5.11) When we compare (5.8) and (5.11), we see that solving M u c = / requires 2 more matrix-vector products and one more G-solve than solving M u p = f. However, as we will see in Chapter 7 (Numerical'Experiments), the 5-solves account for roughly 98% of the total work of solving (5.1). This can also be seen by examining the following ratios: 47 . 2 7 + I 47 + 2 27 + 2' The number of MVPs and G-solves needed to solve Ai u p = / appear on the numerator of the first and second ratios respectively. The number of MVPs and G-solves needed to solve M u c = f appear on the denominator of the first and second ratios respectively. Because 7 > 0, both ratios are less than one and as 7 —> 00, the ratios approach 1. Thus, when 7 is large (typically the case) the cost difference between solving M u p = / and M. u = f is minor. c Since we are dealing with presumably large matrices, we wish to solve the original preconditioned linear system (1.2) with an iterative solver. We explore two iterative Krylov subspace methods: GMRES and MINRES. Instead of using MATLAB'S built-in backslash operator for equations of the form (5.1), we use the block factorizations just described. 35 Chapter 5. Computational Cost of Preconditioning 5.2 Cost of solving preconditioned linear system with G M R E S The iterative solver G M R E S is suitable for any square linear system. 'We have altered MATLAB'S built-in function to take advantage of G M R E S the factorizations of the previous section. Consequently, we have altered the input and output parameters in the following manner: [u,flag,relres,iter,resvec,|cost max i t Gi G2 p = ourgmres(H,b,restart,|tol | , . . . ,u0, pcf lag 1,1 Sf lag , pcSf lag , t o l f lag The new or altered parameters are highlighted by a. box. For more implementation details, we refer the reader to the appendix. The output parameter i t e r returned by ourgmres is a 2 x 1 vector. The first entry is the number of outer iterations and the second entry is the number of inner iterations. Let i t e r = [a /?] where 0 < a < maxit and 0 < (3 < restart. If we are using full GMRES, a = 1 and the total number of inner iterations is p. Otherwise, the total number of inner iterations is (a — 1) x restart + /?. Let 0 if restart = [ ] restart otherwise. Preconditioned GMRES solves a linear system of the form (5.1) at every outer and every inner iteration. As we have already shown, there is an 5-solve involved in solving (5.1) regardless of whether M c or M p is chosen. Let 7^ be the'number of iterations required for the 5-solve at iterations a = i and j3 = j. A subscript of 0 indicates the iteration count has not yet commenced for (3. The 1 36 Chapter 5. Computational Cost of Preconditioning cost of preconditioning (1.1) with A4 is C ((47<o + 2) M V P + (27«, + 2) G-solves) i=l + (( ^' + ) 4 2 M V + ( 7«=i + ) ^-solves) P 2 2 3= 1 • Ot (: — T 1 + I] H ((47ii + 2) M V P + ( 2 i=l j=l = 2 (2^o M V P + 7 i 0 i=i + 2 ^ ^ (2 j=i i=i + 2) G-solves) G-solves) + 2 ^ ( 2 j=i MVP + 7ij 7ii 7 i j MVP + 7 Q c j 7 i G-solves) i G-solves) + 2(a + a r - t + /3 ) MVP + 2(a + a r - r + /3 ) G-solves c c C c c C (5.12) when GMRES (full or restarted) is the iterative solver used. The subscript c on a and (3 indicates that the preconditioner is A4 . C The cost of preconditioning (1.1) with M p ( 7io M V P + ( 2 4 7i0 is + 1) G-solves) i=l + J2 ( 7 a i M V P + ( 2 4 P a -l p + ( ^«. 4 (2 a -l p + 1) G-solve) T i=i j=i = 2 7ii 7i0 M V P MVP + + ( 7 i i + 1) G-solves) ! 2 7i0 G-solves) + ^ (2 7(lpj MVP + 7 y G-solves) r + 2 ^ ^ (2jij M V P + 1=1 j = l 7 i j G-solves) + (a + a r - t-+ P ) G-solves p p p (5.13) when GMRES (full or restarted) is the iterative solver used. The subscript p on a and (3 indicates that the preconditioner is M. . v The right-hand side is the only element of the S-solve changing from one iteration to the next. 7 y is roughly the same regardless of i and j. To get a Chapter 5. Computational Cost of Preconditioning 37 more manageable cost function, let 7 be the average number of iterations needed for a successful 5-solve. Then, the cost of preconditioning (1.1) with M and c the iterative solver GMRES is approximately (a + p + (a - 1)T) ((4 + 2) M V P + (2 + 2) G-solves) c c c 7 7 = (4 (q + p + (a - l)r) +2{a +p 7 c c c c + (a - l)r)) M V P c c + (27 (q + P + (q - l)r) + 2 (a + /? + (a - l)r)) G-solves. c C c c c c (5.14) The cost of preconditioning (1.1) with A4 and the iterative solver GMRES P is approximately (ap + p + (a - l)r) (4 MVP + (2 + 1) G-solves) p p 7 7 = 4 (q + P + {a - 1)T) M V P 7 p P p + (27 (q + P + (q - l)r) + (q + /? + (a - l)r)) G-solves. p P p p p p (5.15) Let us compare the number of matrix-vector products with B (or B ) inT volved in the preconditioning of (1.1) when using both preconditioners by dividing the number found in (5.15) by the number found in (5.14): 2j (a + P + (a - l)r) + (q + P + (q - l)r) p 27 P P p p p (a + Pc + (a - l)r) + 2 (a + p + (a - l)r)" c c c c 1 c " ' We expect 7 to be large. Therefore, the second terms on both the numerator and the denominator are menial in comparison to the first terms on the numerator and denominator respectively. Hence, (5.16) is approximately equal to 27 ( a p + P + (a p p - l)r) 27(a + /? + ( a - l ) - r ) c c c _ a - l)r + P + (a p p p a + p + {a - l)r ' c c (5.17) c We can do the same kind of comparison with G-solves by dividing the number of G-solves found in (5.15) by the number found in (5.14): 2 (a + p + (q - 1)T) + (a + ^ + (a - l)r) 7 p p p p p p 27(q +/3 + ( q - l ) r ) + 2 ( q + /? + ( q - l ) r ) ' c c c c c c 1 ' ' Once again, the second terms on both the numerator and the denominator can be omitted since their contribution is minor. Hence, (5.18) is approximately equal to 38 Chapter 5. Computational Cost of Preconditioning 27 (g + (3 + (a - 1)T) p p p = 27 (a + /3 + (a - l)r) c C q + P + (a - l)r p p p a + /3 + (a - 1)T ' c C c c The important thing to notice when looking at (5.17) and (5.19) is that neither ratio is dependent on 7. The overall work involved in preconditioning our saddle-point linear system is a function of amount of work per iteration ((5.8) or (5.11) depending on whether we are preconditioning with M c or M p respectively) times the total number of iterations. We know the ratios for work per iteration are fairly close. Therefore, the difference in overall cost depends almost entirely on the iteration counts a and /3. 5.3 > Cost of solving preconditioned linear system with M I N R E S The iterative solver MINRES is suitable for any symmetric linear system. Preconditioners must be symmetric positive definite. Hence, we only discuss preconditioning with Mp. We have altered MATLAB'S built-in MINRES function to take advantage of the factorization in Section 5.1. Consequently, we have altered the input and output parameters in the following manner: [u,f l a g . r e l r e s , i t e r .resvec ,resveccg,| cost |] = ourminres(H,b, tol maxit GI G2 , u 0 , Sflag pcSflag I) _ The new or altered parameters are highlighted by a box. Let i t e r = <p be the number of MINRES iterations. We must solve a linear system of the form (5.1) twice before entering the iterative loop and once at every iteration of the loop. Let 7, be the number of iterations required, for a successful 5-solve when ip = i. Then, the cost of preconditioning (1.1) with M p when solving with 39 Chapter 5. Computational Cost of Preconditioning MINRES is 47 M V P + (27 + 1) G-solves + 4-y M V P + (2j + 1) G-solves Q a + b ( 7i M V P + (2 4 7i b + 1) G-solves) i=i = 4( + ) M V P + (2( 7a 76 7a + ) + 2) G-solves 76 v 5^ (4 where j a 7i M V P + (2 7i + 1) G-solves) (5.20) and 76 represent the number of conjugate gradient iterations needed to solve the S-solve occurring in the first and second solves before entering the loop. Once again, to get a more manageable cost function, let 7 be the average number of iterations needed for a successful S-solve. Then, the cost of preconditioning (1.1) with M p is - [ip + 2) (47 M V P + (2 + 1) G-solves) 7 = (4</>7 + 87) M V P + (2957 + 47 + ip + 2) G-solves (5.21) when using MINRES as the iterative solver. An analytic comparison between the cost functions arising from using G M RES (with M ) p and MINRES (with M ) p is not easy. [a ,(3 ] and </? play p v significant roles in the total work and we cannot relate the iteration counts. Because of the 3-term recurrence, MINRES is more susceptible to rounding errors than GMRES. In [34], it is shown that the rounding errors propagated to the solution are proportional to k(H) for MINRES while for GMRES these errors 2 are proportional only to k(H). This may explain why MINRES tends to need more iterations than GMRES for convergence in all the numerical experiments performed in Chapter 7. 1 40 C h a p t e r I n n e r 6 T o l e r a n c e s Often solving a linear system with an inexact method leads to solving a subproblem at each iteration. A second iterative solver can be embedded into the original iterative solver to find the inexact solution of such a subproblem. This concept is known as inner-outer iterations. The outer iterations are associated with the original problem while the inner iterations are associated with the subproblems. The subproblem we are concerned with is the solution of the linear system (5.1), which we must do at least orice for every outer iteration of all Krylov sub-' space methods. The subproblem involves an 5-solve which we solve iteratively using the conjugate gradient method. There is no reason to restrict ourselves to using the input parameters of the outer method for the inner method. In our implementation, we allow for .different tolerances and maximum iteration number for the outer and inner methods. In this chapter, we explore two ways of setting the inner tolerance for the inner solve at each iteration of the outer method. We assume all G-solves are done with MATLAB'S built-in backslash operator (as opposed to iteratively). 6.1 Constant Inner Tolerance The first way of setting the inner tolerance is quite simple. We just fix the inner tolerance over all outer iterations, independently of the outer tolerance. We solve each test problem several times with different inner tolerances (which remain fixed throughout the outer iterations). We quantify the cost of each of these runs with the cost functions presented in the previous chapter. We Chapter 6. Inner Tolerances 41 expect the number of outer iterations to be smaller when the inner tolerance is quite strict, leading to more inner iterations. Alternatively, we expect that the s number of outer iterations to be larger with a loose inner tolerance, leading to fewer inner iterations per outer iteration. As we allow our inner tolerance to grow, we expect convergence to approach that of an unpreconditioned system. As we will see, this is not the case. Full GMRES, and sometimes restarted GMRES(15), stagnate when the inner tolerance is chosen to be greater than the predetermined outer tolerance. Oddly enough, MINRES does not suffer from such stagnation issues. As expected, there is no benefit in solving the inner iterations with a smaller tolerance than the outer tolerance. Furthermore, when the inner tolerance approaches machine precision, the conjugate gradient method stagnates. 1 6.2 Varying Inner Tolerance The second approach allows the inner tolerance to differ from one outer iteration to the next. Initially, the inner tolerance needs to be as strict as the outer tolerance. However, as we approach convergence, the inner tolerance can be relaxed. In this case, GMRES does not suffer from stagnation as it does for the above case. We have implemented an inner tolerance bound proposed by Simoncini and Szyld in [32] and van den Eshof, Sleijpen, and van Gijzen in [35]. In both papers, inner iterations solve matrix-vector products inexactly. This concept of inner iteration comes from the promising numerical experiments carried out by Bouras and Fraysse [5]. Their experiments implemented inexact matrix-vector product inner iterations for numerous Krylov subspace methods, including GMRES. Theory developed by Simoncini and Szyld [31] is only applicable to full GMRES. Despite their "disclaimer, we were able so successfully apply it to GMRES(15). We do not even attempt to apply the theory to MINRES since MINRES does not stagnate when the inner tolerance is large. Although our definition of an inner iteration is different, the theory is still 42 Chapter 6. Inner Tolerances applicable. If we solve the unpreconditioned system (1.1) with a Krylov.subspace method, at each outer iteration i a matrix-vector product involving H is formed. Instead of forming Hvi exactly, (H + E^Vi (6.1) is formed, where Ei is an error matrix. Denote by z{ = H.Vi the exact matrixvector product and Zi the computed matrix-vector product. Then, \\zi-ZiW = \\EiVi\\ < \\Ei\\\\vi\\. Since Vi (and therefore is given, convergence depends on Ei. We wish to solve the preconditioned linear system (1.2) with a Krylov subspace method. At each outer iteration i, a matrix-vector product involving A4~ H is formed. 1 Let y = M- Hv , (6.2) 1 i i and pre-multiply both sides of (6.2) by M. to get Myi = Hvi. Like with the unpreconditioned linear solve, let Zi = Hvi. Then, we have Myi = Zi. We need to solve for y\, a solve which coincides with our subproblem that defines our inner iteration. We perform this solve inexactly. So, Myi = Zi + Si, where y,; is the computed solution. Solve for y% = M~ Zi + M~ S l 1 to get = M~ Hvi l i + M~ 8i. l Then, \\Vi-U = \\M-H \\<\\M- \\\\5 \\. 1 i i Chapter 6. Inner Tolerances 43 Since A4 (and therefore is given, convergence can be guaranteed by monitoring Si closely. Simoncini and Szyld [32] prove the following proposition which bounds the error matrix in (6.1). Proposition 4 (Simoncini and Szyld [31]). Let e > 0. Assume that m iterations of the inexact Arnoldi method have been carried out and let y^ 1 the GMRES solution. Let r^ = r - Ax ^ 1 be the GMRES g 0 be residual after m iterations of the inexact Arnoldi method. If for every k < m, np || -E'fc — Q~m{Hm) 1 II ~gm || > e Ffc-ill m then \\r^ n - ?~^ || < e. Moreover, if m 1 K(H ) m then \\(V H ) r9™\\ m+1 m T 1 m <e. Here, V is the matrix whose columns are the Arnoldi vectors, H is the upper Hessenberg matrix formed in Algorithm 3, and (H ) m is the condition number of the Hessenberg matrix. The magnitude of o(H ) m varies significantly depending on m, indicating sensitivity. Hence, we would like to avoid having to compute K(H ) m at each outer iteration. The bound from Proposition 4 is simplified to \E,\\ < 1 •m n _ Ffc-i I This bound guarantees overall convergence below the given outer tolerance. Bouras and Fraysse [5] simplify the bound even further by allowing £ m leaving ll^ll < TT^S, = 1, Chapter 6. Inner Tolerances 44 where k is the outer iteration number, tol^ = \\Ek\\ is the inner tolerance allowed at outer iteration k, fk-i is the computed residual at outer iteration k — 1, and e > 0 is a predetermined value. We choose e equal to the outer tolerance. We demonstrate in the numerical experiments section that using an inner tolerance that is smaller than the outer tolerance is futile. Therefore, we clamp our inner tolerance in the following manner: tolfc = max { 1 £, £ To illustrate the benefit of clamping the inner tolerance, we have solved the problem mosarqpl with both with a clamped and an undamped inner tolerance (See Figure 6.1). outer iteration Figure 6.1: A comparison of clamped versus undamped varying inner tolerances used to solve mosarqpl with full GMRES, outer tolerance e = 10~ , M , and 10 c G =I The overall cost savings due to clamping based on the cost functions of Chapter 5 is only about 1.1%. However, this is a cost savings nonetheless. 45 Chapter 7 Numerical Experiments All numerical experiments presented in this chapter were run using version 7.0.1 of MATLAB. Throughout the outer tolerance is set to 1 0 -10 . When restarted GMRES is used, it is done so with a restart value of 15. We begin by presenting two test problems on which all numerical experiments are performed. 7.1 Test Problems 7.1.1 stokes2d(0,60) Our first test problem is courtesy of Robert Bridson [6]: [A.B.K.f] = stokes2d(k,n) creates a Stokes problem whose domain length is n andfcis inversely proportional to the time step. If k = 0, we are solving the stationary Stokes problem. The matrices A and B are the (1,1) and (1,2) blocks, respectively, of the K K T matrix H = K and / is the right-hand side. We have selected values offc= 0 and n = 60. This selection leaves us with a K K T matrix whose sparsity pattern is depicted in Figure 7.1. It is worth mentioning that excellent problem-dependent preconditioners exist for Stokes problems [12]. We focus purely on an algebraic investigation. Chapter 7. Numerical Experiments Figure 7.1: Sparsity pattern of Ti formed by stokes2d(0,60) Properties of A: dimension 7080 x 7080 rank 7080 condition number 1482 number of nonzero entries 34924 Properties of B: dimension 7080 x 3540 rank 3540 condition number 107 number of nonzero entries 13982 Properties of K: dimension 10620 x 10620 rank 10620 condition number 746 number of nonzero entries 62888 Properties of / : dimension 10620 x 1 number of nonzero entries 169 46 Chapter 7. Numerical 47 Experiments Krylov subspace method convergence is strongly related to eigenvalue clustering. We have computed all the eigenvalues of a small Stokes problem, stokes2d(0,5). All three graphs of Figure 7.2 display much better clustering when using A4 . C This is mainly due to the fact that at least 2m of the eigenvalues of M~ K X are equal to one, in keeping with the theory of Chapter 3 and [19]. One of the advantages to preconditioning with Alp is that it does not change the inertia of the initial matrix. 7.1.2 mosarqp2 The second test problem comes from the Constrained and Unconstrained Testing Environment (CUTE) [4]. This test problem was constructed by Morales-Perez and Sargent in [23] as an academic problem purely for testing purposes, and is a convex quadratic program: x Ax + c x T T min subject to d x < Wx < d 2 I < x < u where A is symmetric positive definite and B is large and sparse. Because we are only interested in the saddle-point linear system we only consider the equality constrained problem, where we ignore the bounds. Therefore, (7.1) is the first-order conditions of min x Ax + c x subject to Wx = d, • T T Chapter 7. Numerical Experiments with B = —W . 48 As a right-hand side, we define T where e £ R™+ [ m s a vector of ones. The sparsity pattern of the K K T matrix is depicted in Figure 7.3. Properties of A: dimension 900 x 900 rank 900 condition number 38 number of nonzero entries 990 Properties of B: dimension 900 x 600 rank 600 condition number 192 number of nonzero entries 2930 Properties of Ti: dimension • 1500 x 1500 rank 1500 condition number 77677 number of nonzero entries 6850 Properties of b: dimension 1500 x 1 number of nonzero entries 968 The 1500 eigenvalues of Ti for mosarqp2 are depicted in Figure 7.4. Chapter 7. Numerical Experiments 7.2 49 Comparison of various G values We begin the numerical experiments with a comparison of the three choices of G: the identity, the diagonal of A, and its incomplete Cholesky factorization. As can be seen in Figures 7.5, using the incomplete Cholesky factorization of A allows for convergence in significantly fewer iterations regardless of the test problem, iterative Krylov solver, or preconditioner (M c or M ). p Allowing G to be the diagonal of A seems to be the an intermediate choice in terms of inner iteration count while G equal to the identity means needing the most number of inner iterations for convergence. This follows the theory that fewer iterations will be needed the closer the preconditioner is an approximation to the original matrix. Each G-solve performed with the incomplete factorization consists of two rounds of back-substitution at a cost of O(n) since G is sparse: Each Gsolve performed with the diagonal of A costs n. Hence, G-solves where G is the incomplete Cholesky factorization of A are not substantially more expensive than when G is the diagonal of A. There is no cost to solving a G-solve where G is the identity. . The leftmost graph of figure 7.5 appears to be missing data from inner tolerance 1 0 to 10~ . This is due to the fact that GMRES stagnated at those _1 9 inner tolerances. All data points accentuated by a square indicate that the inner method (conjugate gradient) stagnated but the outer method was still able to converge. 7.3 The Dominance of the Schur Complement Solve In Chapter 5 we defined the notions of G-solves and S-solves. We were also able to give cost functions as to how much additional work is done when preconditioning with M or M c with B (and p based entirely on G-solves and matrix-vector products B ). T Since S-solves are composed of G-solves and matrix-vector products, we wish Chapter 7. Numerical Experiments 50 to quantify the percentage of G-solves and matrix-vector products with B (and B) T done within S-solves. These percentages determine the dominance of the Schur complement solves in terms of total cost. Table 7.1 validates the use of the simpler cost functions (5.14), (5.15), and (5.21). Therefore, as long as a fixed inner tolerance is used, then 7 can be averaged and the simplier cost functions should be used. Chapter 7. Numerical Experiments * eigenvalues of K \7 eigenvalues of M * ' K O eigenvalues ol M" K 51 1 (a) G is the identity > eigenvalues ol K V eigenvalues ol M"'K O eigenvalues ol M K _, (b) G is the diagonal of A * eigenvalues ol K •? eigenvalues al M K •j eigenvalues ol M" K (c) G is the incomplete Cholesky factorization of A Figure 7.2: Eigenvalues of K, M~ K, 1 and M~ K 1 for stokes2d(0,5) Chapter 7. Numerical Experiments Figure 7.3: Sparsity pattern of H formed by mosarqp2 52 Chapter 7. Numerical Experiments 53 40 35 30 0 1000 500 1500 (a) Eigenvalues of Ti 0 500 1000 1500 0 500 1000 1500 3 0 500 1000 1500 (b) T o p row: eigenvalues of Mc H lr 0 2r 0 500 1000 1500 0 500 1000 • • 500 1000 1500 1500 where G is identity, diagonal of A , a n d incomplete Cholesky factorization of A respectively. B o t t o m row: eigenvalues of Mp Ti where G is identity, diagonal 1 of A , a n d incomplete C h o l e s k y factorization of A respectively. Figure 7.4: Eigenvalues of H, M H, and M~ H l c l for mosarqp2 Chapter 7. Numerical Experiments 54 x 10 JBr- ~ S — G: identiy H — G: diagonal of A - * - G: incomplete Cholesky factorization 1e-2 1e-4 1e-6 1e-8 Inner tolerance 1e-10 1e-12 1e-14 (a) S o l v i n g mosarqp2 w i t h full G M R E S using an outer tolerance of 10 and several inner tolerances.. Preconditioner is Ai c 1e-2 1e-4 (b) S o l v i n g s t o k e s 2 d ( 0 , 6 0 ) 10 - 1 0 1e-6 1e-8 1e-10 Inner tolerance 1e-12 1e-14 w i t h M I N R E S using an outer tolerance of and several inner tolerances. Preconditioner is Figure 7.5: Comparison of various values for G Iterative p/c G iter 7 solver GMRES GMRES M c M p Actual Estimated Actual # Estimated # # MVPs # MVPs G-solves G-solves identity [1 202] 344 279622 279734 140014 140070 diagonal [1 198] 336 267994 267854 134196 134126 Cholesky [1 61] 108 26852 26908 13488 13516 identity - - - - - - diagonal - - - - - - identity [24 11] 345 525492 524500 263126 262300 diagonal [31 14] 334 663034 661440 332012 330780 Cholesky [6 11] 108 40068 39808 20126 19936 identity [477 12] 329 10028064 10039764 5021661 5027511 diagonal [177 12] 319 3610928 3609804 1808292 1807731 Cholesky [26 13] 105 173192 173880 87010 87354 identity 1029 348 1435620 1435152 718841 718607 diagonal 637 343 876884 876708 439081 438993 Cholesky 179 108 77956 78192 39159 39277 Cholesky GMRES (15) GMRES(15) MINRES M c M v Mp Table 7.1: Checking validity of approximations (5.14), (5.15), and (5.21) of (5.12), (5.13), and (5.20) respectively by solving stokes2d(0,60) with various iterative solvers, both available block preconditioners, and all three G values. Chapter 7. Numerical Experiments 56 The missing data from rows 4, 5, and 6 of Table 7.3 means the given method was unable to converge in a reasonable number of outer iterations due to stagnation. Table 7.3 demonstrates that the 5-solves are a dominant part of the total work involved in preconditioning a saddle-point linear system regardless of the test problem or preconditioner. It appears that the better G is an approximation to A, the less the percentage of total work is devoted to 5-solves. Upwards of 99.88% of all G-solves are done within 5-solves. Upwards of 99.86% of all matrixvector products were performed within an 5-solve when preconditioning with the constraint preconditioner. All matrix-vector products performed are done within 5-solves when preconditioning with the positive-definite preconditioner. See (5.10). 5-solves are done as part of an inner iteration. Therefore, it would make sense that the fraction of work done for 5-solves reduces as the inner tolerance increases. To show that this is indeed the case, we have included Table 7.3. Inner tolerances with an asterisk beside them indicate that inner method stagnated but that GMRES(15) was still able to converge to the desired outer tolerance. Despite a rather large range of inner tolerances, 89.42% to 91.64% of all G-solves are still performed within 5-solves. 7.4 Comparison of Krylov subspace solvers The three Krylov subspace methods we implemented are full GMRES, restarted GMRES, and MINRES. From the previous section, we know that the bulk of the work associated with preconditioning (1.1) with either A4 or M. is in the C v 5-solves. Hence all comparisons in this section will be done based on the total number of iterations needed to solve the 5-solves over all outer iterations. Since MINRES requires the preconditioner to be positive definite, we will not be able to show a MINRES convergence plot when M. is used to precondition. c Missing data points, which occur (or don't occur) only when the inner tolerance is large,' indicate the outer method stagnated and was unable to converge in 57 Chapter 7. Numerical Experiments Iterative p/c G % G-solves % M V P in in 5-solves S-solves identity 99.71% 99.85% diagonal 99.70% 99.85% Cholesky 99.08% 99.54% identity - - diagonal - - identity 99.71% 99.86% diagonal 99.70% 99.85% Cholesky 99.09% 99.54% identity 93.88% 100% diagonal 93.49% 100% Cholesky 93.45% 100% identity 99.86% 100% diagonal 99.85% 100% . Cholesky 99.54% 100% solver GMRES GMRES M c M p Cholesky GMRES(15) GMRES(15) MINRES M c M p M p Table 7.2: Determining ratio of work done for S-solves for stokes2d(0,60) when the inner tolerance equals the outer tolerance. Chapter 7. Numerical Experiments iter 7 58 Total # of % of G-solves of G-solves in 5-solves. 10- [100 9] 29 82434 89.42% lO" 2 [70 6] 142 42326 89.94% lO" 3 [43 15] 272 36948 91.66% 4 [40 14] 351 41236 90.83% io- 5 [35 15] 413 46127 91.24% icr 6 [33 15] 446 50663 91.64% io- 7 [44 9] 472 52300 91.56% io- 8 [45 14] 501 54522 91.56% io- 9 [38 15] 523 56170 91.58% ' [38 15] 540 57758 91.59% [39 8] 554 59096 91.57% -12* [39 8] 565 60264 91.56% 1Q-13* [39 8] 575 61360 91.61% 1 IO" 1 0 -io IO" 10 11 -14* [39 8] . 586 62814 91.68% IO" * [39 8] . 587 62814 91.64% 1Q 15 Table 7.3: Determining ratio of work done for 5-solves for mosarqp2 preconditioned with M p where G is the identity. Solving with restarted GMRES(15). Asterisk indicate the conjugate gradient method stagnated for those particular inner tolerances. Chapter 7. Numerical 59 Experiments M M . p c G: eye G: diag G: chol G: eye G: diag G: chol mosarqp2 1.80 1.00 1.00 2.36 1.15 1.00 stokes2d(0,60) 2.25 2.48 1.49 8.56 4.20 2.25 Table 7.4: Multiple of extra work required for GMRES(15) to converge. Based on average GMRES convergence over inner tolerances 10~ , 1 0 10 -11 , and 10~ . 12 a reasonable amount of iterations. Data points that are accentuated with a square indicate the inner method stagnated yet the outer method was still able to converge. In all three graphs of Figure (7.6), the full GMRES method is unable to converge when the inner tolerance is greater than the outer tolerance of 1 0 -10 . We see that the circles which indicate data points associated with the full GMRES method commence at an inner tolerance of 1 0 - 1 1 and are located directly below the data points for MINRES. When an inner tolerance that is smaller or matches the outer tolerance is used, GMRES is consistently the cheapest method. When restarting actually does take place, GMRES outperforms GMRES (15) and MINRES as can be seen in Tables 7.4 and 7.5. We use full GMRES as the basis and show how much additional work must be done for GMRES(15) and MINRES to converge. Hence, one must multiply a value in one of the tables by the overall cost function of solving with full GMRES to get a cost function for the solving the same problem with a MINRES or GMRES(15). Note: Values were calculated as an average over inner tolerances 10~ , 1 0 10 and 10~ 12 -11 , since GMRES did not converge for larger inner tolerances and the inner method stagnated for smaller ones for all iterative methods. The terms eye, diag, and chol are short form for G equal to the identity, the diagonal of A, or the incomplete Cholesky factorization of A, respectively. As can be seen on the first row of Table 7.4, GMRES and GMRES(15) need virtually the same number of S-solves for convergence when G is a good Chapter 7. Numerical Experiments 60 ....<>.... unreftarlsd QMRES •• 4- ~restartedGMRES(15) - * - • MINRES s I 4 ..... . - a i 4 ^ - - * ~ r ~ - M i-8 Inner tolerance 1a-10 Inner tolerance (a) Solving mosarqp2 with A4 C where G is identity. (b) Solving mosarqp2 with M P where G is diagonal of A. —O— unrestarted QMRES - + - restarted GMRES(15) •• • MINRES (c) Solving stokes2d(0,60) with M V where G is incomplete Cholesky factorization of A. Figure 7.6: Differences in convergence behaviour for full GMRES, GMRES(15), and MINRES (where applicable) Chapter 7. Numerical 61 Experiments M p G: eye G: diag G. chol mosarqp2 1.15 1.04 1.00 stokes2d(0,60) 1.01 1.02 1.00 Table 7.5: Multiple of extra work required for MINRES to converge. Based on average GMRES convergence over inner tolerances 1 0 -10 , 10 -11 , and 1 0 -12 . approximation to A. This is due to the fact that mosarqp2 is the smallest of the test problems and converges before restarting actually takes place. The restarted GMRES method does not suffer from stagnation for the three examples shown. However, this is not always the case, as we will see in subsequent numerical experiments to come. Restarted GMRES is used when memory restrictions arise. But, the memory savings come at another cost. At the k th iteration of full GMRES, the solution estimate is computed over a subspace of dimension k. At the k iteration th of GMRES(restart), the solution estimate is computed over a subspace of dimension restart (restart < k). Theoretically, the larger our restart value, the fewer number of iterations needed for convergence. However, this is not always the case. Embree [13] presented 2 examples where GMRES(l) easily converged while GMRES(2) stagnated. We choose restart = 15. For the test problems mosarqp2 and stokes2d(0,60), GMRES(15) is just as stable (i.e., does not suffer from stagnation issues) than full GMRES if not more so. MINRES converges even when the inner tolerance is as large as 10~ . This is 2 consistent with observations made by Notay [26] and Golub and Overton [15] for the conjugate gradient method, another Krylov subspace method whose basis vectors are formed implicitly. Upon observing Table 7.5, it appears that MINRES and GMRES require almost the same amount of work to converge. Theoretically, the two methods should converge identically since solving a symmetric matrix with an Arnoldi based method such as GMRES should reduce to a Lanczos-type solution. We Chapter 7. Numerical Experiments 62 attribute the small difference to the rounding errors mentioned at the end of Chapter 5. 7.5 Setting the Inner Tolerance To experimentally find the optimal fixed inner tolerance, we solve each of our test problems various times, each time with a different value for the inner tolerance in the hope of finding the optimal fixed inner tolerance. The inner tolerances are chosen from the range 1 0 _l where i € (0,15]. Using an inner tolerance that is small may leads to stagnation in the inner method since the inner tolerance is too close to machine precision as seen by the data points accentuated by squares for previous figures. Using a tolerance that is too large can lead to stagnation in the outer method making it impossible for convergence to occur. There is some variation in the optimal fixed inner tolerances in Figures 7.7-7.11. For our test problems, the following fixed inner tolerances have worked well: GMRES -» lO" 1 0 GMRES(15) -» lO- 1 0 MINRES -> HT 2 As described in Chapter 6, we have implemented a clamped varying inner tolerance. In order to illustrate the rate at which the inner tolerance will vary as GMRES or GMRES(15) converges, we plot the inner tolerance and the residual of the outer method at outer iteration fc, for two examples. In the previous section, we observed 15 out of 24 cases where full GMRES or restarted GMRES(15) stagnated and was unable to converge when the inner tolerance was set to be larger than the outer tolerance. Table 7.5 shows the total number of S-solve iterations needed to solve those very problems when using the optimal inner tolerance versus using the varying inner tolerance proposed by Simoncini and Szyld [31]. In every case of Table 7.5, allowing the inner tolerance to vary requires the Chapter 7. Numerical Experiments Test Iterative Problem Solver mosarqp2 GMRES M mosarqp2 GMRES mosarqp2 63 p/c # S-solve itns # 5-solve itns fixed inner tol varying inner tol eye 19464 17169 M c diag 5615 5490 GMRES M c chol 1611 1611 mosarqp2 GMRES M P eye 140198 117742 mosarqp2 GMRES M diag 18453 17251 mosarqp2 GMRES Mp chol 3209 3209 mosarqp2 GMRES(15) M diag 5635 5185 mosarqp2 GMRES(15) M c chol 1611 1611 mosarqp2 GMRES(15) M p chol 3209 3119 stokes2d(0,60) GMRES M eye 69804 51255 stokes2d(0,60) GMRES M diag 66899 47984 stokes2d(0,60) GMRES M chol 6682 5085 stokes2d(0,60) GMRES Mp eye - 1791668 stokes2d(0,60) GMRES M diag - 846833 stokes2d(0,60) GMRES M chol - 29260 c p c c c c p p G Table 7.6: Comparison of number of 5-solve iterations required for convergence when using a fixed inner tolerance of 10~ described in Chapter 6. 10 and the varying inner tolerance Chapter 7. Numerical Experiments 1e-5 1e-10 1e-5 64 1e-10 1 e - 5 1e-10 Inner t o l e r a n c e Figure 7.7: Total number of S-solve iterations needed for convergence at various inner tolerances when using GMRES with M . c In the first column, G is the identity. In the second column, G is the diagonal of A. In the third column, G is the incomplete Cholesky factorization of A. same, or fewer S-solve iterations, than solving with the optimal inner tolerance of 10~ . 10 Missing data indicates that the iterative solver stagnated and was unable to converge even when the inner tolerance equaled the outer tolerance. When the incomplete Cholesky factorization of A is used for G, convergence occurs in few outer iterations and the inner tolerance does not have a chance to be increased, causing the values of the two rightmost columns to be the same. For mosarqp2 the cost savings range from 0% to 16.03%. For stokes2d(0,60), the cost savings range from 23.9% to 28.4%. Loosely speaking, the larger and the more poorly conditioned the problem is, the greater the cost savings in using Chapter 7. Numerical Experiments 65 mosarqp2 1e-5 1e-10 1e-5 1e-10 Inner tolerance 1e-5 1e-10 Figure 7.8: Total number of S-solve iterations needed for convergence at various inner tolerances when using GMRES with M. . P In the first column, G is the identity. In the second column, G is the diagonal of A. In the third column, G is the incomplete Cholesky factorization of A. the varying inner tolerance as opposed to a fixed inner tolerance. Now that we have seen that allowing the inner tolerance to vary is a better alternative than to simply setting the inner tolerance equal to the outer tolerance when using GMRES or GMRES(15), we should see how the total cost measures up when comparing the total number of 5-solve iterations performed when solving the same problem with MINRES and it's optimal inner tolerance of 10~ . Of course, these comparisons can only be done for the cases where the 2 positive-definite preconditioner M P appear in table 7.5. is used. Let us revisit the seven cases that Chapter 7. Numerical Experiments 66 mosarqp2 x10 1e-5 1e-10 1e-5 1e-10 Inner tolerance 1e-5 1e-10 Figure 7.9: Total number of S-solve iterations needed for convergence at various inner tolerances when using GMRES(15) with M C In the first column, G is the identity. In the second column, G is the diagonal of A. In the third column, G is the incomplete Cholesky factorization of A . Case 1: mosarqp2, G is the identity matrix Full GMRES with varying inner tol requires 117742 5-solve itns MINRES requires 39118 5-solve itns Case 2: mosarqp2, G is the diagonal of A Full GMRES with varying inner tol requires 17251 5-solve itns MINRES requires 13330 5-solve itns ) Chapter 7. Numerical Experiments 1e-5 1e-10 1e-5 67 1e-10 1e-5 1e-10 Inner t o l e r a n c e Figure 7.10: Total number of S-solve iterations needed for convergence at various<. inner tolerances when using GMRES(15) with A4 . p In the first column, G is the identity. In the second column, G is the diagonal of A. In the third column, G is the incomplete Cholesky factorization of A. Cases 3 and 4: mosarqp2, G is the incomplete Cholesky factorization of A Full GMRES with varying inner tol requires 3209 S-solve itns GMRES(15) with varying inner tol requires 3119 S-solve itns MINRES requires 3219 S-solve itns Case 5: stokes2d(0,60), G is the identity matrix Full GMRES with varying inner tol requires 1791668 S-solve itns MINRES requires 140375 S-solve itns Chapter 7. Numerical Experiments 1e-5 1e-10 1e-5 1e-10 68 1e-5 1e-10 stokes2d(0,60) 1e-5 1e-10 1e-5 1e-10 1e-5 1e-10 Inner tolerance Figure 7.11: Total number of S-solve iterations needed for convergence at various inner tolerances when using MINRES (obviously with A4 ). In the first column, P G is the identity. In the second column, G is the diagonal of A. In the third column, G is the incomplete Cholesky factorization of A. Case 6: stokes2d(0,60), G is the diagonal of A Full GMRES with varying inner tol requires 846833 5-solve itns MINRES requires 107262 5-solve itns Case 7: stokes2d(0,60), G is the incomplete Cholesky factorization of A Full GMRES with varying inner tol requires 29260 5-solve itns ^ MINRES requires 7460 5-solve itns 69 Chapter 7. Numerical Experiments Figure 7.12: X-shaped graphs demonstrating rates at which inner tolerance increases and relative residual decreases as GMRES converges. 100 150 Iteration number (a) mosarqp2 solved with G M R E S and a varying inner tolerance. Preconditioning with M. P where G is the identity. (b) stokes2d(0,60) solved with G M R E S and a varying inner tolerance. Preconditioning with M C of A. where G is the diagonal 70 Chapter 7. Numerical Experiments Table 7.7: Number of S-solve iterations needed for stokes2d(0,60) to converge given that G is the identity. Iterative Solver Inner Tolerance Preconditioner # S-Solve Itns GMRES fixed M 6980.4 GMRES varying M 52376 GMRES(15) fixed Mc 131183 GMRES(15) varying M 100192 GMRES fixed M - GMRES varying M 264385 c c c v p GMRES(15) fixed GMRES(15) varying M 1791668 fixed M 140375 MINRES • 2507016 M p p p Cases 3 and 4 were added for completeness. However, they are not very interesting as convergence occurs almost immediately. The benefits of a large inner tolerance can not be seen here. However, cases 1 and 2 strongly suggest that the optimal inner tolerance'of 10~ allowed by MINRES leads to significant 2 cost savings when preconditioning (1.1) with 7.6 M. p The Bottom Line After all has been said and done, all we really care about is preconditioning as cheaply.as possible. Since the cost of an S-solve iteration differs greatly from one value of G to another, we list the cost of preconditioning both our test problems for each of the three values of G for both preconditioners, all three iterative Krylov subspace solvers, and fixed and varying inner tolerances where applicable. All values for solves with GMRES or GMRES(15) and a fixed inner tolerance are based on fixed inner tolerance of 10~ . All values for solves with 10 MINRES are based on fixed inner tolerance of 10~ . 2 71 Chapter 7. Numerical Experiments Iterative Solver Inner Tolerance Preconditioner # 5-Solve Itns GMRES fixed M . GMRES varying M 47984 . GMRES(15) fixed M 165511 GMRES(15) . varying M . c c c 119244 • c 66899 GMRES fixed M GMRES varying M 190073 GMRES(15) fixed M 902732 GMRES(15) varying M 846833 fixed' M 107262 MINRES - v p p p p Table 7.8: Number of 5-solve iterations needed for stokes2d(0,60) to converge given that G is the diagonal of A. Iterative Solver Inner Tolerance Preconditioner # 5-Solve Itns GMRES fixed M 6682 GMRES varying Mc 5085 GMRES(15) fixed M 9971 GMRES(15) varying Mc 7349 GMRES fixed Mp - GMRES varying M GMRES(15) fixed. M 43298 GMRES(15) varying M 29260 fixed Mp 7460 MINRES c c p p p ' ' 14260 Table 7.9: Number of 5-solve iterations needed for stokes2d(0,60) to converge given that G is the incomplete Cholesky factorization of A. Chapter 7. Numerical Iterative Solver 72 Experiments Inner Tolerance Preconditioner # 5-Solve Itns GMRES fixed M 19464 GMRES varying M .17169 GMRES(15) fixed M 32980 GMRES(15) varying M c 28287 GMRES fixed M v 140198 GMRES varying M 117742 GMRES(15) fixed . Mp GMRES(15) varying Mp 282139 fixed Mp 39118 MINRES c c c p , 328457 Table 7.10: Number of 5-solve iterations needed for mosarqp2 to converge given that G is the identity. Iterative Solver Inner Tolerance Preconditioner # 5-Solve Itns GMRES fixed M 5615 GMRES varying M 5490 GMRES(15) fixed M 5635 GMRES (15) varying M c 5185 GMRES fixed M p 18453 GMRES varying M 17251 GMRES(15) fixed M 28861 GMRES(15) varying Mp 24929 fixed Mp 13330 MINRES c c c p p Table 7.11: Number of 5-solve iterations needed for mosarqp2 to converge given that G is the diagonal of A 73 Chapter 7. Numerical Experiments Inner Tolerance Preconditioner # S-Solve Itns GMRES fixed Mc 1611 GMRES varying M 1611 GMRES(15) fixed M 1611 GMRES(15) varying M 1611 GMRES fixed M GMRES varying M 3209 GMRES(15) fixed M 3209 GMRES (15) varying M 3209 fixed M 3219 Iterative Solver MINRES c c c . p p v v p 3209 Table 7.12: Number of S-solve iterations needed for mosarqp2 to converge given that G is the incomplete Cholesky factorization of A. For the test problems we examined, preconditioning with M c effective than preconditioning with M , p is more cost even if A4 is used with a very large P fixed inner tolerance and MINRES. The iterative solver that appears to require the fewest number of S-solve iterations is GMRES. However, GMRES requires a lot of memory, which explains the need for restarted GMRES. Also, it does not take advantage of the symmetry of the saddle-point linear system as does MINRES. 74 Chapter 8 Conclusions and Future Work The contributions of this thesis commence in Chapter 4 where we show three analytic results that have to do with eigenvalues and our block preconditioners of interest. The first of these results relate m eigenvalues of A4 to those of M when G is C p equal to the identity. Hence, if the eigenvalues of one of the block preconditioners are known, m eigenvalues of the second preconditioner are also known. The second result relates the eigenvalues of the constraint preconditioner to singular values of B when G is the identity. The final result of Chapter 4 is based oh work done by Rusten and Winther [29]. However, instead of just finding eigenvalue bounds for Ti, we are able to find eigenvalue bounds for M 1 and P G M. are both positive definite. The bounds are based nxn and Q G M m x m Ti, where on the extremal eigenvalues of P~i AP~i and the extremal singular values of The concepts of G-solves and 5-solves are vital. Using these concept, we are able to quantify the cost of solving M. u — f and M u — f in such a way that c p we take advantage of the block structure of the preconditioners. Within the Krylov subspace solvers MINRES and GMRES, we define solves of the type (5.2) and (5.9) to be our inner iteration. Hence, by multiplying 75 Chapter 8. Conclusions and Future Work the total number of outer iterations by our cost function for one such solve, we derive a cost function for the total cost involved in preconditioning a saddlepoint linear system with the constraint or positive-definite preconditioner. In order to minimize the above cost functions, we try setting the inner tolerance in two different ways. When we allow the inner tolerance to be fixed overall outer iterations, we can use a simpler cost function involving 7, the average number of inner iterations needed to solve an S-solve. For our test problems, MINRES responded well to solving the inner iteration with a large inner tolerance. GMRES, on the other hand, did not. When the inner tolerance was set to be larger than the outer tolerance, GMRES stagnated and was unable to converge. This phenomenon also occurred for GMRES(15), albeit less often. This stagnation prompted the implementation of a varying inner tolerance. Simoncini and Szyld [31], as well as van den Eshof, Sleijpen, and van Gijzen [35], developed an inner tolerance bound dependent on the previous outer iteration number and residual. Although their definition of an inner iteration slightly differed from ours, we were able to extend their theory to our inner iteration. Neither GMRES or GMRES(15) stagnated with solving (1.2) with a varying inner tolerance when solving stokes2d(0,60) and mosarqp2. On the contrary, it was shown in tables 7.7 - 7.11 that GMRES with a varying inner tolerance and the preconditioner M c was by far the cheapest iterative solver in terms of preconditioning. For our test problems, if M p is chosen as the preconditioner, solving with MINRES and a large inner tolerance is the ideal way of solving (1.1). However, if the choice of preconditioner is fixed, solving with GMRES, M , c and a varying inner tolerance reduces the amount of work needed for preconditioning to a minimum. This however, does not take into consideration memory constraints. 8.1 Future Research The choice of G greatly affects the total number of S-solve iterations. The better an approximation to A, the fewer iterations needed. However, each iteration is 76 Chapter 8. Conclusions and Future Work more expensive. Although we only explored three options for G, these are not the complete list of available choices. It might be worthwhile to run some tests with alternate G values. For restarted GMRES, our choice of restart = 15 was rather arbitrary. It might also be worthwhile to re-run the same test problems with a different value for restart. As Embree [13] showed, we might see some very different convergence behaviour. Numerically we observed that GMRES (and sometimes GMRES(15)) stagnates when solving saddle point linear systems with a fixed inner tolerance that is greater than the outer tolerance while MINRES does not. The theory developed by Simoncini and Szyld [31] is based on an Arnoldi basis that is explicitly formed. The (Lanczos) basis vectors are implicitly formed for MINRES. We suspect this is one of the reasons behind the different convergence behaviours of GMRES and MINRES. A further investigation into this matter would be informative. Preconditioning the 5-solve greatly reduces the total number of 5-solve iterations. However, the additional cost of this preconditioning has not been worked out and therefore has not been added to the cost functions of chapter 5. Another layer needs to be added to the cost function for a fair comparison between preconditioned 5-solves and non preconditioned 5-solves. Although this thesis focuses solely on M c and A4 , it should be mentioned P that these are not the only two block preconditioners available for solving saddlepoint linear systems, ourgmres and ourminres could be extended to accommodate other block preconditioners as well (only positive-definite block preconditioners for ourminres). Of course, new cost functions would have to be derived for the total cost of preconditioning. 77 Bibliography [1] K. Arrow, L. Hurwicz, and H. Uzawa. Studies in Nonlinear Programming. Stanford University Press, Stanford, CA, 1958. [2] A. Batterman and M . Heinkenschloss. Preconditioners for Karush-KuhnTucker systems arising in the optimal control of distributed systems. Optimal Control of Partial Differential Equations, pages 15-32, 1998. [3] M . Benzi, G. H. Golub, and J. Liesen. Numerical solution of saddle point problems. Acta Numerica, 14:1-137, 2005. [4] I. Bongartz, A. R. Conn, Nick Gould, and P. L. Toint. strained and unconstrained testing environment. C U T E : Con- ACM Trans. Math. Softw., 21(1):123-160, 1995. [5] A. Bouras and V. Fraysse. Inexact matrix-vector products in Krylov methods for solving linear systems: a relaxation strategy. SIAM J. Matrix Anal. Appl., 26:660-678, 2005. [6] R. Bridson. stokes2d(k,n). Private communication, 2005. [7] J. R. Bunch and B. N. Parlett. Direct methods for solving symmetric indefinite systems of linear equations. SIAM J. Numer. Anal, 8:639-655, 1971. [8] H. S. Dollar and A. J. Wathen. Approximate factorization constraint preconditionersforsaddle-point matrices. SIAM J. Sci. Comput., 2004. To appear. 78 Bibliography [9] H. S. Dollar and A. J. Wathen. Incomplete factorization constraint preconditioners for saddle-point matrices. Technical Report NA-04/01, Oxford University, Numerical Analysis Group, 2004. [10] N. Dyn and W. E . Ferguson. The numerical solution of equality-constrained quadratic programming problems. Math. Comp., 41:165-170, 1983. [11] H. C. Elman and G. H. Golub. Inexact and preconditioned uzawa algorithms for saddle point problems. SIAM J. Numer. Anal, 31:1645-1661, 1994. [12] H.C. Elman, D.J. Silvester, and A.J. Wathen. Finite Elements and Fast Iterative Solvers. Oxford University Press, 2005. [13] M . Embree. The tortoise and the hare restart GMRES. SIAM Review, 45:259-266, 2003. [14] R. Fletcher and T. Johnson. On the stability of null-space methods for K K T systems. Technical Report NA/167, University of Dundee, Department of Mathematics and Computer Science, 1995. [15] G. H. Golub and M . L. Overton. The convergence of inexact Chebyshev and Richardson iterative methods for solving linear systems. Numer. Math., 53:571-593, 1988. [16] A. Greenbaum. Iterative methods for solving linear systems. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 1997,. [17] M . R. Hestenes and R. Stiefel. Methods of conjugate gradients for solving linear systems. J. Research Nat. Bur. Standards, 49:409-436, 1952. [18] R. A. Horn and C. R. Johnson. Matrix Analysis. Cambridge University Press, New York, NY, USA, 1986. [19] C. Keller, N. I. M . Gould, and A. J. Wathen. Constraint preconditioning for indefinite linear systems. SIAM J. Matrix Anal. Appl, 21:1300-1317, 2000. Bibliography 79 [20] Y. A. Kuznetsov. Efficient iterative solvers for elliptic finite element problems on nonmatching grids. Russian J. Numer. Math. Modelling, 10:187211, 1995. [21] C. Lanczos. Solution of systems of linear equations by minimized iterations. J. Research Nat. Bur. Standards, 49:33-53, 1952. [22] L. Luksan and J. Vlcek. Indefinitely preconditioned inexact newton method for large sparse equality constrained non-linear programming problems. Numer. Linear Algebra Appl., 5:219-247, 1998. [23] J. L . Morales-Perez and R. W. H. Sargent. On the implementation and performance of an interior point method for large sparse convex quadratic programming. Optim. Methods Sofw., 1:153-168, 1992. [24] M. F. Murphy, G. H. Golub, and A. J. Wathen. A note on preconditioning for indefinite linear systems. SIAM J. Sci. Comput, 21:1969-1972, 2000. [25] J. Nocedal and S. J. Wright. Numerical Optimization. Springer Series in Operations Research. Springer-Verlag, New York, 1999. [26] Y. Notay. Flexible conjugate gradients. SIAM J. Sci. Comput, 22:14441460, 2000. [27] C. C. Paige and M . A. Saunders. Solution of sparse indefinite systems of linear equations. SIAM J. Numer. Anal., 12:617-629, 1975. [28] J. K. Reid. On the method of conjugate gradients for the solution of large sparse linear equations. In J. K. Reid, editor, Large Sparse Sets of Linear Equations, pages 231-254. Academic Press, New York, NY, USA, 1971. [29] T. Rusten and R. Winther. A preconditioned iterative method for saddlepoint problems. SIAM J. Matrix Anal. Appl., 13:887-904, 1992. [30] Y. Saad and M. H. Schultz. GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Stat. Comput., 7:856-869, 1986. Bibliography 80 [31] V. Simoncini and D. B. Szyld. Flexible inner-outer Krylov subspace methods. SIAM J. Numer. Anal, 40:2219-2239,2003. [32] V. Simoncini and D. B. Szyld. Theory of inexact Krylov subspace methods and applications to scientific computing. SIAM J. Sci. Comput., 25:454477, 2003. [33] V. Simoncini and D. B. Szyld. On the occurrence of superlinear convergence of exact and inexact Krylov subspace methods. SIAM Review, 47:247-272, 2005. [34] G. L. G. Sleijpen, H. A. van der Vorst, and J. Modersitzki. Differences in the effects of rounding error in Krylov solvers for symmetric indefinite linear systems. SIAM J. Matrix Anal. Appl., 22:726-751, 2000. [35] J. van den Eshof, G. L. G. Sleijpen, and M . B. van Gijzen. Relaxation strategies for nested Krylov methods. Journal of Computational and Applied Mathematics, 177:347-365, 2005. [36] Henk A. van der Vorst. Iterative Krylov Methods for Large Linear Systems. Cambridge University Press, 2003. [37] A. J. Wathen and D. J. Silvester. Fast iterative solution of stabilised Stokes systems Part I: Using simple diagonal preconditioners. SIAM J. Numer. Anal., 30:630-649, 1993. 81 Appendix A The Matlab Program All numerical experiments were performed with an altered version of built-in G M R E S O U R M I N R E S , or M I N R E S functions which we cleverly call O M A T L A B ' S U R G M R E S and respectively. Both new functions take advantage of the structure of the block preconditioner and uses the factorizations discussed in Chapter 5 to reduce the amount of work for the linear solves of the form Aiu = f. A.l OURGMRES In order to make OURGMRES as robust as possible we have altered the input and output arguments in the following manner: [u.flag,relres,iter,resvec, cost p = ourgmres(H,b,restart, t o l m a x i t |,| GI |,| G2 |,u0,| p c f l a g , S f l a g , p c S f l a g , t o l f l a g ) New or altered parameters are highlighted by a box. Input Parameters: H . K K T matrix of dimension (n + m) x (n + m). b Right-hand side of unpreconditioned K K T linear system, b is of dimension (n + m) x 1 restart Restarts the method every restart iterations. If restart is n or [ ], then OURGMRES uses the full version of the method. 82 Appendix A. The Matlab Program tol Specifies the tolerances. If t o l is [ ], a default value of le-6 is placed in t o l ( l ) . tol(2) is the tolerance used if the S solves are solved inexactly. If tol(2) is [ ], t o l ( l ) is the default value. t o l (3) is the tolerance used if the G solves are solved inexactly. If tol(3) is [ ], tol(2) is used as a default. Note: If t o l f l a g = 1, the value of t o l (2) is obsolete, maxit Specifies the maximum number of iterations. maxit(2) is the maximum number of iterations for solving the S solves if they . are done so inexactly. If maxit (2) is [ ], maxit (1) is the default value, maxit (3) is the maximum number of iterations for solving the G solves if they are done so inexactly. Gl If G2 is [ ], then G = Gl is (1,1) block of preconditioner. Positive definite and of dimensions n x n . If G2 is [ ], then a preconditioner is not applied. G2 If G2 is nonempty, then G = G1*G2 is (1,1) block of preconditioner. G must be positive definite. G2 is of dimensions n x n uO Specifies the first initial guess. If uO is [.], OURGMRES uses the default, an all zero vector of dimension (n + ra) x 1. pcf lag Determines the preconditioner: 0 —> M c 1— > M p Sf lag (constraint preconditioner). (positive-definite preconditioner). Determines the method of solving S solves: 0— > • Uses. MATLAB's built-in backslash operator. 1 —» Uses the conjugate gradient method. Note: If Sflag — 1, value'of t o l f l a g is obsolete. pcSf lag Determines whether we will precondition our S solves or not (assuming we are solving S solves inexactly): 0 — + No preconditioner! 1— > Precondition with B B. 2 -> Precondition with B T T diag(G) -1 B. 83 Appendix A. The Matlab Program t o l f lag Determines whether the inner tolerance is fixed or varying (assuming Sf lag =1). 0 —> - Inner tolerance is fixed (user specified). 1 —> Inner tolerance varies according to Simoncini and Szyld analysis. Output Parameters: u Solution of linear system Hu = b. Vector of dimension (n+m) x 1. flag Convergence flag: 0 —> OURGMRES converged to desired tol within maxit iterations. 1 —> OURGMRES iterated maxit times but did not converge. 2 —> Preconditioner was ill-conditioned. 3 OURGMRES stagnated, relres Relative residual. iter 1x2 vector whose (1,1) entry is the number of outer OURGMRES iterations and whose (1,2) entry is the number of inner OURGMRES inner iterations necessary to compute u. 0 < i t e r ( l ) < maxit and 0 < iter(2) < restart. If we are using the unrestarted OURGMRES method, i t e r (1) = 1. Note: the terms inner and outer iterations do not have the same meaning in this context as they do everywhere else in this thesis, resvec Vector of residual norms at each iteration. cost Structure that keeps track of the number of solves (G and S) and the number of matrix-vector products. 84 Appendix A. The Matlab Program A.2 OURMINRES In.order to make OURMINRES as robust as possible we have altered the input and output arguments in the following manner: [ u , f l a g . r e l r e s , i t e r ,resvec,resveccg,| cost |] = ourminres(H,b, . . . t o l |,| maxit |,|G1 |,|G2 |,u0,| Sflag , pcSf lag ) New or altered parameters are highlighted by a box. Input Parameters: H K K T matrix of dimension (n + m) x (n + m). b Right-hand side of unpreconditioned K K T linear system, b is of dimension (n + m) x 1 tol ' Specifies the tolerances. If t o l is [ ], a default value of le-6 is placed in t o l ( l ) . tol(2) is the tolerance used if the S solves are solved inexactly. If tol(2) is [ ], t o l ( l ) is the default value, t o l (3) is the tolerance used if the G solves are solved inexactly. If tol(3) is [ ], tol(2) is used as a default. maxit Specifies the maximum number of iterations, maxit(2) is the maximum number of iterations for solving the S solves if they are done so inexactly. If maxit (2) is [ ], maxit (1) is the default value, maxit (3) is the maximum number of iterations for solving the G solves if they are done so inexactly. Gl ' . If G2 is ["], then G = Gl is (1,1) block of preconditioner. Positive definite and of dimensions n x n . If G2 is [ ], then, a preconditioner is not applied. G2 If G2 is nonempty, then G = G1*G2 is (1,1) block of preconditioner. G must be positive definite. G2 is of dimensions n x n uO Specifies the first initial guess. If uO is [ ], OURMINRES uses the default, an all zero vector of dimension (n + m) x 1. Appendix A. The Matlab Program Sflag 85 Determines the method of solving 5-solves: 0 —• Uses MATLAB's built-in backslash operator. 1 —• Uses the conjugate gradient method. Note: If Sflag = 1, value of t o l f l a g is obsolete. pcSf lag Determines whether we will precondition our 5-solves or not (assuming we are solving 5-solves inexactly): 0 —» No preconditioner. 1 —» Precondition with B 1 B. 2 -> Precondition with B T diag(G)- 1 B. Output Parameters: u Solution of linear system Hu flag Convergence Hag: b. Vector of dimension (n + m) x 1. 0 —> OURMINRES converged to desired tolerance within maxit iterations. 1 —> OURMINRES iterated maxit times but did not converge. 2 —> Preconditioner was ill-conditioned. 3 -> OURMINRES stagnated. 4 - * One of the scalar quantities calculated during OURMINRES became too small or too large to continue computing. 5 —+ Preconditioner is not symmetric positive definite, relres Relative residual. iter Iteration number at which </. was computed. 0 < i t e r < maxit resvec Vector of residual norms at each iteration. resveccg Vector of estimates of the conjugate gradient residual norms at each iteration. cost Structure that keeps track of the number of solves (G and 5) and the number of matrix-vector products.
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- A block preconditioning cost analysis for solving saddle-point...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
A block preconditioning cost analysis for solving saddle-point linear systems MacKinnon-Cormier, Sarah Claire 2005
pdf
Page Metadata
Item Metadata
Title | A block preconditioning cost analysis for solving saddle-point linear systems |
Creator |
MacKinnon-Cormier, Sarah Claire |
Date Issued | 2005 |
Description | We investigate the cost of preconditioning when solving large sparse saddlepoint linear systems with Krylov subspace methods. To use the block structure of the original matrix, we apply one of two block preconditioners. Algebraic eigenvalue analysis is given for a particular case of the preconditioners. We also give eigenvalue bounds for the preconditioned matrix when the preconditioner is block diagonal and positive definite. We treat linear solves involving the preconditioner as a subproblem which we solve iteratively. In order to minimize cost, we implement a fixed inner tolerance and a varying inner tolerance based on bounds developed by Simoncini and Szyld (2003) and van den Eshof, Sleijpen, and van Gijzen (2000). Numerical experiments compare the cost of preconditioning for various iterative solvers and block preconditioners. We also experiment with different tolerances for the iterative solution of linear solves involving the preconditioner. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2009-12-12 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0051203 |
URI | http://hdl.handle.net/2429/16638 |
Degree |
Master of Science - MSc |
Program |
Computer Science |
Affiliation |
Science, Faculty of Computer Science, Department of |
Degree Grantor | University of British Columbia |
Graduation Date | 2005-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
Aggregated Source Repository | DSpace |
Download
- Media
- 831-ubc_2005-0542.pdf [ 2.91MB ]
- Metadata
- JSON: 831-1.0051203.json
- JSON-LD: 831-1.0051203-ld.json
- RDF/XML (Pretty): 831-1.0051203-rdf.xml
- RDF/JSON: 831-1.0051203-rdf.json
- Turtle: 831-1.0051203-turtle.txt
- N-Triples: 831-1.0051203-rdf-ntriples.txt
- Original Record: 831-1.0051203-source.json
- Full Text
- 831-1.0051203-fulltext.txt
- Citation
- 831-1.0051203.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0051203/manifest