Numerical Solution of the Time-harmonic Maxwell Equations and Incompressible Magnetohydrodynamics Problems by Dan Li B.Eng., Beijing University of Aeronautics and Astronautics, 1999 M.Eng., Beijing University of Aeronautics and Astronautics, 2002 M.A.Sc., The University of British Columbia, 2005 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF Doctor of Philosophy in THE FACULTY OF GRADUATE STUDIES (Computer Science) The University Of British Columbia (Vancouver) November 2010 c© Dan Li, 2010 Abstract The goal of this thesis is to develop efficient numerical solvers for the time-harmonic Maxwell equations and for incompressible magnetohydrodynamics problems. The thesis consists of three components. In the first part, we present a fully scalable parallel iterative solver for the time-harmonic Maxwell equations in mixed form with small wave numbers. We use the lowest order Nédélec elements of the first kind for the approximation of the vector field and standard nodal elements for the Lagrange multiplier associated with the divergence constraint. The correspond- ing linear system has a saddle point form, with inner iterations solved by precondi- tioned conjugate gradients. We demonstrate the performance of our parallel solver on problems with constant and variable coefficients with up to approximately 40 million degrees of freedom. Our numerical results indicate very good scalability with the mesh size, on uniform, unstructured and locally refined meshes. In the second part, we introduce and analyze a mixed finite element method for the numerical discretization of a stationary incompressible magnetohydrodynam- ics problem, in two and three dimensions. The velocity field is discretized using divergence-conforming Brezzi-Douglas-Marini (BDM) elements and the magnetic field is approximated by curl-conforming Nédélec elements. Key features of the method are that it produces exactly divergence-free velocity approximations, and that it correctly captures the strongest magnetic singularities in non-convex poly- hedral domains. We prove that the energy norm of the error is convergent in the mesh size in general Lipschitz polyhedra under minimal regularity assumptions, and derive nearly optimal a-priori error estimates for the two-dimensional case. We present a comprehensive set of numerical experiments, which indicate optimal con- vergence of the proposed method for two-dimensional as well as three-dimensional ii problems. Finally, in the third part we investigate preconditioned Krylov iterations for the discretized stationary incompressible magnetohydrodynamics problems. We propose a preconditioner based on efficient preconditioners for the Maxwell and Navier-Stokes sub-systems. We show that many of the eigenvalues of the precondi- tioned system are tightly clustered, and hence, rapid convergence is accomplished. Our numerical results show that this approach performs quite well. iii Preface This thesis describes results in three papers. i. C. Greif, D. Li, D. Schötzau, and X. Wei. A mixed finite element method with exactly divergence-free velocities for incompressible magnetohydrodynamics. Computer Methods in Applied Mechanics and Engineering, 199:2840-2855, 2010. ii. D. Li. Parallel numerical solution of the time-harmonic Maxwell equations. Lecture Notes in Computer Science on High Performance Computing and Ap- plications, 5938:224-229, 2010. iii. D. Li, C. Greif, and D. Schötzau. Parallel numerical solution of the time- harmonic Maxwell equations in mixed form. Numerical Linear Algebra with Applications, Submitted (19 pages), 2010. The first two are published and the third is in review. Chapter 2 describes the work in [ii, iii], and Chapter 3 is based on the work published in [i]. I have taken a central role in all aspects of the work on these papers, as fol- lows. Paper [ii] is a single-authored conference publication, and I worked on it and prepared it on my own. I was the lead author in Paper [iii], which is joint with my research supervisors, Chen Greif and Dominik Schötzau. Paper [i] is joint with my research supervisors and with Xiaoxi Wei, who is a PhD student supervised by Dominik Schötzau in the UBC Mathematics Department. My focus in this paper was on the numerical issues and the implementation, while my colleague Xiaoxi Wei has focused on the finite element analysis. For this reason, my thesis mainly discusses numerical results for this part, whereas some of the theoretical results will be presented in X. Wei’s thesis [95]. iv Table of Contents Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Krylov subspace methods . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Scientific parallel computing . . . . . . . . . . . . . . . . . . . . 5 1.3 The problems and their physics . . . . . . . . . . . . . . . . . . . 8 1.3.1 The time-harmonic Maxwell equations . . . . . . . . . . 8 1.3.2 Incompressible magnetohydrodynamics . . . . . . . . . . 11 1.4 Thesis overview and contributions . . . . . . . . . . . . . . . . . 14 2 Solution of the discretized time-harmonic Maxwell equations in mixed form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.1 Background and related work . . . . . . . . . . . . . . . . . . . . 18 2.2 Finite element discretization . . . . . . . . . . . . . . . . . . . . 20 2.2.1 Variational formulation . . . . . . . . . . . . . . . . . . . 20 v 2.2.2 Mixed discretization . . . . . . . . . . . . . . . . . . . . 21 2.2.3 Properties of the discrete operators . . . . . . . . . . . . . 23 2.3 The solver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 2.3.1 The outer solver . . . . . . . . . . . . . . . . . . . . . . 26 2.3.2 The inner solver . . . . . . . . . . . . . . . . . . . . . . 29 2.3.3 Solution algorithm . . . . . . . . . . . . . . . . . . . . . 30 2.3.4 Scalable parallel implementation . . . . . . . . . . . . . . 31 2.4 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . 34 2.4.1 Example 1: structured mesh . . . . . . . . . . . . . . . . 34 2.4.2 Example 2: unstructured mesh . . . . . . . . . . . . . . . 40 2.4.3 Example 3: locally refined mesh . . . . . . . . . . . . . . 42 3 A new mixed finite element method for the incompressible MHD equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.1 Background and related work . . . . . . . . . . . . . . . . . . . . 48 3.2 Mixed finite element discretization . . . . . . . . . . . . . . . . . 51 3.2.1 Variational formulation . . . . . . . . . . . . . . . . . . . 51 3.2.2 Mixed discretization . . . . . . . . . . . . . . . . . . . . 52 3.2.3 Theoretical results . . . . . . . . . . . . . . . . . . . . . 56 3.3 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . 59 3.3.1 Example 1: two-dimensional problem with a smooth solution 60 3.3.2 Example 2: two-dimensional problem with a singular so- lution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 3.3.3 Example 3: two-dimensional Hartmann flow . . . . . . . 65 3.3.4 Example 4: three-dimensional Hartmann flow . . . . . . . 67 3.3.5 Example 5: two-dimensional driven cavity flow . . . . . . 70 3.3.6 Example 6: three-dimensional driven cavity flow . . . . . 72 3.3.7 Example 7: two-dimensional MHD flow over a step . . . . 73 4 Solution of the discretized incompressible MHD equations . . . . . . 76 4.1 Background and related work . . . . . . . . . . . . . . . . . . . . 77 4.2 The solver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 4.2.1 Some ideas for preconditioning . . . . . . . . . . . . . . 79 vi 4.2.2 Preliminary eigenvalue analysis . . . . . . . . . . . . . . 84 4.3 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . 88 4.3.1 Example 1: two-dimensional problem with a smooth solution 88 4.3.2 Example 2: two-dimensional problem with a singular so- lution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 4.3.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 94 5 Conclusions and future work . . . . . . . . . . . . . . . . . . . . . . 95 5.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 5.1.1 The Maxwell problem . . . . . . . . . . . . . . . . . . . 95 5.1.2 The MHD problem . . . . . . . . . . . . . . . . . . . . . 96 5.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 vii List of Tables Table 2.1 Example 1. Number of elements (Nel) and the size of the linear systems (n+m) for grids C1–C3. . . . . . . . . . . . . . . . . 38 Table 2.2 Example 1. Partitioning of grid C1. . . . . . . . . . . . . . . . 38 Table 2.3 Example 1. Iteration counts and computation times for various grids, k = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Table 2.4 Example 1. Iteration counts for various values of k. . . . . . . 39 Table 2.5 Example 1. Iteration counts and computation times for grid C1 on 16 processors for inexact inner iterations, k = 0. . . . . . . 39 Table 2.6 Example 1. Iteration counts for various values of a, k = 0. . . . 40 Table 2.7 Example 2. Number of elements (Nel) and the size of the linear systems (n+m) for grids G1–G4. . . . . . . . . . . . . . . . . 42 Table 2.8 Example 2. Iteration counts and computation times for various grids, k = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Table 2.9 Example 2. Iteration counts for various values of k. . . . . . . 42 Table 2.10 Example 2. Iteration counts for various values of a, k = 0. . . . 43 Table 2.11 Example 3. Number of elements (Nel) and the size of the linear systems (n+m) for grids F1–F4. . . . . . . . . . . . . . . . . 44 Table 2.12 Example 3. Iteration counts and computation times for various grids, k = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Table 2.13 Example 3. Iteration counts for various values of k. . . . . . . 45 Table 2.14 Example 3. Iteration counts for various values of a, k = 0. . . . 46 Table 3.1 Example 1. Convergence of ‖u− uh‖L2(Ω), ‖u− uh‖1,h, and ‖p− ph‖L2(Ω). . . . . . . . . . . . . . . . . . . . . . . . . . . 60 Table 3.2 Example 1. Convergence of ‖b−bh‖L2(Ω) and ‖b−bh‖H(curl;Ω). 60 viii Table 3.3 Example 1. Convergence of ‖r− rh‖L2(Ω) and ‖∇(r− rh)‖L2(Ω). 61 Table 3.4 Example 2. Convergence of ‖u− uh‖L2(Ω), ‖u− uh‖1,h, and ‖p− ph‖L2(Ω). . . . . . . . . . . . . . . . . . . . . . . . . . . 63 Table 3.5 Example 2. Convergence of ‖b− bh‖L2(Ω), ‖b− bh‖H(curl;Ω), and ‖rh‖L2(Ω). . . . . . . . . . . . . . . . . . . . . . . . . . . 63 Table 3.6 Example 3. Convergence of ‖u− uh‖L2(Ω), ‖u− uh‖1,h, and ‖p− ph‖L2(Ω). . . . . . . . . . . . . . . . . . . . . . . . . . . 66 Table 3.7 Example 3. Convergence of ‖b− bh‖L2(Ω), ‖b− bh‖H(curl;Ω), and ‖rh‖L2(Ω). . . . . . . . . . . . . . . . . . . . . . . . . . . 66 Table 3.8 Example 4. Convergence of ‖u− uh‖L2(Ω), ‖u− uh‖1,h, and ‖p− ph‖L2(Ω). . . . . . . . . . . . . . . . . . . . . . . . . . . 69 Table 3.9 Example 4. Convergence of ‖b− bh‖L2(Ω), ‖b− bh‖H(curl;Ω), and ‖rh‖L2(Ω). . . . . . . . . . . . . . . . . . . . . . . . . . . 69 Table 4.1 Example 1. Picard-type iteration counts for various values of ν . We denote by (i) the standard Picard linearization in (3.7), by (ii) and (iii) the Picard-type linearizations in (4.9) and (4.11), respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 Table 4.2 Example 1. Iteration counts for various values of ν . The cou- pling terms are treated explicitly. . . . . . . . . . . . . . . . . 89 Table 4.3 Example 1. Iteration counts for various values of ν . The cou- pling and convection terms are treated explicitly. . . . . . . . . 90 Table 4.4 Example 1. Inner and outer iteration counts for various values of ν , usingPC. . . . . . . . . . . . . . . . . . . . . . . . . . 91 Table 4.5 Example 1. Iteration counts for various values of ν , usingPA. 91 Table 4.6 Example 2. Picard-type iteration counts for various values of ν . We denote by (i) the standard Picard linearization in (3.7), by (ii) and (iii) the Picard-type linearizations in (4.9) and (4.11), respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 Table 4.7 Example 2. Iteration counts for various values of ν . The cou- pling terms are treated explicitly. . . . . . . . . . . . . . . . . 93 Table 4.8 Example 2. Inner and outer iteration counts for various values of ν , usingPC. . . . . . . . . . . . . . . . . . . . . . . . . . 93 ix Table 4.9 Example 2. Iteration counts for various values of ν , usingPA. 94 x List of Figures Figure 1.1 Organization of the PETSc library. The objects above the line are manipulated by PETSc. Objects at a higher level of ab- straction are based on objects below them. . . . . . . . . . . . 7 Figure 2.1 A graphical illustration of the degrees of freedom for the low- est order Nédélec element in 2D and 3D. Degrees of freedom are average values of tangential component of vector fields on each edge. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Figure 2.2 Eigenvalues of (a) the matrix K and (b) the preconditioned matrix P−1M,LK , for k = 0. For the non-preconditioned ma- trixK , the m = 225 negative eigenvalues range in value from -2.80 to -0.20 and the remaining n = 736 positive eigenval- ues range from 0.19 to 764.89. On the other hand, most of the eigenvalues of the preconditioned matrix P−1M,LK are ex- tremely close or equal to either −1 or 1. . . . . . . . . . . . 29 Figure 2.3 Mesh partitioning example. (a) Original mesh after step 1. The thick line (in all four figures) indicates the partitioning of ele- ments after step 2. (b) Reordering of elements after step 3. (c) Distribution of degrees of freedom after step 4. (d) Reordered mesh after step 5. . . . . . . . . . . . . . . . . . . . . . . . . 35 Figure 2.4 Example 1. Distribution of material coefficients. . . . . . . . 37 Figure 2.5 Example 1. (a) Structured mesh. (b) grid C1 partitioned on 3 processors. . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Figure 2.6 Example 2. Distribution of material coefficients. . . . . . . . 41 xi Figure 2.7 Example 2. (a) Quasi-uniform mesh on the gear. (b) grid G1 partitioned on 4 processors. . . . . . . . . . . . . . . . . . . 41 Figure 2.8 Example 3. Distribution of material coefficients. . . . . . . . 43 Figure 2.9 Example 3. A sequence of locally refined meshes. . . . . . . . 44 Figure 2.10 Example 3. Illustration of grid F1 partitioned on 4 processors. 44 Figure 3.1 Failure of H1-conforming elements. (a) Contours of the first component of the strongest magnetic singularity in an L-shaped domain; (b) nodal approximation with piecewise linear ele- ments. It is evident from the plots that, on a sufficiently fine mesh, nodal elements cannot correctly resolve the magnetic field. 49 Figure 3.2 A graphical illustration of the degrees of freedom for the low- est order BDM element in 2D and 3D. . . . . . . . . . . . . . 54 Figure 3.3 Example 1. Convergence history of the Picard iteration for the grid sequence defined in Tables 3.1–3.3. . . . . . . . . . . . . 62 Figure 3.4 Example 2. Numerical approximations of (a) velocity; (b) pressure contours. . . . . . . . . . . . . . . . . . . . . . . . . 64 Figure 3.5 Example 2. Numerical approximations of (a) magnetic field; (b) contours of the first component of the magnetic field; (c) contours of the second component of the magnetic field. . . . 64 Figure 3.6 Example 3. Slices along x = 5, −1≤ y≤ 1: (a) velocity com- ponent u(y); (b) magnetic component b(y). . . . . . . . . . . 67 Figure 3.7 Example 3. Numerical approximations of (a) velocity; (b) nor- malized magnetic field. . . . . . . . . . . . . . . . . . . . . . 67 Figure 3.8 Example 4. Slices along x = 5, −2 ≤ y ≤ 2, and z = 0: (a) velocity component u(y,0); (b) magnetic component b(y,0). . 70 Figure 3.9 Example 4. Numerical approximations of (a) velocity; (b) nor- malized magnetic field. . . . . . . . . . . . . . . . . . . . . . 71 Figure 3.10 Example 5. Numerical approximations of (a) velocity; (b) nor- malized magnetic field. . . . . . . . . . . . . . . . . . . . . . 72 Figure 3.11 Example 5. Numerical approximations of (a) contours of first velocity components; (b) streamlines of velocity. . . . . . . . 73 xii Figure 3.12 Example 6. Numerical approximations of (a) velocity; (b) nor- malized magnetic field. . . . . . . . . . . . . . . . . . . . . . 74 Figure 3.13 Example 7. Numerical approximations of (a) velocity; (b) nor- malized magnetic field; (c) pressure contours. . . . . . . . . . 75 Figure 3.14 Example 7. Velocity flow vectors and streamlines zoomed in behind the step for (a) κ=2.5e4; (b) κ=1e5. . . . . . . . . . . 75 Figure 4.1 Example 1. Plot of the eigenvalues of the preconditioned ma- trix P̂−1C K , DOFs = 721. . . . . . . . . . . . . . . . . . . . 92 xiii Acknowledgments I would like to thank my supervisors, Chen Greif and Dominik Schötzau, for their wonderful guidance and support. I am greatly indebted for the countless hours of discussions, and for the conversations that encouraged me to continue when I was worn down. Without their help, this thesis would not have been possible. I am grateful to my supervisory committee, including Robert Bridson and Carl Ollivier-Gooch for giving me insightful and constructive feedback. I am thankful to my external examiner Sabine Le Borne, as well as my university examiners Felix Herrmann and Brian Wetton, for their valuable feedback. I would also like to thank my friends. My graduate study would not have been so pleasant and memorable without them. I am thankful to all my friends in the Scientific Computing Lab. I am also thankful to Rong Yi, Haijun Qiao, Kangkang Yin, and Xiaoxi Wei. Finally, I would like to thank my family. Their continuous love, support, en- couragement and confidence in me have been invaluable. I could not have done this work without them. Mom and dad, this dissertation is also yours. Though you did not have a chance to pursue a PhD degree, you deserve one! Grandma and grandpa, I hope I have made you proud. My dearest husband, Penglong Wang, thank you for always being there when I need you. You are the sunshine in my life. xiv Dedication To my husband, parents and grandparents xv Chapter 1 Introduction The main topic of this thesis is the efficient numerical solution of the time-harmonic Maxwell and incompressible magnetohydrodynamics equations. These two prob- lems are central themes in the study of physical phenomena in the field of compu- tational electromagnetics [55, 60, 73]. In this introductory chapter, we first give a brief overview of Krylov subspace methods. Following that we present an intro- duction to scientific parallel computing. We then provide the physics of the two problems we study. Finally, an overview of the thesis structure and contributions is given. 1.1 Krylov subspace methods For solving linear systems of a reasonable size, direct methods are very popular in many applications in science and engineering. For large systems, iterative methods are more appropriate. We are interested in iterative solvers, particularly, precondi- tioned Krylov subspace solvers. Suppose we are to solve Ax = b iteratively, with an initial guess x0 and initial residual r0 = b−Ax0. For any real, n× n nonsingular matrix A and vector y of the same length, the k-dimensional 1 Krylov subspace of A with respect to y is defined by Kk(A;y) = span{y,Ay,A2y, . . . ,Ak−1y}. Krylov subspace methods are based on finding a solution within the shifted Krylov subspace x0 +Kk(A;r0) which satisfies a certain optimality criterion. There are three important building blocks for the family of Krylov subspace solvers: 1. construct an orthogonal basis for the Krylov subspace; 2. define an optimality property; 3. use an effective preconditioner. Computing an orthogonal basis. The vectors {r0,Ar0,A2r0, . . . ,Ak−1r0} span the Krylov subspace Kk(A,r0), but as j increases, A jb approaches the dominant eigenvector of A and the vectors become more linearly dependent. As a result, it is more appropriate to consider an orthonormal basis forKk(A,r0). If A is unsym- metric, the Arnoldi process is used to construct the basis iteratively. Each iteration adds an orthogonal vector to the basis. This process can be written as AQk = Qk+1Hk+1,k, where Qk+1 is the matrix containing the k+ 1 vectors of the orthogonal basis for the Krylov subspace, Qk is the same matrix but with the first k columns only, and Hk+1,k is a matrix in upper Hessenberg form of size (k+1)× k. It is easy to show that QTk AQk = Hk,k, where Hk,k a matrix containing the first k rows of Hk+1,k. When A is symmetric, Hk,k must be symmetric. Therefore, it must be tridiagonal; let us denote it by Tk,k. In this case the Arnoldi process reduces to the well known Lanczos method. This process can be written as AQk = Qk+1Tk+1,k. Defining an optimality property. Now that the orthogonal basis is available, we need to define an optimality criterion for a Krylov subspace solver. There are 2 various alternatives for deciding on the type of solution we are looking for within the Krylov subspace. Two particularly popular criteria are the following: • seek a residual with minimum `2-norm within the Krylov subspace, • force the residual to be orthogonal to the Krylov subspace. The first approach leads to MINRES (for minimum residual) when A is symmet- ric [79] and GMRES (for generalized minimum residual) when A is nonsymmet- ric [86]. The second approach leads to the SYMMLQ algorithm [79] when A is symmetric. When A is nonsymmetric, the FOM (for full orthogonalization method) algorithm works [85]. When A is symmetric positive definite, the second approach is equivalent to minimizing the energy norm of the error, ||xk − x||A [33, Theo- rem 6.8] and after some manipulation we get the CG (conjugate gradient) algo- rithm [53]. In this thesis, the numerical solvers we use are MINRES, GMRES and CG. We discuss these three algorithms briefly. • GMRES for general matrices. By the Arnoldi algorithm, we have that AQk = Qk+1Hk+1,k. We can write xk = x0+Qkz, and aim to find xk which minimizes ||b−Axk||. We solve the corresponding least-squares problem through the QR factorization and derive that z = R−1k,kU T k+1,k||r0||e1. Here, Hk+1,k =Uk+1,kRk,r, where Rk,k is an upper triangular matrix and Uk+1,k consists of k orthonormal vectors. The vector e1 is (1,0, . . . ,0)T . Once we have z, we compute the kth iteration simply by setting xk = x0+Qkz. • MINRES for symmetric matrices. For symmetric matrices A, Arnoldi is replaced by Lanczos and the upper Hessenberg matrix is just tridiagonal. 3 The same mechanism as GMRES can be applied but the resulting iterative method is simpler and requires short three-term recurrence relations; this is MINRES. • CG for symmetric positive definite matrices. We write xk = x0+Qkz, and impose QTKrk = 0. It can be shown that z = T−1k,k ||r0||e1. Further manipulations related to the Cholesky decomposition of Tk,k lead to the elegant CG algorithm; see, e.g., [33, Chatper 6]. Preconditioning The convergence rate of Krylov subspace solvers depends on the distribution of A’s eigenvalues. Preconditioning means replacing the system Ax = b with the system M−1Ax = M−1b, where M is an approximation to A. There are several important properties a preconditioner should have [10, 33, 93]: • the system Mz = s must be easy to solve, • the matrix M−1A is well conditioned and/or has strongly clustered eigenval- ues, • the matrix M must not be too difficult to construct. Loosely speaking, there are two general approaches to constructing precon- ditioners [10]. One approach is to design specialized algorithms in applications involving partial differential equations (PDEs); see, e.g., [37]. These algorithms are usually optimal (or nearly so) for a narrow class of problems, but they require complete knowledge of the problem at hand, including the original (continuous) equations, the domain of integration, the boundary conditions, details of the dis- cretization, and so forth. The second approach is to design general-purpose, purely algebraic methods that use only information contained in the coefficient matrix A; see, e.g., [85]. In this thesis, since we have an underlying PDE, we design our preconditioners based on the first methodology. 4 1.2 Scientific parallel computing One of the core tasks in the numerical simulation of mathematical models is solv- ing very large linear systems of equations. Indeed, the number of equations may reach millions very quickly as we refine the mesh, especially in three dimensions. Sequential single-processor solution procedures are often prohibitive in terms of CPU time and memory requirements. In this thesis we develop a parallel comput- ing paradigm to derive efficient scalable linear solvers. Parallel computing is the application of two or more processing units to solve a single problem [81, 90]. These units can be physical processors or logical pro- cesses. Our parallel numerical examples are run on a cluster with distributed mem- ory machines with multiple processors. Each processor executes the same program, but with different data. To accomplish the task of developing an efficient parallel code, a few issues need to be addressed. Data access. Since processors calculate values that are needed by other pro- cessors, we need a way to distinguish among those data at different processors. It is useful to have a processor-centric mechanism whereby the data of a processor un- der consideration are considered on-processor data and all the rest are off-processor data. Accessing off-processor data is often referred to as communication. In us- ing multiple processors to solve a problem, varying degrees of coordination are required. Coordination primarily resolves accessing off-processor data required by some processors to compute its tasks. There are two ways to access off-processor data. 1. Direct memory access. In this method, the processor requesting information will access the off-processor information directly from memory where it was written. This requires some form of synchronization so that the value in memory is accessed when it is valid. The approach is typically used in shared memory parallelism. An intrinsic characteristic of shared memory machines is a strategy for memory coherence and a fast tightly coupled network for distributing data from a commonly accessible memory system. 2. Data exchange via messages. This is another method to acquire off-processor information using messages. In a message, the required information is pack- 5 aged, identified, and sent from the processor that defined it to the processor that requires it. Thus, messages have both the information and synchroniza- tion. This approach to accessing off-processor data is typically used in dis- tributed memory architectures. We use messages to exchange data between processors in this thesis. Load balancing. The overall time spent in parallel programs includes the com- putational time and communication time. Load balancing is very important in par- allel computing. If the work is not distributed equally, then one processor may end up taking longer than the others. Since we are doing a cooperative project, the entire job cannot be finished until the slowest subtask is finished. To reduce the computational time, in addition to optimizing the parallel algorithms, one require- ment is that the work to be done by each processor needs to be balanced among all the processors. Reducing communication cost is another import factor affecting efficiency of parallel programs. Even if the load is well balanced, the communica- tion cost can be high, if all processors need to access a lot of off-processor data. To reduce the communication time, we need to design the data structures that each processor will require minimum off-processor data. Numerical libraries. To write sophisticated parallel application codes that in- volve the numerical simulation of large-scale engineering applications, we require the use of portable, efficient numerical libraries. Over the last few decades, various agencies world-wide have invested heavily in making parallel computing usable for more than very special high-budget custom projects. There are many numeri- cal libraries available for this purpose, such as MUMPS [1], SuperLU DIST [71], PARDISO [87], Trilinos [52], Hypre [38], and PETSc [6–8]. We adopt PETSc (the Portable, Extensible Toolkit for Scientific Computation) as the framework of our parallel code in this thesis. PETSc uses the MPI standard [46] for all message passing communications. To accomplish maximum flexibility PETSc utilizes the object-oriented methodology, which in turn seamlessly allows for a high level of abstraction; see Figure 1.1. The user initiates communications of sequential and parallel phases of computations, but PETSc handles the detailed message pass- ing required during the coordination of the computations. This provides a good balance between ease of use and efficiency of implementation. Each part of the 6 PETSc library manipulates a particular family of objects (vectors, matrices, Krylov subspaces, preconditioners, etc.) and operations that can be performed on them. Matrices are distributed as block rows in PETSc (i.e., block distribution, see [81, Chapter 5] for details), such that each process holds an array of contiguous rows of the matrix and owns complete rows. Our data structures are designed to meet this data distribution scheme; see Section 2.3.4 for details. There are many open source finite element libraries available, such as FEM- STER [16], NGSolve [88], and Deal.II [9]. Instead of using any of these libraries we have written our own code, because it gives us easy access to the matrices and allows us to numerically test the various solvers studied in this work. Matrices KSP (Krylov Subspace Methods) PC (Preconditioners) Vectors Index Sets (Linear Equations Solvers) SLES LAPACKBLAS Level of Abstraction Application Codes (Time Stepping) TS (Nonlinear Equations Solvers) SNES PDE Solvers MPI Draw FIGURE 2. Organization of the PETSc Library The use of communicators in parallel software libraries is extremely important, since it enables all communication for a particular operation (e.g., a matrix-vector product) to be isolated from communication in other parts of code. Such encapsulation eliminates the problem of colliding tags (for example, when two libraries inadvertently use the same tag on dierent messages, one library may incorrectly receive a message intended for the other library), which was a serious limitation of older message-passing systems. The underlying communicators in PETSc objects ensure that communications for dierent com- putations are separate. We achieve this segregation upon object creation by immediately duplicating via MPI Comm dup() (an MPI function that makes a copy of a given communicator) any communica- tor that is not already a \PETSc communicator" and then denoting it as such by inserting an MPI attribute via MPI Attr put(). An MPI attribute is simply any collection of data a user chooses to attach to a communicator. This PETSc attribute essentially contains a tag number that is assigned to the PETSc object. The tag number is then decremented to ensure that each PETSc object that shares a common communicator has a unique tag (or tags) for use in its internal communication. 4 Six Guiding Principles As introduced in Section 1, the six guiding principles in the development of the parallel PETSc software are strongly interrelated. This section discusses each principle, while the following section describes their integration into the PETSc design. 4.1 Managing the Communication in the Context of Higher-Level Operations on Parallel Objects Raw message-passing code is often extremely dicult to understand and debug because, unless the code is very carefully documented, it is often unclear what specic message is associated with a par- ticular operation or data structure in the code. PETSc is designed so that application programmers generally need not worry about writing individual message-passing calls. Instead, they can direct communication as part of higher-level operations on a parallel object or objects. For example, the matrix-vector product interface routine, given by 8 Figure 1.1: Organization of the PETSc library. The objects above the line are manipulated by PETSc. Objects at a higher level of abstracti n are based on objects below them. 7 1.3 The problems and their physics 1.3.1 The time-harmonic Maxwell equations The modern theory of electromagnetism was founded by James Clerk Maxwell (1831-1879) in 1873 with the publication of his A Treatise on Electricity and Mag- netism [72]. The Maxwell equations are a system of PDEs that describe how the electric and magnetic fields relate to their sources, charge density and current den- sity. The classical macroscopic electromagnetic field is described by four vector functions of position x ∈ R3 and time t ∈ R, denoted by E , D , H and B [73]. The fundamental field vectors are E andH , called the electric and magnetic field intensities, respectively. The vector functions D and B are called the electric displacement and magnetic induction, respectively. They will later be eliminated from the description of the electromagnetic field via suitable constitutive relations. Our exposition in this section mainly follows [32, 73]. An electromagnetic field is created by a distribution of sources consisting of static electric charges and the directed flow of electric charge, the current. The distribution of charges is given by a scalar charge density function ρ , while currents are described by the vector current density function J . The Maxwell equations then state that the field variables and sources are related by the following equations: Farady’s law: ∂B ∂ t +∇×E = 0, (1.1a) Coulomb’s law: ∇ ·D = ρ, (1.1b) Ampère’s law: ∂D ∂ t −∇×H =−J , (1.1c) Gauss’s law: ∇ ·B = 0. (1.1d) Since charge is conserved, ρ andJ are connected by the relation ∇ ·J + ∂ρ ∂ t = 0. (1.2) 8 We assume the current density and charge density are time-harmonic: J (x, t) = R(exp(−iωt)Ĵ(x)), ρ(x, t) = R(exp(−iωt)ρ̂(x)), where i = √−1, ω > 0 is the temporal frequency andR(·) denotes the real part of the expression. Then the electromagnetic fields are time-harmonic: E (x, t) = R(exp(−iωt)Ê(x)), D(x, t) = R(exp(−iωt)D̂(x)), H (x, t) = R(exp(−iωt)Ĥ(x)), B(x, t) = R(exp(−iωt)B̂(x)). The field Ê is now a complex-valued vector function of position but not time. Substituting these relations into (1.1) leads to the time-harmonic Maxwell equa- tions: −iωB̂+∇× Ê = 0, (1.3a) ∇ · D̂ = ρ̂, (1.3b) −iωD̂−∇× Ĥ =−Ĵ, (1.3c) ∇ · B̂ = 0, (1.3d) where the time-harmonic charge density ρ̂ is given via charge conservation (1.2) and hence can be eliminated from the equations. The equations in (1.3) must be augmented by two constitutive laws that relate Ê and Ĥ to D̂ and B̂, respectively. These laws depend on the properties of the matter in the domain occupied by the electromagnetic field. We assume the linear relation D̂ = εÊ and B̂ = µĤ, (1.4) where ε and µ denote, respectively, the permittivity (measured in F/m in SI units) and permeability (H/m), which are positive, bounded, scalar functions of position. In a vacuum, we have ε ≡ ε0 and µ ≡ µ0, where ε0 = 8.854× 10−12 F/m and 9 µ0 = 4pi×10−7 H/m. There is one additional constitutive relation. In a conducting material, the elec- tromagnetic field itself gives rise to currents. If the field strengths are not large, we can assume that Ohm’s law holds so that: Ĵ = σ Ê+ Ĵa, (1.5) where σ is called the conductivity (S/m) and is a non-negative function of position. The vector function Ĵa describes the applied current density. Using the constitutive equations in (1.4) and the constitutive relation for the currents in (1.5), we use (1.3) to obtain the following time-harmonic Maxwell sys- tem: −iωµĤ+∇× Ê = 0, (1.6a) ∇ · (εÊ) = 1 iω ∇ · (σ Ê+ Ĵa), (1.6b) −iωεÊ+σ Ê−∇× Ĥ =−Ĵa, (1.6c) ∇ · (µĤ) = 0. (1.6d) Define E = ε1/20 Ê and H = µ 1/2 0 Ĥ. Using (1.6) and defining the relative permittivity and permeability by εr = 1 ε0 (ε+ iσ ω ) and µr = µ µ0 , the following version of the first-order Maxwell system is obtained: − ikµrH+∇×E = 0, (1.7a) −ikεrE−∇×H = − 1ik f, (1.7b) where f= ikµ1/20 Ĵa and the wave number is k=ω √ε0µ0. The following divergence 10 conditions follow from (1.6): ∇ · (εrE) = − 1k2∇ · f, ∇ · (µrH) = 0. The magnetic field H can be eliminated by solving (1.7a) for H and substituting into (1.7b). Thus, we obtain the second-order Maxwell system ∇× (µ−1r ∇×E)− k2εrE = f. (1.8) Equation (1.8) is the model problem we will consider in Chapter 2. We will con- sider a lossless medium (σ = 0) in which case the material coefficients are real; and the real and imaginary parts of (1.8) decouple into two problems of the same form. From now on, we assume E and f are real. Computer simulation has become an essential tool in the design and analysis of electronic and electromechanical devices, such as micro-chips, medical equipment, transformers, sensors and coils. Nowadays, there are many commercial software packages available, such as Maxwell by Ansoft (www.ansoft.com), Microwave Of- fice by AWR (web.awrcorp.com), CST EM studio by CST (www.cst.com), and COMSOL Multiphysics by COMSOL (www.comsol.com). There are a number of solvers available for (1.8). For example, a parallel solver based on virtual space preconditioners has been implemented in Hypre [38] and a Maxwell solver based on AMG preconditioners has been developed in [39]. 1.3.2 Incompressible magnetohydrodynamics The field of magnetohydrodynamics (MHD) studies the behavior of electrically conducting fluids (such as liquid metals, plasmas, salt water, etc.) in electromag- netic fields [30, 42, 74]. The equations of electromagnetics and fluid dynamics are coupled through two fundamental effects: first, the motion of a conducting mate- rial in the presence of a magnetic field induces an electric current that modifies the existing electromagnetic field (this effect is often referred to as electromotive force (EMF)). Secondly, the current and the magnetic field generate the Lorentz force, which accelerates the fluid particles in the direction normal to both the magnetic 11 field and the electric current. Our focus is on incompressible viscous fluids whose electric resistivity is non-negligible [42]. The corresponding incompressible MHD model is a system of PDEs, where the Navier-Stokes equations are coupled with the Maxwell equations. Incompressible MHD has a number of technological and industrial applications such as metallurgical engineering, electromagnetic pump- ing, stirring of liquid metals, and measuring flow quantities based on induction; cf. [28, 42]. In the MHD equations, the non-relativistic motion of Newtonian fluids is gov- erned by the Navier-Stokes equations as follows (see, e.g., [18]) ∂U ∂ t −νv∆U +(U ·∇)U − 1ρ∇P̄ = fL+ f, (1.9a) ∇ ·U = 0. (1.9b) Here U is the velocity field, νv is the kinematic viscosity of the fluid, P̄ is the pressure of the fluid, fL is the Lorentz force per mass, and f is the external body force per mass. The constraint (1.9b) corresponds to conservation of mass. Electromagnetic effects are governed by the Maxwell equations in (1.1). Recall that ε and µ are functions of position only. We assume D = εE and B = µH . (1.10) If the densities of positive and negative charges are assumed to be equal in any sizable region (quasi-neutrality assumption, see, e.g., [59] for details), Ohm’s law simplifies to: J = σ(E +U ×B). (1.11) Here, U ×B represents the electric field induced by the fluid flow. If we further assume that phenomena involving high frequency are not consid- ered, the displacement current ∂D∂ t in (1.1c) can be neglected, leading to J = ∇×H . (1.12) 12 Combining (1.11)–(1.12), we solve for E E = 1 σ J −U ×B = 1 σ ∇×H −U ×B = 1 σµ ∇×B−U ×B. (1.13) We substitute (1.13) into (1.1a) and obtain ∂B ∂ t + 1 σµ ∇×∇×B−∇× (U ×B) = 0, where the term ∇× (U ×B) counts for the electromotive force. On the other hand, the Lorentz’s force fL is the body force exerted by the elec- tromagnetic effects on the fluid. Based on (1.10) and (1.12), it is given by fL = 1 ρ J ×B = 1 ρµ (∇×B)×B. (1.14) Substituting (1.14) into (1.9a), we obtain ∂U ∂ t −νv∆U +(U ·∇)U − 1ρ∇P̄ = 1 ρµ (∇×B)×B+ f. It is convenient to rewrite the MHD equations in non-dimensional form. For this purpose, we introduce characteristic values: the characteristic magnetic induc- tion B, characteristic velocity U and characteristic length L. The resulting sys- tem of the incompressible MHD equations is cast in terms of the non-dimensional space variable xL , time variable tU L (again denoted by x and t), as well as the non-dimensional fields u = UU , b = B B , p = P̄ ρU2 and f = fL U2 . The resulting non- 13 dimensionalized system of equations is ∂u ∂ t −ν ∆u+(u ·∇)u+∇p−κ (∇×b)×b = f, ∂b ∂ t +νm∇× (∇×b)−∇× (u×b) = 0, ∇ ·u = 0, ∇ ·b = 0, where ν = 1Re , νm = 1 Rm , and Re, Rm and κ are three non-dimensional numbers. The parameter Re = ULνv is the hydrodynamic Reynolds number. The parameter Rm = σµUL is the magnetic Reynolds number. The parameter κ = B 2 µρU2 is the coupling number. For further discussion of these parameters and their typical val- ues, we refer the reader to [3, 42, 84]. In this thesis, we are interested in the steady-state MHD equations given by −ν ∆u+(u ·∇)u+∇p−κ (∇×b)×b = f, (1.15a) νm∇× (∇×b)−∇× (u×b) = 0, (1.15b) ∇ ·u = 0, (1.15c) ∇ ·b = 0. (1.15d) The PDE system (1.15) is the focus of our interest in Chapters 3 and 4. We will transform the system into mixed form, making it amenable to discretized curl- confirming elements. 1.4 Thesis overview and contributions This thesis is organized in five chapters. In Chapter 2, we discuss the numeri- cal solution of the Maxwell equations in (1.8) and present a fully scalable parallel iterative solver. It is based on a mixed finite element discretization on tetrahe- dral meshes, using the lowest order Nédélec elements of the first kind [76] for the approximation of the vector field, and standard nodal elements for the Lagrange multiplier associated with the divergence constraint. The corresponding linear sys- 14 tem has a saddle point form, and is solved using preconditioned MINRES [79]. We adopt and extend the block diagonal preconditioning approach derived in [43], with a nodal auxiliary space preconditioning technique as an inner iteration for the shifted curl-curl operator [56]. Our main contribution is the development and parallel implementation of this approach for large-scale problems in complex three-dimensional domains with variable coefficients. We demonstrate the performance of our parallel solver on problems with up to approximately 40 million degrees of freedom. Our numerical results indicate very good scalability with the mesh size, on uniform, unstructured and locally refined meshes. A preliminary version of this work has been published in the conference proceedings of the second international conference on High Per- formance Computing and Applications (Shanghai, China, 2009) in Lecture Notes in Computer Science on High Performance Computing and Applications [69]. A complete version has been submitted to Numerical Linear Algebra with Applica- tions in June, 2010 [70]. In Chapter 3, we move on to consider the more complex incompressible MHD problem (1.15), of which the Maxwell equations discussed in Chapter 2 are a sub- problem. We propose a new finite element method for the numerical discretization of a stationary incompressible MHD problem in two and three dimensions. The novelty of the method is two-fold: first, the method produces exactly divergence- free velocity approximations and the resulting discretization is provably energy- stable; secondly, it captures the strongest magnetic singularities. The velocity field is discretized using divergence-conforming Brezzi-Douglas-Marini (BDM) elements [15] and the magnetic field is approximated by curl-conforming Nédélec elements. The H1-continuity of the velocity field is enforced by a discontinuous Galerkin (DG) approach. The energy norm of the error is convergent in the mesh size in general possibly non-convex Lipschitz polyhedra under minimal regular- ity assumptions, and we have nearly optimal a-priori error estimates for the two- dimensional case, and slightly suboptimal ones for the three-dimensional case. We present a comprehensive set of numerical experiments, which indicate optimal con- vergence of the proposed method for two-dimensional as well as three-dimensional problems. We numerically confirm that the strongest magnetic singularities can be correctly captured. This work has been published in Computer Methods in Applied 15 Mechanics and Engineering [44]. In Chapter 4, we focus on the specific issue of developing efficient solvers for the linear system arising from our finite element discretization, introduced in Chap- ter 3. Here, the main difficulty is the presence of possibly strong skew-symmetric coupling terms. We present preliminary results for preconditioning techniques applied to the discretized MHD equations. We propose a preconditioning ap- proach based on combining efficient preconditioners of the Maxwell sub-system (discussed in Chapter 2) and Navier-Stokes sub-system [37, Chapter 8]. We pro- vide some preliminary analysis for our approach. We present numerical results to show that our approach performs reasonably well. We did not try to parallelize the solver for the MHD problems. This is left as an interesting topic for future work. Conclusions and an outline of related future work are provided in Chapter 5. 16 Chapter 2 Solution of the discretized time-harmonic Maxwell equations in mixed form In this chapter we develop a fully scalable parallel solver for the time-harmonic Maxwell equations in complicated domains in three dimensions. The precondi- tioned iterations are based on outer iterations of the form introduced in [43], ex- tended to the variable coefficient case, and inner iterations of the form derived in [56]. We use algebraic multigrid solvers for each elliptic problem, and accom- plish almost linear complexity in the number of degrees of freedom. With our implementation we can solve problems of dimensions of up to approximately 40 million degrees of freedom. Our numerical results scale very well with the mesh size, on uniform, unstructured, and locally refined meshes. This chapter is structured as follows. First, we discuss related work in Sec- tion 2.1. In Section 2.2 we present the mixed finite element discretization and analyze the properties of the discrete operators. The preconditioning approach is presented in Section 2.3. In Section 2.4 we provide numerical examples to demon- strate the scalability and performance of the proposed solvers. 17 2.1 Background and related work To find the solution to (1.8), we need to know the boundary conditions associated with the domain Ω. We assume ∂Ω is a perfectly conducting connected boundary of Ω, and have the following boundary condition: n×E = 0 on ∂Ω, where n denotes the outward normal unit vector to ∂Ω. We are interested in numerically solving (1.8) using mixed finite element meth- ods. Mixed formulations of the Maxwell equations have been used in [17, 80, 94]. Such formulations may improve the stability for vanishing wave numbers [94] and yield a well defined problem. A mixed approach has been used in [17] for treat- ing non-matching meshes. To derive a mixed formulation we apply the Helmholtz decomposition [80], and write E = u+∇p̂, where εru is divergence-free, and p̂ has homogeneous Dirichlet boundary condi- tions. Setting p = −k2 p̂, we obtain the following mixed problem: find the vector field u and the scalar multiplier p such that ∇×µ−1r ∇×u− k2εru+ εr∇p = f in Ω, (2.1a) ∇ · (εru) = 0 in Ω, (2.1b) n×u = 0 on ∂Ω, (2.1c) p = 0 on ∂Ω. (2.1d) We assume that 0< µmin ≤ µr ≤ µmax < ∞ and 0< εmin ≤ εr ≤ εmax < ∞. The wave number k is assumed small, k 1. 18 In the two-dimensional case, the curl operator ∇× applied to a vector u = (u1,u2) is defined as ∇× u = ∂u2∂x − ∂u1∂y , while the curl of a scalar function p is determined by ∇× p = ( ∂ p∂y ,− ∂ p∂x ). Similarly, the cross product of two vectors u = (u1,u2) and v = (v1,v2) is given by u×v = u1v2−u2v1. Finite element discretization using Nédélec elements of the first kind [76] for the approximation of u and standard nodal elements for p yields a saddle point linear system of the form K x≡ ( A− k2M BT B 0 )( u p ) = ( f 0 ) ≡ b, (2.2) whose size is (m+ n)× (m+ n). The matrix A ∈ Rn×n corresponds to the µ−1r - weighted discrete curl-curl operator; B ∈ Rm×n is the εr-weighted divergence op- erator with full row rank; M ∈Rn×n is the εr-weighted vector mass matrix; f ∈Rn is now the load vector associated with the right-hand side in (2.1a), and the vec- tors u ∈ Rn and p ∈ Rm represent the finite element coefficients. Note that A is symmetric positive semidefinite with nullity m. In [43], block diagonal preconditioners were designed for iteratively solving system (2.2) with constant coefficients. These preconditioners were motivated by spectral equivalence properties. Each iteration of the scheme requires inverting A+ γM, where γ > 0 is a given parameter. There exist several effective multigrid meth- ods for doing so. When a hierarchy of structured meshes is available, geometric multigrid can be applied [54]; for unstructured meshes, algebraic multigrid (AMG) approaches have been explored and implemented in [13, 39, 61, 83], using the smoothers introduced in [54]. A different approach has been proposed in [56]: a nodal auxiliary space preconditioner, which reduces the problem into solving two scalar elliptic problems on the nodal finite element space. A parallel solver based on the nodal auxiliary space preconditioners is implemented in [38, 67]. In this chapter we develop a fully scalable parallel implementation for solv- ing (2.2) in complicated domains in three dimensions. The preconditioned itera- tions are based on outer iterations of the form introduced in [43], extended to the 19 variable coefficient case, and inner iterations of the form derived in [56]. We use algebraic multigrid solvers for each elliptic problem, and accomplish almost lin- ear complexity in the number of degrees of freedom. With our implementation we can solve problems of dimensions of up to approximately 40 million degrees of freedom. Our numerical results scale very well with the mesh size, on uniform, unstructured, and locally refined meshes. 2.2 Finite element discretization In this section, we present a mixed finite element method that employs curl-conforming elements. Then, we discuss properties of the discrete operators. 2.2.1 Variational formulation We assumeΩ is a bounded simply-connected Lipschitz polytope inRd (d = 2 or 3), with a connected boundary ∂Ω. To write (2.2) in weak form, we denote by (·, ·)Ω the inner product in L2(Ω) or L2(Ω)d . For the computational domain Ω⊂ Rd , we let C = H0(curl;Ω) = {u ∈ L2(Ω)d : ∇×u ∈ L2(Ω)d ,n×u = 0 on ∂Ω}, S = H10 (Ω) = {p ∈ H1(Ω) : p = 0 on ∂Ω}. (2.3) The variational formulation of the Maxwell equations (2.1) is to find (u, p)∈C×S such that A(u,v)− k2M(u,v)+B(v, p) = (f,v)Ω, (2.4a) B(u,q) = 0, (2.4b) for all (v,q) ∈ C×S. The variational forms are given by A(u,v) = ∫ Ω µ−1r (∇×u) · (∇×v)dx, M(u,v) = ∫ Ω εru ·vdx, B(u,q) = ∫ Ω εru ·∇qdx. 20 Recall that the wave number k is given by k=ω√ε0µ0 andω 6= 0 is the angular frequency. We assume throughout that k2εr is not a Maxwell eigenvalue. The well- posedness of (2.1) has been proved in [32, 80]. Note that (2.1) is well-posed for k = 0. In this case, if εr ≡ 1, this corresponds to the mixed formulation of the magnetostatic problem in terms of the vector potential u, along with Coulomb’s gauge ∇ ·u= 0. In our numerical tests, we will also consider k = 0 and variable εr. 2.2.2 Mixed discretization We consider a family of regular and quasi-uniform triangulations Th of a suffi- ciently small mesh size h that partition the domain Ω into simplices {K} (i.e., triangles for d = 2 and tetrahedra for d = 3). For k ≥ 1, we wish to approximate the solution of (2.1) by finite element func- tions (uh, ph) ∈ Ch×Sh, where Ch = {u ∈ H0(curl;Ω) : u|K ∈Pk−1(K)d⊕Rk(K), K ∈Th }, Sh = { p ∈ H10 (Ω) : p|K ∈Pk(K), K ∈Th }. (2.5) Here, we denote by Pk(K) the space of polynomials of total degree at most k on element K, and by Rk(K) the space of homogeneous vector polynomials of total degree k that are orthogonal to the position vector x. The space Ch represents the first family of curl-conforming Nédélec elements (cf. [76, Chapter 5]); its degrees of freedom are defined for the tangential compo- nents of functions along faces. Figure 2.1 shows the degrees of freedom on the lowest order Nédélec elements in 2D and 3D. Now we consider the following finite element method: find (uh, ph) ∈ Ch×Sh such that A(uh,v)− k2M(uh,v)+B(v, ph) = (f,v)Ω, (2.6a) B(uh,q) = 0, (2.6b) for all (v,q) ∈ Ch×Sh. Let 〈ψ j〉nj=1 and 〈φi〉mi=1 be finite element bases for the spaces Ch and Sh re- 21 (a) 2D (b) 3D Figure 2.1: A graphical illustration of the degrees of freedom for the lowest order Nédélec element in 2D and 3D. Degrees of freedom are average values of tangential component of vector fields on each edge. spectively: Ch = span〈ψ j〉nj=1, Sh = span〈φi〉mi=1. (2.7) Then, the weak formulation in (2.4) yields a linear system of the form (2.2), where the entries of the matrices and the load vector are given by Ai, j = ∫ Ω µ−1r (∇×ψ j) · (∇×ψi) dx, 1≤ i, j ≤ n, Mi, j = ∫ Ω εrψ j ·ψi dx, 1≤ i, j ≤ n, Bi, j = ∫ Ω εrψ j ·∇φi dx, 1≤ i≤ m, 1≤ j ≤ n, fi = ∫ Ω f ·ψi dx, 1≤ i≤ n. The mixed finite element discretization (2.6) has a unique solution; see the proof in [32]. Let us introduce a few additional matrices that play an important role in this 22 formulation. First, note that ∇Sh ⊂ Ch, and define the matrix C ∈ Rn×m by ∇φ j = n ∑ i=1 Ci, jψi, j = 1, . . . ,m. (2.8) For a function qh ∈ Sh given by qh = ∑mj=1 q jφ j, we then have ∇qh = n ∑ i=1 m ∑ j=1 Ci, jq jψi, so Cq is the coefficient vector of ∇qh in the basis 〈ψi〉ni=1. In the lowest order case, the entries of C are Ci, j = 1 if node j is the head of edge i, −1 if node j is the tail of edge i, 0 otherwise. Define the εr-weighted scalar Laplacian on Sh as L = (Li, j)mi, j=1 ∈ Rm×m with Li, j = ∫ Ω εr∇φ j ·∇φi dx. (2.9) Finally, we set Q= (Qi, j)mi, j=1 ∈Rm×m as the εr-weighted scalar mass matrix on Sh, that is Qi, j = ∫ Ω εrφ j ·φi dx. 2.2.3 Properties of the discrete operators Let us state a few stability results that extend the analysis in [43] from constant material coefficients to the variable coefficient case. Denote by 〈·, ·〉 the standard Euclidean inner product in Rn or Rm, and by null(·) the null space of a matrix. For a given positive (semi)definite matrix W and a vector x, we define the (semi)norm |x|W = √ 〈Wx,x〉. Proposition 1 The following stability properties of the matrices A and B hold: 23 (i) Continuity of A: |〈Au,v〉| ≤ |u|A|v|A, u,v ∈ Rn. (ii) Continuity of B: |〈Bv,q〉| ≤ |v|M|q|L, v ∈ Rn, q ∈ Rm. (iii) The matrix A is positive definite on null(B) and 〈Au,u〉 ≥ α (|u|2A+ |u|2M) , u ∈ null(B), with a stability constant α which is independent of the mesh size. (iv) The matrix B satisfies the discrete inf-sup condition inf 06=q∈Rm sup 06=v∈null(A) 〈Bv,q〉 |v|M|q|L ≥ 1. Proof The first two properties follow directly from the Cauchy-Schwarz inequal- ity. To show (iii), we first recall the discrete Poincaré–Friedrichs inequality from [55, Theorem 4.7]. Let u ∈ null(B) and let uh be the associated finite element function. Then, we have ∫ Ω |∇×uh|2 dx≥ β ∫ Ω |uh|2 dx, where β > 0 is independent of the mesh size. 24 Consequently, we bound 〈Au,u〉 as follows 〈Au,u〉= 1 2 〈Au,u〉+ 1 2 〈Au,u〉 = 1 2 |u|2A+ 1 2 ∫ Ω µ−1r |∇×uh|2 dx ≥ 1 2 |u|2A+ 1 2µmax ∫ Ω |∇×uh|2 dx ≥ 1 2 |u|2A+ β 2µmax ∫ Ω |uh|2 dx ≥ 1 2 |u|2A+ β 2µmaxεmax ∫ Ω εr|uh|2 dx = 1 2 |u|2A+ β 2µmaxεmax |u|2M, ≥min ( 1 2 , β 2µmaxεmax ) (|u|2A+ |u|2M) = α(|u|2A+ |u|2M), where α = min ( 1 2 , β 2µmaxεmax ) . Note that, since 〈Au,u〉 = |u|2A, we must have 0 < α < 1, and then also |u|2A ≥ ᾱ|u|2M, u ∈ null(B), (2.10) with ᾱ = α 1−α . (2.11) To prove (iv), let 0 6= qh ∈ Sh and v be the coefficient vector of vh = ∇qh in the basis 〈ψi〉ni=1. Then it follows that v ∈ null(A) and sup 06=v∈null(A) 〈Bv,q〉 |v|M|q|L = sup06=v∈null(A) ∫ Ω εrvh ·∇qh dx ( ∫ Ω εrvh ·vhdx) 1 2 |q|L ≥ ∫ Ω εr∇qh ·∇qh dx ( ∫ Ω εr∇qh ·∇qhdx) 1 2 |q|L = |q|2L |q|2L = 1, which shows (iv). The properties stated in Proposition 1 and the theory of mixed finite element methods [15, Chapter 2] ensure that the saddle point system (2.2) is invertible 25 (provided that the mesh size is sufficiently small). 2.3 The solver To iteratively solve the saddle point system (2.2) we use MINRES as an outer solver. This is discussed in Section 2.3.1. To perform each outer iteration, we apply an inner solver based on [56] and presented in Section 2.3.2. In Section 2.3.3, we outline the complete solution procedure, and provide details on our scalable parallel implementation in Section 2.3.4. 2.3.1 The outer solver Following the analysis for constant coefficients in [43], we propose the following block diagonal preconditioner to iteratively solve (2.2): PM,L = ( PM 0 0 L ) , (2.12) where PM = A+ γM (2.13) and γ = 1− k2. (2.14) We have the following result: Theorem 2 The preconditioned matrixP−1M,LK has two eigenvalues λ+ = 1 and λ−=− 11−k2 , each with algebraic multiplicity m. The remaining eigenvalues satisfy the bound ᾱ− k2 ᾱ+1− k2 < λ < 1, (2.15) where ᾱ is the constant in (2.11). Proof The proof follows along the lines of [43, Theorem 5.2]. First, we verify that the following relations hold: (1) Rn = null(A)⊕null(B). 26 (2) For any u ∈ null(A), there is a unique q ∈ Rm such that u =Cq. (3) BC = L. (4) MC = BT . (5) 〈Mu,Cq〉= 〈Bu,q〉 for u ∈ Rn and q ∈ Rm. (6) 〈MCp,Cq〉= 〈Lp,q〉 for p,q ∈ Rm. (7) Let u ∈ null(A) with u =Cp. Then |u|M = |p|L. (8) For any u ∈ null(A), we have 〈BT L−1Bu,u〉= |u|M. The relations (1) and (2) follow from the discrete Helmholtz decomposition [73, Section 7.2.1]. To show (3), we have, for 1≤ i, j ≤ m, (BC)i, j = n ∑ k=1 Bi,kCk, j = ∫ Ω ( n ∑ k=1 Ck, jεrψk ) ·∇φi dx = ∫ Ω εr∇φ j ·∇φi dx = Li, j. Relation (4) follows similarly. Relation (5) follows from relation (4). To show (6), if ∇ph and ∇qh are the finite element functions associated with the vectors Cp and Cq, then we have 〈MCp,Cq〉= ∫ Ω εr∇ph ·∇qh dx = 〈Lp,q〉. Relation (7) follows from (6). Finally, to see (8), from relation (2), we have u = Cp for a vector p ∈ Rm. Using relations (3) and (7), we obtain 〈BT L−1Bu,u〉= 〈L−1Bu,Bu〉= 〈L−1BCp,BCp〉= 〈Lp, p〉= |p|L = |u|M. An orthogonality property with respect to the inner product 〈M·, ·〉 is obtained as follows. Let uA ∈ null(A) and uB ∈ null(B). Setting uA =Cq, we have 〈MuA,uB〉= 〈MuB,Cq〉= 〈BuB,q〉= 0, (2.16) 27 by relation (5). Second, the eigenvalue problem forP−1M,LK now is( A− k2M BT B 0 )( v q ) = λ ( A+(1− k2)M 0 0 L )( v q ) . Setting q = 1λ L −1Bv and multiplying the resulting equation for v by λ , we have [ (λ 2−λ )A+((1− k2)λ 2+ k2λ )M]v = BT L−1Bv. Suppose v = vA+vB, where vA ∈ null(A) and vB ∈ null(B), by relation (1). We then have (λ 2−λ )AvB+((1− k2)λ 2+ k2λ )M(vA+ vB) = BT L−1BvA. There are at least m linearly independent vectors v that satisfy vA 6= 0. For m such vectors, taking the inner product with vA and noting that by relation (8) 〈BT L−1BuA,uA〉= |u|2M and that by (2.16) we have 〈M(vA+vB),vA〉= 〈MvA,vA〉= |vA|2M, we get (1− k2)λ 2|vA|2M + k2λ |vA|2M = |vA|2M, from which λ+ and λ− are obtained as solutions of the quadratic equation (1− k2)λ 2+ k2λ = 1. For the remaining eigenvectors we must have vB 6= 0. Then, we take the inner product with vB and use that 〈BT L−1BvA,vB〉 = 〈L−1BvA,BvB〉 = 0 and by (2.16) 〈M(vA+ vB),vB〉= 〈MvB,vB〉= |vB|2M. It follows that (λ −λ 2)|vB|2A = ( (1− k2)λ 2+ k2λ) |vB|2M. Recall that we assume k 1 and λ 6= 0. Therefore, we must have λ < 1. Apply- ing (2.10), we obtain (2.15). In Figure 2.2, we show the eigenvalues ofK andP−1M,LK . We use the lowest order Nédélec elements onΩ= (−1,1)2 and compute the eigenvalues on a uniform 28 triangular mesh of moderate size, for k = 0. Figure 2.2 illustrates the effect of preconditioning, as described in Theorem 2: the preconditioned eigenvalues are strongly clustered, whereas the original ones become unbounded with the mesh size. 0 100 200 300 400 500 600 700 800 900 1000 −100 0 100 200 300 400 500 600 700 800 0 100 200 300 400 500 600 700 800 900 1000 −1.5 −1 −0.5 0 0.5 1 1.5 (a) (b) Figure 2.2: Eigenvalues of (a) the matrixK and (b) the preconditioned ma- trix P−1M,LK , for k = 0. For the non-preconditioned matrix K , the m = 225 negative eigenvalues range in value from -2.80 to -0.20 and the remaining n = 736 positive eigenvalues range from 0.19 to 764.89. On the other hand, most of the eigenvalues of the preconditioned matrix P−1M,LK are extremely close or equal to either −1 or 1. 2.3.2 The inner solver The overall computational cost of usingPM,L depends on the ability to efficiently solve linear systems whose associated matrices arePM in (2.13) and L in (2.9). The linear system L arises from a standard scalar elliptic problem, for which many efficient solution methods exist. On the other hand, efficiently inverting PM is the computational bottleneck in the inner iteration. Recently, Hiptmair and Xu proposed effective auxiliary space preconditioners for linear systems arising from conforming finite element discretizations of H(curl)-elliptic variational prob- lems [56], based on fictitious spaces as developed in [45, 96]. The preconditioner is: P−1V = diag(PM) −1+P(L̄+ γQ̄)−1PT + γ−1C(L−1)CT , (2.17) 29 with γ as in (2.14). The matrix L̄= diag(L,L,L) is the εr-weighted vector Laplacian on S3h, Q̄= diag(Q,Q,Q) is the εr-weighted vector mass matrix on S 3 h, C is the null- space matrix in (2.8), and P is the matrix representation of the nodal interpolation operator Πcurlh : S 3 h → Ch. In the lowest order case, the operator Πcurlh is based on path integrals along edges; for a finite element function wh ∈ S3h it is given by Πcurlh wh =∑ j (∫ e j wh ·ds ) ψ j, where e j is the interior edge associated with the basis function ψ j. We have P = [P(1),P(2),P(3)], where P ∈ Rn×3m and P(k) (k = 1,2,3) are matrices in Rn×m. The entries of P(k) are given by P(k)i, j = { 0.5di t (k) i if node j is the head/tail of edge i, 0 otherwise, where t(k)i is the k-th component of the unit tangential vector on edge i, and di is the length of edge i. In the constant coefficient case, it was shown in [56, Theorem 7.1] that for 0< γ ≤ 1, the spectral condition number κ2(P−1V PM) is independent of the mesh size. Even though there seems to be no theoretical analysis available for the variable coefficient case, the preconditioner PV was experimentally shown to be effective in this case as well [56]. 2.3.3 Solution algorithm We run preconditioned MINRES as the outer solver for the linear system (2.2). The preconditioner is the block diagonal matrixPM,L, defined in (2.12). For each outer iteration, we need to solve a linear system of the form( PM 0 0 L )( v q ) = ( c d ) . (2.18) 30 Two Krylov subspace solvers are applied as the inner iterations. The linear system associated with the (1,1) block, PMv = c, (2.19) is solved using CG with the preconditionerPV , which is defined in (2.17). In each CG iteration, we need to solve a linear system of the form PV w = r. (2.20) Following (2.17), this can be done by solving the two linear systems (L̄+ γQ̄)y = s, (2.21a) L̄z = t, (2.21b) where s = PT r and t =CT r. We run one AMG V-cycle to compute y and z, and we set w = diag(PM)−1r+Py+ γ−1Cz. (2.22) The linear system associated with the (2,2) block of (2.18), Lq = d, (2.23) is solved using CG with an AMG preconditioner. Our approach is summarized in Algorithm 1. The inner iteration for (2.19) is initialized in line 4 and laid out in lines 5–10, where CG iterations preconditioned with PV are used. The inner iteration for (2.23) is initialized in line 11 and pro- vided in lines 12–15, where a CG scheme with an AMG preconditioner is used. Once the two iterative solvers converge, we update the approximated solution x for the next outer iterate in line 16. The stopping criteria are discussed in Section 2.4. 2.3.4 Scalable parallel implementation We solve the Maxwell equations using the lowest order discretization with Nédélec elements of the first family and linear nodal elements. The degrees of freedom 31 Algorithm 1 SolveK x = b in (2.2) 1: initialize MINRES for (2.2) 2: while MINRES not converged do 3: set c, d to be the right-hand side for the current inner iteration; see (2.18) 4: initialize CG for (2.19) 5: while CG not converged do 6: run one AMG V-cycle to approximate (L̄+ γQ̄)−1; update y in (2.21a) 7: run one AMG V-cycle to approximate L−1; update z in (2.21b) 8: update w in (2.20), using (2.22) 9: update v in (2.19) 10: end while 11: initialize CG for (2.23) 12: while CG not converged do 13: apply AMG preconditioner to approximate L 14: update q in (2.23) 15: end while 16: update x in (2.2) 17: end while in this discretization are associated with vertices and edges of the tetrahedra; cf. Figure 2.1. To distribute data on parallel computers, a standard domain partitioning approach is used, see, e.g., [65]. To illustrate the procedure for accomplishing load- balanced data distribution, we provide a few details on the steps that are taken: (1) Meshing. We use our own mesher to generate structured meshes, or use Tet- Gen [91] for unstructured and locally refined meshes. (2) Element partitioning. Given the mesh connectivity graph, we use METIS [62] to partition the elements into non-overlapping subdomains. There are two pri- mary metrics in judging the quality of a partition: the subdomain size and the number of edge cuts in the connectivity graph. Since our meshes are comprised of tetrahedra only, each subdomain should contain a roughly equal number of elements so that the resulting domain partition is load-balanced across all pro- cessors. The number of edge cuts indicates the interprocessor communication required by the parallel solver, which is minimized in METIS [63]. 32 (3) Element re-numbering. We re-number all elements according to the rank of the processor they belong to. Elements stored on a processor with a lower rank have smaller indices. Elements stored on the same processor have contiguous indices. (4) Distribution of degrees of freedom. Based on the partitioning of elements, we distribute all degrees of freedom across all processors. We do so by looping over all elements sequentially, starting from the ones stored on the processor with the lowest rank. (5) Re-numbering of degrees of freedom. We re-number all degrees of freedom such that the numbering is contiguous on the processor that owns them. The reordering is done sequentially, starting with the processor with the lowest rank in the communicator. Once a partition is obtained, we assemble all system matrices and vectors in par- allel. We use PETSc [7], BoomerAMG [51] and Hypre [38] in our own precon- ditioned iterative solution code. The way we order the degrees of freedom and assign them to processors is consistent with the sparse matrix partitioning scheme employed in PETSc. Example Figure 2.3 gives an example of how the mesh partitioning is done. The numbers bounded by squares are indices of elements; the circled numbers are in- dices of vertices; and the rest are indices of edges. Figure 2.3(a) shows the original numbering of elements, vertices, and edges as the output of a mesher. Suppose that we partition the mesh across two processors. Based on the two metrics described in step 2 and the connectivity graph provided to METIS, it outputs the partitioning. For this example we get that processor 1 owns elements 1, 2, 6, 8 and processor 2 owns elements 3–5, 7. We now as per step 3 reorder the elements. The new indices of elements 1–8 become 1, 2, 7, 6, 5, 3, 8, 4 as shown in Figure 2.3(b). Next we consider how to distribute the degrees of freedom associated with the vertices and edges across the two processors. According to step 4, we loop over all elements starting from those stored on processor 1, whose new indices are 1–4. For element 1, vertices 1, 4, 9 and edges 1, 2, 3 are associated with it. They have not been assigned to any processor previously, so they belong to processor 1. Element 2 has 33 vertices 4, 5, 9 and edges 2, 4, 5. Vertices 4, 9 and edge 2 have been allocated to processor 1, so we only need to store vertex 5 and edges 4, 5 to processor 1. We continue to do this until we have looped over all elements. The degrees of freedom associated with the indices with underlines in Figure 2.3(c) are stored on processor 1. The remaining ones are stored on processor 2. Once the partition is obtained, we perform the reordering procedure as in step 5. We start with elements stored on processor 1 from element 1. Now the new indices of vertices 1, 4, 9 become 1, 2, 3 and the indices of edges 1, 2, 3 remain the same. For element 2, vertices 4, 5, 9 are associated with it. Since vertices 4 and 9 have been re-numbered in element 1, we only need to re-number vertex 5, which now becomes 4. Edge 2 has been re-numbered in element 1. We only need to renumber edges 4 and 5. They have the same indices in the re-ordering procedure. Once we loop over all elements, we have the reordered mesh as shown in Figure 2.3(d). In our implementation, steps 4–5 are combined together for better efficiency. 2.4 Numerical results This section is devoted to assessing the numerical performance and parallel scala- bility of our implementation on different test cases. In all experiments, the relative residual of the outer iteration is set to 1e-6 and the relative residual of the inner it- eration is set to 1e-8, unless explicitly specified. The code is executed on a cluster with up to 12 nodes. Each node has eight 2.6GHz Intel processors and 16G RAM. The following notation is used to record our results: np denotes the number of processors of the run, its is the number of outer MINRES iterations, itsi1 is the number of inner CG iterations for solvingPM, itsi2 is the number of CG iterations for solving L, while ts and ta denote the average times needed for the solve phase and the assemble phase in seconds, respectively. The parameter tAMG is the time spent in seconds in one BoomerAMG V-cycle for solving L. 2.4.1 Example 1: structured mesh The first example is a simple domain with a structured mesh. The domain is a cube, Ω = (−1,1)3. We test both homogeneous and inhomogeneous coefficient cases. In the homogeneous case, we set µr = εr = 1. In the variable coefficient 34 17 3 8 6 4 5 2 287 3 5 9 641 9 15 1 12 3 2 4 13 16 5 6 11 10 8 7 14 1 8 7 4 3 6 5 2 287 3 5 9 641 9 15 1 12 3 2 4 13 16 5 6 11 10 8 7 14 (a) (b) 1 8 7 4 3 6 5 2 287 3 5 9 641 9 15 1 12 3 2 4 13 16 5 6 11 10 8 7 14 1 8 7 4 3 6 5 2 987 6 4 3 521 13 16 1 6 3 2 4 7 9 5 8 11 10 12 14 15 (c) (d) Figure 2.3: Mesh partitioning example. (a) Original mesh after step 1. The thick line (in all four figures) indicates the partitioning of elements af- ter step 2. (b) Reordering of elements after step 3. (c) Distribution of degrees of freedom after step 4. (d) Reordered mesh after step 5. 35 case, we assume that there are eight subdomains in the cube as shown in Figure 2.4 and each subdomain has piecewise constant coefficients. The coefficients are µr = εr = 1a if x< 0 and y< 0 and z< 0, 2a if x> 0 and y< 0 and z< 0, 3a if x< 0 and y> 0 and z< 0, 4a if x> 0 and y> 0 and z< 0, 5a if x< 0 and y< 0 and z> 0, 6a if x> 0 and y< 0 and z> 0, 7a if x< 0 and y> 0 and z> 0, 8a otherwise, (2.24) where a is a constant. We set the right-hand side so that the solution of (2.1) is given by u(x,y,z) = u1(x,y,z)u2(x,y,z) u3(x,y,z) = (1− y 2)(1− z2) (1− x2)(1− z2) (1− x2)(1− y2) (2.25) and p(x,y,z) = (1− x2)(1− y2)(1− z2). (2.26) In this example, the homogeneous boundary conditions in (2.1) are satisfied. Uniformly refined meshes are constructed as shown in Figure 2.5(a). The num- ber of elements and matrix sizes are given in Table 2.1. Figure 2.5(b) shows how grid C1 is partitioned across 3 processors. Elements with the same color are stored on the same processor. Elements with the same color are clustered together, which means the communication cost is minimal. Table 2.2 shows the local numbers of elements and degrees of freedom on each processor for grid C1. The number of degrees of freedom on each processor is roughly the same, which indicates the load is balanced. In our first experiment, material coefficients are homogeneous. Scalability re- sults are shown in Table 2.3 and results with different values of k are shown in Table 2.4. To test scalability, we refine the mesh and increase the number of pro- cessors in a proportional manner, so that the problem size per processor remains 36 Figure 2.4: Example 1. Distribution of material coefficients. (a) (b) Figure 2.5: Example 1. (a) Structured mesh. (b) grid C1 partitioned on 3 processors. constant. Full scalability would then imply that the computation time also remains constant. We observe in Table 2.3 that when the mesh is refined, the numbers of outer and inner iterations stay constant, which demonstrates the scalability of our method. The time spent in the assembly also scales very well. The time spent in the 37 grid Nel n+m C1 7,146,096 9,393,931 C2 14,436,624 19,034,163 C3 29,478,000 38,958,219 Table 2.1: Example 1. Number of elements (Nel) and the size of the linear systems (n+m) for grids C1–C3. processor local elements local DOFs 1 2,359,736 3,119,317 2 2,424,224 3,199,406 3 2,362,136 3,075,208 Table 2.2: Example 1. Partitioning of grid C1. solve increases slightly. This is because each BoomerAMG V-cycle seems to take more time when the mesh is refined. For different values of k, we have observed very similar computation times as in Table 2.3. Table 2.4 shows that the iteration counts stay the same for the values of k that we have selected. This demonstrates the scalability of our solver. Notice that the highest wave number in the tables, k = 14 , corresponds to fre- quency ω = 7.9e7. The wave length is then 8pi , which is still considerably larger than our domain. Thus, our discretization well resolves the length scale of the time-harmonic waves. np grid its itsi1 itsi2 ts(sec) ta(sec) tAMG(sec) 3 C1 5 34 7 1,473.58 44.83 16.03 6 C2 5 35 9 1,634.17 45.26 20.55 12 C3 5 34 9 1,879.06 48.93 25.39 Table 2.3: Example 1. Iteration counts and computation times for various grids, k = 0. Setting the outer tolerance as 1e-6, we test our solver with different inner tol- erances. The results are given in Table 2.5. We see that when the inner tolerance is looser, fewer inner iterations are required, however if the tolerance is too loose, more outer iterations are required. For this example, 1e-6 is the optimal inner 38 k = 0 k = 18 k = 1 4 np grid its itsi1 itsi2 its itsi1 itsi2 its itsi1 itsi2 3 C1 5 34 7 5 34 7 5 34 7 6 C2 5 35 9 5 35 9 5 35 9 12 C3 5 34 9 5 34 9 5 34 9 Table 2.4: Example 1. Iteration counts for various values of k. tolerance. In the remaining examples, we stick to 1e-6 and 1e-8 as outer and inner tol- erances respectively. We select a tight inner tolerance since one of our goals is to investigate the speed of convergence of outer iterations. inner tol its itsi1 itsi2 ts(sec) 1e-10 5 43 9 557.08 1e-9 5 39 8 505.30 1e-8 5 35 8 440.92 1e-7 5 30 7 383.34 1e-6 6 21 7 375.91 1e-5 8 20 6 409.57 1e-4 21 16 5 794.15 Table 2.5: Example 1. Iteration counts and computation times for grid C1 on 16 processors for inexact inner iterations, k = 0. Next we test the variable coefficient case. Table 2.6 shows the iteration counts for different variable coefficient cases. Note that the larger a is, the more variant the coefficients are in different regions. As expected, the eigenvalue bound depends on the coefficients. Table 2.6 shows that as a increases, the material coefficients vary more dramatically, the eigenvalue bounds in Theorem 2 get looser and the iteration counts increase. In the variable coefficient case, both the inner and outer iterations are not sensitive to changes of the mesh size, which demonstrates that our method is scalable in the variable coefficient case. 39 a = 1 a = 10 a = 20 np grid its itsi1 itsi2 its itsi1 itsi2 its itsi1 itsi2 3 C1 10 81 8 58 147 8 107 135 8 6 C2 10 87 10 59 154 10 105 145 10 12 C3 10 87 11 58 178 10 107 187 10 Table 2.6: Example 1. Iteration counts for various values of a, k = 0. 2.4.2 Example 2: unstructured mesh In this example, we test the problem in a complicated domain with a quasi-uniform mesh. The domain is a complicated 3D gear, which is bounded in (0.025,0.975)× (0.025,0.975)× (0.025,0.15292). We test two different cases: constant and vari- able coefficient cases. In the constant coefficient test, we set µr = εr = 1. In the variable coefficient case, we assume that there are four subdomains in the gear as shown in Figure 2.6 and each subdomain has piecewise constant coefficients. The coefficients are µr = εr = 1a if x< 0.5 and y< 0.5, 2a if x> 0.5 and y< 0.5, 3a if x< 0.5 and y> 0.5, 4a otherwise, where a is a constant. In both tests, we set the right-hand side function so that the exact solution is given by (2.25) and (2.26), and enforce the inhomogeneous boundary conditions in a standard way. The domain is meshed with quasi-uniform tetrahedra as shown in Figure 2.7(a). The number of elements and matrix sizes are given in Table 2.7. Figure 2.7(b) shows how to partition G1 across 4 processors. First, we test our approach with constant coefficients. The scalability results are shown in Table 2.8. Results with different values of k are shown in Table 2.9. We observe that when the mesh is refined, both the inner and outer iteration counts stay constant. Again, the time spent in the assembly scales very well. The time spent in the solve is increasing slightly, which can be explained by the increase in tAMG. We note that the computational times for different values of k are similar to 40 Figure 2.6: Example 2. Distribution of material coefficients. (a) (b) Figure 2.7: Example 2. (a) Quasi-uniform mesh on the gear. (b) grid G1 partitioned on 4 processors. those reported for k = 0 in Table 2.8. Table 2.9 shows that the iteration counts do not change for different k values. Next we test the variable coefficient example. Table 2.10 shows the results for different variable coefficient cases. Again, we observe that when the variation of 41 grid Nel n+m G1 723,594 894,615 G2 1,446,403 1,810,413 G3 2,889,085 3,650,047 G4 5,778,001 7,354,886 Table 2.7: Example 2. Number of elements (Nel) and the size of the linear systems (n+m) for grids G1–G4. np grid its itsi1 itsi2 ts(sec) ta(sec) tAMG(sec) 4 G1 4 58 8 146.15 5.86 0.73 8 G2 4 61 9 345.32 8.07 1.23 16 G3 4 61 9 399.49 7.92 1.66 32 G4 4 64 10 467.57 8.17 2.12 Table 2.8: Example 2. Iteration counts and computation times for various grids, k = 0. k = 0 k = 18 k = 1 4 np grid its itsi1 itsi2 its itsi1 itsi2 its itsi1 itsi2 4 G1 4 58 8 4 58 8 4 58 8 8 G2 4 61 9 4 61 9 4 61 9 16 G3 4 61 9 4 61 9 4 61 9 32 G4 4 64 10 4 64 10 4 64 10 Table 2.9: Example 2. Iteration counts for various values of k. the material coefficients increases, the iteration counts increase. We also observe that both the inner and outer iterations are not sensitive to changes of the mesh size for the example with unstructured meshes. 2.4.3 Example 3: locally refined mesh The last example is the Fichera corner problem. We are interested in testing our solver on a series of locally refined meshes. The domain Ω = (−1,1)3\[0,1)× [0,−1)× [0,1) is a cube with a missing corner. We also test both homogeneous and inhomogeneous coefficients. In the homogeneous coefficient case, we set µr = 42 a = 1 a = 10 a = 20 np grid its itsi1 itsi2 its itsi1 itsi2 its itsi1 itsi2 4 G1 5 159 8 21 294 8 35 250 8 8 G2 5 167 9 20 328 9 35 296 9 16 G3 5 177 9 20 389 9 34 342 9 32 G4 5 185 10 20 435 10 34 404 10 Table 2.10: Example 2. Iteration counts for various values of a, k = 0. εr = 1. In the inhomogeneous case, we assume that there are seven subdomains in the domain as shown in Figure 2.8 and each subdomain has piecewise constant coefficients. The coefficients are the same as (2.24). In both tests, we set the right- hand side function so that the exact solution is given by (2.25) and (2.26), and enforce the inhomogeneous boundary conditions in a standard way. Figure 2.8: Example 3. Distribution of material coefficients. The domain is discretized with locally refined meshes towards the corner. Fig- ure 2.9 shows an example of a sequence of locally refined meshes. The number of elements and matrix sizes are given in Table 2.11. Figure 2.10 shows the partition- ing of F1 on 4 processors. First, we assume that the material coefficients are homogeneous. The scalabil- 43 (a) (b) (c) Figure 2.9: Example 3. A sequence of locally refined meshes. Figure 2.10: Example 3. Illustration of grid F1 partitioned on 4 processors. grid Nel n+m F1 781,614 957,277 F2 1,543,937 1,917,649 F3 3,053,426 3,832,895 F4 6,072,325 7,689,953 Table 2.11: Example 3. Number of elements (Nel) and the size of the linear systems (n+m) for grids F1–F4. 44 ity results are shown in Table 2.12. The results with different values of k are shown in Table 2.13. On the locally refined meshes, we also observe that when the mesh is refined, both the inner and outer solvers are scalable. Again, the time spent in the assembly scales very well. The time spent in the solve is increasing, which is due to the increasing cost of BoomerAMG V-cycles. Table 2.13 shows that the iteration counts do not change for different wave numbers. As in the previous examples, in our experiments the computational times are also roughly the same as in Table 2.12 for different values of k. np grid its itsi1 itsi2 ts(sec) ta(sec) tAMG(sec) 4 F1 5 59 8 204.82 6.54 0.88 8 F2 5 56 9 213.16 6.39 1.17 16 F3 5 58 10 253.81 6.29 1.65 32 F4 5 62 10 314.88 6.38 1.99 Table 2.12: Example 3. Iteration counts and computation times for various grids, k = 0. k = 0 k = 18 k = 1 4 np grid its itsi1 itsi2 its itsi1 itsi2 its itsi1 itsi2 4 F1 5 59 8 5 59 8 5 59 8 8 F2 5 56 9 5 56 9 5 56 9 16 F3 5 58 10 5 58 10 5 58 10 32 F4 5 62 10 5 62 10 4 63 10 Table 2.13: Example 3. Iteration counts for various values of k. Next, we test the variable coefficient case. Table 2.14 shows the results for different variable coefficient cases. Again, we observe that when the coefficients vary more, the iteration counts increase, but the scalability is very good. 45 a = 1 a = 10 a = 20 np grid its itsi1 itsi2 its itsi1 itsi2 its itsi1 itsi2 4 F1 7 85 9 40 157 9 74 134 9 8 F2 7 92 9 40 178 9 73 157 9 16 F3 7 100 10 39 192 10 72 167 10 32 F4 7 107 10 39 216 10 71 206 10 Table 2.14: Example 3. Iteration counts for various values of a, k = 0. 46 Chapter 3 A new mixed finite element method for the incompressible MHD equations In this chapter1 we design and numerically test a new finite element discretization for the steady-state incompressible MHD equations that model electrically con- ducting fluids in the presence of a magnetic field. We use divergence-confirming elements for the approximation of the velocity field and curl-confirming elements for the magnetic field. This discretization yields exactly divergence-free velocity approximations. Here, we recapitulate the theoretical properties of our method. This chapter is structured as follows. We start by discussing related work in Section 3.1. Section 3.2 is devoted to the finite element discretization. In Sec- tion 3.3 we present a series of numerical experiments validating the theoretical results. 1 The notation in this chapter is slightly different than the notation in Chapter 2. The variables u and p are now used to denote the velocity field and pressure field in the MHD system, respectively. The forms A, B, M are also different than in Chapter 2, but the meaning should be obvious from the context. 47 3.1 Background and related work We consider a standard form of the incompressible MHD equations as derived in (1.15); see also [3, Section 2] and [41, 42, 49]. That is, we neglect phenomena involving high frequency as well as the convection current, and consider a non- polarizable, non-magnetizable and homogeneous medium. In addition, to make the curl-curl operator arising in the Maxwell equations amenable to discretization with Nédélec elements, we use the mixed formulation proposed in [89]. The governing equations are then of the form −ν ∆u+(u ·∇)u+∇p−κ (∇×b)×b = f in Ω, (3.1a) κνm∇× (∇×b)+∇r−κ∇× (u×b) = g in Ω, (3.1b) ∇ ·u = 0 in Ω, (3.1c) ∇ ·b = 0 in Ω. (3.1d) Here, u is the velocity, b the magnetic field, p the hydrodynamic pressure, and r is a Lagrange multiplier associated with the divergence constraint on the magnetic field b. The functions f and g represent external force terms. We consider the following homogeneous Dirichlet boundary conditions: u = 0 on ∂Ω, (3.2a) n×b = 0 on ∂Ω, (3.2b) r = 0 on ∂Ω, (3.2c) with n being the unit outward normal on ∂Ω. By taking the divergence of the magnetostatic equation (3.1b), we obtain the Poisson problem ∆r = ∇ ·g in Ω, r = 0 on ∂Ω. (3.3) Since g is typically divergence-free in physical applications, the multiplier r is typically zero and its primary purpose is to ensure stability; see [32, Section 3]. Various finite element methods for discretizing linear and nonlinear MHD sys- tems can be found in the literature. The magnetic field is often approximated by 48 standard nodal (i.e., H1-conforming) finite elements [3, 41, 47–49]. However, since the strongest magnetic singularities have regularity below H1 straightforwardly ap- plied nodal elements may fail to resolve them in non-convex polyhedral domains; see [26] and the references therein. This is illustrated in Figure 3.1, for the strongest magnetic singularity in an L-shaped domain in two dimensions. A number of reme- dies have been proposed for electromagnetic problems, for example the weighted regularization approach in [27] or the approach in [14], whereby the divergence of the electric field is stabilized in H−α with 12 < α < 1. In [50], weighted regular- ization has been applied to a full incompressible MHD system. −1 −0.5 0 0.5 1 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 −1 −0.5 0 0.5 1 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 (a) (b) Figure 3.1: Failure of H1-conforming elements. (a) Contours of the first com- ponent of the strongest magnetic singularity in an L-shaped domain; (b) nodal approximation with piecewise linear elements. It is evident from the plots that, on a sufficiently fine mesh, nodal elements cannot cor- rectly resolve the magnetic field. In the mixed formulation of [89] the above mentioned difficulties associated with nodal elements are seamlessly avoided without the need for stabilizing the divergence. This approach amounts to introducing the Lagrange multiplier r, and yields the PDE system (3.1). As a result, it is possible to use curl-conforming Nédélec elements for approximating the magnetic field. This makes this approach 49 feasible in situations of highly singular magnetic fields [55, 73, 76]. In the con- text of incompressible magnetohydrodynamics, a related mixed approach for the discretization of the magnetic unknowns was presented in [34]. We are interested in discretizations for incompressible MHD problems that are based on DG methods; see, e.g., the surveys [20, 21, 31] and the references therein. In [48], an interior penalty technique is applied to enforce continuity of the magnetic variable across domains with different electromagnetic properties, while nodal elements are employed in the interior. A full DG method is proposed in [58] for a linearized variant of the system (3.1), whereby all the variables are approximated in discontinuous finite element spaces, based on existing discretiza- tions for the Oseen and Maxwell equations [22, 23, 57]. However, this approach requires a large number of degrees of freedom. Furthermore, a straightforward extension to the nonlinear setting in a locally conservative fashion would require a post-processing procedure for smoothing the DG velocity approximations through- out the nonlinear iteration [23]. In this chapter, we design a new finite element discretization, in an attempt to overcome the above mentioned difficulties. Instead of discontinuous elements for all unknowns, we use divergence-conforming BDM elements [15, 24] for the ap- proximation of the velocity field, and curl-conforming Nédélec elements [76] for the magnetic field, thereby substantially reducing the total number of the coupled degrees of freedom. The H1-continuity of the velocity field is again enforced by a DG technique. A central feature of this discretization is that it yields exactly divergence-free velocity approximations, guaranteeing stability of the linearized system within each Picard iteration, without any other modifications. We note that divergence-conforming discretizations were originally proposed and analyzed for the incompressible Navier-Stokes equations in [24]. For the magnetic approxima- tion we have a discrete version of the desirable property (3.3), in contrast to the full DG method presented in [58]. Our method correctly captures the strongest magnetic singularities in non-convex domains. We have nearly optimal a-priori error estimates for the two-dimensional case, and slightly suboptimal ones for the three-dimensional case. A full description of our theoretical results can be found in [44, 95]. Here we recapitulate the main results and present an extensive set of numerical tests. The numerical results clearly 50 indicate optimal convergence rates in two and three dimensions. 3.2 Mixed finite element discretization In this section, we introduce a mixed finite element method that employs divergence- conforming elements for the approximation of the velocity field and curl-conforming elements for the magnetic field. The H1-continuity of the velocity is enforced by a DG technique. 3.2.1 Variational formulation We assumeΩ is a bounded simply-connected Lipschitz polytope inRd (d = 2 or 3), with a connected boundary ∂Ω. Recall the spaces C and S defined in (2.3). Upon setting V = H10 (Ω) d = {u ∈ H1(Ω)d : u = 0 on ∂Ω}, Q = L20(Ω) = { p ∈ L2(Ω) : (p ,1)Ω = 0}, the variational formulation of the incompressible MHD system (3.1)–(3.2) amounts to finding (u,b, p,r) ∈ V×C×Q×S such that A(u,v)+O(u,u,v)+C(b,v,b)+B(v, p) = (f,v)Ω, (3.4a) M(b,c)−C(b,u,c)+D(c,r) = (g,c)Ω, (3.4b) B(u,q) = 0, (3.4c) D(b,s) = 0, (3.4d) for all (v,c,q,s) ∈ V×C×Q×S. The variational forms are given by A(u,v) = ∫ Ω ν∇u : ∇vdx, O(w,u,v) = ∫ Ω (w ·∇)u ·vdx, M(b,c) = ∫ Ω κνm(∇×b) · (∇× c)dx, C(d,v,b) = ∫ Ω κ (v×d) · (∇×b)dx, B(u,q) =− ∫ Ω (∇ ·u)qdx, D(b,s) = ∫ Ω b ·∇sdx. 51 Furthermore, we define the norm of the source terms by |||(f,g)|||= ( ‖f‖2L2(Ω)+‖g‖2L2(Ω) ) 1 2 . Finally, we introduce the parameters ν̄ = min{ν ,κνm}, κ̄ = max{1,κ}. We assume that there is a constant c1 > 0 only depending on Ω such that for small data with c1 κ̄ ν̄2 |||(f,g)|||< 1, (3.5) the MHD problem (3.4) has a unique solution (u,b, p,r) in V×C×Q×S; see the proof in [89, Corollary 2.18 and Remark 2.14]. 3.2.2 Mixed discretization We consider a family of regular and quasi-uniform triangulationsTh of mesh size h that partition the domain Ω into simplices {K} (i.e., triangles for d = 2 and tetra- hedra for d = 3). We denote byFh the set of all edges (d = 2) or faces (d = 3) of Th. In the following, we generically refer to elements in Fh as faces. As usual, hK denotes the diameter of the element K, and hF is the diameter of the face F . Finally, we write nK for the unit outward normal vector on the boundary ∂K of K. The average and jump operators are defined as follows. Let F = ∂K ∩∂K′ be an interior face shared by K and K′, and let x ∈ F . Let φ be a generic piecewise smooth function and denote by φ and φ ′ the traces of φ on F taken from within the interior of K and K′, respectively. Then, we define the mean value {{·}} at x ∈ F as {{φ}}= 1 2 (φ +φ ′). Furthermore, for a piecewise smooth vector-valued function φ , we define the jump: [[φ ]] = φ ⊗nK +φ ′⊗nK′ , 52 where φ⊗n= (φin j)1≤i, j≤d . On a boundary face F = ∂K∩∂Ω, we set accordingly {{φ}}= φ , [[φ ]] = φ ⊗n. For k≥ 1, we wish to approximate the solution of (3.1)–(3.2) by finite element functions (uh,bh, ph,rh) ∈ Vh×Ch×Qh×Sh, where Vh = {u ∈ H0(div;Ω) : u|K ∈Pk(K)d , K ∈Th }, Qh = { p ∈ L20(Ω) : p|K ∈Pk−1(K), K ∈Th }, and the spaces Ch and Sh are defined in (2.5). Here, we denote by H0(div;Ω) the space H0(div;Ω) = { u ∈ L2(Ω)d : ∇ ·u ∈ L2(Ω), u ·n = 0 on ∂Ω } . The space Vh is the divergence-conforming BDM space (see [15, Section III.3] for details); it has degrees of freedom specified for the normal components of func- tions along faces. Figure 3.2 shows the degrees of freedom on the lowest order BDM elements in 2D and 3D. We notice that the finite element spaces Ch, Qh and Sh are conforming in C, Q and S, respectively, while Vh is non-conforming in V. Now we consider the following finite element method: find (uh,bh, ph,rh) ∈ Vh×Ch×Qh×Sh such that Ah(uh,v)+Oh(uh,uh,v)+C(bh,v,bh)+B(v, ph) = (f,v)Ω, (3.6a) M(bh,c)−C(bh,uh,c)+D(c,rh) = (g,c)Ω, (3.6b) B(uh,q) = 0, (3.6c) D(bh,s) = 0, (3.6d) for all (v,c,q,s) ∈ Vh×Ch×Qh×Sh. The form Ah associated with the Laplacian is chosen as the standard interior 53 (a) 2D (b) 3D Figure 3.2: A graphical illustration of the degrees of freedom for the lowest order BDM element in 2D and 3D. penalty form [4, 5]: Ah(u,v) = ∫ Ω ν∇hu : ∇hvdx− ∑ F∈Fh ∫ F {{ν∇hu}} : [[v]]ds − ∑ F∈Fh ∫ F {{ν∇hv}} : [[u]]ds+ ∑ F∈Fh a0ν hF ∫ F [[u]] : [[v]]ds. Here, ∇h is the elementwise gradient operator, and a0 > 0 is the interior penalty stabilization parameter; it has to be chosen larger than a threshold value which is independent of h, ν , κ and νm. For the convection form, we take the standard upwind form [68]: Oh(w,u,v) = ∑ K∈Th ∫ K (w ·∇)u ·vdx + ∑ K∈Th ∫ ∂K\∂Ω 1 2 (w ·nK−|w ·nK |)(ue−u) ·vds − ∫ ∂Ω 1 2 (w ·n−|w ·n|)u ·vds. Here, ue is the trace of u taken from the exterior of K. The remaining forms are 54 the same as in the continuous case. Notice that due to the presence of the upwind terms the form Oh(w,u,v) is not linear in the first argument. By choosing the divergence-conforming BDM elements as the approximating space for the velocity, the method gives exactly divergence-free velocity approx- imations; cf. [24]. Moreover, the Lagrange multiplier rh vanishes identically for divergence-free source terms, thereby mimicking the continuous property in (3.3). Proposition 3 Let (uh, bh, ph, rh) solve (3.6). Then we have: (i) ∇ ·uh = 0 in Ω. (ii) the Lagrange multiplier rh is the solution of (∇rh,∇s)Ω = (g,∇s)Ω ∀ s ∈ Sh. In particular, if g is solenoidal, then rh ≡ 0. Proof To prove item (i), we proceed as in [24]. We note that ∇ ·uh has vanishing mean value on Ω, and is a discontinuous polynomial of degree k− 1. Thus, we have ∇ · uh ∈ Qh. Equation (3.6c) then implies that ∇ · uh is orthogonal to all functions q ∈ Qh. Therefore, it is equal to zero. To prove item (ii), we take c = ∇s in equation (3.6b) (noting that ∇Sh ⊂ Ch) and obtain (g,∇s)Ω = M(bh,∇s)−C(bh,uh,∇s)+D(∇s,rh) = D(∇s,rh). Here, we have used the fact that ∇×∇s = 0. Therefore, rh satisfies (∇rh,∇s)Ω = (g,∇s)Ω ∀ s ∈ Sh. Since (g,∇s)Ω = (∇ ·g,s)Ω, we have rh ≡ 0 provided that ∇ ·g = 0. The solution of (3.6) can be found by employing the following Picard-type iteration: given (uk−1h ,b k−1 h ) ∈Vh×Ch, let (ukh,bkh, pkh,rkh) in Vh×Ch×Qh×Sh be 55 the solution of the linearized Oseen-type problem Ah(ukh,v)+Oh(u k−1 h ,u k h,v)+C(b k−1 h ,v,b k h)+B(v, p k h) = (f,v)Ω, (3.7a) B(ukh,q) = 0, (3.7b) M(bkh,c)−C(bk−1h ,ukh,c)+D(c,rkh) = (g,c)Ω, (3.7c) D(bkh,s) = 0, (3.7d) for all (v,c,q,s) ∈ Vh×Ch×Qh×Sh. Theorem 3.2 in [44] guarantees the convergence of the iterates {(ukh,bkh, pkh,rkh)}, for k= 1,2, . . ., to the solution (uh,bh, ph,rh) of (3.6) for any initial guess (u0h,b 0 h)∈ Vh×Ch with exactly divergence-free u0h, provided that the small data assumption in (3.5) is satisfied. However, the scheme is only linearly convergent, as we illus- trate in Section 3.3. Remark 4 A more efficient nonlinear solver such as Newton’s method can also be used for solving (3.6); see, e.g., [40, 42, 49]. When upwinding is not incorporated, Newton’s method can be straightforwardly applied. However, when upwind terms are included, adapting the nonlinear iteration to our discretization is more delicate, since it requires additional linearization of the convection form Oh(w,u,v) in the first argument. This remains an item for future investigation. 3.2.3 Theoretical results Our first result is a convergence result. To state it, we suppose the solution (u,b, p,r) of (3.1)–(3.2) possesses the smoothness (u, p) ∈ Hσ+1(Ω)d×Hσ (Ω), (3.8a) (b,∇×b,r) ∈ Hτ(Ω)d×Hτ(Ω)d×Hτ+1(Ω), (3.8b) for σ ,τ > 12 . Remark 5 The regularity assumption (3.8b) is minimal in the sense that it is satis- fied by the strongest singularities of the Maxwell operator in polyhedral domains; 56 cf. [26, 27]. Similarly, the regularity (3.8a) holds true for the strongest singular- ities of the Stokes operator in polyhedral domains; see [2, 29]. In view of these results, we expect (3.8) to be the minimal smoothness of solutions to the MHD system (3.1)-(3.2) in general Lipschitz polyhedra. However, we do not have a full proof of this conjecture. To discuss the convergence of the finite element formulation (3.6), we intro- duce the product norms ‖(u,b)‖Vh×Ch = ( ν‖u‖21,h+κνm‖b‖2H(curl;Ω) ) 1 2 , ‖(p,r)‖Q×S = ( 1 ν ‖p‖2L2(Ω)+ 1 κνm ‖r‖2H1(Ω) ) 1 2 , Here, the discrete H1-norm for the hydrodynamic velocity is defined by ‖u‖1,h = ( ∑ K∈Th ‖∇u‖2L2(K)+ ∑ F∈Fh h−1F ‖[[u]]‖2L2(F) ) 1 2 , and the curl-norm for the magnetic field is defined by ‖b‖H(curl;Ω) = ( ‖b‖2L2(Ω)+‖∇×b‖2L2(Ω) ) 1 2 . Theorem 6 Let (u,b, p,r) and (uh,bh, ph,rh) be the solutions of (3.1)–(3.2) and (3.6), respectively, obtained on a sequence of quasi-uniform meshes {Th}h>0 of mesh size h. Assume (3.8) and that κ̄ ν̄−2|||(f,g)||| is sufficiently small. Then we have lim h→0 ‖(u−uh,b−bh)‖Vh×Ch = 0, limh→0‖(p− ph,r− rh)‖Q×S = 0. Theorem 6 guarantees that the method (3.6) gives correct solutions provided that the (minimal) smoothness assumption (3.8) is satisfied and the data is suffi- ciently small; see the proof in [44, Section 4.4]. In particular, it ensures conver- gence in situations where straightforwardly applied nodal elements for the approx- imation of b are not capable of correctly capturing the singular solution compo- 57 nents. Next, we present a-priori error estimates for the two- and three-dimensional cases. To do so, we introduce the function `(d,h) given by `(d,h) = { h−ε , d = 2, h− 1 2 , d = 3. Here, ε > 0 is a fixed number. The function `(d,h) will indicate the deterioration of convergence rates for both the two-dimensional and three-dimensional cases. We also denote by Cd > 0 a generic constant independent of h, ν , κ and νm, but dependent on the dimension d. In particular, for d = 2 it depends on ε and might be unbounded as ε → 0. Theorem 7 LetΩ⊂Rd (d = 2 or 3) be a simply-connected Lipschitz polygon with a connected boundary ∂Ω. Under the same assumption as in Theorem 6, we have the following error estimates: ‖(u−uh,b−bh)‖Vh×Ch ≤Cd`(d,h)hmin{σ ,τ,k} ( ν 1 2 ‖u‖Hσ+1(Ω)+(κνm) 1 2 ‖b‖Hτ (Ω)+(κνm) 1 2 ‖∇×b‖Hτ (Ω) ) +Chmin{σ ,τ,k} ( ν− 1 2 ‖p‖Hσ (Ω)+(κνm)− 1 2 ‖r‖Hτ+1(Ω) ) , and ‖(p− ph,r− rh)‖Q×S ≤Cd`(d,h)2hmin{σ ,τ,k} ( ν 1 2 ‖u‖Hσ+1(Ω)+(κνm) 1 2 ‖b‖Hτ (Ω)+(κνm) 1 2 ‖∇×b‖Hτ (Ω) ) +Cd`(d,h)hmin{σ ,τ,k} ( ν− 1 2 ‖p‖Hσ (Ω)+(κνm)− 1 2 ‖r‖Hτ+1(Ω) ) . Here, the constants C is independent of h, ν , κ and νm. We provide the proof of Theorem 7 in [44, Section 4.5]. The convergence rates in Theorem 7 are optimal in the mesh size, up to a loss of O(hε) for ε arbitrarily small in two dimensions, and up to a loss of O(h 1 2 ) in three dimensions. In addi- tion, in the two-dimensional case, the constant Cd might become unbounded as ε 58 tends to zero. However, in our numerical experiments this constant is observed to stay bounded. In fact, we observe optimal rates of convergence in all our tests, for both smooth and non-smooth solutions. Full details are given in Section 3.3. This gap between the theoretical and numerical results indicates that the former could be further improved. For the linearized variant of the MHD system (3.1), our method converges op- timally in the mesh size h, as follows from [58, Remark 3.3]. That is, the estimates of Theorem 7 hold true without any loss, both in two and three dimensions. How- ever, there we make stronger smoothness assumptions on the linearized magnetic field. Therefore, this optimality cannot be straightforwardly carried over to the nonlinear setting. 3.3 Numerical results In this section we present a series of numerical experiments. Our computations have been carried out using MATLAB, with direct linear solvers; iterative solvers will be discussed in Chapter 4. The primary purpose of our experiments is to confirm optimal convergence rates of our method. We start by considering one problem with a smooth solution and a second one with a singular solution. Then, we consider the numerical approximations of two- and three-dimensional Hart- mann channel flow and driven cavity flow problems. Finally, we present results for another benchmark problem: MHD flow over a step in two dimensions. Throughout this section, the lowest order BDM and Nédélec elements are em- ployed and the interior penalty stabilization parameter is a0 = 10. The Picard itera- tion described in (3.7) is used to solve the nonlinear systems. For all the examples, we solve a Stokes problem and the Maxwell equations, decoupled, to obtain an initial guess. The tolerance for the Picard iterations is chosen as 1e-5. We test our method on problems with mixed Dirichlet and Neumann bound- ary conditions in the hydrodynamic variables, even though the analysis has been carried out solely for the Dirichlet case. Throughout this section, ΓN denotes the Neumann boundary, and ΓD the Dirichlet boundary. On Neumann boundaries, we specify the value of (pI−ν∇u)n, where I is the identity matrix. 59 3.3.1 Example 1: two-dimensional problem with a smooth solution First, we verify the theoretical results stated in Theorems 6 and 7 for a problem with a smooth analytical solution. We consider the following two-dimensional problem. We set Ω = (−1,1)2 with ΓN = {(1,y) : y ∈ (−1,1)}, ΓD = ∂Ω\ΓN , ν = κ = 1, νm = 1e4, and choose the source terms f, g and the boundary conditions so that the analytical solution is of the form u(x,y) = (y2,x2), p(x,y) = x, b(x,y) = (1− y2,1− x2), r(x,y) = (1− x2)(1− y2). We construct this example with r 6= 0 to show the convergence rate in rh; later examples will feature a divergence-free g and a vanishing r; cf. Proposition 3. DOFs in uh/ph ‖u−uh‖L2(Ω) l ‖u−uh‖1,h l ‖p− ph‖L2(Ω) l 112/32 3.893e-2 – 8.297e-1 – 1.297 – 416/128 1.016e-2 1.94 4.105e-1 1.01 3.734e-1 1.78 1,600/512 2.707e-3 1.91 2.045e-1 1.01 1.293e-1 1.53 6,272/2,048 7.087e-4 1.93 1.021e-1 1.00 5.475e-2 1.24 24,832/8,192 1.813e-4 1.97 5.104e-2 1.00 2.597e-2 1.08 98,816/32,768 4.578e-5 1.99 2.552e-2 1.00 1.281e-2 1.02 Table 3.1: Example 1. Convergence of ‖u−uh‖L2(Ω), ‖u−uh‖1,h, and ‖p− ph‖L2(Ω). DOFs in bh/rh ‖b−bh‖L2(Ω) l ‖b−bh‖H(curl;Ω) l 56/25 4.720e-1 – 9.431e-1 – 208/81 2.358e-1 1.00 4.714e-1 1.00 800/289 1.179e-1 1.00 2.357e-1 1.00 3,136/1,089 5.893e-2 1.00 1.179e-1 1.00 12,416/4,225 2.946e-2 1.00 5.893e-2 1.00 49,408/16,641 1.473e-2 1.00 2.946e-2 1.00 Table 3.2: Example 1. Convergence of ‖b−bh‖L2(Ω) and ‖b−bh‖H(curl;Ω). 60 DOFs in bh/rh ‖r− rh‖L2(Ω) l ‖∇(r− rh)‖L2(Ω) l 56/25 1.673e-1 – 9.391e-1 – 208/81 4.433e-2 1.92 4.824e-1 0.96 800/289 1.125e-2 1.98 2.429e-1 0.99 3,136/1,089 2.822e-3 1.99 1.216e-1 1.00 12,416/4,225 7.062e-4 2.00 6.085e-2 1.00 49,408/16,641 1.766e-4 2.00 3.043e-2 1.00 Table 3.3: Example 1. Convergence of ‖r− rh‖L2(Ω) and ‖∇(r− rh)‖L2(Ω). In Tables 3.1–3.3, we investigate the asymptotic rates of convergence of the errors in the approximations of the hydrodynamic and magnetic variables; here, l denotes the experimental rate of convergence. We observe that ‖u− uh‖1,h, ‖p− ph‖L2(Ω), ‖b−bh‖H(curl;Ω) and ‖∇(r− rh)‖L2(Ω) converge to zero as the mesh is refined, in accordance with Theorem 6. The rate of convergence is O(h). Notice that we obtain the optimal rate in this numerical experiment, even though Theo- rem 7 predicts a suboptimal rate with a loss of O(hε). Additionally, ‖u−uh‖L2(Ω) and ‖r− rh‖L2(Ω) converge at rate O(h2) as h tends to zero, which is also optimal. In Figure 3.3 we show the convergence history of the Picard iterations for the grid sequence considered in this example. The plot depicts the number of iterations against the differences between consecutive iterates corresponding to the approxi- mated vector coefficients, measured in a normalized discrete 2-norm and labeled as ‘Tolerance’ in the plot. As expected, convergence is linear and the iteration count is fairly insensitive to the size of the grid. A very similar behavior has been observed in all of our other experiments, in 2D as well as in 3D. 3.3.2 Example 2: two-dimensional problem with a singular solution In order to verify the capability of the proposed method to capture singularities in two dimensions, we consider a problem in the L-shaped domain Ω = (−1,1)2 \ ([0,1)× (−1,0]) with ΓN = {(1,y) : y ∈ (0,1)}, ΓD = ∂Ω\ΓN , and set ν = κ = 1, νm = 1e4. We choose the forcing terms and the boundary conditions such that the analytic solution is given by the strongest corner singularities for the underlying elliptic operators. In polar coordinates (ρ,θ), the hydrodynamic solution compo- 61 1 2 3 4 5 10−7 10−6 10−5 10−4 10−3 10−2 10−1 100 101 Iteration To le ra nc e Grid 1 Grid 2 Grid 3 Grid 4 Grid 5 Grid 6 Figure 3.3: Example 1. Convergence history of the Picard iteration for the grid sequence defined in Tables 3.1–3.3. nents u and p are then given by u(ρ,θ) = [ ρλ ((1+λ )sin(θ)ξ (θ)+ cos(θ)ξ ′(θ)) ρλ (−(1+λ )cos(θ)ξ (θ)+ sin(θ)ξ ′(θ)) ] , p(ρ,θ) =−ρλ−1((1+λ )2ξ ′(θ)+ξ ′′′(θ))/(1−λ ), where ξ (θ) =sin((1+λ )θ)cos(λω)/(1+λ )− cos((1+λ )θ) − sin((1−λ )θ)cos(λω)/(1−λ )+ cos((1−λ )θ), ω = 32pi and λ ≈ 0.54448373678246. The magnetic pair (b,r) is given by b(ρ,θ) = ∇(ρ2/3 sin(2/3θ)), r(ρ,θ)≡ 0. For this example, we have that (u, p) ∈ H1+λ (Ω)2 × Hλ (Ω) and b∈H2/3(Ω)2. Note that straightforwardly applied nodal elements cannot correctly resolve the magnetic field. In Tables 3.4–3.5, we investigate the asymptotic rates of conver- gence of the errors in the approximations of the hydrodynamic and magnetic vari- 62 ables. Again, we observe that the discrete solution converges to the exact one as the mesh size h approaches zero, in accordance with Theorem 6. The results show full agreement with the optimal rates for ‖u−uh‖1,h and ‖b−bh‖H(curl;Ω). For the pressure, we also see that the rate for ‖p− ph‖L2(Ω) is approaching the optimal rate, albeit more slowly. Additionally, we observe the L2-norm of r is zero because g is divergence-free, in accordance with Proposition 3. In Figures 3.4–3.5, we show the solution computed on the finest mesh with 24,576 elements; the total number of degrees of freedom employed in the finite element space Vh×Ch×Qh× Sh is 148,481. The results show that our solution captures the strongest corner singularities and are comparable to the results in [58]. DOFs in uh/ph ‖u−uh‖L2(Ω) l ‖u−uh‖1,h l ‖p− ph‖L2(Ω) l 88/24 2.159e-1 – 2.468 – 15.91 – 320/96 1.781e-1 0.28 1.880 0.39 9.328 0.77 1,216/384 1.204e-1 0.56 1.368 0.46 5.387 0.79 4,736/1,536 6.816e-1 0.82 0.9588 0.51 3.301 0.71 18,688/6,144 3.490e-2 0.97 0.6627 0.53 2.124 0.64 74,240/24,576 1.705e-2 1.03 0.4559 0.54 1.408 0.59 Table 3.4: Example 2. Convergence of ‖u−uh‖L2(Ω), ‖u−uh‖1,h, and ‖p− ph‖L2(Ω). DOFs in bh/rh ‖b−bh‖L2(Ω) l ‖b−bh‖H(curl;Ω) l ‖rh‖L2(Ω) 44/21 2.796e-1 – 2.796e-1 – 2.162e-12 160/65 1.814e-1 0.62 1.814e-1 0.62 6.188e-12 608/225 1.169e-1 0.63 1.169e-1 0.63 2.289e-11 2,368/833 7.473e-2 0.65 7.473e-2 0.65 4.260e-11 9,344/3,201 4.754e-2 0.65 4.754e-2 0.65 1.406e-10 37,120/12,545 3.013e-2 0.66 3.013e-2 0.66 3.018e-10 Table 3.5: Example 2. Convergence of ‖b−bh‖L2(Ω), ‖b−bh‖H(curl;Ω), and ‖rh‖L2(Ω). 63 −1 −0.5 0 0.5 1 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 −1 −0.5 0 0.5 1 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 (a) (b) Figure 3.4: Example 2. Numerical approximations of (a) velocity; (b) pres- sure contours. −1 −0.5 0 0.5 1 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 −1 −0.5 0 0.5 1 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 −1 −0.5 0 0.5 1 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 (a) (b) (c) Figure 3.5: Example 2. Numerical approximations of (a) magnetic field; (b) contours of the first component of the magnetic field; (c) contours of the second component of the magnetic field. 64 3.3.3 Example 3: two-dimensional Hartmann flow Next, we consider Hartmann channel flow problems in two and three dimensions; cf. [42]. In these examples, we denote by Ha the Hartmann number, which is defined as Ha = √ κ ννm . Consider the two-dimensional Hartmann flow problem, which involves a steady unidirectional flow in the channel Ω= (0,10)× (−1,1) under the influence of the constant transverse magnetic field bD = (0,1). The MHD solution then takes the form: u(x,y) = (u(y),0), p(x,y) =−Gx+ p0(y), b(x,y) = (b(y),1), r(x,y)≡ 0. (3.9) We impose the following boundary conditions: u = 0 on y =±1, (pI−ν∇u)n = pNn on x = 0 and x = 10, n×b = n×bD on ∂Ω, r = 0 on ∂Ω, where pN(x,y) = p(x,y) =−Gx− G 2 2κ ( sinh(yHa) sinh(Ha) − y )2 . The exact solution is given by (3.9) with u(y) = G νHatanh(Ha) ( 1− cosh(yHa) cosh(Ha) ) , b(y) = G κ ( sinh(yHa) sinh(Ha) − y ) , p0(y) =−G 2 2κ ( sinh(yHa) sinh(Ha) − y )2 . We note that p0(y) and −κb(y) 2 2 are the same up to an additive constant. 65 In Tables 3.6–3.7 and Figures 3.6–3.7, we set ν = κ = 1, νm = 1e4, and G = 10. We observe that rh ≡ 0, as predicted in Proposition 3, and ‖u−uh‖1,h, ‖p− ph‖L2(Ω) and ‖b−bh‖H(curl;Ω) converge to zero at the optimal rateO(h) as the mesh is refined. Moreover, we note that the L2-norms of the errors in the approximations of u, b and p tend to zero optimally as well. In Figures 3.6–3.7 we show the slices and the vector fields of the solution com- puted on the mesh with 32,768 elements; the total number of degrees of freedom employed in the finite element space Vh×Ch×Qh× Sh is 197,633. In order to show the directions of vectors, in Figure 3.7(b) and later figures, b is normalized such that the largest magnitude of each component is 1 in the computational do- main. The computed and analytical solutions of the first components in the velocity and magnetic fields are virtually indistinguishable; see Figure 3.6. DOFs in uh/ph ‖u−uh‖L2(Ω) l ‖u−uh‖1,h l ‖p− ph‖L2(Ω) l 416/128 2.028e-1 – 3.215 – 13.97 – 1,600/512 5.169e-2 1.97 1.611 1.00 6.986 1.00 6,272/2,048 1.306e-2 1.99 0.8061 1.00 3.493 1.00 24,832/8,192 3.282e-3 1.99 0.4033 1.00 1.747 1.00 98,816/32,768 8.227e-4 2.00 0.2017 1.00 0.8734 1.00 Table 3.6: Example 3. Convergence of ‖u−uh‖L2(Ω), ‖u−uh‖1,h, and ‖p− ph‖L2(Ω). DOFs in bh/rh ‖b−bh‖L2(Ω) l ‖b−bh‖H(curl;Ω) l ‖rh‖L2(Ω) 208/81 1.679e-4 – 2.259e-4 – 3.868e-12 800/289 8.605e-5 0.96 1.148e-4 0.98 1.746e-11 3,136/1,089 4.328e-5 0.99 5.761e-4 0.99 3.627e-11 12,416/4,225 2.167e-5 1.00 2.883e-5 1.00 9.424e-11 49,408/16,641 1.084e-5 1.00 1.442e-5 1.00 2.401e-10 Table 3.7: Example 3. Convergence of ‖b−bh‖L2(Ω), ‖b−bh‖H(curl;Ω), and ‖rh‖L2(Ω). 66 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 0 1 2 3 4 5 6 y u (y) FE Solution Analytical Solution −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 −8 −6 −4 −2 0 2 4 6 8 x 10−5 y b(y ) FE Solution Analytical Solution (a) (b) Figure 3.6: Example 3. Slices along x = 5, −1≤ y≤ 1: (a) velocity compo- nent u(y); (b) magnetic component b(y). (a) (b) Figure 3.7: Example 3. Numerical approximations of (a) velocity; (b) nor- malized magnetic field. 3.3.4 Example 4: three-dimensional Hartmann flow In this example, we consider the three-dimensional unidirectional flow in the rect- angular duct given by Ω= (0,L)× (−y0,y0)× (−z0,z0) with y0,z0 L under the 67 influence of the constant transverse magnetic field bD = (0,1,0). We take f= g= 0 and consider solutions of the form u(x,y,z) = (u(y,z),0,0), p(x,y,z) =−Gx+ p0(y,z), b(x,y,z) = (b(y,z),1,0), r(x,y,z)≡ 0. We enforce the boundary conditions u = 0 for y =±y0 and z =±z0, (pI−ν∇u)n = pNn for x = 0 and x = L, n×b = n×bD on ∂Ω, r = 0 on ∂Ω, with pN(x,y,z) =−Gx− κb(y,z) 2 2 +10. The function b(y,z) is given by the Fourier series b(y,z) = ∞ ∑ n=0 bn(y)cos(λnz), where λn = (2n+1)pi 2z0 , bn(y) = ν κ ( An λ 2n − p21 p1 sinh(p1y)+Bn λ 2n − p22 p2 sinh(p2y) ) , p21,2 = λ 2 n +Ha 2/2±Ha √ λ 2n +Ha2/4, An = −p1(λ 2n − p22) ∆n un(y0)sinh(p2y0), Bn = p2(λ 2n − p21) ∆n un(y0)sinh(p1y0), ∆n = p2(λ 2n − p21)sinh(p1y0)cosh(p2y0)− p1(λ 2n − p22)sinh(p2y0)cosh(p1y0), un(y0) = −2G νλ 3n z0 sin(λnz0). 68 DOFs in uh/ph ‖u−uh‖L2(Ω) l ‖u−uh‖1,h l ‖p− ph‖L2(Ω) l 360/48 3.959e-1 – 1.829 – 30.89 – 2,592/384 1.320e-1 1.58 0.9561 0.94 8.194 1.91 19,584/3,072 3.609e-2 1.87 0.4903 0.96 2.837 1.53 152,064/24,576 9.590e-3 1.91 0.2484 0.98 1.091 1.38 Table 3.8: Example 4. Convergence of ‖u−uh‖L2(Ω), ‖u−uh‖1,h, and ‖p− ph‖L2(Ω). DOFs in bh/rh ‖b−bh‖L2(Ω) l ‖b−bh‖H(curl;Ω) l ‖rh‖L2(Ω) 98/27 1.850e-5 – 3.219e-5 – 9.855e-12 604/125 1.565e-5 0.24 2.579e-5 0.32 1.013e-10 4,184/729 8.592e-6 0.86 1.464e-5 0.82 4.098e-10 31,024/4,913 4.411e-6 0.96 7.543e-6 0.96 1.795e-9 Table 3.9: Example 4. Convergence of ‖b−bh‖L2(Ω), ‖b−bh‖H(curl;Ω), and ‖rh‖L2(Ω). The functions u(y,z) and p0(y,z) can be also expressed by Fourier series; for de- tails; see [40]. In fact, p0(y,z) and−κb(y,z) 2 2 are identical up to an additive constant. Note also that p(x,y,z) = pN(x,y,z). In our tests, we set L = 10, y0 = 2, z0 = 1, ν = κ = 1, νm = 1e4 and G = 0.5. In Tables 3.8–3.9, we investigate the asymptotic rates of convergence of the errors in the approximations of the hydrodynamic and magnetic variables. Again, we observe that the finite element solution converges to the exact solution as the mesh size h approaches zero, in accordance with Theorem 6. We observe the results show good agreement with the optimal rates for ‖u−uh‖1,h and ‖b−bh‖H(curl;Ω). For the pressure, we also see that the rate for ‖p− ph‖L2(Ω) is approaching the optimal rate, although more slowly. Additionally, we observe the L2-norm of r is zero because g is divergence-free, in accordance with Proposition 3. In Figures 3.8–3.9 we show the slices and the vector fields of the solution computed on a uniform tetrahedral mesh of 24,576 elements; this results in a total of 212,577 degrees of freedom in the finite element space Vh×Ch×Qh×Sh. We observe that the computed and analytical solutions are in good agreement on this 69 relatively coarse mesh; see Figure 3.8. −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 0 0.05 0.1 0.15 0.2 0.25 y u (y, 0) FE Solution Analytical Solution −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 −3 −2 −1 0 1 2 3 x 10−6 y b(y ,0) FE Solution Analytical Solution (a) (b) Figure 3.8: Example 4. Slices along x = 5, −2 ≤ y ≤ 2, and z = 0: (a) velocity component u(y,0); (b) magnetic component b(y,0). 3.3.5 Example 5: two-dimensional driven cavity flow Let us consider a classic test problem used in fluid dynamics, known as driven- cavity flow. It is a model of the flow in a cavity with the lid moving in one direction; cf. [37, Chapter 5.1.3] and [77]. In this example, we consider the two-dimensional domain Ω = (−1,1)2 with ΓD = ∂Ω, and set the source terms to be zero. The boundary conditions are pre- scribed as follows: u = 0 on x =±1 and y =−1, u = (1,0) on y = 1, n×b = n×bD on ∂Ω, r = 0 on ∂Ω, where bD = (1,0). We set ν = 1e-2, νm = 1e5, κ = 1e5, which simulates liquid metal type flows. Figures 3.10–3.11 show the solution computed on a mesh with 8,192 elements 70 (a) (b) Figure 3.9: Example 4. Numerical approximations of (a) velocity; (b) nor- malized magnetic field. and 49,665 degrees of freedom. Figure 3.10(a) shows the circulation created by the moving lid; Figure 3.10(b) shows that the magnetic field changes direction due to the coupling effect. Figure 3.11(a) depicts the boundary layer formation in terms of the first component of the velocity. Streamlines for the velocity field are 71 displayed in Figure 3.11(b). The computed solution agrees with the solution in the literature [77]. −1 −0.5 0 0.5 1 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 −1 −0.5 0 0.5 1 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 (a) (b) Figure 3.10: Example 5. Numerical approximations of (a) velocity; (b) nor- malized magnetic field. 3.3.6 Example 6: three-dimensional driven cavity flow The problem we consider is the three-dimensional driven cavity flow in the domain Ω = (−1,1)3 with ΓD = ∂Ω. The source terms are set to be zero. The boundary conditions are prescribed as follows: u = 0 on x =±1, y =±1 and z =−1, u = (1,0,0) on z = 1, n×b = n×bD on ∂Ω, r = 0 on ∂Ω, where bD = (1,0,0). We set ν = 1e-2, νm = 1e5, κ = 1e5 and obtain Figure 3.12 on a uniform tetrahedral mesh comprising 24,576 elements; this results in a total of 212,577 degrees of freedom. The flow vectors on slices demonstrate a similar behavior to 72 −1 −0.5 0 0.5 1 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 −1 −0.5 0 0.5 1 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 (a) (b) Figure 3.11: Example 5. Numerical approximations of (a) contours of first velocity components; (b) streamlines of velocity. the two-dimensional scenario in Section 3.3.5; see Figure 3.10. 3.3.7 Example 7: two-dimensional MHD flow over a step The example we present here is another classical problem of a flow over a step under a transverse magnetic field; cf. [41]. The magnetic field tends to damp the vortex of the fluid after the step. The domain isΩ= (−0.25,0.75)×(−0.125,0.125)\(−0.25,0]×(−0.125,0], with ΓN = {(0.75,y) : y ∈ (−0.125,0.125)} and ΓD = ∂Ω\ΓN . We set f = g = 0, and choose ν = 1e-2, νm = 1e5, κ = 2.5e4. The boundary data are given by u = 0 on y =±0.125 and {(x,0) : x ∈ (−0.25,0)}, u = 0 on {(0,y): y ∈ (−0.125,0)}, u = (−25.6y(y−0.125),0) on x =−0.25, (pI−ν∇u)n = pNn on x = 0.75, n×b = n×bD on ∂Ω, r = 0 on ∂Ω, 73 (a) (b) Figure 3.12: Example 6. Numerical approximations of (a) velocity; (b) nor- malized magnetic field. where pN = 0 and bD = (0,1). Figures 3.13–3.14 show the solution computed on a mesh with 7,168 elements and 43,649 degrees of freedom. It is evident from Figure 3.13 that the flow field is correctly captured; the magnetic field changes directions due to the coupling effect; the pressure drops behind the step. Figure 3.14 shows the velocity field in terms of stream lines. The recirculation after the step decreases as the coupling coefficient κ increases. We observe that our numerical method reproduces this damping effect without any oscillation in the numerical solution. The computed solutions agree with the solutions in the literature [25, 41]. 74 (a) (b) (c) Figure 3.13: Example 7. Numerical approximations of (a) velocity; (b) nor- malized magnetic field; (c) pressure contours. −0.05 0 0.05 −0.12 −0.1 −0.08 −0.06 −0.04 −0.02 0 −0.05 0 0.05 −0.12 −0.1 −0.08 −0.06 −0.04 −0.02 0 (a) (b) Figure 3.14: Example 7. Velocity flow vectors and streamlines zoomed in behind the step for (a) κ=2.5e4; (b) κ=1e5. 75 Chapter 4 Solution of the discretized incompressible MHD equations In this chapter, we present preliminary results for preconditioning techniques for the discretized MHD equations in mixed form presented in Chapter 3. The MHD system includes the Maxwell sub-system (discussed in Chapter 2), the Navier- Stokes sub-system (mentioned in Chapter 1) and the coupling terms. A consid- erable effort has been made to develop efficient and robust solution algorithms for the Maxwell equations and the incompressible Navier-Stokes equations. We present a few preconditioning ideas for the discretized MHD equations, which are based on combining preconditioners for the sub-systems. We provide some eigen- value analysis for our approach and show that many of the eigenvalues are tightly clustered. Numerical results show that the resulting scheme is rapidly convergent, and is reasonably scalable. This chapter is structured as follows. First, we discuss related work in Sec- tion 4.1. We propose a few ideas of preconditioning in Section 4.2 and compare their performance in the numerical experiments of Section 4.3. 76 4.1 Background and related work To transform the Picard iteration (3.7) into matrix form, let 〈ϕ j〉n̂j=1 and 〈ζi〉m̂i=1 be finite element bases for the spaces Vh and Qh respectively: Vh = span〈ϕ j〉n̂j=1, Qh = span〈ζi〉m̂i=1. (4.1) Recall from (2.7) that 〈ψ j〉nj=1 and 〈φi〉mi=1 are finite element bases for the spaces Ch and Sh respectively. We identify finite element functions (ukh,b k h, p k h,r k h) ∈ Vh×Ch×Qh× Sh with their coefficient vectors u=(u1, . . . ,un̂)∈Rn̂, b=(b1, . . . ,bn)∈Rn, p=(p1, . . . , pm̂)∈ Rm̂, and r = (r1, . . . ,rm) ∈ Rm, with respect to the bases (2.7) and (4.1). The solu- tion of (3.7) can now be computed by solving the linear system K x≡ A+O BT CT 0 B 0 0 0 −C 0 M DT 0 0 D 0 u p b r = f 0 g 0 . (4.2) Here, A, B, C, D, M and O are matrices associated the forms Ah(ukh,v), B(u k h,q), C(bk−1h ,v,b k h), D(b k h,s), M(b k h,c) and Oh(u k−1 h ,u k h,v) defined in Section 3.2. Vec- tors f and g are associated with the forms (f,v)Ω and (g,c)Ω. The system (4.2) is in a generalized saddle point form. It is unsymmetric due to the convection term O (note also the skew-symmetric term C, which can be easily symmetrized by manipulating signs). We are interested in solving (4.2) iteratively using preconditioned GMRES. To the best of our knowledge, preconditioned ap- proaches for solving (4.2) have not been previously considered in the literature. We denote the coefficient matrix for the Navier-Stokes sub-problem by KN = ( F BT B 0 ) , (4.3) 77 where F = A+O, and the coefficient matrix for the Maxwell sub-system is KM = ( M DT D 0 ) . (4.4) We have already presented preconditioning techniques for the time-harmonic Maxwell equations in Chapter 2. Thus, we discuss some related work for precon- ditioning the Navier-Stokes system and hence focus on solution algorithms for the linear system associated with the coefficient matrix (4.3). Consider the LDU block factorization of the coefficient matrix, KN = ( I 0 BF−1 I )( F 0 0 −S )( I F−1BT 0 I ) , where S = BF−1BT is the Schur complement (of F in KN). We will investigate a preconditioning operator of the form M ≡ ( F 0 0 S ) . (4.5) The preconditioned linear system M−1KN has exactly three eigenvalues. Thus GMRES will converge in three iterations; see [75]. However, this idealized situ- ation is impractical, since the preconditioning operation requires the action of the inverse of S. Effective approximate block preconditioners are often based on a good approximation to S (e.g., [11, 12, 19, 37, 64]). Since the linear system is nonsymmetric, we could also consider a block trian- gular version of (4.5) M ≡ ( F BT 0 −MS ) , (4.6) where MS is an approximation to S. Krylov subspace iteration with the block trian- gular preconditioner requires approximately half as many as steps as with the block diagonal preconditioner; see [35]. A choice for MS is the pressure mass matrix K. If F only consists of the discrete Laplacian operator (i.e., the convection term O 78 is negligible), with the choice MS = 1ν K, the preconditioner M makes it possi- ble to obtain mesh-independent convergence rates; see [35, 66]. However, it does not take into account the effects of convection on the Schur complement operator S, and convergence rates deteriorate as the convection increases. Pressure convection- diffusion preconditioners are proposed in [64, 92]. These preconditioners are more faithful approximations to the Schur complement matrix. Numerous theoretical and numerical studies have demonstrated mesh independent convergence on sev- eral problems and the overall efficacy of this method. A drawback, however, is that it requires the convection-diffusion operation projected onto the discrete pressure space. As an alternative, the least-squares commutator preconditioner is proposed in [36]. This method automatically generates an approximation to S by solving the normal equation associated with a certain least-squares problem derived from the commutator. A comparison and summary of convergence results for the pressure convection-diffusion preconditioner and the least-squares commutator precondi- tioner is given in [78]. In this chapter, we investigate efficient preconditioners for (4.2). We consider a few different approaches and propose a preconditioner combining the least-squares commutator preconditioner for the Navier-Stokes sub-system and the block diag- onal preconditioner in (2.12) for the Maxwell sub-system. We have some prelim- inary eigenvalue analysis for this approach. Our numerical results show that this approach performs quite well, though it is not independent of the mesh size. 4.2 The solver The MHD problem combines the Maxwell sub-system and the Navier-Stokes sub- system with coupling terms. We first discuss ad-hoc preconditioners for these sub- systems. Then, we propose a few ideas of combining them for the discretized MHD problem. 4.2.1 Some ideas for preconditioning For the Maxwell sub-system, the resulting linear system is associated with the coefficient matrix (4.4). Note this corresponds to the discretized time-harmonic Maxwell equations in (2.2) with εr = 1, µr = 1κνm and k = 0. We can apply the 79 preconditioner defined in (2.12), which is N ≡ ( M+X 0 0 L ) . (4.7) Here, L is the Laplacian on Sh defined as L = (Li, j)mi, j=1 ∈ Rm×m with Li, j = ∫ Ω ∇φ j ·∇φi dx, and X = (Xi, j)ni, j=1 ∈ Rn×n is the mass matrix on Ch, defined as Xi, j = ∫ Ω ψ j ·ψi dx. For the Navier-Stokes sub-system with the coefficient matrix (4.3), we apply the least-squares commutator preconditioner [36, 37]. This approach is based on the assumption that BF−1BT ≈ (BQ−1BT )(BQ−1FQ−1BT )−1(BQ−1BT ), where Q = (Qi, j)n̂i, j=1 ∈ Rn̂×n̂ is the mass matrix on Vh and is defined as Qi, j = ∫ Ω ϕ j ·ϕi dx. In the lowest order case, the mass matrix Q is diagonal; for higher order elements, it is block-diagonal. In the latter case, if necessary Q can be replaced by Q̂ = diag(Q) to enhance sparsity; see [37, Remark 6.4]. This would give the following approximation to the Schur complement: MS = (BQ̂−1BT )(BQ̂−1FQ̂−1BT )−1(BQ̂−1BT ). (4.8) The least-squares commutator preconditioner has the form in (4.6) with (4.8) as the (2,2) block. Applying this preconditioner involves the inverse M−1S = (BQ̂ −1BT )−1(BQ̂−1FQ̂−1BT )(BQ̂−1BT )−1, 80 and so, it involves two Poisson solves, for which many efficient solution methods are available. Next, we discuss three different approaches to combining the preconditioners for the sub-systems. (1) Combining preconditioners for the sub-systems without coupling. First we consider applying the ad-hoc preconditioners for the sub-systems directly. We treat the coupling term C explicitly in the Picard iteration in (3.7). We obtain the following Picard-type linearization Ah(ukh,v)+Oh(u k−1 h ,u k h,v)+B(v, p k h) = (f,v)Ω−C(bk−1h ,v,bk−1h ), B(ukh,q) = 0, M(bkh,c)+D(c,r k h) = (g,c)Ω+C(b k−1 h ,u k−1 h ,c), D(bkh,s) = 0. (4.9) The coefficient matrix associated with this iteration is KE = F BT 0 0 B 0 0 0 0 0 M DT 0 0 D 0 . (4.10) The coefficient matrix (4.10) is still unsymmetric, but the two sub-systems are decoupled. Let us refer to (4.10) as a 2× 2 block matrix. The (1, 1) block corresponds to the Navier-Stokes sub-system and the (2, 2) block corresponds to the Maxwell sub-system. We can apply GMRES preconditioned with (4.6) to the (1,1) block and apply MINRES preconditioned with (4.7) to the (2, 2) block. If we also treat the convection term O explicitly and move it to the right-hand- 81 side, we obtain the following Picard-type iteration Ah(ukh,v)+B(v, p k h) = (f,v)Ω−Oh(uk−1h ,uk−1h ,v)−C(bk−1h ,v,bk−1h ), B(ukh,q) = 0, M(bkh,c)+D(c,r k h) = (g,c)Ω+C(b k−1 h ,u k−1 h ,c), D(bkh,s) = 0. (4.11) The coefficient matrix is symmetric: KS = A BT 0 0 B 0 0 0 0 0 M DT 0 0 D 0 . We can use a symmetric solve. Similar to the previous case, the system is decoupled. We now apply MINRES to both of the sub-systems. For the Navier- Stokes sub-system, we use the preconditioner( A 0 0 MS ) . For the Maxwell sub-system, we apply MINRES preconditioned with (4.7). We expect to see fast convergence for the last two approaches. However, be- cause the convection and coupling terms are treated explicitly, the Picard-type linearization may require more iterations to converge or may not converge, if the convection and/or coupling is strong. (2) Combining preconditioners for the sub-systems with coupling. In order to de- velop a more faithful approximation to the discretized MHD system in (4.2), we need to incorporate effects of the coupling terms in the preconditioner. We 82 consider the following preconditioner PC = F BT CT 0 0 −MS 0 0 −C 0 M+X 0 0 0 0 L . (4.12) We expect this preconditioner to perform better when the coupling and/or con- vection is strong. In order to apply the preconditioning approach, we need to solve linear systems associated withPC. We propose to use GMRES with the following inner preconditioner PI = F BT 0 0 0 −MS 0 0 0 0 M+X 0 0 0 0 L . (4.13) We provide some preliminary eigenvalue analysis for this preconditioning ap- proach in Section 4.2.2. (3) Applying the least-squares preconditioner to the entire MHD system. The sys- tem (4.2) can be written as F CT BT 0 −C M 0 DT B 0 0 0 0 D 0 0 u b p r = f g 0 0 . (4.14) Rewrite the coefficient matrix in (4.14) as a 2×2 block matrix as the following( F BT B 0 ) , (4.15) where F = ( F CT −C M ) , 83 and B = ( B 0 0 D ) . Note that (4.15) has the same form as (4.3). Our finite element formulation is inf-sup stable; see [44, Section 3.3]. The operator F has a component F , which is a convection-diffusion type operator and the size of F is bigger than M. We assume F preserves some properties of the convection-diffusion operator. We therefore consider applying the least-squares commutator preconditioner directly to the entire discretized MHD system. The preconditioner is PA = ( F B 0 −MS ) . HereMS is defined as MS = (BQ̂ −1BT )−1(BQ̂−1FQ̂−1BT )(BQ̂−1BT )−1, where Q̂ = diag ( Q 0 0 X ) . For the three approaches discussed above, if the coupling and convection are not very strong, we can treat them explicitly. Then the Navier-Stokes and Maxwell sub-systems are decoupled, and we apply the ad-hoc preconditioners for them sep- arately. If the coupling and convection are strong, we applyPC as a preconditioner for GMRES. As later shown in the numerical examples, the approach is not com- pletely independent of the mesh size, but shows reasonably good performance. For the approach of applying the least-squares commutator preconditioner directly to the discretized MHD system, it is not as good as the other approaches. 4.2.2 Preliminary eigenvalue analysis Eigenvalue analysis of the Maxwell system is given in Theorem 2. It is generally difficult to derive convergence analysis for nonsymmetric systems. As far as we know, there is no complete analysis of the convergence characteristics of the least- 84 squares preconditioner. Thus convergence analysis for PA is also a very difficult problem. In this section, we provide preliminary eigenvalue analysis for the precondi- tioner PC in (4.12). Note that the (3,3) block M +X is spectrally equivalent to M+DT L−1D; see [43, Theorem 3.3]. In the analysis, we consider an ideal version of (4.12) P̂C = F BT CT 0 0 −MS 0 0 −C 0 M+DT L−1D 0 0 0 0 L . Theorem 8 The matrix P̂−1C K has an eigenvalue λ = 1 with algebraic multiplic- ity of at least n+ n̂ and an eigenvalue λ =−1 with algebraic multiplicity of at least m. Proof The corresponding eigenvalue problem is F BT CT 0 B 0 0 0 −C 0 M DT 0 0 D 0 u p b r = λ F BT CT 0 0 −MS 0 0 −C 0 M+DT L−1D 0 0 0 0 L u p b r . The four block rows can be written as (1−λ )(Fu+BT p+CT b) = 0, (4.16) Bu = −λMS p, (4.17) (λ −1)Cu+(1−λ )Mb−λDT L−1Db+DT r = 0, (4.18) Db = λLr. (4.19) If λ = 1, (4.16) is satisfied automatically. Equation (4.17) shows p =−M−1S Bu. Equation (4.19) leads to r = L−1Db. 85 If this holds, (4.18) is satisfied. Therefore, (u,−M−1S Bu,b,L−1Db) is an eigenvector. There exist n̂ linearly independent such u’s and n linearly independent such b’s. There are at least n̂+n linearly independent non-zero vectors satisfying the eigenvalue problem when λ = 1. Therefore λ = 1 is an eigenvalue with algebraic multiplicity of at least n+ n̂. If λ =−1, (4.19) leads to r =−L−1Db. Substituting it into (4.18), we obtain Cu = Mb. If b is a discrete gradient in Ch, Mb= 0 and CT b= 0. If we take u= 0, then Cu= 0. The requirement Cu = Mb is satisfied. If u = 0 and b is a discrete gradient, (4.16) becomes BT p = 0. Since B has full row rank, this implies p = 0. Therefore, if b is a discrete gradient, (0,0,b,−L−1Db) is an eigenvector corre- sponding to λ =−1. There are m discrete gradients in Ch. Therefore λ =−1 is an eigenvalue with at least algebraic multiplicity m. Remark 9 In our experiments, we have observed the eigenvalue λ = 1 has alge- braic multiplicity of exactly n+ n̂ and the eigenvalue λ = −1 has algebraic mul- tiplicity of exactly m. Proving this may be difficult due to complicated generalized eigenvalue problems that other eigenvalues satisfy. Theorem 8 shows that P̂−1C K has some tightly clustered eigenvalues and GM- RES should converge quickly. We expect a similar behavior forP−1C K . Next, we show some preliminary eigenvalue analysis for the inner GMRES iterations. Theorem 10 The matrix P−1I PC has an eigenvalue λ = 1 with algebraic multi- plicity of at least m̂+ n̂+3m−n. 86 Proof The corresponding eigenvalue problem is F BT CT 0 0 −MS 0 0 −C 0 M+X 0 0 0 0 L u p b r = λ F BT 0 0 0 −MS 0 0 0 0 M+X 0 0 0 0 L u p b r . The four block rows can be written as (1−λ )Fu+(1−λ )BT p+CT b = 0, (4.20) (1−λ )MS p = 0, (4.21) −Cu+(1−λ )(M+X)b = 0, (4.22) (1−λ )Lr = 0. (4.23) If λ = 1, (4.21) and (4.23) are satisfied automatically. Equation (4.20) becomes CT b = 0. If b is a discrete gradient, this is satisfied. Equation (4.22) becomes Cu = 0. If u is in the null space of C, we have Cu = 0. Therefore, (u, p,b,r) is an eigenvector associated with eigenvalue λ = 1, if u is in the null space of C and b is a discrete gradient. The matrix C is n× n̂, so we have rank(CT )+nullity(CT ) = n, (4.24) rank(C)+nullity(C) = n̂. (4.25) There are m distinct discrete gradients in Ch. The nullity of CT is at least m. Ac- cording to (4.24), the rank of CT is at most n−m. Matrix C has the same rank as CT . According to (4.25), the nullity of C is at least n̂− n+m. We have m̂ lin- early independent p in Qh and m linearly independent r in Sh. So, there are at least 87 m̂+ n̂+3m−n linearly independent eigenvectors. Therefore, the eigenvalue λ = 1 has algebraic multiplicity of at least m̂+ n̂+3m−n. Theorem 10 shows that many eigenvalues ofP−1I PC are tightly clustered. We expect GMRES to converge within a few iterations. 4.3 Numerical results In this section we present a series of numerical experiments. The primary pur- pose of our experiments is to compare the performance of various preconditioners presented in Section 4.2.1. We use inner-outer Krylov iterations for each Picard iteration. In all experiments, which are carried out using MATLAB, the relative residual of the outer iteration is set to 1e-6 and the relative residual of the inner iteration is set to 1e-8. The tolerance for the Picard iterations is chosen as 1e-5. For outer iterations, we show iteration counts in the last Picard iteration. For inner iterations, we show iterations counts for the last inner iteration. In the numerical experiments, we denote by (i), (ii), and (iii) the three Picard-type of linearization schemes, where (i) denotes the standard Picard linearization in (3.7), (ii) and (iii) denote the Picard-type linearizations in (4.9) and (4.11), respectively. 4.3.1 Example 1: two-dimensional problem with a smooth solution In this example, we apply our solvers for linear systems arising from Section 3.3.1. We assume κ = 1, νm = 1e4. Table 4.1 shows the number of iterations required for different Picard-type lin- earization schemes. The symbol nc indicates the linearization does not converge. We run the experiments with different values of ν . Note that ν only shows up in front of the diffusion operator A and reducing it, both the convection and the coupling get relatively stronger. Table 4.1 shows that when the convection and coupling get stronger, lineariza- tion (iii) in (4.11) does not always converge to the solution, while the other two schemes converge. Scheme (ii) in (4.9) behaves like the standard Picard lineariza- tion. The numbers of steps required to converge by these two schemes are the same in this example. 88 ν = 1 ν = 0.1 ν = 0.01 DOFs (i) (ii) (iii) (i) (ii) (iii) (i) (ii) (iii) 169 5 5 5 8 8 nc 13 13 nc 721 5 5 5 9 9 nc 15 15 nc 2,977 5 5 5 9 9 nc 15 15 nc 12,097 5 5 5 8 8 nc 14 14 nc Table 4.1: Example 1. Picard-type iteration counts for various values of ν . We denote by (i) the standard Picard linearization in (3.7), by (ii) and (iii) the Picard-type linearizations in (4.9) and (4.11), respectively. If we treat the coupling terms explicitly, the linear system is decoupled. We solve the Navier-Stokes and Maxwell sub-systems separately. Table 4.2 shows the iteration counts for the two sub-problems. We solve the Navier-Stokes sub-system with GMRES and use the least-squares commutator preconditioner. We denote by its1 the GMRES iterations. The Maxwell system is solved with MINRES with the block diagonal preconditioner in (4.7). We denote by its2 the MINRES iterations. When the mesh is refined, more GMRES iterations are required, while the number of MINRES iterations remains the same. This shows our preconditioner for the Maxwell sub-system is independent of the mesh size, but the least-squares com- mutator preconditioner is not (yet it is not very sensitive to a change of the mesh size). When the convection and coupling get stronger, typically more iterations are required. The fast convergence of the Maxwell problem can be attributed to the fact that the eigenvalues appear in two extremely tight clusters. ν = 1 ν = 0.1 ν = 0.01 DOFs its1 its2 its1 its2 its1 its2 169 28 2 27 2 24 2 721 51 2 48 2 57 2 2,977 58 2 58 2 85 2 12,097 66 2 65 2 99 2 Table 4.2: Example 1. Iteration counts for various values of ν . The coupling terms are treated explicitly. 89 When both the coupling and convection terms are treated explicitly, we show the number of MINRES iterations required for the two sub-problems. We only show results for ν = 1, because in the other cases, this linearization scheme does not converge. When the mesh is refined, we observe that the number of iteration counts for the Navier-Stokes problem increases, while that for the Maxwell system remains the same. ν = 1 ν = 0.1 ν = 0.01 DOFs its1 its2 its1 its2 its1 its2 169 38 2 nc nc nc nc 721 42 2 nc nc nc nc 2,977 44 2 nc nc nc nc 12,097 54 2 nc nc nc nc Table 4.3: Example 1. Iteration counts for various values of ν . The coupling and convection terms are treated explicitly. Table 4.4 shows the inner and outer GMRES iteration counts, using the outer preconditioner PC in (4.12) and the inner preconditioner PI in (4.13). Here, its denotes the number of outer GMRES iterations, and itsi denotes the number of in- ner GMRES iterations. When the mesh is refined, the outer GMRES solve requires more iterations and the iteration counts are comparable to the iteration counts for the Navier-Stokes sub-system in Table 4.2. The preconditionerPC is not indepen- dent of the mesh size, but it is not very sensitive to the change. When ν decreases, more outer iterations are required. The inner iterations show mesh-size indepen- dent behavior and they are not sensitive to a change of ν , either. The real eigenvalues of the preconditioned linear system P̂−1C K are shown in Figure 4.1. In this experiment, we take ν = 1 and use a mesh with 721 degrees of freedom, which corresponds to n = 176, m = 49, n̂ = 368 and m̂ = 128. Accord- ing to Theorem 8, we expect to see λ = 1 with algebraic multiplicity of at least n+ n̂ = 544 and λ =−1 with algebraic multiplicity of at least m = 49. In our ex- periment, we obtain λ = 1 with algebraic multiplicity of exactly 544, and λ =−1 with algebraic multiplicity of exactly 49. For the inner preconditioner, we expect to see λ = 1 with algebraic multiplicity of at least m̂+ n̂+3m−n= 467 of the pre- 90 ν = 1 ν = 0.1 ν = 0.01 DOFs its itsi its itsi its itsi 169 29 3 28 4 26 5 721 54 3 52 3 61 5 2,977 62 3 63 3 94 4 12,097 75 3 70 3 110 4 Table 4.4: Example 1. Inner and outer iteration counts for various values of ν , usingPC. conditioned linear system P−1I PC. Numerically, we obtain 469 real eigenvalues and they are all 1. Table 4.5 shows iteration counts for GMRES preconditioned with PA. The number of iteration counts grows up very quickly, as we refine the mesh. This preconditioning approach is more sensitive to a change of the mesh size, compared with the previous approaches. DOFs ν = 1 ν = 0.1 ν = 0.01 169 37 35 33 721 98 97 106 2,977 nc 283 310 12,097 nc nc nc Table 4.5: Example 1. Iteration counts for various values of ν , usingPA. 4.3.2 Example 2: two-dimensional problem with a singular solution In this example, we apply our solvers for linear systems arising from Section 3.3.2. We assume κ = 1, νm = 1e4. Table 4.6 shows the number of iterations required for different Picard-type lin- earization schemes. Again, we observe that scheme (iii), which is the Picard- type linearization in (4.11), converges more slowly than the other two linearization schemes. If the coupling and convection get stronger, this linearization does not converge. Scheme (ii) given in (4.9) shows a similar behavior to the standard Pi- card linearization. However, because the convection term is treated explicitly in 91 0 100 200 300 400 500 600 700 800 −10 0 10 20 30 40 50 60 70 Figure 4.1: Example 1. Plot of the eigenvalues of the preconditioned matrix P̂−1C K , DOFs = 721. this scheme, it will fail to converge if convection is strong. Indeed, for this exam- ple, taking ν = 0.01, νm = 1e4 and κ = 1e6, scheme (ii) fails to converge, while the standard Picard linearization still converges. Table 4.7 shows the iteration counts for the two sub-problems, when the cou- pling terms are treated explicitly. Here, we also find that the least-squares commu- tator preconditioner is not completely independent of the mesh size, but it is not very sensitive to a change of the mesh size. For the Maxwell sub-problem, our scheme is not sensitive to a change of the mesh size. Table 4.8 shows the inner and outer GMRES iteration counts, using the outer preconditionerPC in (4.12) and the inner preconditionerPI in (4.13). The outer 92 ν = 1 ν = 0.1 ν = 0.01 DOFs (i) (ii) (iii) (i) (ii) (iii) (i) (ii) (iii) 117 4 4 7 6 6 nc 8 8 nc 521 5 5 9 7 7 nc 10 10 nc 2,193 4 4 9 8 8 nc 10 10 nc 8,893 4 4 9 7 7 nc 11 11 nc Table 4.6: Example 2. Picard-type iteration counts for various values of ν . We denote by (i) the standard Picard linearization in (3.7), by (ii) and (iii) the Picard-type linearizations in (4.9) and (4.11), respectively. ν = 1 ν = 0.1 ν = 0.01 DOFs its1 its2 its1 its2 its1 its2 117 22 1 20 1 18 1 521 46 1 50 1 40 1 2,193 56 1 84 1 83 1 8,893 63 1 88 1 99 1 Table 4.7: Example 2. Iteration counts for various values of ν . The coupling terms are treated explicitly. iterations demonstrate a similar behavior to the iterations of the Navier-Stokes sub- system in Table 4.7. The inner iterations are not sensitive to a change of the mesh size and ν . ν = 1 ν = 0.1 ν = 0.01 DOFs its itsi its itsi its itsi 117 22 3 20 3 18 3 521 46 3 50 3 41 3 2,193 56 3 84 3 89 3 8,893 68 3 106 3 138 3 Table 4.8: Example 2. Inner and outer iteration counts for various values of ν , usingPC. Table 4.9 shows iteration counts for GMRES preconditioned with PA. Here, we also find that the iteration does not scale very well when we refine the mesh. 93 DOFs ν = 1 ν = 0.1 ν = 0.01 117 27 25 23 521 79 83 73 2,193 217 245 244 8,893 nc nc nc Table 4.9: Example 2. Iteration counts for various values of ν , usingPA. 4.3.3 Discussion We have considered three different approaches to combining the preconditioners for the sub-systems. We choose the preconditioner based on how strong the con- vection and coupling are. If they are not very strong, we take the first approach, where preconditioners for the sub-systems are combined together directly without coupling. If the coupling terms are small, we can treat them explicitly. As shown in Tables 4.1 and 4.6, the linearization scheme (ii) requires the same iterations as the standard Picard linearization (i). If both of the coupling and convection terms are negligible, we can treat both of them explicitly as in scheme (iii). When the mesh is refined, schemes (ii) and (iii) are not completely mesh-independent, but they are not very sensitive to a change of the mesh size. However, if the coupling and convection are strong, these schemes will converge more slowly compared with the standard Picard scheme, or may not converge. We propose to combine the pre- conditioners for the sub-systems with the coupling terms as in PC. As shown in Tables 4.4 and 4.8, this approach performs qualitatively similar to the least-squares commutator preconditioner for the Navier-Stoke sub-problem shown in Tables 4.2 and 4.7. The third approach, where the least-squares commutator is applied to the entire discretized MHD system, does not work as well as the other two approaches, as shown in Tables 4.5 and 4.9. 94 Chapter 5 Conclusions and future work 5.1 Conclusions In this thesis, we have studied the numerical solutions of the time-harmonic Maxwell equations and incompressible MHD equations. For the Maxwell equations, we have proposed and analyzed an efficient solver based on inner-outer iterations. We have demonstrated very good scalability in our numerical experiments. For the MHD equations, we have designed a new finite element discretization. This dis- cretization yields exactly divergence-free velocity approximations and captures the strongest magnetic singularities. We have also investigated preconditioning tech- niques for the discretized MHD equations. We have proposed a preconditioner based on efficient preconditioners of the Maxwell and Navier-Stokes sub-systems. Our numerical results clearly indicate optimal convergence rates in two and three dimensions and show that our preconditioning approach performs reasonably well. In the following subsections, we summarize the results in the thesis and draw some conclusions. 5.1.1 The Maxwell problem We have developed and implemented a fully scalable parallel iterative solver for the time-harmonic Maxwell equations with variable coefficients, on unstructured com- plicated domains with heterogeneous coefficients. Our code for preconditioned iterative solvers is specialized, dealing with the specific features of the discretiza- 95 tion, such as the use of edge elements. We use state-of-the-art software packages (PETSc [7], BoomerAMG [51], Hypre [38], and METIS [62]) to optimize the per- formance of our solvers. We have also developed our own mesher for structured meshes, and we use TetGen [91] for unstructured and locally refined meshes. The preconditioned solvers are based on inner-outer iterations. For outer it- erations we extend and generalize [43] to the variable coefficient case. The inner iterations are based on the auxiliary spaces technique developed in [56]. We have shown that the preconditioned matrix is well conditioned and its eigenvalues are tightly clustered; this is key to the effectiveness of the proposed approach. As our extensive numerical experiments indicate, the inner and outer iterations are scalable in terms of iteration counts and computation times, and the solver is minimally sensitive to changes in the mesh size. Highly varying coef- ficients result in higher iteration counts, but those barely change as the mesh is refined. 5.1.2 The MHD problem We have introduced a new mixed finite element method for the numerical dis- cretization of a stationary incompressible magnetohydrodynamics problem, with divergence-conforming BDM elements and curl-conforming Nédélec elements for the velocity and magnetic fields, respectively. The approximation of the velocity field is exactly mass conservative. The discrete formulation is well-posed under a standard small data assumption, and the approximations are convergent under minimal regularity assumptions. We have presented an extensive set of numerical tests in Section 3.3. The numerical experiments show optimal convergence in two and three dimensions, and indicate that the constant Cd in Theorem 7 stays bounded in all cases, even though this is not guaranteed by the analysis. Altogether, the computed results are in excellent agreement with results in the literature, and the method correctly resolves the strongest magnetic singularities in non-convex domains. Based on the theoretical results in [82], we expect the same good performance of our discretization and solution techniques to carry over to the dynamic problem, provided that the nonlinear terms are treated (semi)implicitly. We also mention the 96 issue of higher order elements. Here, too, we do not expect any deviation from our current computational results. In particular, we expect to see optimal convergence rates for smooth solutions. We have done a preliminary study in efficient solvers for the discretized MHD equations based on Krylov methods. We have investigated a few precondition- ing ideas. If the coupling and convection are not very strong, we can treat them explicitly. Then the Navier-Stokes and Maxwell sub-systems are decoupled, and we apply preconditioners for them separately. If the coupling and convection are strong, we have developed a preconditioner PC for GMRES based on combining the preconditioners for the sub-systems and the coupling terms. We have shown that many of the eigenvalues are well-clustered. We have also investigated a pre- conditioner for the inner iterations. Our numerical results show that this approach scales reasonably well, though is not fully scalable. 5.2 Future work The scope of our work can be broadened in a number of directions: 1. Our results for the solution of the Maxwell equations apply to low wave num- bers; dealing with high wave numbers is a primary challenge of much interest. If the wave number is high, the inf-sup constant for our mixed formulation is large [32], which results in numerical difficulties. Therefore, solving problems with high wave numbers exhibits several complications that need to be dealt with using a possibly different methodology. 2. In our parallel solver for the Maxwell equations, the meshing part is sequen- tial and each processor stores the entire mesh. A dense mesh may require high memory resources. Integrating our solver with parallel distributed mesh genera- tors could relieve the memory consumption and improve the performance of our solvers. Parallel mesh generation methods require the partitioning of the mesh in a way that accommodates load balancing. However, this goal may not always accommodate other important factors, such as the quality of the mesh. As a re- sult, the overall computational time may increase. Optimizing these issues is a challenging and involved task. 97 3. Our preconditioners for the MHD equations perform reasonably well, but are not fully scalable. Deriving effective and scalable preconditioners remains an item for future research. While there are efficient solution techniques for the Navier-Stokes equations as well as for the curl-curl operator, the primary chal- lenge here is how to deal with the skew-symmetric coupling terms, especially when coupling is strong. 4. While for the discretized Maxwell problem we have a parallelized and efficient code, we do not have such a code yet for the discretized MHD equations. When the mesh is refined, the size of the resulting linear system increases very quickly and efficient parallel implementations are crucial for obtaining a solution within a reasonable computational time. 5. Another item for future work is the derivation of a non-linear solver for the MHD equations, which potentially converges more rapidly than the Picard it- eration used in our experiments. As we have pointed out in Remark 4 of Sec- tion 3.2.2, developing the Newton iteration for our discretization is somewhat delicate and is the subject of ongoing investigation. 6. For the preconditioning techniques for the discretized MHD problems, we pro- vide some preliminary eigenvalue analysis. The complete eigenvalue analysis is very challenging, because it involves complicated generalized eigenvalue prob- lems including the coupling terms. A complete analysis may require a better understanding of the differential properties of the coupling terms. 98 Bibliography [1] P. R. Amestoy, I. S. Duff, J.-Y. L’Excellent, and J. Koster. A fully asynchronous multifrontal solver using distributed dynamic scheduling. SIAM Journal on Matrix Analysis and Applications, 23(1):15–41, 2001. [2] C. Amrouche, C. Bernardi, M. Dauge, and V. Girault. Vector potentials in three-dimensional non-smooth domains. Mathematical Methods in The Applied Sciences, 21:823–864, 1998. [3] F. Armero and J. Simo. Long-term dissipativity of time-stepping algorithms for an abstract evolution equation with applications to the incompressible MHD and Navier-Stokes equations. Computer Methods in Applied Mechanics and Engineering, 131:41–90, 1996. [4] D. Arnold. An interior penalty finite element method with discontinuous elements. SIAM Journal on Numerical Analysis, 19:742–760, 1982. [5] D. Arnold, F. Brezzi, B. Cockburn, and L. Marini. Unified analysis of discontinuous Galerkin methods for elliptic problems. SIAM Journal on Numerical Analysis, 39:1749–1779, 2001. [6] S. Balay, W. D. Gropp, L. C. McInnes, and B. F. Smith. Efficienct management of parallelism in object oriented numerical software libraries. In E. Arge, A. M. Bruaset, and H. P. Langtangen, editors, Modern Software Tools in Scientific Computing, pages 163–202. Birkhauser Press, 1997. [7] S. Balay, W. D. Gropp, L. C. McInnes, and B. F. Smith. PETSc users manual. Technical Report ANL-95/11 - Revision 2.3.2, Argonne National Laboratory, 2006. [8] S. Balay, K. Buschelman, W. Gropp, D. Kaushik, M. Knepley, L. McInnes, B. Smith, and H. Zhang. PETSc Web page. http://www.mcs.anl.gov/petsc, 2009. 99 [9] W. Bangerth, R. Hartmann, and G. Kanschat. deal.II Web page. http://www.dealii.org/, 2010. [10] M. Benzi. Preconditioning techniques for large linear systems: a survey. Journal of Computational Physics, 182(2):418–477, 2002. [11] M. Benzi and G. H. Golub. A preconditioner for generalized saddle point problems. SIAM Journal on Matrix Analysis and Applications, 26(1):20–41, 2005. [12] M. Benzi, G. H. Golub, and J. Liesen. Numerical solution of saddle point problems. Acta Numerica, 14:1–137, 2005. [13] P. Bochev, C. Garasi, J. Hu, A. Robinson, and R. Tuminaro. An improved algebraic multigrid method for solving Maxwell’s equations. SIAM Journal on Scientific Computing, 25:623–642, 2003. [14] A. Bonito and J.-L. Guermond. Approximation of the eigenvalue problem for the time harmonic Maxwell system by continuous Lagrange finite elements. Technical Report 2009-121, Institute for Applied Mathematics and Computational Science, Texas A&M University, 2009. [15] F. Brezzi and M. Fortin. Mixed and Hybrid Finite Element Methods. In Springer Series in Computational Mathematics, volume 15. Springer-Verlag, New York, 1991. [16] P. Castillo, R. Rieben, and D. White. FEMSTER: An object-oriented class library of high-order discrete differential forms. ACM Transactions on Mathematical Software, 31:425–457, 2005. [17] Z. Chen, Q. Du, and J. Zou. Finite element methods with matching and nonmatching meshes for Maxwell equations with discontinuous coefficients. SIAM Journal on Numerical Analysis, 37(5):1542–1570, 2000. [18] A. Chorin and J. Marsden. A Mathematical Introduction to Fluid Mechanics. Springer-Verlag, 1990. [19] E. Chow and Y. Saad. Approximate inverse techniques for block-partitioned matrices. SIAM Journal on Scientific Computing, 18(6):1657–1675, 1997. [20] B. Cockburn and C.-W. Shu. Runge-Kutta discontinuous Galerkin methods for convection-dominated problems. Journal of Scientific Computing, 16: 173–261, 2001. 100 [21] B. Cockburn, G. Karniadakis, and C.-W. Shu, editors. Discontinuous Galerkin Methods. Theory, Computation and Applications, volume 11 of Lecture Notes in Computational Science and Engineering. Springer-Verlag, 2000. [22] B. Cockburn, G. Kanschat, and D. Schötzau. Local discontinuous Galerkin methods for the Oseen equations. Mathematics of Computation, 73: 569–593, 2004. [23] B. Cockburn, G. Kanschat, and D. Schötzau. A locally conservative LDG method for the incompressible Navier-Stokes equations. Mathematics of Computation, 74:1067–1095, 2005. [24] B. Cockburn, G. Kanschat, and D. Schötzau. A note on discontinuous Galerkin divergence-free solutions of the Navier-Stokes equations. Journal of Scientific Computing, 31:61–73, 2007. [25] R. Codina and N. Silva. Stabilized finite element approximation of the stationary magneto-hydrodynamics equations. Computational Mechanics, 38:344–355, 2006. [26] M. Costabel and M. Dauge. Singularities of electromagnetic fields in polyhedral domains. Archive for Rational Mechanics and Analysis, 151: 221–276, 2000. [27] M. Costabel and M. Dauge. Weighted regularization of Maxwell equations in polyhedral domains. Numerische Mathematik, 93:239–277, 2002. [28] T. G. Cowling. Magnetohydrodynamics. Adam Hilger, England, 1976. [29] M. Dauge. Stationary Stokes and Navier-Stokes systems on two- or three-dimensional domains with corners. Part I: Linearized equations. SIAM Journal on Mathematical Analysis, 20:74–97, 1989. [30] P. A. Davidson. An Introduction to Magnetohydrodynamics. Cambridge University Press, 2001. [31] C. Dawson. Foreword for the special issue on discontinuous Galerkin methods. Computer Methods in Applied Mechanics and Engineering, 195:3, 2006. [32] L. Demkowicz and L. Vardapetyan. Modeling of electromagnetic absorption/scattering problems using hp–adaptive finite elements. Computer Methods in Applied Mechanics and Engineering, 152:103–124, 1998. 101 [33] J. W. Demmel. Applied Numerical Linear Algebra. SIAM, 1997. [34] M. Discacciati. Numerical approximation of a steady MHD problem. In U. Langer, M. Discacciati, D. E. Keys, O. B. Widlund, and W. Zulehner, editors, Domain Decomposition Methods in Science and Engineering XVII, volume 60 of Lecture Notes in Computational Science and Engineering, pages 313–320. Springer-Verlag, 2008. [35] H. Elman and D. Silvester. Fast nonsymmetric iterations and preconditioning for Navier-Stokes equations. SIAM Journal on Scientific Computing, 17(1):33–46, 1996. [36] H. Elman, V. E. Howle, J. Shadid, R. Shuttleworth, and R. Tuminaro. Block preconditioners based on approximate commutators. SIAM Journal on Scientific Computing, 27(5):1651–1668, 2006. [37] H. C. Elman, D. J. Silvester, and A. J. Wathen. Finite Elements and Fast Iterative Solvers: with Applications in Incompressible Fluid Dynamics. Numerical Mathematics and Scientific Computation. Oxford University Press, 2006. [38] R. Falgout, A. Cleary, J. Jones, E. Chow, V. Henson, C. Baldwin, P. Brown, P. Vassilevski, and U. M. Yang. Hypre Web page. http://acts.nersc.gov/hypre/, 2010. [39] M. Gee, C. Siefert, J. Hu, R. Tuminaro, and M. Sala. ML 5.0 smoothed aggregation user’s guide. Technical Report SAND2006-2649, Sandia National Laboratories, 2006. [40] J.-F. Gerbeau. Problèmes mathématiques et numériques posés par la modélisation de l’électrolyse de l’aluminium. PhD thesis, École Nationale des Ponts et Chaussées, 1998. [41] J.-F. Gerbeau. A stabilized finite element method for the incompressible magnetohydrodynamic equations. Numerische Mathematik, 87:83–111, 2000. [42] J.-F. Gerbeau, C. L. Bris, and T. Lelièvre. Mathematical Methods for the Magnetohydrodynamics of Liquid Metals. Numerical Mathematics and Scientific Computation. Oxford University Press, New York, 2006. [43] C. Greif and D. Schötzau. Preconditioners for the discretized time-harmonic Maxwell equations in mixed form. Numerical Linear Algebra with Applications, 14(4):281–297, 2007. 102 [44] C. Greif, D. Li, D. Schötzau, and X. Wei. A mixed finite element method with exactly divergence-free velocities for incompressible magnetohydrodynamics. Computer Methods in Applied Mechanics and Engineering, 199:2840–2855, 2010. [45] M. Griebel and P. Oswald. On the abstract theory of additive and multiplicative Schwarz algorithms. Numerische Mathematik, 70(2): 163–180, 1995. [46] W. Gropp, E. Lusk, D. Ashton, P. Balaji, D. Buntinas, R. Butler, A. Chan, J. Krishna, G. Mercier, R. Ross, R. Thakur, and B. Toonen. MPI Web page. http://www-unix.mcs.anl.gov/mpi/, 2007. [47] J.-L. Guermond and P. Minev. Mixed finite element approximation of an MHD problem involving conducting and insulating regions: the 3D case. Numerical Methods for Partial Differential Equations, 19:709–731, 2003. [48] J.-L. Guermond, R. Laguerre, J. Léorat, and C. Nore. Nonlinear magnetohydrodynamics in axisymmetric heterogeneous domains using a Fourier/finite element technique and an interior penalty method. Journal of Computational Physics, 228:2739–2757, 2009. [49] M. Gunzburger, A. Meir, and J. Peterson. On the existence and uniqueness and finite element approximation of solutions of the equations of stationary incompressible magnetohydrodynamics. Mathematics of Computation, 56: 523–563, 1991. [50] U. Hasler, A. Schneebeli, and D. Schötzau. Mixed finite element approximation of incompressible MHD problems based on weighted regularization. Applied Numerical Mathematics, 51:19–45, 2004. [51] V. E. Henson and U. M. Yang. BoomerAMG: a parallel algebraic multigrid solver and preconditioner. Applied Numerical Mathematics, 41:155–177, 2000. [52] M. A. Heroux, R. A. Bartlett, V. E. Howle, R. J. Hoekstra, J. J. Hu, T. G. Kolda, R. B. Lehoucq, K. R. Long, R. P. Pawlowski, E. T. Phipps, A. G. Salinger, H. K. Thornquist, R. S. Tuminaro, J. M. Willenbring, A. Williams, and K. S. Stanley. An overview of the Trilinos project. ACM Transactions on Mathematical Software, 31(3):397–423, 2005. [53] M. R. Hestenes and E. Stiefel. Methods of conjugate gradients for solving linear systems. Journal of Research of the National Bureau of Standards, 49 (6):409–436, 1952. 103 [54] R. Hiptmair. Multigrid method for Maxwell’s equations. SIAM Journal on Numerical Analysis, 36(1):204–225, 1998. [55] R. Hiptmair. Finite elements in computational electromagnetism. Acta Numerica, 11:237–339, 2002. [56] R. Hiptmair and J. Xu. Nodal auxiliary space preconditioning in H(curl) and H(div) spaces. SIAM Journal on Numerical Analysis, 45(6):2483–2509, 2007. [57] P. Houston, I. Perugia, and D. Schötzau. Mixed discontinuous Galerkin approximation of the Maxwell operator: Non-stabilized formulation. Journal of Scientific Computing, 22:315–346, 2005. [58] P. Houston, D. Schötzau, and X. Wei. A mixed DG method for linearized incompressible magnetohydrodynamics. Journal of Scientific Computing, 40:281–314, 2009. [59] J. Jackson. Classical Electrodynamics. Jon Wiley & Sons, Inc., 2nd edition, 1975. [60] J. Jin. The Finite Element Method in Electromagnetics. Jon Wiley & Sons, Inc., 1993. [61] J. Jones and B. Lee. A multigrid method for variable coefficient Maxwell’s equations. SIAM Journal on Scientific Computing, 27(5):1689–1708, 2006. [62] G. Karypis. METIS Web page. http://glaros.dtc.umn.edu/gkhome/views/metis/, 2010. [63] G. Karypis and V. Kumar. Multilevel k-way partitioning scheme for irregular graphs. Journal of Parallel and Distributed Computing, 48(1):96–129, 1998. [64] D. Kay, D. Loghin, and A. Wathen. A preconditioner for the steady-state Navier-Stokes equations. SIAM Journal on Scientific Computing, 24(1): 237–256, 2002. [65] B. S. Kirk, J. W. Peterson, R. H. Stogner, and G. F. Carey. libMesh: a C++ library for parallel adaptive mesh refinement/coarsening simulations. Engineering with Computers, 22(3):237–254, 2006. [66] A. Klawonn and G. Starke. Block triangular preconditioners for nonsymmetric saddle point problems: field-of-values analysis. Numerische Mathematik, 81:577–594, 1999. 104 [67] T. Kolev and P. Vassilevski. Parallel auxiliary space AMG for H(curl) problems. Journal of Computational Mathematics, 27:604–623, 2009. [68] P. LeSaint and P. Raviart. On a finite element method for solving the neutron transport equation. In C. de Boor, editor, Mathematical Aspects of Finite Elements in Partial Differential Equations, pages 89–123. Academic Press, New York, 1974. [69] D. Li. Parallel numerical solution of the time-harmonic Maxwell equations. Lecture Notes in Computer Science on High Performance Computing and Applications, 5938:224–229, 2010. [70] D. Li, C. Greif, and D. Schötzau. Parallel numerical solution of the time-harmonic Maxwell equations in mixed form. Numerical Linear Algebra with Applications, Submitted (19 pages), 2010. [71] X. S. Li and J. W. Demmel. SuperLU DIST: A scalable distributed-memory sparse direct solver for unsymmetric linear systems. ACM Transactions on Mathematical Software, 29(2):110–140, 2003. [72] J. C. Maxwell. A treatise on electricity and magnetism. Clarendon Press, 1873. [73] P. Monk. Finite Element Methods for Maxwell’s Equations. Numerical Mathematics and Scientific Computation. Oxford University Press, New York, 2003. [74] U. Müller and L. Bühler. Magnetofluiddynamics in Channels and Containers. Springer-Verlag, Berlin, 2001. [75] M. F. Murphy, G. H. Golub, and A. J. Wathen. A note on preconditioning for indefinite linear systems. SIAM Journal on Scientific Computing, 21(6): 1969–1972, 1999. [76] J. C. Nédélec. Mixed finite elements in R3. Numerische Mathematik, 35: 315–341, 1980. [77] A. I. Nesliturk, S. H. Aydin, and M. Tezer-Sezgin. Two-level finite element method with a stabilizing subgrid for the incompressible MHD equations. International Journal or Numerical Methods in Fluids, 58:551–572, 2008. [78] M. A. Olshanskii and Y. V. Vassilevski. Pressure Schur complement preconditioners for the discrete Oseen problem. SIAM Journal on Scientific Computing, 29:2686–2704, 2007. 105 [79] C. Paige and M. Saunders. Solution of sparse indefinite systems of linear equations. SIAM Journal on Numerical Analysis, 12:617–629, 1975. [80] I. Perugia, D. Schötzau, and P. Monk. Stabilized interior penalty methods for the time-harmonic Maxwell equations. Computer Methods in Applied Mechanics and Engineering, 191:4675–4697, 2002. [81] W. P. Petersen and P. Arbenz. Introduction to Parallel Computing: [A Practial Guide with Examples in C]. Oxford University Press, 2004. [82] A. Prohl. Convergent finite element discretizations of the nonstationary incompressible magnetohydrodynamic system. Mathematical Modelling and Numerical Analysis, 42:1065–1087, 2008. [83] S. Reitzinger and J. Schöberl. An algebraic multigrid method for finite element discretizations with edge elements. Numerical Linear Algebra with Applications, 9(3):223–238, 2002. [84] P. H. Roberts. An Introduction to Magnetohydrodynamics. Longmans, London, 1967. [85] Y. Saad. Iterative methods for sparse linear systems. SIAM, 2003. [86] Y. Saad and M. H. Schultz. GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM Journal on Scientific and Statistical Computing, 7(3):856–869, 1986. [87] O. Schenk and K. Gärtner. PARDISO Web page. http://www.pardiso-project.org, 2010. [88] J. Schöberl. NGSolve Web page. http://www.hpfem.jku.at/ngsolve/index.html, 2010. [89] D. Schötzau. Mixed finite element methods for incompressible magneto-hydrodynamics. Numerische Mathematik, 96:771–800, 2004. [90] L. Scott, T. Clark, and B. Bagheri. Scientific Parallel Computing. Princeton University Press, 2005. [91] H. Si. TetGen Web page. http://tetgen.berlios.de/, 2010. [92] D. Silvester, H. Elman, D. Kay, and A. Wathen. Efficient preconditioning of the linearized Navier-Stokes equations for incompressible flow. Journal of Computational and Applied Mathematics, 128(1-2):261–279, 2001. 106 [93] L. N. Trefethen and D. Bau. Numerical Linear Algebra. SIAM, 1997. [94] L. Vardapetyan and L. Demkowicz. hp-adaptive finite elements in electromagnetics. Computer Methods in Applied Mechanics and Engineering, 169:331–344, 2 1999. [95] X. Wei. Finite element methods for incompressible magnetohydrodynamics. PhD thesis, The University of British Columbia. To appear. [96] J. Xu. The auxiliary space method and optimal multigrid preconditioning techniques for unstructured grids. Computing, 56(3):215–235, 1996. 107
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Numerical solution of the time-harmonic Maxwell equations...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Numerical solution of the time-harmonic Maxwell equations and incompressible magnetohydrodynamics problems Li, Dan 2010
pdf
Page Metadata
Item Metadata
Title | Numerical solution of the time-harmonic Maxwell equations and incompressible magnetohydrodynamics problems |
Creator |
Li, Dan |
Publisher | University of British Columbia |
Date Issued | 2010 |
Description | The goal of this thesis is to develop efficient numerical solvers for the time-harmonic Maxwell equations and for incompressible magnetohydrodynamics problems. The thesis consists of three components. In the first part, we present a fully scalable parallel iterative solver for the time-harmonic Maxwell equations in mixed form with small wave numbers. We use the lowest order Nedelec elements of the first kind for the approximation of the vector field and standard nodal elements for the Lagrange multiplier associated with the divergence constraint. The corresponding linear system has a saddle point form, with inner iterations solved by preconditioned conjugate gradients. We demonstrate the performance of our parallel solver on problems with constant and variable coefficients with up to approximately 40 million degrees of freedom. Our numerical results indicate very good scalability with the mesh size, on uniform, unstructured and locally refined meshes. In the second part, we introduce and analyze a mixed finite element method for the numerical discretization of a stationary incompressible magnetohydrodynamics problem, in two and three dimensions. The velocity field is discretized using divergence-conforming Brezzi-Douglas-Marini (BDM) elements and the magnetic field is approximated by curl-conforming Nedelec elements. Key features of the method are that it produces exactly divergence-free velocity approximations, and that it correctly captures the strongest magnetic singularities in non-convex polyhedral domains. We prove that the energy norm of the error is convergent in the mesh size in general Lipschitz polyhedra under minimal regularity assumptions, and derive nearly optimal a-priori error estimates for the two-dimensional case. We present a comprehensive set of numerical experiments, which indicate optimal convergence of the proposed method for two-dimensional as well as three-dimensional problems. Finally, in the third part we investigate preconditioned Krylov iterations for the discretized stationary incompressible magnetohydrodynamics problems. We propose a preconditioner based on efficient preconditioners for the Maxwell and Navier-Stokes sub-systems. We show that many of the eigenvalues of the preconditioned system are tightly clustered, and hence, rapid convergence is accomplished. Our numerical results show that this approach performs quite well. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2010-12-07 |
Provider | Vancouver : University of British Columbia Library |
Rights | Attribution-NonCommercial-NoDerivatives 4.0 International |
DOI | 10.14288/1.0051989 |
URI | http://hdl.handle.net/2429/30314 |
Degree |
Doctor of Philosophy - PhD |
Program |
Computer Science |
Affiliation |
Science, Faculty of Computer Science, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 2011-05 |
Campus |
UBCV |
Scholarly Level | Graduate |
Rights URI | http://creativecommons.org/licenses/by-nc-nd/4.0/ |
AggregatedSourceRepository | DSpace |
Download
- Media
- 24-ubc_2011_spring_li_dan.pdf [ 1.91MB ]
- Metadata
- JSON: 24-1.0051989.json
- JSON-LD: 24-1.0051989-ld.json
- RDF/XML (Pretty): 24-1.0051989-rdf.xml
- RDF/JSON: 24-1.0051989-rdf.json
- Turtle: 24-1.0051989-turtle.txt
- N-Triples: 24-1.0051989-rdf-ntriples.txt
- Original Record: 24-1.0051989-source.json
- Full Text
- 24-1.0051989-fulltext.txt
- Citation
- 24-1.0051989.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.24.1-0051989/manifest