@prefix vivo: . @prefix edm: . @prefix ns0: . @prefix dcterms: . @prefix skos: . vivo:departmentOrSchool "Science, Faculty of"@en, "Earth and Ocean Sciences, Department of"@en ; edm:dataProvider "DSpace"@en ; ns0:identifierCitation "Thomas, Stephen J., Hacker, Joshua P., Smolarkiewicz, Piotr K., Stull, Roland B. 2003. Spectral Preconditioners for Nonhydrostatic Atmospheric Models. Monthly Weather Review. 131(10) 2464-2478."@en ; ns0:rightsCopyright "Stull, Roland B."@en ; dcterms:creator "Thomas, Stephen J."@en, "Hacker, Joshua P."@en, "Smolarkiewicz, Piotr K."@en, "Stull, Roland B."@en ; dcterms:issued "2011-04-11T20:03:50Z"@en, "2003-10"@en ; dcterms:description """The elliptic problems in semi-implicit nonhydrostatic atmospheric models are difficult. Typically, they are poorly conditioned, nonseparable, contain cross-derivative terms, and are often nonsymmetric. Here, the resulting linear system is solved using a preconditioned Krylov subspace method—the generalized conjugate residual (GCR) algorithm. A horizontal spectral preconditioner is developed as an alternative to a more standard and much simpler line Jacobi relaxation scheme. However, the efficacy of the spectral preconditioner requires neglecting the cross-derivative terms and homogenization (e.g., averaging) metric coefficients over the computational domain. Because such a compromise causes a substantial departure of the preconditioner from the original elliptic operator, it is not obvious a priori whether it leads to a competitive solver. The robustness of the proposed approach over a broad range of representative meteorological applications is evaluated, in the context of a three-time-level semi-implicit semi-Lagrangian all-scale weather-prediction/research model. Copyright 2003 American Meteorological Society (AMS). Permission to use figures, tables, and brief excerpts from this work in scientific and educational works is hereby granted provided that the source is acknowledged. Any use of material in this work that is determined to be “fair use” under Section 107 of the U.S. Copyright Act or that satisfies the conditions specified in Section 108 of the U.S. Copyright Act (17 USC §108, as revised by P.L. 94-553) does not require the AMS’s permission. Republication, systematic reproduction, posting in electronic form, such as on a web site or in a searchable database, or other uses of this material, except as exempted by the above statement, requires written permission or a license from the AMS. Additional details are provided in the AMS Copyright Policy, available on the AMS Web site located at (http://www.ametsoc.org/) or from the AMS at 617-227-2425 or copyright@ametsoc.org."""@en ; edm:aggregatedCHO "https://circle.library.ubc.ca/rest/handle/2429/33532?expand=metadata"@en ; skos:note "2464 VOLUME 131M O N T H L Y W E A T H E R R E V I E W q 2003 American Meteorological Society Spectral Preconditioners for Nonhydrostatic Atmospheric Models STEPHEN J. THOMAS, JOSHUA P. HACKER, AND PIOTR K. SMOLARKIEWICZ National Center for Atmospheric Research, Boulder, Colorado ROLAND B. STULL Department of Earth and Ocean Sciences, University of British Columbia, Vancouver, British Columbia, Canada (Manuscript received 23 October 2002, in final form 20 March 2003) ABSTRACT The elliptic problems in semi-implicit nonhydrostatic atmospheric models are difficult. Typically, they are poorly conditioned, nonseparable, contain cross-derivative terms, and are often nonsymmetric. Here, the resulting linear system is solved using a preconditioned Krylov subspace method—the generalized conjugate residual (GCR) algorithm. A horizontal spectral preconditioner is developed as an alternative to a more standard and much simpler line Jacobi relaxation scheme. However, the efficacy of the spectral preconditioner requires ne- glecting the cross-derivative terms and homogenization (e.g., averaging) metric coefficients over the compu- tational domain. Because such a compromise causes a substantial departure of the preconditioner from the original elliptic operator, it is not obvious a priori whether it leads to a competitive solver. The robustness of the proposed approach over a broad range of representative meteorological applications is evaluated, in the context of a three-time-level semi-implicit semi-Lagrangian all-scale weather-prediction/research model. 1. Introduction A semi-implicit time-stepping scheme (Robert 1969) is commonly applied to the terms responsible for fast waves in large-scale weather prediction and general cir- culation models. This removes the associated time step restrictions and makes computation practical. More re- cently, semi-implicit integrators have also been applied at mesoscale resolutions (Tapp and White 1976; Cullen 1990; Tanguay et al. 1990; Golding 1992; Yeh et al. 2002; Grabowski and Smolarkiewicz 2002) and con- vective scales (Robert 1993; Grabowski and Smolar- kiewicz 2002). The resulting elliptic partial differential equations are nontrivial. They are poorly conditioned, nonseparable, contain cross-derivative terms, and are typically nonsymmetric. This is due to domain anisot- ropy, effects of planetary rotation, ambient stratification, the use of general curvilinear coordinates in the gov- erning equations [e.g., the Gal-Chen and Somerville (1975) terrain-following transformation], or the impo- sition of partial-slip conditions along an irregular lower boundary. Among the most effective methods reported for solving such problems are the preconditioned non- symmetric conjugate-gradient-type (alias Krylov sub- space, hereafter Krylov) iterative schemes (e.g., Kapitza and Eppel 1992; Skamarock et al. 1997; Thomas et al. Corresponding author address: Dr. Stephen J. Thomas, NCAR/ SCD, P.O. Box 3000, Boulder, CO 80307-3000. E-mail: thomas@ucar.edu 1998, 2000). A number of alternative nonsymmetric Krylov solvers are used in computational research and engineering (Axelsson 1994; Greenbaum 1997). Al- though this paper focuses on a particular scheme, the issues discussed are relevant to other iterative methods as well. Our method of choice is the restarted gener- alized conjugate residual [GCR(k)] algorithm1 akin to the popular generalized minimum residual [GMRES(k)] solver (Eisenstat et al. 1983; Saad and Schultz 1986; Smolarkiewicz and Margolin 1994, 1997; Smolarkiew- icz et al. 1997).2 For the reader’s convenience, GCR(k) is summarized in appendix A, where we also introduce some terminology, notation, and a notion of precondi- tioning (left, as opposed to right) used throughout this paper. A brief discussion of line-relaxation precondi- tioners is included in appendix B. Designing a suitable preconditioner is important, as it can dramatically accelerate solver convergence, there- by reducing the overall computational expense of a model. In principle, the preconditioner P can be any linear operator such that LP21 is definite, where L sym- bolizes the original elliptic operator implied by the phys- ical problem at hand. The role of the preconditioner is 1 Throughout this paper GCR (4) is employed unless stated oth- erwise. 2 Other popular alternatives include the biconjugate gradients sta- bilized (BiCGStab; van der Vorst 1992) and transpose-free quasi- minimal residual (TFQMR; Freund 1993) algorithms. OCTOBER 2003 2465T H O M A S E T A L . to substitute the governing elliptic problem L(C) 2 Q 5 0 with an auxiliary problem P 21[L(C) 2 Q] 5 0 that converges faster (than the original problem) because of a closer clustering of the eigenvalues of the auxiliary elliptic operator P 21L, where C and Q symbolize the dependent variable and the rhs, respectively. For the preconditioner to be useful, the convergence of the aux- iliary problem must be sufficiently rapid to overcome the effort associated with ‘‘inverting’’ the preconditioner itself [i.e., computing P 21(·)]. In general, the closer the preconditioner approximates the original operator, the faster the solver converges but the more difficult it is to compute P 21(·). There is no general method for designing an optimal preconditioner (Axelsson 1994, section 7). Kadogliu and Mudrick (1992) argued that basic point-Jacobi re- laxation provides an effective preconditioner for GMRES(k) applied to atmospheric flow problems. How- ever, their conclusion is derived from solving the di- agnostic quasigeostrophic omega equation in a periodic channel—a fairly simple elliptic problem that does not reflect the full complexity of a multiscale atmosphere. In particular, it does not extrapolate to prognostic prob- lems when the shallowness of the earth’s atmosphere dictates condition numbers k(L) ; O(1010); recall that k(L) can be interpreted as the squared ratio of the lon- gest to the shortest wave present in the system. Because the asymptotic convergence rate of conjugate-gradient- type methods is proportional to k(L)21/2, a direct pre- conditioner in the vertical is the ‘‘categorical impera- tive’’ of an effective iterative solver for all-scale at- mospheric models (e.g., Marshall et al. 1997; Skamar- ock et al. 1997; Thomas et al. 1998; Smolarkiewicz et al. 2001). In this paper we take this preconditioning strategy in the vertical for granted and focus on the optimization of the horizontal component of P. Guided by the development of elliptic solvers for ear- ly anelastic models, we extend a fast direct precondi- tioning strategy to three spatial dimensions via hori- zontal spectral decomposition. This offers an alternative to alternating-direction-implicit (ADI) iterative methods (Skamarock et al. 1997), of which the line Jacobi pre- conditioner is a special case (appendix B). The use of the FFT approach has a long history in the design of elliptic solvers. Because FFTs are effective for separable problems, early elliptic solvers for meteorological mod- els often employed operator decomposition (into a sep- arable part and the residual) and a stationary block- iteration with the nonseparable residual lagging behind (cf. Schumann and Volkert 1987 and Schumann and Sweet 1988, and references therein). Bernardet (1995) investigated an alternative approach (conceptually the opposite) where the full operator is retained in a non- stationary Krylov iteration, but FFTs are used to invert the ‘‘flat’’ preconditioner obtained by neglecting orog- raphy. He used the classical, symmetric, conjugate-gra- dient solver of Hestenes and Stiefel (1952), thereby ne- cessitating careful modifications to the boundary con- ditions (in order to assure a self-adjoint elliptic opera- tor). His development addressed 2D mesoscale problems in the x–z plane. Because of the inherent dif- ficulties with extending this approach to 3D, Lafore et al. (1998) adopted an FFT-preconditioned Richardson iteration. Recently, Elman and O’Leary (1998) analyzed the performance of nonsymmetric Krylov solvers for indefinite 3D Helmholtz equations with Sommerfeld ra- diation boundary conditions and found that FFT meth- ods can result in effective preconditioners even on par- allel architectures. Because of the indefiniteness, they addressed a more difficult problem than those typically encountered in meteorological applications. On the oth- er hand, their Helmholtz operator contained the standard Laplacian on isotropic grids and had constant coeffi- cients. Both the coefficients and solutions were smooth. Although there exists a substantial body of evidence indicating the potential of FFT preconditioners, a sys- tematic study in the context of all-scale meteorological applications is needed. This is because modeling natural flows leads to elliptic problems characterized by highly inhomogeneous anistropic coefficients and full spectra forced on the rhs. We complement earlier work with a comprehensive study based on a fully compressible non- hydrostatic all-scale atmospheric model (Thomas et al. 1998). A semi-implicit time discretization leads to a nonsymmetric, but definite, Helmholtz problem. In or- der to design an effective FFT preconditioner, we derive a separable constant-coefficient approximation to the semi-implicit Helmholtz equation by dropping cross- derivative terms and averaging metric coefficients across the domain. Neumann boundary conditions are specified to close the problem, and thus a discrete cosine transform (DCT) is required to diagonalize the discrete horizontal Laplacian component of the preconditioner. Robustness is vital for the elliptic solver when a nu- merical model admits a broad range of scales of motion. We assess the relative efficiency of line relaxation versus DCT preconditioners in both linear and nonlinear flow regimes (prototypes for, respectively, laminar and tur- bulent flows) throughout the range O(101)–O(107) m of spatial scales. Our study encompasses the simulation of shallow thermal convection, small and large mesoscale mountain flows, and synoptic-scale winter storm pre- diction. We measure the performance of the two pre- conditioners by comparing the amount of work (the it- eration count and total CPU time of the model) required to achieve a specified convergence threshold. Because the residual error of the elliptic problem has a sense of Dt-normalized first-order spatial partial derivatives of velocity, the convergence criterion is based on the mag- nitude of the local Courant number and its variations. This specifies both the absolute admissible error and its magnitude compared to the flow. In effect, both nu- merical and physical error metrics are employed (cf. Smolarkiewicz et al. 1997). The paper is organized as follows: Section 2 sum- marizes the governing equations of the model along with 2466 VOLUME 131M O N T H L Y W E A T H E R R E V I E W their discretization and formulation of the elliptic prob- lem. Section 3 presents the derivation of a spectral pre- conditioner. Section 4 discusses the results of our sim- ulations. Remarks in section 5 conclude the paper. 2. Model description a. Analytic formulation To focus the discussion, consider the standard com- pressible Euler model for a dry, adiabatic, stratified, rotating fluid. Cast in a noninertial frame rotating with angular velocity V, it consists of mass, momentum, and entropy evolution laws, and the equation of state: Dr 5 2r= · v, Dt Dv 1 5 2 =p 2 g 2 2V 3 v, Dt r DQ 5 0, p 5 rRT, (1) Dt where D/Dt 5 ]/]t 1 v · = is the material-derivative operator, v the velocity vector, p thermodynamic pres- sure, r density, and g combines gravitational and cen- trifugal accelerations. Also, Q 5 T exp(2kq) is the potential temperature, where T denotes the temperature, k 5 R/cp (with R and cp being the gas constant and specific heat at constant pressure), and q 5 ln(p/p00), with p00 5 100 kPa.3 In order to isolate the fast acoustic-gravity modes (for the sake of implicit numerical treatment), the prognostic equations in (1) are decomposed by introducing a hor- izontally homogeneous, static reference atmosphere (hereafter denoted with subscript 0). Thermodynamic variables are decomposed into a sum of the base state and perturbations T 5 T 1 T9, q 5 q (z) 1 q9, (2)0 0 where ]q g0 5 2 . (3) ]z RT0 After grouping terms associated with ‘‘fast’’ linear waves on the lhs, while retaining the ‘‘slow’’ residual on the rhs and eliminating r using the gas law, the governing evolution equations take the form 1 Dq9 g 1 = · v 2 w 5 0, 2g Dt c0 3 Using the auxiliary ‘‘pressure’’ variable q, rather than the pressure p itself, facilitates isolating the 1/r=p nonlinearity in the numerical- model equations, analogous to q 5 lnps (surface pressure ps) in hy- drostatic primitive equation models (cf. Machenhauer and Daley 1972; Bourke 1974; Robert et al. (1985; Tanguay et al. 1990). The Exner function P 5 (p/p00)k is an alternative (Skamarock et al. 1997). Dv 1 RT =q9 2 Bk 5 2RT9=q9 2 f k 3 v,0Dt 2DT9 RT Dq9 N RT90 02 1 T w 5 2 = · v, (4)0Dt c Dt g cp y where B 5 gT9/T0 denotes the buoyancy, and f 5 2V sinf is the Coriolis parameter at latitude f. Also, g 5 cp/cy (with cy 5 cp 2 R denoting the specific heat at constant volume), and N0 and c0 are, respectively, the Brunt–Väisälä frequency and speed of sound: 2g 2 2N 5 , c 5 gRT . (5)0 0 0 c Tp 0 Equations (1) and (4), as written, are valid in a ro- tating Cartesian reference frame. To account for the cur- vature of the earth when simulating atmospheric flows and produce results consistent with standard weather maps, Eqs. (4) are written in terms of a Cartesian co- ordinate system projected onto the plane intersecting the sphere at reference latitude f0 (cf. Tanguay et al. 1990). The resulting equations take the form 1 Dq9 ]q g 1 D (U, V ) 1 2 w 5 0,H 2g Dt ]z c0 DU ]q9 ]S ]q9 1 RT 5 f V 2 K 2 RT9 ,0Dt ]X ]X ]X DV ]q9 ]S ]q9 1 RT 5 2 fU 2 K 2 RT9 ,0Dt ]Y ]Y ]Y Dw ]q9 ]q9 1 RT 2 B 5 2RT9 ,0Dt ]z ]z 2DT9 RT Dq9 N RT90 02 1 T w 5 2 D, (6)0Dt c Dt g cp y where the (X, Y) conformal coordinate-based form of D/Dt, horizontal divergence DH(U, V) of the wind image (U, V), pseudokinetic energy K, and metric coefficient S are all detailed in appendix C for the sake of com- pleteness. To account for the irregularity of the lower boundary, the system in (6) is transformed to the terrain- following curvilinear framework of Gal-Chen and Som- merville (1975). Optionally, variable spacing of vertical levels is allowed by composing the Gal-Chen transfor- mation with a smooth stretching (Thomas et al. 1998). Rather than complicating (6) with details of the com- position of these mappings, all metric terms associated with the vertical transformations are absorbed into the definitions of the partial derivatives (appendix C). b. Numerical approximations The semi-Lagrangian approach employed for ap- proximating (6) on a discrete mesh (Robert 1993) aims OCTOBER 2003 2467T H O M A S E T A L . at second-order accuracy in space and time.4 A staggered Arakawa C grid (Arakawa and Lamb 1977) is employed in the horizontal, and a staggered Tokioka B grid (To- kioka 1978) in the vertical. The associated spatial par- tial-derivative discrete operators are detailed in appen- dix D. In general, the governing equations—here, either (1), (4), or (6)—can be viewed symbolically as Dc 5 F , (7)cDt where c stands for any of the kinematic or thermody- namic dependent variables of the model, and Fc for the associated rhs. Then, semi-Lagrangian schemes can be interpreted as approximations to the trajectory integrals of the governing equations: c 5 c 1 F dt , (8)0 E c G where c and c0 denote, respectively, the values of c(X, t) at the two endpoints of the trajectory G connecting [X0(X, t1), t0] and (X, t1). For our study (of spectral preconditioners), we adopt the Mesoscale Compressible Community (MC2) model (Tanguay et al. 1990; Benoit et al. 1997; Laprise et al. 1997; Thomas et al. 1998, and references therein), which implements the semi-im- plicit semi-Lagrangian algorithm of Robert (1981). In the context of the integral representation, this scheme blends midpoint three-time-level and trapezoidal two- time-level quadrature rules. The midpoint and trape- zoidal approximations result, respectively, in explicit (leapfrog) and implicit (Crank–Nicholson) time dis- cretizations of the various forcing terms. In order to concisely present the resulting finite-difference equa- tions, we define the discrete operators n11 n21 n11 n21c 2 c c 1 ci 0 i 0D c 5 , m c 5 , (9)t t2Dt 2 where denotes the unknown (sought) value of cn11c i at the grid point (Xi, tn11), and is the value of cn21c 0 at the foot (X0, tn21) of the trajectory arriving at (Xi, tn11), evaluated using cubic interpolation from the neighboring grid points. With the definitions in (9), the compact discretized form of the equations (6) becomes (after some regrouping) g nD q9 1 gm D (U, V ) 1 g D 2 m m w 5 F ,t t H Z Z t q2[ ]c0 nD U 1 RT m D q9 5 F ,t 0 t X U nD V 1 RT m D q9 5 F ,t 0 t Y V nD w 1 RT m D q9 2 m B 5 F ,t 0 t Z t w 4 Numerical dissipative terms absent from the analytic model (1)— absorbers, filters, subgrid-scale turbulence parameterizations, etc.— are split and integrated to first order. 2RT N0 0 nD T9 2 D q9 1 T m w 5 F , (10)t t 0 t T c gp where spatial partial-derivative and average operators (DX, DY, DZ, and mZ) are defined in appendix D, and the forcings combine the Coriolis and slow-residualnF c terms that are interpolated at the trajectory midpoint, (X, tn), with X [ (Xi 1 X0)/2 [cf. Thomas et al. (1998) for a discussion of the trajectory scheme)]. c. Elliptic problem In order to derive the discrete Helmholtz problem that underlies the semi-implicit scheme in (10), all unknown t n11 terms are grouped on the lhs, and all known tn and t n21 terms are separated on the rhs. Denoting the rhs by Q and dropping all temporal superscripts (as there is no ambiguity at this point), the resulting linear system of algebraic equations (valid at each grid point) is written as follows: g q9 1 DtgD (U, V ) 1 Dtg D 2 m w 5 Q , (11)H Z Z q92[ ]c0 U 1 DtRT D q9 5 Q , (12)0 X U V 1 DtRT D q9 5 Q , (13)0 Y V w 1 DtRT D q9 2 DtB 5 Q , (14)0 Z w 2RT N0 0T9 2 q9 1 DtT w 5 Q . (15)0 T c gp To arrive at the implicit equation for q9, all remaining dependent variables are eliminated from (11)–(15). Combining (14) and (15) eliminates the buoyancy B and results in g 2 2[1 1 Dt N ]w 1 DtRT D 2 m q9 5 Q9 , (16)0 0 Z Z w[ ]c Tp 0 where gQ9 5 Q 1 Dt Q . (17)w w TT0 Next, taking the divergence of (12) and (13) and sub- stituting for DH(U, V) in (11), g 2 2[1 2 Dt c D (D , D )]q9 1 Dtg D 2 m w 5 Q9 ,0 H X Y Z Z q92[ ]c0 (18) where Q9 5 Q 2 DtgD (Q , Q ).q9 q9 H U V (19) Following the original notation of Tanguay et al. (1990), we define the operators, g(1)D 5 D 2 m andZ Z Z[ ]c Tp 0 2468 VOLUME 131M O N T H L Y W E A T H E R R E V I E W g(2)D 5 D 2 m , (20)Z Z Z2[ ]c0 and the coefficient N 5 (1 1 Dt2 ). Then, solving2N 0 directly for w in (16) and substituting the result in (18), we arrive at the discrete Helmholtz equation, 2 2 2 2 (2) 21 (1){[1 2 Dt c D (D , D )] 2 Dt c D N D }q9 5 Q*,0 H X Y 0 Z Z q9 (21) where the modified forcing on the rhs takes the form (2) 21Q* 5 Q9 2 DtD N Q9 .q9 q9 Z w (22) d. Boundary conditions Lateral boundaries are open with inflow/outflow de- termined by the normal components of the velocity. From the perspective of the Helmholtz problem in (21), normal velocities specified at the lateral boundaries im- ply Neumann boundary conditions on the pressure via (12) and (13). The model top and bottom boundaries are impermeable. At the model top ztop, the terrain-fol- lowing coordinate Z becomes flat [cf. (C7) in appendix C], whereupon the contravariant ‘‘vertical’’ velocity be- comes identical to the vertical component of the phys- ical velocity and they both vanish: ˙Z | 5 w | 5 0.z ztop top (23) At the lower boundary, the normal contravariant velocity vanishes (Z˙ | 0 5 0), implying that the physical vertical velocity w follows coordinate surfaces according to 13 23w | 5 GS(G U 1 G V) | .0 0 (24) The boundary conditions on q9 associated with (23) and (24) are derived by substituting the momentum equa- tions (12) and (13) into, respectively, (23) and (24) and requiring that the resulting relations be consistent with (16). Then and can be specified at Z 5 0 and(1)Q9 Dw Z Z 5 ztop.5 3. Spectral preconditioner The discrete Helmholtz equation (21) takes the form Lq9 5 Q*,q9 (25) referred to early in the introduction. The symbol L on the lhs of (25) represents the entire operator appearing between the brackets {·} on the lhs of (21). For the terrain-following transformation and conformal coor- dinates, the generalized Laplacian-like operator DH(DX, DY) embedded in L includes cross-derivative terms and is nonseparable. The discrete horizontal divergence and 3D gradient, (D7) and (D4) respectively, couple the hor- 5 The reader interested in implicit formulations of boundary con- ditions in the context of Krylov solvers is referred to Smolarkiewicz and Margolin (2000) for a discussion. izontal to the vertical operators through a complex blend of X–Z and Y–Z derivatives and averages. The first step in constructing an effective preconditioner P (for at- mospheric models) is to neglect these cross terms—to facilitate a direct inversion of P in the vertical (recall the discussion in the introduction). The resulting sim- plification leads to the preconditioned linear system of equations: Pe 5 r (26) or, equivalently, 2 2 2 2 2 (2) 21 (1){[1 2 Dt c S¹ ] 2 Dt c D N D }e 5 r, (27)0 H 0 Z Z where 5 dXX 1 dYY denotes the standard discrete2¹H horizontal Laplacian; e is the current iterate (of the Kry- lov solver) estimation of the solution error e 5 q9 2 9, with the overbar denoting the sought exact solution;q and r is the current iterate value of the residual error r 5 Lq9 2 (cf. appendix A). Because the operator onQ*q9 the lhs of (27) is a simple sum of the vertical and hor- izontal differential operators, iterative line-relaxation preconditioners (outlined in appendix B) can be easily applied at this point. Here, we advocate an alternate strategy, where a separable (in the horizontal) constant- coefficient approximation of the operator on the lhs of (27) is specified—thereby departing even further with P from the original operator L in (25)—enabling a fast direct inversion in the horizontal. The horizontal component of P in (27) is nonseparable because the metric coefficient S 5 S(X, Y) is a function of both X and Y.6 In general, a separable operator is obtained by a cross-directional homogenization of met- ric coefficients—for example, by averaging S that mul- tiplies the dXX and dYY parts of , respectively, in the2¹H Y and X directions—whereupon the tensor-product method (Lynch et al. 1964) can be applied to construct a direct solver. This requires computing separately the eigenvalues and eigenvectors of the matrices repre- senting the discrete operators Y dXX and X dYY, and di-S S agonalization follows by combining two unitary simi- larity transformations. Insofar as stand-alone solvers for separable problems are concerned, this approach re- quires O(nzn3) operations and is not competitive with ADI schemes [an O(nzn2 log2n) operation count] except when a fast transform [e.g., FFT with O(nzn2 logn) op- erations] is available (Lynch et al. 1964). Here, n 5 nx 5 ny refers to the number of mesh points in either hor- izontal direction, and nz is the number of points in the vertical. Because (27) is merely a preconditioner, S can be globally averaged or extremized (Bernardet 1995; Smolarkiewicz and Margolin 2000), thereby enabling application of the FFT. Here, we replace S in (27) with the global mean; flat topography is specified to assure 6 In models with X–Y cross differentiation [e.g., due to the coor- dinate mapping for the sake of mesh adaptivity (Smolarkiewicz and Prusa 2002)] either ADI or spectral preconditioners would require neglecting these cross terms. OCTOBER 2003 2469T H O M A S E T A L . constant coefficients in the vertical derivatives. The dis- crete operator , with homogeneous Neumann bound-2¹H ary conditions,7 is then readily diagonalized by the real part of the Fourier transform (via the discrete cosine transform; Sweet 1973). This results in nx 3 ny inde- pendent tridiagonal vertical problems: XY 2 2 2 2 (2) 21 (1){[1 2 Dt c S (l 1 l )] 2 Dt c D N D }ê 5 r̂,0 l m 0 Z Z (28) where 4 p(l 2 1) 2l 5 2 sin , l 5 1, . . . , n ,l x2 [ ]DX 2nx 4 p(m 2 1) 2l 5 2 sin , m 5 1, . . . , n ,m y2 [ ]DY 2ny are the eigenvalues of in the X and Y directions, and2¹H denotes Fourier transform.· 4. Numerical experiments A suite of model runs8 is designed to compare the performance of the DCT preconditioner against the line Jacobi preconditioner, proven effective in meteorolog- ical applications (Thomas et al. 1998, 2000). The rel- ative efficiency of the two preconditioners is measured with two Jacobi/DCT ratios [hereafter the performance ratios (PR)]: the ratio of total GCR solver iterations and the ratio of the total model CPU times. Both ratios are informative. For a given convergence criterion, the first one estimates the computational cost of the solver, in- dependent of the implementation and machine, whereas the second assesses the total work required for the ap- plication at hand. Values of PR . 1 show efficiency gains by the DCT preconditioner. Because the residual error of the Helmholtz problem in (21) combines Dt- normalized first-order spatial partial derivatives of the velocity components, it motivates the stopping criterion: \\Lq9 2 Q* \\ # « min(C, L).q9 (29) In (29), C and L represent the Courant \\Dtv/DX\\ and Lipschitz \\Dt(]v/]x)\\ numbers—correspondingly, the measures of the flow magnitude and its variations. Typ- ically, in order to control the stability/accuracy of the Eulerian and semi-Lagrangian computations, C ; L ; O(1), whereupon « specifies both the absolute admis- sible error and its magnitude compared to the flow. The test cases considered here are designed to eval- uate the robustness and efficiency of the DCT precon- ditioner over a wide range of meteorological applica- 7 Nonperiodic boundary conditions are always homogeneous in the preconditioner (Smolarkiewicz and Margolin 2000, section 3). 8 All calculations reported were performed with 64-bit real arith- metic. tions. The first test is very special; it probes the reflex- ivity of our preconditioned Krylov approach. In all prob- lems where the preconditioner is an exact solver, GCR should converge in one iteration to machine precision, regardless of the value of «. In the MC2 model, all small- to mesoscale problems with map scale factor m [ 1 in (C1) and flat lower boundary with h(X, Y) [ const in (C7) belong to this class, which constitutes a broad range of research applications including studies of fully developed turbulence and deep moist convection. For illustration, we consider a full-spectrum turbulent ther- mal convection simulation. The physical scenario is a buoyant thermal rising in a neutrally stratified environment—a 3D extension of the earlier experiment in Smolarkiewicz and Pudykiew- icz (1992). The grid employed consists of 100 3 100 3 150 points with constant interval DX 5 DY 5 DZ 5 10 m; Dt 5 2.5 s. Initially, a spherical temperature anomaly of radius r0 5 250 m and Q9 5 0.5 K is centered in the horizontal, r0 1 DZ above the lower boundary. Figure 1 shows potential temperature (Q) xz and xy cross sections through the center of the convec- tive bubble and at z 5 800 m, respectively; at the end of the integration, t 5 180Dt, by which time the flow becomes turbulent and exhibits a full range of scales (see Fig. 2). The solution in Figs. 1 and 2 is for the DCT precon- ditioner. It is independent of the stopping criterion in (29). This is documented in Fig. 3, which compares the rms divergence of the solution using the DCT and line Jacobi preconditioners for a range of «. Following Ska- marock et al. (1997), this diagnostic is used to measure the solution convergence in elastic models. For the Ja- cobi preconditioner, as « → 0 the curve flattens, indi- cating the critical value « 5 1023, below which the solution can be considered physically converged; that is, further solver iterations do not contribute to the over- all accuracy of the results. Because the DCT precon- ditioner is a direct solver of the elliptic problem at hand, convergence is reached in a single GCR iteration re- gardless of the specified value of «. The latter is not the case for the Jacobi preconditioner, which requires an increasing number of GCR iterations as « → 0. Figure 4 demonstrates this by displaying the PR for iteration count and total CPU time as a function of «. Here, the DCT-preconditioned solver converges in one iteration, and the PR iteration count is equivalent to the average number of GCR iterations (per model time step) using the Jacobi preconditioner. The PR CPU time documents that, even with the overhead associated with computing FFTs, the DCT preconditioner is at least an order of magnitude more efficient for all values of « representing the convergent solutions. The second test simulates a smooth, vertically prop- agating 3D mountain wave, identical to those in Saito et al. (1998) and Thomas et al. (2000). The resulting flow is characterized by small-amplitude gravity wave response at Froude number F [ U/Nh 5 4.4, where the 2470 VOLUME 131M O N T H L Y W E A T H E R R E V I E W FIG. 1. Potential temperature u (a) x–z and (b) x–y cross sections through the center of the convective bubble at t 5 450 s. The xy cross sections are at z 5 800 m, shown by the horizontal dashed line in (a). Contour interval is 0.1 K. FIG. 2. Spanwise (a) x-average and (b) y-average 1D power spectra at three height levels of the solution in Fig. 1. The thin solid line shows the ‘‘25/3’’ slope for reference. linear theory of Smith (1980) is qualitatively valid. To contrast with the buoyant thermal case, the linear flow regime is chosen because it emphasizes a low signal- to-noise ratio. Here, unlike in the previous case, the energy spectrum of a fully developed solution is com- pact, and the grid aspect ratio significantly departs from unity, reflecting the anisotropy of the problem. To quan- tify the effect of domain anisotropy (shallow versus deep flow scenarios) on solver performance, both hy- drostatic and nonhydrostatic variants are addressed [cf. Skamarock et al. (1997) for a discussion]. The param- eters of the experiment are as follows. The bell-shaped mountain, h0h(X, Y ) 5 , 2 2 3/2[1 1 (X /a) 1 (Y /a) ] with the half-width a 5 3DX 5 3DY and the height h0 5 100 m is centered at the (30, 30) grid point of the 61 3 61 horizontal mesh. In the vertical, 31 exponen- tially stretched levels extend to 19 km, resulting in ver- OCTOBER 2003 2471T H O M A S E T A L . FIG. 3. The rms divergence as a function of « in Eq. (29) for the convective bubble case shown in Fig. 1. Results with both the Jacobi (solid line) and the DCT (dashed line) preconditioners are shown. FIG. 4. Performance ratios for CPU and iteration count, as a function of «, for the convective bubble case shown in Fig. 1. tical grid increments ranging from 40 to 1180 m, re- spectively, for the bottom and top thermodynamic lay- ers. The model is integrated over 480 time steps, when the solution is considered to have reached an approxi- mate steady state. A constant inflow velocity of U 5 8 m s21 and an isothermal atmosphere with buoyancy fre- quency N 5 0.018 s21 are assumed. The Coriolis force and the curvature of the earth are neglected. No subgrid- scale parameterizations are employed. The Davies (1976) relaxation scheme is used at the lateral bound- aries to minimize wave reflections. In the hydrostatic case, the grid length DX 5 2000 m and time step Dt 5 60 s result in the advective Cour- ant number C 5 0.24. The horizontal scale of the moun- tain is a 5 6000 m, so aN/U 5 13.5. In the nonhydro- static case, DX 5 400 m and Dt 5 15 s, giving C 5 0.3. The horizontal scale decreases to a 5 1200 m, so that aN/U 5 2.7, whereupon nonhydrostatic effects are significant. For illustration, Fig. 5 displays the con- verged solution. Vertical velocity cross sections are shown through the vertical center plane and on the hor- izontal plane at z 5 2100 m (3/4 of the vertical wave- length) for both cases. The maximum vertical velocities at this level are 0.06 and 0.2 m s21 in the hydrostatic and nonhydrostatic cases, respectively. Note the down- stream tilt of the wave packet (manifested by the down- stream shift of the maxima) in the nonhydrostatic case. The corresponding energy spectra in the mean flow di- rection at z 5 1800 m are shown in Fig. 6. The iteration count of the Krylov solver using the DCT (solid line) and line Jacobi (dashed line) precon- ditioners is compared in Fig. 7 as a function of the convergence threshold « in (29). The rms divergence (thick solid line) shows the critical value of « 5 1025, below which the solution is considered physically con- verged.9 It is apparent that the number of iterations per 9 For the exact steady-state solution, the rms divergence should vanish. time step required for the line Jacobi preconditioner is about 5(10) times that required for the DCT scheme in the hydrostatic (nonhydrostatic) case. However, this does not imply that the efficiency of the DCT precon- ditioner follows the same pattern. The computational cost of a single GCR iteration with the DCT precon- ditioner is much higher than with line Jacobi, whereupon the CPU time performance ratio is about 1.5 for the hydrostatic result and somewhat over 2 for the non- hydrostatic solution (see Fig. 8). The trend of decreasing relative efficacy of the DCT preconditioner with in- creasing hydrostacy of the problem is consistent with that observed for ADI preconditioners (Skamarock et al. 1997). It is likely related to the propagation char- acteristics of gravity waves present in the system. Next, the complex Scandinavian orography is used for the lower boundary to probe the solver performance for natural mesoscale flow scenarios containing many scales of motion. For this case, DX 5 DY 5 10 km, and DZ 5 300 m, giving a grid aspect ratio of 0.01. The total horizontal domain size is 2000 km 3 2000 km (200 3 200 horizontal grid points), with the model lid placed at 15 km. A uniform northwesterly ambient flow with speed of 10 m s21 and buoyancy frequency N 5 0.01 s21 are assumed. Unlike in the previous cases, a b-plane approximation is employed, with the map scale factor m varying accordingly in (C1). The model is integrated over 24 h with 288 5-min time steps. The converged wind solution on the lowest momentum level (approximately 150 m above the ground) is shown in Fig. 9, for illustration. Based on experience gained from the earlier test cases, a safe value « 5 1026 is assumed to assure physical convergence. Performance-ratio val- ues are 4.04 and 5.46 for CPU time and iteration count, respectively, documenting much higher gains for the DCT preconditioner compared to the isolated mountain case discussed above. This is not surprising because compact spectra limit the number of significant eigen- modes present in the solution, thereby accelerating the convergence of Krylov solvers. For the sake of com- 2472 VOLUME 131M O N T H L Y W E A T H E R R E V I E W FIG. 5. Vertical velocity (left) x–z and (right) x–y cross sections through the mountain wave simulations at time step 480: (top) hydrostatic case and (bottom) nonhydrostatic case. The x–y cross sections are at z 5 2100 m. Contour intervals are 0.01 m s21 for the hydrostatic case and 0.04 m s21 for the nonhydrostatic case. pleteness, a 3D ADI preconditioner was also tested in this case. The number of GCR iterations required (for the given convergence threshold) is the same as with the DCT preconditioner, but the computational overhead is substantial, owing to multiple ADI iterations within the preconditioner itself. Finally, we test the performance of the DCT precon- ditioner on a synoptic-scale, weather-prediction prob- lem. An arbitrarily chosen forecast from 0000 UTC 16 September 2002 is run using initial and boundary con- ditions supplied by the National Centers for Environ- mental Prediction (NCEP) Eta Model. The MC2 model is integrated over 48 h with 480 6-min time steps. The computational domain is 200 3 160 3 19, with DX 5 DY 5 50 km on a polar stereographic projection true at 608N, and the vertically stretched grid ranges from DZ 5 72 m near the surface to 3800 m near the top at 23 km. The resulting condition number of the Helmholtz operator is extreme, O(1012), and the energy spectrum full. As in the Scandinavian case, « 5 1026 is assumed. The resulting performance ratios are 1.93 and 4.90 for CPU time and iteration count, respectively. The DCT preconditioner again requires fewer iterations, and the model integration rate is about twice as fast. This dem- onstrates that the DCT is far superior to line Jacobi, despite the large departure of the preconditioner from the full Helmholtz operator emphasized by a synoptic- scale forecast. To assess the relative merits of alternative Krylov methods, we also implemented the right-preconditioned flexible GMRES (FGMRES) algorithm of Saad (1993).10 For comparison, we ran the MC2 model using FGMRES(4), with DCT preconditioner, for the Scan- dinavian topography simulation and winter storm syn- optic forecast. In both cases, we found that FGMRES requires additional iterations because it tends to con- verge more slowly than GCR after a restart. The model run times were nearly identical, with FGMRES giving a slight advantage in the synoptic case. To compare 10 Computational complexity estimates for the various algorithms are provided by Saad (1996). OCTOBER 2003 2473T H O M A S E T A L . FIG. 6. One-dimensional power spectra through the centers of the right-hand panels in Fig. 5. Solid and dashed lines denote the hy- drostatic and nonhydrostatic case, respectively. FIG. 8. The CPU time performance ratio as a function of conver- gence criterion « for both the hydrostatic and nonhydrostatic mountain wave cases. FIG. 7. The rms divergence and average number of GCR iterations per time step for the (a) hydrostatic and (b) nonhydrostatic mountain wave cases. Results with both the Jacobi (JAC) and DCT preconditioner are shown. solutions, we examined the maximum Courant numbers (equivalent to a maximum norm) and residual L2 norm (least squares error) of the elliptic problem over the computational domain for the GCR and FGMRES solv- ers. In both cases, these agree to six digits, and thus we conclude that the solutions are the same. 5. Concluding remarks We have developed a spectral preconditioner for a generalized conjugate residual (GCR) Krylov solver ap- plied to the Helmholtz problem arising in the semi- implicit time discretization of a compressible nonhy- drostatic atmospheric model. Our approach offers an alternative to the more standard and much simpler line- relaxation, stationary block-iteration-type precondition- ers used in meteorology. Despite substantial departures of the spectral preconditioner from the governing Helm- holtz operator, we observed dramatic performance gains over these simpler schemes. The relative improvement in the model integration rate ranges from a factor of 2 to one order of magnitude, depending on the problem at hand. The relative performance gains are, of course, greatest when the DCT preconditioner happens to be a direct solver for the governing Helmholtz problem, that is, zero orography and constant mapscale factor character- istic of small-scale dynamics applications. In a meso- scale flow scenario over Scandinavian topography, with a broad range of scales of motion present, a factor of 4 improvement was achieved. For smooth idealized mountain wave flows, frequently used to benchmark numerical model performance, the relative gains are more modest. But extrapolating these results to natural problems underestimates the potential gains because idealized flows often have compact spectra that favor simpler schemes. For synoptic weather prediction, 2474 VOLUME 131M O N T H L Y W E A T H E R R E V I E W FIG. 9. Wind solution for the first model level (Z 5 150 m) over the Scandinavian topography at 24 h with the DCT preconditioner. Topography is also shown at 500-m intervals. where the shallowness of the atmosphere implies an extremely poorly conditioned problem, the performance improvement is still a factor of 2. Spectral methods rely on global basis functions, and computing their expansion coefficients requires multiple evaluations of global sums. Consequently, spectral transform methods are criticized as being inappropriate for modern, distributed-memory parallel architectures. Nevertheless, in the context of FFT-based precondi- tioners for iterative Krylov methods, Elman and O’Leary (1998) documented overall parallel speedups despite the large data communication overheads. In me- teorology, the vertical direction is physically distinct and typically not partitioned across the processors in numerical models. Thus, a massively parallel imple- mentation of spectral preconditioners can be based on a data-transposition strategy (Thomas et al. 2002) sim- ilar to that used in the European Centre for Medium- Range Weather Forecasts (ECMWF) weather prediction model Integrated Forecast System (IFS; Skålin 1997; Barros and Kauranne 1994). The performance of the next-generation (all-scale, nonhydrostatic semi-implicit, weather and climate pre- diction/research) numerical models in meteorology de- pends critically on the performance of elliptic solvers employed. It is widely recognized that the state-of-the- art methods rely on preconditioned nonsymmetric Kry- lov algorithms. Given the extensive body of mathe- matical literature, still relatively little numerical expe- rience exists with such solvers in advanced meteoro- logical applications (cf. Navon and Cai 1993). We have advocated a spectrally preconditioned GCR solver in the context of a particular model, representative of cur- rent trends in NWP. Further studies with a focus on alternative numerics, mesh refinement, complex ge- ometries, and extreme meteorological scenarios will pose/define extreme challenges/benchmarks to the ad- vocated approach. Acknowledgments. The authors are grateful to Claude Girard and Michel Desgagné of RPN and Environment Canada, for maintaining and improving the MC2 model, and to Xingxiu Deng and Henryk Modzelewski from UBC, for preparing the synoptic case and adept com- puter administration, respectively. This work was sup- ported by the National Science Foundation, the De- partment of Energy Climate Change Prediction Program (CCPP), the Canadian Natural Science and Engineering Council, RPN and Environment Canada, the Canadian Foundation for Innovation, BC Knowledge Develop- ment Fund, the Geophysical Disaster Computational Fluid Dynamics Center, and UBC Research Endow- ments. The comments of two anonymous referees helped to improve the presentation. APPENDIX A Generalized Conjugate-Residual Approach Here we describe the preconditioned GCR(k) algo- rithm used in this study. In general, we consider a linear elliptic problem, 3 3] ]C IJ IL(C) [ C 1 D C 2 AC 5 Q, (A1)O OI J1 2]x ]xI51 J51 with variable coefficients A, CIJ, DI, Q, and either pe- riodic, Dirichlet, or Neumann boundary conditions and adopt the following notation: The discrete representa- tion of a field on the grid is denoted by the subscript i; the discrete representation of the elliptic operator on the lhs of (A1) is denoted by Li(C); and the inner product ^jz& [ Si jiz i. The preconditioner P is a linear operator that approximates L to a greater or lesser degree but is easier to invert. Its role is to substitute (A1) with an auxiliary problem that converges faster (than the orig- inal problem) because of a closer clustering of the ei- genvalues of the auxiliary operator resulting from the superposition of L and P 21. In this paper, we are pri- marily concerned with ‘‘left’’ preconditioning that sub- stitutes (A1) with an auxiliary problem P 21[L(C) 2 Q] 5 0. Its accelerated convergence exploits spectral prop- erties of P 21L. In general, ‘‘right’’ preconditioning is also possible. It augments (A1) with LP 21[P (C)] 5 Q, and its convergence relies on reducing the spectral ra- dius of LP 21. Left preconditioning assumes P constant during the iteration process, whereas right precondi- tioning allows for variable P as, for example, in the FGMRES solver of Saad (1993). OCTOBER 2003 2475T H O M A S E T A L . The GCR(k) method of Eisenstat et al. (1983) may be derived via variational arguments (cf. Smolarkiewicz and Margolin 1994, 2000). In essence, we augment (A1) with a kth-order damped oscillation equation: k k21] P (C) 1 ] P (C) 1 ]P (C) 1 1 · · · 1 k k21]t T (t) ]t T (t) ]tk21 1 5 L(C) 2 Q. (A2) Then we discretize (A2) in a pseudotime t to form the affine discrete equation for the progression of the re- sidual errors r and determine the optimal parameters T1, . . . , Tk21 and integration increment Dt (variable in t) that assure minimization of the residual errors in the norm defined by the inner product ^rr&. This leads to the following algorithm. For any initial guess , set 5 Li(C0) 2 Qi,0 0 0C r pi i i 5 (r0); then iterate:21P i For n 5 1, 2, . . .until convergence do for n 5 0, . . . , k 2 1 do n n^r L(p )& b 5 2 , n n^L(p )L(p )& n11 n nC 5 C 1 bp ,i i i n11 n nr 5 r 1 bL (p ),i i i n11exit if \\r \\ # «, 21 n11e 5 P (r ),i i 3 3] ]e IJ IL (e) 5 C 1 D e 2 Ae ,O Oi I J1 2[ ]]x ]xI51 J51 i l^L(e)L(p )&∀ a 5 2 ,l50,n l l l^L(p )L(p )& n n11 lp 5 e 1 a p ,Oi i l i l50 n n11 lL (p ) 5 L (e) 1 a L (p ),Oi i l i l50 end do, k 0reset [C, r, p, L(p)] to [C, r, p, L(p)] ,i i end do. For the convergence, the GCR(k) algorithm above requires LP 21 to be negative definiteA1 but not neces- sarily self-adjoint.A2 Direct evaluation of the elliptic op- A1 An operator A is said to be definite if ^jA(j )& is either strictly positive (positive definite) or strictly negative (negative definite) for all j. A2 An operator A is said to be self-adjoint if ^jA(z )& 5 ^zA(j )& for all j and z. erator on the grid takes place only once per iteration following the preconditioning e 5 P 21(rn11), which pro- vides an estimate of the solution error en11 5 Cn11 2 Cexact. APPENDIX B Line-Relaxation Preconditioners A distinctive feature of meteorological flows is their anisotropy in the vertical direction—the larger the ratio of the horizontal scale of the flow to the fluid depth, the stiffer the elliptic problem. A class of simple yet effective preconditioners that mitigate this aspect (of elliptic problems in meteorology) derives from an im- plicit Richardson iteration (Smolarkiewicz and Mar- golin 2000), m11 me 2 e h m z m11 n115 P (e ) 1 P (e ) 2 r , (B1) Dt̃ a realization of the preconditioning step e 5 P 21(rn11) of the GCR(k) solver summarized in appendix A. Here, P h and P z are the horizontal and the vertical counterparts of the operator P, respectively; D is a parameter of thet̃ iteration (a pseudo-time step) based on spectral prop- erties of P h [viz., linear stability analysis of (B1)]; m numbers successive Richardson iterations; and n num- bers the outer iterations of the Krylov solver. Equation (B1) leads to a linear problem, z m11 m˜(I 2 Dt̃P )e 5 R , (B2) where R˜ m [ em 1 D [P h(em) 2 rn11], that can be solvedt̃ readily using the well-known tridiagonal algorithm (cf. appendix A in Roache 1972). Alternating implicit dis- cretization between the vertical and horizontal counter- parts of P in fractional steps of leads to the ADIt̃ preconditioners of Skamarock et al. (1997). The implicit Richardson preconditioner in (B1) can be further improved. Consider extending the Richardson diffusion scheme with respect to P h to the diagonally preconditioned Duffort–Frankel type implicit algorithm, m11 mDe 2 De h m m11 m5 P (e ) 2 D(e 2 e ) Dt̃ z m11 n111 P (e ) 2 r , (B3) where (21)D stands for the diagonal coefficient em- bedded within the matrix representing P h on the grid. Note that adding the relaxation term on the rhs of (B1) has the effect of replacing the Dem term with D(em11) in P h(em) without complicating flux boundary condi- tions imposed in constructing P h(e) (cf. Smolarkiewicz and Margolin 2000 for discussions). In the limit D →t̃ `, (B3) is equivalent to the block Jacobi preconditioner proven effective in meteorological applications (Thom- as et al. 1998, 2000). 2476 VOLUME 131M O N T H L Y W E A T H E R R E V I E W APPENDIX C Coordinate Systems For a polar stereographic projection, true at longitude f0, m is the map scale factor and S 5 m2: 1 1 sinf0 m 5 . (C1) 1 1 sinf The components (U, V) of the wind ‘‘image’’ are defined in terms of the scaled and rotated components of the true horizontal velocity (u, y) in local (x, y, z) coordi- nates, U 1 2sinl 2cosl u 5 , (C2)[ ] [ ][ ]V m cosl 2sinl y where l represents longitude. The (X, Y) components of the conformal coordinates are scaled and rotated ac- cording to dX 2sinl 2cosl dx 5 m . (C3)[ ] [ ][ ]dY cosl 2sinl dy In conformal coordinates (X, Y, z), the Euler equations take the following form: 1 Dq ]U ]V ]w 5 2S 1 2 ,1 2g Dt ]X ]Y ]z DU ]S ]q 5 f V 2 K 2 RT , Dt ]X ]X DV ]S ]q 5 2 fU 2 K 2 RT , Dt ]Y ]Y Dw ]q 5 2g 2 RT , Dt ]z DT RT Dq 2 5 0, (C4) Dt c Dtp where K 5 (U 2 1 V 2 )/2 is the pseudohorizontal ki- netic energy per unit mass. The material derivative is written as D ] ] ] ] 5 1 S U 1 V 1 w . (C5)1 2Dt ]t ]X ]Y ]z The divergence of (U, V, w) is given by ]U ]V ]w D 5 S 1 1 , (C6)1 2]X ]Y ]z where the horizontal divergence of (U, V) on the rhs is denoted by DH(U, V) throughout the body of the paper. To account for the orography one can introduce a terrain-following coordinate system, where any mono- tonic function of geometrical height can be used as a vertical coordinate, denoted by Z. Throughout this paper we employ the classical transformation of Gal-Chen and Sommerville (1975), where z 2 h(X, Y ) Z 5 z . (C7)top z 2 h(X, Y )top Because Z in (C7) is assumed to be time independent [cf. Prusa et al. (1996) for time-dependent extensions], the resulting Z-coordinate system represents a nonde- formable system with surfaces of constant Z fixed in physical space—in contrast to various pressure-based vertical coordinates used in hydrostatic weather-predic- tion models. The mapping from conformal (X, Y, z) coordinates to curvilinear (X, Y, Z) coordinates is rep- resented by the metric coefficients GIJ of the transfor- mation: 1 ]z 1 ]z ]z 13 23G 5 , G 5 , G 5 .1 2 1 2G ]X G ]Y ]ZZ Z (C8) Admitting an optional, variable spacing of vertical lev- els modifies the metric coefficients in (C8) via multi- plicative factors (cf. Thomas et al. 1998). The terrain- following system is right-handed, where Z increases from the bottom of the model domain to the top, and the Z surface corresponding to the lower boundary is conformal to the terrain. Thus, by definition, G is always positive. Partial derivatives of any variable c are to be interpreted for terrain-following coordinates as follows: ]c ]c ]c 135 1 G ,1 2 1 2]X ]X ]Z z Z ]c ]c ]c 235 1 G ,1 2 1 2]Y ]Y ]Z z Z ]c 1 ]c 5 . (C9) ]z G ]Z The contravariant ‘‘vertical’’ velocity is related to the physical velocity components by the equation 21 13 23˙W [ Z 5 G w 1 S(G U 1 G V). (C10) APPENDIX D Numerical Operators The centered averaging mz and differencing dz op- erators are defined as 1 Dz Dz m c 5 c z 1 1 c z 2 , (D1)z 5 1 2 1 262 2 2 1 Dz Dz d (c) 5 c z 1 2 c z 2 , (D2)z 5 1 2 1 26Dz 2 2 where c is a dependent model variable that may be OCTOBER 2003 2477T H O M A S E T A L . defined at the center of a grid cell or on the grid cell faces. Multiple averages are denoted by X Y XY Y X m c 5 c 5 (c ) 5 (c ) . (D3)XY Discrete horizontal and vertical derivative operators are defined as follows: 13D 5 d 1 G m d ,X X XZ Z 23D 5 d 1 G m d ,Y Y YZ Z 1 D 5 d . (D4)Z XYZ Z G The discrete divergence operator D, acting on the con- travariant wind components, can be written as follows: 1 YZ XZ D(U, V, W ) 5 {S[d G U 1 d G V ]}XYZ X Y G XY 1 d G W. (D5)Z The operator in (D5) satisfies the Gaussian divergence theorem in discrete form (Thomas et al. 1998). The Jacobian G of the coordinate transformation is located at grid cell vertices and D(U, V, W) at cell centers. The contravariant velocity component W, defined in (C10), is computed from Z XY X Y G W 5 w 1 S[G U 1 G V ],1 2 X Y 13 23G 5 GG , G 5 GG , (D6)1 2 and the formal discrete flow divergence (D5) is then implemented as D(U, V, W) 5 DH(U, V) 1 dzw, where S YZ D (U, V ) 5 Sd U 1 [d G m 1 m d G m ]UH X XYZ X X Z Z 1 X G 1 Sd VY S XY 1 [d G m 1 m d G m ]V. (D7)XYZ Y Y Z Z 2 Y G REFERENCES Arakawa, A., and V. Lamb, 1977: Computational design of the basic dynamical processes of the UCLA general circulation model. Methods in Computational Physics, Vol. 17, J. Chang, Ed., Ac- ademic Press, 174–265. Axelsson, O., 1994: Iterative Solution Methods. Cambridge Univer- sity Press, 654 pp. Barros, S., and T. Kauranne, 1994: On the parallelization of global spectral weather models. Parallel Comput., 20, 1335–1356. Benoit, R., M. Desgagné, P. Pellerin, S. Pellerin, Y. Chartier, and S. Desjardins, 1997: The Canadian MC2: A semi-Lagrangian, semi- implicit wide-band atmospheric model suited for fine-scale pro- cess studies and simulations. Mon. Wea. Rev., 125, 2382–2415. Bernardet, P., 1995: The pressure term in the anelastic model: A symmetric elliptic solver for an Arakawa C grid in generalized coordinates. Mon. Wea. Rev., 123, 2474–2490. Bourke, W., 1974: A multi-level spectral model. I. Formulation and hemispheric integrations. Mon. Wea. Rev., 102, 687–701. Cullen, M., 1990: A test of a semi-implicit integration technique for a fully compressible nonhydrostatic model. Quart. J. Roy. Me- teor. Soc., 116, 1253–1258. Davies, H., 1976: A lateral boundary formulation for multi-level pre- diction models. Quart. J. Roy. Meteor. Soc., 102, 405–418. Eisenstat, S., H. Elman, and M. Schultz, 1983: Variational iterative methods for nonsymmetric systems of linear equations. SIAM J. Numer. Anal., 2, 345–357. Elman, H., and D. O’Leary, 1998: Efficient iterative solution of the three-dimensional Helmholtz equation. J. Comput. Phys., 142, 163–181. Freund, R., 1993: A transpose-free quasi-mimimum residual algo- rithm for non-Hermitian linear systems. SIAM J. Sci. Stat. Com- put., 14, 470–482. Gal-Chen, T., and R. Sommerville, 1975: On the use of a coordinate transformation for the solution of the Navier–Stokes equations. J. Comput. Phys., 17, 209–228. Golding, B., 1992: An efficient nonhydrostatic forecast model. Me- teor. Atmos. Phys., 50, 89–103. Grabowski, W., and P. Smolarkiewicz, 2002: A multiscale model for meteorological research. Mon. Wea. Rev., 130, 939–956. Greenbaum, A., 1997: Iterative Methods for Solving Linear Systems. Society of Industrial and Applied Mathematics, 220 pp. Hestenes, M., and E. Stiefel, 1952: Method of conjugate gradients for solving linear systems. J. Res. Natl. Bur. Stand. (U.S.), 49, 409–436. Kadogliu, H., and S. Mudrick, 1992: On the implementation of the GMRES(m) method to elliptic equations in meteorology. J. Com- put. Phys., 102, 348–359. Kapitza, H., and D. Eppel, 1992: The nonhydrostatic mesoscale model GESIMA. Part I: Dynamical equations and tests. Beitr. Phys. Atmos., 65, 129–146. Lafore, J., and Coauthors, 1998: The Meso–NH atmospheric mod- eling system. Part I: Adiabatic formulation and control simu- lations. Ann. Geophys., 16, 90–109. Laprise, R., D. Caya, G. Bergeron, and M. Giguère, 1997: The for- mulation of the André Robert MC2 (Mesoscale Compressible Community) model. Atmos.–Ocean, 35, 127–152. Lynch, R., J. Rice, and D. Thomas, 1964: Direct solution of partial difference equations by tensor product methods. Numer. Math., 6, 185–199. Machenhauer, B., and R. Daley, 1972: A baroclinic primitive equation model with a spectral representation in three dimensions. Tech. Rep. 4, Institute of Theoretical Meteorology, University of Co- penhagen, 63 pp. Marshall, J., C. Hill, L. Perelman, and A. Adcroft, 1997: Hydrostatic, quasi-hydrostatic and non-hydrostatic ocean modeling. J. Geo- phys. Res., 102, 5733–5752. Navon, I., and Y. Cai, 1993: Domain decomposition and parallel processing of a finite element model of the shallow water equa- tions. Comput. Methods Appl. Mech. Eng., 106, 179–212. Prusa, J., P. Smolarkiewicz, and R. Garcia, 1996: On the propagation and breaking at high altitudes of gravity waves excited by tro- pospheric forcing. J. Atmos. Sci., 53, 2186–2216. Roache, P., 1972: Computational Fluid Dynamics. Hermosa, 446 pp. Robert, A., 1969: Integration of a spectral model of the atmosphere by the implicit method. Proc. WMO/IUGG Symp. on NWP, Vol. 7, Tokyo, Japan, Japan Meteorological Agency, 19–24. ——, 1981: A stable numerical integration scheme for the primitive meteorological equations. Atmos.–Ocean, 19, 35–46. ——, 1993: Bubble convection experiments with a semi-implicit for- mulation of the Euler equations. J. Atmos. Sci., 50, 1865–1873. ——, T. Yee, and H. Ritchie, 1985: A semi-Lagrangian semi-implicit numerical integration scheme for multilevel atmospheric models. Mon. Wea. Rev., 113, 388–394. Saad, Y., 1993: A flexible inner-outer preconditioned GMRES al- gorithm. SIAM J. Sci. Stat. Comput., 14, 461–469. ——, 1996: Iterative Methods for Sparse Linear Systems. PWS, 447 pp. ——, and M. Schultz, 1986: GMRES: A generalized minimal residual 2478 VOLUME 131M O N T H L Y W E A T H E R R E V I E W algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Stat. Comput., 7, 856–869. Saito, K., U. Schättler, and U. Steppler, 1998: 3D mountain waves by the Lokal-Modell of DWD and the MRI mesoscale nonhy- drostatic model. Pap. Meteor. Geophys., 49, 7–19. Schumann, U., and H. Volkert, 1987: Three-dimensional mass- and momentum-consistent Helmholtz-equation in terrain-following coordinates. Notes on Numerical Fluid Mechanics, W. Hack- busch, Ed., Vol. 10, Springer-Verlag, 109–131. ——, and R. Sweet, 1988: Fast Fourier transforms for direct solution of Poisson’s equation with staggered boundary conditions. J. Comput. Phys., 75, 123–137. Skålin, R., 1997: Scalability of parallel gridpoint limited-area at- mospheric models. Part II: Semi-implicit time-integration schemes. J. Atmos. Oceanic Technol., 14, 442–455. Skamarock, W., P. Smolarkiewicz, and J. Klemp, 1997: Precondi- tioned conjugate-residual solvers for Helmholtz equations in nonhydrostatic models. Mon. Wea. Rev., 125, 587–599. Smith, R., 1980: Linear theory of stratified hydrostatic flow past an isolated mountain. Tellus, 32, 348–364. Smolarkiewicz, P., and J. Pudykiewicz, 1992: A class of semi-La- grangian approximations for fluids. J. Atmos. Sci., 49, 2082– 2096. ——, and L. Margolin, 1994: Variational solver for elliptic problems in atmospheric flows. Appl. Math. Comput. Sci., 4, 527–551. ——, and ——, 1997: On forward-in-time differencing for fluids: An Eulerian/semi-Lagrangian nonhydrostatic model for stratified flows. Atmos.–Ocean, 35, 127–152. ——, and ——, 2000: Variational methods for elliptic problems in fluid models. Proc. ECMWF Workshop on Developments in Nu- merical Methods for Very High Resolution Global Models, Read- ing, United Kingdom, ECMWF, 137–159. ——, and J. Prusa, 2002: VLES modelling of geophysical fluids with nonoscillatory forward-in-time schemes. Int. J. Numer. Methods Fluids, 39, 779–819. ——, V. Grubisı̌ć, and L. Margolin, 1997: On forward-in-time dif- ferencing for fluids: Stopping criteria for iterative solutions of anelastic pressure equations. Mon. Wea. Rev., 125, 647–654. Sweet, R., 1973: Direct methods for the solution of Poisson’s equation on a staggered grid. J. Comput. Phys., 12, 422–428. Tanguay, M., A. Robert, and R. Laprise, 1990: A semi-implicit semi- Lagrangian fully compressible regional forecast model. Mon. Wea. Rev., 118, 1970–1980. Tapp, M., and P. White, 1976: A nonhydrostatic mesoscale model. Quart. J. Roy. Meteor. Soc., 102, 277–296. Thomas, S., C. Girard, R. Benoit, M. Desgagné, and P. Pellerin, 1998: A new adiabatic kernel for the MC2 model. Atmos.–Ocean, 36, 241–270. ——, ——, G. Doms, and U. Schättler, 2000: Semi-implicit scheme for the DWD Lokal-Modell. Meteor. Atmos. Phys., 73, 105–125. ——, J. Hacker, M. Desgagné, and R. Stull, 2002: An ensemble analysis of forecast errors related to floating point performance. Wea. Forecasting, 17, 898–906. Tokioka, T., 1978: Some considerations on vertical differencing. J. Meteor. Soc. Japan, 56, 98–111. van der Vorst, H., 1992: Bi-CGStab: A fast and smoothly converging variant of Bi-CG for the solution of non-symmetric linear sys- tems. SIAM J. Sci. Stat. Comput., 12, 631–644. Yeh, K.-S., J. Cote, S. Gravel, A. Methot, A. Patoine, M. Roch, and A. Staniforth, 2002: The CMC–MRB Global Environmental Multiscale (GEM) model. Part III: Nonhydrostatic formulation. Mon. Wea. Rev., 130, 339–356."@en ; edm:hasType "Article"@en ; edm:isShownAt "10.14288/1.0041844"@en ; dcterms:language "eng"@en ; ns0:peerReviewStatus "Reviewed"@en ; edm:provider "Vancouver : University of British Columbia Library"@en ; dcterms:publisher "American Meteorological Society"@en ; dcterms:rights "Attribution-NonCommercial-NoDerivatives 4.0 International"@en ; ns0:rightsURI "http://creativecommons.org/licenses/by-nc-nd/4.0/"@en ; ns0:scholarLevel "Faculty"@en ; dcterms:title "Spectral Preconditioners for Nonhydrostatic Atmospheric Models."@en ; dcterms:type "Text"@en ; ns0:identifierURI "http://hdl.handle.net/2429/33532"@en .