UBC Faculty Research and Publications

Spectral Preconditioners for Nonhydrostatic Atmospheric Models. Thomas, Stephen J.; Hacker, Joshua P.; Smolarkiewicz, Piotr K.; Stull, Roland B. Oct 31, 2003

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


52383-Stull_AMS_2003_MWR2464.pdf [ 379.85kB ]
JSON: 52383-1.0041844.json
JSON-LD: 52383-1.0041844-ld.json
RDF/XML (Pretty): 52383-1.0041844-rdf.xml
RDF/JSON: 52383-1.0041844-rdf.json
Turtle: 52383-1.0041844-turtle.txt
N-Triples: 52383-1.0041844-rdf-ntriples.txt
Original Record: 52383-1.0041844-source.json
Full Text

Full Text

2464  MONTHLY WEATHER REVIEW  VOLUME 131  Spectral Preconditioners for Nonhydrostatic Atmospheric Models STEPHEN J. THOMAS, JOSHUA P. HACKER,  AND  PIOTR K. SMOLARKIEWICZ  National Center for Atmospheric Research, Boulder, Colorado  ROLAND B. STULL Department of Earth and Ocean Sciences, University of British Columbia, Vancouver, British Columbia, Canada (Manuscript received 23 October 2002, in final form 20 March 2003) ABSTRACT The elliptic problems in semi-implicit nonhydrostatic atmospheric models are difficult. Typically, they are poorly conditioned, nonseparable, contain cross-derivative terms, and are often nonsymmetric. Here, the resulting linear system is solved using a preconditioned Krylov subspace method—the generalized conjugate residual (GCR) algorithm. A horizontal spectral preconditioner is developed as an alternative to a more standard and much simpler line Jacobi relaxation scheme. However, the efficacy of the spectral preconditioner requires neglecting the cross-derivative terms and homogenization (e.g., averaging) metric coefficients over the computational domain. Because such a compromise causes a substantial departure of the preconditioner from the original elliptic operator, it is not obvious a priori whether it leads to a competitive solver. The robustness of the proposed approach over a broad range of representative meteorological applications is evaluated, in the context of a three-time-level semi-implicit semi-Lagrangian all-scale weather-prediction/research model.  1. Introduction A semi-implicit time-stepping scheme (Robert 1969) is commonly applied to the terms responsible for fast waves in large-scale weather prediction and general circulation models. This removes the associated time step restrictions and makes computation practical. More recently, semi-implicit integrators have also been applied at mesoscale resolutions (Tapp and White 1976; Cullen 1990; Tanguay et al. 1990; Golding 1992; Yeh et al. 2002; Grabowski and Smolarkiewicz 2002) and convective scales (Robert 1993; Grabowski and Smolarkiewicz 2002). The resulting elliptic partial differential equations are nontrivial. They are poorly conditioned, nonseparable, contain cross-derivative terms, and are typically nonsymmetric. This is due to domain anisotropy, effects of planetary rotation, ambient stratification, the use of general curvilinear coordinates in the governing equations [e.g., the Gal-Chen and Somerville (1975) terrain-following transformation], or the imposition of partial-slip conditions along an irregular lower boundary. Among the most effective methods reported for solving such problems are the preconditioned nonsymmetric conjugate-gradient-type (alias Krylov subspace, hereafter Krylov) iterative schemes (e.g., Kapitza and Eppel 1992; Skamarock et al. 1997; Thomas et al. Corresponding author address: Dr. Stephen J. Thomas, NCAR/ SCD, P.O. Box 3000, Boulder, CO 80307-3000. E-mail: thomas@ucar.edu  ᭧ 2003 American Meteorological Society  1998, 2000). A number of alternative nonsymmetric Krylov solvers are used in computational research and engineering (Axelsson 1994; Greenbaum 1997). Although this paper focuses on a particular scheme, the issues discussed are relevant to other iterative methods as well. Our method of choice is the restarted generalized conjugate residual [GCR(k)] algorithm1 akin to the popular generalized minimum residual [GMRES(k)] solver (Eisenstat et al. 1983; Saad and Schultz 1986; Smolarkiewicz and Margolin 1994, 1997; Smolarkiewicz et al. 1997).2 For the reader’s convenience, GCR(k) is summarized in appendix A, where we also introduce some terminology, notation, and a notion of preconditioning (left, as opposed to right) used throughout this paper. A brief discussion of line-relaxation preconditioners is included in appendix B. Designing a suitable preconditioner is important, as it can dramatically accelerate solver convergence, thereby reducing the overall computational expense of a model. In principle, the preconditioner P can be any linear operator such that LP Ϫ1 is definite, where L symbolizes the original elliptic operator implied by the physical problem at hand. The role of the preconditioner is  1 Throughout this paper GCR (4) is employed unless stated otherwise. 2 Other popular alternatives include the biconjugate gradients stabilized (BiCGStab; van der Vorst 1992) and transpose-free quasiminimal residual (TFQMR; Freund 1993) algorithms.  OCTOBER 2003  THOMAS ET AL.  to substitute the governing elliptic problem L(⌿) Ϫ Q ϭ 0 with an auxiliary problem P Ϫ1 [L(⌿) Ϫ Q] ϭ 0 that converges faster (than the original problem) because of a closer clustering of the eigenvalues of the auxiliary elliptic operator P Ϫ1L, where ⌿ and Q symbolize the dependent variable and the rhs, respectively. For the preconditioner to be useful, the convergence of the auxiliary problem must be sufficiently rapid to overcome the effort associated with ‘‘inverting’’ the preconditioner itself [i.e., computing P Ϫ1 (·)]. In general, the closer the preconditioner approximates the original operator, the faster the solver converges but the more difficult it is to compute P Ϫ1 (·). There is no general method for designing an optimal preconditioner (Axelsson 1994, section 7). Kadogliu and Mudrick (1992) argued that basic point-Jacobi relaxation provides an effective preconditioner for GMRES(k) applied to atmospheric flow problems. However, their conclusion is derived from solving the diagnostic quasigeostrophic omega equation in a periodic channel—a fairly simple elliptic problem that does not reflect the full complexity of a multiscale atmosphere. In particular, it does not extrapolate to prognostic problems when the shallowness of the earth’s atmosphere dictates condition numbers ␬ (L) ϳ O(1010 ); recall that ␬ (L) can be interpreted as the squared ratio of the longest to the shortest wave present in the system. Because the asymptotic convergence rate of conjugate-gradienttype methods is proportional to ␬ (L) Ϫ1/2 , a direct preconditioner in the vertical is the ‘‘categorical imperative’’ of an effective iterative solver for all-scale atmospheric models (e.g., Marshall et al. 1997; Skamarock et al. 1997; Thomas et al. 1998; Smolarkiewicz et al. 2001). In this paper we take this preconditioning strategy in the vertical for granted and focus on the optimization of the horizontal component of P. Guided by the development of elliptic solvers for early anelastic models, we extend a fast direct preconditioning strategy to three spatial dimensions via horizontal spectral decomposition. This offers an alternative to alternating-direction-implicit (ADI) iterative methods (Skamarock et al. 1997), of which the line Jacobi preconditioner is a special case (appendix B). The use of the FFT approach has a long history in the design of elliptic solvers. Because FFTs are effective for separable problems, early elliptic solvers for meteorological models often employed operator decomposition (into a separable part and the residual) and a stationary blockiteration with the nonseparable residual lagging behind (cf. Schumann and Volkert 1987 and Schumann and Sweet 1988, and references therein). Bernardet (1995) investigated an alternative approach (conceptually the opposite) where the full operator is retained in a nonstationary Krylov iteration, but FFTs are used to invert the ‘‘flat’’ preconditioner obtained by neglecting orography. He used the classical, symmetric, conjugate-gradient solver of Hestenes and Stiefel (1952), thereby necessitating careful modifications to the boundary con-  2465  ditions (in order to assure a self-adjoint elliptic operator). His development addressed 2D mesoscale problems in the x–z plane. Because of the inherent difficulties with extending this approach to 3D, Lafore et al. (1998) adopted an FFT-preconditioned Richardson iteration. Recently, Elman and O’Leary (1998) analyzed the performance of nonsymmetric Krylov solvers for indefinite 3D Helmholtz equations with Sommerfeld radiation boundary conditions and found that FFT methods can result in effective preconditioners even on parallel architectures. Because of the indefiniteness, they addressed a more difficult problem than those typically encountered in meteorological applications. On the other hand, their Helmholtz operator contained the standard Laplacian on isotropic grids and had constant coefficients. Both the coefficients and solutions were smooth. Although there exists a substantial body of evidence indicating the potential of FFT preconditioners, a systematic study in the context of all-scale meteorological applications is needed. This is because modeling natural flows leads to elliptic problems characterized by highly inhomogeneous anistropic coefficients and full spectra forced on the rhs. We complement earlier work with a comprehensive study based on a fully compressible nonhydrostatic all-scale atmospheric model (Thomas et al. 1998). A semi-implicit time discretization leads to a nonsymmetric, but definite, Helmholtz problem. In order to design an effective FFT preconditioner, we derive a separable constant-coefficient approximation to the semi-implicit Helmholtz equation by dropping crossderivative terms and averaging metric coefficients across the domain. Neumann boundary conditions are specified to close the problem, and thus a discrete cosine transform (DCT) is required to diagonalize the discrete horizontal Laplacian component of the preconditioner. Robustness is vital for the elliptic solver when a numerical model admits a broad range of scales of motion. We assess the relative efficiency of line relaxation versus DCT preconditioners in both linear and nonlinear flow regimes (prototypes for, respectively, laminar and turbulent flows) throughout the range O(101 )–O(10 7 ) m of spatial scales. Our study encompasses the simulation of shallow thermal convection, small and large mesoscale mountain flows, and synoptic-scale winter storm prediction. We measure the performance of the two preconditioners by comparing the amount of work (the iteration count and total CPU time of the model) required to achieve a specified convergence threshold. Because the residual error of the elliptic problem has a sense of ⌬t-normalized first-order spatial partial derivatives of velocity, the convergence criterion is based on the magnitude of the local Courant number and its variations. This specifies both the absolute admissible error and its magnitude compared to the flow. In effect, both numerical and physical error metrics are employed (cf. Smolarkiewicz et al. 1997). The paper is organized as follows: Section 2 summarizes the governing equations of the model along with  2466  MONTHLY WEATHER REVIEW  their discretization and formulation of the elliptic problem. Section 3 presents the derivation of a spectral preconditioner. Section 4 discusses the results of our simulations. Remarks in section 5 conclude the paper. 2. Model description a. Analytic formulation To focus the discussion, consider the standard compressible Euler model for a dry, adiabatic, stratified, rotating fluid. Cast in a noninertial frame rotating with angular velocity ⍀, it consists of mass, momentum, and entropy evolution laws, and the equation of state: D␳ ϭ Ϫ␳١ · v, Dt Dv 1 ϭ Ϫ ١p Ϫ g Ϫ 2⍀ ϫ v, Dt ␳ D⌰ ϭ 0, Dt  p ϭ ␳RT,  (1)  where D/Dt ϭ ‫ץ‬/‫ץ‬t ϩ v · ١ is the material-derivative operator, v the velocity vector, p thermodynamic pressure, ␳ density, and g combines gravitational and centrifugal accelerations. Also, ⌰ ϭ T exp(Ϫ␬ q) is the potential temperature, where T denotes the temperature, ␬ ϭ R/c p (with R and c p being the gas constant and specific heat at constant pressure), and q ϭ ln( p/p 00 ), with p 00 ϭ 100 kPa.3 In order to isolate the fast acoustic-gravity modes (for the sake of implicit numerical treatment), the prognostic equations in (1) are decomposed by introducing a horizontally homogeneous, static reference atmosphere (hereafter denoted with subscript 0). Thermodynamic variables are decomposed into a sum of the base state and perturbations T ϭ T0 ϩ TЈ,  q ϭ q0 (z) ϩ qЈ,  (2)  where ‫ץ‬q0 g ϭϪ . ‫ץ‬z RT0  (3)  After grouping terms associated with ‘‘fast’’ linear waves on the lhs, while retaining the ‘‘slow’’ residual on the rhs and eliminating ␳ using the gas law, the governing evolution equations take the form 1 DqЈ g ϩ ١ · v Ϫ 2 w ϭ 0, ␥ Dt c0 3 Using the auxiliary ‘‘pressure’’ variable q, rather than the pressure p itself, facilitates isolating the 1/␳١p nonlinearity in the numericalmodel equations, analogous to q ϭ lnp s (surface pressure p s ) in hydrostatic primitive equation models (cf. Machenhauer and Daley 1972; Bourke 1974; Robert et al. (1985; Tanguay et al. 1990). The Exner function ⌸ ϭ (p/p 00 ) ␬ is an alternative (Skamarock et al. 1997).  VOLUME 131  Dv ϩ RT0١qЈ Ϫ Bk ϭ ϪRTЈ١qЈ Ϫ f k ϫ v, Dt DTЈ RT0 DqЈ N2 RTЈ Ϫ ϩ T0 0 w ϭ Ϫ ١ · v, Dt c p Dt g c␷  (4)  where B ϭ gTЈ/T 0 denotes the buoyancy, and f ϭ 2⍀ sin␾ is the Coriolis parameter at latitude ␾. Also, ␥ ϭ c p /c ␷ (with c ␷ ϭ c p Ϫ R denoting the specific heat at constant volume), and N 0 and c 0 are, respectively, the Brunt–Va¨isa¨la¨ frequency and speed of sound: N 20 ϭ  g2 , c p T0  c02 ϭ ␥RT0 .  (5)  Equations (1) and (4), as written, are valid in a rotating Cartesian reference frame. To account for the curvature of the earth when simulating atmospheric flows and produce results consistent with standard weather maps, Eqs. (4) are written in terms of a Cartesian coordinate system projected onto the plane intersecting the sphere at reference latitude ␾ 0 (cf. Tanguay et al. 1990). The resulting equations take the form 1 DqЈ ‫ץ‬q g ϩ D H (U, V ) ϩ Ϫ 2 w ϭ 0, ␥ Dt ‫ץ‬z c0 DU ‫ץ‬qЈ ‫ץ‬S ‫ץ‬qЈ ϩ RT0 ϭ fV Ϫ K Ϫ RTЈ , Dt ‫ץ‬X ‫ץ‬X ‫ץ‬X DV ‫ץ‬qЈ ‫ץ‬S ‫ץ‬qЈ ϩ RT0 ϭ Ϫ fU Ϫ K Ϫ RTЈ , Dt ‫ץ‬Y ‫ץ‬Y ‫ץ‬Y Dw ‫ץ‬qЈ ‫ץ‬qЈ ϩ RT0 Ϫ B ϭ ϪRTЈ , Dt ‫ץ‬z ‫ץ‬z DTЈ RT0 DqЈ N2 RTЈ Ϫ ϩ T0 0 w ϭ Ϫ D, Dt c p Dt g c␷  (6)  where the (X, Y) conformal coordinate-based form of D/Dt, horizontal divergence D H (U, V) of the wind image (U, V), pseudokinetic energy K, and metric coefficient S are all detailed in appendix C for the sake of completeness. To account for the irregularity of the lower boundary, the system in (6) is transformed to the terrainfollowing curvilinear framework of Gal-Chen and Sommerville (1975). Optionally, variable spacing of vertical levels is allowed by composing the Gal-Chen transformation with a smooth stretching (Thomas et al. 1998). Rather than complicating (6) with details of the composition of these mappings, all metric terms associated with the vertical transformations are absorbed into the definitions of the partial derivatives (appendix C). b. Numerical approximations The semi-Lagrangian approach employed for approximating (6) on a discrete mesh (Robert 1993) aims  OCTOBER 2003  2467  THOMAS ET AL.  at second-order accuracy in space and time.4 A staggered Arakawa C grid (Arakawa and Lamb 1977) is employed in the horizontal, and a staggered Tokioka B grid (Tokioka 1978) in the vertical. The associated spatial partial-derivative discrete operators are detailed in appendix D. In general, the governing equations—here, either (1), (4), or (6)—can be viewed symbolically as D␺ ϭ F␺ , Dt  ͵  F␺ d␶ ,  (8)  ⌫  where ␺ and ␺ 0 denote, respectively, the values of ␺ (X, t) at the two endpoints of the trajectory ⌫ connecting [X 0 (X, t1 ), t 0 ] and (X, t1 ). For our study (of spectral preconditioners), we adopt the Mesoscale Compressible Community (MC2) model (Tanguay et al. 1990; Benoit et al. 1997; Laprise et al. 1997; Thomas et al. 1998, and references therein), which implements the semi-implicit semi-Lagrangian algorithm of Robert (1981). In the context of the integral representation, this scheme blends midpoint three-time-level and trapezoidal twotime-level quadrature rules. The midpoint and trapezoidal approximations result, respectively, in explicit (leapfrog) and implicit (Crank–Nicholson) time discretizations of the various forcing terms. In order to concisely present the resulting finite-difference equations, we define the discrete operators  ␺ nϩ1 Ϫ ␺ 0nϪ1 Dt ␺ ϭ i , 2⌬t  ␺ nϩ1 ϩ ␺ 0nϪ1 ␮t ␺ ϭ i , 2  (9)  where ␺ inϩ1 denotes the unknown (sought) value of ␺ at the grid point (X i , t nϩ1 ), and ␺ nϪ1 is the value of ␺ 0 at the foot (X 0 , t nϪ1 ) of the trajectory arriving at (X i , t nϩ1 ), evaluated using cubic interpolation from the neighboring grid points. With the definitions in (9), the compact discretized form of the equations (6) becomes (after some regrouping)  [  D t qЈ ϩ ␥␮ t D H (U, V ) ϩ ␥ D Z Ϫ  RT0 N2 D t qЈ ϩ T0 0 ␮ t w ϭ F Tn , cp g  (10)  where spatial partial-derivative and average operators (D X , D Y , D Z , and ␮ Z ) are defined in appendix D, and the F n␺ forcings combine the Coriolis and slow-residual terms that are interpolated at the trajectory midpoint, (X, t n ), with X ϵ (X i ϩ X 0 )/2 [cf. Thomas et al. (1998) for a discussion of the trajectory scheme)].  (7)  where ␺ stands for any of the kinematic or thermodynamic dependent variables of the model, and F␺ for the associated rhs. Then, semi-Lagrangian schemes can be interpreted as approximations to the trajectory integrals of the governing equations:  ␺ ϭ ␺0 ϩ  D t TЈ Ϫ  ]  g ␮ ␮ w ϭ F nq , c02 Z t  D t U ϩ RT0 ␮ t D X qЈ ϭ F Un , D t V ϩ RT0 ␮ t D Y qЈ ϭ FVn , D t w ϩ RT0 ␮ t D Z qЈ Ϫ ␮ t B ϭ F nw , 4 Numerical dissipative terms absent from the analytic model (1)— absorbers, filters, subgrid-scale turbulence parameterizations, etc.— are split and integrated to first order.  c. Elliptic problem In order to derive the discrete Helmholtz problem that underlies the semi-implicit scheme in (10), all unknown t nϩ1 terms are grouped on the lhs, and all known t n and t nϪ1 terms are separated on the rhs. Denoting the rhs by Q and dropping all temporal superscripts (as there is no ambiguity at this point), the resulting linear system of algebraic equations (valid at each grid point) is written as follows:  [  ]  g ␮ w ϭ Q qЈ , c02 Z  (11)  U ϩ ⌬tRT0 D X qЈ ϭ Q U ,  (12)  V ϩ ⌬tRT0 D Y qЈ ϭ Q V ,  (13)  w ϩ ⌬tRT0 D Z qЈ Ϫ ⌬tB ϭ Q w ,  (14)  RT0 N qЈ ϩ ⌬tT0 w ϭ Q T . cp g  (15)  qЈ ϩ ⌬t␥D H (U, V ) ϩ ⌬t␥ D Z Ϫ  TЈ Ϫ  2 0  To arrive at the implicit equation for qЈ, all remaining dependent variables are eliminated from (11)–(15). Combining (14) and (15) eliminates the buoyancy B and results in  [  [1 ϩ ⌬t 2 N 02 ]w ϩ ⌬tRT0 D Z Ϫ  ]  g ␮ qЈ ϭ QЈw , c p T0 Z  (16)  where QЈw ϭ Q w ϩ ⌬t  g Q . T0 T  (17)  Next, taking the divergence of (12) and (13) and substituting for D H (U, V) in (11),  [  [1 Ϫ ⌬t 2 c02 D H (D X , D Y )]qЈ ϩ ⌬t␥ D Z Ϫ  ]  g ␮ w ϭ QЈqЈ , c02 Z (18)  where QЈqЈ ϭ Q qЈ Ϫ ⌬t␥ D H (Q U , Q V ).  (19)  Following the original notation of Tanguay et al. (1990), we define the operators,  [  D Z(1) ϭ D Z Ϫ  ]  g ␮ c p T0 Z  and  2468  MONTHLY WEATHER REVIEW  [  D Z(2) ϭ D Z Ϫ  ]  g ␮ , c02 Z  (20)  and the coefficient N ϭ (1 ϩ ⌬t 2N 20). Then, solving directly for w in (16) and substituting the result in (18), we arrive at the discrete Helmholtz equation, {[1 Ϫ ⌬t 2 c02 D H (D X , D Y )] Ϫ ⌬t 2 c02 D Z(2)N Ϫ1 D Z(1) }qЈ ϭ Q*qЈ ,  izontal to the vertical operators through a complex blend of X–Z and Y–Z derivatives and averages. The first step in constructing an effective preconditioner P (for atmospheric models) is to neglect these cross terms—to facilitate a direct inversion of P in the vertical (recall the discussion in the introduction). The resulting simplification leads to the preconditioned linear system of equations:  (21) where the modified forcing on the rhs takes the form Q*qЈ ϭ QЈqЈ Ϫ ⌬tD N QЈw . (2) Z  Ϫ1  (22)  d. Boundary conditions Lateral boundaries are open with inflow/outflow determined by the normal components of the velocity. From the perspective of the Helmholtz problem in (21), normal velocities specified at the lateral boundaries imply Neumann boundary conditions on the pressure via (12) and (13). The model top and bottom boundaries are impermeable. At the model top ztop , the terrain-following coordinate Z becomes flat [cf. (C7) in appendix C], whereupon the contravariant ‘‘vertical’’ velocity becomes identical to the vertical component of the physical velocity and they both vanish: Z˙ | z ϭ w | z ϭ 0. (23) top  top  At the lower boundary, the normal contravariant velocity vanishes (Z˙ | 0 ϭ 0), implying that the physical vertical velocity w follows coordinate surfaces according to w | 0 ϭ GS(G13 U ϩ G 23 V) | 0 .  (24)  The boundary conditions on qЈ associated with (23) and (24) are derived by substituting the momentum equations (12) and (13) into, respectively, (23) and (24) and requiring that the resulting relations be consistent with (16). Then QЈw and D Z(1) can be specified at Z ϭ 0 and Z ϭ ztop .5 3. Spectral preconditioner The discrete Helmholtz equation (21) takes the form LqЈ ϭ Q*qЈ ,  (25)  referred to early in the introduction. The symbol L on the lhs of (25) represents the entire operator appearing between the brackets {·} on the lhs of (21). For the terrain-following transformation and conformal coordinates, the generalized Laplacian-like operator D H (D X , D Y ) embedded in L includes cross-derivative terms and is nonseparable. The discrete horizontal divergence and 3D gradient, (D7) and (D4) respectively, couple the horThe reader interested in implicit formulations of boundary conditions in the context of Krylov solvers is referred to Smolarkiewicz and Margolin (2000) for a discussion. 5  VOLUME 131  Pe ϭ r  (26)  or, equivalently, {[1 Ϫ ⌬t 2 c02 Sٌ H2 ] Ϫ ⌬t 2 c02 D Z(2)N Ϫ1 D Z(1) }e ϭ r,  (27)  where ٌ H2 ϭ ␦ XX ϩ ␦ YY denotes the standard discrete horizontal Laplacian; e is the current iterate (of the Krylov solver) estimation of the solution error e ϭ qЈ Ϫ q Ј, with the overbar denoting the sought exact solution; and r is the current iterate value of the residual error r ϭ LqЈ Ϫ Q*qЈ (cf. appendix A). Because the operator on the lhs of (27) is a simple sum of the vertical and horizontal differential operators, iterative line-relaxation preconditioners (outlined in appendix B) can be easily applied at this point. Here, we advocate an alternate strategy, where a separable (in the horizontal) constantcoefficient approximation of the operator on the lhs of (27) is specified—thereby departing even further with P from the original operator L in (25)—enabling a fast direct inversion in the horizontal. The horizontal component of P in (27) is nonseparable because the metric coefficient S ϭ S(X, Y) is a function of both X and Y.6 In general, a separable operator is obtained by a cross-directional homogenization of metric coefficients—for example, by averaging S that multiplies the ␦ XX and ␦ YY parts of ٌ H2 , respectively, in the Y and X directions—whereupon the tensor-product method (Lynch et al. 1964) can be applied to construct a direct solver. This requires computing separately the eigenvalues and eigenvectors of the matrices repreY X senting the discrete operators S ␦ XX and S ␦ YY , and diagonalization follows by combining two unitary similarity transformations. Insofar as stand-alone solvers for separable problems are concerned, this approach requires O(n z n 3 ) operations and is not competitive with ADI schemes [an O(n z n 2 log 2 n) operation count] except when a fast transform [e.g., FFT with O(n z n 2 logn) operations] is available (Lynch et al. 1964). Here, n ϭ n x ϭ n y refers to the number of mesh points in either horizontal direction, and n z is the number of points in the vertical. Because (27) is merely a preconditioner, S can be globally averaged or extremized (Bernardet 1995; Smolarkiewicz and Margolin 2000), thereby enabling application of the FFT. Here, we replace S in (27) with the global mean; flat topography is specified to assure 6 In models with X–Y cross differentiation [e.g., due to the coordinate mapping for the sake of mesh adaptivity (Smolarkiewicz and Prusa 2002)] either ADI or spectral preconditioners would require neglecting these cross terms.  OCTOBER 2003  THOMAS ET AL.  constant coefficients in the vertical derivatives. The discrete operator ٌ H2 , with homogeneous Neumann boundary conditions,7 is then readily diagonalized by the real part of the Fourier transform (via the discrete cosine transform; Sweet 1973). This results in n x ϫ n y independent tridiagonal vertical problems: XY  {[1 Ϫ ⌬t 2 c02 S (␭ l ϩ ␭ m )] Ϫ ⌬t 2 c02 D Z(2)N Ϫ1 D Z(1) }eˆ ϭ rˆ, (28) where  [ ] [ ]  ␭l ϭ Ϫ  4 ␲ (l Ϫ 1) sin 2 , 2 ⌬X 2n x  l ϭ 1, . . . , n x ,  ␭m ϭ Ϫ  4 ␲ (m Ϫ 1) sin 2 , 2 ⌬Y 2n y  m ϭ 1, . . . , n y ,  are the eigenvalues of ٌ H2 in the X and Y directions, and · denotes Fourier transform. 4. Numerical experiments A suite of model runs8 is designed to compare the performance of the DCT preconditioner against the line Jacobi preconditioner, proven effective in meteorological applications (Thomas et al. 1998, 2000). The relative efficiency of the two preconditioners is measured with two Jacobi/DCT ratios [hereafter the performance ratios (PR)]: the ratio of total GCR solver iterations and the ratio of the total model CPU times. Both ratios are informative. For a given convergence criterion, the first one estimates the computational cost of the solver, independent of the implementation and machine, whereas the second assesses the total work required for the application at hand. Values of PR Ͼ 1 show efficiency gains by the DCT preconditioner. Because the residual error of the Helmholtz problem in (21) combines ⌬tnormalized first-order spatial partial derivatives of the velocity components, it motivates the stopping criterion: ࿣LqЈ Ϫ Q*qЈ ࿣ Յ ␧ min(C, L).  (29)  In (29), C and L represent the Courant ࿣⌬tv/⌬X࿣ and Lipschitz ࿣⌬t(‫ץ‬v/‫ץ‬x)࿣ numbers—correspondingly, the measures of the flow magnitude and its variations. Typically, in order to control the stability/accuracy of the Eulerian and semi-Lagrangian computations, C ϳ L ϳ O(1), whereupon ␧ specifies both the absolute admissible error and its magnitude compared to the flow. The test cases considered here are designed to evaluate the robustness and efficiency of the DCT preconditioner over a wide range of meteorological applica7 Nonperiodic boundary conditions are always homogeneous in the preconditioner (Smolarkiewicz and Margolin 2000, section 3). 8 All calculations reported were performed with 64-bit real arithmetic.  2469  tions. The first test is very special; it probes the reflexivity of our preconditioned Krylov approach. In all problems where the preconditioner is an exact solver, GCR should converge in one iteration to machine precision, regardless of the value of ␧. In the MC2 model, all smallto mesoscale problems with map scale factor m ϵ 1 in (C1) and flat lower boundary with h(X, Y) ϵ const in (C7) belong to this class, which constitutes a broad range of research applications including studies of fully developed turbulence and deep moist convection. For illustration, we consider a full-spectrum turbulent thermal convection simulation. The physical scenario is a buoyant thermal rising in a neutrally stratified environment—a 3D extension of the earlier experiment in Smolarkiewicz and Pudykiewicz (1992). The grid employed consists of 100 ϫ 100 ϫ 150 points with constant interval ⌬X ϭ ⌬Y ϭ ⌬Z ϭ 10 m; ⌬t ϭ 2.5 s. Initially, a spherical temperature anomaly of radius r 0 ϭ 250 m and ⌰Ј ϭ 0.5 K is centered in the horizontal, r 0 ϩ ⌬Z above the lower boundary. Figure 1 shows potential temperature (⌰) xz and xy cross sections through the center of the convective bubble and at z ϭ 800 m, respectively; at the end of the integration, t ϭ 180⌬t, by which time the flow becomes turbulent and exhibits a full range of scales (see Fig. 2). The solution in Figs. 1 and 2 is for the DCT preconditioner. It is independent of the stopping criterion in (29). This is documented in Fig. 3, which compares the rms divergence of the solution using the DCT and line Jacobi preconditioners for a range of ␧. Following Skamarock et al. (1997), this diagnostic is used to measure the solution convergence in elastic models. For the Jacobi preconditioner, as ␧ → 0 the curve flattens, indicating the critical value ␧ ϭ 10 Ϫ3 , below which the solution can be considered physically converged; that is, further solver iterations do not contribute to the overall accuracy of the results. Because the DCT preconditioner is a direct solver of the elliptic problem at hand, convergence is reached in a single GCR iteration regardless of the specified value of ␧. The latter is not the case for the Jacobi preconditioner, which requires an increasing number of GCR iterations as ␧ → 0. Figure 4 demonstrates this by displaying the PR for iteration count and total CPU time as a function of ␧. Here, the DCT-preconditioned solver converges in one iteration, and the PR iteration count is equivalent to the average number of GCR iterations (per model time step) using the Jacobi preconditioner. The PR CPU time documents that, even with the overhead associated with computing FFTs, the DCT preconditioner is at least an order of magnitude more efficient for all values of ␧ representing the convergent solutions. The second test simulates a smooth, vertically propagating 3D mountain wave, identical to those in Saito et al. (1998) and Thomas et al. (2000). The resulting flow is characterized by small-amplitude gravity wave response at Froude number F ϵ U/Nh ϭ 4.4, where the  2470  MONTHLY WEATHER REVIEW  VOLUME 131  FIG. 1. Potential temperature ␪ (a) x–z and (b) x–y cross sections through the center of the convective bubble at t ϭ 450 s. The xy cross sections are at z ϭ 800 m, shown by the horizontal dashed line in (a). Contour interval is 0.1 K.  linear theory of Smith (1980) is qualitatively valid. To contrast with the buoyant thermal case, the linear flow regime is chosen because it emphasizes a low signalto-noise ratio. Here, unlike in the previous case, the energy spectrum of a fully developed solution is compact, and the grid aspect ratio significantly departs from unity, reflecting the anisotropy of the problem. To quantify the effect of domain anisotropy (shallow versus deep flow scenarios) on solver performance, both hydrostatic and nonhydrostatic variants are addressed [cf.  Skamarock et al. (1997) for a discussion]. The parameters of the experiment are as follows. The bell-shaped mountain, h0 h(X, Y ) ϭ , 2 [1 ϩ (X/a) ϩ (Y/a) 2 ] 3/ 2 with the half-width a ϭ 3⌬X ϭ 3⌬Y and the height h 0 ϭ 100 m is centered at the (30, 30) grid point of the 61 ϫ 61 horizontal mesh. In the vertical, 31 exponentially stretched levels extend to 19 km, resulting in ver-  FIG. 2. Spanwise (a) x-average and (b) y-average 1D power spectra at three height levels of the solution in Fig. 1. The thin solid line shows the ‘‘Ϫ5/3’’ slope for reference.  OCTOBER 2003  THOMAS ET AL.  FIG. 3. The rms divergence as a function of ␧ in Eq. (29) for the convective bubble case shown in Fig. 1. Results with both the Jacobi (solid line) and the DCT (dashed line) preconditioners are shown.  tical grid increments ranging from 40 to 1180 m, respectively, for the bottom and top thermodynamic layers. The model is integrated over 480 time steps, when the solution is considered to have reached an approximate steady state. A constant inflow velocity of U ϭ 8 m s Ϫ1 and an isothermal atmosphere with buoyancy frequency N ϭ 0.018 s Ϫ1 are assumed. The Coriolis force and the curvature of the earth are neglected. No subgridscale parameterizations are employed. The Davies (1976) relaxation scheme is used at the lateral boundaries to minimize wave reflections. In the hydrostatic case, the grid length ⌬X ϭ 2000 m and time step ⌬t ϭ 60 s result in the advective Courant number C ϭ 0.24. The horizontal scale of the mountain is a ϭ 6000 m, so aN/U ϭ 13.5. In the nonhydrostatic case, ⌬X ϭ 400 m and ⌬t ϭ 15 s, giving C ϭ 0.3. The horizontal scale decreases to a ϭ 1200 m, so that aN/U ϭ 2.7, whereupon nonhydrostatic effects are significant. For illustration, Fig. 5 displays the converged solution. Vertical velocity cross sections are shown through the vertical center plane and on the horizontal plane at z ϭ 2100 m (3/4 of the vertical wavelength) for both cases. The maximum vertical velocities at this level are 0.06 and 0.2 m s Ϫ1 in the hydrostatic and nonhydrostatic cases, respectively. Note the downstream tilt of the wave packet (manifested by the downstream shift of the maxima) in the nonhydrostatic case. The corresponding energy spectra in the mean flow direction at z ϭ 1800 m are shown in Fig. 6. The iteration count of the Krylov solver using the DCT (solid line) and line Jacobi (dashed line) preconditioners is compared in Fig. 7 as a function of the convergence threshold ␧ in (29). The rms divergence (thick solid line) shows the critical value of ␧ ϭ 10 Ϫ5 , below which the solution is considered physically converged.9 It is apparent that the number of iterations per 9 For the exact steady-state solution, the rms divergence should vanish.  2471  FIG. 4. Performance ratios for CPU and iteration count, as a function of ␧, for the convective bubble case shown in Fig. 1.  time step required for the line Jacobi preconditioner is about 5(10) times that required for the DCT scheme in the hydrostatic (nonhydrostatic) case. However, this does not imply that the efficiency of the DCT preconditioner follows the same pattern. The computational cost of a single GCR iteration with the DCT preconditioner is much higher than with line Jacobi, whereupon the CPU time performance ratio is about 1.5 for the hydrostatic result and somewhat over 2 for the nonhydrostatic solution (see Fig. 8). The trend of decreasing relative efficacy of the DCT preconditioner with increasing hydrostacy of the problem is consistent with that observed for ADI preconditioners (Skamarock et al. 1997). It is likely related to the propagation characteristics of gravity waves present in the system. Next, the complex Scandinavian orography is used for the lower boundary to probe the solver performance for natural mesoscale flow scenarios containing many scales of motion. For this case, ⌬X ϭ ⌬Y ϭ 10 km, and ⌬Z ϭ 300 m, giving a grid aspect ratio of 0.01. The total horizontal domain size is 2000 km ϫ 2000 km (200 ϫ 200 horizontal grid points), with the model lid placed at 15 km. A uniform northwesterly ambient flow with speed of 10 m s Ϫ1 and buoyancy frequency N ϭ 0.01 s Ϫ1 are assumed. Unlike in the previous cases, a ␤-plane approximation is employed, with the map scale factor m varying accordingly in (C1). The model is integrated over 24 h with 288 5-min time steps. The converged wind solution on the lowest momentum level (approximately 150 m above the ground) is shown in Fig. 9, for illustration. Based on experience gained from the earlier test cases, a safe value ␧ ϭ 10 Ϫ6 is assumed to assure physical convergence. Performance-ratio values are 4.04 and 5.46 for CPU time and iteration count, respectively, documenting much higher gains for the DCT preconditioner compared to the isolated mountain case discussed above. This is not surprising because compact spectra limit the number of significant eigenmodes present in the solution, thereby accelerating the convergence of Krylov solvers. For the sake of com-  2472  MONTHLY WEATHER REVIEW  VOLUME 131  FIG. 5. Vertical velocity (left) x–z and (right) x–y cross sections through the mountain wave simulations at time step 480: (top) hydrostatic case and (bottom) nonhydrostatic case. The x–y cross sections are at z ϭ 2100 m. Contour intervals are 0.01 m s Ϫ1 for the hydrostatic case and 0.04 m s Ϫ1 for the nonhydrostatic case.  pleteness, a 3D ADI preconditioner was also tested in this case. The number of GCR iterations required (for the given convergence threshold) is the same as with the DCT preconditioner, but the computational overhead is substantial, owing to multiple ADI iterations within the preconditioner itself. Finally, we test the performance of the DCT preconditioner on a synoptic-scale, weather-prediction problem. An arbitrarily chosen forecast from 0000 UTC 16 September 2002 is run using initial and boundary conditions supplied by the National Centers for Environmental Prediction (NCEP) Eta Model. The MC2 model is integrated over 48 h with 480 6-min time steps. The computational domain is 200 ϫ 160 ϫ 19, with ⌬X ϭ ⌬Y ϭ 50 km on a polar stereographic projection true at 60ЊN, and the vertically stretched grid ranges from ⌬Z ϭ 72 m near the surface to 3800 m near the top at 23 km. The resulting condition number of the Helmholtz operator is extreme, O(1012 ), and the energy spectrum full. As in the Scandinavian case, ␧ ϭ 10 Ϫ6 is assumed. The resulting performance ratios are 1.93 and 4.90 for  CPU time and iteration count, respectively. The DCT preconditioner again requires fewer iterations, and the model integration rate is about twice as fast. This demonstrates that the DCT is far superior to line Jacobi, despite the large departure of the preconditioner from the full Helmholtz operator emphasized by a synopticscale forecast. To assess the relative merits of alternative Krylov methods, we also implemented the right-preconditioned flexible GMRES (FGMRES) algorithm of Saad (1993).10 For comparison, we ran the MC2 model using FGMRES(4), with DCT preconditioner, for the Scandinavian topography simulation and winter storm synoptic forecast. In both cases, we found that FGMRES requires additional iterations because it tends to converge more slowly than GCR after a restart. The model run times were nearly identical, with FGMRES giving a slight advantage in the synoptic case. To compare 10 Computational complexity estimates for the various algorithms are provided by Saad (1996).  OCTOBER 2003  THOMAS ET AL.  2473  FIG. 8. The CPU time performance ratio as a function of convergence criterion ␧ for both the hydrostatic and nonhydrostatic mountain wave cases.  FIG. 6. One-dimensional power spectra through the centers of the right-hand panels in Fig. 5. Solid and dashed lines denote the hydrostatic and nonhydrostatic case, respectively.  solutions, we examined the maximum Courant numbers (equivalent to a maximum norm) and residual L 2 norm (least squares error) of the elliptic problem over the computational domain for the GCR and FGMRES solvers. In both cases, these agree to six digits, and thus we conclude that the solutions are the same. 5. Concluding remarks We have developed a spectral preconditioner for a generalized conjugate residual (GCR) Krylov solver applied to the Helmholtz problem arising in the semiimplicit time discretization of a compressible nonhydrostatic atmospheric model. Our approach offers an alternative to the more standard and much simpler line-  relaxation, stationary block-iteration-type preconditioners used in meteorology. Despite substantial departures of the spectral preconditioner from the governing Helmholtz operator, we observed dramatic performance gains over these simpler schemes. The relative improvement in the model integration rate ranges from a factor of 2 to one order of magnitude, depending on the problem at hand. The relative performance gains are, of course, greatest when the DCT preconditioner happens to be a direct solver for the governing Helmholtz problem, that is, zero orography and constant mapscale factor characteristic of small-scale dynamics applications. In a mesoscale flow scenario over Scandinavian topography, with a broad range of scales of motion present, a factor of 4 improvement was achieved. For smooth idealized mountain wave flows, frequently used to benchmark numerical model performance, the relative gains are more modest. But extrapolating these results to natural problems underestimates the potential gains because idealized flows often have compact spectra that favor simpler schemes. For synoptic weather prediction,  FIG. 7. The rms divergence and average number of GCR iterations per time step for the (a) hydrostatic and (b) nonhydrostatic mountain wave cases. Results with both the Jacobi (JAC) and DCT preconditioner are shown.  2474  MONTHLY WEATHER REVIEW  VOLUME 131  logical applications (cf. Navon and Cai 1993). We have advocated a spectrally preconditioned GCR solver in the context of a particular model, representative of current trends in NWP. Further studies with a focus on alternative numerics, mesh refinement, complex geometries, and extreme meteorological scenarios will pose/define extreme challenges/benchmarks to the advocated approach.  FIG. 9. Wind solution for the first model level (Z ϭ 150 m) over the Scandinavian topography at 24 h with the DCT preconditioner. Topography is also shown at 500-m intervals.  Acknowledgments. The authors are grateful to Claude Girard and Michel Desgagne´ of RPN and Environment Canada, for maintaining and improving the MC2 model, and to Xingxiu Deng and Henryk Modzelewski from UBC, for preparing the synoptic case and adept computer administration, respectively. This work was supported by the National Science Foundation, the Department of Energy Climate Change Prediction Program (CCPP), the Canadian Natural Science and Engineering Council, RPN and Environment Canada, the Canadian Foundation for Innovation, BC Knowledge Development Fund, the Geophysical Disaster Computational Fluid Dynamics Center, and UBC Research Endowments. The comments of two anonymous referees helped to improve the presentation. APPENDIX A Generalized Conjugate-Residual Approach  where the shallowness of the atmosphere implies an extremely poorly conditioned problem, the performance improvement is still a factor of 2. Spectral methods rely on global basis functions, and computing their expansion coefficients requires multiple evaluations of global sums. Consequently, spectral transform methods are criticized as being inappropriate for modern, distributed-memory parallel architectures. Nevertheless, in the context of FFT-based preconditioners for iterative Krylov methods, Elman and O’Leary (1998) documented overall parallel speedups despite the large data communication overheads. In meteorology, the vertical direction is physically distinct and typically not partitioned across the processors in numerical models. Thus, a massively parallel implementation of spectral preconditioners can be based on a data-transposition strategy (Thomas et al. 2002) similar to that used in the European Centre for MediumRange Weather Forecasts (ECMWF) weather prediction model Integrated Forecast System (IFS; Ska˚lin 1997; Barros and Kauranne 1994). The performance of the next-generation (all-scale, nonhydrostatic semi-implicit, weather and climate prediction/research) numerical models in meteorology depends critically on the performance of elliptic solvers employed. It is widely recognized that the state-of-theart methods rely on preconditioned nonsymmetric Krylov algorithms. Given the extensive body of mathematical literature, still relatively little numerical experience exists with such solvers in advanced meteoro-  Here we describe the preconditioned GCR(k) algorithm used in this study. In general, we consider a linear elliptic problem, ϩ D ⌿΃ Ϫ A⌿ ϭ Q, ͸ ‫ץ‬x‫΂ ץ‬͸ C ‫ץ‬⌿ ‫ץ‬x 3  L(⌿) ϵ  3  IJ  Iϭ1  I  Jϭ1  I  J  (A1)  with variable coefficients A, C IJ , D I , Q, and either periodic, Dirichlet, or Neumann boundary conditions and adopt the following notation: The discrete representation of a field on the grid is denoted by the subscript i; the discrete representation of the elliptic operator on the lhs of (A1) is denoted by L i (⌿); and the inner product ͗␰␨ ͘ ϵ ⌺ i ␰ i ␨ i . The preconditioner P is a linear operator that approximates L to a greater or lesser degree but is easier to invert. Its role is to substitute (A1) with an auxiliary problem that converges faster (than the original problem) because of a closer clustering of the eigenvalues of the auxiliary operator resulting from the superposition of L and P Ϫ1 . In this paper, we are primarily concerned with ‘‘left’’ preconditioning that substitutes (A1) with an auxiliary problem P Ϫ1 [L(⌿) Ϫ Q] ϭ 0. Its accelerated convergence exploits spectral properties of P Ϫ1L. In general, ‘‘right’’ preconditioning is also possible. It augments (A1) with LP Ϫ1 [P (⌿)] ϭ Q, and its convergence relies on reducing the spectral radius of LP Ϫ1 . Left preconditioning assumes P constant during the iteration process, whereas right preconditioning allows for variable P as, for example, in the FGMRES solver of Saad (1993).  OCTOBER 2003  The GCR(k) method of Eisenstat et al. (1983) may be derived via variational arguments (cf. Smolarkiewicz and Margolin 1994, 2000). In essence, we augment (A1) with a kth-order damped oscillation equation: ‫ ץ‬kP (⌿) 1 ‫ ץ‬kϪ1P (⌿) 1 ‫ץ‬P (⌿) ϩ ϩ ··· ϩ k ‫␶ץ‬ TkϪ1 (␶ ) ‫ ␶ץ‬kϪ1 T1 (␶ ) ‫␶ץ‬ ϭ L(⌿) Ϫ Q.  (A2)  For n ϭ 1, 2, . . .until convergence do for ␯ ϭ 0, . . . , k Ϫ 1 do  ␤ϭϪ  ␯  ⌿␯i ϩ1 ϭ ⌿␯i ϩ ␤p␯i , r␯i ϩ1 ϭ r␯i ϩ ␤L i ( p␯ ), exit if ࿣r␯ϩ1࿣ Յ ␧, e i ϭ P iϪ1 (r␯ϩ1 ),  [͸ 3  Iϭ1  ‫ץ‬ ‫ץ‬x I  ΂͸ C 3  IJ  Jϭ1  ∀ lϭ0,␯ ␣ l ϭ Ϫ  ]  ΃  ‫ץ‬e ϩ D I e Ϫ Ae , ‫ץ‬x J i  ͗L(e)L( p l )͘ , ͗L( p l )L( p l )͘  ͸␣p, ) ϭ L (e) ϩ ͸ ␣ L ( p ),  p␯i ϩ1 ϭ e i ϩ  ␯  l  l i  lϭ0  ␯  L i ( p␯ϩ1  l  i  l  Line-Relaxation Preconditioners A distinctive feature of meteorological flows is their anisotropy in the vertical direction—the larger the ratio of the horizontal scale of the flow to the fluid depth, the stiffer the elliptic problem. A class of simple yet effective preconditioners that mitigate this aspect (of elliptic problems in meteorology) derives from an implicit Richardson iteration (Smolarkiewicz and Margolin 2000), e ␮ϩ1 Ϫ e ␮ ϭ P h (e ␮ ) ϩ P z (e ␮ϩ1 ) Ϫ r␯ϩ1 , ⌬␶˜  (B1)  a realization of the preconditioning step e ϭ P Ϫ1 (r ␯ϩ1 ) of the GCR(k) solver summarized in appendix A. Here, P h and P z are the horizontal and the vertical counterparts of the operator P, respectively; ⌬␶˜ is a parameter of the iteration (a pseudo-time step) based on spectral properties of P h [viz., linear stability analysis of (B1)]; ␮ numbers successive Richardson iterations; and ␯ numbers the outer iterations of the Krylov solver. Equation (B1) leads to a linear problem,  ͗r L( p )͘ , ͗L( p␯ )L( p␯ )͘ ␯  erator on the grid takes place only once per iteration following the preconditioning e ϭ P Ϫ1 (r ␯ϩ1 ), which provides an estimate of the solution error e ␯ϩ1 ϭ ⌿ ␯ϩ1 Ϫ ⌿exact . APPENDIX B  Then we discretize (A2) in a pseudotime ␶ to form the affine discrete equation for the progression of the residual errors r and determine the optimal parameters T1 , . . . , T kϪ1 and integration increment ⌬ ␶ (variable in ␶) that assure minimization of the residual errors in the norm defined by the inner product ͗rr͘. This leads to the following algorithm. For any initial guess ⌿ i0, set r i0 ϭ L i (⌿ 0 ) Ϫ Q i , p i0 0 ϭ P Ϫ1 i (r ); then iterate:  L i (e) ϭ  2475  THOMAS ET AL.  i  (I Ϫ ⌬ ␶˜ P z )e ␮ϩ1 ϭ R˜ ␮ ,  (B2)  where R˜ ␮ ϵ e ␮ ϩ ⌬␶˜ [P h (e ␮ ) Ϫ r ␯ϩ1 ], that can be solved readily using the well-known tridiagonal algorithm (cf. appendix A in Roache 1972). Alternating implicit discretization between the vertical and horizontal counterparts of P in fractional steps of ␶˜ leads to the ADI preconditioners of Skamarock et al. (1997). The implicit Richardson preconditioner in (B1) can be further improved. Consider extending the Richardson diffusion scheme with respect to P h to the diagonally preconditioned Duffort–Frankel type implicit algorithm,  lϭ0  end do, reset [⌿, r, p, L( p)] ik to [⌿, r, p, L( p)] i0 , end do. For the convergence, the GCR(k) algorithm above requires LP Ϫ1 to be negative definiteA1 but not necessarily self-adjoint.A2 Direct evaluation of the elliptic opA1 An operator A is said to be definite if ͗␰ A(␰ )͘ is either strictly positive (positive definite) or strictly negative (negative definite) for all ␰. A2 An operator A is said to be self-adjoint if ͗␰ A(␨ )͘ ϭ ͗␨ A(␰ )͘ for all ␰ and ␨.  De ␮ϩ1 Ϫ De ␮ ϭ P h (e ␮ ) Ϫ D(e ␮ϩ1 Ϫ e ␮ ) ⌬␶˜ ϩ P z (e ␮ϩ1 ) Ϫ r␯ϩ1 ,  (B3)  where (Ϫ1)D stands for the diagonal coefficient embedded within the matrix representing P h on the grid. Note that adding the relaxation term on the rhs of (B1) has the effect of replacing the De ␮ term with D(e ␮ϩ1 ) in P h (e ␮ ) without complicating flux boundary conditions imposed in constructing P h (e) (cf. Smolarkiewicz and Margolin 2000 for discussions). In the limit ⌬␶˜ → ϱ, (B3) is equivalent to the block Jacobi preconditioner proven effective in meteorological applications (Thomas et al. 1998, 2000).  2476  MONTHLY WEATHER REVIEW  APPENDIX C  VOLUME 131  we employ the classical transformation of Gal-Chen and Sommerville (1975), where  Coordinate Systems For a polar stereographic projection, true at longitude ␾ 0 , m is the map scale factor and S ϭ m 2 : 1 ϩ sin␾0 mϭ . 1 ϩ sin␾  (C1)  The components (U, V) of the wind ‘‘image’’ are defined in terms of the scaled and rotated components of the true horizontal velocity (u, ␷ ) in local (x, y, z) coordinates,  [] [  U 1 Ϫsin␭ ϭ V m cos␭  ][ ]  Ϫcos␭ u , Ϫsin␭ ␷  (C2)  where ␭ represents longitude. The (X, Y) components of the conformal coordinates are scaled and rotated according to  [] [  dX Ϫsin␭ ϭm dY cos␭  ][ ]  Ϫcos␭ dx . Ϫsin␭ dy  (C3)  In conformal coordinates (X, Y, z), the Euler equations take the following form:  ΂  ΃  1 Dq ‫ץ‬U ‫ץ‬V ‫ץ‬w ϭ ϪS ϩ Ϫ , ␥ Dt ‫ץ‬X ‫ץ‬Y ‫ץ‬z DU ‫ץ‬S ‫ץ‬q ϭ fV Ϫ K Ϫ RT , Dt ‫ץ‬X ‫ץ‬X  Z ϭ ztop  G13 ϭ  ΂ ΃  1 ‫ץ‬z , G ‫ץ‬X Z  ‫␺ץ‬  ΃  z  ΂‫ץ‬X ϩ ‫ץ‬Y΃ ϩ ‫ץ‬z , ‫ץ‬U  ‫ץ‬V  ‫ץ‬w  Gϭ  ϩ G13  ‫␺ץ‬ , ‫ץ‬Z  ϩ G 23  ‫␺ץ‬ , ‫ץ‬Z  Z  Z  ‫␺ץ‬ 1 ‫␺ץ‬ ϭ . ‫ץ‬z G ‫ץ‬Z  ‫ץ‬z . ‫ץ‬Z (C8)  (C9)  The contravariant ‘‘vertical’’ velocity is related to the physical velocity components by the equation W ϵ Z˙ ϭ G Ϫ1 w ϩ S(G13 U ϩ G 23 V). (C10) APPENDIX D  (C5)  Numerical Operators  The divergence of (U, V, w) is given by DϭS  ‫␺ץ‬  z  (C4)  ΂  ΂ ΃  1 ‫ץ‬z , G ‫ץ‬Y Z  Admitting an optional, variable spacing of vertical levels modifies the metric coefficients in (C8) via multiplicative factors (cf. Thomas et al. 1998). The terrainfollowing system is right-handed, where Z increases from the bottom of the model domain to the top, and the Z surface corresponding to the lower boundary is conformal to the terrain. Thus, by definition, G is always positive. Partial derivatives of any variable ␺ are to be interpreted for terrain-following coordinates as follows:  Dw ‫ץ‬q ϭ Ϫg Ϫ RT , Dt ‫ץ‬z  D ‫ץ‬ ‫ץ‬ ‫ץ‬ ‫ץ‬ ϭ ϩS U ϩV ϩw . Dt ‫ץ‬t ‫ץ‬X ‫ץ‬Y ‫ץ‬z  G 23 ϭ  ΂‫ץ‬X΃ ϭ ΂‫ץ‬X΃ ‫␺ץ‬ ‫␺ץ‬ ΂‫ץ‬Y΃ ϭ ΂‫ץ‬Y΃  where K ϭ (U 2 ϩ V 2 )/2 is the pseudohorizontal kinetic energy per unit mass. The material derivative is written as  (C7)  Because Z in (C7) is assumed to be time independent [cf. Prusa et al. (1996) for time-dependent extensions], the resulting Z-coordinate system represents a nondeformable system with surfaces of constant Z fixed in physical space—in contrast to various pressure-based vertical coordinates used in hydrostatic weather-prediction models. The mapping from conformal (X, Y, z) coordinates to curvilinear (X, Y, Z) coordinates is represented by the metric coefficients G IJ of the transformation:  DV ‫ץ‬S ‫ץ‬q ϭ Ϫ fU Ϫ K Ϫ RT , Dt ‫ץ‬Y ‫ץ‬Y  DT RT Dq Ϫ ϭ 0, Dt c p Dt  z Ϫ h(X, Y ) . ztop Ϫ h(X, Y )  The centered averaging ␮ ␨ and differencing ␦ ␨ operators are defined as (C6)  where the horizontal divergence of (U, V) on the rhs is denoted by D H (U, V) throughout the body of the paper. To account for the orography one can introduce a terrain-following coordinate system, where any monotonic function of geometrical height can be used as a vertical coordinate, denoted by Z. Throughout this paper  Ά΂  ΃ ΂  ΃· ,  Ά΂  ΃ ΂  ΃· ,  ␮␨ ␺ ϭ  1 ⌬␨ ⌬␨ ␺ ␨ϩ ϩ␺ ␨Ϫ 2 2 2  ␦␨ (␺) ϭ  1 ⌬␨ ⌬␨ ␺ ␨ϩ Ϫ␺ ␨Ϫ ⌬␨ 2 2  (D1)  (D2)  where ␺ is a dependent model variable that may be  OCTOBER 2003  THOMAS ET AL.  defined at the center of a grid cell or on the grid cell faces. Multiple averages are denoted by  ␮ XY ␺ ϭ ␺  XY  X  Y  Y  X  ϭ (␺ ) ϭ (␺ ) .  (D3)  Discrete horizontal and vertical derivative operators are defined as follows: D X ϭ ␦ X ϩ G13 ␮ XZ ␦ Z , D Y ϭ ␦ Y ϩ G 23 ␮ YZ ␦ Z , 1  DZ ϭ  G  XYZ  ␦Z.  (D4)  The discrete divergence operator D, acting on the contravariant wind components, can be written as follows: D(U, V, W ) ϭ  1 G  YZ  XYZ  XZ  {S[␦ X G U ϩ ␦ Y G V ]} XY  ϩ ␦ Z G W.  (D5)  The operator in (D5) satisfies the Gaussian divergence theorem in discrete form (Thomas et al. 1998). The Jacobian G of the coordinate transformation is located at grid cell vertices and D(U, V, W) at cell centers. The contravariant velocity component W, defined in (C10), is computed from XY  X  G W ϭ w ϩ S[G1 U ϩ G2 V X  Y  Z  ], Y  G1 ϭ GG13 ,  G2 ϭ GG 23 ,  (D6)  and the formal discrete flow divergence (D5) is then implemented as D(U, V, W) ϭ D H (U, V) ϩ ␦ z w, where D H (U, V ) ϭ S␦ X U ϩ  S G  YZ  XYZ  [␦ X G ␮ X ϩ ␮ Z ␦ Z G1 ␮ X ]U  ϩ S␦ Y V ϩ  S G  XYZ  XY  [␦ Y G ␮ Y ϩ ␮ Z ␦ Z G2 ␮ Y ]V.  (D7)  REFERENCES Arakawa, A., and V. Lamb, 1977: Computational design of the basic dynamical processes of the UCLA general circulation model. Methods in Computational Physics, Vol. 17, J. Chang, Ed., Academic Press, 174–265. Axelsson, O., 1994: Iterative Solution Methods. Cambridge University Press, 654 pp. Barros, S., and T. Kauranne, 1994: On the parallelization of global spectral weather models. Parallel Comput., 20, 1335–1356. Benoit, R., M. Desgagne´, P. Pellerin, S. Pellerin, Y. Chartier, and S. Desjardins, 1997: The Canadian MC2: A semi-Lagrangian, semiimplicit wide-band atmospheric model suited for fine-scale process studies and simulations. Mon. Wea. Rev., 125, 2382–2415. Bernardet, P., 1995: The pressure term in the anelastic model: A symmetric elliptic solver for an Arakawa C grid in generalized coordinates. Mon. Wea. Rev., 123, 2474–2490. Bourke, W., 1974: A multi-level spectral model. I. Formulation and hemispheric integrations. Mon. Wea. Rev., 102, 687–701.  2477  Cullen, M., 1990: A test of a semi-implicit integration technique for a fully compressible nonhydrostatic model. Quart. J. Roy. Meteor. Soc., 116, 1253–1258. Davies, H., 1976: A lateral boundary formulation for multi-level prediction models. Quart. J. Roy. Meteor. Soc., 102, 405–418. Eisenstat, S., H. Elman, and M. Schultz, 1983: Variational iterative methods for nonsymmetric systems of linear equations. SIAM J. Numer. Anal., 2, 345–357. Elman, H., and D. O’Leary, 1998: Efficient iterative solution of the three-dimensional Helmholtz equation. J. Comput. Phys., 142, 163–181. Freund, R., 1993: A transpose-free quasi-mimimum residual algorithm for non-Hermitian linear systems. SIAM J. Sci. Stat. Comput., 14, 470–482. Gal-Chen, T., and R. Sommerville, 1975: On the use of a coordinate transformation for the solution of the Navier–Stokes equations. J. Comput. Phys., 17, 209–228. Golding, B., 1992: An efficient nonhydrostatic forecast model. Meteor. Atmos. Phys., 50, 89–103. Grabowski, W., and P. Smolarkiewicz, 2002: A multiscale model for meteorological research. Mon. Wea. Rev., 130, 939–956. Greenbaum, A., 1997: Iterative Methods for Solving Linear Systems. Society of Industrial and Applied Mathematics, 220 pp. Hestenes, M., and E. Stiefel, 1952: Method of conjugate gradients for solving linear systems. J. Res. Natl. Bur. Stand. (U.S.), 49, 409–436. Kadogliu, H., and S. Mudrick, 1992: On the implementation of the GMRES(m) method to elliptic equations in meteorology. J. Comput. Phys., 102, 348–359. Kapitza, H., and D. Eppel, 1992: The nonhydrostatic mesoscale model GESIMA. Part I: Dynamical equations and tests. Beitr. Phys. Atmos., 65, 129–146. Lafore, J., and Coauthors, 1998: The Meso–NH atmospheric modeling system. Part I: Adiabatic formulation and control simulations. Ann. Geophys., 16, 90–109. Laprise, R., D. Caya, G. Bergeron, and M. Gigue`re, 1997: The formulation of the Andre´ Robert MC2 (Mesoscale Compressible Community) model. Atmos.–Ocean, 35, 127–152. Lynch, R., J. Rice, and D. Thomas, 1964: Direct solution of partial difference equations by tensor product methods. Numer. Math., 6, 185–199. Machenhauer, B., and R. Daley, 1972: A baroclinic primitive equation model with a spectral representation in three dimensions. Tech. Rep. 4, Institute of Theoretical Meteorology, University of Copenhagen, 63 pp. Marshall, J., C. Hill, L. Perelman, and A. Adcroft, 1997: Hydrostatic, quasi-hydrostatic and non-hydrostatic ocean modeling. J. Geophys. Res., 102, 5733–5752. Navon, I., and Y. Cai, 1993: Domain decomposition and parallel processing of a finite element model of the shallow water equations. Comput. Methods Appl. Mech. Eng., 106, 179–212. Prusa, J., P. Smolarkiewicz, and R. Garcia, 1996: On the propagation and breaking at high altitudes of gravity waves excited by tropospheric forcing. J. Atmos. Sci., 53, 2186–2216. Roache, P., 1972: Computational Fluid Dynamics. Hermosa, 446 pp. Robert, A., 1969: Integration of a spectral model of the atmosphere by the implicit method. Proc. WMO/IUGG Symp. on NWP, Vol. 7, Tokyo, Japan, Japan Meteorological Agency, 19–24. ——, 1981: A stable numerical integration scheme for the primitive meteorological equations. Atmos.–Ocean, 19, 35–46. ——, 1993: Bubble convection experiments with a semi-implicit formulation of the Euler equations. J. Atmos. Sci., 50, 1865–1873. ——, T. Yee, and H. Ritchie, 1985: A semi-Lagrangian semi-implicit numerical integration scheme for multilevel atmospheric models. Mon. Wea. Rev., 113, 388–394. Saad, Y., 1993: A flexible inner-outer preconditioned GMRES algorithm. SIAM J. Sci. Stat. Comput., 14, 461–469. ——, 1996: Iterative Methods for Sparse Linear Systems. PWS, 447 pp. ——, and M. Schultz, 1986: GMRES: A generalized minimal residual  2478  MONTHLY WEATHER REVIEW  algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Stat. Comput., 7, 856–869. Saito, K., U. Scha¨ttler, and U. Steppler, 1998: 3D mountain waves by the Lokal-Modell of DWD and the MRI mesoscale nonhydrostatic model. Pap. Meteor. Geophys., 49, 7–19. Schumann, U., and H. Volkert, 1987: Three-dimensional mass- and momentum-consistent Helmholtz-equation in terrain-following coordinates. Notes on Numerical Fluid Mechanics, W. Hackbusch, Ed., Vol. 10, Springer-Verlag, 109–131. ——, and R. Sweet, 1988: Fast Fourier transforms for direct solution of Poisson’s equation with staggered boundary conditions. J. Comput. Phys., 75, 123–137. Ska˚lin, R., 1997: Scalability of parallel gridpoint limited-area atmospheric models. Part II: Semi-implicit time-integration schemes. J. Atmos. Oceanic Technol., 14, 442–455. Skamarock, W., P. Smolarkiewicz, and J. Klemp, 1997: Preconditioned conjugate-residual solvers for Helmholtz equations in nonhydrostatic models. Mon. Wea. Rev., 125, 587–599. Smith, R., 1980: Linear theory of stratified hydrostatic flow past an isolated mountain. Tellus, 32, 348–364. Smolarkiewicz, P., and J. Pudykiewicz, 1992: A class of semi-Lagrangian approximations for fluids. J. Atmos. Sci., 49, 2082– 2096. ——, and L. Margolin, 1994: Variational solver for elliptic problems in atmospheric flows. Appl. Math. Comput. Sci., 4, 527–551. ——, and ——, 1997: On forward-in-time differencing for fluids: An Eulerian/semi-Lagrangian nonhydrostatic model for stratified flows. Atmos.–Ocean, 35, 127–152. ——, and ——, 2000: Variational methods for elliptic problems in fluid models. Proc. ECMWF Workshop on Developments in Nu-  VOLUME 131  merical Methods for Very High Resolution Global Models, Reading, United Kingdom, ECMWF, 137–159. ——, and J. Prusa, 2002: VLES modelling of geophysical fluids with nonoscillatory forward-in-time schemes. Int. J. Numer. Methods Fluids, 39, 779–819. ——, V. Grubisıˇc´, and L. Margolin, 1997: On forward-in-time differencing for fluids: Stopping criteria for iterative solutions of anelastic pressure equations. Mon. Wea. Rev., 125, 647–654. Sweet, R., 1973: Direct methods for the solution of Poisson’s equation on a staggered grid. J. Comput. Phys., 12, 422–428. Tanguay, M., A. Robert, and R. Laprise, 1990: A semi-implicit semiLagrangian fully compressible regional forecast model. Mon. Wea. Rev., 118, 1970–1980. Tapp, M., and P. White, 1976: A nonhydrostatic mesoscale model. Quart. J. Roy. Meteor. Soc., 102, 277–296. Thomas, S., C. Girard, R. Benoit, M. Desgagne´, and P. Pellerin, 1998: A new adiabatic kernel for the MC2 model. Atmos.–Ocean, 36, 241–270. ——, ——, G. Doms, and U. Scha¨ttler, 2000: Semi-implicit scheme for the DWD Lokal-Modell. Meteor. Atmos. Phys., 73, 105–125. ——, J. Hacker, M. Desgagne´, and R. Stull, 2002: An ensemble analysis of forecast errors related to floating point performance. Wea. Forecasting, 17, 898–906. Tokioka, T., 1978: Some considerations on vertical differencing. J. Meteor. Soc. Japan, 56, 98–111. van der Vorst, H., 1992: Bi-CGStab: A fast and smoothly converging variant of Bi-CG for the solution of non-symmetric linear systems. SIAM J. Sci. Stat. Comput., 12, 631–644. Yeh, K.-S., J. Cote, S. Gravel, A. Methot, A. Patoine, M. Roch, and A. Staniforth, 2002: The CMC–MRB Global Environmental Multiscale (GEM) model. Part III: Nonhydrostatic formulation. Mon. Wea. Rev., 130, 339–356.  


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items