UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

A higher-order unstructured finite volume solver for three-dimensional compressible flows Hoshyari, Shayan 2017

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


24-ubc_2017_november_hoshyari_shayan.pdf [ 3.2MB ]
JSON: 24-1.0354983.json
JSON-LD: 24-1.0354983-ld.json
RDF/XML (Pretty): 24-1.0354983-rdf.xml
RDF/JSON: 24-1.0354983-rdf.json
Turtle: 24-1.0354983-turtle.txt
N-Triples: 24-1.0354983-rdf-ntriples.txt
Original Record: 24-1.0354983-source.json
Full Text

Full Text

A Higher-Order Unstructured Finite Volume Solver forThree-Dimensional Compressible FlowsbyShayan HoshyariB.Sc., Sharif University of Technology, 2015a thesis submitted in partial fulfillmentof the requirements for the degree ofMASTER OF APPLIED SCIENCEinthe faculty of graduate and postdoctoral studies(Mechanical Engineering)The University of British Columbia(Vancouver)August 2017© Shayan Hoshyari, 2017AbstractHigh-order accurate numerical discretization methods are attractive for their potential to significantlyreduce the computational costs compared to the traditional second-order methods. Among the variousunstructured higher-order discretization schemes, the k-exact reconstruction finite volume method is ofinterest for its straightforward mathematical formulation, and its compatibility with the current lower-order industrial solvers. However, current three-dimensional finite volume solvers are limited to thesolution of inviscid and laminar viscous flow problems. Since three-dimensional turbulent flows appearin many industrial applications, the current thesis takes the first step towards the development of a three-dimensional higher-order finite volume solver for the solution of both inviscid and viscous turbulentsteady-state flow problems.The k-exact finite volume formulation of the governing equations is rederived in a dimension-independentmanner, where the negative Spalart-Allmaras turbulence model is employed. This one-equation modelis reasonably accurate for many flow conditions, and its simplicity makes it a good starting point for thedevelopment of numerical algorithms. Then, the three-dimensional mesh preprocessing steps for a finitevolume simulation are presented, including higher-order accurate numerical quadrature, and capturingthe boundary curvature in highly anisotropic meshes. Also, the issues of k-exact reconstruction inhandling highly anisotropic meshes are reviewed and addressed.Since three-dimensional problems can require much more memory than their two-dimensional counter-parts, solution methods that work in two dimensions might not be feasible in three dimensions anymore.As an attempt to overcome this issue, a practical and parallel scalable method for the solution of thediscretized system of nonlinear equations is presented.Finally, the solution of four three-dimensional test problems are studied: Poisson’s equation in a cubicdomain, inviscid flow over a sphere, turbulent flow over a flat plate, and turbulent flow over an extrudedNACA0012 airfoil. The solution is verified, and the resource consumption of the flow solver is measured.The results demonstrate the benefit and practicality of using higher-order methods for obtaining a certainlevel of accuracy.iiLay SummaryThe finite volume method is a popular numerical scheme for the solution of aerodynamic flow problems.Although conventional finite volume schemes aremostly second-order accurate, there has been a growinginterest in higher-order accurate numerical methods, since they can be considerably more efficient interms of computational resources for achieving a certain level of accuracy.Higher-order finite volumemethods have been successfully developed for awide range of two-dimensionalflow problems. Nevertheless, numerical methods that work in two dimensions might not be feasible inthree dimensions anymore, since three-dimensional problems can require much more memory than theirtwo-dimensional counterparts. The current thesis aims at identifying, and resolving such shortcomingsto construct a working three-dimensional finite volume solver with an emphasis on turbulent flows. Sub-sequently, three-dimensional benchmark flow problems are solved to verify and assess the performanceof the developed numerical method.iiiPrefaceAll the work presented in this thesis is an intellectual product of a working relationship between ShayanHoshyari and Dr. Carl Ollivier-Gooch. The implementation of the methods, the data analysis, and themanuscript preparations were done by Shayan Hoshyari with invaluable guidance from Carl Ollivier-Gooch throughout the process.ivTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiLay Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiiList of Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xAcknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.3 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.1 The Finite Volume Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.2 K-exact Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.3 Studied Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.3.1 Poisson’s Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.3.2 Navier-Stokes Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.3.3 Extension to Turbulent Flows . . . . . . . . . . . . . . . . . . . . . . . . . . 112.4 Numerical Flux Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.4.1 Inviscid Flux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.4.2 Viscous Flux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 Three-Dimensional Mesh Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17v3.1 Element Mapping and Quadrature . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173.2 Creating Curved Anisotropic Meshes . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.3 Modified Basis Functions for Highly Anisotropic Meshes . . . . . . . . . . . . . . . . 244 Solving the Discretized System of Equations . . . . . . . . . . . . . . . . . . . . . . . . . 284.1 Pseudo Transient Continuation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284.2 Linear Solvers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304.3 Preconditioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304.3.1 Point Gauss-Seidel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314.3.2 Block Jacobi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334.3.3 ILU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334.4 Improved Preconditioning Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . 344.4.1 Inner GMRES Iterations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344.4.2 Lines of Strong Coupling Between Unknowns . . . . . . . . . . . . . . . . . . 354.5 Numerical Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365 Three-Dimensional Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415.1 Poisson’s Equation in a Cubic Domain . . . . . . . . . . . . . . . . . . . . . . . . . . 415.2 Inviscid Flow Around a Sphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425.3 Turbulent Flow Over a Flat Plate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485.4 Turbulent Flow Over an Extruded Airfoil . . . . . . . . . . . . . . . . . . . . . . . . . 516 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60A ANSLib Command-Line Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66A.1 Two-Dimensional Turbulent Flow over a NACA 0012 Airfoil from Section 4.5 . . . . . 66A.2 Poisson’s Equation from Section 5.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 69A.3 Inviscid Flow Over a Sphere from Section 5.2 . . . . . . . . . . . . . . . . . . . . . . 69A.4 Turbulent Flow Over a Flat Plate from Section 5.3 . . . . . . . . . . . . . . . . . . . . 70A.5 Turbulent Flow Over an Extruded NACA 0012 Airfoil from Section 5.4 . . . . . . . . 71B Sample Script for Running a Parallel Job on Grex . . . . . . . . . . . . . . . . . . . . . 73viList of TablesTable 2.1 Dimensionless empirical parameters used in the negative S-A model . . . . . . . . 13Table 3.1 Performance of the linear elasticity solver . . . . . . . . . . . . . . . . . . . . . . . 24Table 4.1 Preconditioning methods considered . . . . . . . . . . . . . . . . . . . . . . . . . 38Table 4.2 Performance comparison for different preconditioning schemes . . . . . . . . . . . 40Table 4.3 Computed drag and lift coefficients for the NACA 0012 airfoil . . . . . . . . . . . . 40Table 5.1 Number of iterations and the resource consumption of the flow solver for the sphereproblem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47Table 5.2 Computed value and convergence order of the drag coefficient and the skin frictioncoefficient at the point x = (0.97, 0, 0.5) for the flat plate problem . . . . . . . . . . 51Table 5.3 Number of iterations and the resource consumption of the flow solver for the flat plateproblem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52Table 5.4 Computed drag and lift coefficients for the extruded NACA 0012 airfoil problem . . 54Table 5.5 Resource consumption of the flow solver for the extruded NACA 0012 problem . . . 56viiList of FiguresFigure 2.1 Illustration of terminology in finite volume discretization . . . . . . . . . . . . . . 6Figure 2.2 Reconstruction stencils for a control volume . . . . . . . . . . . . . . . . . . . . . 9Figure 3.1 Control volumes for cell-centered and cell-vertex methods . . . . . . . . . . . . . 18Figure 3.2 First order reference Lagrange elements . . . . . . . . . . . . . . . . . . . . . . . 19Figure 3.3 Geometry of the curved mesh generation test case . . . . . . . . . . . . . . . . . . 23Figure 3.4 Curved mesh displacement magnitude . . . . . . . . . . . . . . . . . . . . . . . . 24Figure 3.5 Residual per iteration for the linear elasticity solver . . . . . . . . . . . . . . . . . 25Figure 3.6 Construction of approximate wall coordinates for a two-dimensional control volume 26Figure 4.1 Schematic of Block Jacobi decomposition . . . . . . . . . . . . . . . . . . . . . . 33Figure 4.2 Mesh and lines of strong unknown coupling for the geometry of a NACA 0012 airfoil 38Figure 4.3 Comparison of residual histories for the NACA 0012 problem obtained using differ-ent preconditioning algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39Figure 4.4 Residual history for the NACA 0012 problem on the finest mesh of 400K controlvolumes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39Figure 5.1 Exact solution of Poisson’s problem . . . . . . . . . . . . . . . . . . . . . . . . . 42Figure 5.2 Different meshes for the solution of Poisson’s problem . . . . . . . . . . . . . . . 43Figure 5.3 Discretization error versus mesh size for Poisson’s problem . . . . . . . . . . . . . 44Figure 5.4 Symmetric cut of the coarsest mesh near the sphere surface . . . . . . . . . . . . . 44Figure 5.5 Relative entropy norm versus mesh size for the sphere problem . . . . . . . . . . . 45Figure 5.6 Computed Mach contours on the x3 = 0 symmetry plane for the sphere problem . . 46Figure 5.7 Norm of the residual vector per PTC iteration for the sphere problem . . . . . . . . 47Figure 5.8 The parallel speedup of the solver for the sphere problem . . . . . . . . . . . . . . 48Figure 5.9 Coarsest mesh for the flat plate problem. . . . . . . . . . . . . . . . . . . . . . . . 49Figure 5.10 Distribution of the turbulence working variable on the plane x3 = 0.5 for the flatplate problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49Figure 5.11 Distribution of the nondimensional eddy viscosity on the line (x1 = 0.97)∧(x3 = 0.5)for the flat plate problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50Figure 5.12 Norm of the residual vector per PTC iteration for the flat plate problem . . . . . . 51viiiFigure 5.13 The parallel speedup of the solver for the flat plate problem . . . . . . . . . . . . . 52Figure 5.14 Meshes for the extruded NACA 0012 problem . . . . . . . . . . . . . . . . . . . . 53Figure 5.15 Distribution of the turbulenceworking variable for the extrudedNACA0012 problemon the x3 = 0 plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54Figure 5.16 Distribution of surface pressure coefficient on the intersection of the extruded NACA0012 airfoil and the x3 = 0.5 plane. . . . . . . . . . . . . . . . . . . . . . . . . . 55Figure 5.17 Norm of the residual vector per PTC iteration for the extruded airfoil problem . . . 55ixList of SymbolsOrdersk Order of reconstructionl Order of Lagrangian polynomialsp Order of finite element basis functionsq Order of accuracy for a quadrature ruleNumber of EntitiesNunk Number of unknownsNDOF Number of degrees of freedomNdim Number of dimensionsNrec Number of reconstruction basis functionsNlag Number of Lagrangian basis functionsNqua Number of quadrature pointsNCV Number of control volumesNumerical Discretizationx Cartesian coordinatesL Differential operatorR Set of rational numbersu Solution vectorxh Mesh representative length scaleTh Set of all control volumesuh Discrete solution vectorUh Vector of degrees of freedomeh Discretization errorR Residual vectorΩ Domain∂Ω Boundary of domainn Surface unit normalτ A control volume∂τ Boundary of a control volumeΩτ Volume or area of a control volumeψ A Finite element or Lagrangian basis functionφ A Cartesian coordinates monomialϕ∗ A Principal coordinates monomialϕ+ A Curvilinear coordinates monomialxτ Reference location of a control volumewτ Principal coordinates for a control volumeξτ Curvilinear coordinates for a control volumetτ Tangent to wall coordinate for a control volumedτ Normal to wall coordinate for a control volumeF Inviscid flux matrixQ Viscous flux matrixS Source term vectorF Numerical inviscid flux functionQ Numerical viscous flux functionxiB Boundary condition constraint functionB Boundary conditionsMinNeigh Minimum number of reconstruction stencil membersBdryQuad Set of all boundary quadrature points for a control volumeStencil Reconstruction stencil for a control volumeSolution of System of Equationsp ILU fill levelx Vector of unknownsb Right hand side vector for a linear systemr Residual vector for a linear systemA Left hand side matrix for a linear systemA∗ The matrix used for constructing the preconditionerP Preconditioner matrixL˜, U˜ Lower and upper matrices resulting from ILU factorizationCFL The Courant-Friedrichs-Lewy numberFlow Related VariablesRe Reynolds numberPr Prandtl numberPrT Turbulent Prandtl numberMa Mach numberDiff Diffusion term in the Spalart-Allmaras turbulence modelProd Production term in the Spalart-Allmaras turbulence modelDest Destruction term in the Spalart-Allmaras turbulence modelTrip Trip term in the Spalart-Allmaras turbulence modelτ Viscous stress tensorxiiv Velocity vectorρ DensityE Total energyP PressureT Temperatureµ ViscosityµT Turbulent viscosityγ Specific heat ratioν˜ Spalart-Allmaras working variablet Timed Distance from wallOverscripts˜(.) Roe average stateˆ(.) Reference element statexiiiAcknowledgmentsFirst and foremost, I would like to thank my supervisor, Dr. Carl Ollivier-Gooch, for his professionalmentorship, continuous support, and endless patience. It has been a pleasure for me to work under hissupervision for the past two years.I am thankful for the financial support of the Natural Sciences and Engineering Research Council ofCanada and the University of British Columbia through the Canada Graduate Scholarships-Master’s(CGS-M) program and the Gartshore Scholarship, respectively.I would like to express my gratitude to my friends and fellow labmates for their valuable help andsuggestions. Special thanks to Alireza for patiently helping me with my initial transition into theresearch environment at ANSLab, and Gary for proofreading this thesis.I wish to thank my dear friend Mohammad. I will never forget your help and advice both inside andoutside of work. Also, thank you for your technical suggestions and for proofreading a major portion ofthis thesis.Last but not least, I would like to thank my parents, Iraj and Mahboubeh, and my sister, Leiana, for theirunconditional love, support, and caring.xivChapter 1Introduction1.1 MotivationThe advent of computational fluid dynamics (CFD) has considerably improved the design and manufac-turing processes in many industries such as aerospace, turbomachinery, oil and gas, and bioengineering.CFD finds solutions to engineering problems by numerically solving the governing and/or modelingpartial differential equations (PDEs). CFD simulations offer a cheap alternative to experimental setupsin many, though not all, situations while not suffering from the limitation of analytical methods inhandling complex geometries. Nevertheless, CFD methods are not always considered a rival to the othermethods of analysis. Computational methods are sometimes used for tuning parameters or selectingmethods of measurement in experimental setups. For example, Yoo et al. [80] employ CFD simulationsto estimate the right scale for building a nuclear fuel cask experimental model. Conversely, many PDEsthat are solved in CFD are derived from experimental data, such as the turbulence model of Spalart andAllmaras [68].A majority of CFD methods are categorized as mesh-based discretization schemes. A mesh-baseddiscretization scheme seeks to approximate the solution to a PDELu(x) = 0 inside a domainΩ ⊂ RNdim ,where L is a differential operator, u(x) ∈ RNunk is the unknown exact solution, and no time dependencehas been assumed for simplicity. In the first step, the domain is subdivided (meshed) into a set ofnon-overlapping sub-volumes Th, followed by defining a discrete function in the form of uh(x;Uh).Here, the term discrete means that this function will be uniquely defined if a finite number of degreesof freedom, Uh ∈ RNDOF , are all specified. Furthermore, the subscript h, which is the representativelength scale of the subdivision, emphasizes the dependency of the solution on the mesh. Finally, thegoal is to find Uh, such that uh closely approximates the exact solution u as the mesh gets smaller. Therigorous definition of “to closely approximate” is that the discretization error, eh = uh − u, must havean asymptotic behavior such that: ‖eh ‖ = O(hp). The discretization scheme is then said to be pth-orderaccurate.1Although conventional discretization schemes for aerodynamical flows are only second-order accurate,there has been a growing interest in higher-order accurate methods. Higher-order methods can beconsiderably more efficient compared to second-order methods in achieving accurate solutions, sincethe former results in more accurate results on coarser meshes [21, 48]. Even small improvements inaccuracy can be of considerable importance. For example, Vassberg et al. [74] show that for a long-rangejet-aircraft that delivers a payload between distant city pairs, a 1% error in predicting the drag can reducethe carried payload by 7%. With the airlines operating on profit margins of only a few percent, such aloss can make the service completely unprofitable.Regarding the adoption of higher-order methods, there has always been the legitimate concern ofmodeling errors, i.e., the discrepancies between the exact solution of the governing equations and thetrue physical quantities. For a problem where the modeling errors are dominant, only a limited level ofreduction in the discretization errors is of interest. Even in such a case, preliminary results have shownthat an hp-adaptive strategy can achieve a desired level of accuracy faster than conventional second-order methods [33, 38]. An hp-adaptive method increases the local order of accuracy p and decreasesthe mesh size h at only certain regions of the domain. In the context of design and optimization,thousands of simulations might be required for optimizing a subject geometry [42]. Thus, even asmall runtime improvement for a single simulation can result in a considerable reduction of the overallresource consumption. Moreover, modeling errors are not always the dominant mode. In recent work byMavriplis [45], where he studied the numerical solution of the flow around a wing-body configurationfrom the AIAA drag prediction workshop (DPW), the dominant errors were found to be those ofdiscretization.For structured meshes, highly-accurate finite difference methods have long been developed [39, 75], andare known to have superior properties in terms of computational cost and efficiency [19]. Nonetheless,generation of structured meshes for complex geometries is a challenging task, and requires extensiveamounts of human input. Therefore, grids for complex geometries are often obtained by the fairlyautomatic and convenient alternative of unstructured mesh generation techniques.For unstructuredmeshes, various approaches have been devised for achieving higher-order accuracy, mostnotably: continuous [5] and discontinuous [29]Galerkin finite elementmethods, the correction procedurevia reconstruction formulations of the discontinuous Galerkin and spectral volume methods [31], andfinite volume schemes [32]. (See the work of Andren et al. [8] for a detailed comparison of numericalresults). The use of finite volume methods is partly motivated by their straightforward mathematicalformulation compared to the complex structure of the other mentioned methods. Furthermore, most ofthe current industrial CFD codes are based on the finite volume method. While implementing otherapproaches would require the development of completely new commercial codes, higher-order finitevolume methods can be integrated into the current industrial solvers. Finite volume methods are alsoattractive because of the smaller number of degrees of freedom that they require for the same meshcompared to finite element methods.2In two dimensions, unstructured high-order finite volume methods have been successfully applied to arange of aerodynamic problems: the Euler equations [27, 46], laminar Navier-Stokes equations [33, 40],and turbulent Reynolds Averaged Navier-Stokes (RANS) equations [32]. Although taking the effectsof turbulence into consideration is necessary for correctly capturing many aerodynamic flows, three-dimensional results are scarce and limited to the solution of Euler and laminar Navier-Stokes equations[25, 40]. In the long run, the ANSLab team at UBC is interested in developing a three-dimensionalhigher-order finite volume solver for the solution of both inviscid and viscous turbulent steady-state flowproblems, while providing approximate error bounds on target unknowns such as lift and drag. The firststep towards this goal will be taken in this thesis: solution of well-known three-dimensional benchmarkflow problems.1.2 ObjectivesThe ultimate goal of this thesis is the generalization of our current 2-D flow solver, ANSLib, for thesolution of 3-D benchmark problems involving inviscid and viscous turbulent flows. The pursuit of thisgoal is divided into the following steps:• Derive the finite volume formulation of the governing equations in a dimension-independentmanner. Of course, the choice of the turbulence model in this part considerably affects thesolution. In this thesis, the objective is to employ the negative Spalart-Allmaras turbulence model[6]. This one-equation model is reasonably accurate for many flow conditions, and its simplicitymakes it a good starting point for the development of numerical algorithms. The negative versionof this model is chosen because it permits negative values for the model working variable, whichcan be present when higher-order methods are employed.• Identify and undertake the preprocessing steps involved in handling three-dimensional grids re-quired for turbulent simulations.• Design a practical and parallel scalable method for the solution of the discretized system ofnonlinear equations. Since three-dimensional problems can require much more memory thantheir two-dimensional counterparts, solution methods that work in two dimensions might not befeasible in three dimensions anymore.• Verify the performance of the solver, and the accuracy of the solutions obtained.1.3 Thesis OutlineThis thesis is organized in the following manner:Chapter 2 provides an overview of our in-house flow solver ANSLib, in which the algorithms of thisthesis have been implemented. The key concepts of the solver, i.e., k-exact reconstruction and finite3volume discretization are discussed in detail. Also, governing and modeling equations of interestare introduced. Finally, the flux functions and the boundary conditions are revisited in a dimension-independent notation.Chapter 3 discusses the preprocessing steps of three-dimensional meshes for a finite volume simulation.First, numerical integration is addressed, where Gauss quadrature rules are employed in conjunctionwith reference element mappings to construct quadrature information for the control volumes and theirfaces. Then, the capturing of boundary curvature in highly anisotropic meshes is addressed, which isa necessity for higher-order solution methods. Finally, the issues of k-exact reconstruction in handlinghighly anisotropic meshes are reviewed and addressed.In Chapter 4, the solution of the discretized system of nonlinear equations is discussed. First, the pseudotransient continuation method is revisited, which transforms the solution of the nonlinear system ofequations into the solution of a series of linear systems. Then, a memory lean, yet effective, method forthe solution of the corresponding linear systems is proposed, and compared to the previously availablelinear solution methods in ANSLib.Chapter 5 presents the solution of four three-dimensional test problems. To test the correct imple-mentation of the mesh preprocessing algorithms, Poisson’s equation is solved in a simple geometry,where the discretization error is explicitly evaluated and its asymptotic behavior is verified. Then, thesubsonic inviscid flow around a sphere is studied, where the solution accuracy is verified by measuringthe deviation of the entropy from the inflow conditions throughout the domain. Finally, the problemsof viscous turbulent flow over a flat plate and an extruded NACA 0012 airfoil are solved, and the solu-tions are verified against the reference data provided in the NASA turbulence modeling (TMR) website[60].In the end, Chapter 6 summarizes the research of the thesis, provides conclusions, and proposes possiblefuture work.In addition, the command-line options that were provided to the ANSLib executable for running eachtest case are provided in Appendix A. A sample script for running a parallel job on the WestGrid Grexcluster [3] is also given in Appendix B, as the three-dimensional cases of this thesis were all run on thiscluster.4Chapter 2BackgroundThis chapter presents the fundamentals upon which this research has been founded. As the algorithmsinvolved in this thesis have been implemented as parts of ANSLib, it is essential to present the workingmechanism of this solver. Thus, this chapter starts with a description of the finite volume method andk-exact reconstruction, which are the numerical solution schemes in ANSLib, and then moves on to themodel equations of interest.2.1 The Finite Volume MethodThe finite volumemethod is applied to equations which can be expressed in the conservative form:∂u∂t+ ∇ · (F(u) − Q(u,∇u)) = S(u,∇u), (2.1)where F ∈ RNunk×Ndim is the inviscid fluxmatrix, Q ∈ RNunk×Ndim is the viscous fluxmatrix, and S ∈ RNunkis the source term vector. In this method, the degrees of freedom are the same as the average value of thediscrete solution inside every control volume. This fundamental constraint is known as the conservationof the mean, and can be written as:1Ωτ∫τuh(x)dΩ = Uh,τ τ ∈ Th, (2.2)where Ωτ and Uh,τ ∈ RNunk represent the volume and local DOF vector for control volume τ, respec-tively. Establishing a relation between the discrete solution uh, and the degrees of freedom Uh, whensatisfying the conservation of the mean constraint, is called reconstruction in the finite volume frame-work. To accomplish this, ANSLib uses the k-exact reconstruction scheme, which will be introduced inSection 2.2.The finite volume method uses the divergence theorem to discretize Equation (2.1). Consider a control5n(uh+,∇uh+)(uh-,∇uh-)(uh,∇uh)∂τ\∂Ω∂τ∩∂ΩFigure 2.1: Illustration of terminology in finite volume discretizationvolume τ ∈ Th as depicted in Figure 2.1. Integrating Equation (2.1) inside the control volume, and usingthe divergence theorem gives:dUh,τdt+1Ωτ∫∂τ\∂Ω(FI (u+h, u−h) −QI (u+h,∇u+h, u−h,∇u−h)) dS+1Ωτ∫∂τ∩∂Ω(FB(uh,B) −QB(uh,∇uh,B)) dS − 1Ωτ∫τS(uh,∇uh)dΩ = 0, (2.3)where B represents the boundary conditions, in the case where the control volume has faces lying on theboundary. Although the discrete solution uh and its gradient ∇uh are continuous inside every controlvolume, they can be discontinuous on the control volume boundaries ∂τ. These discontinuous values areshown using the (.)+ and (.)− notations. FI andQI represent the interior numerical flux functions, whileFB and QB represent their boundary counterparts. Numerical flux functions are designed to mimic theproduct of the original flux matrices and the normal vector while taking into account the discontinuityof the discrete solution and the boundary conditions. The flux functions used in this thesis will beintroduced in Section 2.4.Looking back at Equation (2.3), the following system of ODEs for control volume averages can bederived:dUhdt+ R(Uh) = 0, (2.4)which can be solved with a variety of time advance schemes when a time-accurate solution is of interest.Butcher [15] presents a detailed study of these methods. Although only the steady state solution is ofinterest in this thesis, the unsteady terms can be used to improve the robustness of the solver with respectto the initial solution guess. This method, known as pseudo transient continuation (PTC) [35], will beintroduced in Chapter 4.62.2 K-exact ReconstructionThe design of reconstruction schemes in the finite volume framework started with Van Leer’s MUSCLscheme [72, 73]. His scheme took advantage of the uniform pattern of control volumes in structuredmeshes andwas not directly applicable to unstructured grids. Later on, the k-exact reconstruction [12] andthe WENO/ENO [66] family of methods were designed, and did not suffer from MUSCL’s shortcomingin handling unstructured meshes. The latter, however, has poor steady state convergence properties [59].Targeted to solve aerodynamic steady state problems, ANSLib uses the k-exact reconstruction methodin its finite volume formulation.Suppose a polynomial function of order smaller or equal to k, v(x), is integrated in every control volumeto find the control volume averagesVh. If these average values are fed to a k-exact reconstruction schemeto construct the function vh(x), the identity vh(x) = v(x) must hold, which is where the name k-exactcomes from. A k-exact scheme is also called (k + 1)th-order accurate, since it results in a nominaldiscretization error of order O(h(k+1)) [13]. For simplicity, let us consider this method when there isonly one unknown variable, i.e., the vector u reduces to a scalar u. Generalization to multiple unknownvariables will then be straightforward. In this case, the solution in every control volume is defined as thesuperposition of a set of basis functions in the form:uh(x;Uh,B)|x∈τ = uh,τ(x;Uh,B) =Nrec∑i=1aiτ(Uh,B)φiτ(x) τ ∈ Th . (2.5)Where φiτ(x) and aiτ represent the ith basis function and reconstruction coefficient for control volume τ,respectively. Most commonly, the basis functions will be chosen as monomials in Cartesian coordinateswith origin at the reference point of each control volume:{φiτ(x) i = 1 . . . Nrec } = { 1a!b!c! (x1 − xτ1)a(x2 − xτ2)b(x3 − xτ3)c  a + b + c ≤ k } , (2.6)where xτ1, xτ2, and xτ3 are the coordinates of the control volume’s reference location, which is usuallychosen as the centroid of volume. This particular choice of basis functions has the advantage that thediscrete solution uh will resemble the Taylor expansion of the exact solution u, which is particularlyuseful in theoretical analysis [54]. However, there are situations, e.g., highly anisotropic meshes, inwhich it would be more beneficial to use other basis functions [32]. Such a case will be introduced inChapter 3.The discrete solution must satisfy the conservation of the mean constraint, given in Equation (2.2).Moreover, for every control volume τ, a specific set of its neighbors are chosen as its reconstructionstencil Stencil(τ). The k-exact reconstruction requires uh,τ to predict the average values of the membersof Stencil(τ) closely. Furthermore, if the control volume is located on the boundary (∂τ ∩ ∂Ω , ∅),enforcing uh,τ or ∇uh,τ ·n to have certain values at boundary quadrature points, can improve convergencein the presence of certain boundary conditions [54]. Thus, the reconstruction coefficients can be found7by solving the following constrained minimization problem:minimizea1τ ...aNrecτ∑σ∈Stencil(τ)(1Ωσ∫σuh,τ(x)dΩ −Uh,σ)2+∑q∈BdryQuad(τ)(B(uh,τ(q),∇uh,τ(q),B))2subject to1Ωτ∫τuh(x)dΩ = Uh,τ,(2.7)where BdryQuad(τ) is the set of boundary quadrature points for control volume τ, and B is the boundarycondition constraint function. The number of control volumes in Stencil(τ) must be greater thanthe number of reconstruction coefficients Nrec, so that the minimization problem does not becomeundetermined. To construct Stencil(τ), all the neighbors at a given topological distance from τ areadded to the stencil until the number of stencil members gets bigger than a given value MinNeigh(k).For a well behaved problem on a mesh with high regularity, choosing MinNeigh(k) = Nrec can resultin an acceptable solution. However, a value of MinNeigh(k) = 1.5Nrec is chosen to cope with meshirregularities and oscillatory solution behavior. Figure 2.2 shows a control volume and its reconstructionstencil for different k values.By introducing the integral of each basis function of every control volume inside itself and its stencilmembers:I iτσ =∫σφiτ(x)dΩ σ ∈ Stencil(τ) ∪ { τ } , (2.8)the constrainedminimization problem in Equations (2.7), can bewritten in a compact matrix form:I1ττ . . . INrecττI1τσ1 . . . INrecτσ1.... . ....I1τσNS(τ) . . . INrecτσNS(τ)a1τ...aNrecτ =Uh,τUh,σ1...Uh,σNS(τ), (2.9)where the boundary constraints have been neglected for the sake of simplicity. The symbols σ1, σ2,. . . , σNS(τ) represent the members of Stencil(τ). The first row of the matrix is a constraint and mustbe exactly satisfied, while the other rows correspond to equations that have to be minimized in a leastsquares fashion. As the left hand side matrix in Equation (2.9) is only dependent on geometric terms,and does not include the average solution values Uh, the solution can be written as:a1τ...aNrecτ = A†τUh,τUh,σ1...Uh,σNS(τ), (2.10)where the matrix A†τ is the pseudo-inverse of the left hand side matrix. A change of variables inspired bythe Gaussian Elimination method is used to convert Equation (2.9) into an unconstrained optimizationproblem [54]. Then the unconstrained problem is solved by singular value decomposition (SVD) [24]8τFigure 2.2: Reconstruction stencils for a control volume τ: A set of values MinNeigh(1) = 3,MinNeigh(2) = 9, and MinNeigh(3) = 18 have been used, which have resulted in reconstruc-tion stencils { blue }, { blue, magenta }, and { blue, magenta, cyan } for the 1-exact, 2-exact,and 3-exact reconstruction schemes, respectively.to yield the pseudo-inverse matrix. This process has to be evaluated for every control volume, only as apreprocessing step, which prevents it from becoming a bottleneck in our computations.In the case of equations with discontinuous solutions, shock capturing numerical schemes such as slopelimiters [76] or artificial diffusion [69] are required in conjunction with the k-exact reconstructionmethod to ensure convergence to the correct solution. For more recent implementation of these methodsin conjunction with higher-order schemes see [11, 46, 53]. In this thesis, however, the emphasis is onmodel problems without discontinuities. Thus, no shock capturing methods have been used.2.3 Studied EquationsThis section introduces the equations of interest. Namely, Poisson’s, Euler, laminar and Reynoldsaveraged Navier-Stokes, and the Spalart-Allmaras turbulence closure equations.2.3.1 Poisson’s EquationThe Poisson’s equation is a simple yet powerful tool to verify the correct implementation of manyalgorithms in ANSLib. This equation has a scalar unknown u and a solution-independent source term fin the form of:− ∇ · (∇u) = f (x). (2.11)9To simplify the notation, we consider the gradient operator on a scalar function as a row vector. Forexample, ∇u = [ ∂u∂x1 , ∂u∂x2 , ∂u∂x3 ]. Thus the Poisson equation can be recovered from the steady state formof the conservative Equation (2.1) by replacing F = ∇u, Q = 0, and S = f (x).2.3.2 Navier-Stokes EquationsThe Navier-Stokes and the continuity equations are widely used for the simulation of fluid flow. In thecompressible form of these equations, the vector of unknowns is u = [ρ, ρvT , E]T , where ρ is the fluiddensity, v = [v1, v2, v3]T is the velocity vector, and E is the total energy. When combined with theideal gas internal energy and state equations, the nondimensionalized Navier-Stokes equations can beidentified by a zero source term vector and flux matrices:F =ρvTρvvT + PI(E + P)vT Q =0MaRe τ(E + P)τv + 1γ−1( µPr) ∇T , (2.12)where Ma, Re, Pr, and γ represent the Mach number, the Reynolds number, the Prandtl number, and thespecific heat ratio, respectively. T is the temperature, (.)T denotes matrix transpose, P is the pressure,τ is the viscous stress matrix, µ is the dimensionless fluid viscosity, and I is the identity matrix. Sincethe emphasis is on air as the working fluid, the values γ = 1.4, Pr = 0.72 are used, and µ is found fromSutherland’s law. The pressure is related to the dependent variables via the formula:P = (γ − 1)(E − 12ρ(v · v)). (2.13)Similarly, temperature is related to pressure and density in the form:T =γPρ. (2.14)For Newtonian compressible fluids, the viscous stress tensor is related to the velocity as:τ = 2µ(12(∇v + (∇v)T)− 13trace(∇v)I), (2.15)where ∇v represents the gradient of the velocity vector, such that (∇v)i j = ∂vi∂x j .If the viscous terms are neglected, i.e., either Q, or equivalently µ is set to zero, the Euler equationswould be recovered. The Euler equations are used in this work to asses the capability of the solver inhandling reasonably big three-dimensional problems.102.3.3 Extension to Turbulent FlowsIn this thesis, the Reynolds averaged Navier Stokes (RANS) equations are used for modeling turbulentflows, and are coupled with the negative Spalart-Allmaras (negative S-A) turbulence model [6]. Thismodel includes a number of dimensionless empirical constants, listed in Table 2.1, and a few empiricalfunctions, shown by the symbol f and an appropriate subscript, which will be introduced shortly. Thenondimensionalized flux matrices for the RANS + negative S-A equations are defined as:F =ρvTρvvT + PI(E + P)vTν˜ρvTQ =0MaRe τ(E + P)τv + 1γ−1(µPr +µTPrT)∇T− MaReσ (µ + µT )∇ν˜, (2.16)where ν˜ is the negative S-A working variable, µT is the turbulent viscosity, and PrT is the turbulentPrandtl number, which has a value of 0.9 for air. The source term of the nondimensionalized RANS +negative S-A is defined as:S =000Diff +ρ(Prod −Dest+Trip), (2.17)whereDiff , Prod, Dest, and Trip represent diffusion, production, destruction, and trip terms, respectively.The viscous stress tensor can be found via the modified equation:τ = 2(µ + µT )(12(∇v + (∇v)T)− 13trace(∇v)I). (2.18)The turbulent viscosity is expressed as:µT =µ′ fv1ρν˜ ν˜ ≥ 00 ν˜ < 0, (2.19)where µ′ is the reference value that is used to nondimensionalize the turbulence working variable. In thiswork, we have used µ′ = 1000 to make ν˜ comparable in size to the other nondimensionalized variables,which enhances the performance of the numerical solver [17]. The production term in Equation (2.17)is defined as:Prod =cb1(1 − ft2)S˜ν˜ ν˜ ≥ 0cb1(1 − ct3)S ν˜ < 0. (2.20)11The destruction term is found from the equation:Dest =µ′MaRe(cw1 fw − cb1κ2 ft2) (ν˜d)2ν˜ ≥ 0−µ′MaRe cw1(ν˜d)2ν˜ < 0, (2.21)where d is the minimum distance to wall boundaries. The diffusion term is given as:Diff =MaReσ(µ′cb2ρ∇ν˜ · ∇ν˜ − µρ(1 + χ fn) ∇ρ · ∇ν˜), (2.22)where χ is:χ =µ′ρν˜µ. (2.23)The trip term inEquation (2.17)models flows that include the laminar to turbulent transition phenomenon.Since the emphasis in this thesis is on fully turbulent flows, the trip term is neglected and set to zero.The vorticity S, and its modified forms, S˜, S¯, are found from the equations:S =√√12∑i∑j(∂vi∂xj− ∂vj∂xi)2S¯ =µ′MaReν˜κ2d2fv2S˜ =S + S¯ S¯ ≥ −cv2SS +c2v2S+cv3S¯(cv3−2cv2)S−S¯ S¯ < −cv2S.(2.24)The empirical function fn ensures the positivity of µT , and is defined as:fn =1 ν˜ ≥ 0cn1+χ3cn1−χ3 ν˜ < 0. (2.25)The functions fv1, fv2, fv3 are given as:fv1 =χ3χ3 + c3v1fv2 = 1 − χ1 + χ fv1 ft2 = ct3 exp(−ct4χ2), (2.26)and the function fw is given as:r =µ′MaReν˜S˜κ2d2g = r + cw2(r6 − r) fw = g(1 + c6w3g6 + c6w3). (2.27)Having defined the equations of interest, let us now introduce the numerical flux functions.12Table 2.1: Dimensionless empirical parameters used in the negative S-A modelName Value Name Value Name Valuecb1 0.1355 cb2 0.622 cw1 cb1κ2 +1+cb2σcw2 0.3 cw3 2.0 ct3 1.2ct4 0.5 cn1 16 cv1 7.1cv2 0.7 cv3 0.9 κ 0.41σ 0.662.4 Numerical Flux FunctionsAs discussed earlier, the finite volume method relies on flux functions that not only take into accountthe discontinuity of the solution along internal faces, but also correctly capture the influence of theboundary conditions. In this section, we will introduce the numerical flux functions used in ANSLibfor the solution of the more general RANS + negative S-A equations. Flux functions for the simplerEuler and Poisson equations can simply be derived by omitting the relevant terms from the more generalcase.2.4.1 Inviscid FluxThe inviscid fluxes represent propagation of information via finite speed waves, and are convective in na-ture. Most numerical flux functions seek a solution to the approximate one-dimensional equation:∂u∂t+∂F(u)n∂s= 0, (2.28)and then use this solution to find the numerical flux. In Equation (2.28), F is the inviscid flux matrix,n is the face normal vector, and s is the unit of length. The artificial viscosity method [79] adds extranonlinear dissipative terms to Equation (2.28) to give it an elliptic nature. On the other hand, theGodunov method [23] is based on the exact solution of Equation (2.28), with piecewise constant initialconditions corresponding to the left and right states (the Riemann Problem). Due to the expensive cost ofexactly solving the Riemann problem, many researchers have developed approximate solutions such asthe Rusanov [61], HLL family [28, 70], and Roe [58] methods. In this thesis, a computationally efficientformulation [57] of the approximate Riemann solver of Roe has been used due to its effectiveness andsimplicity. This formulation has the form ofFI (u+h, u−h) =12(F(u+h)n + F(u−h)n − D(u+h, u−h)), (2.29)13where D represents the diffusion vector given by the formula:D(u+h, u−h) =|λ˜2 |∆ρ + δ1|λ˜2 |∆(ρv) + δ1v˜ + δ2n|λ˜2 |∆E + δ1H˜ + δ2v˜ · n|λ˜2 |∆(ρν˜) + δ1 ˜˜ν. (2.30)Here, ∆(.) = (.)+h− (.)−h, and the variables λ1, . . . , λ6 represent the six eigenvalues of the Jacobian matrix∂F(u)n∂u , expressed as:λ1 = v · n − c λ2 = . . . = λ5 = v · n λ6 = v · n + c. (2.31)The variable c is the speed of sound, defined as:c =√γPρ. (2.32)In Equation (2.30) the symbol ˜(.) represents the Roe average state, evaluated from the equation:˜(.) =√(.)−h(.)+h(.) = ρ√ρ−h(.)−h+√ρ+h(.)+h√ρ−h+√ρ+h(.) = v, ν˜,H, (2.33)where H = P+Eρ is the enthalpy. Note that v˜ and ˜˜ν are the velocity and the S-A working variable at theRoe average state, respectively. The variables δ1 and δ2 in Equation (2.30) are given as:δ1 =∆Pc˜2(−|λ˜2 | + |λ˜1 | + |λ˜6 |2)+ρ˜2c˜2∆(v · n) ( |λ˜6 | − |λ˜1 |)δ2 =∆P2c˜2( |λ˜6 | − |λ˜1 |) + ρ˜∆(v · n) (−|λ˜2 | + |λ˜1 | + |λ˜6 |2 ) . (2.34)ANSLib evaluates the the inviscid boundary fluxes using the method of characteristics [30]. For walland symmetry boundary conditions, the mass flux must be zero. Thus, the inviscid flux is evaluatedas:FB(uh,B) = [0, PhnT , 0]T B is symmetry or wall. (2.35)On farfield boundaries, however, the boundary flux formulation changes based on the sign of the normalvelocity, v · n. In this work, we consider subsonic inflow and outflow boundary conditions which areidentified by the inequalities 0 ≤ v · n < c, and −c < v · n ≤ 0, respectively. In either case, the boundaryflux is evaluated in terms of an intermediate state, u∗h, defined as a function of both the boundaryconditions and the discrete solution:FB(uh,B) = F(u∗h)n u∗h = (uh,B) B is inflow or outflow. (2.36)14For subsonic inflow, five values of farfield turbulence working variable ν˜far, total temperature Tt , totalpressure Pt , side slip angle ψ, and angle of attack α must be specified. Subsequently, the intermediatestate u∗hcan be found as:P∗h = Ph T∗h = Tt(P∗hPt) γ−1γρ∗h = γP∗hT∗hν˜∗h = ν˜far‖v∗h ‖2 =√2γ − 1(TtT∗h− 1)v∗h1 = ‖v∗h ‖2 cosα cosψ v∗h2 = ‖v∗h ‖2 sinα cosψ v∗h3 = ‖v∗h ‖2 sinψ.(2.37)For subsonic outflow, only the back-pressure value Pb must be specified. The intermediate state willthen be defined as:ρ∗h = ρh v∗h = vh P∗h = Pb ν˜∗h = ν˜h . (2.38)In this thesis, the same values are chosen for ν˜far, Tt , Pt , and Pb as the work of Jalali and Ollivier-Gooch[32]:ν˜far =3µ′Tt = 1 +γ − 12Ma2 Pt =1γTγγ−1t Pb =1γ. (2.39)2.4.2 Viscous FluxThe viscous flux functions are evaluated in a different manner compared to their inviscid counterparts.When evaluating internal flux values, an intermediate state u∗his used in the form:QI (u+h,∇u+h, u−h,∇u−h) = Q(u∗h,∇u∗h)n. (2.40)Finding the intermediate solution as the numerical average of the left and right states,u∗h =12(u+h + u−h), (2.41)results in a sufficiently accurate solution [51]. When evaluating the intermediate solution gradient,however, simple averaging may lead to spurious solutions and instabilities. Nishikawa [50] suggestedthe following modified formula, in line with the interior penalty formulation [9] used in the DG frame-work:∇u∗h =12(∇u+h + ∇u−h ) + η ( u+h − u−h‖xτ+ − xτ−‖2)n, (2.42)where xτ− and xτ+ are the reference locations of the adjacent control volumes, respectively, and η is aheuristic damping factor, known as the jump term. Jalali et al. [34] have numerically tested differentvalues of η in high-order finite volume simulations, and have recommended a value of η = 1, which isalso used in this thesis.15The boundary viscous fluxes are evaluated in the same manner as reference [32], by simply replacingthe interior state into the viscous flux matrix,QB(uh,∇uh,B) = Q(uh,∇uh)n. The boundary conditionsare then enforced through a combination of soft and hard constraints. For adiabatic walls, the velocity,heat flux, and turbulence working variable have to be zero, which are expressed as:Hard constraints: vh = 0 ν˜h = 0Soft constraint: ∇Th · n = 0B is adiabatic wall, (2.43)where the hard constraints are applied to the k-exact reconstruction process through the boundarycondition constraint function B. The soft constraint, however, is applied by simply replacing the valueof ∇Th · n with zero when evaluating the boundary flux. On symmetry boundaries, the heat flux, normalderivative of the turbulence working variable, and the tangential viscous force have to be zero, which areenforced via the following soft constraints:Soft constraints: ∇ν˜h · n = 0 ∇Th · n = 0((nTτhn)I − τh)n = 0 B is symmetry. (2.44)Finally, no constraints are applied when evaluating the viscous flux function on the farfield bound-aries.16Chapter 3Three-Dimensional Mesh ProcessingThe finite volume method requires the subdivision of the domain into a set of control volumes, whichare obtained from a mesh. There are two different approaches for constructing the control volumes:cell-centered, and vertex-centered (cell-vertex). In the cell-centered approach, every cell of the meshis considered as a control volume, as shown in Figure 3.1a. Conversely, the vertex-centered approachassociates a control volume to each vertex of the mesh, and is dependent on the definition of the controlvolume faces. Figure 3.1b shows an example of vertex-centered control volumes, which are created byconnecting the barycenter of each triangle to the midpoints of its edges. This method for constructingthe vertex-centered control volumes is known as the median-dual approach.The only data that the solver requires about the geometry of the control volumes is their adjacencyand quadrature information. Thus, relevant data structures and algorithms have to be implementedin a numerical solver package to offer access to such information, with reasonable time and memorycost. Although the cell-centered and cell-vertex methods are both straightforward to implement in twodimensions, three-dimensional implementation of the latter is more difficult to code, and requires morequadrature points for integration at a given order of accuracy. As a result, the cell-centered method hasbeen chosen for the purpose of this thesis, and is implemented in ANSLib. This chapter discusses thepreprocessing steps required for a three-dimensional cell-centered finite volume solver.3.1 Element Mapping and QuadratureIn terms of solution accuracy, hexahedral and quadrilateral meshes are known to have superior propertiesin three and two dimensions, respectively, compared to tetrahedral and triangular meshes. Althoughstructured mesh generation techniques can create such grids, they can be expensive in terms of timeresources, since applying these algorithms to complex geometries requires a considerable amount ofhuman input. Conversely, unstructured mesh generation techniques are rather automatic, but are notable to handle complex geometries without resorting to robust algorithms that only create simplex cells,17(a) (b)Figure 3.1: Schematic of control volumes: (a) cell-centered, (b) median-dual cell-vertex.such as the Delaunay mesh generation methods [18]. Therefore, simulation grids are usually composedof a mixture of different cell types. The purpose of this section is to introduce the connectivity andquadrature formulas for the control volumes and their faces in a computational mesh. Hereinafter, boththe control volumes and their faces will be referred to as elements.The process starts with mapping each element E to its reference version Eˆ , which resides in a nondimen-sional space. The mapping is a polynomial of degree l, that must guarantee geometric continuity acrossneighboring elements. Thus, it is usually constructed from Lagrangian basis functions (polynomials).Using the ˆ(.) notation for the reference space, the mapping can be written as:x(xˆ) =Nlag∑i=1yiψˆi(xˆ) xˆ ∈ Eˆ, (3.1)where ψˆi and yi are the Lagrange basis functions and nodal locations, respectively, and Nlag is the numberof Lagrange points. Hexahedra, prisms and tetrahedra are the most commonly used three-dimensionalelements in both finite element and finite volume simulations. Pyramids, on the other hand, are notvery popular, and are used for making transitions between different element types in a mixed mesh.Quadrilateral and triangular elements are also used to represent the control volume faces. Figure 3.2shows the first order (l = 1) versions of these elements in the reference space. A well defined mesh mustprovide the location of every node. Furthermore, it must store the local to global node numbering forevery element, along with its type and interpolation order.For a reference element, a qth-order accurate quadrature rule, consists of Nqua point locations zˆi, andweights wˆi, such that for every polynomial Pˆ(xˆ) of degree less than or equal to q − 1:∫EˆPˆ(xˆ)dEˆ =Nqua∑i=1Pˆ(zˆi)wˆi . (3.2)These quadrature rules are tabulated for different orders and types of elements in the literature, such asthe work of Solin et al. [67]. By a change of variables, quadrature rules can be constructed for elements18xˆ1xˆ2xˆ312 3456 78 yˆ1 = (−1, −1, −1) ψˆ1 = (1 − xˆ1)(1 − xˆ2)(1 − xˆ3)/8yˆ2 = (+1, −1, −1) ψˆ2 = (1 + xˆ1)(1 − xˆ2)(1 − xˆ3)/8yˆ3 = (+1, +1, −1) ψˆ3 = (1 + xˆ1)(1 + xˆ2)(1 − xˆ3)/8yˆ4 = (−1, +1, −1) ψˆ4 = (1 − xˆ1)(1 + xˆ2)(1 − xˆ3)/8yˆ5 = (−1, −1, +1) ψˆ5 = (1 − xˆ1)(1 − xˆ2)(1 + xˆ3)/8yˆ6 = (+1, −1, +1) ψˆ6 = (1 + xˆ1)(1 − xˆ2)(1 + xˆ3)/8yˆ7 = (+1, +1, +1) ψˆ7 = (1 + xˆ1)(1 + xˆ2)(1 + xˆ3)/8yˆ8 = (−1, +1, +1) ψˆ8 = (1 − xˆ1)(1 + xˆ2)(1 + xˆ3)/8123456yˆ1 = (0, 0, −1) ψˆ1 = (1 − xˆ3)(1 − xˆ2 − xˆ1)/2yˆ2 = (1, 0, −1) ψˆ2 = (1 − xˆ3)xˆ1/2yˆ3 = (0, 1, −1) ψˆ3 = (1 − xˆ3)xˆ2/2yˆ4 = (0, 0, +1) ψˆ4 = (1 + xˆ3)(1 − xˆ2 − xˆ1)/2yˆ5 = (1, 0, +1) ψˆ5 = (1 + xˆ3)xˆ1/2yˆ6 = (0, 1, +1) ψˆ6 = (1 + xˆ3)xˆ2/212 345yˆ1 = (−1, −1, 0) ψˆ1 = (xˆ3 + xˆ1 − 1)(xˆ3 + xˆ2 − 1)/4((1 − xˆ3) +  )yˆ2 = (+1, −1, 0) ψˆ2 = (xˆ3 − xˆ1 − 1)(xˆ3 + xˆ2 − 1)/4((1 − xˆ3) +  )yˆ3 = (+1, −1, 0) ψˆ3 = (xˆ3 − xˆ1 − 1)(xˆ3 − xˆ2 − 1)/4((1 − xˆ3) +  )yˆ4 = (+1, −1, 0) ψˆ4 = (xˆ3 + xˆ1 − 1)(xˆ3 − xˆ2 − 1)/4((1 − xˆ3) +  )yˆ5 = (0, 0, 1) ψˆ5 = xˆ3  = 10−201234yˆ1 = (0, 0, 0) ψˆ1 = 1 − xˆ1 − xˆ2 − xˆ3yˆ2 = (1, 0, 0) ψˆ2 = xˆ1yˆ3 = (0, 1, 0) ψˆ3 = xˆ2yˆ4 = (0, 0, 1) ψˆ4 = xˆ3xˆ1xˆ2xˆ312 34 yˆ1 = (−1, −1, 0) ψˆ1 = (1 − xˆ1)(1 − xˆ2)/4yˆ2 = (+1, −1, 0) ψˆ2 = (1 + xˆ1)(1 − xˆ2)/4yˆ3 = (+1, +1, 0) ψˆ3 = (1 + xˆ1)(1 + xˆ2)/4yˆ4 = (−1, +1, 0) ψˆ4 = (1 − xˆ1)(1 + xˆ2)/4123 yˆ1 = (0, 0, 0) ψˆ1 = 1 − xˆ1 − xˆ2yˆ2 = (1, 0, 0) ψˆ2 = xˆ1yˆ3 = (0, 1, 0) ψˆ3 = xˆ2Figure 3.2: First order reference Lagrange elements and polynomials. From top to bottom: hexa-hedron, prism, pyramid, tetrahedron, quadrilateral, and triangle.19in the physical space. For a three-dimensional element, we can write:∫Ef (x)dE =∫Eˆf (x(xˆ)) det(J(xˆ))dEˆ ≈Nqua∑i=1f (x(zˆi)) det(J(zˆi))wˆi . (3.3)Here, J = ∂x∂xˆ represents the Jacobian of the transformation given in Equation (3.1). A similar statementcan be made for two-dimensional elements, giving way to the following procedure for finding quadraturerules in the physical space:2D Element: wi = ‖∂xˆ1x × ∂xˆ2x‖2wˆi zi = x(zˆi)3D Element: wi = det(J(zˆi))wˆi zi = x(zˆi). (3.4)It is noteworthy to mention that the higher the interpolation order of an element, l, the higher the valuesof q are required to evaluate an integral up to a certain order of accuracy. This is due to the presenceof the term det(J) in Equation (3.3) that prevents a quadrature rule to maintain its nominal order ofaccuracy in the physical space. Furthermore, for integrals on the faces that involve the normal vector,the following formula can be used:n = ∂xˆ1x × ∂xˆ2x‖∂xˆ1x × ∂xˆ2x‖2. (3.5)In this thesis we have used the Lagrange polynomials and reference element quadrature rules that areprovided by the libMesh finite element library [36].3.2 Creating Curved Anisotropic MeshesHigh-order numerical discretization schemes must correctly account for the curvature of domain bound-aries to maintain their nominal order of accuracy [14]. As conventional mesh generators create onlymeshes with planar faces, modification schemes are required before such meshes can be used for ahigh-order numerical simulation. This modification can be as simple as just replacing the boundaryfaces with higher-order representations, as in isotropic meshes. Successful resolution of quantities ofinterest in turbulent flows, however, requires highly anisotropic meshes that have big spacing in the walltangent direction compared to small spacing in the wall normal direction. Replacing only the boundaryfaces in this case would create self intersecting invalid domain subdivisions. Thus, the curvature of theboundary faces must somehow be propagated throughout the domain to prevent mesh tangling.For structuredmeshes, a simple remedy is to agglomerate a finemesh and use the extra points to define thehigher order representation of the cells and their faces. Although this method is effective, its limitationin only working with structured meshes confines its use to verification and benchmark problems. A moregeneral approach is to use a solid mechanics analogy, and model the faceted mesh as an elastic solid.Then, the boundary of the solid can be deformed to match its curved representation, leading to an internaldeformation which is consequently used to define the curvature of the inner faces and cells. Hartmann20and Leicht [29] gave a detailed summary of this family of methods. They also introduced modificationsthat greatly reduce the computational cost, while increasing the quality of the final mesh produced.Another approach was presented by Toulorge et al. [71], which is based on minimizing a novel nonlinearenergy function. Although this approach is more complicated than using a linear elasticity analogy, theauthors argue that it has better properties in terms of robustness and efficiency. Jalali and Ollivier-Gooch[32] have developed a simplified two-dimensional linear elasticity solver, which is currently used inANSLib to generate high-order curved meshes for two-dimensional turbulent flow simulations. In thissection, their approach will be extended to work with three-dimensional unstructured meshes.A linear elastic solid is governed by the Navier equations, which are derived from the conservation oflinear momentum, and expressed as [37]:− ∇ · σ = 0, (3.6)where σ is the stress tensor, and is defined as:σ =µ2(∇u + ∇uT)− λ3trace(∇u)I, (3.7)where u = [u1, u2, u3]T is the displacement vector, and the variables µ and λ are the Lame’s coefficients.The elastic solid properties are usually reported in terms of the Young’s modulus, E , and Poisson’s ratio,ν, which are related to the Lame’s coefficients as:µ =E2(1 + ν) λ =νE(1 + ν)(1 − 2ν) . (3.8)On the boundaries, either the displacement, the normal force, or a linear relation between them has tobe specified, which results in Dirichlet, Neumann, or Robin boundary conditions, respectively. Theseconditions can all be expressed through a single equation:Q(x)u + σn = f(x) x ∈ ∂Ω, (3.9)where f is the normal force, and Q is a generalized spring constant. In the curved mesh generationprocess, only the Dirichlet boundary conditions are of interest, which can be recovered by settingQ = 1 I and f(x) = 1 ub(x) in Equation (3.9). Here, ub(x) is the prescribed boundary displacement, and is a small number such as 10−10.The solution method for the Navier equations and their boundary conditions will now be presented. Toease the notation, the Einstein summation convention is used for indices i, j, k, and l, which are reservedonly for the geometric dimensions. In the first solution step, Equations (3.6) and (3.7) are combined toyield:− ∂∂xk(Ki jkl∂u j∂xl)= 0 i = 1 · · · Ndim. (3.10)21where Ki jkl are the components of the fourth-order stiffness tensor:Ki jkl = λδikδjl + µ(δi jδkl + δilδjk). (3.11)The next step involves the use of the continuous Galerkin finite element method to solve the governingequations. The numerical solution uh(x) will be defined as the linear superposition of a set of basisfunctions,uh(x) =NDOF/Ndim∑r=1Uh,rψr (x), (3.12)where ψr is the rth finite element basis function. Note that the finite element basis functions arenot necessarily the same as Lagrange polynomials that are used to represent the curved geometry inEquation (3.1). Moreover, Uh,r is the rth sub-component of the vector of degrees of freedom, Uh, andis expressed as:Uh,r = [(Uh,r )1, (Uh,r )2, (Uh,r )3]T . (3.13)Subsequently, u is replaced with uh in Equation (3.10), and the result is multiplied by every basis functionψr , which when integrated over the whole domain gives:−∫Ω∂∂xk(Ki jkl∂uh, j∂xl)ψrdΩ = 0 i = 1 · · · Ndim r = 1 · · · NDOF.Then, using the integration by parts theorem and Equation (3.7), we arrive at the discretized weakform:NDOF∑s=1(∫Ω∂ψr∂xkKi jkl∂ψs∂xldΩ +∫∂ΩψrQi jψsdS)(Uh,s)j =∫∂ΩfiψrdS, (3.14)for every i ≤ Ndim and r ≤ NDOF, which is a linear system of equations for finding Uh.The described solution scheme is implemented using the libMesh finite element library [36] in thisthesis. The hierarchical polynomials [22] of order p are chosen as the basis functions, where p is aninput parameter. LibMesh supports quadrature rules of arbitrary orders of accuracy for assembling thediscretized system of equations. Therefore, a quadrature scheme accurate up to order (2p + 1) is used,which ensures exact evaluation of all integrals, so that we do not have to worry about the effect ofquadrature error on the final solution.To illustrate the performance of the implemented method, a model problem is presented, with a simplegeometry resembling a three-dimensional airfoil. This geometry along with its anisotropic mesh areshown in Figure 3.3. The mesh is constituted of 1288 hexahedra. The curved part of the boundary isanalytically parameterized as:222.51.200 00.83XYZFigure 3.3: Geometry of the curved mesh generation test casey(u, v) =[u + (R0 − bu) sin(v), cu, (R0 − bu) cos(v) − H0(1 − buR0)]T0 < u < 1.25 − pi6< v <pi6R0 = 1 H0 = cos(pi/6) b = 2 c = 4,(3.15)where u and v are parameterization variables, and y is a point on the surface. To correctly capture thecurvature of the boundary, we need to project every point x on the boundary of the faceted mesh to thesurface of the original geometry. Then, the boundary condition simply becomes that of Dirichlet withub(x) = Proj(x) − x. The projection operator would be identity on the planar parts of the boundary. Onthe curved part, however, we seek to find the point Proj(x) = y(u, v), such that the line xy is perpendicularto the surface, i.e., we seek the unknowns [u, v, d]T that satisfy:x − y(u, v) − dn(u, v) = 0, (3.16)where d is the distance between the points x and y, and n is the surface normal vector,n(u, v) = ∂uy × ∂vy‖∂uy × ∂vy‖2 . (3.17)The projection problem can be solved using the Newton’s method, globalized by line search, which fullydefines the boundary conditions. Finally, the values E = 1 and ν = 0.3 have been used in the Navierequations. Note that due to the use of only Dirichlet boundary conditions, the value of E cancels out andis insignificant as long as it is uniform throughout the domain. However, changing ν in the admissiblerange (0, 0.5) will result in slightly different displacement fields.To solve the presented model problem, we have used basis functions of orders p = 2, 3, and a Conjugate23Table 3.1: Performance of the linear elasticity solverp NDOF Assembly Time(s) Linear Solve Time(s) Total Time(s)2 36, 801 4.1 7.8 12.53 117, 390 52.4 98.0 153.6(a) (b)Figure 3.4: Curved mesh displacement magnitude: (a) z = 0 view, (b) y = 0.25 cross section.Gradient (CG) solver, to set up and solve Equation (3.14), respectively. The problem size and solutiontime are listed in Table 3.1. Due to the nature of higher-order finite element methods, it is not surprisingthat the number of DOF and computational time have increased substantially for the p = 3 scheme.Figure 3.4 shows the magnitude of the displacement field obtained using quadratic hierarchical basisfunctions. As expected, the displacement field has the largestmagnitude near the curvedwall, yet vanishesas one moves towards the other planar parts of the boundary. Moreover, there is no displacement at thevertices of the mesh, as they are already on the true curved boundary. On the other hand, there is abig displacement in the middle of the high aspect ratio cells that are near the curved wall, i.e., wherethe difference between the true boundary and its planar approximation is the greatest. The residualof the linear system (3.14) per CG iteration, is shown in Figure 3.5. Convergence is rather smooth,and the residual drops below 10−11 in 18 and 25 iterations for the quadratic and cubic basis functions,respectively. The increase in number of linear iterations for the p = 3 scheme is due to its stiffer linearsystem and higher number of degrees of freedom.3.3 Modified Basis Functions for Highly Anisotropic MeshesIf Cartesian coordinate monomials, φi(x), are used as basis functions for the k-exact reconstructionscheme, numerical difficulties arise when dealing with anisotropic meshes. The problem shows itself intwo situations. First, the condition number of the left hand side matrix of Equation (2.9) soars drastically,even up to O( 1 ), where  is the machine zero. This can in turn introduce an error of order O(1) inthe reconstruction coefficients, and deteriorate the accuracy of uh. Secondly, the k-exactness of the240 5 10 15 20 25 301e-121e-101e-81e-61e-41e-2CG iterationResidualp=2p=3Figure 3.5: Residual per iteration for the linear elasticity solverreconstruction scheme is lost, i.e., for a function v and its projection on the discrete solution space vh,the equality ‖v − vh ‖ = O(hk+1)might not hold anymore. In this section, we will introduce the remediesproposed by Jalali and Ollivier-Gooch [32] to mitigate these difficulties, along with their generalizationto three dimensions.The key is to use two new sets of reconstruction basis functions that have better conditioning andinterpolation properties for anisotropic meshes. The first group is the set of principal coordinate basisfunctions ϕ+i(x), which are defined as:{ϕ+iτ (x) i = 1 . . . Nrec } = { 1a!b!c! (wτ1(x))a(wτ2(x))b(wτ3(x))c  a + b + c ≤ k } . (3.18)In this equation, wτ(x) is the principal coordinate of control volume τ evaluated at point x. To find thiscoordinate transformation, the moment of inertia tensor Iτ must be found,(Iτ)i j =∫τ(xi − xτi)(xj − xτ j)dΩ. (3.19)As this tensor is symmetric, it can be diagonalized, Iτ = QτΛτQTτ , where Q and Λ are its matrix ofright eigenvectors, and diagonal matrix of eigenvalues, respectively. Then, the principal coordinates areevaluated from: wτ(x) = Qτ(x − xτ). Although these basis functions span the same polynomial spaceas the Cartesian monomials, φi(x), they greatly reduce the condition number of the left hand side matrixin Equation (2.9). Our heuristic approach is to use these basis functions for control volumes which havea high aspect ratio, yet their faces are almost planar.For control volumes that are close to wall boundaries and have high distortion, a more complicatedapproach must be adopted. As a starting point, let us assume that for every control volume τ with suchproperty, we somehow can find a vector function (dτ, tτ1, tτ2), such that at every point x in the vicinityof the control volume, ∇dτ(x) is perpendicular to the closest wall boundary. Furthermore, ∇tτ1(x) and25.xτ. x∇dτ(xτ)t˜τ(x)dτ(x)Figure 3.6: Construction of approximate wall coordinates for a two-dimensional control volume τ∇tτ2(x)must be perpendicular to ∇dτ(x), and form an orthogonal basis for R3. The vector (dτ, tτ1, tτ1) isthen denoted as the wall coordinates of control volume τ. In two dimensions, the wall coordinate vectorshrinks to (dτ, tτ), so a rather simple strategy can be used to find it. The normal coordinate, dτ(x), is setto the distance to wall of point x minus that of the reference point of τ. For the tangential coordinate tτ ,however, a compact analytic relation does not exist. Thus, the approximate value t˜τ is used,t˜τ(x) = ‖(x − xτ) − ((x − xτ) · ∇dτ(xτ))∇dτ(xτ)‖2, (3.20)which is the length of the projection of x − xτ onto the perpendicular plane of ∇dτ(xτ). Figure 3.6schematically shows the construction of the approximate wall coordinates for a sample control volume.Although it is possible to find these coordinates for every quadrature point in the vicinity of the con-trol volume of interest, finding their derivatives can be complicated, or even impossible. Thus, yetanother coordinate transformation is introduced, namely the curvilinear coordinates ξτ , which closelyapproximates (dτ, t˜τ), while having a nice polynomial relationship in terms of x,ξτ =Nrec∑i=1biτϕ+iτ (x), (3.21)where ϕ+iτ are the same principal coordinate monomials introduced in Equation (3.18). The coefficientsbτ are subsequently found by requiring ξτ to be as close as possible to (dτ, t˜τ), at the reference pointof all the control volumes in Stencil(τ). Mathematically, the following minimization problem must besolved:minimizeb1τ ...bNrecτ∑σ∈Stencil(τ)∪{ τ }(h(xσ) − ξτ1(xσ))2 + (t˜(xσ) − ξτ2(xσ))2subject to ξτ(xτ) = 0,(3.22)which has an identical solution process to that of Equation (2.9). Note that it is theoretically possibleto use the Cartesian monomials φiτ(x) in the definition of the curvilinear coordinates in Equation (3.21),but this results in a poor-conditioned least square problem in finding the biτ coefficients, and thus isavoided. In this thesis, the emphasis is on anisotropic three-dimensional meshes that are symmetricin the x3 direction. Taking advantage of the symmetry, we are able to define ξτ1 and ξτ2 using the26two-dimensional method, and simply set ξτ3 = x3 − xτ3.Now that the curvilinear coordinates are introduced, their corresponding basis functions, ϕ∗iτ (x), can bedefined as:{ϕ∗iτ (x) i = 1 . . . Nrec } = { 1a!b!c! (ξτ1(x))a(ξτ2(x))b(ξτ3(x))c  a + b + c ≤ k } . (3.23)Numerical experiments have shown that using these basis functions for near-wall high aspect ratioelements restores the k-exactness of the reconstruction scheme [32].27Chapter 4Solving the Discretized System ofEquationsThis chapter begins by introducing the Pseudo Transient Continuation (PTC) method that finds thesolution of Equation (2.4) via solving a series of systems of linear equations. Available methods forthe solution of these linear systems are introduced, and their shortcoming in handling three-dimensionalfinite volume problems is addressed. Finally, a new solution scheme is proposed, and its performance iscompared to other previously available methods in ANSLib.4.1 Pseudo Transient ContinuationThe PTCmethod [17, 32, 35, 47] starts from an initial guess for the steady-state solution of Equation (2.4).The initial guess is usually taken from the free-stream conditions or the converged solution of a lower-order accurate scheme. Then, the solution is updated iteratively until the norm of the residual vector,‖R(Uh)‖2, falls below a desired threshold. Consider the backward Euler time advance scheme:U+h − Uh∆t+ R(Uh) = 0, (4.1)where ∆t is the time step size, and U+h is the the DOF vector at the next time level. LinearizingEquation (4.1) gives: (I∆t+∂R∂Uh)δUh = −R(Uh), (4.2)where δUh is the change in the DOF vector between the current and next time levels. Furthermore,∂R∂Uh is the residual Jacobian matrix, and is evaluated using the algorithm proposed by Michalak andOllivier-Gooch [47].The PTC method is derived from the backward Euler scheme by making a few modifications. First, a28nonuniform time step is used for every control volume. Thus, Equation (4.2) is changed to:(VCFL+∂R∂Uh)δUh = −R(Uh), (4.3)where CFL is the Courant-Friedrichs-Lewy number, and the diagonal matrix V is the time step scalingmatrix. The entries of V corresponding to control volume τ are denoted as Vτ , and found from theequation:Vτ =λmax,τhτ, (4.4)where hτ is the hydraulic diameter of the control volume, and λmax,τ is the maximum eigenvalue ofthe inviscid flux Jacobian over all the control volume surface quadrature points. Secondly, after δUh issolved for, the solution is updated according to the line search algorithm:Uh ← Uh + ωδUh, (4.5)where ω ∈ (0 1] is the line search parameter, which must satisfy:ωVδUhCFL + R(Uh + ωδUh)2 ≤ κ‖R(Uh)‖2. (4.6)Here, κ controls the strictness of the line search algorithm, and κ = 1.2 performs well for both viscousand inviscid flows [17]. In addition, the vector entries corresponding to the turbulence working variableare omitted in the evaluation of the norms in Equation (4.6) to enhance convergence [77]. Finally, theCFL number is updated at each iteration according to the value of ω:CFL←1.5 CFL ω = 1CFL 0.1 < ω < 10.1 CFL ω < 0.1. (4.7)At the beginning of the solution process, the approximate solution is away from its steady-state value,and a small CFL number prevents divergence by making the approximate solution follow a physicaltransient path. On the other hand, when the approximate solution gets close to its steady-state value,Equation (4.7) will increase the CFL number. Therefore, the effect of the term VCFL in Equation (4.3)will be reduced, and Newton iterations would be recovered. Thus, as better solution approximations areobtained, the convergence rate will get closer to the optimum behavior of Newton’s iterations.294.2 Linear SolversEvery PTC iteration requires the solution of the linear system Ax = b, where:A =(VCFL+∂R∂Uh)x = δUh b = −R(Uh). (4.8)The exact solution of the linear system is carried out by combinatorial algorithms that typically calculatea lower-upper (LU) factorization of the LHS matrix. Although there has been great progress in thedevelopment of such methods, e.g., MUMPS linear solver library [7], they still suffer from hugeconsumption of memory and time resources, when applied to sufficiently large problems. Furthermore,these methods do not take advantage of any initial guess for the solution, nor the fact that only anapproximate solution is of interest. As a result, scientists in the CFD community have resorted toiterative methods for the solution of huge linear systems.Iterative methods start from an initial guess x(0), and then generate a sequence of improving approximatesolutions. Iterative methods are either categorized as stationary, or a member of the Krylov Subspace(KSP) family. In a stationary method the approximate solution at step k + 1, x(k+1), is only dependenton the residual vector of the previous iteration, r(k), which is defined as:r(k) = b − Ax(k). (4.9)Conversely, a KSPmethod updates the approximate solution based on all or some of the previous residualvectors. ANSLib uses a KSP method, GMRES [63], as its linear solver because of its successful historyin solving aerodynamic problems [17, 41, 49, 56, 78], and the fact that it is readily available as a blackbox solver in many numerical computation libraries, such as PETSc [10].The GMRES method iteratively constructs an orthogonal basis for the search space spanned by thevectors r(0), Ar(0), . . . , Ak−1r(0). Then, it seeks the next approximate solution vector, x(k), as the memberof the search space that minimizes ‖r(k)‖2. Furthermore, the search space is usually restarted aftera certain number of iterations to prevent it from becoming too large. The behavior of the GMRESmethod is strongly dependent on the eigenvalue structure of matrix A. The more compact the eigenvaluespectrum, the fewer iterations are required for convergence. As the LHS matrices arising from a k-exactfinite volume discretization usually lack a nice eigenvalue distribution, the performance of GMRESmustbe improved by means of preconditioning, which is introduced in the following section.4.3 PreconditioningPreconditioning is the transformation of the original linear system into one which has the same solution,but is easier to solve using iterative methods. Preconditioning is usually done by first finding a matrixP, such that the condition number of the product matrix AP is smaller than that of A. Then, the original30equation is changed to:APy = b x = Py. (4.10)The preconditioning matrix can be constructed either directly from A, or the LHS matrix resulting froma lower-order discretization scheme. The matrix from which the preconditioner is constructed will bedenoted as A∗.The main steps of the preconditioned GMRES algorithm are shown in Algorithm 1 [63], where no restartstrategy and a zero initial guess have been assumed for simplicity. In this algorithm, the columns ofthe Vm ∈ RNunk×m matrix form an orthonormal basis for the search space spanned by the vectors b,(AP)b, . . ., (AP)m−1b. At each iterations m, the Hm and Vm matrices are used to find the solution guess,x(m) = APVmαm, and the norm of its corresponding residual, ‖r(m)‖2 = γm. Also, n is the maximumnumber of iterations while rtol is the tolerance in the reduction of the residual norm. A smaller tolerancevalue will result in a more accurate solution, but requires more GMRES iterations. Finally, note that theGMRES solver only requires the matrix-by-vector product of the P matrix, and does not need its explicitform.In this section, the preconditioners available in ANSLib, namely Point Gauss-Seidel, Block Jacobi,and ILU will be introduced, and their shortcoming in solving large three-dimensional problems will bediscussed. More complicated preconditioners, such as multigrid and domain decomposition have notbeen considered in this thesis because of the inconsistent results available in the literature regardingtheir performance. For example, Shahbazi et al. [65] reported that the linear multigrid method can beten times faster than conventional single grid algorithms. On the other hand, Diosady and Darmofal[20], and Wallraff et al. [77] reported cases for which they were not able to achieve a significant speedupfactor.4.3.1 Point Gauss-SeidelThe Point Gauss-Seidel (PGS) method is a stationary iterative solver that can also be used as a precon-ditioner for the GMRES method. First, A∗ is decomposed as:A∗ = L + D + U, (4.11)31Algorithm 1 Preconditioned GMRES with no restart and a zero initial guess.function GMRES(A, P, b, n, rtol)β = ‖b‖2v1 = b/βV1 = [v1], H0 = []m = 0repeatm = m + 1zm = Pvmw = Azmhm, vm+1 = Arnoldi(Vm,w)Vm+1 = [Vm, vm+1], Hm = [Hm−1, hm]αm, γm =DenseSolve(β,Hm)until m > n or γm/β < rtolReturn x(m) = PVmαmfunction Arnoldi(Vm,w)Returns the vector vm+1, which is the unit normal component of w to the space spanned by theprevious vi (i ≤ m) vectors.Returns the column vector hm, where (hm)i is the length of the projection of w over the vi vectorif i ≤ m + 1, and is zero otherwise.function DenseSolve(β,Hm)Returns αm = argminα ‖b − APVmα‖ and γm = minα ‖b − APVmα‖.The objective of this function is achieved by solving the equivalent problemargminα ‖βe1 − Hmα‖2 using a rotational change of variables. Solving this problem includesthe solution of a dense m×m linear system, and yields both γm and αm. Also, e1 = [1, 0, . . . , 0]T .where D is the control volume diagonal part of A, L is the strict lower part, and U is the strict upperpart. Then, the matrix-by-vector product Pv of the PGS preconditioner is defined as:z(0) = 0z(1) = (D + L)−1(v − A∗z(0))...z(n) = (D + L)−1(v − A∗z(n−1))Pv = z(n),(4.12)where n is the number of PGS iterations. Although PGS is a strong preconditioner for solving linearsystems arising from discretization on isotropicmeshes, it performs poorly when dealingwith anisotropicmeshes [44]. Therefore, PGS is only used for the solution of Poisson’s and Euler equations in this thesis,for which an isotropic mesh correctly captures the solution.32UL DD(a) (b)Figure 4.1: Schematic of Block Jacobi decomposition: (a) The mesh, and the partition belongingto each processor (b) The lower, upper, and block diagonal matrices.4.3.2 Block JacobiThe Block Jacobi (BJ) preconditioner also uses a decomposition of matrix A∗, with the difference thatthe diagonal blocks of matrix D must be composed of all the degrees of freedom belonging to the sameprocessor. Figure 4.1a shows a partitioned mesh, and Figure 4.1b shows the corresponding structureand decomposition of the LHS matrix. The matrix-by-vector product for this preconditioner is denotedas:z(0) = 0Dz(1) = v − A∗z(0)...Dz(n) = v − A∗z(n−1)Pv = z(n),(4.13)where n is the number of BJ iterations. Note that the block diagonal matrix D cannot be invertedexactly anymore, and each intermediate vector z(k) must be solved for by using an inner iterative solver.Nevertheless, since every block of D only belongs to a single processor, the inner iterative solver canbe serial. Thus, BJ is a means of parallelizing serial preconditioners, and is used for this purpose inANSLib.4.3.3 ILUThe incomplete lower-upper factorization with fill level p (ILUp) produces an approximation of the LUfactorization of A∗, such that the approximate matrices have a much smaller nonzero structure than theexact LU factorization. The factored lower and upper triangular matrices L˜ and U˜ are computed byperforming Gaussian elimination on A∗, but ignoring certain matrix entries. Then, the preconditionermatrix is set to P = (L˜U˜)−1. A larger fill level results in L˜ and U˜ matrices with bigger nonzero structure,33but is likely to construct a more effective preconditioner [63]. Nevertheless, the ILUmethod is inherentlya serial algorithm, so it is implemented in conjunction with the BJ method for parallel simulations.Since the ILU preconditioner is based on matrix factoring, its performance is dramatically affected bythe ordering of DOFs in the matrix A∗. Quotient minimum degree (QMD) and reverse Cuthill-McKee(RCM) are among the reordering algorithms that are offered in PETSc [10], and are used in ANSLib. Theformer reduces the fill of the factored matrices, while the latter reduces the fill of matrix A∗ itself.In the course of development of ANSLib, Nejat and Ollivier-Gooch [49] used an ILU preconditionerfactored from the LHS matrix of the 0-exact scheme to solve the Euler equations. They showedthe good performance of their preconditioning method for 1- and 2-exact finite volume schemes, butconcluded that the 3-exact scheme requires a more effective preconditioner. Later, Michalak andOllivier-Gooch [47] demonstrated that incomplete factorization of the higher-order LHS matrix resultsin faster convergence compared to factoring the 0-exact LHS matrix for inviscid problems. They furtherconjectured that factoring the LHS matrix to the full order using a fill level of 3 has a feasible memoryconsumption, and can be a practical preconditioner for three-dimensional flow problems. Hereinafter,the ILU preconditioner of fill level p factored from the 0-exact and full-order LHS matrices will bedenoted as LO-ILUp and HO-ILUp, respectively.Viscous flow problems are more challenging than their inviscid counterparts. Jalali and Ollivier-Gooch[32] observed that a fill level of three or larger is required forHO-ILUpreconditioning of viscous turbulentflow problems. The HO-ILU3 method is a practical preconditioner for two-dimensional problems, butits memory cost soars drastically in three-dimensions, hindering its implementation for such problems.To mitigate this issue, a new ILU based algorithm will be introduced in the next section that not only hasa less memory consumption than HO-ILU3, but also does not suffer from poor performance of LO-ILUwhen used with the 3-exact reconstruction scheme.4.4 Improved Preconditioning AlgorithmsIn this section, an effective preconditioning algorithm is proposed based on inner GMRES iterations. Fur-thermore, the lines of strong unknown coupling are presented for reordering of the LHS matrix. As willbe shown in Section 4.5, this reordering method can improve the speed of the solver considerably.4.4.1 Inner GMRES IterationsSome researchers have observed that the lower-order LHS matrix can construct a more effective precon-ditioner compared to its higher-order counterpart because the structure of the lower-order LHS matrixonly includes the immediate neighbors of every control volume [41, 56, 78]. Nevertheless, as will beshown in Section 4.5, the LO-ILU method can behave poorly for high-order reconstruction schemesapplied to viscous flow problems. Our conjecture is that using the exact inverse of the lower-order LHS34matrix as the preconditioner: P = (A∗)−1, rather than the product of the ILU matrices: P = (L˜U˜)−1, canmitigate the mentioned issue. In this case, the matrix-by-vector product z = Pv can be found by solvingthe system:A∗z = v. (4.14)Nevertheless, the linear system of Equation (4.14) can be as large as the original linear system Ax = b,and its solution cannot be carried out exactly. Instead of seeking the exact solution, the proposal is to usefurther inner ILU preconditioned GMRES iterations to find an approximate solution for Equation (4.14).The resulting preconditioner will be referred to as GMRES-LO-ILU in this thesis, and its performancewill be examined in Section 4.5.When an inner GMRES solver is used as the preconditioner, P will not be a linear operator anymore, andits effect changes from iteration to iteration. Therefore, the equation x(m) = PVmαm cannot be used anylonger. The solution to this issue is using a Flexible GMRES (FGMRES) outer solver [62]. The resultingcombination, i.e., a GMRES-LO-ILU preconditioned FGMRES solver, is shown in Algorithm 2. Unlikenormal GMRES, the z vectors are found by applying the inner GMRES solver to the v vectors, and areexplicitly kept track of as the columns of the Z matrix. Moreover, the final approximate solution is writtenas x(m) = Zmαm while the ILU factored matrices are still supplied to the algorithm for preconditioningthe inner GMRES solver. Also, tolerance values and limits for the maximum number of iterations haveto be specified for both the inner and outer solvers which are identified by the subscripts (.)i and (.)o,respectively.Algorithm 2 GMRES-LO-ILU preconditioned FGMRES with no restart and a zero initial guess.function FGMRES(A, A∗, L˜, U˜, b, no, rtolo, ni, rtoli)β = ‖b‖2v1 = b/βV1 = [v1], H0 = [], Z0 = [],m = 0repeatm = m + 1zm =GMRES(A∗, (L˜U˜)−1, vm, ni, rtoli)w = Azmhm, vm+1 = Arnoldi(Vm,w)Vm+1 = [Vm, vm+1], Hm = [Hm−1, hm], Zm = [Zm−1, zm]αm, γm =DenseSolve(β,Hm)until m > n or γm/β < rtolReturn x(m) = Zmαm4.4.2 Lines of Strong Coupling Between UnknownsForming non-overlapping lines that contain control volumes with strongly coupled degrees of freedomis beneficial in constructing effective preconditioners. Mavriplis [43] introduced the concept of lines inanisotropic unstructured meshes to enhance the performance of multigrid solvers via implicit smoothing.35His approximate algorithm for finding such lineswas purely geometric-based, and only considered controlvolume aspect ratios. Thus, his method only captured coupling through diffusion. Okusanya et al. [52]later developed an algorithm that considered both the advection and diffusion phenomena. Fidkowskiet al. [21] proved that such an algorithm results in a set of unique lines, and used the lines for nonlinear p-multigrid solution of Euler equations. Diosady and Darmofal [20] showed that reordering the unknowns,such that members of a line have consecutive numbers, is an effective reordering strategy for the ILUpreconditioner. In this thesis, the line reordering algorithm of [20] is implemented to improve the speedof the linear solver.In the first step of the line creation algorithm, a weight is assigned to every face inside the mesh.This weight is derived from the Jacobian of the 0-exact discretization of the advection-diffusion equa-tion,∇ · (vu − µL∇u) = 0, (4.15)where u is the unknown variable, v is the velocity taken from the initial conditions, and µL is an inputparameter that controls the sensitivity of the lines to mesh anisotropy. The weight of a face f is thendefined as:W( f ) = max(∂Rσ∂uτ,∂Rτ∂uσ), (4.16)where W is the weight of the face, σ and τ are the adjacent control volumes of the face, and R theresidual vector. If a face is located on the boundary, Equation (4.16) will not directly be applicable to it.Thus, a mirrored ghost control volume is placed adjacent to every boundary face. After finding the faceweights, calling Algorithm 3 will construct the lines, where F(τ) is the set of faces of control volumeτ, and σ = C(τ, f ) is the control volume which has the face f in common with control volume τ. Thisalgorithm ensures that two adjacent control volumes are connected to each other if and only if they havetheir first or second highly weighted face lying between them [21].To illustrate the performance of the algorithm, an anisotropic mesh for the geometry of a NACA 0012airfoil is considered, as shown in Figure 4.2. The velocity has been taken from the 2-exact solution ofthe flow resulting from a Mach number of 0.15, a Reynolds number of 6 × 106, and an angle of attackof 10◦. The parameter µL is set to the heuristic value of 10−5. As desired, the lines follow the directionof mesh anisotropy near wall boundaries, while their pattern changes, and follows the flow direction inother parts of the geometry.4.5 Numerical ComparisonsIn this section, the performance of the proposed preconditioning algorithm is studied by considering theturbulent flow around a NACA 0012 airfoil, with α = 10◦, Ma = 0.15, and Re = 6 × 106. Three nestedmeshes with approximately 25K, 100K, and 400K control volumes (NCV = 25K,100K, and 400K) areconsidered, where the coarsest mesh and its corresponding lines of strong unknown coupling are shownin Figure 4.2. The farfield boundaries are located almost 500 chords away from the airfoil, and the chord36Algorithm 3 Line creationfunction CreateLines . Calling this function creates all lines.Unmark all control volumes.while there a exists an unmarked control volume τ doMakePath(τ) . Forward path creationMakePath(τ) . Backward path creationfunctionMakePath(τ) . Helper function.repeatFor control volume τ, pick the face f ∈ F(τ)with the highest weight, such that control volumeσ = C(τ, f ) is not part of the current line.if f is a boundary face thenTerminateelse if σ is marked thenTerminateelse if f is not the first or second ranked face of σ in weight thenTerminateMark σ.Add σ to the current line.τ ← σuntil Terminationlength has a nondimensional value of one. This problem is taken from the NASA Turbulence ModelingResource (TMR) website [60], and has been previously studied by Jalali and Ollivier-Gooch [32], wherethey used a HO-ILU3 preconditioner with QMD reordering. Only the highest-order discretizationscheme of ANSLib (3-exact) is considered here, since it results in the stiffest linear systems that are mostdifficult to solve. Also, the initial conditions are taken from the steady-state solution of the lower-order2-exact scheme.The ILU based preconditioning methods tested are shown in Table 4.1. In all cases, the outer GMRESsolver stops when the linear residual of Equation (4.3) is reduced by a factor of 10−3, or a maximumnumber of 500 iterations is performed. The maximum Krylov subspace size is 100 for the outer GMRESsolver, and the initial CFL number is set to 10−2. Furthermore, the simulation ends when the norm ofthe residual vector ‖R(Uh)‖2 is reduced by a factor of 10−8. The command-line options supplied to theANSLib executable for each case are listed in Appendix A. Cases A-E are compared to each other onthe 25K and 100Kmeshes, while case E∗ is used to solve the problem on the finest mesh. Note that othercombinations of preconditioner and reordering schemes that are not considered in Table 4.1, did notresult in convergence even for the coarse mesh. Moreover, an extensive search for finding the optimumnumber of inner iterations ni has not been carried out, and the large number of inner iterations for caseE∗ is chosen as a safe measure to ensure convergence on the finest mesh.The residual history of the solver for each preconditioning algorithm is shown in Figure 4.3, where thewalltime is obtained from a single core of an Intel i7-4790 (3.60 GHz) CPU. Our proposed preconditioning37Block ColorFigure 4.2: Mesh and lines of strong unknown coupling for the geometry of a NACA 0012 airfoilTable 4.1: Preconditioning methods consideredCase Preconditioning Reordering rtoli, niname method algorithmA HO-ILU3 QMD —B LO-ILU0 RCM —C LO-ILU0 lines —D GMRES-LO-ILU0 RCM 10−3, 10E GMRES-LO-ILU0 lines 10−3, 10E∗ GMRES-LO-ILU0 lines 10−3, 60380.0 0.2 0.4 0.6 0.8 1.0Wall time (s)|R(Uh)|20 200 400 600 80010−610−410−2100102104NCV=25K0 1000 2000 3000 4000 5000NCV=100KCase ACase BCase CCase DCase EFigure 4.3: Comparison of residual histories for the NACA 0012 problem obtained using differentpreconditioning algorithms05×1041×105Wall time (s)10-610-410-2100102‖R(Uh)‖2case E∗Figure 4.4: Residual history for the NACA 0012 problem on the finest mesh of 400K controlvolumesalgorithm (case E) outperforms all the other methods on both mesh sizes, and takes only half the timeof the HO-ILU3 algorithm (case A) to find the steady-state solution. Furthermore, the line reorderingalgorithm shows its prominence on the 100K control volumemesh, where the RCM reordering algorithmfails. The effectiveness of the new preconditioning algorithm is further demonstrated by solving theproblem on a 400K control volume mesh. The residual history for this case is shown in Figure 4.4.Table 4.2 shows the number of PTC iterations (N-PTC), the number of outer GMRES iterations (N-GMRES), totalmemory consumption, CPU time spent on the linear solver (LST), and total computationaltime (TST). The proposed preconditioning scheme (case E) speeds up the linear solve process by a factorof three, and results in a twofold reduction in memory consumption, compared to the HO-ILU3 method.The shortcomings of the LO-ILU methods (cases B and C) are evident because of their large number ofPTC iterations and failure in convergence, on the coarse and medium meshes, respectively. Also, the39Table 4.2: Performance comparison for different preconditioning schemes. Cases with entriesmarked by “−” did not converge to the steady state solution.Preconditioner N-PTC N-GMRES Memory(GB) LST(s) TST(s)NCV = 25KA 37 956 1.4 416 635B 44 10, 893 0.7 264 572C 51 13, 692 0.7 328 705D 34 1, 856 0.7 134 379E 34 1, 787 0.7 118 361NCV = 100KA 39 4, 640 5.9 3, 122 4, 068B − − 2.8 − −C − − 2.8 − −D − − 3.2 − −E 36 4, 396 3.2 1, 308 2, 348NCV = 400KE∗ 38 6, 869 13.1 87, 312 91, 502Table 4.3: Computed drag and lift coefficients for the NACA 0012 airfoilNCV CDp CDv CL25K 0.00545 0.00583 1.0951100K 0.00615 0.00623 1.0905400K 0.00609 0.00619 1.0922NASA TMR 0.00607 0.00621 1.0910memory consumption scales linearly with respect to the problem size for the proposed preconditioningscheme, whereas the linear solve time seems to be increasing at a much more rapid rate. Although theproposed preconditioning scheme has a huge linear solve time for NCV = 400K, it still outperforms theHO-ILU3 method, which cannot handle this mesh size at all.The viscous drag (CDv), the pressure drag (CDp), and the lift (CL) coefficients computed from thenumerical solution along with their reference values are listed in Table 4.3. The reference values aretaken from the NASA TMR website, and are obtained by the FUN3D [2] solver running on a veryfine mesh (approximately 2M quadrilaterals). As expected, the change in the coefficients between thefine and medium meshes is much smaller than that of the medium and coarse meshes. Furthermore,the gap between the coefficients and their reference values gets smaller with mesh refinement, with theexception ofCL . Solution singularities, different boundary condition implementations between ANSLiband FUN3D, or inexact evaluation of the distance from wall function may partly be accountable for thenon-ideal behavior of the lift coefficient. Nevertheless, the purpose of this section was the solution ofthe discretized system of equations, which according to Figure 4.4 has been performed correctly.40Chapter 5Three-Dimensional ResultsThis chapter presents the k-exact finite volume solution of four three-dimensional problems. Theproblems are studied in order of difficulty, and the solutions are verified to assess the accuracy andperformance of the solver. First, Poisson’s equation is solved in a cubic domain, where the source termis obtained from a manufactured solution [64]. Then, the Euler equations are solved to simulate thesubsonic inviscid flow around a unit sphere. Subsequently, the solution of Navier-Stokes + negative S-Aequations are presented to simulate the subsonic viscous turbulent flow over a flat plate and the extrudedNACA 0012 airfoil. All the command-line options provided to the solver for running each case areprovided in Appendix A.All the flow simulations were performed using the Grex cluster from the WestGrid computing facilities[3]. Each node of the cluster consists of two 6-core Intel Xeon X5650 2.66GHz processors, and has aminimum of 48GB memory. All compute nodes are connected by a non-blocking Infiniband 4X QDRnetwork. The wall time and memory cost of all the flow problems are reported using 8 processors. Inaddition, a strong parallel scaling test is conducted for the flat plate and the sphere test cases, i.e., thesame problem is solved by using different numbers of processors ranging from 1 to 10 while the resultingspeedup is recorded.5.1 Poisson’s Equation in a Cubic DomainFor this test case, Poisson’s equation is considered inside the domain Ω = [0, 1]3. The manufacturedexact solutionu = sinh(sin(x1)) sinh(sin(x2)) sinh(sin(x3)) (5.1)is used to find the source term, while homogeneous Dirichlet conditions are imposed over all theboundaries. Slice plots of the exact solution are shown in Figure 5.1. The solution has a maximum valueof (sinh(1))3 at the point (12, 12, 12 ), and decays to zero on the boundaries.41Figure 5.1: Exact solution of Poisson’s problemA study of the solution accuracy is performed on four families of unstructured grids. The first gridfamily is only composed of hexahedra while the other families are composed of only prisms, mixtureof pyramids and tetrahedra, and only tetrahedra, respectively. The hexahedral grids are constructed byperturbing the vertices of a uniform N × N × N structured mesh. Other grid families are subsequentlycreated by decomposing each hexahedron into the desired element types. Figure 5.2 shows all the gridsfor N = 5.The problem is solved on grid sizes: N = 10, 20, 40, and reconstruction orders: k = 1, 2, 3. Figure 5.3shows the L2 norm of the discretization error ‖eh ‖2 as a function of the mesh length scale h = 1N andthe number of degrees of freedom. Optimal accuracy order of k + 1 is attained for the k = 1 and 3reconstruction schemes as expected, whereas the k = 2 scheme only achieves a second-order accuratesolution. Nevertheless, the non-optimal behavior of k = 2 is expected for Poisson’s problem, and waspreviously addressed and theoretically explained by Ollivier-Gooch and Van Altena [54].5.2 Inviscid Flow Around a SphereThe second test problem is the inviscid flow around a sphere with unit radius, and centered at the point(0, 0, 0). The Mach number is equal to Ma = 0.38, and the free-stream flow is in the x1 direction. Thisproblem is chosen as a three-dimensional extension of the inviscid flow around a circle, the higher-ordersolution of which was studied by Bassi and Rebay [14]. Three prismatic grids (64K, 322K, and 1Mcontrol volumes) are used, where the farfield is located at a distance of 100 dimensionless units from thesphere surface. A symmetric cut of the coarsest mesh near the sphere surface is shown in Figure 5.4.The faces and control volumes adjacent to the wall boundaries are curved using only second-orderLagrange polynomials, as quadratic functions are sufficient to represent a spherical surface. To ensurethat numerical quadrature does not introduce additional error in the solution, a quadrature scheme42(a) (b)(c) (d)Figure 5.2: Different meshes for the solution of Poisson’s problem: (a) hexahedra, (b) prisms, (c)mixture of pyramids and tetrahedra, (d) tetrahedra.accurate up to order k + 2 is employed.In all cases, the initial solution is the uniform free-stream condition everywhere, and the initial CFLnumber is set to 10. The GMRES linear solver stops when the linear residual of Equation (4.3) is 10−3times its initial value, or a maximum of 500 iterations is performed. Furthermore, the GMRES solverhas a maximum Krylov subspace size of 100, and is preconditioned using the BJ preconditioner withno = 4 iterations. The inner solver for the BJ preconditioner is PGS with ni = 4 iterations. Althoughthe LHS matrix of the linear system is calculated to full-order, its preconditioner is constructed from theLHS matrix of the 0-exact reconstruction scheme.As the flow does not contain any discontinuity, the entropy must be constant throughout the domain.Therefore, the L2 norm of the entropy relative to the free-stream conditions, ‖S − S∞‖2, has an exactvalue of zero, and is used to study the solution accuracy. The convergence of ‖S − S∞‖2 with meshrefinement is shown in Figure 5.5, where the mesh length scale is defined as h = (NCV)−1/3. The k = 1and 3 reconstruction schemes converge even faster than their expected ratios of 2 and 4, respectively.4312 3 4h0/h10−610−510−410−310−210−1|eh|2k=112 3 4h0/hk=212 3 4h0/hk=3Hex Prism Pyramid+Tet Tet O(hk+1)(a)103104105106NDOF10−610−510−410−310−210−1|eh|2k=1103104105106NDOFk=2103104105106NDOFk=3(b)Figure 5.3: Discretization error versus mesh size for Poisson’s problem: (a) Error versus meshlength scale, (b) Error versus number of degrees of freedom.Figure 5.4: Symmetric cut of the coarsest mesh near the sphere surface441 2 3 4 5 6h0/h10-710-610-510-410-3‖Sh−S∞‖2ave. slope=2.2ave. slope=2.4ave. slope=4.8k=1k=2k=3Figure 5.5: Relative entropy norm versus mesh size for the sphere problemThe convergence ratio for the k = 2 reconstruction scheme, however, is half an order smaller than itsnominal value. The slight deviation of the convergence orders from their theoretical values can partlybe attributed to the fact that the meshes employed are not nested.The effect of the reconstruction order k on solution accuracy is also evident from the qualitativeMach contours on the x3 = 0 symmetry plane, shown in Figure 5.6. Not surprisingly, only the k = 3reconstruction scheme has been able to capture the symmetry of the flow on the finestmesh. Furthermore,the artificialwake behind the sphere seems smaller for the k = 2 solution compared to that of k = 1.The norm of the residual vector per PTC iteration is shown in Figure 5.7. Convergence is independent ofthe reconstruction order k, but the number of iterations increases as the mesh is refined. It is noteworthyto mention that the increase in the initial norm corresponding to mesh refinement is due to the definitionof the residual vector R(Uh) in Equations (2.3) and (2.4). Each entry of the residual vector is divided bythe size of its corresponding control volume. Therefore, when the mesh is refined, the control volumesizes become smaller, and the residual vector entries will increase.The number of iterations and the resource consumption of the flow solver are listed in Table 5.1. Thetime spent on linear solves makes up less than twenty percent of the computational time, suggesting thatthe Jacobian evaluation is the major computational bottleneck for this problem. Either increasing thereconstruction order, or the number of control volumes results in stiffer linear systems, which requiremore GMRES iterations to be solved. Nevertheless, both the solution time and memory consumptionincrease linearly with the number of control volumes for all k. Also, the advantage of using a high-orderreconstruction scheme can be clearly seen: According to Figure 5.5, the k = 1 scheme needs at leastone level of uniform mesh refinement on the finest grid to achieve an entropy norm of 10−7. While sucha mesh refinement would increase the resource consumption by a minimum factor of eight, the k = 3scheme achieves the same level of accuracy by consuming only two times the same resources.To test the parallel scalability of the flow solver, the sphere problem was solved on different numbers of45NCV = 64K NCV = 1Mk=1k=2k=3Figure 5.6: Computed Mach contours on the x3 = 0 symmetry plane for the sphere problem460.0 0.2 0.4 0.6 0.8 1.0PTC iteration0. 10 2010-1210-1010-81-610-410-2100102‖R(Uh)‖2k=10 10 20k=20 10 2k=3NCV = 64KNCV = 322KNCV = 1MFigure 5.7: Norm of the residual vector per PTC iteration for the sphere problemTable 5.1: Number of iterations and the resource consumption of the flow solver for the sphereproblemk N-PTC N-GMRES Memory(GB) LST(s) TST(s)NCV = 64K1 15 182 2.94 24 1072 15 181 3.91 20 1123 15 255 6.00 31 320NCV = 322K1 16 207 13.24 134 5792 16 212 14.05 132 6203 16 300 28.39 271 1, 879NCV = 1M1 17 275 39.48 486 1, 9692 17 277 45.52 536 2, 1503 17 385 85.78 793 5, 904processors ranging from 1 to 10 with NCV = 322K and k = 3. The parallel speedup of the total solutionprocess, the Jacobian evaluation algorithm, and the linear solver scheme were computed separately, andare shown in Figure 5.8. Not surprisingly, the linear solver algorithm is less scalable than that of theJacobian integrator mainly because the performance of the BJ preconditioner decreases with an increasein the number of processors. The sudden dip of the parallel speedup for 9 processors may be due toa less favorable mesh partitioning, or communication problems on the cluster. Regardless, the overallsolution algorithm scales very well in parallel, as the linear solver only constitutes a small part of theoverall process.471 2 3 4 5 6 7 8 9 10# of processors12345678910Speed-upIdealTotalJac. int.Lin. solveFigure 5.8: The parallel speedup of the solver for the sphere problem: NCV = 322K and k = 3.5.3 Turbulent Flow Over a Flat PlateFor this test case, we consider a three-dimensional extension of the flat plate verification case fromthe NASA TMR website. The computational domain is Ω = [−0.33, 2] × [0, 1] × [0, 1], and thenondimensional parameters have the values: Re = 5 × 106, Ma = 0.2, α = 0, and ψ = 0. The flat plateis located on the (0 ≤ x1 ≤ 2) ∧ (x2 = 0) boundary, where adiabatic solid wall conditions are imposed.Symmetry boundary conditions are imposed on the (x3 = 0), (x3 = 1), and (−0.33 ≤ x1 ≤ 0) ∧ (x2 = 0)boundaries, while other boundaries are considered as farfield. As symmetry boundary conditions areimposed on all the boundaries normal to the x3-axis, the exact solution of this problem must be constantin the x3 direction. Therefore, the solutions obtained in this section can be compared to the two-dimensional results from the NASA TMR website, which are obtained by the CFL3D solver [1] on a544 × 384 grid.The two-dimensional meshes provided by the NASA TMR website are extruded in the x3 direction toconstruct a series of nested three-dimensional grids with dimensions: 60 × 34 × 7, 120 × 68 × 14, and240 × 136 × 28. The coarsest mesh is depicted in Figure 5.9. An anisotropic mesh is necessary tocorrectly capture the flow pattern in the boundary layer and at the plate leading edge. On each mesh, theproblem is solved for k = 1, then for k = 2, and finally for k = 3. The free-stream conditions are usedas the initial conditions for k = 1, while the initial solution for each k ≥ 2 is taken from the convergedsolution of the (k − 1)-exact scheme. The GMRES-LO-ILU2 method is used as the preconditioner,with 10 inner GMRES iterations, and the RCM reordering algorithm. Moreover, the ILU factorizationis performed only on the diagonal block of each processor. Other solver parameters are the same asSection 5.2.To schematically demonstrate the correctness of the solution, the distribution of the turbulence workingvariable obtained from k = 3 on the finest mesh is plotted on the x3 = 0.5 plane, and shown inFigure 5.10a. The results for a 544×384 grid provided by theNASATMRare also shown in Figure 5.10b.48XYZFigure 5.9: Coarsest mesh for the flat plate problem.(a) 3-exact: min : 7 × 10−3, max : 379 (b) NASA TMR: min : 0, max : 378Figure 5.10: Distribution of the turbulence working variable on the plane x3 = 0.5 for the flat plateproblemThere is a close agreement between the reference solution of NASA TMR and the solution of this thesis,although a few undershoots of small magnitude are present in the latter. The next quantity of interestis the distribution of the nondimensional eddy viscosity µTµ on the line (x1 = 0.97) ∧ (x3 = 0.5),which is depicted in Figure 5.11. Note how the solution discontinuity at control volume boundaries isquite evident for k = 1, while higher-order reconstruction schemes result in much smoother solutions.The reference solution from the NASA TMR website is also shown in Figure 5.11. An acceptableagreement is observable between our computed results and the reference values. Also, it is evident thata higher-order reconstruction scheme leads to a more accurate estimate of the eddy viscosity.To further verify the numerical solution, the convergence of the drag coefficient CD and the skinfriction coefficient Cf at the point x = (0.97, 0, 0.5) are studied with mesh refinement. Table 5.2 showsthe computed values on different meshes. The computed values converge faster for a higher-order490.0 0.2 0.4 0.6 0.8×34×7NASA TMRk=1k=2k=×68×140 50 100 150 200 250µT/µ0.000.010 020.03240×136×28Figure 5.11: Distribution of the nondimensional eddy viscosity on the line (x1 = 0.97)∧(x3 = 0.5)for the flat plate problemreconstruction scheme, such that only k = 3 offers an accurate solution on the medium mesh. Thereference values from the NASA TMR library are also shown in the same table. As desired, there is agood agreement between the computed values using the method of this thesis and the reference valuesof the NASA TMR website. The convergence orders of CD and Cf are computed using the procedure ofCelik et al. [16], and are also shown in Table 5.2. All the reconstruction schemes have attained optimalconvergence rates. Nevertheless, care must be taken when interpreting the estimated convergence ratesbecause the procedure of Celik et al. is most reliable when the error has already achieved an asymptoticbehavior on the coarsest mesh, which might not necessarily be the case here.The norm of the residual vector per PTC iteration is shown in Figure 5.12. Similar to Section 5.2,convergence is not affected by the reconstruction order k, but is slightly degraded as the number ofdegrees of freedom increases. The number of iterations and the resource consumption of the flow solverare listed in Table 5.3. The majority of the total time is spent on linear solves, showing that the linearsystems arising from viscous turbulent flows are much stiffer compared to their inviscid counterparts.50Table 5.2: Computed value and convergence order of the drag coefficient and the skin frictioncoefficient at the point x = (0.97, 0, 0.5) for the flat plate problemCD CfNASA TMR 0.00286 0.00271Meshk 1 2 3 1 2 360 × 34 × 7 0.00396 0.00233 0.00233 0.00350 0.00228 0.00222120 × 68 × 14 0.00301 0.00281 0.00285 0.00283 0.00268 0.00271240 × 136 × 28 0.00287 0.00286 0.00286 0.00274 0.00273 0.00273Convergence order 2.8 3.3 5.4 3 3 5.10 10 20 30 40 50 60 70 80 90PTC iteration10-910-710-510-310-1101103105‖R(Uh)‖2k=1 k=2 k=360×34×7120×68×14240×136×28Figure 5.12: Norm of the residual vector per PTC iteration for the flat plate problemAs desired, memory consumption increases linearly with mesh refinement, whereas the linear solve timedoes not scale as nicely as for the inviscid sphere problem. Most importantly, the benefit of higher-ordermethods can once again be observed: While the k = 3 solution on the medium mesh offers the samelevel of accuracy as the k = 1 solution on the fine mesh, the computational time of the former is smallerthan that of the latter by a factor of four.A strong parallel scaling test is conducted for the k = 3 scheme on the 120×68×14 mesh. The resultingspeedup for different parts of the solver is shown in Figure 5.13, which is quite similar to the resultsof the sphere problem, and suggests that the preconditioning algorithm is performing well for viscousflow problems. Nevertheless, the net scaling of the solver degrades marginally compared to the sphereproblem, as linear solves take up a bigger portion of the total solution time.5.4 Turbulent Flow Over an Extruded AirfoilFor the last test case, we consider a three-dimensional extension of the flow around the NACA 0012airfoil. The computational domain of Section 4.5 is extruded in the x3 direction with an extrusion length51Table 5.3: Number of iterations and the resource consumption of the flow solver for the flat plateproblemk N-PTC N-GMRES Memory(GB) LST(s) TST(s)60 × 34 × 7 mesh1 26 844 0.42 31 552 26 1, 009 1.35 42 1243 26 1, 071 2.10 57 202120 × 68 × 14 mesh1 28 1, 436 5.24 510 7132 29 1, 864 8.30 742 1, 4893 29 2, 041 14.63 825 2, 124240 × 136 × 28 mesh1 29 2, 492 38.77 6, 222 7, 8842 27 3, 305 60.70 12, 353 18, 1373 27 2, 906 121.81 23, 141 35, 4121 2 3 4 5 6 7 8 9 10# of processors12345678910Speed-upIdealTotalJac. int.Lin. solveFigure 5.13: The parallel speedup of the solver for the flat plate problem: 120 × 68 × 14 mesh andk = 3.of one nondimensional unit. Symmetry boundary conditions are imposed on the two boundaries normalto the x3-axis, while other boundaries retain their conditions from Section 4.5. The nondimensionalparameters are: Re = 6 × 106, Ma = 0.15, α = 10, and ψ = 0. As the exact solution must be constant inthe x3 direction, the solutions are compared to the reference values provided by NASA TMR, which areobtained by the FUN3D [2] solver on a 7169 × 2049 grid.Two meshes are employed: a hexahedral mesh with NCV = 100K, and a mixed prismatic-hexahedralmesh with NCV = 176K. In both cases, quadrature points are obtained from the tensor product of one-dimensional quadrature formulas and the quadrature points of the curved two-dimensional meshes. Aportion of the meshes is depicted in Figure 5.14. In both cases, there are 7 layers of extruded controlvolumes in the x3 direction. A value of MinNeigh(3) = 30 performed acceptably for the hexahedralmesh, whereas the same value resulted in an ill-conditioned version of Equation (2.7) for the mixed52XYZ(a) Hexahedral meshXYZ(b) Mixed prismatic-hexahedral meshFigure 5.14: Meshes for the extruded NACA 0012 problemmesh. Therefore, MinNeigh(3) = 60 was used in the latter case as a safe measure to ensure a well-posed minimization problem for Equation (2.7). Other solver parameters, including the order rampingprocedure for solution initialization, are the same as Section 5.3.The computed contours of turbulence working variable on the x3 = 0 plane are shown in Figure 5.15for the k = 3 solutions. The mixed-mesh resolves the wake region more accurately, while providing asmoother distribution of the turbulence working variable. The superior performance of the mixed meshcan be explained by its bigger number of degrees of freedom, and the presence of extra flow alignedfaces that are missing in the hexahedral mesh.The distribution of the surface pressure coefficient on the intersection of the airfoil and the x3 = 0.5plane is computed using the k = 1 and 3 solutions, and depicted in Figure 5.16. Generally, the computeddistributions are consistent with the reference values obtained by FUN3D. Closer views of the plots areshown for four distinct points to better compare the results. As expected, the k = 3 scheme predicts themost accurate values, particularly on the mixed mesh. For example, the FUN3D results show that thepressure coefficient on the lower surface becomes bigger than that of the upper surface, near the trailingedge of the airfoil. This phenomenon is captured by the k = 3 solutions, but not observed by those ofk = 1.53(a) Hexahedral mesh(b) Mixed prismatic-hexahedral meshFigure 5.15: Distribution of the turbulence working variable for the extruded NACA 0012 problemon the x3 = 0 planeTable 5.4: Computed drag and lift coefficients for the extruded NACA 0012 airfoil problemk CDp CDv CLNASA TMR− 0.00607 0.00621 1.0910Hex mesh1 0.01703 0.00582 1.06192 0.01702 0.00497 1.05073 0.00301 0.00472 1.0417Mixed mesh1 0.01129 0.00574 1.07352 0.00365 0.00565 1.07763 0.00550 0.00536 1.0869The computed and the reference lift and drag coefficients are shown in Table 5.4. It seems that thehexahedral mesh is too coarse, as none of the schemes can obtain accurate enough solutions. For themixed mesh, the k = 3 scheme gives satisfactory results with less than 10% error for all the coefficients.The solution of the other schemes, however, is quite off, particularly for the pressure drag coefficientCDp. Note that the reference results are obtained using a 7169 × 2049 mesh, which is more than athousand times finer than the meshes employed in this thesis.The norm of the residual vector per PTC iteration is shown in Figure 5.17. Convergence is marginallyaffected by either the mesh type or the reconstruction order k. The latter is a desirable result of usingthe steady-state solution of a k-exact scheme as the initial guess for the (k + 1)-exact solution.540.0 0.2 0.4 0.6 0.8 1.0x−6−5−4−3−2−1012Cp(a)(b)(c)(d)FUN3Dk=3, hexk=3, mixedk=1, hexk=1, mixed(a) (b)(c) (d)Figure 5.16: Distribution of surface pressure coefficient on the intersection of the extruded NACA0012 airfoil and the x3 = 0.5 plane.0 20 40 60 80 100PTC iteration10-710-510-310-1101103105‖R(Uh)‖2k=1 k=2 k=3HexMixedFigure 5.17: Norm of the residual vector per PTC iteration for the extruded airfoil problem55Table 5.5: Resource consumption of the flow solver for the extruded NACA 0012 problemk N-PTC N-GMRES Memory(GB) LST(s) TST(s)Hex mesh, NCV = 100K1 33 1, 154 4.77 317 7442 31 1, 788 6.82 730 2, 0973 31 2, 415 12.23 1, 057 3, 215Mixed mesh, NCV = 176K1 34 1, 132 8.70 458 1, 1642 32 1, 769 10.87 800 2, 4273 31 2, 185 26.47 1, 311 4, 666Finally, parameters related to the iterative convergence of the solver are listed in Table 5.5. The linearsolve time makes up around 30% of the total computation time. The total computation time is stronglydependent on k, but does not differ considerably between the two meshes for the same reconstructionorder. Moreover, the memory consumption seems to be linearly increasing with both k and NCV.56Chapter 6ConclusionsA higher-order solution framework was presented for steady-state three-dimensional compressible flows.The highlights of this solution framework include: a k-exact finite volume discretization, a PTC solutionalgorithm, a memory lean preconditioner based on inner GMRES iterations, and an ILU reorderingalgorithm based on lines of strong coupling between them.Higher-order accuracy was achieved by constructing a piecewise continuous representation of the controlvolume average values. This piecewise continuous representation, also referred to as the discrete solution,was defined as the superposition of a set of basis functions. Monomials of Cartesian, principal, orcurvilinear coordinates were used as the basis functions while the coefficient of each monomial in thesuperposition was found by solving a constrained minimization problem. In the finite volume method,the convective fluxes were evaluated by the Roe flux function. The viscous fluxes were obtained byadding a stabilizing damping term to the averaged gradients of the two adjacent states. Specifically, theflux functions were presented for the RANS equations fully coupled with the negative S-A turbulencemodel.The curvature of the domain boundaries must be correctly accounted for in high-order numericaldiscretization schemes, yet simply replacing the boundary faces of highly anisotropic meshes withhigher-order representations can result in invalid intersecting elements. Thus, a three-dimensional finiteelement elasticity solver was developed to propagate the curvature of the boundary throughout thedomain, and to prevent mesh tangling. The developed solver was tested by solving a model problem thatimitated the geometry of a three-dimensional airfoil. Also, the principal coordinates and the curvilinearcoordinates basis functions were presented to prevent the k-exact reconstruction scheme from behavingpoorly on curved anisotropic meshes.The PTCmethodwas revisited for the solution of the discretized nonlinear equations, where each iterationrequired the solution of a linear system. The challenges involved in the solution of these linear systemswere addressed for three-dimensional problems, and a novel method was introduced to mitigate theseissues. In this method, the FGMRES linear solver was preconditioned using inner GMRES iterations,57which were subsequently preconditioned by the ILU method. The L˜ and U˜ matrices were factoredfrom the LHS matrix of the 0-exact discretization scheme, and the lines of strong unknown couplingwere employed to reorder the unknowns. By considering the 2-D fully turbulent flow around the NACA0012 airfoil, it was shown that the proposed preconditioning algorithm is more efficient both in termsof solution time and memory consumption, compared to the other ILU based preconditioning methodsconsidered.The correct implementation of element quadrature and connectivity information in the solver was verifiedby solving Poisson’s equation inside a cubic domain. Subsequently, inviscid and viscous turbulent testproblems were considered, where the solution was verified against reference values either obtainedanalytically or provided by the NASA TMR website. In all cases, a satisfactory agreement was observedbetween the solutions of this thesis and the reference values. Moreover, accuracy tests were performedwith mesh refinement, and optimal convergence order was attained for reconstruction orders k = 1 and 3.The k = 2 scheme, however, was slightly inferior in performance compared to its theoretical convergenceorder.Timing results demonstrated the benefit and practicality of using higher-order methods for obtaininga certain level of accuracy. A second-order discretization scheme required a finer grid and morecomputational time to attain the same level of accuracy compared to a higher-order discretizationscheme. Furthermore, strong parallel scaling tests were performed, and excellent scaling was observedfor the Jacobian integration algorithm. The linear solver algorithm also demonstrated satisfactory parallelscaling results, although the performance in this area can still be improved. It was also observed thatthe required number of PTC iterations for convergence is independent of the reconstruction order whileonly slightly affected by the grid size.Much work remains to be done to extend the solution method of this thesis to solve industrially practicalproblems:1. A theoretical study can be conducted to investigate the inferior performance of the k = 2 dis-cretization scheme. The theoretical approach for unstructured mesh-based discretization methodsis usually through the functional analysis framework. A possible starting point can be the workof Barth and Larson [13] where they prove theoretical convergence orders for the k-exact finitevolume method.2. The boundary condition implementations of this work, though functional, might not be the bestchoice. An investigation of other possible boundary condition implementations has the potentialto improve convergence properties and solution quality [29].3. The proposed preconditioning algorithm does not require the explicit form of the full-order LHSmatrix. Thus, a matrix-free method can be implemented, where the matrix-by-vector productof the full-order LHS matrix is evaluated by finite difference Fréchet derivatives while only the0-exact LHS matrix is assembled. This strategy can reduce the memory cost considerably [55],58while possibly increasing the computation time. Sacrificing computation time for less memoryusage might be unavoidable for very large problems.4. In this thesis, it was possible to directly use the 2-D curvilinear coordinate basis functions for 3-Dproblems, as the solutions of the test problems considered were constant in the x3 direction. Apractical problem, on the other hand, requires the definition of truly 3-D curvilinear coordinatebasis functions. It should be possible to generalize the method of Section 3.3 by providing adefinition for the third wall coordinate. Alternatively, a big hexahedral meta-element can beconstructed by extruding a portion of the nearest wall surface to include all the members of atarget reconstruction stencil. The reference coordinates of the constructed meta-element can thenbe used as the curvilinear coordinates.5. The original 0-exact discretization scheme does not take the gradient dependent source terms intoaccount. Modifying this scheme to somehow capture the influence of the gradient dependentsource terms can greatly enhance the performance of the preconditioning algorithm [78].6. Employing a line Gauss-Seidel preconditioner instead of the ILU method, in conjunction with theproposed inner GMRES preconditioner has the potential to improve the parallel scalability of thelinear solver [4].7. As already mentioned, real world problems contain solution discontinuities, and require shockcapturing methods to be used in conjunction with the k-exact reconstruction scheme. Shockcapturing is even more important for higher-order methods, as they are more prone to developingunstable and spurious solutions.8. Haider et al. [26] recently introduced a multi-level reconstruction scheme that uses compactreconstruction stencils, and eliminates the need to directly work with large stencils. Their high-order finite volume method is therefore better suited for parallel simulations. Although their workis limited to the linear advection equation, its application in the context of turbulent compressibleflows is worth investigating.9. An hp-adaptive strategy can be much more efficient in achieving a prescribed level of accuracy,compared to uniform mesh refinement, or using high reconstruction orders for all the controlvolumes [33, 38]. Desirably, hanging nodes resulting from adaptively refined meshes fit naturallyinto the k-exact finite volume framework. Nevertheless, an effective error estimator must bedesigned that selects the candidate regions for refinement. In the case of anisotropic meshrefinement, the error estimator must also provide an estimate of the direction in which the errorchanges most abruptly.59Bibliography[1] Computational Fluids Laboratory 3-D, 2011. URL https://cfl3d.larc.nasa.gov/.[2] Fully Unstructured Navier–Stokes in 3-D, 2011. URL http://fun3d.larc.nasa.gov/.[3] WestGrid computing resources, 2017. URL https://www.westgrid.ca/.[4] B. R. Ahrabi and D. J. Mavriplis. Scalable solution strategies for stabilized finite-element flowsolvers on unstructured meshes. In Proceedings of the Fifty-Fifth AIAA Aerospace SciencesMeeting, page 517, 2017.[5] B. R. Ahrabi, W. K. Anderson, and J. C. Newman III. An adjoint-based hp-adaptive stabilizedfinite-element method with shock capturing for turbulent flows. Computer Methods in AppliedMechanics and Engineering, 318:1030–1065, 2017.[6] S. R. Allmaras, F. T. Johnson, and P. Spalart. Modifications and clarifications for theimplementation of the Spalart-Allmaras turbulence model. In Proceedings of the SeventhInternational Conference on CFD, 2012.[7] P. R. Amestoy, I. S. Duff, and J.-Y. L’Excellent. MUMPS multifrontal massively parallel solverversion 2.0. Technical report, CERFACS: Centre EuropÃľen de Recherche et de FormationAvancÃľe en Calcul Scientifique, 1998. URL http://mumps.enseeiht.fr/.[8] J. Andren, H. Gao, M. Yano, D. Darmofal, C. Ollivier Gooch, and Z. Wang. A comparison ofhigher-order methods on a set of canonical aerodynamics applications. In Proceedings of theTwentieth AIAA Computational Fluid Dynamics Conference, page 3230, 2011.[9] D. N. Arnold. An interior penalty finite element method with discontinuous elements. SIAMJournal on Numerical Analysis, 19(4):742–760, 1982.[10] S. Balay, S. Abhyankar, M. F. Adams, J. Brown, P. Brune, K. Buschelman, L. Dalcin, V. Eijkhout,W. D. Gropp, D. Kaushik, M. G. Knepley, L. C. McInnes, K. Rupp, B. F. Smith, S. Zampini, andH. Zhang. PETSc users manual. Technical Report ANL-95/11 - Revision 3.6, Argonne NationalLaboratory, 2015. URL http://www.mcs.anl.gov/petsc.[11] G. E. Barter and D. L. Darmofal. Shock capturing with PDE-based artificial viscosity forDGFEM: Part I. Formulation. Journal of Computational Physics, 229(5):1810–1827, 2010.[12] T. J. Barth and P. O. Frederickson. Higher order solution of the Euler equations on unstructuredgrids using quadratic reconstruction. In Twenty-eighth AIAA Aerospace Sciences Meeting, Jan.1990. AIAA paper 90-0013.60[13] T. J. Barth and M. G. Larson. A posteriori error estimates for higher order Godunov finite volumemethods on unstructured meshes. Finite Volumes for Complex Applications III, London, 2002.[14] F. Bassi and S. Rebay. High-order accurate discontinuous finite element solution of the 2D Eulerequations. Journal of Computational Physics, 138(2):251–285, 1997.[15] J. C. Butcher. Numerical Methods for Ordinary Differential Equations. John Wiley & Sons, 2016.[16] I. B. Celik, U. Ghia, and P. J. Roache. Procedure for estimation and reporting of uncertainty dueto discretization in CFD applications. Journal of Fluids Engineering-Transactions of the ASME,130(7), 2008.[17] M. Ceze and K. J. Fidkowski. Constrained pseudo-transient continuation. International Journalfor Numerical Methods in Engineering, 102(11):1683–1703, 2015.[18] S.-W. Cheng, T. K. Dey, and J. Shewchuk. Delaunay Mesh Generation. CRC Press, 2012.[19] X. Deng, M. Mao, G. Tu, H. Zhang, and Y. Zhang. High-order and high accurate CFD methodsand their applications for complex grid problems. Communications in Computational Physics, 11(04):1081–1102, 2012.[20] L. T. Diosady and D. L. Darmofal. Preconditioning methods for discontinuous Galerkin solutionsof the Navier–Stokes equations. Journal of Computational Physics, 228(11):3917–3935, 2009.[21] K. J. Fidkowski, T. A. Oliver, J. Lu, and D. L. Darmofal. p-Multigrid solution of high-orderdiscontinuous Galerkin discretizations of the compressible Navier–Stokes equations. Journal ofComputational Physics, 207(1):92–113, 2005.[22] J. E. Flaherty. Course notes on finite element analysis. URL http://www.cs.rpi.edu/~flaherje/pdf.Accessed 05/12/17.[23] S. K. Godunov. A finite difference method for the numerical computation of discontinuoussolutions of the equations of fluid dynamics. Matematicheskie Sborniki, 47:357–393, 1959.[24] G. H. Golub and C. F. V. Loan. Matrix Computations. The Johns Hopkins University Press,Baltimore, 1983.[25] F. Haider, J.-P. Croisille, and B. Courbet. Stability analysis of the cell centered finite-volumeMuscl method on unstructured grids. Numerische Mathematik, 113(4):555–600, 2009.[26] F. Haider, P. Brenner, B. Courbet, and J.-P. Croisille. Efficient implementation of high orderreconstruction in finite volume methods. In Finite Volumes for Complex Applications VIProblems & Perspectives, pages 553–560. Springer, 2011.[27] F. Haider, P. Brenner, B. Courbet, and J.-P. Croisille. Parallel implementation of k-exact finitevolume reconstruction on unstructured grids. In High-Order Nonlinear Numerical Schemes forEvolutionary PDEs, pages 59–75. Springer, 2014.[28] A. Harten, P. D. Lax, and B. Van Leer. On upstream differencing and Godunov-type schemes forhyperbolic conservation laws. In Upwind and High-Resolution Schemes, pages 53–79. Springer,1997.61[29] R. Hartmann and T. Leicht. Generation of unstructured curvilinear grids and high-orderdiscontinuous Galerkin discretization applied to a 3D high-lift configuration. InternationalJournal for Numerical Methods in Fluids, 2016.[30] C. Hirsch. Numerical Computation of Internal and External Flows: Computational Methods forInviscid and Viscous Flows. Wiley, 1990.[31] H. Huynh, Z. J. Wang, and P. E. Vincent. High-order methods for computational fluid dynamics:a brief review of compact differential formulations on unstructured grids. Computers and Fluids,98:209–220, 2014.[32] A. Jalali and C. Ollivier-Gooch. Higher-order unstructured finite volume RANS solution ofturbulent compressible flows. Computers and Fluids, 143:32–47, 2017.[33] A. Jalali and C. F. Ollivier Gooch. An hp-adaptive ustructured finite volume solver forcompressible aerodynamic flows. In Proceedings of the Fifty-Fifth AIAA Aerospace SciencesMeeting, page 0082, 2017.[34] A. Jalali, M. Sharbatdar, and C. Ollivier-Gooch. Accuracy assessment of finite volumediscretization schemes on unstructured meshes. Computers and Fluids, 101:220–232, 2014.doi:10.1016/j.compfluid.2014.06.008.[35] C. T. Kelley and D. E. Keyes. Convergence analysis of pseudo-transient continuation. SIAMJournal on Numerical Analysis, 35(2):508–523, 1998.[36] B. S. Kirk, J. W. Peterson, R. H. Stogner, and G. F. Carey. libMesh: a C++ library for paralleladaptive mesh refinement/coarsening simulations. Engineering with Computers, 22(3-4):237–254, 2006.[37] W. M. Lai, D. H. Rubin, D. Rubin, and E. Krempl. Introduction to Continuum Mechanics.Butterworth-Heinemann, 2009.[38] T. Leicht and R. Hartmann. Anisotropic mesh refinement for discontinuous Galerkin methods intwo-dimensional aerodynamic flow simulations. International Journal for Numerical Methods inFluids, 56(11):2111–2138, 2008.[39] S. K. Lele. Compact finite difference schemes with spectral-like resolution. Journal ofComputational Physics, 103(1):16–42, 1992.[40] W. Li. Efficient Implementation of High-Order Accurate Numerical Methods on UnstructuredGrids. Springer, 2014.[41] H. Luo, J. D. Baum, and R. Löhner. A fast, matrix-free implicit method for compressible flows onunstructured grids. In Sixteenth International Conference on Numerical Methods in FluidDynamics, pages 73–78. Springer, 1998.[42] D. Mavriplis, D. Darmofal, D. Keyes, and M. Turner. Petaflops opportunities for the NASAfundamental aeronautics program. In Proceedings of the Eighteenth AIAA Computational FluidDynamics Conference, page 4084, 2007.[43] D. J. Mavriplis. Multigrid strategies for viscous flow solvers on anisotropic unstructured meshes.Journal of Computational Physics, 145(1):141–165, 1998.62[44] D. J. Mavriplis. Directional agglomeration multigrid techniques for high-Reynolds-numberviscous flows. AIAA Journal, 37(10):1222–1230, 1999.[45] D. J. Mavriplis. Grid resolution study of a Drag Prediction Workshop configuration using theNSU3D unstructured mesh solver. In Proceedings of the Twenty-Third AIAA AppliedAerodynamics Conference, June 2005. AIAA paper 2005-4729.[46] C. Michalak and C. Ollivier-Gooch. Accuracy preserving limiter for the high-order accuratesolution of the Euler equations. Journal of Computational Physics, 228(23):8693–8711, 2009.doi:10.1016/j.jcp.2009.08.021.[47] C. Michalak and C. Ollivier-Gooch. Globalized matrix-explicit Newton-GMRES for thehigh-order accurate solution of the Euler equations. Computers and Fluids, 39:1156–1167, 2010.doi:10.1016/j.compfluid.2010.02.008.[48] C. R. Nastase and D. J. Mavriplis. High-order discontinuous Galerkin methods using anhp-multigrid approach. Journal of Computational Physics, 213(1):330–357, 2006.[49] A. Nejat and C. Ollivier-Gooch. A high-order accurate unstructured finite volume Newton-Krylovalgorithm for inviscid compressible flows. Journal of Computational Physics, 227(4):2582–2609,2008.[50] H. Nishikawa. Beyond interface gradient: A general principle for constructing diffusion scheme.In Proceedings of the Fortieth AIAA Fluid Dynamics Conference, 2010. AIAA paper 2010-5093.[51] H. Nishikawa. Two ways to extend diffusion schemes to Navier-Stokes schemes: Gradientformula or upwind flux. In Proceedings of the Twentieth AIAA Computational Fluid DynamicsConference, page 3044, 2011.[52] T. Okusanya, D. Darmofal, and J. Peraire. Algebraic multigrid for stabilized finite elementdiscretizations of the Navier–Stokes equations. Computer Methods in Applied Mechanics andEngineering, 193(33):3667–3686, 2004.[53] C. F. Ollivier-Gooch. Quasi-ENO schemes for unstructured meshes based on unlimiteddata-dependent least-squares reconstruction. Journal of Computational Physics, 133(1):6–17,1997.[54] C. F. Ollivier-Gooch and M. Van Altena. A high-order accurate unstructured mesh finite-volumescheme for the advection-diffusion equation. Journal of Computational Physics, 181(2):729–752,sep 2002. doi:10.1006/jcph.2002.7159.[55] P.-O. Persson and J. Peraire. Newton-GMRES preconditioning for discontinuous Galerkindiscretizations of the Navier–Stokes equations. SIAM Journal on Scientific Computing, 30(6):2709–2733, 2008.[56] A. Pueyo and D. W. Zingg. Efficient Newton-Krylov solver for aerodynamic computations. AIAAJournal, 36(11):1991–1997, 1998.[57] P. Roe and J. Pike. Efficient construction and utilisation of approximate Riemann solutions. InProceedings of the Sixth International Symposium on Computing Methods in Applied Sciencesand Engineering, VI, pages 499–518. North-Holland Publishing Co., 1985.63[58] P. L. Roe. Approximate Riemann solvers, parameter vectors, and difference schemes. Journal ofComputational Physics, 43:357–372, 1981. doi:10.1016/0021-9991(81)90128-5. URLhttp://www.sciencedirect.com/science/article/pii/0021999181901285.[59] A. Rogerson and E. Meiburg. A numerical study of the convergence properties of ENO schemes.Journal of Scientific Computing, 5(2):151–167, 1990.[60] C. Rumsey. Turbulence Modeling Resource, 2014. URL http://turbmodels.larc.nasa.gov/.[61] V. Rusanov. Calculation of Interaction of Non-steady Shock Waves with Obstacles. Technicaltranslation. National Research Council of Canada, 1962.[62] Y. Saad. A flexible inner-outer preconditioned GMRES algorithm. SIAM Journal on ScientificComputing, 14(2):461–469, 1993.[63] Y. Saad. Iterative Methods for Sparse Linear Systems. SIAM, 2003.[64] K. Salari and P. Knupp. Code verification by the method of manufactured solutions. Technicalreport, Sandia National Labs., Albuquerque, NM (US); Sandia National Labs., Livermore, CA(US), 2000.[65] K. Shahbazi, D. J. Mavriplis, and N. K. Burgess. Multigrid algorithms for high-orderdiscontinuous Galerkin discretizations of the compressible Navier–Stokes equations. Journal ofComputational Physics, 228(21):7917–7940, 2009.[66] C.-W. Shu. High order ENO and WENO schemes for computational fluid dynamics. InHigh-order Methods for Computational Physics, pages 439–582. Springer, 1999.[67] P. Solin, K. Segeth, and I. Dolezel. Higher-Order Finite Element Methods. CRC Press, 2003.[68] P. Spalart and S. R. Allmaras. A one-equation turbulence model for aerodynamic flows. InThirtieth AIAA Aerospace Sciences Meeting, Jan. 1992. AIAA paper 92-0439.[69] P. K. Sweby. High resolution schemes using flux limiters for hyperbolic conservation laws. SIAMJournal on Numerical Analysis, 21(5):995–1011, Oct. 1984.doi:http://dx.doi.org/10.1137/0721062.[70] E. Toro and A. Chakraborty. The development of a Riemann solver for the steady supersonicEuler equations. The Aeronautical Journal, 98(979):325–339, 1994.[71] T. Toulorge, C. Geuzaine, J.-F. Remacle, and J. Lambrechts. Robust untangling of curvilinearmeshes. Journal of Computational Physics, 254:8–26, 2013.[72] B. Van Leer. Towards the ultimate conservative difference scheme. IV. A new approach tonumerical convection. Journal of Computational Physics, 23(3):276–299, 1977.[73] B. Van Leer. Towards the ultimate conservative difference scheme. V. A second-order sequel toGodunov’s method. Journal of Computational Physics, 135(2):229–248, 1997.[74] J. Vassberg, M. DeHaan, and T. Sclafani. Grid generation requirements for accurate dragpredictions based on OVERFLOW calculations. In Proceedings of the Sixteenth AIAAComputational Fluid Dynamics Conference, page 4124, 2003.64[75] M. R. Visbal and D. V. Gaitonde. On the use of higher-order finite-difference schemes oncurvilinear and deforming meshes. Journal of Computational Physics, 181(1):155–185, 2002.[76] J. Von Neumann and R. D. Richtmyer. A method for the numerical calculation of hydrodynamicshocks. Journal of Applied Physics, 21(3):232–237, 1950.[77] M. Wallraff, R. Hartmann, T. Leicht, N. Kroll, C. Hirsch, F. Bassi, C. Johnston, and K. Hillewaert.Multigrid solver algorithms for DG methods and applications to aerodynamic flows. IDIHOM:Industrialization of High-Order Methods-A Top-Down Approach, 128:153–178, 2015.[78] P. Wong and D. W. Zingg. Three-dimensional aerodynamic computations on unstructured gridsusing a Newton–Krylov approach. Computers and Fluids, 37(2):107–120, 2008.[79] P. Woodward and P. Colella. The numerical simulation of two-dimensional fluid flow with strongshocks. Journal of Computational Physics, 54(1):115–173, 1984.[80] S. H. Yoo, H. C. No, H. M. Kim, and E. H. Lee. CFD-assisted scaling methodology andthermal-hydraulic experiment for a single spent fuel assembly. Nuclear Engineering and Design,240(12):4008–4020, 2010.65Appendix AANSLib Command-Line OptionsThe command line interface provided by the ANSLib library consists of an executable file Solver, whichcan be executed through the Linux shell, with the input options provided in a separate text file:$ ./Solver -opt < address of options file >ANSLib achieves parallelism using the Message Passing Interface (MPI). Thus, parallel jobs can beexecuted via the MPI launcher :$ mpiexec -np < # of processors > ./Solver -opt < address of options file >In the remainder, the input options are presented for the test problems considered in this thesis. Notethat comments are distinguished by the character #, in the options file.A.1 Two-Dimensional Turbulent Flow over a NACA 0012 Airfoil fromSection 4.5For this problem, several cases of preconditioners were considered. The following options are sharedbetween all the cases:Common Options# --------------------------- MESH-d 2 # Dimensions-mesh_type c # Cell-centered mesh-f < Location of .mesh file without extension >-B < Location of .bdry file without extension >-fec < Location of .fec file without extension ># --------------------------- HISTORY AND SOLUTION FILE NAME-sol < Location of initial solution input file >-sol_out < Location of final solution output file >66-ita_history_name < Location of residual history output file ># --------------------------- ACCURACY-a < Order of accuracy of the scheme: a=k+1 ># --------------------------- PHYSICS-physics RoeTurbSA2D # 2-D RANS + negative S-A-reynolds 6e6 # Reynolds number-mach 0.15 # Mach number-angle 10 # Angle of attack# --------------------------- NON-LINEAR SOLVER OPTIONS-exnut 1 # Ignore \tilde{nu} entries in the line-search-C 0.01 # Initial CFL number-no_distance_weight # No distance weights in the k-exact# reconstruction least squares problem-pseudolts_fixed # No additional CFL ramping-max_iter 80 # Maximum number of PTC iterations-jacobian_type recanalytic # Evaluate the Jacobian exactly-ita_target_residual 1e-5 # Target norm for the non-linear residualTo implement different preconditioning strategies, distinct sets of options have to be provided to the flowsolver in each case:Extra Options for Case A# --------------------------- LINEAR SOLVER OPTIONS# --------------------------- KSP-ksp_max_it 500 # Maximum # of GMRES iterations-ksp_rtol 1e-3 # rtol for the outer KSP-ksp_type gmres # KSP solver type# --------------------------- PC-pc_type ilu # Preconditioner type: ILU-pc_factor_levels 3 # ILU fill level: 3-pc_factor_mat_ordering_type qmd # ILU reordering type: QMDExtra Options for Case B# --------------------------- LINEAR SOLVER OPTIONS# --------------------------- KSP-ksp_max_it 500 # Maximum # of GMRES iterations-ksp_rtol 1e-3 # rtol for the outer KSP-ksp_type gmres # KSP solver type# --------------------------- PC-pre_order 1 # Order of scheme for A*-pc_type ilu # Preconditioner type: ILU-pc_factor_levels 0 # ILU fill level: 0-pc_factor_mat_ordering_type rcm # ILU reordering type: RCM67Extra Options for Case C# --------------------------- LINEAR SOLVER OPTIONS# --------------------------- KSP-ksp_max_it 500 # Maximum # of GMRES iterations-ksp_rtol 1e-3 # rtol for the outer KSP-ksp_type gmres # KSP solver type# --------------------------- PC-mu 1e-5 # \mu_L for line construction-pre_order 1 # Order of scheme for A*-pc_type ilu # Preconditioner type: ILU-pc_factor_levels 0 # ILU fill level: 0-pc_factor_mat_ordering_type lines # ILU reordering type: LINESExtra Options for Case D# --------------------------- LINEAR SOLVER OPTIONS# --------------------------- KSP-ksp_max_it 500 # Maximum # of GMRES iterations-ksp_rtol 1e-3 # rtol for the outer KSP-ksp_type fgmres # KSP solver type: FGMRES# --------------------------- PC-pre_order 1 # Order of scheme for A*-pc_type ksp # Preconditioner type: inner-KSP-ksp_ksp_type gmres # inner-KSP type: GMRES-ksp_ksp_max_it 10 # inner-KSP n_i: 10-ksp_pc_type ilu # inner-KSP preconditioner: ILU-ksp_pc_factor_levels 0 # ILU fill level: 0-ksp_pc_factor_mat_ordering_type rcm # ILU reordering: RCMExtra Options for Case E# --------------------------- LINEAR SOLVER OPTIONS# --------------------------- KSP-ksp_max_it 500 # Maximum # of GMRES iterations-ksp_rtol 1e-3 # rtol for the outer KSP-ksp_type fgmres # KSP solver type: FGMRES# --------------------------- PC-mu 1e-5 # \mu_L-pre_order 1 # Order of scheme for A*-pc_type ksp # Preconditioner type: inner-KSP-ksp_ksp_type gmres # inner-KSP type: GMRES-ksp_ksp_max_it 10 # inner-KSP n_i: 10-ksp_pc_type ilu # inner-KSP preconditioner: ILU-ksp_pc_factor_levels 0 # ILU fill level: 0-ksp_pc_factor_mat_ordering_type lines # ILU reordering: LINES68A.2 Poisson’s Equation from Section 5.1# --------------------------- MESH-d 3 # Dimensions-mesh_type c # Cell-centered mesh-f < Location of .vmesh file without extension ># --------------------------- HISTORY AND SOLUTION FILE NAME-sol_out < Location of final solution output file >-ita_history_name < Location of residual history output file >#---------------------------- PHYSICS-physics Poisson # Solve Poisson’s equation-poisson_problem_data sinhsin # Name of the manufactured solution# --------------------------- ACCURACY-a < Order of accuracy of the scheme: a=k+1 ># --------------------------- NON-LINEAR SOLVER OPTIONS-C 1e10 # Initial CFL number-jacobian_type recanalytic # Evaluate the Jacobian exactly-no_distance_weight # No distance weights in the k-exact# reconstruction least squares problem-max_iter 3 # Maximum number of PTC iterations-ita_target_residual 1e-10 # Target norm for the residual vector R# --------------------------- LINEAR SOLVER OPTIONS-ksp_type gmres # KSP type: GMRES-ksp_rtol 1e-12 # GMRES rtol-pre_order 1 # Order of scheme for A*-pc_type sor # Preconditioner type: PGS-pc_sor_its 10 # Number of PGS iterations: 10A.3 Inviscid Flow Over a Sphere from Section 5.2# --------------------------- MESH-d 3 # Dimensions-mesh_type c # Cell-centered mesh-f < Location of .vmesh file without extension ># --------------------------- HISTORY AND SOLUTION FILE NAME-sol_out < Location of final solution output file >-ita_history_name < Location of residual history output file >----------------------------- PHYSICS-physics Euler3D # Solve 3-D Euler equations-mach 0.38 # Mach number-angle 0 # Angle of attack# --------------------------- ACCURACY-a < Order of accuracy of the scheme: a=k+1 >69-mcell3d_sphere_hack 1 # Curve the boundary of the sphere-mcell3d_extraq_face 1 # Use quadrature rule of order k+1 for faces-mcell3d_extraq_cell 1 # Use quadrature rule of order k+1 for cells# --------------------------- NON-LINEAR SOLVER OPTIONS-C 10 # Initial CFL number-jacobian_type recanalytic # Evaluate the Jacobian exactly-pseudolts_fixed # No additional CFL ramping-max_iter 25 # Maximum number of PTC iterations-ita_target_residual 1e-6 # Target norm for the residual vector R# --------------------------- LINEAR SOLVER OPTIONS-ksp_type fgmres # KSP type: FGMRES-ksp_rtol 1e-3 # GMRES rtol-pre_order 1 # Order of scheme for A*-pc_type sor # Preconditioner type: PGS parallelized by BJ-pc_sor_its 4 # Number of local PGS iterations-pc_sor_lits 4 # Number of BJ iterationsA.4 Turbulent Flow Over a Flat Plate from Section 5.3# --------------------------- MESH-d 3 # Dimensions-mesh_type c # Cell-centered mesh-f < Location of .vmesh file without extension ># --------------------------- HISTORY AND SOLUTION FILE NAME-sol < Location of initial solution input file >-sol_out < Location of final solution output file >-ita_history_name < Location of residual history output file >----------------------------- PHYSICS-physics TurbSA3D # Solve 3-D RANS + negative SA-mach 0.2 # Mach number-reynolds 5e6 # Reynolds number-angle 0 # Angle of attack-turbsa3d_problem_data flat_plate # Specify post-processing operations# --------------------------- ACCURACY-a < Order of accuracy of the scheme: a=k+1 ># --------------------------- NON-LINEAR SOLVER OPTIONS-C 0.1 # Initial CFL number-jacobian_type recanalytic # Evaluate the Jacobian exactly-pseudolts_fixed # No additional CFL ramping-max_iter 25 # Maximum number of PTC iterations-ita_target_residual 1e-8 # Target norm for the residual vector R-no_distance_weight # No weight for the k-exact# reconstruction least squares problem# --------------------------- LINEAR SOLVER OPTIONS-ksp_type fgmres # KSP type: FGMRES70-ksp_rtol 1e-3 # GMRES rtol-pre_order 1 # Order of scheme for A*-pc_type ksp # Preconditioner type: inner-KSP-ksp_ksp_max_it 10 # Inner-KSP n_i: 10-ksp_ksp_type gmres # Inner-KSP type: GMRES-ksp_pc_type bjacobi # Inner-KSP preconditioner: BJ# BJ number of iterations defaults to 1-ksp_sub_pc_type ilu # BJ inner solver: ILU-ksp_sub_pc_factor_levels 2 # ILU fill level: 2-ksp_sub_pc_factor_mat_ordering_type rcm # ILU reordering type: RCMA.5 Turbulent Flow Over an Extruded NACA 0012 Airfoil fromSection 5.4# --------------------------- MESH-d 3 # Dimensions-mesh_type e # Cell-centered extruded mesh-mext_nlayer 7 # Number of extruded layers-mext_btag 2 # Boundary tag for the extruded walls (symmetry)-mext_length 1 # Extrusion length-f < Location of .mesh file without extension >-B < Location of .bdry file without extension >-fec < Location of .fec file without extension ># --------------------------- HISTORY AND SOLUTION FILE NAME-sol < Location of initial solution input file >-sol_out < Location of final solution output file >-ita_history_name < Location of residual history output file >----------------------------- PHYSICS-physics TurbSA3D # Solve 3-D RANS + negative SA-mach 0.15 # Mach number-reynolds 6e6 # Reynolds number-angle 10 # Angle of attack-turbsa3d_problem_data naca0012 # Specify post-processing operations# --------------------------- ACCURACY-a < Order of accuracy of the scheme: a=k+1 ># --------------------------- NON-LINEAR SOLVER OPTIONS-C 0.1 # Initial CFL number-jacobian_type recanalytic # Evaluate the Jacobian exactly-pseudolts_fixed # No additional CFL ramping-max_iter 100 # Maximum number of PTC iterations-ita_target_residual 1e-5 # Target norm for the residual vector R-no_distance_weight # No weight for the k-exact# reconstruction least squares problem# --------------------------- LINEAR SOLVER OPTIONS-ksp_type fgmres # KSP type: FGMRES71-ksp_rtol 1e-3 # GMRES rtol-pre_order 1 # Order of scheme for A*-pc_type ksp # Preconditioner type: inner-KSP-ksp_ksp_max_it 10 # Inner-KSP n_i: 10-ksp_ksp_type gmres # Inner-KSP type: GMRES-ksp_pc_type bjacobi # Inner-KSP preconditioner: BJ# BJ number of iterations defaults to 1-ksp_sub_pc_type ilu # BJ inner solver: ILU-ksp_sub_pc_factor_levels 2 # ILU fill level: 2-ksp_sub_pc_factor_mat_ordering_type rcm # ILU reordering type: RCM72Appendix BSample Script for Running a Parallel Jobon GrexThe computational work on WestGrid systems is carried out through a non-interactive batch job system.Rather than directly executing the solver in the interactive mode, a request has to be made in the form ofa job script, which specifies the commands to be executed as well as the required memory, computationtime, number of nodes, and number of processors. The following shows a sample job script that executesthe solver using 8 processors, and redirects the output to a text file:Sample Job Script#!/bin/bash#PBS -S /bin/bash ## This is a request file for job submission#PBS -l nodes=8:ppn=1 ## Number of nodes and processors per node#PBS -l walltime=00:40:00 ## Maximum requested wall time#PBS -l mem=10gb ## Maximum requested total memory#PBS -M <your-email> ## Email the owner after the job is finished#PBS -m abe ## Send the email at any occasion## Run an external script that correctly sets the environment## variables and loads Grex’s modules.## For more information on modules see Grex’s website.. ~/scripts/setenv.sh## On many WestGrid systems a variable PBS_NP is automatically## assigned the number of cores requested of the batch system.## On systems where $PBS_NP is not available , one could use:CORES=$(/bin/awk ’END {print NR}’ $PBS_NODEFILE)echo "Running on $CORES cores"## Announce start timeecho "Starting run at: $(date)"73## Recommended MPI options from Petsc for running a parallel job.NPMAX=${CORES}MPI_BINDING=’--bysocket --bind-to-socket --report-bindings’MPIEXEC=mpiexec## Call the ANSLib executable , provide the correct option file,## and finally redirect the output to a file, so that it can## be inspected later.${MPIEXEC} ${MPI_BINDING} -np ${CORES} \~/code/ANSLib/hooshi/apps/solver/Solver \-opt options/h2_p2_facet.opt \|& tee pp_data/h2_p2_facet.out## Announce that the job is finished.echo "pbs/h2_p2_facet.pbs finished at: $(date)"After a script has been created, it should be submitted for execution:$ qsub <address of the script>The batch job systemwill place the script in the execution queue, where the execution priority is evaluatedbased on the the requested amount of resources.74


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items