Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

A framework for geophysical inversions with application to vadose zone parameter estimation Cockett, Archa Rowan B. 2017

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


24-ubc_2018_february_cockett_archarowan.pdf [ 9MB ]
JSON: 24-1.0362383.json
JSON-LD: 24-1.0362383-ld.json
RDF/XML (Pretty): 24-1.0362383-rdf.xml
RDF/JSON: 24-1.0362383-rdf.json
Turtle: 24-1.0362383-turtle.txt
N-Triples: 24-1.0362383-rdf-ntriples.txt
Original Record: 24-1.0362383-source.json
Full Text

Full Text

A framework for geophysical inversionswith application to vadose zone parameter estimationbyArcha Rowan B. CockettB.Sc. Applied and Environmental Geology, University of Calgary, 2010A THESIS SUBMITTED IN PARTIAL FULFILLMENTOF THE REQUIREMENTS FOR THE DEGREE OFDoctor of PhilosophyinTHE FACULTY OF GRADUATE AND POSTDOCTORALSTUDIES(Geophysics)The University of British Columbia(Vancouver)December 2017c© Archa Rowan B. Cockett, 2017AbstractInverse modeling is a powerful tool for extracting information about the subsurfacefrom geophysical and hydrologic data. Geophysical inverse problems are inherentlymultidisciplinary, requiring elements from the relevant physics, numerical simulation,and optimization, as well as knowledge of the geologic setting, hydrologic processes,and a comprehension of the interplay between all of these elements. Increasingly geo-scientists are tackling complex problems that require integration of multiple types ofinformation in order to better characterize the subsurface. However, many of the sub-fields of geophysics are developing simulation and inversion approaches, algorithms,and supporting software in isolation. This isolation is a barrier to quantitative inte-gration and leads to inefficiencies in advancing interdisciplinary research. Greaterefficiencies, and higher quality outcomes, could be achieved if (hydro)geophysicistshad a common framework to accelerate an integrated approach. The main goal of mythesis is to organize the components of (hydro)geophysical simulations and inverseproblems, and synthesize these into a comprehensive, modular framework.The development of a geophysical framework requires considering a number ofdisciplines and geophysical problems (e.g. electromagnetics and potential fields) to en-sure generality as well as extensibility. However, the goal is also to have the frameworkwork outside of geophysics and most notably in hydrogeology; vadose zone fluid flowiiis used as a model problem. Fluid flow in the vadose zone is governed by the Richardsequation; it is parameterized by hydraulic conductivity, which is a nonlinear functionof pressure head. The computational scalability of the Richards equation inversion isa significant challenge for three dimensional inversions in hydrogeophysics. Existingwork explicitly calculates the sensitivity matrix using finite difference or automaticdifferentiation, however, for large-scale problems these methods are constrained bycomputation and memory. This dissertation provides an implicit sensitivity algorithmthat enables large-scale inversion problems for distributed parameters in the Richardsequation to become tractable on modest computational resources.iiiLay SummaryGeophysical methods gather data remotely to enable insights into subsurface struc-ture and processes (e.g. locating economic resources or monitoring environmentalchanges). The information derived from geophysical methods is of crucial impor-tance in resource exploration, environmental remediation, and the study of deep-earthprocesses. Interpretation of geophysical data requires a combination of numerical sim-ulation and inversion. Inversion is a procedure for using data to estimate an image ormodel of the earth (this is similar to medical imaging). Increasingly, geoscientists aretackling complex problems that require integration of multiple types of informationin order to better characterize the subsurface. In hydrogeology and geophysics, thisquantitative integration requires advances in both disciplines, as well as a frameworkfor this collaboration. The objective of this dissertation is to identify and refine a com-putational framework that enables and encourages sustained cross-disciplinary com-munication, which is a necessary step in integrated geophysical simulation research.ivPrefaceThe research for this dissertation was completed while studying at the University ofBritish Columbia. This research has resulted in three peer reviewed publications, threeexpanded conference abstracts, and several auxiliary works. The main focus of my the-sis is on a framework for geophysical simulations and inversions that increases quan-titative geoscience communication. In 2016, Dr. Oldenburg, Dr. Pidlisecky, LindseyHeagy and I organized an international conference around this work that was spon-sored by the Banff International Research Station; excerpts from the introduction ofmy thesis were used in the conference proposal.Chapter 2 presents a framework for simulation and parameter estimation for geo-physical applications. An earlier version of which was published in Cockett et al.(2015c), and ideas from this chapter also have been presented at several internationalconferences (cf. Cockett et al. (2014b, 2015b,a)).Chapter 3 presents a computationally scalable algorithm for solving inverse prob-lems for hydraulic parameters in vadose zone flow using the Richards equation. Thiswork has been submitted for peer review and the preprint is available on arXiv (Cockettet al., 2017); preliminary versions of this research were presented at two conferences(Cockett and Haber, 2013b,a).Chapter 4 involves several numerical examples, which were inspired by work fromvmy undergraduate thesis, of which two papers were published during the course of mygraduate research (Pidlisecky et al., 2013; Cockett and Pidlisecky, 2014). The forwardsimulation framework for multi-parameter simulations and inversions in time-domainphysical problems used in this chapter was derived from collaborative work betweenelectromagnetics and vadose zone flow (Heagy et al., 2016). One of the numericalexamples in Chapter 4 has previously been published in (Cockett et al., 2017).Two of the appendices contain supporting materials on finite volume and severalnumerical examples and case studies. Appendix B on finite volume contains work andfigures that have been published in a computational tutorial on finite volume (Cockettet al., 2016a). Additionally, much of this work is supported by course material andinstruction from Dr. Eldad Haber, Dr. Uri Ascher, and Dr. Chen Grief (Haber, 2015;Ascher and Greif, 2011). Appendix C presents an adaptation of the forward simulationframework published in Heagy et al. (2016) for the Richards equation. This appendixalso summarizes conclusions and insights from three extended conference abstracts onelectromagnetics and a publication on parametric geologic modelling (Heagy et al.,2014; Kang et al., 2015a; Heagy et al., 2015c; Cockett et al., 2016b).Throughout the course of my graduate research, I have started and contributedto several open source software projects to support, test, and validate the geophys-ical simulation and inversion framework that is the main focus of my thesis. Mymain focus with this software was on inheritance, composition, terminology, and theinterfaces between simulation and inversion components – the elements that definethe framework. This is demonstrated by my personal contribution of 267,614 linesof code over the last five years, which have been reduced over 4.5 fold to 59,111lines of code while increasing possibilities and geophysical applications. For an up-to-date, detailed analysis on code contribution and attribution over time, please see:vi This is perhaps the most salient dis-tinguishment between the focus on framework development as opposed to a scriptor executable that is aimed at a specific type of geophysical inversion. The majorsoftware packages that have been created are: (a) SimPEG, a framework for simula-tion and parameter estimation in geophysics (; (b)discretize, a finite volume package for simulation in the context of inverse prob-lems (; and (c) pymatsolver, a common inter-face to several matrix solvers and packages ( projects have seen significant investment from my colleagues in testing, ap-plying, and expanding the capabilities of the framework to other geophysical appli-cations. This open, collaborative work has involved colleagues across industry, gov-ernment, and six universities. Currently SIMPEG includes methods for: vadose zoneflow (Cockett et al., 2017); direct current resistivity and induced polarization (Kangand Oldenburg, 2016); time-domain and frequency-domain electromagnetics (Heagyet al., 2016, 2017); magnetotellurics (Rosenkjaer et al., 2016); magnetics and grav-ity (Miller et al., 2017); and several examples of other inverse problems (Kang et al.,2017b). All software has been released under the permissive MIT license, to encouragereuse, adaptation, and sustained contribution to these ideas.viiTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiLay Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiiList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xivList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvList of Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxiiAcknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxiiiDedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxv1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Research context . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1.1 Geophysical inverse problems . . . . . . . . . . . . . . . . . 21.1.2 Hydrogeophysics in the vadose zone . . . . . . . . . . . . . . 4viii1.1.3 Research motivation . . . . . . . . . . . . . . . . . . . . . . 71.2 A framework for geophysical inversions . . . . . . . . . . . . . . . . 91.2.1 Take home points . . . . . . . . . . . . . . . . . . . . . . . . 121.3 Application to vadose zone parameter estimation . . . . . . . . . . . 131.3.1 Take home points . . . . . . . . . . . . . . . . . . . . . . . . 161.4 Thesis outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 Simulation and inversion framework . . . . . . . . . . . . . . . . . . . . 212.1 Introduction and motivation . . . . . . . . . . . . . . . . . . . . . . . 212.1.1 Attribution and dissemination . . . . . . . . . . . . . . . . . 232.2 Inversion methodology . . . . . . . . . . . . . . . . . . . . . . . . . 252.2.1 Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262.2.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . 292.2.3 Evaluation/interpretation . . . . . . . . . . . . . . . . . . . . 402.3 Modular implementation . . . . . . . . . . . . . . . . . . . . . . . . 412.3.1 Implementation choices . . . . . . . . . . . . . . . . . . . . . 422.3.2 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422.3.3 Motivating example . . . . . . . . . . . . . . . . . . . . . . . 442.3.4 Mesh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452.3.5 Forward simulation . . . . . . . . . . . . . . . . . . . . . . . 492.3.6 DC resistivity forward simulation . . . . . . . . . . . . . . . 512.3.7 Sensitivities . . . . . . . . . . . . . . . . . . . . . . . . . . . 522.3.8 Inversion elements . . . . . . . . . . . . . . . . . . . . . . . 542.3.9 Inverse problem and optimization . . . . . . . . . . . . . . . 552.3.10 Inversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56ix2.3.11 DC resistivity inversion . . . . . . . . . . . . . . . . . . . . . 562.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 593 Richards equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 603.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 603.1.1 Attribution and dissemination . . . . . . . . . . . . . . . . . 643.2 Forward problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 653.2.1 Richards equation . . . . . . . . . . . . . . . . . . . . . . . . 653.2.2 Discretization . . . . . . . . . . . . . . . . . . . . . . . . . . 673.2.3 Solving the nonlinear equations . . . . . . . . . . . . . . . . 713.3 Inverse Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 743.3.1 Implicit sensitivity calculation . . . . . . . . . . . . . . . . . 763.4 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 813.4.1 Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 813.4.2 Comparison to literature . . . . . . . . . . . . . . . . . . . . 843.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 864 Vadose zone inversions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 884.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 884.1.1 Attribution and dissemination . . . . . . . . . . . . . . . . . 914.2 Empirical relationships . . . . . . . . . . . . . . . . . . . . . . . . . 924.2.1 Objective functions . . . . . . . . . . . . . . . . . . . . . . . 974.3 Layered soil profile . . . . . . . . . . . . . . . . . . . . . . . . . . . 1004.4 Unconstrained joint inversion . . . . . . . . . . . . . . . . . . . . . . 1014.4.1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1044.4.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109x4.5 Three dimensional inversion . . . . . . . . . . . . . . . . . . . . . . 1114.5.1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1124.5.2 Scalability of the implicit sensitivity . . . . . . . . . . . . . . 1174.5.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1194.6 Integrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1204.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1215 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1235.1 Contributions and dissemination . . . . . . . . . . . . . . . . . . . . 1255.1.1 Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1265.2 Outlook and continuing work . . . . . . . . . . . . . . . . . . . . . . 127Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130A Frameworks and ontologies . . . . . . . . . . . . . . . . . . . . . . . . . 140B Finite volume techniques . . . . . . . . . . . . . . . . . . . . . . . . . . 143B.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143B.1.1 Attribution and dissemination . . . . . . . . . . . . . . . . . 144B.2 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145B.2.1 Mesh types . . . . . . . . . . . . . . . . . . . . . . . . . . . 145B.2.2 Cell anatomy . . . . . . . . . . . . . . . . . . . . . . . . . . 148B.2.3 Numbering . . . . . . . . . . . . . . . . . . . . . . . . . . . 150B.2.4 DC resistivity equations . . . . . . . . . . . . . . . . . . . . . 152B.2.5 Weak formulation . . . . . . . . . . . . . . . . . . . . . . . . 154B.3 Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156B.3.1 Divergence . . . . . . . . . . . . . . . . . . . . . . . . . . . 157xiB.3.2 Curl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159B.3.3 Gradient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161B.4 Inner products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163B.4.1 Face inner product . . . . . . . . . . . . . . . . . . . . . . . 163B.4.2 Edge inner product . . . . . . . . . . . . . . . . . . . . . . . 166B.4.3 Tensor product mesh . . . . . . . . . . . . . . . . . . . . . . 167B.4.4 Anisotropy . . . . . . . . . . . . . . . . . . . . . . . . . . . 168B.4.5 Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . 168B.5 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171B.5.1 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . 172B.5.2 User interface . . . . . . . . . . . . . . . . . . . . . . . . . . 173B.6 Numerical examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 176B.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177B.7.1 Continuing work . . . . . . . . . . . . . . . . . . . . . . . . 178C Interfaces and extensions . . . . . . . . . . . . . . . . . . . . . . . . . . 182C.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182C.1.1 What is your model? . . . . . . . . . . . . . . . . . . . . . . 183C.2 Forward simulation framework . . . . . . . . . . . . . . . . . . . . . 184C.2.1 Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185C.2.2 Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . 187C.2.3 Properties and Mappings . . . . . . . . . . . . . . . . . . . . 190C.3 Parameterizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194C.3.1 Expected distributions . . . . . . . . . . . . . . . . . . . . . 195C.3.2 Survey design . . . . . . . . . . . . . . . . . . . . . . . . . . 196xiiC.3.3 Geologic modeling . . . . . . . . . . . . . . . . . . . . . . . 198C.3.4 Dimensionality and controlled variables . . . . . . . . . . . . 198C.3.5 Nesting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200C.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201xiiiList of TablesTable 2.1 Selected Mesh class properties with explanations. . . . . . . . . . 47Table 2.2 Base Problem class properties with explanations. . . . . . . . . . 50Table 2.3 Selected Survey class properties with explanations. . . . . . . . . 51Table 2.4 Common functions for the Regularization, and DataMisfitclasses. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55Table 3.1 Fictitious source test for Richards equation in 1D using the mixed-form Newton iteration. . . . . . . . . . . . . . . . . . . . . . . . . 83Table 4.1 Canonical soil parameters for the water retention and hydraulic con-ductivity curves (Van Genuchten et al., 1991) . . . . . . . . . . . . 94Table 4.2 Comparison of the memory necessary for storing the dense explicitsensitivity matrix compared to the peak memory used for the im-plicit sensitivity calculation excluding the matrix solve. The cal-culations are completed on a variety of mesh sizes for a single dis-tributed parameter (Ks) as well as for five distributed van Genuchtenmodel parameters (Ks,α,n,θr, and θs). Values are reported in giga-bytes (GB). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118xivList of FiguresFigure 1.1 Outline of the thesis chapters: (2) the simulation and inversionframework; (3) the sensitivity calculation in Richards equation; (4)applications and exploration into vadose zone inversions; (A1) fi-nite volume techniques on a variety of meshes; and (A2) interfacesto geologic knowledge through parameterizations and a forwardsimulation framework and extensions to the work presented. . . . 19Figure 2.1 Inversion methodology. Including inputs, implementation, evalua-tion and interpretation. . . . . . . . . . . . . . . . . . . . . . . . 27Figure 2.2 SIMPEG framework indicating the flow of information. In theimplementation, each of these modules is a base class. . . . . . . 44Figure 2.3 Solving the DC resistivity problem for a dipole and using the meshesvisualization routine for the potential, φ , for three different meshtypes: (a) TensorMesh, (b) TreeMesh, and (c) CurvilinearMesh.The potential has been interpolated onto the tensor mesh for visu-alization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49Figure 2.4 Illustration of mapping in DC inversion. (a) 1D log conductivitymodel. (b) 3D conductivity model. . . . . . . . . . . . . . . . . . 58xvFigure 2.5 (a) Observed (black line) and predicted (red line) apparent resis-tivity values. (b) True and recovered 1D conductivity model. . . . 58Figure 3.1 The water retention curve and the hydraulic conductivity functionfor four canonical soil types of sand, loam, sandy clay, and clay. . 68Figure 3.2 Discretization of unknowns in 1D, 2D and 3D space. Red circlesare the locations of the discrete hydraulic conductivity K and thepressure head ψ . The arrows are the locations of the discretizedflux ~f on each cell face. . . . . . . . . . . . . . . . . . . . . . . . 69Figure 3.3 Fictitious source test in 1D showing the analytic function Ψtrue attimes 0.0 and 0.5 and the numerical solution Ψ(x,0.5) using themixed-form Newton iteration. . . . . . . . . . . . . . . . . . . . 83Figure 3.4 Comparison of results to Celia et al. (1990) showing the differ-ences in the (a) head-based and (b) mixed formulations for t=360s. 86Figure 4.1 The water retention curve and the hydraulic conductivity functionfor four canonical soil types of sand, loam, sandy clay, and clay. . 94Figure 4.2 The hydraulic conductivity function showing bounds of the var-ious parameters Ks ∈ [1× 10−7,1× 10−4], α ∈ [2.5,13.5], andn ∈ [1.1,1.6] for four canonical soil types of (a) sand, (b) loam,(c) sandy clay, and (d) clay. . . . . . . . . . . . . . . . . . . . . 96Figure 4.3 The water retention function showing bounds of the various pa-rameters θs ∈ [0.3,0.5], θr ∈ [0.01,0.1], α ∈ [2.5,13.5], and n ∈[1.1,1.6] for four canonical soil types of (a) sand, (b) loam, (c)sandy clay, and (d) clay. . . . . . . . . . . . . . . . . . . . . . . 97xviFigure 4.4 Objective function cross sections plotted for all ten cross sectionsthrough the five dimensional space of Ks,θr,θs,α and n. Eachcross section was simulated with 40× 40 simulations and com-pared using an l2 data objective function. The results are shown inlog10-scale. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99Figure 4.5 Fields from the numerical simulation of a layered one dimensionalsoil profile, showing (a) pressure head and (b) water content overthe full time period. The soil types are shown as annotations oneach figure, the spatial location of water content measurements areshown adjacent to the water content fields. . . . . . . . . . . . . 102Figure 4.6 Observed and predicted water content data from the one dimen-sional infiltration experiment showing water content over time. . 104Figure 4.7 Showing the water retention and hydraulic conductivity curves forthe true and predicted models for the three soil layers. The his-togram in each plow shows the distribution of true pressure headvalues in each layer. . . . . . . . . . . . . . . . . . . . . . . . . 106Figure 4.8 The true and recovered soil parameters as a function of depth,showing (a) hydraulic conductivity; (b) residual and saturated wa-ter content; (c) the empirical parameter n; and (d) the empiricalparameter α . Each plot also shows the full inversion history of thepredicted model as a transparent black line. . . . . . . . . . . . . 108xviiFigure 4.9 Recovered simulation fields from the unconstrained joint inversionfor van Genuchten parameters. Showing (a) the pressure head and(b) the water content over the full time period of the one dimen-sional recovered soil profile. The soil types are shown as annota-tions on each figure, the spatial location of water content measure-ments are shown adjacent to the water content fields. . . . . . . . 110Figure 4.10 Soil structure in three dimensions showing the boundary betweentwo soil types of sand and loamy sand. . . . . . . . . . . . . . . 113Figure 4.11 Vertical cross sections through the pressure head and saturationfields from the numerical simulation at two times: (a) pressurehead field at t = 5.2 hours and (b) t = 10.3 hours; and (c) satura-tion field at t = 5.2 hours and (d) t = 10.3 hours. The saturationfield plots also show measurement locations and green highlightedregions that are shown in Figure 4.12. The true location of the twosoils used are shown with a dashed outline. . . . . . . . . . . . . 114Figure 4.12 Observed and predicted data for five measurements locations atdepths from 10cm to 150cm from the center of the model domain. 115Figure 4.13 The 3D distributed saturated hydraulic conductivity model recov-ered from the inversion compared to the (a) synthetic model mapview section, using (b) the same map view section, (c) an X-Zcross section and (d) a Y-Z cross-section. The synthetic model isshow as an outline on all sections, and tie lines are show on allsections as solid and dashed lines, all location measurements arein centimeters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117xviiiFigure B.1 Three mesh types in two dimensions on the domain of a unit square:(a) a tensor product mesh, (b) a quadtree mesh, and (c) a curviliearmesh. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148Figure B.2 Names of a finite volume cell on a tensor mesh in (a) one dimen-sion, (b) two dimensions, and (c) three dimensions. . . . . . . . . 149Figure B.3 Names of a finite volume cell on a curvilinear mesh in (a) twodimensions, and (b) three dimensions. Note that the cell faces andedges are no longer orthogonal. . . . . . . . . . . . . . . . . . . . 150Figure B.4 Names of a finite volume cell on a tree mesh in (a) two dimensions,and (b) three dimensions. Note the location of hanging x-facesfrom the refined cells; hanging edges are not shown. . . . . . . . . 153Figure B.5 Derivation of the direct current resistivity equations. . . . . . . . . 154Figure B.6 Volume calculation using five tetrahedra. . . . . . . . . . . . . . . 156Figure B.7 Visual connection between the continuous and discrete representa-tions of the divergence. . . . . . . . . . . . . . . . . . . . . . . . 159Figure B.8 Simple quadtree mesh showing (a) the mesh structure, cell num-bering, and face numbering in both the x and y directions; and (b)the structure of the face divergence matrix that has eliminated thehanging faces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160Figure B.9 Edge path integration or definition of the curl operator. . . . . . . 161Figure B.10 Matrix structure of a face inner product of a cell centered physicalproperty on a tensor mesh. . . . . . . . . . . . . . . . . . . . . . 169Figure B.11 The computational ontology developed for the discretize pack-age showing inheritance and commonalities between the four meshtypes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179xixFigure B.12 Regular mesh and mesh aligned to layer for a simple conductivitymodel at 14×14×14. . . . . . . . . . . . . . . . . . . . . . . . 180Figure B.13 Comparison of norm data error for the regular mesh and the meshaligned to the interface. . . . . . . . . . . . . . . . . . . . . . . . 181Figure C.1 The forward simulation framework that is used for Richards equa-tion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186Figure C.2 The components required in calculating the derivatives of the for-ward simulation, showing (a) the modular nature of each deriva-tive; (b) the process of multiplying each derivative in the forwardsense with Jv; and (c) in the adjoint sense with J>v. . . . . . . . 189Figure C.3 Mapping an inversion model, a 1D layered, log conductivity modeldefined below the surface, to electrical conductivity defined in thefull simulation domain. . . . . . . . . . . . . . . . . . . . . . . 192Figure C.4 (a) Traditional approach to inversion, where the model space, elec-trical conductivity, is mapped to data space, the electromagneticresponse, through a forward model. The inversion then providesa method by which we estimate a model that is consistent withthe observed data. The recovered conductivity model is then usedto infer information about the reservoir properties of interest, inthis case, the distribution of proppant. (b) Parametrized inversion,where we parametrize the model space, electrical conductivity, interms of the property of interest, the distribution of proppant. Bydefining such a parametrization, the inversion can provide a meansof estimating the properties of interest directly from the data. . . 196xxFigure C.5 Setup of a parametric models for a steel cased well and a reservoirtarget. The calculation of sensitivity for using a primary secondaryapproach is shown using the forward simulation framework. . . . 197Figure C.6 Parameterized geologic models using a series of analytic functions.Models were created using Visible Geology ( . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199Figure C.7 Conceptual diagram of moving between 1D, 2D, and 3D models. . 200Figure C.8 Diagram showing the entire setup and organization of (a) the fre-quency domain simulation; (b) the time domain simulation; and(c) the common inversion framework used for each example. Themuted text shows the programmatic inputs to each class instance. 201xxiList of Programs2.1 Creation of a 2D tensor product mesh using the discretize pack-age discussed in Appendix B. . . . . . . . . . . . . . . . . . . . . . . 462.2 Creation of the matrix A(σ) for the direct current resistivity problem.See Appendix B for details on finite volume. . . . . . . . . . . . . . . 472.3 Solving and plotting the fields (φ ) for direct current resistivity usingpymatsolver and visualization utilities in SIMPEG. . . . . . . . . 482.4 Pairing the Problem and Survey objects to create predicted data,dpred. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 522.5 Sensitivity times a vector method for the DCProblem. . . . . . . . . 532.6 Creation and chaining together of multiple mapping properties for amodel of σ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 572.7 Instantiation of the direct current resistivity problem with a mappingfor the σ property. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 572.8 Creating a boiler plate inversion at a low level. . . . . . . . . . . . . . 58C.1 Definition of a hydraulic conductivity model with multiple invertibleproperties that is declaratively attached to the Richards problem class. 191C.2 Demonstration of the ability to choose arbitrary parameters to includein a model, and use the chain rule to compose parameterizations. . . . 193xxiiAcknowledgmentsThe interdisciplinary work that I have started at the University of British Columbiawould not of been possible without the enthusiastic support from a tremendous anddiverse group of individuals. First and foremost, Dr. Eldad Haber, who is a trailblazerin the computational geosciences and whose guidance and expertise has not only beeninvaluable, but is foundational to almost every paragraph of this thesis. Thank youfor all your investment in me. Next, Dr. Doug Oldenburg, whose infectious enthusi-asm and curiosity is able to both motivate and mobilize a team of passionate studentsaround him that are able to tackle grand goals that we wholeheartedly believe willchange our field of study. Next, Dr. Adam Pidlisecky, who introduced me to scien-tific programming and hydrogeophysics and who is able to weave scientific narrativesthat inspire and broker innovation. Next, my parents for, among so many other things,perspective in this process. Next, my committee and examiners who improved thequality and focus of this dissertation. Thank you all for your unwavering support andencouragement.The funding for this interdisciplinary work was provided from several sources. Aspecial thanks to the University of British Columbia, the Natural Sciences and Engi-neering Research Council of Canada, the Vanier Graduate Canada Scholarship pro-gram, the Gilles Brassard Doctoral Prize for Interdisciplinary Research, the KillamxxiiiScholarship program, the University of British Columbia Library, and the Banff Inter-national Research Station for their generous support.A special thanks to my colleagues, friends and family who are always willing totalk through ideas and have supported me throughout this process. To name a few ofthese wonderful people: Dom Fournier, Gudni Rosenkjaer, Michael Mitchell, ThibautAstic, Brendan Smithyman, Dave Marchant, Julie Nutini, Michael Wathen, FranklinKoch, Teddi Herring, Michael Firmin, Lars Ruthotto, Jenn Fohring, Luz AngelicaCaudillo-Mata, Klara Steklova, Pieter Aukes, Kristyn Adams, Leonardo Uieda, MattHall, and many others. Also, a special thanks to the current and future users and con-tributors of SIMPEG, and to those who are pioneering connected and open geosciencecommunities.Finally, this dissertation would not have been possible without the continued emo-tional and educational support and encouragement from Lindsey Heagy and SeogiKang. You have been partners in thought and have been integral in nurturing many ofthe ideas that fill the following pages.xxivTo giants and their shoulders.xxvIf we then ask ourselves where that intelligence is embodied, we are forced toconcede that it is elusively distributed throughout a hierarchy of functionalprocesses – a hierarchy whose foundation extends down into natural processesbelow the depth of our comprehension. If there is any one thing upon which this‘intelligence depends’ it would seem to be organization.— Douglas C. Engelbart (1962)Augmenting human intellect: a conceptual frameworkxxviChapter 1Introduction1.1 Research contextOne of the goals of the applied geosciences is to gain an understanding of subsurfacestructures and processes. These understandings are often used to make predictionsand decisions associated with commercial and environmental challenges, includingcontaminant delineation, resource exploration, reservoir optimization, and watershedcharacterization. The accuracy of these predictions can have far-reaching economicand environmental implications. There are many disciplines and skills that are involvedin providing predictions, and increasingly these disciplines must collaborate and inte-grate their domain-specific knowledge. In a managed aquifer recharge project, forexample, the goal is to infiltrate water into the subsurface for storage and subsequentrecovery. Throughout the lifetime of the project, monitoring and management of theinfiltration site is necessary (e.g. Racz et al. (2011); Daily et al. (1992); Park (1998)).Such projects require input from geology, hydrology, and geophysics in order to mapthe hydrostratigraphy, collect and interpret time-lapse geophysical measurements, and1integrate all results to make predictions and decisions about fluid movement at the site.The quantitative integration of the geosciences is far from trivial as each discipline hasdiffering descriptive terminology, as well as software tools that are domain specificwith limited interoperability.In the following two subsections, I will independently introduce two disciplineswithin applied geoscience: (a) geophysical inversions, and (b) hydrogeophysics in thevadose zone. The current state of these disciplines gives context to the work that fol-lows and motivates research into a computational framework that improves the quanti-tative communication between methodologies and researchers. The subsequent sectionexpands on these ideas and identifies a significant computational challenge of hydro-geophysics inversions in the vadose zone.1.1.1 Geophysical inverse problemsGeophysical methods involve making measurements at or above the earth’s surface,or in boreholes. The data acquired with these methods are then used to create mod-els of the subsurface; this is similar to non-invasive medical imaging, but the spatialand temporal scales are typically much larger. The models, which can be 1D, 2D, or3D distributions of various physical properties, are used for monitoring and extractinginformation about fluid flow and subsurface structures. The physical properties arelinked to the data through various partial differential equations. The task of generat-ing a quantitative understanding of the data requires the ability to carry out forwardsimulations of these equations and, in many situations, inverting the data to estimatea static or time-lapse model of the subsurface. Forward simulations use the physicsof the underlying measurement approach to simulate the response of a given distribu-tion of physical properties. Inversion is a mathematical, algorithmic, and occasionally2heuristic process that constructs a model consistent with the field measurements and apriori geologic, geophysical, and hydrologic information.Many of these geophysical methods (e.g. electromagnetics, magnetotellurics, grav-ity, direct current resistivity) have mature solutions for both simulation and inversionin three dimensions and through time (Oldenburg, 2016). There is, of course, con-tinued research into improving computational efficiency for large-scale geophysicalsurveys (cf. Haber and Schwarzbach (2014); Yang et al. (2014); Haber and Heldmann(2007)). In parallel to this effort, there is ongoing work to integrate these geophysicalmethodologies to create more informed interpretations of the subsurface from multi-ple data types and surveys (e.g. Devriese et al. (2017); Kang et al. (2017a); Fournieret al. (2017)). This research trend is true in exploration geophysics as well as in cross-disciplinary fields such as hydrogeophysics where hydrologic simulations and geo-physical simulations can be combined to better inform predictions about groundwaterflow.The development of new methodologies to address the evolving challenges inquantitative geoscience integration will build upon and extend standard practices. Theseextensions and integrations will require experimentation with, and recombination of,existing techniques. This presupposes that researchers have access to consistent, well-tested tools that can be extended, adapted, and combined. One of the main goals ofmy thesis is to organize the components of geophysical simulations and inverse prob-lems into a comprehensive, modular framework in order to support this combinatorialexperimentation and exploration.31.1.2 Hydrogeophysics in the vadose zoneThe majority of groundwater recharge is derived through water that percolates throughthe vadose zone, the region between the earth’s surface and the fully saturated zone.As such, studying the processes that occur in the vadose zone is of critical importancefor understanding our groundwater resources. Much attention has been given to mon-itoring, describing and predicting processes that occur in this region of the earth. Tra-ditionally, monitoring has been conducted by taking point-measurements of saturationor pressure, or laboratory measurements of soil hydraulic properties. More recently,geophysical methods are being used in conjunction with hydrologic data to create moreinformed models of and predictions about the subsurface (Linde and Doetsch, 2016).The advantages of employing geophysics to hydrogeology problems are numerous;geophysical methods allow data to be gathered remotely, and the data can then be usedto create an image of a distributed physical property of interest (e.g. electrical conduc-tivity) in the subsurface. However, the geophysical problem is inherently non-uniqueand when viewed independently, images produced by a geophysical inversion oftenlack the detail necessary to make informed hydrogeologic predictions. Some level ofprediction can be offered by hydrogeologic simulations within the structural geologiccontext; however, these simulations are difficult to verify due to lack of constraininghydrologic data. Taken separately, each methodology involved in this monitoring chal-lenge yields distinct interpretations and predictions that are often dissonant or actuallyconflicting.Fluid flow in the vadose zone is described by the Richards equation and is pa-rameterized by hydraulic conductivity, which is a nonlinear function of pressure head.Hydraulic conductivity defines how fluids move in the subsurface, and is an impor-4tant physical property to estimate for accurate predictions (Pollock and Cirpka, 2012;Sˇimunek et al., 2012). It is not possible to directly image hydraulic conductivity withgeophysical data, however, geophysical electromagnetic methods are sensitive to bulkelectrical conductivity, which changes significantly depending on the saturating fluid(e.g. gas or water) (Archie, 1942; Liang et al., 2012; Mendelson and Cohen, 1982).Changes in saturation over time, as fluids move, can be related to changes in electricalproperties, and can be observed by electromagnetic geophysical methods. Knowingwhere and how the fluids move can subsequently be related to hydraulic conductiv-ity (or other hydraulic properties). This technique has been used to estimate hydraulicproperties directly from geophysical data. For example, in Binley et al. (2002), a cross-well tomography experiment was conducted using radar and direct current resistivitymethods. The movement of a vadose zone tracer was tracked and a single parameterwas estimated using the Richards equation, through trial and error, for homogeneoushydraulic conductivity. Both, the quality, and the spatial and temporal density of geo-physical data available for monitoring vadose zone processes will continue to prolifer-ate (e.g. Pidlisecky et al. (2013)). The increased data density and quality opens up thepossibility to estimate many more distributed hydraulic parameters.Time-lapse estimation problem presents a significant conceptual and computa-tional challenge (Pollock and Cirpka, 2012; Haber and Gazit, 2013; Towara et al.,2015; Linde and Doetsch, 2016). It requires large-scale, time-lapse hydrogeologicsimulations that must be efficiently solved and then integrated with geophysical meth-ods. This multiphysics integration of geophysical and hydrologic simulation can becompleted in a variety of ways. For example, this integration can be through directcoupling of the simulations or through qualitative observations and uncoupled work-flows. Hinnell et al. (2010) presents uncoupled integrations as: (a) using the geophys-5ical data to estimate a physical property, such as electrical conductivity; (b) using anempirical relation, such as Archie’s equation (eq. 4.1), to transform the geophysicalestimate into a hydrological parameter, such as water content; and (c) using hydrolog-ical estimates to help inform or test a hydrogeologic simulation. A coupled inversionformulates the entire process as a single forward model and uses stochastic or de-terministic parameter estimation to directly update the hydrogeologic and empiricalparameters (Finsterle and Kowalsky, 2008; Ferre´ et al., 2009). Increasingly, there areinstances of these sorts of collaborations and studies in near surface hydrogeophysics(cf. Linde and Doetsch (2016) and references within). The integration of geophys-ical and hydrologic data increases the scale of simulations and inversions that mustbe considered – hundreds of thousands to millions of hydrologic parameters must beestimated. Currently, this is not computationally feasible for large 3D inversions ofvadose zone parameters using the Richards equation. For example, the relatively fewparameters that can be estimated by stochastic inversions may not be sufficient for3D inversions (Linde and Doetsch, 2016). Alternatively, deterministic inversions canbe used, but will need to draw on improvements across the field of geophysical in-verse problems. For example, regularization techniques developed in other areas ofexploration geophysics (e.g. Paasche and Tronicke (2007); Sun et al. (2012)), have po-tential to be helpful in introducing known parameter distributions into a vadose zoneinversion. In Hinnell et al. (2010), the authors conclude that, “the coupled approach[for hydrogeophysics] requires that the hydrologic and geophysical models be merged,[which] forces the hydrologist and the geophysicist to formulate a consistent frame-work.” This consistent framework was identified as “the primary limit to the routineimplementation of coupled inversion[s]. The formulation of common solution grids,time steps, and simulation accuracies requires an uncommon level of collaboration6during scientific analysis.”1.1.3 Research motivationThe quantitative integration of hydrology, geophysics, and geology remains an openproblem (Liu and Gupta, 2007; Ferre´ et al., 2009; Pollock and Cirpka, 2012; Knightet al., 2013; Linde and Doetsch, 2016). This task is being worked on from manydifferent perspectives in various research communities, and much progress has beenmade in case studies, new algorithms, and novel integrations. The complexity of thisintegration “intertwines various disciplines/subjects including geophysics, hydrology,petrophysics, geostatistics, [and] inverse theory” (Knight et al., 2013). Although eachsubdiscipline (e.g. flow modelling, electromagnetic simulation) invokes many of thesame concepts and numerical pieces for solving simulation and inversion problems, theapproaches developed and applied are not easily shared between subdisciplines. Thisis due to differing terminology, organization of methodologies, differing data densi-ties and sensitivities, model conceptualizations, as well as software implementations.For example, in geophysics a model is often taken to be a volumetric distribution ofphysical properties (e.g. Oldenburg and Li (2005)); in geology a model is often morequalitative, represented by a sketch, description, 3D surfaces, or a cross section thatopaquely embeds knowledge about geologic processes (e.g. Harder et al. (2009); Por-wal and Kreuzer (2010)); in hydrology a model often refers to the representation of aphysical simulation or empirical equation, frequently containing simplifying assump-tions of homogeneity or dimensionality (cf. Devi et al. (2015)). As another example,in hydrogeology data is often collected as high precision point measurements with lowspatial and/or temporal coverage; in electromagnetic geophysics, however, physicalproperty recoveries are often less precise and are averaged over a larger spatial scale.7The inclusion of relevant information from one subdiscipline into another is diffi-cult due to these differences in terminology, knowledge representation (e.g. quantita-tive or qualitative), knowledge mapping (e.g. through empirical or structural relations),model conceptualization (e.g. volumetric or surfaces or parametric), data sensitivi-ties (e.g. point or bulk measurements), and simplifying or implicit assumptions ( dimensional or homogeneous). These disconnects are exceptionally apparent insoftware implementations, even though software is precisely where quantitative inte-gration must occur! Software is often developed ad hoc for specific outcomes, andthe algorithmic components, which are conceptually generic and could be shared withothers, are deeply embedded and not easily transferred to other applications. Withina given subdiscipline this can create challenges, as the system under considerationcan potentially embed hard-coded, tacit assumptions. Furthermore, this lack of trans-portability and interoperability severely hinders the advancement of novel geophysicalapplications since geoscientists in different subgroups often find themselves having todevelop a complete software solution from scratch prior to investigating their scientificquestions of interest. Overcoming these bottlenecks, and establishing a simulation andinversion framework that works across many subdisciplines of (hydro)geophysics, isthe overarching goal of this thesis.Based on the current state of the geoscience inversion and hydrogeophysics com-munity and the observations outlined above, I have arranged this thesis to address tworesearch topics:1. the development of an extensible framework for geophysical inversions, and2. formulation of the three dimensional Richards equation inversion for computa-tional scalability.8The overarching goal is to promote both quantitative integration and collaboration be-tween geoscience disciplines and communities. Interdisciplinary integration requiresdissemination, reproducibility, accessibility, and collaboration; as such these are cru-cial to my work and demonstrated throughout the following thesis.1.2 A framework for geophysical inversionsGeophysical inversions are the mathematical process of creating subsurface models tofit measured data and geophysical simulations. The language, workflows, and resultingsoftware implementations of geoscience inversions vary across disciplines. These in-consistencies are among the large barriers to sustained cross-disciplinary integrations.One research approach to addressing these interdisciplinary barriers is the develop-ment of a framework that organizes, synthesizes and abstracts diverse methodologies.A framework should (a) serve as a means of organizing an approach to simulation andinverse problems, (b) facilitate quantitative communication between researchers andgeophysics methodologies, and (c) act as a blueprint for both ideation and softwareimplementations. The disciplines and methodologies that have been used to informthe research of this framework include: vadose zone flow using the Richards equation,direct current resistivity, time and frequency domain electromagnetics, magnetotel-lurics, potential fields including gravity and magnetics, and using geologic parame-terizations to inform model conceptualization. Oldenburg (2016) noted that many ofthe geophysical inversion techniques can now be completed in three dimensions us-ing computationally efficient inversion algorithms. This is significant as geologicalstructures and processes such as electromagnetics and fluid flow often require treat-ment in three dimensions. In both geophysics and hydrogeology the data is a field or aflux, sampled at various locations, times, or frequencies. Additionally, point samples9of physical properties or (hydro)stratigraphy can be inferred or tested from boreholecores and geologically interpolated between wells and surface observations. These canbe included into the inverse problem formulation either implicitly through weightingsand reference models (cf. Williams (2008)) or more explicitly by forcings of geologicpriors (cf. Linde et al. (2015)). The geologic observations can also be modelled, forexample, using radial basis functions (RBFs), to implicitly reproduce the geologic con-tacts and drillhole data; this results in geologically interpreted interpolations dividingthe subsurface into lithological units (Hillier et al., 2014).Each geophysical technique is sensitive to different physical properties and/or dif-ferent spatial scales. The differing sensitivities of these techniques motivates combi-nation of methodologies to better understand and image the subsurface and time-lapseprocesses. This is an active field of study, for example, (a) investigating cooperativeelectromagnetics inversions in realistic settings by externally combining existing toolsthrough custom workflows (McMillan and Oldenburg, 2014), (b) joint inversion andmodel fusion algorithms for direct current resistivity and borehole tomography (Haberand Gazit, 2013), (c) integrating multiple types of airborne geophysical data into aconsistent geologic model for mineral exploration (Kang et al., 2017a; Fournier et al.,2017; Devriese et al., 2017), and (d) combining one dimensional vadose zone flow anddirect current resistivity to invert for hydrological parameters (Hinnell et al., 2010).Many of these studies rely on externally integrating existing software tools throughpurpose-built scripts and workflows; limiting the transferability to other disciplines.However, recent work has seen an increased focus by the geophysical community ona framework approach that targets multiple geophysical and hydrogeologic method-ologies (e.g. JInv (Ruthotto et al., 2016) and PyGIMLi (Ru¨cker et al., 2017)). Inmany electromagnetic geophysical applications, for example, a common model for10electrical conductivity can be produced through cycling a common model through therelevant problems until a sufficient misfit is achieved. In hydrogeophysics, however,the model from a geophysical simulation is the data for a hydrogeologic simulation.As such, for a deterministic large-scale inversion the sensitivity from one problem isempirically coupled to another problem and must be efficiently calculated. In hydro-geophysics, coupled hydrologic and geophysical interpretations are moving into threedimensions, and standard probabilistic and finite difference techniques are becoming“computationally infeasible” (Linde and Doetsch, 2016). The coupling of these meth-ods into a computationally efficient inversion requires attention to the scalability ofall individual approaches as well as exposing the geophysical inversions effectivelyfor hydrogeologic parameter estimation. In order to support the custom parameteri-zations, couplings, and integrations that are necessary for a new application, a gen-eral framework must provide combinatorial building blocks that are independentlyaccessible and extensible, while maintaining computational efficiencies. The PESTframework for model independent parameter estimation and uncertainty analysis is aconcrete example of where parts of this have been done with success (Doherty, 2004).The software is widely cited in academia (> 2K citations) especially in hydrology andhydrogeophysics, and is heavily used in industry. The advantage of being model in-dependent has given this technique wide application due to the flexibility to adapt tonew scientific questions. However, this also comes at quite a cost because the structureof the simulation and modelling cannot be used to the advantage of the algorithm. Aswith vadose zone flow or electromagnetics, when moving to three dimensions theremay be hundreds of thousands to millions of parameters to estimate. Not taking thestructure of the problem into account severely limits types and size of problems thatcan be considered. In the context of geophysical simulations and inversions there are11two significant challenges/opportunities for such a framework:1. to organize the components of geophysical simulations and inversions to andexpose explicit interfaces to components to interdisciplinary manipulation in acombinatorial manner; and2. maintaining computational scalability, especially with respect to efficient calcu-lation of sensitivities.Adapting interdisciplinary methodologies to formalize geophysical simulations andinversions inherently requires that a diverse suite of methods and applications be con-sidered across geophysics, hydrogeology, and geology. This process will take the formof deriving, from the existing body of literature, a consistent conceptual and compu-tational framework, which supports reproducible inversion workflows. By formalize,I do not mean mathematically, rather taking practices of ontology and computationalframework development in biology and other more mature interdisciplinary fields andapplying them to geophysical inversions. The ontology literature provides context, al-beit abstract, for the approach that I have used to formalize the research around thisinterdisciplinary problem and is briefly detailed in Appendix A.1.2.1 Take home pointsSustained, reproducible integration of geophysical simulations and inversions requiresthat methodologies be accessible, consistent, numerically documented, interoperable,and extensible. This can be enabled by a comprehensive framework that is validatedand rigorously tested against reality and leading edge research. To do this, research isrequired to:12• identify the composable components of geophysical inversions and simulations,as well as the interfaces between the components;• abstract commonalities between methodologies to a consistent, supporting sub-set; and• capture geoscience inversion heuristics in a reproducible manner.The output of this research will be a computational framework that is numericallytested and demonstrates the capability to support existing and future research direc-tions. Ideally this framework will catalyze and accelerate interdisciplinary collabora-tions.1.3 Application to vadose zone parameter estimationThe development of a geophysical framework requires considering a number of dis-ciplines and geophysical problems to ensure generality as well as extensibility. I amworking with collaborators in many of these geophysical methods (electromagnetics,direct current resistivity, magnetotellurics, magnetics, gravity) and am ensuring thatthe framework that I am researching supports these applications. However, the goal isalso to have the framework work outside of geophysics and most notably in hydroge-ology, as such, I have chosen vadose zone fluid flow as a model problem.Fluid flow in the vadose zone is described by the Richards equation (eq 3.1) andparameterized by hydraulic conductivity, which is a nonlinear function of pressurehead (Richards, 1931; Celia et al., 1990). Investigations in the vadose zone typicallyrequire identification of distributed hydraulic properties. This is increasingly beingdone through an inversion approach, which is also known as data assimilation, modelcalibration, or history matching (Liu and Gupta, 2007). These methods use changes13in water content or pressure head to infer hydraulic parameters (Binley et al., 2002;Deiana et al., 2007; Hinnell et al., 2010). Hydrogeophysics allows many more proxymeasurements, such as direct current resistivity data, to be taken to help characterizethese sites spatially, as well as through time. As such, the number of distributed hy-draulic parameters to be estimated in a Richards equation inversion will continue togrow. Conceptually this integration is framed as taking the output of a (time-lapse)direct current resistivity inversion (cf. Pidlisecky et al. (2013)), and using this estimateof electrical conductivity as a proxy for water content data in hydraulic parameterestimation. The proxy data can be directly incorporated through an empirical relation(e.g. Archie (1942)) or time-lapse estimations can be structurally incorporated throughsome sort of regularization technique (Haber and Gazit, 2013; Haber and Oldenburg,1997; Hinnell et al., 2010). Previous studies have either estimated homogeneous soilprofiles estimating less than five parameters (e.g. Binley et al. (2002); Deiana et al.(2007)) or heterogeneous soil profiles, estimating less than thousands of parameters(cf. Irving and Singha (2010); Jardani et al. (2013); Orgogozo et al. (2014)). Parame-ter estimation is currently completed by trial and error or using stochastic techniques,neither of which scale to the scenario that requires estimation of hundreds of thousandsto millions of parameters (Linde and Doetsch, 2016). This limit in scalability, espe-cially in the context of hydrogeophysics has been explicitly noted in the literature (cf.Binley et al. (2002); Deiana et al. (2007); Towara et al. (2015)). To our knowledge,a large scale inversion in three dimensions for distributed hydraulic parameters usingthe Richards equation has not yet been completed in the literature due to these issueswith computational scalability.There has been much research into the scalability of the inverse problem in geo-physical applications (e.g. electromagnetics) that allow the calculation of an optimiza-14tion step in the inversion without explicitly calculating or storing the sensitivity matrix(cf. Haber et al. (2000)). This is extremely important in large problems as the com-puter memory available to store this large, dense matrix can often be a limitation. Forexample, although there have been significant advances for massively parallel forwardsimulations of the Richards equation (cf. Orgogozo et al. (2014)), the computational“memory may simply not be large enough” to run the inverse problem using standardautomatic differentiation (Towara et al., 2015). Previous work uses either automaticdifferentiation or finite difference in order to explicitly compute the sensitivity matrix(e.g. PEST) (Finsterle and Zhang, 2011; Bitterlich and Knabner, 2002; Towara et al.,2015). Finite difference is computationally slower and can generate inaccuracies inthe sensitivity computation and tarry convergence of the optimization algorithm. Withregard to implicit sensitivity calculations, there is an opportunity to apply some ofthe learnings from the geophysical inversion literature to this hydrogeologic problem.Note that the implicit sensitivity calculation is necessary in any gradient based tech-nique as well as modern stochastic methods (Bui-Thanh and Ghattas, 2015).The application of the implicit sensitivity calculation to the Richards equation,however, is not straightforward. Hydraulic conductivity is the function we are invert-ing for - it is empirically determined and depends on pressure head; pressure headis the field that is calculated using the Richards equation. This nonlinear couplingrequires iterative optimization methods in the forward simulation between each timestep (e.g. Picard or Newton). This makes the implicit calculation of the effect of thesensitivity on a vector rather involved and intricate. Furthermore, the nonlinear re-lationship of hydraulic conductivity is empirically determined and depending on therelation used could involve the estimation of up to ten spatially-distributed parametersfrom a finite dataset. The implicit use of the sensitivity matrix should have the ability15to calculate the sensitivity to any of these model parameters; however, any inversionalgorithm must be tested as to the limits of estimating all distributed parameters atonce. One goal of the work in this thesis is to tackle the sensitivity calculation implic-itly. This would further allow for exploration into different inversion methodologiesand parameterizations of the empirical relationships. By not storing the sensitivity, andinstead computing its effect on a vector, the size of the problem that we can invert willbecome much larger. This will allow large 3D hydraulic parameter inversions usingthe Richards equation to be run on modest computational resources. However, directlyjumping into a 3D inversion for heterogeneous hydraulic properties, with many param-eters per cell in the isotropic case, is highly non-unique. I will explore some inversionschemes, model conceptualizations, and ways to explore interfacing to a priori infor-mation. Unsaturated hydraulic conductivity as well as the water retention curve areboth empirically described functions. The parameterization and estimation of thesefunctions in the context of collecting proxy saturation data will be explored in a num-ber of numerical experiments.1.3.1 Take home pointsWith advances in the spatial and temporal density of geophysical data collection, time-lapse water content estimates can be made with increasing accuracy. These proxy time-lapse measurements can be used to estimate hydraulic properties from non-invasivegeophysical methods. This is increasingly being completed in field studies, however,conceptual and computational simplifications are consistently made. Part of the bot-tleneck is the scalability of current inverse methods applied to the Richards equation.Research is required to:• reframe the Richards equation inversion for computational scalability when mov-16ing to large 3D distributed parameter estimation,• ensure that any parameter in the empirically determined hydraulic conductivityfunction and water retention curve can be estimated, and• investigate and explore the effectiveness of distributed joint inversions for hy-draulic parameters from a water content dataset.This research will inform the conceptual framework and contribute an implicit sen-sitivity calculation for the Richards equation inverse problem that can be coupled toother geophysical methods.1.4 Thesis outlineThe content of this thesis is divided into three chapters and three appendices; these areshown visually in Figure 1.1. Each chapter and appendix provides a stand alone intro-duction and conclusion to the specific topic under consideration. Chapter 2 presents acomprehensive framework for geophysical simulations and inversions. This includesan overview of current research and outlines a modular, object-oriented approach forstructuring deterministic, large-scale inversion methodology in geophysics that hasapplication to hydrogeology and can incorporate and interface to geologic informa-tion. A direct current resistivity forward simulation and inversion are used through-out this chapter as an example. A brief comment on the approach used to researchthis computational framework is included in Appendix A. An overview of finite vol-ume discretization techniques, which are heavily used throughout this thesis, has beenincluded as Appendix B. In this appendix, I examine and comment on the organiza-tion, structure, and formulation of three different classes of mesh: (a) tensor productorthogonal mesh, including formulation in cylindrical coordinates; (b) quadtree and17octree meshes; and (c) logically rectangular, non-orthogonal meshes. These meshesin 1D, 2D, 3D, and 4D are used throughout the thesis as well as extensions to mywork. The software used to inform my work and refine my interdisciplinary approachto simulation and parameter estimation in geophysics is open source and available un-der the permissive MIT license ( This repositoryincludes forward and inverse software by me and my colleagues of the frameworkfor vadose zone flow, time and frequency domain electromagnetics, direct currentresistivity, induced polarization, magnetotellurics, magnetics, gravity, and a numberof example linear inverse problems; these are described in the online documentation( 3 focuses on the Richards equation, which is the partial differential equa-tion that describes vadose zone flow. Using the previously developed framework andfinite volume tools tailored specifically for inverse problems (Appendix B), I havereframed the Richards equation to be scalable with respect to large-scale inverse prob-lems. The majority of this work is focused on enabling access to the sensitivity im-plicitly, through multiplication with a vector. The validity of this technique as well ascomments on numerical performance are provided. The scalability of the algorithmdeveloped is shown with comparison to other techniques. This work has built upon aswell as informed the research into the organization of the framework. The Richardsequation is more complex than many other geophysical methods analyzed because ofthe nonlinear, time dependent forward problem and multiple empirical relationshipsthat may or may not require estimation.Chapter 4 explores the estimation of hydraulic parameters from water content data.This is motivated by the hydrogeophysical problem where there is an availability ofproxy water content data. This chapter explores a set of empirical relations that in-18Figure 1.1: Outline of the thesis chapters: (2) the simulation and inversion frame-work; (3) the sensitivity calculation in Richards equation; (4) applicationsand exploration into vadose zone inversions; (A1) finite volume techniqueson a variety of meshes; and (A2) interfaces to geologic knowledge throughparameterizations and a forward simulation framework and extensions tothe work presented.form both the hydraulic conductivity function and the water retention curve. A jointinversion is formulated to estimate for five spatially distributed hydraulic parameters.The number of spatially distributed unknowns are experimented with, and the responseof the inversion algorithm is tested under various levels of a priori knowledge. Theefficacy of these approaches is commented upon, which may help inform laboratoryor field based experiments of this kind. Finally, a three dimensional inversion is com-pleted using the Richards equation. Due to computational scalability issues detailedin Chapter 3, an inversion for distributed hydraulic parameters at this scale is com-putationally infeasible with standard techniques (Linde and Doetsch, 2016). Theselimitations are overcome by both the framework introduced in Chapter 2 and the im-plicit sensitivity calculation that allows large-scale parameter estimation problems tobe tackled (Chapter 3). Other examples, extensions, and applications of the frameworkincluding other geophysical methodologies, case studies, and geoscience integrationsare included in Appendix C; much of this work is collaborative in nature and focuses19on the parameterization of the forward problem. As the focus of this thesis is on thegeophysical inversion framework, I have distilled my observations from these casestudies and provided these learnings in a general form.Finally, Chapter 5 provides a brief discussion on the thesis contributions and sum-marizes some opportunities for future research and collaborations. These researchareas may seem disparate, but collectively they are united by a common theme ofaddressing the complexities of bringing together the disciplines of geophysics, hydrol-ogy, geology, and inverse theory in a computationally tractable manner that is accessi-ble and extensible by other researchers.20Chapter 2Simulation and inversion framework2.1 Introduction and motivationGeophysical surveys can be used to obtain information about the subsurface, as themeasured responses depend on the physical properties and contrasts in the earth. In-versions provide a mathematical framework for constructing physical property modelsconsistent with the data collected by these surveys. The data collected are finite innumber, while the physical property distribution of the earth is continuous. Thus, in-verting for a physical property model from geophysical data is an ill-posed problembecause no unique solution explains the data. Furthermore, the data may be contam-inated with noise. Therefore, the goal of a deterministic inversion is not only to finda model consistent with the data, but also to find the ‘best’ model that is consistentwith the data1. The definition of ‘best’ requires the incorporation of assumptions anda priori information, often in the form of an understanding of the particular geologic1Alternatively, the inverse problem can be formulated in a probabilistic framework, for example:(Tarantola, 2005; Tarantola and Valette, 1982). In this thesis, we will focus our attention on the deter-ministic approach.21setting or structures (Constable et al., 1987; Oldenburg and Li, 2005; Lelie`vre et al.,2009). Solving the inverse problem involves many moving pieces that must work to-gether, including physical simulations, optimization, linear algebra, and incorporationof geology. Deterministic geophysical inversions have been extensively studied andmany components and methodologies have become standard practice. With increasesin computational power and instrumentation quality, there is a greater drive to extractmore information from geophysical data. Additionally, geophysical surveys are beingapplied in progressively more challenging environments. As a result, the geosciencesare moving towards the integration of geological, geophysical, and hydrological in-formation to better characterize the subsurface (e.g. Haber and Oldenburg (1997);Doetsch et al. (2010); Gao et al. (2012)). This integration is a scientifically and prac-tically challenging task (Li and Oldenburg, 2000a; Lelie`vre et al., 2009). These chal-lenges, compounded with inconsistencies between different data sets, often make theintegration and implementation complicated and/or non-reproducible. The develop-ment of new methodologies to address these challenges will build upon, as well asaugment, standard practices; this presupposes that researchers have access to consis-tent, well-tested tools that can be extended, adapted, and combined.There are many proprietary codes available that focus on efficient algorithms andare optimized for a specific geophysical application (e.g. Kelbert et al. (2014); Li andKey (2007); Li and Oldenburg (1996, 1998)). These packages are effective for theirintended application; for example, in a domain-specific, large-scale geophysical inver-sion or a tailored industry workflow. However, many of these packages are ‘black-box’algorithms; that is, they cannot easily be interrogated or extended. As researchers,we require the ability to interrogate and extend ideas; this must be afforded by thetools that we use. Accessibility and extensibility are the primary motivators for this22work. Other disciplines have approached the development of these tools through opensource initiatives, using interpreted languages such as Python (for example, Astropyin astronomy (Astropy Collaboration et al., 2013) and SciPy in numerical computing(Jones et al., 2001)). Interpreted languages facilitate interactive development usingscripting, visualization, testing, and interoperability with code in compiled languagesand existing libraries. Furthermore, many open source initiatives have led to com-munities with hundreds of researchers contributing and collaborating by using socialcoding platforms, such as GitHub ( Initiatives also exist in the geo-physical forward and inverse modeling community, which target specific geophysicalapplications (cf. Hansen et al. (2013); Hewett et al. (2013); Uieda et al. (2014); Kel-bert et al. (2014); Harbaugh (2005)). Recent work has seen an increased focus bythe geophysical community on a framework approach that targets multiple geophys-ical/hydrogeologic methods (e.g. JInv (Ruthotto et al., 2016) and PyGIMLi (Ru¨ckeret al., 2017)). We are interested in creating a community around geophysical sim-ulations and gradient-based inversions. To create a foundation on which to build acommunity, we require a comprehensive framework that is applicable across domainsand upon which researchers can readily develop their own tools and methodologies.To support these goals, this framework must be modular and its implementation mustbe easily extensible by researchers.2.1.1 Attribution and disseminationThe goal of this chapter is to present a comprehensive framework for simulation andgradient-based parameter estimation in geophysics. Core ideas from a variety of geo-physical inverse problems have been distilled to create this framework. We also pro-vide an open source library, written in Python, called SIMPEG (Simulation and Pa-23rameter Estimation in Geophysics, Our implemen-tation has core dependencies on SciPy, NumPy, and Matplotlib, which are standardscientific computing packages in Python (Jones et al., 2001; Van Rossum and DrakeJr, 1995; Oliphant, 2007; Hunter, 2007). SIMPEG includes staggered grid, mimeticfinite volume discretizations on structured and semi-structured meshes. It interfaces tostandard numerical solver packages, convex optimization algorithms, model parame-terizations, and visualization routines. We use Python’s object-oriented paradigm tocreate modular code that is extensible through inheritance and subtype polymorphism.SIMPEG follows a fully open source development paradigm (Feller and Fitzgerald,2000) and uses the permissive MIT license. Throughout its development, we havefocused on modularity, usability, documentation, and extensive unit-testing (Wilsonet al., 2014). Please see the website ( for up-to-date code, examples,and documentation of this package. In addition, there are many published use casesacross a variety of geophysical applications (Kang et al., 2014, 2015b,a; Kang andOldenburg, 2015; Heagy et al., 2014, 2015d). We hope that the organization, modu-larity, minimal dependencies, documentation, and testing in SIMPEG will encouragereproducible research, cooperation, and communication to help tackle some of the in-herently multidisciplinary geophysical problems.To guide the discussion, we start this chapter by outlining gradient-based inver-sion methodology in Section 2.2. The inversion methodology directly motivates theconstruction of the SIMPEG framework, terminology, and software implementation,which we discuss in Section 2.3. We weave an example of Direct Current (DC) resis-tivity throughout the discussion of the SIMPEG framework to provide context for thechoices made and highlight many of the features of SIMPEG.242.2 Inversion methodologyGeophysical inverse problems are motivated by the desire to extract information aboutthe earth from measured data. A typical geophysical datum can be written as:Fi[m]+ εi = di, (2.1)where F is a forward simulation operator that incorporates details of the relevant phys-ical equations, sources, and survey design, m is a generic symbol for the inversionmodel, εi is the noise that is often assumed to have known statistics, and di is theobserved datum. In a typical geophysical survey, we are provided with the data,di, i= 1...N, and some estimate of their uncertainties. The goal is to recover the model,m, which is often a physical property. The data provide only a finite number of inac-curate constraints upon the sought model. Finding a model from the data alone isan ill-posed problem since no unique model exists that explains the data. Additionalinformation must be included using prior information and assumptions (for example,downhole property logs, structural orientation information, or known interfaces (Fulla-gar et al., 2008; Li and Oldenburg, 2000a; Lelie`vre et al., 2009)). This prior knowledgeis crucial if we are to obtain an appropriate representation of the earth and will be dis-cussed in more detail in Section 2.2.1.Defining and solving a well-posed inverse problem is a complex task that requiresmany interacting components. It helps to view this task as a workflow in which vari-ous elements are explicitly identified and integrated. Figure 2.1 outlines the inversionmethodology that consists of inputs, implementation, and evaluation. The inputs arecomposed of: the geophysical data; the equations, which are a mathematical descrip-tion of the governing physics; and, prior knowledge or assumptions about the setting.25The implementation consists of two broad categories: the forward simulation and theinversion. The forward simulation is the means by which we solve the governing equa-tions, given a model, and the inversion components evaluate and update this model.We are considering a gradient-based approach, which updates the model through anoptimization routine. The output of this implementation is a model, which, prior tointerpretation, must be evaluated. This requires considering, and often re-assessing,the choices and assumptions made in both the input and the implementation stages.In this chapter, our primary concern is the implementation component; that is, howthe forward simulation and inversion are carried out numerically. As a prelude to dis-cussing how the SIMPEG software is implemented, we step through the elements inFigure 2.1, considering a Tikhonov-style inversion.2.2.1 InputsThree sources of input are required prior to performing an inversion: (1) the geophysi-cal data and uncertainty estimates; (2) the governing equations that connect the soughtmodel with the observations; and (3) prior knowledge about the model and the contextof the problem.Data and uncertainty estimatesAt the heart of the inversion are the geophysical data that consist of measurementsover the earth. These data depend on the type of survey, the physical property distri-bution, and the type and location of the measurements. The details about the surveymust also be known, such as the location, orientation, and waveform of a source andwhich component of a particular wavefield is measured at a receiver. The data arecontaminated with additive noise, which can sometimes be estimated by taking mul-26Figure 2.1: Inversion methodology. Including inputs, implementation, evalua-tion and interpretation.tiple realizations of the data. However, standard deviations of those realizations onlyprovide a lower bound for the noise. For the inverse problem, the uncertainty in thedata must include not only this additive noise, but also any discrepancy between thetrue earth experiment and our mathematical representation of the data. Including theseaspects requires accounting for mis-location of receivers and sources, poor control ofthe transmitter waveform, electronic gains or filtering applied to signals entering thereceivers, incorrect dimensionality in our mathematical model (e.g. working in 2Dinstead of 3D), neglect of physics in our mathematical equations by introducing as-sumptions (e.g. using a straight ray tomography vs. a full waveform simulation in27seismic), and discretization errors of our mathematical equations.Governing equationsThe governing equations provide the connection between the physical properties ofthe subsurface and the data we observe. Most frequently, these are sets of partialdifferential equations with specific boundary conditions. The governing equations,with specified source terms, can be solved through numerical discretization using finitevolume, finite element, or integral equation techniques. Alternatively, they may alsobe solved through evaluations of analytic functions. Whichever approach is taken, it iscrucial that there exists some way to simulate the data response given a model.Prior knowledgeIf one model acceptably fits the data, then infinitely many such models exist. Ad-ditional information is therefore required to reduce non-uniqueness. This additionalinformation can be geologic information, petrophysical knowledge about the variousrock types, borehole logs, additional geophysical data sets, or inversion results. Thisprior information can be used to construct reference models for the inversion and alsocharacterize features of the model, such as whether it is best described by a smoothfunction or if it is discontinuous across interfaces. Physical property measurementscan be used to assign upper and lower bounds for a physical property model at pointsin a volume or in various regions within our 3D volume. The various types of infor-mation relevant to the geologic and geophysical questions that we must address mustbe combined and translated into useful information for the inversion (Lelie`vre et al.,2009; Li et al., 2010).282.2.2 ImplementationIn this section, we outline the components necessary to formulate a well-posed inverseproblem and solve it numerically. Two major abilities are critical to running the inver-sion: (1) the ability to simulate data, and (2) the ability to assess and update the model(Figure 2.1).Forward simulationThe ability to carry out an inversion presupposes the ability to run a forward simulationand create predicted data, given a physical property model. In forward simulation, wewish to compute F [m] = dpred. The operator, F , simulates the specific measurementstaken in a geophysical survey, using the governing equations. The survey refers to alldetails regarding the field experiment that we need to simulate the data. The forwardsimulation of DC resistivity data requires knowledge of the topography, the resistivityof the earth, and the survey details, including locations of the current and potentialelectrodes, the source waveform, the units of the observations, and the polarity ofdata (since interchanging negative and positive electrodes may sometimes occur in thefield). To complete the simulation, we need to solve our governing equations usingthe physical property model, m, that is provided. In the DC resistivity experiment,our partial differential equation, with supplied boundary conditions, is solved with anappropriate numerical method; for example, finite volumes, finite elements, integralequations, or semi-analytic methods for 1D problems. In any case, we must discretizethe earth onto an appropriate numerical forward simulation mesh, (meshF ). The sizeof the cells will depend upon the structure of the physical property model, topography,and the distance between sources and receivers. Cells in meshF must be small enough,and the domain large enough, to achieve sufficient numerical accuracy. Proper mesh29design is crucial so that numerical modeling errors are below a prescribed thresholdvalue (cf. Haber (2015)).In general, we can write our governing equations in the form of:C(m,u) = 0, (2.2)where m is the modeled physical property and u are the fields and/or fluxes. C is oftengiven by a partial differential equation or a set of partial differential equations. Infor-mation about the sources and appropriate boundary conditions are included in C. Thissystem is solved for u and the predicted data are extracted from u via a projection (orfunctional), dpred = P[u]. The ability to simulate the geophysical problem and generatepredicted data is a crucial building block. Accuracy and efficiency are essential, sincemany forward problems must be evaluated when carrying out any inversion.Inversion elementsIn the inverse problem, we must first specify how we parameterize the earth model.Finding a distributed physical property can be done by discretizing the 3D earth intovoxels, each of which has a constant, but unknown, value. It is convenient to referto the domain on which this model is discretized as the inversion mesh, meshI . Thechoice of discretization involves an assessment of the expected dimensionality of theearth model. If the physical property varies only with depth, then the cells in meshIcan be layers and a 1D inverse problem can be formulated. A more complex earthmay require 2D or 3D discretizations. The choice of discretization depends on boththe spatial distribution and resolution of the data and the expected complexity of thegeologic setting. We note that the inversion mesh has different design criteria and con-30straints than the forward simulation mesh. For convenience, many inverse problemshave historically been solved with meshI = meshF so that only one discretization isneeded for the inversion. There is a growing body of work that investigates combina-tions of inversion discretizations and forward modeling meshes that are geared towardsproblem-specific formulations as well as efficiency in large-scale problems (Haber andSchwarzbach, 2014; Yang et al., 2014; Haber and Heldmann, 2007). In any formula-tion, there exists a mapping between meshI and meshF such that the parameterizationchosen can be used to simulate data in a forward simulation.To formulate a mathematical statement of the inverse problem, there are two es-sential elements:1. data misfit: a metric that measures the misfit between the observed and predicteddata; and2. regularization: a metric that is constructed to evaluate the model’s agreementwith assumptions and prior knowledge.The data misfit requires an assessment of the error in each datum. These errorsresult from anything that yields a discrepancy between the mathematical modeling andthe true value. It includes additive noise, errors in the description of survey parameters(e.g. receiver locations, transmitter waveforms, etc.), incorrect choice of governingequations, and numerical errors arising from the simulation technique. Clearly, quan-tifying the noise for each datum poses a challenge.The data misfit is a measure of how well the data predicted by a given modelreproduce the observed data. To assess goodness of fit, we select a norm that evaluatesthe ‘size’ of the misfit. This metric must include an uncertainty estimate for each31datum. Often, we assume that the data errors are Gaussian and uncorrelated and thenestimate the standard deviation for each datum. The most common norm, and one thatis compatible with Gaussian statistics, has the form:φd(m) =12‖Wd(F [m]−dobs)‖22. (2.3)Here, F [m] is a forward modeling that produces predicted data, dpred, as in equation:2.1. Wd is a diagonal matrix whose elements are equal to Wdii = 1/εi, where εi is anestimated standard deviation of the ith datum. It is important to think carefully whenassigning these estimates. A good option is to assign a εi = f loor+%|di|. Percentagesare generally required when there is a large dynamic range of the data. A percentagealone can cause great difficulty for the inversion if a particular datum acquires a valueclose to zero; therefore, we include a floor.In addition to a metric that evaluates the size of the misfit, we also require a tol-erance, φ∗d . We consider that models satisfying φd(m) ≤ φ∗d adequately fit the data(Parker, 1994). If the data errors are Gaussian and we have assigned the correct stan-dard deviations, then the expected value of φ∗d ≈ N, where N is the number of data.Finding a model that has a misfit substantially lower than this will result in a solutionthat has excessive and erroneous structure; that is, we are fitting the noise. Finding amodel that has a misfit substantially larger than this will yield a model that is missingstructure that could have been extracted from the data (see Oldenburg and Li (2005)for a tutorial).The choice of misfit in equation 2.3 is not the only possibility for a misfit measure.If data errors are correlated, then Wd is the square root of the data covariance matrixand it will have off-diagonal terms. Often useful in practice is recognizing if the noise32statistics are non-Gaussian. Incorporating robust statistical measures, like lp normswith p≈ 1, are useful for handling outliers (Ekblom, 1973; Farquharson, 1998).The second essential inversion element is defining the regularization functional.If there is one model that has a misfit equal to the desired tolerance, then there areinfinitely many other models which can fit to the same degree. The challenge is tofind the model that has both the desired characteristics and is compatible with a prioriinformation. A single model can be selected from an infinite ensemble by measuringthe length, or norm, of each model. Then the smallest, or sometimes largest, membercan be isolated. Our goal is to design a norm that embodies our prior knowledge and,when minimized, yields a realistic candidate for the solution of our problem. Thenorm can penalize variation from a reference model, spatial derivatives of the model,or some combination of these. We denote this norm by φm and write it in a matrixform, for example,φm(m) =12‖Wm(m−mref)‖22. (2.4)Wm is a matrix and mref is a reference model (which could be zero). The matrix Wmcan be a stacked combination of matrices weighted by α∗:Wm = [αsI, αxW>x , αyW>y , αzW>z ]>. (2.5)Here, Wm is a combination of smallness and first-order smoothness in the x, y, and zdirections. Each of the W matrices is, in fact, a discrete representation of an integral33(cf. Oldenburg and Li (2005)).∫Ω(ws(m−mre f ))2 dV (smallness),∫Ω(wxdmdx)2dV (x-smoothness),∫Ω(wydmdy)2dV (y-smoothness),∫Ω(wzdmdz)2dV (z-smoothness),(2.6)The final regularization, Wm, can be a weighted sum of these, with α∗ being appliedas scalars or diagonal matrices, with varying weights for each cell or cell face (cf. Old-enburg and Li (2005); Haber (2015)). Additional weightings can also be incorporatedthrough Wm, such as depth weighting, which is important in potential field inversions(such as magnetics and gravity), or sensitivity weightings to prevent model structurefrom concentrating close to sources or receivers (Li and Oldenburg, 1996, 2000b). Theregularization functionals addressed provide constraints on the model in a weak form;that is, a single number is used to characterize the entire model. Strong constraints thatwork within each cell can often be provided in the form of upper and lower bounds;these bounds will be incorporated in the following section. The l2 norms referred toabove are appropriate for many problems, however models norms, such as lp-norms,total variation, minimum support stabilizing functionals, or rotated smoothness oper-ators that favor different character and / or include additional information can also bedesigned (cf. Oldenburg (1984); Oldenburg and Li (2005); Claerbout and Muir (1973);Strong and Chan (2003); Zhdanov (2002); Li and Oldenburg (2000a)). For example:∫|wp(m−mre f )|pdV (weighted-lp norm) (2.7)34The potential to have different norms tailored to a specific problem, with the additionalfunctionality of localized weightings and reference models, provides the user withthe capability of designing a regularization that favors a solution that is compatiblewith existing knowledge about the model. This task is not trivial, requires carefulthought, and must often be re-evaluated and adjusted as the geophysicist iterates overthe inversion procedure (Figure 2.1).Statement of the inverse problemThe purpose of this section is to pose our inverse problem in a mathematically preciseway and to provide a methodology by which a solution can be achieved. In the workthat follows, we outline a specific methodology that we will later demonstrate. We for-mulate the inverse problem as a problem in optimization, where we define an objectivefunction, based on the data misfit and model regularization, and aim to find a modelwhich sufficiently minimizes it. Many variants of this formulation are possible.At this stage of the workflow, we have on hand all of the necessary components forformulating the inverse problem as an optimization problem. We have the capabilityto forward model and generate predicted data, assess the data misfit using φd , and atolerance on the data misfit has already been specified. A regularization functional,φm, and additional strong constraints on the model have been identified, such as upperand lower bounds: mLi ≤mi ≤mHi . The sought model is one that minimizes φm andalso reduces the data misfit to some tolerance, φ∗d . However, a reduction in data misfitrequires that the model have increased structure, which typically is at odds with theassumptions we impose in the construction of φm, meaning that the φd and φm areantagonistic. To address this and still pose the inversion as an optimization problem,35we design a composite objective function:φ(m) = φd(m)+βφm(m), (2.8)where β is a positive constant. It is often referred to as the trade-off parameter, re-gression parameter, regularization parameter, or Tikhonov parameter (Tikhonov andArsenin, 1977). When β is very large, the minimization of φ(m) produces a modelthat minimizes the regularization term and yields a large φd(m). Alternatively, when βis very small, minimization of φ(m) produces a model that fits the data very well butis contaminated with excessive structure so that φm(m) is large. The inverse problemis posed as:minimizemφ(m) = φd(m)+βφm(m)s.t. φd ≤ φ∗d , mLi ≤mi ≤mHi .(2.9)Since the value of β is not known a priori, the above optimization problem can besolved at many values of β to produce a trade-off, or Tikhonov, curve (cf. Parker(1994); Hansen (1998)). An optimum value, β ∗, can be found so that minimizingequation 2.8 with β ∗ produces a model with misfit φ∗d . The value of β∗ can be foundvia cooling techniques where the β is progressively reduced from some high valueand the process stopped when the tolerance is reached or by using two-stage methods(cf. Parker (1977)). There are other strategies for selecting the trade-off parameterincluding the L-curve technique (Hansen, 1992), which attempts to find the point ofgreatest curvature in the Tikhonov curve and Generalized Cross Validation (Wahba,1990; Golub et al., 1979; Haber and Oldenburg, 2000; Oldenburg and Li, 2005; Far-quharson and Oldenburg, 2004).36The optimization posed in equation 2.9 is almost always non-linear. It is linearonly in a special case, where the forward mapping is a linear functional of the model,φm and φd are written as l2 norms, β is known, and there are no imposed bound con-straints. This rarely happens in practice, requiring that iterative optimization methodsbe employed to find a solution. Gradient-based methods are commonly used and werefer the reader to Nocedal and Wright (1999) for background and introductions to therelevant literature. For geophysical problems, Gauss-Newton techniques have provento be valuable and practical. For l2 norms, we write the objective function as:φ(m) =12||Wd(F [m]−dobs)||22+12β ||Wm(m−mref)||22. (2.10)The gradient is given by:g(m) = J[m]>W>d Wd(F [m]−dobs)+βW>mWm(m−mref), (2.11)where J[m] is the sensitivity or Jacobian. The components, J[m]i j, specify how the ithdatum changes with respect to the jth model parameter; these changes will be discussedin more detail in the next section. At the kth iteration, beginning with a model, mk, wesearch for a perturbation, δm, which reduces the objective function. Linearizing theforward simulation by:F [mk +δm]≈ F [mk]+ J[mk]δm (2.12)and setting the gradient equal to zero yields the standard Gauss-Newton equations to37be solved for the perturbation δm:(J[m]>W>d WdJ[m]+βW>mWm)δm =−g(m). (2.13)The updated model is given by:mk+1 = mk + γδm, (2.14)where γ ∈ (0,1] is a coefficient that can be found by a line search. Setting γ = 1 is thedefault and a line search is necessary if φ(mk+1)≥ φ(mk).The iterative optimization process is continued until a suitable stopping criterionis reached. Completion of this iterative process yields a minimization for particularvalue of the trade-off parameter, β . If we are invoking a cooling schedule, and if thedesired misfit tolerance is not yet achieved, β is reduced and the iterative numericaloptimization procedure is repeated.SensitivitiesA central element in the above approach is the computation of the sensitivities. Thesensitivity functional is defined by:J[m] =∂F [m]∂m= P(∂u∂m)(2.15)where P is a linear projection and d· indicates total difference. There are numerousapproaches to computing the sensitivity, but the chosen methodologies are dictated bythe size of the problem. The discrete sensitivity matrix, J, is a dense N×M matrix,where N is the number of data and M is the number of model parameters. For some38problems, J can be computed directly and stored. Ultimately, this computation andstorage demands the solution of numerous forward problems (cf. Haber (2015)). Inanother approach, we can factor J[m] in symbolic form. In the general case, we solvefor the sensitivity implicitly by taking the derivative of C(m,u) = 0 (equation 2.2) toyield:∇mC(m,u)dm+∇uC(m,u)du = 0, (2.16)where ∇· indicates partial difference and both ∇mC(m,u) and ∇uC(m,u) are ma-trices. For a given model, ∇uC(m,u) corresponds to the forward simulation opera-tor. If the forward problem is well-posed, then the matrix is invertible (Haber, 2015).Equation 2.16 can be rearranged to:du =−(∇uC(m,u))−1∇mC(m,u)dm, (2.17)and combined with equation 2.15 to obtain a formula for the sensitivity matrix. Wenote that this matrix is dense, often large, and need not actually be formed and storedand the inverse of ∇uC(m,u) need not be formed explicitly.Inversion as optimizationOnce the inverse problem has been stated in an optimization framework (equation2.9), an appropriate optimization routine can be selected. For example, if bound con-straints are incorporated, we can use a projected Gauss-Newton algorithm. In large-scale inversions, special attention may have to be given to ensuring a memory efficientoptimization algorithm. However, the underlying mechanics of the algorithms oftenremain unchanged. In a geophysical inversion, we require a model that is consistent39with a priori information and known, or assumed, statistical distributions (e.g. thediscrepancy principle). As such, the stopping criteria of the inversion are often imple-mented differently than traditional optimization algorithms or a series of incompleteoptimization algorithms are invoked while changing the objective function (Oldenburgand Li, 2005; Haber, 2015; Haber et al., 2000).The optimization of the stated inverse problem provides the machinery to obtain amathematical solution. However, before the model is accepted as a viable candidate,there are numerous questions that should be investigated. For example, some questionsto address might include: (a) How well does the recovered model fit the observeddata? (b) Is there bias in the misfits between the observed and predicted data? (c)What was the path for the convergence? (d) Is there too much or too little structure?(e) Does the model fit with prior knowledge and other data sets? The final resultsand details about how the inversion algorithm has performed all provide clues as towhether the constructed model can be accepted or if elements in our procedure or itsnumerical implementation need to be altered and the inversion rerun. These mightinclude: adjusting the assigned uncertainties in the misfit function; altering the modelregularization; or, changing aspects of the numerical computations.2.2.3 Evaluation/interpretationIn this section, we return to the initial question posed, which the inversion was de-signed to help answer. Questions of interest might include: (a) Are the interestingfeatures supported by the data or are they artifacts?; (b) Does the result make sensegeologically and geophysically?; and (c) Are there interesting features that should beinvestigated further? Addressing these questions usually involves repeating the inver-sion process with appropriate modifications (cf. Oldenburg and Li (2005); Pidlisecky40et al. (2011); Lines et al. (1988)). As such, we require an implementation that is inher-ently and unequivocally modular, with all pieces available for manipulation. Black-box software, where the implementations are hidden, obfuscated, or difficult to ma-nipulate, does not promote experimentation and investigation. Exposing the detailsof the implementation to the geophysicist in a manner that promotes productivity andquestion-based interrogation is the goal of SIMPEG and is the topic of the next section.2.3 Modular implementationAn overwhelming amount of choices must be made while working through the forwardmodeling and inversion process (Figure 2.1). As a result, software implementations ofthis workflow often become complex and highly interdependent, making it difficultto interact with other scientists or to ask them to pick up and change the work. Ourapproach to handling this complexity is to propose a framework, Figure 2.2, whichcompartmentalizes the implementation of inversions into various units. We presentthe framework in this specific modular style, as each unit contains a targeted subsetof choices crucial to the inversion process. The aim of the SIMPEG framework, andimplementation, is to allow users to move between terminology, math, documentation,and code with ease, such that there is potential for development in a scalable way.The SIMPEG implementation provides a library that mimics the framework shown inFigure 2.2, with each unit representing a base class. These base classes can be inheritedin specific geophysical problems to accelerate development as well as to create codethat is consistent between geophysical applications.412.3.1 Implementation choicesWe chose Python (Van Rossum and Drake Jr, 1995) for the implementation of SIM-PEG. Python supports object-oriented practices and interactive coding, has extensivesupport for documentation, and has a large and growing open source scientific com-munity (Lin, 2012). As an interpreted language, however, there are occasionally bot-tlenecks on speed or memory. These inefficiencies may mean that the code will not beable to scale to a production quality code. However, these computational bottleneckscan often be identified through profiling and can be written in a lower-level languageand wrapped in Python. Additionally, these problems are admissible when the goalof the software is clear: we require an interactive research tool where geophysicalproblems from many disciplines live in one place to enhance experimentation and dis-semination of new ideas. To enhance the dissemination of our work, we have releasedour work under the permissive MIT license for open source software. The MIT licenseneither forces packages that use SIMPEG to be open source, nor does it restrict com-mercial use. We have also ensured that we have followed best practices, with regard toversion control, code-testing, and documentation (Wilson et al., 2014).2.3.2 OverviewAs discussed in the previous section, the process of obtaining an acceptable modelfrom an inversion generally requires the geophysicist to perform several iterations ofthe inversion workflow while rethinking and redesigning each piece of the frameworkto ensure it is appropriate in the current context. Inversions are experimental and em-pirical by nature and our software package is designed to facilitate this iterative pro-cess. To accomplish this iterative process, we have divided the inversion methodologyinto eight major components (Figure 2.2). The Mesh class handles the discretization42of the earth and also provides numerical operators. The forward simulation is splitinto two classes: the Survey and the Problem. The Survey class handles thegeometry of a geophysical problem as well as sources. The Problem class handlesthe simulation of the physics for the geophysical problem of interest. Although cre-ated independently, these two classes must be paired to form all of the componentsnecessary for a geophysical forward simulation and calculation of the sensitivity. TheProblem creates geophysical fields, given a source from the Survey. The Surveyinterpolates these fields to the receiver locations and converts them to the appropri-ate data type (for example, by selecting only the measured components of the field).Each of these operations may have associated derivatives, with respect to the modeland the computed field; these associated derivatives are included in the calculationof the sensitivity. For the inversion, a DataMisfit is chosen to capture the good-ness of fit of the predicted data and a Regularization is chosen to handle non-uniqueness. These inversion elements and an Optimization routine are combinedinto an inverse problem class (InvProblem). InvProblem is the mathematicalstatement (i.e. similar to equation 2.9) that will be numerically solved by running anInversion. The Inversion class handles organization and dispatch of directivesbetween all of the various pieces of the framework.The arrows in Figure 2.2 indicate what each class takes as a primary argument. Forexample, both the Problem and Regularization classes take a Mesh class asan argument. The diagram does not show class inheritance, as each of the base classesoutlined have many subtypes that can be interchanged. The Mesh class, for example,could be a regular Cartesian mesh or a cylindrical coordinate mesh, which have manycommon properties. We can exploit these common features, such as both meshes beingcreated from tensor products, through inheritance of base classes; differences can be43Figure 2.2: SIMPEG framework indicating the flow of information. In the im-plementation, each of these modules is a base class.expressed through subtype polymorphism. We refer the reader to the online, up-to-date documentation ( to observe the class inheritance structurein depth.2.3.3 Motivating exampleWe will use the DC resistivity problem from geophysics to motivate and explain thevarious components of the SIMPEG framework. This example will be referred tothroughout this section. We will introduce the example briefly here and refer the readerto Appendix B.2.4 for a more in-depth discussion. The governing equations for DC44resistivity are:∇ ·~j = I(δ (~r−~rs+)−δ (~r−~rs−)) = q1σ~j =−∇φ(2.18)where σ is the electrical conductivity, φ is the electric potential, and I is the input cur-rent at the positive and negative dipole locations~rs± , captured as Dirac delta functions.In DC resistivity surveys, differences in the potential field, φ , are sampled using dipolereceivers to collect observed data. To simulate this partial differential equation (PDE)(or set of PDEs, if there are multiple current injection locations), we must discretizethe equation onto a computational mesh.2.3.4 MeshAny numerical implementation requires the discretization of continuous functions intodiscrete approximations. These approximations are typically organized in a mesh,which defines boundaries, locations, and connectivity. In geophysical simulations, werequire the definitions of averaging, interpolation, and differential operators for anymesh. Throughout our work, we have implemented discretization techniques, using astaggered mimetic finite volume approach (Hyman and Shashkov, 1999; Hyman et al.,2002). For an in-depth discussion of the finite volume techniques employed in thisthesis, we refer the reader to Appendix B. This work has resulted in an open sourcepackage called discretize, which provides finite volume techniques abstractedacross four mesh types: (1) tensor product mesh; (2) cylindrically symmetric mesh; (3)logically rectangular, non-orthogonal mesh; and (4) octree and quadtree meshes. Thetechniques and interface to the methodologies are specifically tailored for efficiencyand accessibility for geophysical inverse problems. To create a new Mesh instance, aTensorMesh class can be selected from the discretize module and instantiated45with a list of vectors: Here, we import the discretize library as well as NumPyimport discretize # See Appendix A and Cockett et al. 2016import numpy as npimport scipy.sparse as sphx = np.ones(30)hy = np.ones(30)mesh = discretize.TensorMesh([hx, hy])Program 2.1: Creation of a 2D tensor product mesh using the discretizepackage discussed in Appendix B.(np) and SciPy’s sparse matrix package (sp) (Oliphant, 2007; Jones et al., 2001). Thevectors hx and hy describe the cell size in each mesh dimension. The dimension ofthe mesh is defined by the length of the list, requiring very little change to switch meshdimensions or type. Once an instance of a mesh is created, access to the propertiesand methods, shown in Table 2.1, is possible. Additional methods and visualizationroutines are also included in the Mesh classes. Of note in Table 2.1 are organizationalproperties (such as counting and geometric properties), locations of mesh variables asCartesian grids, differential and averaging operators, and interpolation matrices. Wecan readily extend the mesh implementation to other types of finite volume meshes (forexample, octree (Haber and Heldmann, 2007), logically rectangular non-orthogonalmeshes (Hyman et al., 2002), and unstructured meshes (Ollivier-Gooch and Van Al-tena, 2002)). Additionally, this piece of the framework may be replaced by othermethodologies such as finite elements.With the differential operators readily accessible across multiple mesh types, sim-ulation of a cell-centered discretization for conductivity, σ , in the DC resistivity prob-lem is straightforward. The discretized system of equations, B.4, can be written as:A(σ)u = D(Mf1/σ )−1D>u =−q, (2.19)46Table 2.1: Selected Mesh class properties with explanations.Property or Function Explanationdim Dimension of the meshx0 Location of the originnC, nN, nF, nE The number of cells, nodes, faces, or edges. (e.g. nC isthe total number of cells)vol, area, edge Geometric measurements for the meshgridN, gridCC, etc. Array of grid locationsnodalGrad Gradient of a nodal variable→ edge variablefaceDiv Divergence of a face variable→ cell-centered variableedgeCurl Curl of a edge variable→ face variablecellGrad Gradient of a cell-centered variable→ face variableaveF2CC, aveN2CC,etc.Averaging operators (e.g. F→CC, takes values on facesand averages them to cell-centers)getInterpolationMat(loc) Interpolation matrix for xyz locationswhere D and D> are the divergence and ‘gradient’ operators, respectively. This equa-tion is assuming Dirichlet boundary conditions and a weak formulation of the DC re-sistivity equations, as in Section B.2.5. The conductivity, σ , is harmonically averagedfrom cell-centers to cell-faces to create the matrix (Mf1/σ )−1 (Pidlisecky et al., 2007).Note that the matrix (Mf1/σ )−1 is diagonal when the physical property is isotropic orhas coordinate anisotropy on a tensor product mesh, so the inverse is trivial. Using ourdiscretize package, this equation is written as:D = mesh.faceDivMsig = mesh.getFaceInnerProduct(sigma, invProp=True, invMat=True)A = D*Msig*D.TProgram 2.2: Creation of the matrix A(σ) for the direct current resistivity prob-lem. See Appendix B for details on finite volume.The code is easy to read, looks similar to the math, can be built interactively us-ing tools such as IPython (Pe´rez and Granger, 2007), and is not dependent on thedimension of mesh used. Additionally, it is decoupled from the mesh type. For ex-47ample, Figure 2.3 is generated by solving a DCProblem for three different meshtypes: TensorMesh; TreeMesh; and, CurvilinearMesh. Other than the spe-cific mesh generation code, no other modifications to the DC problem were neces-sary (see the online examples provided in SIMPEG). Given the electrode locations,a q can be constructed on each mesh and the system, A(σ)u = −q, can be solved.There are many excellent packages available to solve matrix equations and we havecreated a library to interface many of these direct and iterative solvers. The package,pymatsolver, comes with a few different types of Solver objects that provide asimple and common interface to Super-LU, Paradiso, and Mumps as well as includinga few simple preconditioners for iterative solvers (Li, 2005; Schenk and Gartner, 2004;Duff et al., 1986; Balay et al., 2012). The potential field can be projected onto the re-from pymatsolver import PardisoSolver # Solver wrapping utilitiesAinv = PardisoSolver(A) # Create a solver objectu = Ainv * (- q)mesh.plotImage(u)Program 2.3: Solving and plotting the fields (φ ) for direct current resistivity us-ing pymatsolver and visualization utilities in SIMPEG.ceiver electrode locations through interpolation matrices, which are constructed by theMesh class. Additionally, there are multiple visualization routines that have been in-cluded in the Mesh class for rapid visualization and interrogation of geophysical fieldsand physical properties (Figure 2.3). We note that these code snippets can be easily becombined in a script, highlighting the versatility and accessibility of the Mesh classesin discretize.This script will be expanded upon and segmented into the various pieces of theframework in the following sections. We find that the development of geophysicalcodes is often iterative and requires ‘scripting’ of equations. Only after these equations48Figure 2.3: Solving the DC resistivity problem for a dipole and using the meshesvisualization routine for the potential, φ , for three different mesh types: (a)TensorMesh, (b) TreeMesh, and (c) CurvilinearMesh. The potential hasbeen interpolated onto the tensor mesh for visualization.are correct, as demonstrated by an appropriate test (e.g. Tests.checkDerivative),do we formalize and segment our script to enable a geophysical inversion to be run.The toolbox that SIMPEG provides promotes this interactive and iterative style of de-velopment.2.3.5 Forward simulationThe forward simulation in SIMPEG is broken up into a Survey class and a Problemclass. The Problem class contains the information and code that capture both thephysics used to describe the connection between a physical property distribution andthe fields/fluxes that are measured in a geophysical survey. The Survey class con-tains information about the observed data and the geometry of how to collect the data(e.g. locations and types of receivers and sources) given a Problem that simulatesfields. The Problem and the Survey must be paired together to simulate predicteddata. We decided on this separation of the code because it is possible to have multiplemathematical descriptions, of varying complexities, which explain the same observeddata. For example, a seismic simulation could have multiple approximations to the49physics, which increase in complexity and accuracy, from straight-ray tomography orEikonal tomography to full waveform simulation. Additionally, there are often mul-tiple types of geophysical surveys that could be simulated from the same Problemclass.Table 2.2: Base Problem class properties with explanations.Property or Function Explanationfields(m) Calculation of the fields given a modelJvec(m, v) Sensitivity times a vectorJtvec(m, v) Adjoint sensitivity times a vectorJfull(m) Full sensitivity matrixmapping Maps the model to a physical propertyThe crucial aspects of the Problem class are shown in Table 2.2 and the proper-ties and methods of the Survey class are shown in Table 2.3. We note that each ofthe sub-classes of Problem will implement fields and sensitivities in a different way,likely with additional methods and properties. Furthermore, the choice of terminologybecomes clearer when these classes are inherited and used in a specific geophysicalmethod (e.g. a DCProblem or EMProblem). For the DCProblem, the fieldscan be created by constructing A(m) and solving with the source terms, Q, whichwill be provided by the DCSurvey’s source list (srcList). Each source has atleast one receiver associated with it and the receivers can create a matrix, P, whichproject the fields, u, onto the data-space. For example, in the DC problem, a dipolereceiver samples the potential at each electrode location and computes the differenceto give a datum. We note that the process of computing a datum may be more in-volved and have derivatives with respect the computed fields and, possibly, the model.We solve much of the organizational bottlenecks through general receiver and sourceclasses, which can be inherited and tailored to the specific application. The mapping50in the Problem provides a transformation from an arbitrary model to a discretizedgrid function of physical properties. For example, log-conductivity is often used inthe inverse problem for DC resistivity, rather than parameterizing directly in terms ofconductivity. If this choice is made for the model, an appropriate map (i.e. the expo-nential) must be provided to transform from the model space to the physical propertyspace (cf. Heagy et al. (2014)).Table 2.3: Selected Survey class properties with explanations.Property or Function Explanationdobs, nD dobs, number of datastd Estimated standard deviationssrcList List of sources with associated receiversdpred(m) Predicted data given a model, dpred(m)projectFields(m, u) Projects the fields, P(m,u)projectFieldsDeriv(m,u)Derivative of the projection, dP(m,u)dmresidual(m) dpred(m)−dobs2.3.6 DC resistivity forward simulationWe present a simple DC-resistivity survey to demonstrate some of the componentsof SIMPEG in action. We use a set of Schlumberger arrays to complete a verticalsounding. In this example, we have taken our scripts from the previous section de-scribing the forward simulation and combined them in a package called SimPEG.DC( We use the 3D tensor mesh to run the forward simulation for thedata of this problem. Here the srcList is a list of dipole sources (DC.SrcDipole),each of which contains a single receiver, (DC.RxDipole). Similar to the illustrationin Figure 2.2, the Problem and the Survey must be paired for either to be used tosimulate fields and/or data. These elements represent the major pieces of any forward51from SimPEG.EM.Static import DCsurvey = DC.SurveyDC(srcList)problem = DC.ProblemDC(mesh)problem.pair(survey)data = survey.dpred(sigma)Program 2.4: Pairing the Problem and Survey objects to create predicteddata, dpred.simulation in geophysics; they are crucial and must be well-tested for accuracy andefficiency before any attempt is made at setting up the inverse problem.2.3.7 SensitivitiesThe sensitivity and adjoint will be used in the optimization routine of the inversion.Inefficient or inaccurate calculation of the sensitivities can lead to an extremely slowinversion. This is critical in large-scale inversions, where the dense sensitivity ma-trix may be too large to hold in memory directly. As discussed in the methodologysection, the sensitivity matrix need not be explicitly created when using an iterativeoptimization algorithm, such as Gauss-Newton (2.13), solved with a conjugate gradi-ent approach. The calculation of vector products with the sensitivity matrices is animportant aspect of SIMPEG, which has many tools to make construction and testingof these matrices modular and simple. For the DC resistivity example, the discretizedgoverning equations are written as: C(m,u) = A(m)u−q = 0. We can implement thesensitivity equations 2.15 and 2.17 to yield:J =−P(A(m)−1∇mC(m,u)), (2.20)where ∇mC(m,u) is a known sparse matrix, A(m) is the forward operator and isequivalent to ∇uC(m,u), and P is a projection matrix (cf. Pidlisecky et al. (2007)).The sensitivity matrix is dense and holding it in memory may not be possible. If an52iterative solver is used in the optimization, only matrix vector products are necessaryand the sensitivity need not be explicitly calculated or stored. Program 2.5 outlinesthe calculation of Jvec, given a model, m, the fields, u, and a vector to multiply, v.In Program 2.5, we draw the distinction between the model, m, and the conductivity,sig, which are connected through a mapping, σ =M (m), and associated derivatives.The matrix, ∇mC(m,u), is denoted dCdm and formed by looping over each source inthe DC resistivity survey.1 def Jvec(self, m, v, u=None):2 # Set current model; clear dependent property A(m)3 self.curModel = m4 sigma = self.curModel.transform # σ =M (m)5 if u is None:6 # Run forward simulation if u not provided7 u = self.fields(self.curModel)8 else:9 shp = (self.mesh.nC, self.survey.nTx)10 u = u.reshape(shp, order=’F’)1112 D = self.mesh.faceDiv13 G = self.mesh.cellGrad14 # Derivative of model transform, ∂σ∂m15 dsigdm_x_v = self.curModel.transformDeriv * v1617 # Take derivative of C(m,u) w.r.t. m18 dCdm_x_v = np.empty_like(u)19 # loop over fields for each transmitter20 for i in range(self.survey.nTx):21 # Derivative of inner product,(M f1/σ)−122 dAdsig = D * self.dMdsig( G * u[:,i] )23 dCdm_x_v[:, i] = dAdsig * dsigdm_x_v2425 # Take derivative of C(m,u) w.r.t. u26 dCdu = self.A27 # Solve for ∂u∂m28 dCdu_inv = self.Solver(dCdu, **self.solverOpts)29 P = self.survey.getP(self.mesh)30 J_x_v = - P * mkvc( dCdu_inv * dCdm_x_v )31 return J_x_vProgram 2.5: Sensitivity times a vector method for the DCProblem.532.3.8 Inversion elementsAs indicated in the methodology section, there are two key elements needed for ageophysical inversion: DataMisfit and Regularization. The DataMisfitmust have a way to calculate predicted data and, as such, it takes a paired survey as aninitial argument, which allows forward simulations to be completed. DataMisfitand Regularization have similar interfaces, which are shown in Table 2.4. TheDataMisfit class also has a property, targetMisfit, for the target misfit, whichcan be checked by an InversionDirective and used as a stopping criteria. Asdiscussed in the methodology section, the Regularization is defined indepen-dently from the forward simulation. The regularization is with respect to the model,which may or may not be on the same mesh as the forward simulation (i.e. meshI 6=meshF ). In this case, a mapping of a model to a physical property on the forward sim-ulation mesh is necessary for the Problem. The Regularization class also has amapping property, which allows a wide variety of regularizations to be implemented(e.g. an active cell map used to ignore air cells). As such, the Regularizationmapping is often independent from the mapping in the Problem class, which out-puts a physical property. Included in the SIMPEG package are basic Tikhonov regular-ization routines and simple l2 norms for both Regularization and DataMisfitclasses. Each of these classes has properties for the appropriate model and data weight-ings, as discussed in the previous section (e.g. Wm and Wd). These classes are readilyextensible, such that they can be customized to specific problems and applications (forexample, considering l1 or lp norms or customized regularizations).54Table 2.4: Common functions for the Regularization, and DataMisfitclasses.Function Explanationobj(m) Evaluate the functional given a model when the class iscalled directly.obj.deriv(m) First derivative returns a vector.obj.deriv2(m, v) Second derivative as an implicit operator.2.3.9 Inverse problem and optimizationThe InvProblem combines the DataMisfit and Regularization classes byintroducing a trade-off parameter, β . In addition to the trade-off parameter, there aremethods that evaluate the objective function and its derivatives (Table 2.4). Additionalmethods can save fields so that information is not lost between evaluation of the ob-jective function and the derivatives. The InvProblem may also include bounds onthe model properties so that they can be used in the optimization routine. If we con-sider a joint or integrated inversion, multiple data misfit functions, employing differentphysics, and that multiple types of regularization functionals may be summed together,possibly with relative weightings, we can define the InvProblem (cf. Lines et al.(1988); Holtham and Oldenburg (2010); Heagy et al. (2014)). Once the InvProblemcan be evaluated to a scalar with associated derivatives, an Optimization can ei-ther be chosen among the ones included in SIMPEG or provided by an external pack-age. Optimization routines in SIMPEG include steepest descent, L-BFGS, and InexactGauss-Newton (cf. Nocedal and Wright (1999)). The components are relatively sim-ple to hook up to external optimization packages (for example, with the optimizationpackage in SciPy (Jones et al., 2001)).552.3.10 InversionThe Inversion conducts all communication between the various components of theframework and is instantiated with an InvProblem object. The Inversion hasvery few external methods but contains the list of directives that are executed through-out the inversion. Each InversionDirective has access to the components of theinversion framework and can thus access and change any of these components whilethe inversion is running. A simple directive may print optimization progress or savemodels to a database. More complicated directives may change or compute parame-ters such as, β , reference models, data weights, or model weights. These directives areoften guided by heuristics, but versions can often be formalized (see, for example, theiterative Tikhonov style inversion (Tikhonov and Arsenin, 1977; Parker, 1994; Olden-burg and Li, 2005)). There are many computational shortcuts that may be investigated,such as how many inner and outer CG iterations to complete in the inexact Gauss-Newton optimization and whether the number of iterations should change as the al-gorithm converges to the optimal model. The directiveList in the Inversionencourages heuristics, which geophysicists often complete ‘by hand’, to be codified,combined, and shared via a plug-in style framework.2.3.11 DC resistivity inversionWe will build on the example presented in Section 2.3.6, which has a survey setupthat only provides enough information for a vertical sounding. As such, we willdecouple our 3D forward mesh and 1D inversion mesh and connect them through amapping (cf. Kang et al. (2015b)). Additionally, since electrical conductivity is alog-varying parameter, we will also construct a model space that is optimized in logspace. Both of these model transformations will be handled with a single map, M ,56where σ =M (m). We have provided a number of common mapping transformationsfrom SimPEG import Mapsmapping = Maps.ExpMap(mesh) * Maps.SurjectVertical1D(mesh)sigma = mapping * modelProgram 2.6: Creation and chaining together of multiple mapping properties fora model of σ .in the SimPEG.Maps package and these can be easily combined with a multiplicationsymbol. Additionally, when using these maps, we calculate the derivatives using thechain rule, allowing them to be easily included in the sensitivity calculation (cf. Pro-gram 2.5, line 15). Figure 2.5 demonstrates this mapping visually. The 1D model is inlog(σ), shown in Figure 2.4(a) as a black solid line, and the transformation producesa 3D sigma vector, which we plotted in Figure 2.4(b). We can now use the same sim-ulation machinery as discussed in Section 2.3.6, with a single change: Synthetic data,from SimPEG.EM.Static import DCproblem = DC.ProblemDC(mesh, sigmaMap=mapping)Program 2.7: Instantiation of the direct current resistivity problem with a map-ping for the σ property.dobs, are created using the 1D log-conductivity model and adding 1% Gaussian noise.When creating the regularization inversion element, we note again that the mappingparameter can be used to regularize in the space that makes the most sense. In thiscase, we will regularize on a 1D mesh in log-conductivity space; as such, we will sup-ply only a 1D tensor mesh to the regularization. An inversion is run by combining thetools described above. Figure 2.2 illustrates how the components are put together. Wenote that there are many options and inputs that can enhance the inversion; refer to theonline up-to-date documentation ( The result of this inversioncan be seen in Figure 2.5(a) and (b) for the predicted data and model, respectively.57mesh1D = discretize.TensorMesh([mesh.hz])dmis = DataMisfit.l2_DataMisfit(survey)reg = Regularization.Tikhonov(mesh1D)opt = Optimization.InexactGaussNewton()invProb = InvProblem.BaseInvProblem(dmis, reg, opt)inv = Inversion.BaseInversion(invProb)mopt = 2.8: Creating a boiler plate inversion at a low level.Figure 2.4: Illustration of mapping in DC inversion. (a) 1D log conductivitymodel. (b) 3D conductivity model.Figure 2.5: (a) Observed (black line) and predicted (red line) apparent resistivityvalues. (b) True and recovered 1D conductivity model.582.4 ConclusionsProducing an interpretation from geophysical data through an inversion is an iterativeprocess with many moving pieces. A number of inversion components, techniques, andmethodologies have become standard practice. The development of new methodolo-gies to address the evolving challenges in the geosciences will build upon and extendthese standard practices, requiring experimentation with, and recombination of, exist-ing techniques. To facilitate this combinatorial experimentation, we have organized thecomponents of geophysical inverse problems in a comprehensive, modular framework.Our implementation of this framework, SIMPEG (, providesan extensible, well-tested toolbox and infrastructure that supports problems, includingelectromagnetics, fluid flow, seismic, and potential fields. As SIMPEG is formulatedwith the inverse problem as its core focus, many design choices have been made to en-sure that sensitivities are efficient to compute and are readily available; these choiceshave shown to be advantageous for integrated geophysical inversions. The modularframework that we suggest splits the code into components, which are motivated di-rectly by geophysical methodology and terminology. Splitting the code allows eachpiece to be improved by specialists, while promoting quantitative communication be-tween researchers.To accelerate the dissemination and adoption of SIMPEG in the wider community,we have made the entire project open source under the permissive MIT License. Theusability of this framework has been a focus of SIMPEG and we strive to use best prac-tices of continuous integration, documentation (, unit-testing,and version-control. These practices are key to have in place as more modules andpackages are created by the community.59Chapter 3Richards equation3.1 IntroductionStudying the processes that occur in the vadose zone, the region between the earth’ssurface and the fully saturated zone, is of critical importance for understanding ourgroundwater resources. Fluid flow in the vadose zone is described by the Richardsequation and parameterized by hydraulic conductivity, which is a nonlinear functionof pressure head (Richards, 1931; Celia et al., 1990). Typically, hydraulic conduc-tivity is heterogeneous and can have a large dynamic range. In any site characteri-zation, the spatial estimation of the hydraulic conductivity function is an importantstep. Achieving this, however, requires the ability to efficiently solve and optimize thenonlinear, time-domain Richards equation. Rather than working with a full, implicit,3D time-domain system of equations, simplifications are consistently used to avert theconceptual, practical, and computational difficulties inherent in the parameterizationand inversion of the Richards equation. These simplifications typically parameterizethe conductivity and assume that it is a simple function in space, often adopting a ho-60mogeneous or one dimensional layered soil profile (cf. (Binley et al., 2002; Deianaet al., 2007; Hinnell et al., 2010; Liang and Uchida, 2014)). Due to the lack of con-straining hydrologic data, such assumptions are often satisfactory for fitting observedmeasurements, especially in two and three dimensions as well as in time. However,as more data become available, through spatially extensive surveys and time-lapseproxy measurements (e.g. direct current resistivity surveys and distributed tempera-ture sensing), extracting more information about subsurface hydrogeologic parame-ters becomes a possibility. The proxy data can be directly incorporated through anempirical relation (e.g. (Archie, 1942)) or time-lapse estimations can be structurallyincorporated through some sort of regularization technique (Haber and Gazit, 2013;Haber and Oldenburg, 1997; Hinnell et al., 2010). Recent advances have been madefor the forward simulation of the Richards equation in a computationally-scalable man-ner (Orgogozo et al., 2014). However, the inverse problem is non-trivial, especially inthree-dimensions (Towara et al., 2015), and must be considered using modern numer-ical techniques that allow for spatial estimation of hydraulic parameters. However,this is especially intricate to both derive and implement due to the nonlinear, time-dependent forward simulation and potential model dependence in many aspects of theRichards equation (e.g. multiple empirical relations, boundary/initial conditions). Toour knowledge, there has been no large-scale inversion for distributed hydraulic pa-rameters in three dimensions using the Richards equation as the forward simulation.Inverse problems in space and time are often referred to as history matching prob-lems (see Dean and Chen (2011); Dean et al. (2008); Sarma et al. (2007); Oliver andReynolds (2001); Sˇimunek et al. (2012) and reference within). Inversions use a flowsimulation model, combined with some a-priori information, in order to estimate a spa-tially variable hydraulic conductivity function that approximately yields the observed61data. The literature shows a variety of approaches for this inverse problem, includ-ing trial-and-error, stochastic methods, and various gradient based methods (Bitterlichet al., 2004; Binley et al., 2002; Carrick et al., 2010; Durner, 1994; Finsterle and Zhang,2011; Mualem, 1976; Sˇimunek and van Genuchten, 1996). The way in which the com-putational complexity of the inverse method scales becomes important as problem sizeincreases (Towara et al., 2015). Computational memory and time often become a bot-tleneck for solving the inverse problem, both when the problem is solved in 2D and,particularly, when it is solved in 3D (Haber et al., 2000). To solve the inverse prob-lem, stochastic methods are often employed, which have an advantage in that they canexamine the full parameter space and give insights into non-uniqueness (Finsterle andKowalsky, 2011). However, as the number of parameters we seek to recover in an in-version increases, these stochastic methods require that the forward problem be solvedmany times, which often makes these methods impractical. This scalability, especiallyin the context of hydrogeophysics has been explicitly noted in the literature (cf. Binleyet al. (2002); Deiana et al. (2007); Towara et al. (2015); Linde and Doetsch (2016)).Derivative-based optimization techniques become a practical alternative when theforward problem is computationally expensive or when there are many parameters toestimate (i.e. thousands to millions). Inverse problems are ill-posed and thus to pose asolvable optimization problem, an appropriate regularization is combined with a mea-sure of the data misfit to state a deterministic optimization problem (Tikhonov andArsenin, 1977). Alternatively, if prior information can be formulated using a statis-tical framework, we can use Bayesian techniques to obtain an estimator through theMaximum A Posteriori model (MAP) (Kaipio and Somersalo, 2004). In the context ofBayesian estimation, gradient based methods are also important, as they can be usedto efficiently sample the posterior (Bui-Thanh and Ghattas, 2015).62A number of authors have sought solutions for the inverse problem, where the for-ward problem is the Richards equation (cf. (Bitterlich and Knabner, 2002; Iden andDurner, 2007; Sˇimunek et al., 2012) and references within). The discretization of theRichards equation is commonly completed by an implicit method in time and a finitevolume or finite element method in space. Most work uses a Newton-like methodfor the resulting nonlinear system, which arises from the discretization of the forwardproblem. For the deterministic inverse problem using the Richards equation, previouswork uses some version of a Gauss-Newton method (e.g. Levenberg-Marquardt), witha direct calculation of the sensitivity matrix (Finsterle and Kowalsky, 2011; Sˇimunekand van Genuchten, 1996; Bitterlich and Knabner, 2002). However, while these ap-proaches allow for inversions of moderate scale, they have one major drawback: thesensitivity matrix is large and dense; its computation requires dense linear algebra anda non-trivial amount of memory (cf. (Towara et al., 2015)). Previous work used eitherexternal numerical differentiation (e.g. PEST) or automatic differentiation in orderto directly compute the sensitivity matrix (Finsterle and Zhang, 2011; Bitterlich andKnabner, 2002; Doherty, 2004; Towara et al., 2015). Finite difference can generateinaccuracies in the sensitivity matrix and, consequently, tarry the convergence of theoptimization algorithm. Furthermore, external numerical differentiation is computa-tionally intensive and limits the number of model parameters that can be estimated.The goal of this chapter is to suggest a modern numerical formulation that allowsthe inverse problem to be solved without explicit computation of the sensitivity matrixby using exact derivatives of the discrete formulation (Haber et al., 2000). Our tech-nique is based on the discretize-then-optimize approach, which discretizes the forwardproblem first and then uses a deterministic optimization algorithm to solve the inverseproblem (Gunzburger, 2003). To this end, we require the discretization of the forward63problem. Similar to the work of (Celia et al., 1990), we use an implicit Euler methodin time and finite volume in space. Given the discrete form, we show that we can an-alytically compute the derivatives of the forward problem with respect to distributedhydraulic parameters and, as a result, obtain an implicit formula for the sensitivity. Theformula involves the solution of a linear time-dependent problem; we avoid computingand storing the sensitivity matrix directly and, rather, suggest a method to efficientlycompute the product of the sensitivity matrix and its adjoint times a vector. Equippedwith this formulation, we can use a standard inexact Gauss-Newton method to solvethe inverse problem for distributed hydraulic parameters in 3D. This large-scale dis-tributed parameter estimation becomes computationally tractable with the techniquepresented in this chapter and can be employed with any iterative Gauss-Newton-likeoptimization technique.This chapter is structured as follows: in Section 3.2, we discuss the discretizationof the forward problem on a staggered mesh in space and backward Euler in time; inSection 3.3, we formulate the inverse problem and construct the implicit functions usedfor computations of the Jacobian-vector product. In Section 3.4.1, we demonstrate thevalidity of the implementation of the forward problem and sensitivity calculation. InSection 3.4, we validate the numerical implementation and compare to the literature.Chapter 4 will expand upon the techniques introduced in this chapter to show theeffectiveness of the implicit sensitivity algorithm in comparison to existing numericaltechniques.3.1.1 Attribution and disseminationTo accelerate both the development and dissemination of this approach, we have builtthese tools on top of an open source framework for organizing simulation and inverse64problems in geophysics (Cockett et al., 2015c). We have released our numerical im-plementation under the permissive MIT license. Our implementation of the implicitsensitivity calculation for the Richards equation and associated inversion implementa-tion is provided and tested to support 1D, 2D, and 3D forward and inverse simulationswith respect to custom empirical relations and sensitivity to parameters within thesefunctions. The source code can be found at and maybe a helpful resource for researchers looking to use or extend our implementation. Ihave presented early versions of this work at two international conferences (Cockettand Haber, 2013a,b) and have submitted a version of this manuscript for peer review(Cockett et al., 2017).3.2 Forward problemIn this section, we describe the Richards equations and its discretization (Richards,1931). The Richards equation is a nonlinear parabolic partial differential equation(PDE) and we follow the so-called mixed formulation presented in (Celia et al., 1990)with some modifications. In the derivation of the discretization, we give special atten-tion to the details used to efficiently calculate the effect of the sensitivity on a vector,which is needed in any derivative based optimization algorithm.3.2.1 Richards equationThe parameters that control groundwater flow depend on the effective saturation ofthe media, which leads to a nonlinear problem. The groundwater flow equation hasa diffusion term and an advection term which is related to gravity and only acts inthe z-direction. There are two different forms of the Richards equation; they differ inhow they deal with the nonlinearity in the time-stepping term. Here, we use the most65fundamental form, referred to as the ‘mixed’-form of the Richards equation (Celiaet al., 1990):∂θ(ψ)∂ t−∇ · k(ψ)∇ψ− ∂k(ψ)∂ z= 0 ψ ∈Ω (3.1)where ψ is pressure head, θ(ψ) is volumetric water content, and k(ψ) is hydraulicconductivity. This formulation of the Richards equation is called the ‘mixed’-formbecause the equation is parameterized in ψ but the time-stepping is in terms of θ . Thehydraulic conductivity, k(ψ), is a heterogeneous and potentially anisotropic functionthat is assumed to be known when solving the forward problem. In this chapter, weassume that k is isotropic, but the extension to anisotropy is straightforward (Cockettet al., 2015c, 2016a). The equation is solved in a domain, Ω, equipped with boundaryconditions on ∂Ω and initial conditions, which are problem-dependent.An important aspect of unsaturated flow is noticing that both water content, θ ,and hydraulic conductivity, k, are functions of pressure head, ψ . There are manyempirical relations used to relate these parameters, including the Brooks-Corey model(Brooks and Corey, 1964) and the van Genuchten-Mualem model (Mualem, 1976; vanGenuchten, 1980). The van Genuchten model is written as:θ(ψ) =θr +θs−θr(1+ |αψ|n)m ψ < 0θs ψ ≥ 0(3.2a)k(ψ) =Ksθe(ψ)l(1− (1−θe(ψ)−m)m)2 ψ < 0Ks ψ ≥ 0(3.2b)66whereθe(ψ) =θ(ψ)−θrθs−θr , m = 1−1n, n > 1 (3.3)Here, θr and θs are the residual and saturated water contents, Ks is the saturated hy-draulic conductivity, α and n are fitting parameters, and, θe(ψ) ∈ [0,1] is the effectivesaturation. The pore connectivity parameter, l, is often taken to be 12 , as determined byMualem (1976). Figure 4.1 shows the functions over a range of negative pressure headvalues for four soil types (sand, loam, sandy clay, and clay). The pressure head variesover the domain ψ ∈ (−∞,0). When the value is close to zero (the left hand side),the soil behaves most like a saturated soil where θ = θs and k = Ks. As the pressurehead becomes more negative, the soil begins to dry, which the water retention curveshows as the function moving towards the residual water content (θr). Small changesin pressure head can change the hydraulic conductivity by several orders of magnitude;as such, k(ψ) is a highly nonlinear function, making the Richards equation a nonlinearPDE.3.2.2 DiscretizationThe Richards equation is parameterized in terms of pressure head, ψ . Here, we de-scribe simulating the Richards equation in one, two, and three dimensions. We startby discretizing in space and then we discretize in time. This process yields a discrete,nonlinear system of equations; for its solution, we discuss a variation of Newton’smethod.Spatial DiscretizationIn order to conservatively discretize the Richards equation, we introduce the flux ~f andrewrite the equation as a first order system of the form:67Figure 3.1: The water retention curve and the hydraulic conductivity function forfour canonical soil types of sand, loam, sandy clay, and clay.∂θ(ψ)∂ t−∇ ·~f − ∂k(ψ)∂ z= 0 (3.4a)k(ψ)−1~f = ∇ψ (3.4b)We then discretize the system using a standard staggered finite volume discretiza-tion (cf. Ascher (2008); Haber (2015); Cockett et al. (2016a), and Appendix B). Thisdiscretization is a natural extension of mass-conservation in a volume where the bal-ance of fluxes into and out of a volume are conserved (Lipnikov and Misitas, 2013).Here, it is natural to assign the entire cell one hydraulic conductivity value, k, which islocated at the cell center. Such assigning leads to a piecewise constant approximationfor the hydraulic conductivity and allows for discontinuities between adjacent cells.From a geologic perspective, discontinuities are prevalent, as it is possible to have68large differences in hydraulic properties between geologic layers in the ground. Thepressure head, ψ , is also located at the cell centers and the fluxes are located on cellfaces, which lead to the usual staggered mesh or Marker and Cell (MAC) discretiza-tion in space (Fletcher, 1988). We demonstrate the discretization in 1D, 2D and 3D onthe tensor mesh in Figure 3.2. We discretize the function, ψ , on a cell-centered grid,which results in a grid function, ψ . We use bold letters to indicate other grid functions.Figure 3.2: Discretization of unknowns in 1D, 2D and 3D space. Red circles arethe locations of the discrete hydraulic conductivity K and the pressure headψ . The arrows are the locations of the discretized flux ~f on each cell face.The discretization of a diffusion-like equation on an orthogonal mesh is well-known (see (Haber and Ascher, 2001; Fletcher, 1988; Haber et al., 2007; Ascher andGreif, 2011) and reference within). We discretize the differential operators by usingthe usual mass balance consideration and the elimination of the flux, f 1. This spatialdiscretization leads to the following discrete nonlinear system of ordinary differential1Here we assume an isotropic conductivity that leads to a diagonal mass matrix and this yields easyelimination of the fluxes.69equations (assuming homogeneous Dirichlet boundary conditions):dθ(ψ)dt−D diag(kAv(ψn+1))Gψ−Gz (kAv(ψn+1))= 0 (3.5)Here, D is the discrete divergence operator and G is the discrete gradient operator. Thediscrete derivative in the z-direction is written as Gz. The values of ψ and k(ψ) areknown on the cell-centers and must be averaged to the cell-faces, which we completethrough harmonic averaging (Haber and Ascher, 2001).kAv(ψ) =1Av 1k(ψ)(3.6)where Av is a matrix that averages from cell-centers to faces and the division of thevector is done pointwise; that is, we use the vector notation, (1/v)i = 1/vi. We incor-porate boundary conditions using a ghost-point outside of the mesh (Trottenberg et al.,2001).Time discretization and steppingThe Richards equation is often used to simulate water infiltrating an initially dry soil.At early times in an infiltration experiment, the pressure head, ψ , can be close to dis-continuous. These large changes in ψ are also reflected in the nonlinear terms k(ψ)and θ(ψ); as such, the initial conditions imposed require that an appropriate timediscretization be chosen. Hydrogeologists are often interested in the complete evolu-tionary process, until steady-state is achieved, which may take many time-steps. Here,we describe the implementation of a fully-implicit backward Euler numerical scheme.Higher-order implicit methods are not considered here because the uncertainty associ-ated with boundary conditions and the fitting parameters in the Van Genuchten models70(eq. 4.2) have much more effect than the order of the numerical method used.The discretized approximation to the mixed-form of the Richards equation, usingfully-implicit backward Euler, reads:F(ψn+1,ψn)=θ(ψn+1)−θ(ψn)∆t−D diag(kAv(ψn+1))Gψn+1−Gz (kAv(ψn+1))= 0(3.7)This is a nonlinear system of equations forψn+1 that needs to be solved numerically bysome iterative process. Either a Picard iteration (as in Celia et al. (1990)) or a Newtonroot-finding iteration with a step length control can be used to solve the system. Notethat to deal with dependence of θ with respect to ψ in Newton’s method, we requirethe computation of dθdψ . We can complete this computation by using the analytic formof the hydraulic conductivity and water content functions (e.g. derivatives of eq. 4.2).We note that a similar approach can be used for any smooth curve, even when theconnection between θ and ψ are determined empirically (for example, when θ(ψ) isgiven by a spline interpolation of field data).3.2.3 Solving the nonlinear equationsRegardless of the empirical relation chosen, we must solve 3.7 using an iterative root-finding technique. Newton’s method iterates over m = 1,2, . . . until a satisfactory esti-mation of ψn+1 is obtained. Given ψn+1,m, we approximate F(ψn+1,ψn) as:F(ψn+1,ψn)≈ F(ψn,m,ψn)+Jψn+1,mδψ (3.8)71where the Jacobian for iteration, m, is:Jψn+1,m =∂F(ψ,ψn)∂ψ∣∣∣∣ψn+1,m(3.9)The Jacobian is a large dense matrix, and its computation necessitates the computationof the derivatives of F(ψn+1,m,ψn). We can use numerical differentiation in order toevaluate the Jacobian (or its product with a vector). However, in the context of theinverse problem, an exact expression is preferred. Given the discrete forward problem,we obtain that:Jψn+1,m =1∆tdθ(ψn+1,m)dψn+1,m− ddψn+1,m(Ddiag(kAv(ψn+1,m))Gψn+1,m)−Gz dkAv(ψn+1,m)dψn+1,m(3.10)Here, recall that kAv is harmonically averaged and its derivative can be obtained by thechain rule:dkAv(ψ)dψ= diag((Avk−1(ψ))−2)Av diag(k−2(ψ)) dk(ψ)dψ(3.11)Similarly, for the second term in (3.10) we obtain:∂∂ψ(D diag(kAv(ψ))Gψ) = D diag(kAv(ψ))G+D diag(Gψ)∂kAv(ψ)∂ψ(3.12)Here the notation n+1,m has been dropped for brevity. For the computations above,we need the derivatives of functions k(ψ) and θ(ψ); note that, since the relations areassumed local (point wise in space) given the vector, ψ , these derivatives are diagonal72matrices. For Newton’s method, we solve the linear system:Jψn+1,m δψ = −F(ψn+1,m,ψn) (3.13)For small-scale problems, we can solve the linear system using direct methods;however, for large-scale problems, iterative methods are more commonly used. Theexistence of an advection term in the PDE results in a non-symmetric linear systemin the Newton solve. Thus, when using iterative techniques to solve this system, anappropriate iterative method, such as BICGSTAB or GMRES (Saad, 1996; Barrett et al.,1994), must be used. For a discussion on solver choices in the context of the Richardsequation please see Orgogozo et al. (2014).At this point, it is interesting to note the difference between the Newton iterationand the Picard iteration suggested in (Celia et al., 1990). We can verify that the Pi-card iteration uses an approximation to the Jacobian Jψn+1,m δψ given by dropping thesecond term from (3.12). This term can have negative eigenvalues and dropping it istypically done when considering the lagged diffusivity method (Vogel, 2001). How-ever, as discussed in (Vogel, 2001), ignoring this term can slow convergence.Finally, a new iterate is computed by adding the Newton update to the last iterate:ψn+1,m+1 = ψn+1,m+αδψwhere α is a parameter that guarantees that‖F(ψn,m+1,ψn)‖< ‖F(ψn,m,ψn)‖To obtain α , we perform an Armijo line search (Nocedal and Wright, 1999). In our73numerical experiments, we have found that this method can fail when the hydraulicconductivity is strongly discontinuous and changes rapidly. In such cases, Newton’smethod yields a poor descent direction. Therefore, if the Newton iteration fails to con-verge to a solution, the update is performed with the mixed-form Picard iteration. Notethat Picard iteration can be used, even when Newtons method fails, because Picard it-eration always yields a descent direction (Vogel, 2001).At this point, we have discretized the Richards equation in both time and spacewhile devoting special attention to the derivatives necessary in Newton’s method andthe Picard iteration as described in (Celia et al., 1990). The exact derivatives of thediscrete problem will be used in the following two sections, which outline the implicitformula for the sensitivity and its incorporation into a general inversion algorithm. Theimplementation is provided as a part of the open source SimPEG project (Cockett et al.(2015c), Inverse ProblemThe location and spatial variability of, for example, an infiltration front over time isinherently dependent on the hydraulic properties of the soil column. As such, director proxy measurements of the water content or pressure head at various depths alonga soil profile contain information about the soil properties. We pose the inverse prob-lem, which is the estimation of distributed hydraulic parameters, given either watercontent or pressure data. We frame this problem under the assumption that we wishto estimate hundreds of thousands to millions of distributed model parameters. Due tothe large number of model parameters that we aim to estimate in this inverse problem,Bayesian techniques or external numerical differentiation, such as the popular PESTtoolbox (Doherty, 2004), are not computationally feasible. Instead, we will employ a74direct method by calculating the exact derivatives of the discrete the Richards equa-tion and solving the sensitivity implicitly. For brevity, we show the derivation of thesensitivity for an inversion model of only saturated hydraulic conductivity, Ks, frompressure head data, dobs. This derivation can be readily extended to include the use ofwater content data and inverted for other distributed parameters in the heterogeneoushydraulic conductivity function. We will demonstrate the sensitivity calculation formultiple distributed parameters in the numerical examples (Section 4.5).The Richards equation simulation produces a pressure head field at all points inspace as well as through time. Data can be predicted, dpred, from these fields and com-pared to observed data, dobs. To be more specific, we let Ψ= [(ψ1)>, . . . ,(ψnt )>]> bethe (discrete) pressure field for all space and nt time steps. When measuring pressurehead recorded only in specific locations and times, we define the predicted data, dpred,as dpred = PΨ(m). Here, the vector m is the vector containing all of the parameterswhich we are inverting for (e.g. Ks,α,n,θr, or θs when using the van Genuchten em-pirical relation). The matrix, P, interpolates the pressure head field, Ψ, to the locationsand times of the measurements. Since we are using a simple finite volume approachand backward Euler in time, we use linear interpolation in both space and time to com-pute dpred from Ψ. Thus, the entries of the matrix P contain the interpolation weights.For linear interpolation in 3D, P is a sparse matrix that contains up to eight non-zeroentries in each row. Note that the time and location of the data measurement is inde-pendent and decoupled from the numerical discretization used in the forward problem.A water retention curve, such as the van Genuchten model, can be used for computa-tion of predicted water content data, which requires another nonlinear transformation,dθpred = Pθ(Ψ(m),m). Note here that the transformation to water content data, in gen-eral, depends on the model to be estimated in the inversion, which will be addressed75in the numerical examples. For brevity in the derivation that follows, we will maketwo simplifying assumptions: (1) that the data are pressure head measurements, whichrequires a linear interpolation that is not dependent on the model; and, (2) that themodel vector, m, describes only distributed saturated hydraulic conductivity. Our soft-ware implementation does not make these assumptions; our numerical examples willuse water content data, a variety of empirical relations, and calculate the sensitivity tomultiple heterogeneous empirical parameters.We can now formulate the discrete inverse problem to estimate saturated hydraulicconductivity, m, from the observed pressure head data, dobs. We frame the inversionas an optimization problem, which minimizes a data misfit and a regularization term.Chapter 2 showed an approach for geophysical inversions where hundreds of thou-sands to millions of distributed parameters are commonly estimated in a determinis-tic inversion (Tikhonov and Arsenin, 1977; Oldenburg and Li, 2005; Constable et al.,1987; Haber, 2015). Please refer to the previous chapter for the details of this inversionmethodology. The hydrogeologic literature also shows the use of these techniques;however, there is also a large community advancing stochastic inversion techniquesand geologic realism (cf. Linde et al. (2015)). Regardless of the inversion algorithmused, an efficient method to calculate the sensitivity is crucial; this method is the focusof our work.3.3.1 Implicit sensitivity calculationThe optimization problem requires the derivative of the pressure head with respect tothe model parameters, ∂Ψ∂m . We can obtain an approximation of the sensitivity ma-trix through a finite difference method on the forward problem (Sˇimunek and vanGenuchten, 1996; Finsterle and Kowalsky, 2011; Finsterle and Zhang, 2011). One76forward problem, or two, when using central differences, must be completed for eachcolumn in the Jacobian at every iteration of the optimization algorithm. This style ofdifferentiation proves advantageous in that it can be applied to any forward problem;however, it is highly inefficient and introduces errors into the inversion that may slowthe convergence of the scheme (Doherty, 2004). Automatic differentiation (AD) canalso be used (Nocedal and Wright, 1999). However, AD does not take the structure ofthe problem into consideration and often requires that the dense Jacobian be explicitlyformed. Bitterlich and Knabner (2002) presents three algorithms (finite difference, ad-joint, and direct) to directly compute the elements of the dense sensitivity matrix forthe Richards equation. As problem size increases, the memory required to store thisdense matrix often becomes a practical computational limitation (Haber et al., 2004;Towara et al., 2015). As we show next, it is possible to explicitly write the derivativesof the Jacobian and evaluate their products with vectors using only sparse matrix op-erations. This algorithm is much more efficient than finite differencing, especially forlarge-scale simulations, since it does not require explicitly forming and storing a largedense matrix. Rather, the algorithm efficiently computes matrix-vector and adjointmatrix-vector products with sensitivity. We can use these products for the solution ofthe Gauss-Newton system when using the conjugate gradient method, which bypassesthe need for the direct calculation of the sensitivity matrix and makes solving large-scale inverse problems possible. Other geophysical inverse problems have used thisidea extensively, especially in large-scale electromagnetics (cf. Haber et al. (2000)).The challenge in both the derivation and implementation for the Richards equationlies in differentiating the nonlinear time-dependent forward simulation with respect tomultiple distributed hydraulic parameters.The approach to implicitly constructing the derivative of the Richards equation in77time involves writing the whole time-stepping process as a block bi-diagonal matrixsystem. The discrete Richards equation can be written as a function of the model. Fora single time-step, the equation is written:F(ψn+1(m),ψn(m),m) =θ n+1(ψn+1)−θ n(ψn)∆t− D diag(kAv(ψn+1,m))Gψn+1−GzkAv(ψn+1,m) = 0 (3.14)In this case, m is a vector that contains all the parameters of interest. Note that ψn+1and ψn are also functions of m. In general, θ n+1 and θ n are also dependent on themodel; however, for brevity, we will omit these derivatives. The derivatives of F to thechange in the parameters m can be written as:∇mF(ψn,ψn+1,m) =∂F∂kAv∂kAv∂m+∂F∂ψn∂ψn∂m+∂F∂ψn+1∂ψn+1∂m= 0 (3.15)or, in more detail:1∆t(∂θ n+1∂ψn+1∂ψn+1∂m− ∂θn∂ψn∂ψn∂m)−D diag(Gψn+1)(∂kAv∂m+∂kAv∂ψn+1∂ψn+1∂m)− D diag(kAv(ψn+1))G∂ψn+1∂m −Gz(∂kAv∂m+∂kAv∂ψn+1∂ψn+1∂m)= 0(3.16)The above equation is a linear system of equations and, to solve for dΨdm , we rearrangethe block-matrix equation:78A0(ψn+1)︷ ︸︸ ︷[1∆t∂θ n+1∂ψn+1−D diag(Gψn+1) ∂kAv∂ψn+1−D diag(kAv(ψn+1,m))G−Gz ∂kAv∂ψn+1]∂ψn+1∂m+[− 1∆t∂θ n∂ψn]︸ ︷︷ ︸A−1(ψn)∂ψn∂m=[−D diag(Gψn+1) ∂kAv∂m−Gz∂kAv∂m]︸ ︷︷ ︸B(ψn+1)(3.17)Here, we use the subscript notation of A0(ψn+1) and A−1(ψn) to represent two block-diagonals of the large sparse matrix A(Ψ,m). Note that all of the terms in these ma-trices are already evaluated when computing the Jacobian of the Richards equations inSection 3.2 and that they contain only basic sparse linear algebra manipulations with-out the inversion of any matrix. If ψ0 does not depend on the model, meaning theinitial conditions are independent, then we can formulate the block system as:A(Ψ,m)︷ ︸︸ ︷A0(ψ1)A−1(ψ1) A0(ψ2)A−1(ψ2) A0(ψ3). . . . . .A−1(ψnt−1) A0(ψnt )∂Ψ∂m︷ ︸︸ ︷∂ψ1∂m∂ψ2∂m...∂ψnt−1∂m∂ψnt∂m=B(Ψ,m)︷ ︸︸ ︷B1(ψ1)B2(ψ2)...Bn−1(ψnt−1)Bn(ψnt )(3.18)This is a block matrix equation; both the storage and solve will be expensive if it isexplicitly computed. Indeed, its direct computation is equivalent to the adjoint method(Bitterlich and Knabner, 2002; Dean and Chen, 2011).79Since only matrix vector products are needed for the inexact Gauss-Newton opti-mization method, the matrix J is never needed explicitly and only the products of theform Jv and J>z are needed for arbitrary vectors v and z. Projecting the full sensitivitymatrix onto the data-space using P results in the following equations for the Jacobian:J = PA(Ψ,m)−1B(Ψ,m) (3.19a)J> = B(Ψ,m)>A(Ψ,m)−>P> (3.19b)In these equations, we are careful to not write dΨdm , as it is a large dense matrix whichwe do not want to explicitly compute or store. Additionally, the matrices A(Ψ,m) andB(Ψ,m) do not even need to be explicitly formed because the matrix A(Ψ,m) is atriangular block-system, which we can solve using forward or backward substitutionwith only one block-row being solved at a time (this is equivalent to a single time step).To compute the matrix vector product, Jv, we use a simple algorithm:1. Given the vector v calculate y = Bv2. Solve the linear system Aw = y for the vector w3. Set Jv = PwHere, we note that we complete steps (1) and (2) using a for-loop with only oneblock-row being computed and stored at a time. As such, only the full solution, Ψ, isstored and all other block-entries may be computed as needed. There is a complicationhere if data is in terms of water content or effective saturation, as the data projection isno longer linear and may have model dependence. These complications can be dealt80with using the chain rule during step (3). Similarly, to compute the adjoint J>z involvesthe intermediate solve for y in A>y = P>z and then computation of B>y. Again, wesolve the block-matrix via backward substitution with all block matrix entries beingcomputed as needed. Note that the backward substitution algorithm can be viewed astime stepping, which means that it moves from the final time back to the initial time.This time stepping is equivalent to the adjoint method that is discussed in Dean andChen (2011) and references within. The main difference between our approach andthe classical adjoint-based method is that our approach yields the exact gradient of thediscrete system; no such guarantee is given for adjoint-based methods.The above algorithm and the computations of all of the different derivatives sum-marizes the technical details of the computations of the sensitivities. Equipped withthis “machinery”, we now demonstrate that validity of our implementation.3.4 Numerical resultsThe focus of this section is to validate and compare our algorithm and implementationto the literature. The following chapter will focus on applications of this work as wellas demonstrate computational scalability of the algorithm for realistic field examples(Chapter 4).3.4.1 ValidationForward problemThe Richards equation has no analytic solution, which makes testing the code moreinvolved. Here we have chosen to use a fictitious source experiment to rigorously testthe code. In this experiment, we approximate an infiltration front by an arctangent81function in one dimension, which is centered over the highly nonlinear part of the vanGenuchten curves, with ψ ∈ [−60,−20] centimeters. The arctangent curve advectsinto the soil column with time. The advantage of using an analytic function is that thederivative is known explicitly and can be calculated at all times. However, it shouldbe noted that the Richards equation does not satisfy the analytic solution exactly, butdiffers by a function, S(x, t). We refer to this function as the fictitious source term.The analytic function that we used has similar boundary conditions and shape to anexample in Celia et al. (1990) and is considered over the domain x ∈ [0,1].Ψtrue(x, t) =−20arctan(20((x−0.25)− t))−40 (3.20)This analytic function is shown at times 0 and 0.5 in Figure 3.3 and has a pressurehead range of ψ ∈ [−60,−20]. We can compare these values to the van Genuchtencurves in Figure 4.1. We can then put the known pressure head into the Richardsequation (3.1) and calculate the analytic derivatives and equate them to a source term,S(x, t). Knowing this source term and the analytic boundary conditions, we can solvediscretized form of the Richards equation, which should reproduce the analytic func-tion in Equation 3.20. Table 3.1 shows the results of the fictitious source test when thenumber of mesh-cells is doubled and the time-discretization is both fixed and equiv-alent to the mesh size (i.e. k = h). In this case, we expect that the backward-Eulertime discretization, which is O(δ ), will limit the order of accuracy. The final columnof Table 3.1 indeed shows that the order of accuracy is O(δ ). The higher errors in thecoarse discretization are due to high discontinuities and changes in the source term,which the coarse discretization does not resolve. We can complete a similar procedurein two and three dimensions and these tests show similar results of convergence. The82rigorous testing of the code presented provides confidence in the forward simulationthat is used throughout the following sections of this chapter.Figure 3.3: Fictitious source test in 1D showing the analytic function Ψtrue attimes 0.0 and 0.5 and the numerical solution Ψ(x,0.5) using the mixed-form Newton iteration.Table 3.1: Fictitious source test for Richards equation in 1D using the mixed-form Newton iteration.Mesh Size (n) ||Ψ(x,0.5)−Ψtrue(x,0.5)||∞ Order Decrease, O(δ )64 5.485569e+00128 2.952912e+00 0.894256 1.556827e+00 0.924512 8.035072e-01 0.9541024 4.086729e-01 0.9752048 2.060448e-01 0.9884096 1.034566e-01 0.9948192 5.184507e-02 0.99783Inverse problemIn order to test the implicit sensitivity calculation, we employ derivative and adjointtests as described in Haber (2015). Given that the Taylor expansion of a functionf (m+h∆m) isf (m+h∆m) = f (m)+hJ∆m+O(h2), (3.21)for any of the model parameters considered, we see that our approximation of f (m+h∆m) by f (m)+ hJ∆m should converge as O(h2) as h is reduced. This allows us toverify our calculation of Jv. To verify the adjoint, J>v, we check thatw>Jv = v>J>w (3.22)for any two random vectors, w and v. These tests are run for all of the parametersconsidered in an inversion of the Richards equation. Within our implementation, boththe derivative and adjoint tests are included as unit tests which are run on any updatesto the implementation ( Comparison to literatureCode-to-code comparisons have been completed for comparison to Celia et al. (1990),which can be found in Cockett (2017). The following results are a direct comparisonto the results produced by Celia et al. (1990) for the Picard iteration only; Celia et al.(1990) did not implement the Newton iteration. The direct comparison is completed:(a) to give confidence to the numerical results; (b) to compare the Newton iterationto the Picard iteration of the mixed formulation for a well-known example; and (c)demonstrate the use of multiple empirical models. Here, we use the Haverkamp model(Haverkamp et al., 1977) (rather than the classically used van Genuchten model) for84the water retention and hydraulic conductivity functions.θ(ψ) =α(θs−θr)α+ |ψ|β +θrK(ψ) = KsAA+ |ψ|γ(3.23)We used parameters of α = 1.611× 106, θs = 0.287, θr = 0.075, β = 3.96, A =1.175× 106, γ = 4.74, and Ks = 9.44× 10−5 m/s, which are the same as in Celiaet al. (1990). The 40 cm high 1D soil column has initially dry conditions with a pres-sure head ψ0(x,0) = −61.5cm. The boundary conditions applied are inhomogeneousDirichlet with the top of the soil column, ψ(40cm, t) = −20.7cm, and the bottom ofthe soil column, ψ(0cm, t) =−61.5cm. The initial conditions are not consistent withthe boundary condition at the top of the soil profile. This inconsistency leads to aboundary layer and a steep gradient in the pressure head at early times; as such, weanticipate that the Newton iteration will converge slowly at these times. The spatialgrid is regular and has a spacing of 1.0cm, while the time-stepping, ∆t, is manipulated.The three iterative methods described in Section 3.2 are implemented and compared at360s: (1) head-based form Picard; (2) mixed-form Picard; and, (3) mixed-form New-ton. Figure 3.4 shows the solution obtained with the three iterative methods. Com-paring the head-based formulation to the mixed-formulation for a large time-step (e.g.∆t = 120s) shows the degradation of the head-based method. Not only is the infiltrationfront smoothed, there is underestimate of the front location (Figure 3.4a). The under-estimate of the infiltration front location is due to a loss of mass, which can be tracedback the initial formulation of the head-based method (Celia et al., 1990). The mixed-formulation, solved with either a Picard iteration or a Newton method, conserves massand correctly identifies the spatial location of the infiltration front (Figure 3.4b). The85results obtained here show excellent agreement with Celia et al. (1990).Figure 3.4: Comparison of results to Celia et al. (1990) showing the differencesin the (a) head-based and (b) mixed formulations for t=360s.3.5 ConclusionsThe number of parameters that are estimated in the Richards equation inversions hasgrown and will continue to grow as time-lapse data and geophysical data integrationbecome standard in site characterizations. The increase in data quantity and qualityprovides the opportunity to estimate spatially distributed hydraulic parameters fromthe Richards equation; doing so requires efficient simulation and inversion strategies.In this chapter, we have shown a derivative-based optimization algorithm that does notstore the Jacobian, but rather computes its effect on a vector (i.e. Jv or J>z). By notstoring the Jacobian, the size of the problem that we can invert becomes much larger.We have presented efficient methods to compute the Jacobian that can be used for allempirical hydraulic parameters, even if the functional relationship between parametersis obtained from laboratory experiments.Our technique allows a deterministic inversion, which includes regularization, tobe formulated and solved for any of the empirical parameters within the Richards equa-tion. For a full 3D simulation, as many as ten spatially distributed parameters may be86needed, resulting in a highly non-unique problem; as such, we may not be able toreasonably estimate all hydraulic parameters. Depending on the setting, amount ofa-priori knowledge, quality and quantity of data, the selection of which parameters toinvert for may vary. Our methodology enables practitioners to experiment in 1D, 2Dand 3D with full simulations and inversions, in order to explore the parameters that areimportant to a particular dataset. Our numerical implementation is provided in an opensource repository ( and is integrated into the framework presented inChapter 2. The goal of Chapter 4 is to work with these techniques in an experimentthat represents a field infiltration experiment. The following chapter will also furtherdocument the scalability of this approach when moving to 3D inversions.87Chapter 4Vadose zone inversions4.1 IntroductionCharacterizing hydraulic parameters can be completed using direct methods, whichrequire laboratory samples to be taken. However, because much of the vadose zoneconsists of unconsolidated materials, these invasive measurement techniques may dis-turb the soil properties, especially porosity and water content (Deiana et al., 2007).Furthermore, these point measurements may not represent the entire hydrologic settingand cannot observe the vadose zone processes in situ. Geophysical methods, however,can be used to generate spatially extensive estimates of physical properties, which canbe related, through empirical relations, to hydrogeologic properties, such as hydraulicconductivity. We can use geophysical inversions, such as direct current resistivity, toimage the electrical conductivity. The electrical conductivity is related to the soil mois-ture content and the fluid electrical conductivity. This relation is empirically described88through Archie’s equation (Archie, 1942):σ = a−1σ fφmθ ne (4.1)where σ is the bulk electrical conductivity of the fluid filled soil or rock, σ f isthe electrical conductivity of the fluid, φ is the porosity, and θe is the effective satura-tion. The exponents, m and n, and scaling factor, a, are empirical fitting parameters,which are occasionally referred to as the cementation exponent, the saturation expo-nent, and the tortuosity factor, respectively. Archie’s equation, which must be furthermodified in the presence of clays, has three empirical parameters that are either un-known or poorly constrained. If the imaging for electrical conductivity is completedin a time-lapse experiment, we can capture changes of moisture content over timedue to fluid movement. In the vadose zone, the moisture content is the time-steppingterm in the Richards equation, which is related to the pressure head through anotherempirical relation (e.g. the van Genuchten-Mualem empirical relations). Given esti-mates of saturation, an inversion can be formulated for hydraulic parameters using theRichards equation (Chapter 3). The van Genuchten empirical model has five parame-ters (Ks,θr,θs,α,, and n) that are commonly estimated in laboratory experiments. Wecan use these hydraulic parameters in groundwater models to make predictions anddecisions about groundwater processes.In summary, we can use geophysical measurements to make time-lapse images ofwater content, which, in turn, can be used in conjunction with a groundwater sim-ulation to invert for hydraulic properties (cf. Hinnell et al. (2010)). Throughout thisprocess, we require multiple empirical parameterizations as well as extensive hydroge-ologic knowledge, including knowledge of boundary and initial conditions. In Binley89et al. (2002), a cross-well tomography experiment was conducted using radar and DCresistivity over the course of a vadose zone tracer test. The center of mass of the tracerwas found and tracked over time using the geophysical imaging and subsequent empir-ical interpretation as water content changes, ∆θ . A separate groundwater simulationwas completed using the Richards equation to estimate the saturated hydraulic con-ductivity, which was assumed to be homogeneous and isotropic and was determinedthrough “trial and error” because “no automatic data matching was possible” due tothe size of the numerical simulation. Similarly, in Deiana et al. (2007), “only the sat-urated hydraulic conductivity Ks was modified by trial and error to match field data.”In this case, anisotropy was introduced to further fit the results obtained by match-ing the infiltration experiment. Similarly, when investigating coupled hydrogeologicand geophysical simulation, Hinnell et al. (2010) notes that the parameterization was“simpler than reality to make numerical inversion tractable” and that the computationalpower necessary for a stochastic “approach limits widespread application and use ofthe hydrogeophysic[s].” The lack of algorithms that formulate and solve the inverseproblem for the Richards equation is a hurdle in the analysis and joint quantitativeanalysis of hydrologic and geophysical data. In Chapter 3, we presented a formulationthat reduced a number of these barriers to efficient large-scale parameter estimation.However, the numerical ability to invert for five distributed parameters at once is notimmediately practical.In this chapter, we will use the algorithm and formulation of the inverse problemdeveloped in Chapter 3 to investigate a distributed multi-parameter inversion in bothone and three dimensions using water content data. The goal of this work is: (a) toexplore the nonlinearities and couplings in the van Genuchten functions; and, (b) todemonstrate that it is now possible to complete an inversion for distributed hydraulic90parameters in a large-scale 3D simulation. We also refer the reader to Appendix C,where we expose the assumptions in a forward simulation framework to manipulationand estimation. This appendix also demonstrates multiple geologic- and physics-basedparameterizations that can be used to embed knowledge between diverse disciplines.4.1.1 Attribution and disseminationThis work extends the previous chapter in terms of applications, but it also extendsthe forward simulation framework and the inversion framework that is necessary toflexibly invert for multiple distributed parameters at once. We abstracted the necessaryadvancements in the framework from the organization and implementation of elec-tromagnetics inversions in both time domain and frequency domain. The forwardsimulation framework used for all numerical examples is derived from this cross-disciplinary, collaborative work in the Richards equation and electromagnetics. Thiswork is presented in Heagy et al. (2016) for eight formulations of Maxwell’s equationsfor geophysical simulations and inverse problems. Appendix C shows an adaptionof the forward simulation framework for the Richards equation, along with a num-ber of other collaborative case-studies that build upon or extend this work. The li-brary for implementing the numerical methods and inverse formulation is contained inSimPEG.FLOW.Richards ( The code to repro-duce the the majority of the results and figures in this chapter is available on FigShare(Cockett and Haber, 2015); the examples are available within the online documen-tation ( The numerical examples are inspired by work frommy undergraduate thesis, of which two papers were published during the course ofmy PhD (Pidlisecky et al., 2013; Cockett and Pidlisecky, 2014). These examples in-volved a field site in California designed for managed aquifer recharge, which col-91lected dense geophysical data to help inform management practices. The two papers’results showed a qualitative comparison to water content in one dimension (Pidliseckyet al., 2013) and a numerical rock physics approach that analyzed soils as they clogged(Cockett and Pidlisecky, 2014). These two studies informed the numerical setup of thefollowing synthetic experiments.4.2 Empirical relationshipsThe Richards equation relies upon the correct parameterization of both the water re-tention curve and the hydraulic conductivity function. A number of empirical modelshave been proposed to describe these functions, including Brooks and Corey (1964),Haverkamp et al. (1977), van Genuchten (1980), and Mualem (1976). All of theseempirical models are loosely based on the physical interpretation of fitting parameters;however, this basis can be misleading and it has been shown that many parametershave no physical meaning and should be considered empirical shape factors (Schaapand Leij, 2000). These functions have also been interpreted as splines, which can behelpful in the inverse formulation (Bitterlich and Knabner, 2002). The van Genuchtenmodel, introduced in Chapter 3, is written as:θ(ψ) =θr +θs−θr(1+ |αψ|n)m ψ < 0θs ψ ≥ 0(4.2a)k(ψ) =Ksθe(ψ)l(1− (1−θe(ψ)−m)m)2 ψ < 0Ks ψ ≥ 0(4.2b)92whereθe(ψ) =θ(ψ)−θrθs−θr , m = 1−1n, n > 1 (4.3)Here, θr and θs are the residual and saturated moisture contents, Ks is the saturatedhydraulic conductivity, α and n are fitting parameters, and θe(ψ)∈ [0,1] is the effectivesaturation. The pore connectivity parameter, l, is often taken to be 12 , as determined byMualem (1976).These curves are unknown at every point in space in the inverse problem. We willuse a number of canonical parameters for the van Genuchten empirical relation to lookat the water retention curve and the hydraulic conductivity function; Table 4.1 showsthe values for these parameters. The soil-naming scheme refers to the proportionsof sand, silt, and clay. Figure 4.1 shows the functions over a range of negative soilwater potentials for four soil types (sand, loam, sandy clay, and clay). The soil waterpotential varies over the domain ψ ∈ (−∞,0). When the value is close to zero (theleft hand side), the soil behaves most like a saturated soil where θ = θs and K = Ks.As the water potential becomes more negative, the soil begins to dry, which the waterretention curve shows as the function moving towards the residual water content (θr).The parameters α and n determine the slope and shape of this transition. In Figure 4.1,we see that the water retention curve for sand has a high saturated water content butrapidly changes to a low residual water content. For clay, this transition, with respect tothe soil water potential, is much more gradual. This difference has to do with the smallsize of the pores in clay and their ability to retain water, even at high suctions. Thisdifference is reflected in the hydraulic conductivity function: when close to saturatedconditions, sand has the highest hydraulic conductivity, however, as the soil dries, thehydraulic conductivity of sand decreases rapidly and becomes relatively lower than a93loam or clay with the same water potential.Table 4.1: Canonical soil parameters for the water retention and hydraulic con-ductivity curves (Van Genuchten et al., 1991)Soil Type θr θs α (1/m) n Ks (m/s)Sand 0.020 0.417 13.8 1.592 5.8e-05Loamy sand 0.035 0.401 11.5 1.474 1.7e-05Sandy loam 0.041 0.412 6.8 1.322 7.2e-06Loam 0.027 0.434 9.0 1.220 1.9e-06Silt loam 0.015 0.486 4.8 1.211 3.7e-06Sandy clay loam 0.068 0.330 3.6 1.250 1.2e-06Clay loam 0.075 0.390 3.9 1.194 6.4e-07Silty clay loam 0.040 0.432 3.1 1.151 4.2e-07Sandy clay 0.109 0.321 3.4 1.168 3.3e-07Silty clay 0.056 0.423 2.9 1.127 2.5e-07Clay 0.090 0.385 2.7 1.131 1.7e-07Figure 4.1: The water retention curve and the hydraulic conductivity function forfour canonical soil types of sand, loam, sandy clay, and clay.In this chapter, we are interested in the inverse problem for the soil parameters,which controls the shape and intercepts of these two functions. To solve this problem94systematically, we will first observe how the shape of these curves change as the empir-ical parameters are modified over the full empirical range. In Figure 4.2, we completethis process for the hydraulic conductivity function for the four soil types (sand, loam,sandy clay, and clay). The bounds show the change from that curve by varying a singleparameter and holding the rest constant. The saturated hydraulic conductivity, Ks, canmove the entire curve up and down. The parameters α and n control the shape of thecurve as the negative soil water potential increases. Figure 4.3 also shows the foursoil types for the water retention curve. As expected in this case, θs controls the y-intercept and θr controls the minimum value of the function. Again, the parameters αand n control the shape of the curve between the bounds of θs and θr. In this case, theparameter n has much the same effect as θr; Sˇimunek et al. (1998) notes this high cor-relation. Note that the parameters α and n are involved in both the calculation of k andθ , which links these two curves in a physically reasonable way (Bitterlich et al., 2004).Additionally, the low parameterization of these curves, when using the van Genuchtenempirical relationship, means that changing one parameter has a global effect over thedomain of the soil water potential, ψ ∈ (−∞,0). When framing the inverse problem,a spline parameterization for each curve can alternatively be used. Although fram-ing in this way decouples the two curves, we need to take care when extrapolatingthe results to moisture contents not covered by the bounds of the experiment and toensure that the obtained curves are physically reasonable (Bitterlich et al., 2004). Toavoid these potential pitfalls of an inversion, we will use the van Genuchten empiricalparameterization for the remainder of the experimentation.95Figure 4.2: The hydraulic conductivity function showing bounds of the variousparameters Ks ∈ [1×10−7,1×10−4], α ∈ [2.5,13.5], and n ∈ [1.1,1.6] forfour canonical soil types of (a) sand, (b) loam, (c) sandy clay, and (d) clay.96Figure 4.3: The water retention function showing bounds of the various param-eters θs ∈ [0.3,0.5], θr ∈ [0.01,0.1], α ∈ [2.5,13.5], and n ∈ [1.1,1.6] forfour canonical soil types of (a) sand, (b) loam, (c) sandy clay, and (d) clay.4.2.1 Objective functionsTo further explore the van Genuchten parameterizations, we will use a homogeneoussandy clay loam (Ks: 1.2-06 m/s, α: 3.6 1/m, n: 1.25, θr: 0.068, θs: 0.33) in a small in-filtration experiment. The boundary conditions on the top of a 40 cm deep soil profile97are ψ = −5cm at z = 0cm and ψ = −41.5cm at z = −40cm. The Richards equationsimulation is run for a time period of 22 hours with an exponentially increasing timestep from 5 s to 60 s. Saturation data is collected at nine equally spaced locationsfrom -2 cm to -34 cm. The water content data at each of the nine locations is sampled23 times over the length of the experiment. This sampling differs from traditionallycollected laboratory experiment data, which records water outflow, distributed pres-sure measurements (using tensiometers), or bulk soil moisture. The analysis here iscompleted with the assumption that distributed estimates of soil moisture are availablefrom a geophysical method.In the following sections, we will be estimating the van Genuchten parameters(Ks,θr,θs,α , and n) from this water content data. In this section, we will visualize theobjective function, l2 norm of the data difference, around the true model to get a senseof the optimization problem for a homogeneous soil with this data. Even with a homo-geneous soil, the objective function is in five-dimensional space, so both visualizationand identification of local minima is difficult. Figure 4.4 shows ten profiles throughthe objective function. These profiles are created by varying two of the parameterswhile keeping all other parameters constant at the true values. A 40× 40 grid, overreasonable bounds of each parameter, requires 16,000 simulations and produces tencross-sections of the five-dimensional space. The cross-sections of θs−Ks, n−Ks,n− θr, and n−α show well-defined objective functions that are convex along theseplanes. The structure of the objective functions for α and n, with respect to θs, showsa more elongated objective function with contours nearly perpendicular to the θs axis,which means that, while keeping θs constant in Figures 4.4h and 4.4i, both α and n canvary over their respective ranges at similar objective function values. In Figure 4.4c, αcan also change while keeping Ks constant at similar objective function values. Mawer98et al. (2013) also notes low sensitivity to the α parameter when inverting saturationestimates for homogeneous layers. In this case, we also see the contours of the θr−θsobjective function as perpendicular to the θs axis, which indicates low sensitivity to θrin this plane of the objective function (Figure 4.4e). In all cases, when θs is involved,except for the cross-section of θs−Ks, the objective function is elongated perpendic-ular to the θs axis. This perpendicular elongation means that there is high sensitivityto θs, which is not surprising given that our data is water content. Additionally, theboundary conditions of the experiment yield pressure head values in the entire domainbetween 10−2m and 100m, which is in the domain where θs has the greatest influenceover the shape of the water content response to pressure head (Figure 4.3).Figure 4.4: Objective function cross sections plotted for all ten cross sectionsthrough the five dimensional space of Ks,θr,θs,α and n. Each cross sectionwas simulated with 40× 40 simulations and compared using an l2 dataobjective function. The results are shown in log10-scale.This analysis of the objective function shows that all of the cross-sections aroundthe global minima are convex (albeit highly elongated in some axes). If local minima99exist, these are not located on cross sections through the true soil parameters. However,as seen in Figure 4.2 and 4.3, at a single soil water potential, a change in any parametercan raise or lower the functions for hydraulic conductivity and water content. If only asmall portion of each curve is examined by the experimental setup (i.e. boundary andinitial conditions and measurement locations), then the recovery of these parameterswill be non-unique. In the following sections, we will release the assumptions of ahomogeneous soil and investigate the recovery of distributed soil parameters in a one-dimensional soil profile.4.3 Layered soil profileIn this section, we will investigate our ability to recover the van Genuchten parametersof a layered soil. The general setup of this experiment continues to expand on the 40cm deep soil profile examined in the previous section. In this case, however, we breakup the soil profile into three layers: (1) silt loam from 0 cm to -15 cm; (2) loam from-15 cm to -25 cm; and, (3) sandy clay loam from -25 cm to -40 cm. Table 4.1 showsthe values for the van Genuchten curves. Figure 4.5 shows the pressure head and watercontent fields as an image of the 1D soil profile over the length of the experiment. Theboundary conditions are set to inhomogeneous Dirichlet with values of -5 cm at the topof the model and -41.5 cm at the bottom of the model. The infiltration front is observedas a pressure increase moving down through the soil profile. The pressure head fieldis continuous across layer boundaries, as is expected. We can translate the pressurehead everywhere in time and space to a saturation profile over time using the knownspatially heterogeneous van Genuchten parameters (Figure 4.5b). The most notabledifference is the discontinuity between layers, which is mostly due to the differencesin θs, although the parameters α and n and, to a lesser extent, θr, also influence these100discontinuities. This figure shows the infiltration front as an increase in soil watercontent, which corresponds to the increase in pressure head. We end the experimentwhen the infiltration front reaches the bottom of the soil profile at 22 hours. The watercontent figure shows the measurement locations as equally spaced samplings from -2cm to -34 cm. We took the measurements over the entire time-lapse experiment, asseen in Figure 4.6. There were 225 measurements for the entire experiment. We added1% noise to the synthetic data, which the figure shows as dobs. This noise is far belowwhat can be expected in any geophysical recovery of the water content; however, byadding more noise, we must conversely reduce our expectations of recovering the truedistributions. In this experiment, we are interested in what is possible to recover underthe best of circumstances.4.4 Unconstrained joint inversionTo recover the van Genuchten parameters of Ks,θr,θs,α , and n, we will frame theproblem as an unconstrained joint inversion for all parameters at once. This framingrequires that the model, m, contains all five parameters for every cell in the modeland, in this case, has a length of 200. To get the model for hydraulic conductivity,for example, we use a 40×200 projection matrix to select the appropriate rows of themodel vector.mKs = PKsmWe can also complete other model parameterizations at this stage to embed otherknowledge (Appendix C). For example, we expect the hydraulic conductivity to belogarithmically varying, so a model of log(Ks) can be created. This mapping, as well101Figure 4.5: Fields from the numerical simulation of a layered one dimensionalsoil profile, showing (a) pressure head and (b) water content over the fulltime period. The soil types are shown as annotations on each figure, thespatial location of water content measurements are shown adjacent to thewater content the projection, must be taken care of in both the translation to the physical propertyvalues and the derivative. The entire mapping can be represented as p=M (m), wherep is the parameter that is mapped from the model, m. In the previous chapter, we onlyaddressed the model derivative, with respect to saturated hydraulic conductivity. Inthis chapter, we also require the derivative of the water content curve as well as thederivatives with respect to each parameter in the van Genuchten relationships. Thederivative for the water content curve requires attention in both the time stepping termsand in the conversion of the pressure head field to water content for inclusion in themeasurement locations.102The inversion for all van Genuchten parameters at once can be written as a sum ofweighted objective functions:argminmΦ(m) =12∥∥Wd(dpred(m)−dobs)∥∥22+β2 ∑{·}=Ks, α, n, θs, θr(α{·}∥∥∥Wm{·}(M{·}(m)−mref{·})∥∥∥22) (4.4)Here, Wm refers to both model smoothness and smallness for each mapped param-eter. The mref can be chosen independently for each parameter. The α{·} weightingshave significant influence on the inversion; a poor choice of weighting can cause theoptimization of the objective function to not converge. We chose the weightings for theinversion results presented by looking at the relative magnitudes of the model resolu-tion matrix (J>J) around the starting model, (m0). The objective function weights thatwe used in this inversion were approximately the average sensitivity over the entiresoil profile: Ks=1e-3; θr=1; θs=1; α=1e2; and, n=5e3. The β parameter was chosenby relatively weighting the data misfit and model regularization terms at 1:100 basedon a coarse estimate of the major eigenvalue of each inversion component. We useda cooling schedule for β that reduced β by a factor of five every three iterations. Thestarting model used was a soil with the parameter values of the middle layer of themodel: a loam, with the exception of the α parameter, that was set to an initial valueof six, which was the midpoint of the values presented in Table 4.1. We optimized theobjective function with an inexact Gauss-Newton algorithm that used five conjugategradient iterations for each step in the inversion. The data misfit function started witha value of 2.8e4, which was reduced two orders of magnitude to the target misfit of113. The inversion took 117 iterations; however, the majority of the decrease in theobjective function occurred in the first 25 iterations of the inversion, which decreased103the misfit to below ResultsFigure 4.6 shows the predicted data for this unconstrained joint inversion for all spa-tially heterogeneous parameters, which has a good visual fit to the the observed data.This figure shows all of the locations and times in a single figure. There are three sat-uration measurements in the top and bottom layers and two in the middle layer. Overthe course of the infiltration experiment, we can differentiate these saturation profilesby the time at which the infiltration front causes an increase in the water content of thesoil.Figure 4.6: Observed and predicted water content data from the one dimensionalinfiltration experiment showing water content over time.Figure 4.7 shows the hydraulic conductivity and water retention curves. We al-lowed each parameter in the van Genuchten curves to vary spatially as shown in Fig-104ure 4.8. However, for visualization purposes, Figure 4.7 uses the average value for eachparameter type in each layer to calculate the hydraulic conductivity and water retentioncurves. The amount of the curve that was interrogated by the infiltration experimentsis important to consider. As such, the figure also shows the true pressure head valuesin the entire layer as a normalized histogram. Information about the curves, outsidethe bounds of the experiment, will likely be poorly estimated. The curves for hydraulicconductivity in the first two layers were well-recovered for the entire domain of pres-sure head values. Similarly, the water retention curves were well-recovered over thedomain of the experiment, as seen in Figures 4.7d and 4.7e. However, outside the pres-sure head bounds imposed by the experimental setup, which are shown as histograms,the water retention curves were not fit well for pressure head values less than -1.0m. The recovered curve over-estimated the water content in the top layer and under-estimated it in the middle layer. The third layer was the last to see the effect of the in-filtration event and most of the experiment exposed this layer to approximately -0.5 mof pressure head. The recovered curves for the hydraulic conductivity curve are shownin Figure 4.7c, which overestimates the hydraulic conductivity by nearly an order ofmagnitude at −ψ = 10−2m. Figure 4.7f shows the predicted curve for water reten-tion. Here, the water content is over-estimated at−ψ = 10−2m and under-estimated at−ψ = 102m. The location where the layer was sampled was well-recovered and this isseen as the intercept between the true and the predicted models at −ψ = 0.5m, whichis the peak of the histogram. In summary, the curves were relatively well-recovered inthe first two layers, where the pressure head changed by an order of magnitude over thecourse of the infiltration experiment. Outside these bounds, the water retention curveswere less well-recovered, but the hydraulic conductivity showed a good match. Thethird layer, however, was not matched well and, outside the bounds of the experiment,105the curves for both hydraulic conductivity and water retention were poorly-recovered.Figure 4.7: Showing the water retention and hydraulic conductivity curves forthe true and predicted models for the three soil layers. The histogram ineach plow shows the distribution of true pressure head values in each layer.Figure 4.8 shows the spatially varying van Genuchten parameters through depth.Blue marks the true model parameters and green marks the recovered model parame-ters at the target misfit. The figure also shows all of the model iterations as thin greylines of increasing opacity. In Figure 4.8b, the recovered values for θs appear to fit inthe first few iterations of the inversion. The saturated water content in the top layer isrecovered to within 0.012 of the true value; the second layer was also well-recovered,with a maximum difference from the true value of 0.016 for the values inside the layer.The saturated water content in the third layer over-estimated the true value by 0.046.The residual water content for all three layers did not move far from the initial esti-106mate and did not recover the difference in the final layer, but underestimated the thirdlayer by 0.040. The majority of the experiment exposed the soil profile to pressuresthat were between −ψ = 10−2m and −ψ = 100m, which is outside the domain whereθr influences the water retention curve, as seen in Figure 4.3. In Figure 4.8d, the αparameter did not change from the initial chosen value of 6.0, except in the loam layerwhere there is a slight increase towards the known value of 9.0. This can be expectedfrom the analysis of the objective functions in Figure 4.4, where α could change overa large range without changing the objective function. Figure 4.8a shows the saturatedhydraulic conductivity estimate for the soil profile. The Ks estimate shows the layer-ing of the soil profile, which can lead to other parameterization techniques, as we willsee in the next section. The top layer is overestimated by up to a 0.34 in log10-space,the second layer is underestimated by 0.47 in log10-space, and the third layer overes-timates Ks by up to an order of magnitude, 0.98 in log10-space. The n parameter isunderestimated in the top layer by a maximum of 0.030, in the second layer n is over-estimated by a maximum of 0.072, and is slightly overestimated by 0.019 in the thirdlayer.The parameters estimated by this method demonstrate that the data can be fit by alocal minima; that is, multiple values of Ks−n−α can sufficiently minimize the watercontent data provided. Figure 4.9 further highlights this through the plotting of thefields from the recovered model. As expected, the water content field matches the truefields in Figure 4.5, which contains the data that was recorded and fit. There are someoscillations in the top layer due to the estimate of θs; these were added in the finaliterations of the optimization and could be considered over-fitting of the inversion. Weconstructed the data from linear interpolation of the closest two cells; the sensitivityto model parameters outside of this interpolation is up to three orders of magnitude107Figure 4.8: The true and recovered soil parameters as a function of depth, show-ing (a) hydraulic conductivity; (b) residual and saturated water content; (c)the empirical parameter n; and (d) the empirical parameter α . Each plotalso shows the full inversion history of the predicted model as a transparentblack line.108lower. When considering a geophysical estimation of water content, the ‘footprint’of this estimate should be experimented with because the geophysical estimate is anintegration over a certain volume of the soil rather than a point estimate, as is assumedhere. Comparing to the true fields for the third layer, the water content profile is well-estimated; however, the pressure head recovery has a completely different character.The middle of the bottom layer reaches a maximum pressure head of -16 cm rather than-27 cm in the true model. Considering that the pressure head range of the experimentwas 36.5 cm, this estimate was off by 30.1%; this further shows that only collectingsaturation data, especially over a small range of pressure heads, can lead to inaccurateresults in both the van Genuchten parameters recovered and the pressure head field.4.4.2 DiscussionThe recovery of θs is the most robust in the infiltration experiment considered becausethe majority of the data was collected when the soil was close to saturation. At thesepressure head values, θs has the greatest control over the van Genuchten water retentioncurve. We can recover the layering in the system from the saturation data, whichcan lead to other parameterizations of the model space and exploration of other apriori data to be included. The hydraulic conductivity curves for the first two layerswere well-recovered within half an order of magnitude. However, there is a trade-offbetween Ks and n, which could not be isolated over the small pressure ranges that weused for this simulation. We found low sensitivity to α over the range of pressure headsinvestigated, as Mawer et al. (2013) has previously observed. We found a local minimaacross the Ks, n, and α parameters. The coverage of a large range (several ordersof magnitude) of pressure head values is important for extrapolating the hydraulicconductivity curve and, especially, the water retention curve. Most field studies that109Figure 4.9: Recovered simulation fields from the unconstrained joint inversionfor van Genuchten parameters. Showing (a) the pressure head and (b) thewater content over the full time period of the one dimensional recoveredsoil profile. The soil types are shown as annotations on each figure, thespatial location of water content measurements are shown adjacent to thewater content fields.use geophysical data to estimate water content will likely see changes of only a feworders of magnitude. However, if only a small portion of the curve is interrogated in theexperiment, data can be fit with incorrect van Genuchten parameters. The domain ofthe curves that is interrogated should be reported with any estimate of van Genuchtenparameters. Furthermore, the addition of pressure head data will be advantageous tothe joint inversion presented, as it will reduce non-uniqueness in the water retentionmodel for estimating the pressure head field.1104.5 Three dimensional inversionIn this section, we turn our attention to recovering a three-dimensional soil structuregiven water content data. The example, motivated by a field experiment introduced in(Pidlisecky et al., 2013), shows a time-lapse electrical resistivity tomography surveycompleted in the base of a managed aquifer recharge pond. The goal of this man-agement practice is to infiltrate water into the subsurface for storage and subsequentrecovery. Such projects require input from geology, hydrology, and geophysics to mapthe hydrostratigraphy, to collect and interpret time-lapse geophysical measurements,and to integrate all results to make predictions and decisions about fluid movement atthe site. As such, the hydraulic properties of the aquifer are important to character-ize, and information from hydrogeophysical investigations has been demonstrated toinform management practices (Pidlisecky et al., 2013). We use this context to motivateboth the model domain setup of the following synthetic experiment and the subsequentinversion.For the inverse problem solved here, we assume that time-lapse water-content in-formation is available at many locations in the subsurface. In reality, water contentinformation may only be available through proxy techniques, such as electrical resis-tivity methods. These proxy data can be related to hydrogeologic parameters usinginversion techniques based solely on the geophysical inputs (cf. Mawer et al. (2013)).For the following numerical experiments, we do not address complications in empiri-cal transformations, such as Archie’s equation (Archie, 1942). The synthetic numericalmodel has a domain with dimensions 2.0 m × 2.0 m × 1.7 m for the x, y, and z di-mensions, respectively. The finest discretization used is 4 cm in each direction. Weuse padding cells to extend the domain of the model (to reduce the effect of boundary111conditions in the modelling results). These padding cells extend at a factor of 1.1 inthe negative z direction. We use an exponentially expanding time discretization with40 time steps and a total time of 12.3 hours. This choice in discretization leads to amesh with 1.125×105 cells in space (50× 50× 45). To create a three-dimensionallyvarying soil structure, we construct a model for this domain using a three-dimensional,uniformly random field, ∈ [0,2], that is convolved with an anisotropic smoothing ker-nel for a number of iterations. We create a binary distribution from this random fieldby splitting the values above and below unity. Figure 4.10 shows the resulting model,which reveals potential flow paths. We then map van Genuchten parameters to thissynthetic model as either a sand or a loamy-sand. The van Genuchten parameters forsand are: Ks: 5.83e-05m/s, α: 13.8, θr: 0.02, θs: 0.417, and n: 1.592; and for loamy-sand are: Ks: 1.69e-05m/s, α: 11.5, θr: 0.035, θs: 0.401, and n: 1.474. For thisinversion, we are interested in characterizing the soil in three dimensions.4.5.1 ResultsFor calculation of synthetic data, the initial conditions are a dry soil with a homo-geneous pressure head (ψ = −30cm). The boundary conditions applied simulate aninfiltration front applied at the top of the model, ψ =−10cm ∈ δΩtop. Neumann (no-flux) boundary conditions are used on the sides of the model domain. Figure 4.13shows the pressure head and water content fields from the forward simulation. Fig-ure 4.11a and 4.11b show two cross-sections at time 5.2 hours and 10.3 hours of thepressure head field. These figures show true soil type model as an outline, where theinclusions are the less hydraulically-conductive loamy sand. The pressure head fieldis continuous across soil type boundaries and shows the infiltration moving verticallydown in the soil column. We can compute the water content field from the pressure112Figure 4.10: Soil structure in three dimensions showing the boundary betweentwo soil types of sand and loamy sand.head field using the nonlinear van Genuchten model chosen; Figure 4.11c and 4.11dshow this computation at the same times. The loamy sand has a higher relative watercontent for the same pressure head and the water content field is discontinuous acrosssoil type boundaries, as expected.The observed data, which will be used for the inversion, is collected from the watercontent field at the points indicated in both Figure 4.11c and 4.11d. The sampling113Figure 4.11: Vertical cross sections through the pressure head and saturationfields from the numerical simulation at two times: (a) pressure head fieldat t = 5.2 hours and (b) t = 10.3 hours; and (c) saturation field at t = 5.2hours and (d) t = 10.3 hours. The saturation field plots also show mea-surement locations and green highlighted regions that are shown in Fig-ure 4.12. The true location of the two soils used are shown with a dashedoutline.location and density of this three-dimensional grid within the model domain is similarto the resolution of a 3D electrical resistivity survey. Our implementation supportsdata as either water content or pressure head; however, proxy water content data ismore realistic in this context. Similar to the field example in (Pidlisecky et al., 2013),we collect water content data every 18 minutes over the entire simulation, leading toa total of 5000 spatially and temporally extensive measurements. The observed watercontent data for a single infiltration curve is plotted through depth in Figure 4.12. Thegreen circles in Figure 4.11 show the locations of these water content measurements.114The depth of the observation is colour-coded by depth, with the shallow measurementsbeing first to increase in water content over the course of the infiltration experiment. Tocreate the observed dataset, dobs, from the synthetic water content field, 1% Gaussiannoise is added to the true water content field. This noise is below what can currently beexpected from a proxy geophysical measurement of the water content. However, withthe addition of more noise, we must reduce our expectations of our ability to recoverthe true parameter distributions from the data. In this experiment, we are interested inexamining what is possible to recover under the best of circumstances, and thereforehave selected a low noise level.Figure 4.12: Observed and predicted data for five measurements locations atdepths from 10cm to 150cm from the center of the model domain.115For the inverse problem, we are interested in the distribution of soil types that fitsthe measured data. We parameterize these soil types using the van Genuchten em-pirical model (4.2) with as least five spatially distributed properties. Inverting for all5.625×105 parameters in this simulation with only 5000 data points is a highly under-determined problem, and thus there are many possible models that may fit those data.In this 3D example, we will invert solely for saturated hydraulic conductivity and as-sume that all other van Genuchten parameters are equivalent to the sand; that is, theyare parameterized to the incorrect values in the loamy sand. Note that this assumption,while reasonable in practice, will handicap the results, as the van Genuchten curvesbetween these two soils differ. Better results can, of course, be obtained if we assumethat the van Genuchten parameters are known; this assumption is unrealistic in prac-tice, which means that we will not be able to recreate the data exactly. However, thedistribution of saturated hydraulic conductivity may lead to insights about soil distri-butions in the subsurface. Figure 4.13 shows the results of the inversion for saturatedhydraulic conductivity as a map view slice at 66 cm depth and two vertical sectionsthrough the center of the model domain. The recovered model shows good correla-tion to the true distribution of hydraulic properties, which is superimposed as a dashedoutline. Figure 4.12 shows the predicted data overlaid on the true data for five watercontent measurement points through time; these data are from the center of the modeldomain and are colour-coded by depth. As seen in Figure 4.12, we do a good job offitting the majority of the data. However, there is a tendency for the predicted infiltra-tion front to arrive before the observed data, which is especially noticeable at deepersampling locations. The assumptions put on all other van Genuchten parameters to actas sand, rather than loamy sand, lead to this result.116Figure 4.13: The 3D distributed saturated hydraulic conductivity model recov-ered from the inversion compared to the (a) synthetic model map viewsection, using (b) the same map view section, (c) an X-Z cross sectionand (d) a Y-Z cross-section. The synthetic model is show as an outlineon all sections, and tie lines are show on all sections as solid and dashedlines, all location measurements are in centimeters.4.5.2 Scalability of the implicit sensitivityFor the forward simulation presented, the Newton root-finding algorithm took 4-12iterations to converge to a tolerance of 1× 10−4m on the pressure head. The inverseproblem took 20 iterations of inexact Gauss-Newton with five internal CG iterationsused at each iteration. The inversion fell below the target misfit of 5000 at iteration 20with φd = 4.893×103; this led to a total of 222 calls to functions to solve the productsJv and J>z. Here, we again note that the Jacobian is neither computed nor storeddirectly, which makes it possible to run this code on modest computational resources;this is not possible if numerical differentiation or direct computation of the Jacobian is117used. For these experiments, we used a single Linux Debian Node on Google ComputeEngine (Intel Sandy Bridge, 16 vCPU, 14.4 GB memory) to run the simulations andinversion. The forward problem takes approximately 40 minutes to solve. In thissimulation, the dense Jacobian matrix would have 562.5 million elements. If we useda finite difference algorithm to explicitly calculate each of the 1.125e5 columns ofthe Jacobian with a simple forward difference, we would require a calculation for eachmodel parameter – or approximately 8.5 years of computational time. Furthermore, wewould need to recompute the Jacobian at each iteration of the optimization algorithm.In contrast, if we use the implicit sensitivity algorithm presented in Chapter 3, we cansolve the entire inverse problem in 34.5 hours.Table 4.2: Comparison of the memory necessary for storing the dense explicitsensitivity matrix compared to the peak memory used for the implicit sensi-tivity calculation excluding the matrix solve. The calculations are completedon a variety of mesh sizes for a single distributed parameter (Ks) as well asfor five distributed van Genuchten model parameters (Ks,α,n,θr, and θs).Values are reported in gigabytes (GB).Explicit Sensitivity Implicit SensitivityMesh Size 1 parameter 5 parameters 1 parameter 5 parameters32×32×32 1.31 6.55 0.136 0.17164×64×64 10.5 52.4 0.522 0.772128×128×128 83.9 419 3.54 4.09Table 4.2 shows the memory required to store the explicit sensitivity matrix fora number of mesh sizes and contrasts them to the memory required to multiply theimplicit sensitivity by a vector. These calculations are modifications on the examplepresented above and use 5000 data points. The memory requirements are calculatedfor a single distributed parameter (Ks) as well as five spatially distributed parameters(Ks,α,n,θr, or θs). Neither calculation includes the memory required to solve thematrix system, as such, the reported numbers underestimate the actual memory re-118quirements for solving the inverse problem. The aim of this comparison is to demon-strate how the memory requirements scale, an appropriate solver must also be chosenfor either method to solve the forward problem. When using an explicitly calculatedsensitivity matrix to invert for additional physical properties, the memory footprint in-creases proportionally to the number of distributed physical properties; this is not thecase for the implicit sensitivity calculation. For example, on a 128×128×128 mesh,the explicit formation of the sensitivity requires 419 GB for five spatially distributedmodel parameters, which is five times the requirement for a single distributed modelparameter (83.9 GB). For the implicit sensitivity on the same mesh, only 4.09 GB ofmemory is required, which is 1.2 times the requirement for a single distributed modelparameter (3.54 GB). For this mesh, inverting for five spatially distributed parametersrequires over 100 times less memory when using the implicit sensitivity algorithm,allowing these calculations to be run on modest computational resources.4.5.3 DiscussionIn the context of managed aquifer recharge, small variations in soil types can causedifferences in infiltration rates. Geophysical data that is both spatially and temporallydense can act as a proxy for water content (cf. Pidlisecky et al. (2013)). If hydraulicproperties are identified or even spatially delineated, this knowledge can inform andinfluence management decisions (e.g. where and when to till or dredge a pond to in-crease infiltration rates). The hydraulic properties determining these infiltration ratesare distributed in three dimensions. The inversion presented above demonstrated alarge-scale inversion for a 3D distribution of saturated hydraulic conductivity. Wefixed the other van Genuchten parameters at values for sand and, as such, the inversionhad difficulty fitting late time data in the deeper profile. The inversion qualitatively dis-119criminated between the two geologic units. In this experimentation, we assumed thatthe initial conditions and boundary conditions were known. This knowledge may bepossible in a heavily monitored situation, such as managed aquifer recharge where thepond water height and underlying water table are measured. Further investigation intothe conceptualization of the groundwater simulation is important to both the hydro-geologic modelling and any coupling to geophysical methods (Hinnell et al., 2010).The inversion at this scale is computationally possible due to the formulation of theRichards equation presented in Chapter 3 and the implementation both extended andtested the geophysical inversion framework presented in Chapter 2.4.6 IntegrationsThe combination of the Richards equation with geophysical responses has been laidout in Hinnell et al. (2010); this includes both directly coupled and uncoupled forms ofintegrating information. As presented in Hinnell et al. (2010), uncoupled integrationsare completed by: (a) using the geophysical data to estimate a physical property, suchas electrical conductivity; (b) using an empirical relation, such as Archie’s equation,to transform the geophysical estimate into a hydrological parameter, such as watercontent; and, (c) using hydrological estimates to help inform or test a hydrogeologicsimulation. In contrast, a coupled inversion uses geophysical data to directly informthe hydrogeologic simulation through stochastic or deterministic parameter estimation(Finsterle and Kowalsky, 2008; Ferre´ et al., 2009). We can find increasing instancesof these sorts of collaborations and studies in near surface hydrogeophysics (cf. Lindeand Doetsch (2016) and references within). As data sizes and numerical domainscontinue to grow, the relatively few parameters that can be estimated by stochastic in-versions may not be sufficient. Alternatively, deterministic inversions can be used, but120will need to draw on improvements across the field of geophysical inverse problems.For example, regularization techniques using fuzzy c-means clustering, developed inother areas of geophysics, have potential to be helpful in introducing known parameterdistributions into the Richards equation inversion (Paasche and Tronicke, 2007; Sunet al., 2012). In Hinnell et al. (2010), the authors conclude that, “the coupled approach[for hydrogeophysics] requires that the hydrologic and geophysical models be merged,[which] forces the hydrologist and the geophysicist to formulate a consistent frame-work,” which would require “an uncommon level of collaboration during scientificanalysis”.A consistent framework was proposed in Chapter 2; however, to both advanceleading research and disseminate leading practice, the integration and arbitrary combi-nation of physical simulations must also be possible while simultaneously calculatingderivatives efficiently. Appendix C presents a brief introduction to this collaborativework, as well as several case studies that are striving towards a general formulation.The framework presented in Chapter 2 has been extended to: (a) explicitly exposeassumptions in the forward simulation framework to interrogation and inversion; (b)compose custom objective functions, including regularization and multi-physics dataobjective functions; and, (c) provide extensible parameterizations that are flexible forcustom inclusion of a priori knowledge. This is ongoing, collaborative work.4.7 ConclusionsThe joint inversion of various hydraulic parameters was explored on a layered 1D soilprofile. The water content data was well fit and the soil layers were delineated. Ad-ditionally, the van Genuchten curves that were used as the empirical relations werealso well recovered over the range of pressure head values that each layer of the soil121profile was exposed to over the experiment. Outside of these ranges, the curves werenot reliably recovered. As such, the numeric values of the van Genuchten parameterswere occasionally unreliable, especially if pressure head was relatively constant, aswas the case in the third layer. In this experiment, the water content data was con-sidered as a point measurement, and the sensitivity of the recovered model varied byorders of magnitude between voxels that were included in that measurement and im-mediate voxel neighbours. This lead to some artificial layering in the recovered modelwhich was not compensated for by the regularization. The footprint of the water con-tent measurement is likely an integration over a number of voxels, especially if comingfrom a geophysical estimate, and should be considered and experimented with.As geophysics is more regularly included in hydrogeology parameter estimationthe number of distributed parameters that will be necessary to estimate will increase byseveral orders of magnitude. The final example in this chapter showed a 3D recoveryof a distributed hydraulic parameter (Ks). This inversion would not have been possiblethrough standard finite difference techniques and was significantly more memory effi-cient than an automatic differentiation algorithm that explicitly forms the Jacobian (upto two orders of magnitude in some cases). These improvements allowed the Richardsequation inversion to be run on modest computational resources. The coupling, nestingor otherwise integrating of various geophysical methods with the Richards equation isan obvious extension to this work. This has been extensively explored by several au-thors, albeit at a smaller scale (cf. Linde and Doetsch (2016) and references within).Appendix C briefly explores various types of integrations between various geophysicalmethodologies as well as custom model conceptualizations that will be necessary forthese integrations.122Chapter 5ConclusionsForward and inverse modelling in geophysics requires solving and optimizing large-scale partial differential equations. Thus, many components including linear algebra,optimization routines, discretizations, and model conceptualizations, are required tointeract. Advances in instrumentation, the monitoring of time-lapse processes, and ac-quisition of multiple geophysical and hydrologic data types are driving a need for amore integrated geoscience approach. This includes integrating information throughmultiphysics simulations and coupled geophysical inversions. In addition to recover-ing 3D models, geophysics is increasingly being used in time-lapse imaging of fluidflow processes, requiring both computational scalability to 4D inverse problems andinterdisciplinary collaborations. There have been significant advances in the past threedecades in computational geophysics; researchers are now able to both simulate andinvert almost any geophysical method in three dimensions (cf. Oldenburg (2016)). Oneof the major challenges ahead of geophysics as a discipline is how to systematicallyimprove the quantitative interfaces and integrations between hydrological, geophysi-cal, and geological information and processes. The research necessary to address these123challenges will require the interdisciplinary community to build upon, as well as aug-ment, standard practices; this presupposes that researchers have access to consistentmethodologies that can be extended, adapted, and combined.Adapting interdisciplinary methodologies to geophysical simulations and inver-sions inherently requires that a diverse suite of methods and applications be consideredacross hydrogeology, geophysics, and geology. Throughout this work, my colleaguesand I have tried to summarize and/or reproduce many methodologies in the geophysi-cal inversion literature. Other communities, such as astrophysics and machine learning(Astropy Collaboration et al., 2013; Pedregosa et al., 2011) have organized these ef-forts and research communities around open, accessible, and actionable ideas. Adapt-ing these learnings to geoscience, I strive to complete all of my research such that it isimmediately reproducible and openly available.Much of my dissertation required the development of software, which is implemen-tation and engineering by its nature. However, the software, although extremely useful,is not the aim of my research. If the goal is a framework for quantitative geoscienceintegration, a simplistic, high level conceptualization is easy to present. However, apicture or a paragraph describing a framework cannot be tested, interrogated, nor usedbeyond its static form. Software is the means by which I test, extend, organize, andabstract inherently computational ideas in a rigorous, scientific way. The aim of myresearch is to identify, explore, and formalize a framework for simulation and param-eter estimation in geophysics. I applied this framework to vadose zone flow and, withthe help of collaborators, to several other geophysical methodologies.1245.1 Contributions and disseminationThe conclusions from each component of my thesis are contained within each chap-ter and each appendix. However, the central aim of my dissertation was to develop aframework for geophysical inversions: this has been disseminated through three publi-cations (Cockett et al., 2017; Heagy et al., 2016; Cockett et al., 2015c), four extendedabstracts (Heagy et al., 2014; Kang et al., 2015a; Heagy et al., 2015c, 2017), overtwenty conference presentations, and a dedicated international workshop1. Further-more, this framework has demonstrated value in several new research areas, method-ologies, and case studies (cf. Kang et al. (2017a); Miller et al. (2017); Kang andOldenburg (2016); Rosenkjaer et al. (2016)). Overall, the contributions of my thesisare twofold:1. a conceptual organization and synthesis of geophysical simulations and inver-sions into a framework that has been rigorously, numerically tested; and2. an algorithm for large-scale vadose zone parameter estimation for any distributedhydraulic parameter, regardless of the empirical relationship used.One outcome of a framework approach is the accelerated transfer of ideas from onediscipline to another. For example, the implicit sensitivity calculation for the Richardsequation (Cockett et al., 2017) was heavily inspired by research completed in time-domain electromagnetics simulations and inversions (Heagy et al., 2016). The refine-ment and application of this algorithm to hydrology significantly improved numericalscalability for the 3D inverse problem. Chapter 4 showed significant improvements inmemory over explicitly forming the sensitivity matrix by over two orders of magnitude1At the Banff International Research Station, see the example shown, bringing this inversion into the range of possibility on mod-est computational resources. Additionally, the complexities of the Richards equationwere generalized and synthesized to improve other geophysical methods in the frame-work. These improvements were especially demonstrated with regard to dealing withmultiple physical properties that may or may not be estimated and occur in distributedempirical relations throughout the forward simulation framework.The framework presented is designed to decouple concerns and expose well-definedinterfaces between the many components necessary in geophysical simulations and in-versions. Chapter 4 and Appendix C showed many demonstrations of these ideas, forexample: (a) exposing model conceptualization that are decoupled from the sensitiv-ity calculation allows custom, parameterized inversions to be completed by combiningvarious, predefined mappings; (b) the declarative interface of differential operatorsand derivatives is decoupled from the structure and type of mesh, which allows thephysics to be written once and used across many types of meshes; or (c) the decou-pling of the physics from the definition of the geophysical survey, which allows manytypes of geophysical surveys to be combined with a single physical problem (e.g. inelectromagnetics) or conversely different approximations of a physical problem (e.g.dimensionality) to be combined with a single survey. Given the number of choices ingeophysical simulations and inversions this type of combinatorial, decoupled approachcould provide significant acceleration to the unique geoscience integration problemsof the future.5.1.1 SoftwareThe conceptual organization developed could not have been created without the aidof software. Furthermore, even if it were, it would be of limited utility, difficult, or126impossible to validate, and would not make significant progress towards sustained,reproducible, quantitative geoscience integrations. It is not until a framework is im-plemented and tested from a number of non-overlapping geoscience perspectives thatthe assumptions, inconsistencies, or redundancies come to light and are available tointerrogation. The main software package that was, and continues to be, developed isSIMPEG (, which defines the framework and hostsa collection of other geophysical methods written by many collaborators across sixuniversities. Currently there are methods for: vadose zone flow (Cockett et al., 2017),direct current resistivity and induced polarization (Kang and Oldenburg, 2016), time-domain and frequency-domain electromagnetics (Heagy et al., 2016), magnetotellurics(Rosenkjaer et al., 2016), magnetics and gravity (Fournier et al., 2016; Miller et al.,2017), and a number of example linear inverse problems. Many of these geophysicalmethods also have different formulations (e.g. integral equation, differential equation,etc.), dimensionalities (e.g. 1D, 2.5D, 3D), and survey components (e.g. sources andreceivers). All software is disseminated with the MIT license to encourage permis-sionless innovation.5.2 Outlook and continuing workThis thesis constructed a preliminary organization and synthesis of simulations anddeterministic inversions in a few subdisciplines of geophysics and hydrogeology. Thisconceptual framework and computational implementation has demonstrated utility inadvancing current and future research, however, caveats and qualifiers abound.Some of the more intricate parallel algorithms (e.g. stochastic optimization) wouldrequire updates to the implementation and framework; a lack of these scalable parallelalgorithms is limiting when tackling problems with many sources (e.g. airborne elec-127tromagnetics). There is still significant work to do in coupling and nesting various geo-physical problems in a robust way. Due to the current focus on enabling researchers,the framework offers little in high-level interfaces to inversions that are typical of blackbox industry use (i.e. data in, model out). Regardless of these shortcomings, many ofmy colleagues rely upon and extend this framework and implementation in their dayto day research. My goal was to accelerate their work and connect their research to acommunity of collaborators who are explicitly working towards common goals.This thesis is positioned from a perspective of looking out to a future of multidis-ciplinary, multi-data-type, quantitatively integrated geoscience simulations and inver-sions. A future where joint, cooperative, coupled, parameterized, multiphysics inver-sions and simulations are the norm rather than the exception. Where multiple existing,robust, and computationally-efficient methodologies are combined to extract all possi-ble information from disparate geoscience datasets. To realize this sort of ubiquitous,quantitative communication between disciplines and methodologies requires an orga-nized and integrated community that can effectively work together. My research isaimed here. This future will not be realized by one person nor by one research group.In the field of machine learning, Olah and Carter (2017) note that “[t]he maintain-able size of the field is controlled by how its members trade off the energy betweencommunicating and understanding.” The curation of ideas is just as important as theircreation. There is a research debt created by an exclusive focus on research novelty;future advances also require distillation, synthesis, and explicit communication. Thereis a significant amount of effort ahead of us to achieve effective communication andcollaboration with our geoscience peers in geology and hydrogeology. This commu-nication and quantitative integration is the webbing on which the future of our geo-science field depends. My approach, therefore, has been to research and disseminate128a numerical framework that attempts to support and enable a number of these diverseinterdisciplinary collaborations. I hope that a lasting contribution of my work is theopen, modular approach that I have curated and the community that I have helped toseed around these ideas.129BibliographyAlexe, M., Sandu, A., 2014. Space-time adaptive solution of inverse problems with the discrete adjoint method. Journal ofComputational Physics 270, 21–39. → pages 171Archie, G., 1942. The Electrical Resistivity Log as an Aid in Determining Some Reservoir Characteristics. Transactions of theAIME 146 (1), 54–62. → pages 5, 14, 61, 89, 111Armstrong, R., Gannon, D., Geist, A., Keahey, K., Kohn, S., McInnes, L., Parker, S., 1999. Toward a Common ComponentArchitecture for High Performance Scientific Computing. Proc. 8th IEEE Symp. on High Performance DistributedComputing, 115–124. → pages 141Ascher, U. M., 2008. Numerical methods for evolutionary differential equations. Society for Industrial and AppliedMathematics. → pages 68Ascher, U. M., Greif, C., jun 2011. A First Course in Numerical Methods. Society for Industrial and Applied Mathematics,Philadelphia, PA. → pages vi, 69, 144Astropy Collaboration, Robitaille, T., Tollerud, E., Greenfield, P., Droettboom, M., Bray, E., Aldcroft, T., Davis, M., Ginsburg,A., Price-Whelan, A., Kerzendorf, W., Conley, A., Crighton, N., Barbary, K., Muna, D., Ferguson, H., Grollier, F., Parikh,M., Nair, P., Unther, H., Deil, C., Woillez, J., Conseil, S., Kramer, R., Turner, J., Singer, L., Fox, R., Weaver, B., Zabalza, V.,Edwards, Z., Azalee Bostroem, K., Burke, D., Casey, A., Crawford, S., Dencheva, N., Ely, J., Jenness, T., Labrie, K., Lim, P.,Pierfederici, F., Pontzen, A., Ptak, A., Refsdal, B., Servillat, M., Streicher, O., oct 2013. Astropy: A community Pythonpackage for astronomy. AAP 558, A33. → pages 23, 124, 142Balay, S., Brown, J., Buschelman, K., Eijkhout, V., Gropp, W., Kaushik, D., Knepley, M., McInnes, L. C., Smith, B., Zhang, H.,2012. PETSc users manual revision 3.3. Computer Science Division, Argonne National Laboratory, Argonne, IL. → pages 48Bard, J. B. L., Rhee, S. Y., 2004. Ontologies in biology: design, applications and future challenges. Nature reviews. Genetics5 (3), 213–222. → pages 141Barrett, R., Berry, M., Chan, T. F., Demmel, J., and J. Donato, Dongarra, J., Eijkhout, V., Pozo, R., Romine, C., der Vorst, H. V.,1994. Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods. SIAM, Philadelphia. → pages 73Binley, A., Cassiani, G., Middleton, R., Winship, P., 2002. Vadose zone flow model parameterisation using cross-borehole radarand resistivity imaging. Journal of Hydrology 267 (3), 147–159. → pages 5, 14, 61, 62, 89Bitterlich, S., Durner, W., Iden, S. C., Knabner, P., aug 2004. Inverse Estimation of the Unsaturated Soil Hydraulic Propertiesfrom Column Outflow Experiments Using Free-Form Parameterizations. Vadose Zone Journal 3 (3), 971–981. → pages 62,95Bitterlich, S., Knabner, P., 2002. An efficient method for solving an inverse problem for the Richards equation. Journal ofComputational and Applied Mathematics 147, 153–173. → pages 15, 63, 77, 79, 92Brooks, R. H., Corey, A. T., 1964. Hydraulic properties of porous media and their relation to drainage design. Trans. ASAE 7.126 (28). → pages 66, 92Bui-Thanh, T., Ghattas, O., jan 2015. A scalable algorithm for MAP estimators in Bayesian inverse problems with Besov priors.Inverse Problems and Imaging 9 (1), 27–53. → pages 15, 62130Burstedde, C., Wilcox, L. C., Ghattas, O., 2011. p4est: Scalable Algorithms For Parallel Adaptive Mesh Refinement On ForestsOf Octrees. SIAM J. on Sci. Comp. 33 (3), 1103–1133. → pages 144, 146Calhoun, D. A., Helzel, C., Leveque, R. J., 2008. Logically Rectangular Grids and Finite Volume Methods for PDEs in Circularand Spherical Domains. SIAM Review 50 (4), 723–752. → pages 147Carrick, S., Almond, P., Buchan, G., Smith, N., dec 2010. In situ characterization of hydraulic conductivities of individual soilprofile layers during infiltration over long time periods. European Journal of Soil Science 61 (6), 1056–1069. → pages 62Celia, M. A., Bouloutas, E. T., Zarba, R. L., 1990. A general mass-conservative numerical solution for the unsaturated flowequation. Water Resources Research 26 (7), 1483–1496. → pages xvi, 13, 60, 64, 65, 66, 71, 73, 74, 82, 84, 85, 86Claerbout, J., Muir, F., 1973. Robust modeling with erratic data. Geophysics 38, 826–844. → pages 34Cockett, R., 2012. Visible Geology - Interactive online geologic block modelling. In: AGU Fall Meeting. → pages 198Cockett, R., 2013. A Interactive Online Geologic Block Modeling. In: AAPG Hedberg Research Conference - 3D StructuralGeologic Interpretation: Earth, Mind, and Machine. AAPG, Reno. → pages 198Cockett, R., 2015. Developing, deploying and reflecting on a web-based geologic simulation tool. In: AGU Fall Meeting. →pages 198Cockett, R., 2017. Richards equation comparison to Celia 1990. FigShare →pages 84Cockett, R., Funning, G. J., Pratt-Sitaula, B., 2014a. Visible Earthquakes : An open visualization and modeling platform forInSAR earthquake data. In: Southern California Earthquake Center - Annual Meeting. → pages 198Cockett, R., Haber, E., 2013a. A Numerical Method for Large Scale Estimation of Distributed Hydraulic Conductivity fromRichards Equation. In: AGU Fall Meeting. → pages v, 65, 174Cockett, R., Haber, E., 2013b. Inversion for hydraulic conductivity using the unsaturated flow equations. In: SIAM Conferenceon Mathematical & Computational Issues in the Geosciences. → pages v, 65Cockett, R., Haber, E., 2015. Richards Equation Inversion. FigShare → pages 91Cockett, R., Heagy, L. J., Haber, E., 2017. A numerical method for efficient 3D inversions using Richards equation. arXiv. →pages v, vi, vii, 65, 125, 127Cockett, R., Heagy, L. J., Kang, S., Oldenburg, D. W., 2015a. Coupling geophysical terminology and package development. In:SciPy. → pages vCockett, R., Heagy, L. J., Oldenburg, D. W., 2016a. Pixels and their neighbors: Finite volume. The Leading Edge 35 (8),703–706. → pages vi, 66, 68, 144, 174Cockett, R., Heagy, L. J., Rosenkjaer, G., Kang, S., 2015b. Development practices and lessons learned in developing SimPEG.In: AGU Fall Meeting. → pages vCockett, R., Kang, S., Heagy, L. J., Pidlisecky, A., Haber, E., Oldenburg, D. W., 2014b. SimPEG: A framework for simulationand parameter estimation in geophysics. In: SciPy. Austin. → pages v, 174Cockett, R., Kang, S., Heagy, L. J., Pidlisecky, A., Oldenburg, D. W., 2015c. SimPEG: An open source framework for simulationand gradient based parameter estimation in geophysical applications. Computers & Geosciences 85, 142–154. → pages v,65, 66, 74, 125, 144, 145, 147, 185, 190Cockett, R., Moran, T., Pidlisecky, A., 2016b. Visible Geology: Creative online tools for teaching, learning, and communicatinggeologic concepts. In: Krantz, R., Ormand, C. J., Freeman, B. (Eds.), Earth, Mind, and Machine: 3D StructuralInterpretation, American Association of Petroleum Geologists Hedberg Series. No. 6. → pages vi, 198Cockett, R., Pidlisecky, A., mar 2014. Simulated electrical conductivity response of clogging mechanisms for managed aquiferrecharge. Geophysics 79 (2), D81–D89. → pages vi, 91, 92131Constable, S. C., Parker, R. L., Constable, C. G., 1987. Occam’s inversion: A practical algorithm for generating smooth modelsfrom electromagnetic sounding data. Geophysics 52 (3), 289–300. → pages 22, 76Daily, W., Ramirez, A., LaBrecque, D., Nitao, J., 1992. Electrical resistivity tomography of vadose water movement. WaterResources Research 28 (5), 1429–1442. → pages 1Dean, O. S., Chen, Y., 2011. Recent progress on reservoir history matching: a review. Computational Geosciences 15, 185–221.→ pages 61, 79, 81Dean, O. S., Reynolds, A. C., Liu, N., 2008. Inverse Theory for Petroleum Reservoir Characterization and History Matching.Cambridge University Press, Cambridge. → pages 61Deiana, R., Cassiani, G., Kemna, A., Villa, A., Bruno, V., Bagliani, A., 2007. An experiment of non-invasive characterization ofthe vadose zone via water injection and cross-hole time-lapse geophysical monitoring. Near Surface Geophysics, 183–194.→ pages 14, 61, 62, 88, 90Devi, G. K., Ganasri, B. P., Dwarakish, G. S., 2015. A Review on Hydrological Models. International Conference On WaterResources, Coastal and Ocean Engineering (ICWRCOE’15) 4 (Icwrcoe), 1001–1007. → pages 7Devriese, S. G. R., Davis, K., Oldenburg, D. W., aug 2017. Inversion of airborne geophysics over the DO-27/DO-18 kimberlites- Part 1: Potential fields. Interpretation 5 (3), T299–T311. → pages 3, 10Doetsch, J., Linde, N., Coscia, I., Greenhalgh, S. A., Green, A. G., 2010. Zonation for 3D aquifer characterization based on jointinversions of multimethod crosshole geophysical data 75 (6). → pages 22Doherty, J., 2004. PEST: Model independent parameter estimation. Fifth edition of user manual. Watermark NumericalComputing. → pages 11, 63, 74, 77, 143, 184Duff, I. S., Erisman, A. M., Reid, J. K., 1986. Direct methods for sparse matrices. Clarendon press, Oxford. → pages 48Durner, W., 1994. Hydraulic conductivity estimation for soils with heterogeneous pore structure. Water Resources Research30 (2), 211. → pages 62Egbert, G. D., Kelbert, A., apr 2012. Computational recipes for electromagnetic inverse problems. Geophysical JournalInternational 189 (1), 251–267. → pages 147Ekblom, H., 1973. Calculation of linear best Lp-approximations. BIT Numerical Mathematics 13 (3), 292–300. → pages 33Farquharson, C., 1998. Nonlinear inversion using general measures of data misfit and model structure. Geophysical Journal 134,213–227. → pages 33Farquharson, C. G., Oldenburg, D. W., mar 2004. A comparison of automatic techniques for estimating the regularizationparameter in non-linear inverse problems. Geophysical Journal International 156 (3), 411–425. → pages 36Feller, J., Fitzgerald, B., 2000. A Framework Analysis of the Open Source Software Development Paradigm. In: Proceedings ofthe Twenty First International Conference on Information Systems. ICIS ’00. Association for Information Systems, Atlanta,GA, USA, pp. 58–69. → pages 24Fensel, D., Motta, E., Decker, S., Zdrahal, Z., 1997. Using ontologies for defining tasks, problem-solving methods and theirmappings. Lecture Notes in Computer Science 1319, 113–128. → pages 141Ferre´, T., Bentley, L., Binley, A., Linde, N., Kemna, A., Singha, K., Holuger, K., Huisman, J. A., Minsley, B., 2009. Criticalsteps for the continuing advancement of hydrogeophysics. Eos 90 (23), 200. → pages 6, 7, 120Finsterle, S., Kowalsky, M. B., 2008. Joint HydrologicalGeophysical Inversion for Soil Structure Identification. Vadose ZoneJournal 7 (1), 287. → pages 6, 120Finsterle, S., Kowalsky, M. B., jun 2011. A truncated LevenbergMarquardt algorithm for the calibration of highly parameterizednonlinear models. Computers & Geosciences 37 (6), 731–738. → pages 62, 63, 76Finsterle, S., Zhang, Y., jul 2011. Solving iTOUGH2 simulation and optimization problems using the PEST protocol.Environmental Modelling & Software 26 (7), 959–968. → pages 15, 62, 63, 76132Fletcher, C. A. J., 1988. Computational Techniques for Fluid Dynamics. Vol. II. Springer-Verlag. → pages 69Fournier, D., Kang, S., McMillan, M. S., Oldenburg, D. W., aug 2017. Inversion of airborne geophysics over the DO-27/DO-18kimberlites - Part 2: Electromagnetics. Interpretation 5 (3), T397–T409. → pages 3, 10Fournier, D., Oldenburg, D., Davis, K., sep 2016. Robust and flexible mixed-norm inversion. In: SEG Technical ProgramExpanded Abstracts 2016. No. 5. Society of Exploration Geophysicists, pp. 1542–1547. → pages 127Fullagar, P. K., Pears, G. A., McMonnies, B., 2008. Constrained inversion of geologic surfaces pushing the boundaries. TheLeading Edge 27 (1), 98–105. → pages 25Funning, G. J., Cockett, R., 2012. Visible Earthquakes: a web-based tool for visualizing and modeling InSAR earthquake data.In: AGU Fall Meeting. American Geophysical Union, Fall Meeting 2012, abstract #ED43E-06. → pages 198Gao, G., Abubakar, A., Habashy, T., 2012. Joint petrophysical inversion of electromagnetic and full-waveform seismic data.GEOPHYSICS 77 (3), WA3–WA18. → pages 22Golub, G., Heath, M., Wahba, G., 1979. Generalized Cross-Validation as a Method for Choosing a Good Ridge Parameter.Technometrics 21, 215–223. → pages 36Gruber, T. R., 1993. A Translation Approach to Portable Ontology Specifications. Knowledge acquisition 5 (2), 199–220. →pages 140Gunzburger, M. D., 2003. Prespectives in flow control and optimization. SIAM. → pages 63Guyer, J. E., Wheeler, D., Warren, J. A., 2009. FiPy : Partial Differential (September 2014). → pages 171Haber, E., 2015. Computational Methods in Geophysical Electromagnetics. Mathematics in industry. → pages vi, 30, 34, 39, 40,68, 76, 84, 144, 154, 172, 188Haber, E., Ascher, U., Oldenburg, D., 2000. On Optimization Techniques for Solving Nonlinear Inverse Problems. Inverseproblems 16, 1263–1280. → pages 15, 40, 62, 63, 77Haber, E., Ascher, U., Oldenburg, D., 2004. Inversion of 3D electromagnetic data in frequency and time domain using an inexactall-at-once approach. Geophysics 69, 1216–1228. → pages 77Haber, E., Ascher, U. M., jan 2001. Fast Finite Volume Simulation of 3D Electromagnetic Problems with Highly DiscontinuousCoefficients. SIAM Journal on Scientific Computing 22 (6), 1943–1961. → pages 69, 70, 147Haber, E., Gazit, M. H., 2013. Model Fusion and Joint Inversion. Reviews in Geophysics To Appear, 35–54. → pages 5, 10, 14,61Haber, E., Heldmann, S., 2007. An OcTree Multigrid Method for Quasi-Static Maxwell’s Equations with Highly DiscontinuousCoefficients. Journal of Computational Physics 65, 324–337. → pages 3, 31, 46, 146Haber, E., Heldmann, S., Ascher, U., 2007. Adaptive finite volume method for the solution of discontinuous parameterestimation problems. Inverse Problems. → pages 69Haber, E., Oldenburg, D., 1997. Joint Inversion A Structural Approach. Inverse Problems 13, 63–67. → pages 14, 22, 61Haber, E., Oldenburg, D. W., 2000. A GCV based method for nonlinear ill-posed problems. Computational Geosciences 4 (1),41–63. → pages 36Haber, E., Schwarzbach, C., 2014. Parallel inversion of large-scale airborne time-domain electromagnetic data with multipleOcTree meshes. Inverse Problems 30 (5), 55011. → pages 3, 31Hansen, P. C., 1992. Analysis of discrete ill-posed problems by means of the L-curve. SIAM review 34 (4), 561–580. → pages36Hansen, P. C., 1998. Rank-deficient and discrete ill-posed problems: numerical aspects of linear inversion. Vol. 4. Siam. →pages 36133Hansen, T. M., Cordua, K. S., Looms, M. C., Mosegaard, K., mar 2013. SIPPI: A Matlab toolbox for sampling the solution toinverse problems with complex prior information. Computers & Geosciences 52, 481–492. → pages 23Harbaugh, A. W., 2005. The U.S. Geological Survey modular ground-water model the Ground-Water Flow Process. U.S.Geological Survey Techniques and Methods (6), A16. → pages 23Harder, M., Scott Smith, B. H., Hetman, C. M., Pell, J., 2009. The evolution of geological models for the DO-27 kimberlite,NWT, Canada: Implications for evaluation. Lithos 112, 61–72. → pages 7Haverkamp, R., M. Vauclin, J. Touma, P.J. Wierenga, Vachaud, G., 1977. A Comparison of Numerical Simulation Models ForOne-Dimensional Infiltration. Soil Sci. Soc. Am. J. 41, 285–294. → pages 84, 92Heagy, L. J., Cockett, A. R., Oldenburg, D. W., aug 2014. Parametrized Inversion Framework for Proppant Volume in aHydraulically Fractured Reservoir. In: SEG Technical Program Expanded Abstracts 2014. Society of ExplorationGeophysicists, pp. 865–869. → pages vi, 24, 51, 55, 125, 144, 174, 195Heagy, L. J., Cockett, R., Kang, S., Rosenkjaer, G., Oldenburg, D. W., 2015a. simpegEM: An open-source resource forsimulation and parameter estimation problems in electromagnetic geophysics. In: AGU Fall Meeting. → pages 174Heagy, L. J., Cockett, R., Kang, S., Rosenkjaer, G. K., Oldenburg, D. W., 2016. A framework for simulation and inversion inelectromagnetics. Computers and Geosciences. → pages vi, vii, 91, 125, 127, 147, 174, 185, 187, 188, 190, 196, 199, 200,201Heagy, L. J., Cockett, R., Oldenburg, D. W., 2015b. Using Python to span the gap between education, research, and industryapplications in geophysics. In: SciPy. → pages 145, 174Heagy, L. J., Cockett, R., Oldenburg, D. W., 2017. Modular electromagnetic simulations with applications to steel cased wells.In: Proceedings of the Sixth International Symposium in Three-Dimensional Electromagnetics. pp. 125–129. → pages vii,125Heagy, L. J., Cockett, R., Oldenburg, D. W., Wilt, M., aug 2015c. Modelling electromagnetic problems in the presence of casedwells. In: SEG Technical Program Expanded Abstracts 2015. Society of Exploration Geophysicists, pp. 699–703. → pagesvi, 125, 144, 174, 188Heagy, L. J., Cockett, R., Oldenburg, D. W., Wilt, M., 2015d. Modelling electromagnetic problems in the presence of casedwells, 2–6. → pages 24Heagy, L. J., Oldenburg, D. W., Geophysical, U. B. C., Facility, I., Columbia, B., 2013. Investigating the potential of usingconductive or permeable proppant particles for hydraulic fracture characterization. SEG Houston 2013 Annual Meeting i,576–580. → pages 195Hewett, R. J., Demanet, L., the PySIT Team, 2013. PySIT: Python Seismic Imaging Toolbox. → pages 23Hillier, M. J., Schetselaar, E. M., de Kemp, E. A., Perron, G., 2014. Three-Dimensional Modelling of Geological Surfaces UsingGeneralized Interpolation with Radial Basis Functions. Mathematical Geosciences 46 (8), 931–953. → pages 10, 198Hindmarsh, A. C., Brown, P. N., Grant, K. E., Lee, S. L., Serban, R., Shumaker, D. E., Woodward, C. S., sep 2005. SUNDIALS:Suite of Nonlinear and Differential/Algebraic Equation Solvers. ACM Transactions on Mathematical Software 31 (3),363–396. → pages 172Hinnell, A. C., Ferre´, T. P. a., Vrugt, J. a., Huisman, J. a., Moysey, S., Rings, J., Kowalsky, M. B., apr 2010. Improved extractionof hydrologic information from geophysical data through coupled hydrogeophysical inversion. Water Resources Research46 (4), W00D40. → pages 5, 6, 10, 14, 61, 89, 90, 120, 121Holtham, E., Oldenburg, D. W., 2010. Three-dimensional inversion of MT and ZTEM data SEG Denver 2010 Annual MeetingSEG Denver 2010 Annual Meeting (2), 655–659. → pages 55Hunter, J. D., 2007. Matplotlib: A 2D graphics environment. Computing In Science & Engineering 9 (3), 90–95. → pages 24Hyman, J., Morel, J., Shashkov, M., Steinberg, S., 2002. Mimetic finite difference methods for diffusion equations .Computational Geosciences, 333–352. → pages 45, 46, 145134Hyman, J. M., Shashkov, M., dec 1997. Adjoint operators for the natural discretizations of the divergence, gradient and curl onlogically rectangular grids. Applied Numerical Mathematics 25 (4), 413–442. → pages 144Hyman, J. M., Shashkov, M., 1999. Mimetic Discretizations for Maxwell’s Equations 909, 881–909. → pages 45, 144, 145Iden, S. C., Durner, W., jul 2007. Free-form estimation of the unsaturated soil hydraulic properties by inverse modeling usingglobal optimization. Water Resources Research 43 (7), W07451. → pages 63Irving, J., Singha, K., nov 2010. Stochastic inversion of tracer test and electrical geophysical data to estimate hydraulicconductivities. Water Resources Research 46 (11), W11514. → pages 14Jahandari, H., Ansari, S., Farquharson, C. G., jan 2017. Comparison between staggered grid finitevolume and edgebasedfiniteelement modelling of geophysical electromagnetic data on unstructured grids. Journal of Applied Geophysics. → pages147, 178Jardani, A., Revil, A., Dupont, J., feb 2013. Stochastic joint inversion of hydrogeophysical data for salt tracer test monitoringand hydraulic conductivity imaging. Advances in Water Resources 52, 62–77. → pages 14Jasak, H., Jemcov, A., Tukovic, Z., 2007. OpenFOAM : A C ++ Library for Complex Physics Simulations. InternationalWorkshop on Coupled Methods in Numerical Dynamics m, 1–20. → pages 172Jones, E., Oliphant, T., Peterson, P., Others, 2001. SciPy: Open source scientific tools for Python. → pages 23, 24, 46, 55, 142Kaipio, J., Somersalo, E., 2004. Statistical and Computational Inverse Problems. Springer Verlag. → pages 62Kang, S., Cockett, R., Heagy, L. J., 2014. Moving between dimensions in electromagnetic inversions with a consistentframework. In: AGU Fall Meeting. → pages 24, 145, 174, 199Kang, S., Cockett, R., Heagy, L. J., Oldenburg, D. W., aug 2015a. Moving between dimensions in electromagnetic inversions. In:SEG Technical Program Expanded Abstracts 2015. No. 2. Society of Exploration Geophysicists, pp. 5000–5004. → pagesvi, 24, 125, 144, 174, 182, 190, 198Kang, S., Cockett, R., Heagy, L. J., Oldenburg, D. W., 2015b. Moving between dimensions in electromagnetic inversions (2). →pages 24, 56Kang, S., Fournier, D., Oldenburg, D. W., aug 2017a. Inversion of airborne geophysics over the DO-27/DO-18 kimberlites - Part3: Induced polarization. Interpretation 5 (3), T345–T358. → pages 3, 10, 125Kang, S., Heagy, L. J., Cockett, R., Oldenburg, D. W., 2017b. Exploring nonlinear inversions : A 1D magnetotelluric example.The Leading Edge (August), 696–699. → pages viiKang, S., Oldenburg, D. W., jan 2015. Recovering IP information in airborne-time domain electromagnetic data. ASEGExtended Abstracts 2015 (1), 1–4. → pages 24Kang, S., Oldenburg, D. W., 2016. On recovering distributed IP information from inductive source time domain electromagneticdata (In Review). Geophysical Journal International. → pages vii, 125, 127, 145Kelbert, A., Meqbel, N., Egbert, G. D., Tandon, K., feb 2014. ModEM: A Modular System for Inversion of ElectromagneticGeophysical Data. Computers & Geosciences. → pages 22, 23, 147Ketcheson, D. I., Mandli, K., Ahmadia, A. J., Alghamdi, A., de Luna, M. Q., Parsani, M., Knepley, M. G., Emmett, M., jan2012. PyClaw: Accessible, Extensible, Scalable Tools for Wave Propagation Problems. SIAM Journal on ScientificComputing 34 (4), C210–C231. → pages 171Knight, R., Irving, J., van der Kruk, J., jan 2013. Studying Hydrological Properties and Processes at Scales From Centimeters toWatersheds. Eos, Transactions American Geophysical Union 94 (2), 21–21. → pages 7Lelie`vre, P. G., Oldenburg, D. W., Williams, N. C., 2009. Integrating geological and geophysical data through advancedconstrained inversions. Exploration Geophysics 40 (4), 334–341. → pages 22, 25, 28LeVeque, R. J., 1997. Wave Propagation Algorithms for Multidimensional Hyperbolic Systems. Journal of ComputationalPhysics 131 (2), 327–353. → pages 171135Li, M., Abubakar, A., Habashy, T. M., Zhang, Y., 2010. Inversion of controlled-source electromagnetic data using a model-basedapproach. Geophysical Prospecting 58 (3), 455–467. → pages 28, 182Li, X. S., sep 2005. An overview of SuperLU. ACM Transactions on Mathematical Software 31 (3), 302–325. → pages 48Li, Y., Key, K., 2007. 2D marine controlled-source electromagnetic modeling: Part 1 An adaptive finite-element algorithm.GEOPHYSICS 72 (2), WA51–WA62. → pages 22Li, Y., Oldenburg, D., 2000a. Incorporating geological dip information into geophysical inversions. GEOPHYSICS 65 (1),148–157. → pages 22, 25, 34Li, Y., Oldenburg, D. W., 1996. 3D Inversion of magnetic data. Geophysics 61, 394–408. → pages 22, 34, 147, 182Li, Y., Oldenburg, D. W., jan 1998. 3-D inversion of gravity data. Geophysics 63 (1), 109–119. → pages 22, 182Li, Y., Oldenburg, D. W., 2000b. Joint inversion of surface and three-component borehole magnetic data. Geophysics 65 (2),540–552. → pages 34Liang, L., Abubakar, A., Habashy, T., 2012. Joint inversion of controlled-source electromagnetic and production data forreservoir monitoring. GEOPHYSICS 77 (5), ID9–ID22. → pages 5Liang, W.-L., Uchida, T., mar 2014. Effects of topography and soil depth on saturated-zone dynamics in steep hillslopesexplored using the three-dimensional Richards’ equation. Journal of Hydrology 510, 124–136. → pages 61Lin, J. W.-B., dec 2012. Why Python Is the Next Wave in Earth Sciences Computing. Bulletin of the American MeteorologicalSociety 93 (12), 1823–1824. → pages 42Linde, N., Doetsch, J., apr 2016. Joint Inversion in Hydrogeophysics and Near-Surface Geophysics. pp. 117–135. → pages 4, 5,6, 7, 11, 14, 19, 62, 120, 122Linde, N., Renard, P., Mukerji, T., Caers, J., 2015. Geological realism in hydrogeological and geophysical inverse modeling: Areview. Advances in Water Resources 86, 86–101. → pages 10, 76Lines, L. R., Schultz, A. K., Treitel, S., 1988. Cooperative inversion of geophysical data. Geophysics 53 (1), 8–20. → pages 41,55Lipnikov, C., Misitas, O., 2013. Second-order accurate monotone finite volume scheme for Richards’ equation. J. Comp. Phys239, 123–137. → pages 68Liu, Y., Gupta, H. V., 2007. Uncertainty in hydrologic modeling: Toward an integrated data assimilation framework. WaterResources Research 43 (7), 1–18. → pages 7, 13Ma, X., 2011. Ontology Spectrum for Geological Data Interoperability. Ph.D. thesis. → pages 140Mawer, C., Kitanidis, P., Pidlisecky, A., Knight, R., 2013. Electrical Resisivity for Characterization and Infiltration Monitoringbeneath a Managed Aquifer Recharge Pond. Vadose Zone Journal 12 (1), 1–20. → pages 98, 109, 111McDonald, M. G., Harbaugh, A. W., mar 2003. The History of MODFLOW. Ground Water 41 (2), 280–283. → pages 147McMillan, M. S., Oldenburg, D. W., 2014. Cooperative constrained inversion of multiple electromagnetic data sets. Geophysics79 (4), B173–B185. → pages 10McMillan, M. S., Schwarzbach, C., Haber, E., Oldenburg, D. W., 2015. 3D parametric hybrid inversion of time-domain airborneelectromagnetic data. Geophysics 80 (6), K25–K36. → pages 182, 183Mendelson, K. S., Cohen, M. H., feb 1982. The effect of grain anisotropy on the electrical properties of sedimentary rocks.GEOPHYSICS 47 (2), 257–263. → pages 5Miller, C. A., Williams-Jones, G., Fournier, D., Witter, J., 2017. 3D gravity inversion and thermodynamic modelling revealproperties of shallow silicic magma reservoir beneath Laguna del Maule, Chile. Earth and Planetary Science Letters 459,14–27. → pages vii, 125, 127136Mualem, Y., 1976. A new model for predicting the hydraulic conductivity of unsaturated porous media. Water ResourcesResearch 12 (3), 513. → pages 62, 66, 67, 92, 93Nocedal, J., Wright, S. J., 1999. Numerical Optimization. Springer, New York, NY. → pages 37, 55, 73, 77Noy, N. F., McGuinness, D. L., 2002. Ontology Development 101: A Guide to Creating Your First Ontology. In: StanfordMedical Informatics Report. Stanford University, Stanford, CA. → pages 140Olah, C., Carter, S., 2017. Research Debt. Distill 1 (1), 1–5. → pages 128Oldenburg, D. W., 1984. An introduction to linear inverse theory. Geoscience and Remote Sensing, IEEE Transactions on (6),665–674. → pages 34Oldenburg, D. W., 2016. Looking back and moving forward. Banff International Research Station 16 (2695). → pages 3, 9, 123Oldenburg, D. W., Li, Y., 2005. 5. Inversion for Applied Geophysics: A Tutorial. Ch. 5, pp. 89–150. → pages 7, 22, 32, 34, 36,40, 56, 76, 182Oliphant, T. E., 2007. Python for Scientific Computing. Computing in Science & Engineering 9 (3). → pages 24, 46Oliver, Y. A. D., Reynolds, A., 2001. Efficient History-Matching Using Subspace Vectors. Computational Geosciences 5,151–172. → pages 61Ollivier-Gooch, C., Van Altena, M., sep 2002. A High-Order-Accurate Unstructured Mesh Finite-Volume Scheme for theAdvectionDiffusion Equation. Journal of Computational Physics 181 (2), 729–752. → pages 46, 147Orgogozo, L., Renon, N., Soulaine, C., He´non, F., Tomer, S., Labat, D., Pokrovsky, O., Sekhar, M., Ababou, R., Quintard, M.,2014. An open source massively parallel solver for Richards equation: Mechanistic modelling of water fluxes at thewatershed scale. Computer Physics Communications 185 (12), 3358–3371. → pages 14, 15, 61, 73Paasche, H., Tronicke, J., 2007. Cooperative inversion of 2D geophysical data sets: A zonal approach based on fuzzy c-meanscluster analysis. Geophysics 72 (3), A35. → pages 6, 121Park, S., 1998. Fluid migration in the vadose zone from 3-D inversion of resistivity monitoring data. Geophysics 63 (1), 41–51.→ pages 1Parker, R., 1994. Geophysical Inverse Theory. Princeton University Press, Princeton NJ. → pages 32, 36, 56Parker, R. L., may 1977. Understanding Inverse Theory. Annual Review of Earth and Planetary Sciences 5 (1), 35–64. → pages36Peckham, S. D., Hutton, E. W., Norris, B., apr 2013. A component-based approach to integrated modeling in the geosciences:The design of CSDMS. Computers & Geosciences 53, 3–12. → pages 141Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R.,Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E., 2011. Scikit-learn:Machine Learning in Python. Journal of Machine Learning Research 12, 2825–2830. → pages 124Pe´rez, F., Granger, B. E., may 2007. IPython: a System for Interactive Scientific Computing. Computing in Science andEngineering 9 (3), 21–29. → pages 47Pidlisecky, A., Cockett, a. R., Knight, R., 2013. Electrical Conductivity Probes for Studying Vadose Zone Processes: Advancesin Data Acquisition and Analysis. Vadose Zone Journal 12 (1), 1–12. → pages vi, 5, 14, 91, 92, 111, 114, 119, 147, 199Pidlisecky, A., Haber, E., Knight, R., 2007. RESINVM3D : A 3D resistivity inversion package. Geophysics 72 (2), H1—-H10.→ pages 47, 52, 152, 182Pidlisecky, A., Singha, K., Day-Lewis, F. D., 2011. A distribution-based parametrization for improved tomographic imaging ofsolute plumes. Geophysical Journal International 187, 214–224. → pages 40, 182, 183Pollock, D., Cirpka, O. a., jan 2012. Fully coupled hydrogeophysical inversion of a laboratory salt tracer experiment monitoredby electrical resistivity tomography. Water Resources Research 48 (1), W01505. → pages 5, 7137Porwal, A. K., Kreuzer, O. P., 2010. Introduction to the Special Issue: Mineral prospectivity analysis and quantitative resourceestimation. Ore Geology Reviews 38 (3), 121–127. → pages 7Racz, A. J., Fisher, A. T., Schmidt, C. M., Lockwood, B. S., Huertos, M. L., nov 2011. Spatial and Temporal InfiltrationDynamics During Managed Aquifer Recharge. Ground water, 1–9. → pages 1Richards, L. A., 1931. Capillary conduction of liquids through porous mediums. Journal of Applied Physics 1 (5), 318–333. →pages 13, 60, 65Rosenkjaer, G., Cumming, W., Christopherson, K., 2016. The CSIMT method combining natural MT signals and cultural noiseas sources. SEG Technical Program Expanded Abstracts 2016, 889–894. → pages vii, 125, 127, 145, 174, 200Rosenkjaer, G., Heagy, L. J., Cockett, R., 2015a. Practical integration of processing, inversion and visualization ofmagnetotelluric geophysical data. In: SciPy. → pages 174Rosenkjaer, G., Heagy, L. J., Cockett, R., Kang, S., Oldenburg, D., 2015b. Using simpegMT to demonstrate a modularmodelling and inversion framework applied in a workflow for a hydrothermal system. In: AGU Fall Meeting. → pages 145Ru¨cker, C., Gu¨nther, T., Wagner, F. M., 2017. pyGIMLi: An open-source library for modelling and Inversion in geophysics.Computers & Geosciences. → pages 10, 23Ruthotto, L., Treister, E., Haber, E., 2016. jInv – a flexible Julia package for PDE parameter estimation. arXiv, 1–19. → pages10, 23, 172Saad, Y., 1996. Iterartive Methods for Sparse Linear Systems. PWS Publishing Company. → pages 73Sarma, P., Durlofsky, L. J., Aziz, K., Chen, W., 2007. A New Approach to Automatic History Matching using Kernel PCA. SPEReservoir Simulation Symposium, Houston, Texas. → pages 61Schaap, M. G., Leij, F. J., 2000. Improved Prediction of Unsaturated Hydraulic Conductivity with the Mualem-van GenuchtenModel. Soil Science Society of America Journal 64 (3), 843–851. → pages 92Schenk, O., Gartner, K., 2004. Solving unsymmetric sparse systems of linear equations with PARDISO. Future GenerationComputer Systems 20 (3), 475–487. → pages 48Sharman, R., Kishore, R., Rames, R., 2004. Computational Ontologies and Information Systems: II. Formal Specification.Communications of AIS 2004 (14), 184–205. → pages 140, 141, 142Sˇimunek, J., van Genuchten, M. T., 1996. Estimating Unsaturated Soil Hydraulic Properties from Tension Disc InfiltrometerData by Numerical Inversion. Water Resources Research 32 (9), 2683. → pages 62, 63, 76Sˇimunek, J., van Genuchten, M. T., Senja, M., 2012. HYDRUS: MODEL USE, CALIBRATION, AND VALIDATION.American Society of Agricultural and Biological Engineers 55 (4), 1261–1274. → pages 5, 61, 63Sˇimunek, J., Wendroth, O., van Genuchten, M. T., 1998. Parameter Estimation Analysis of the Evaporation Method forDetermining Soil Hydraulic Properties. Soil Science Society of America Journal 62, 894–905. → pages 95Stein, L. D., 2008. Towards a cyberinfrastructure for the biological sciences: progress, visions and challenges. Nature reviews.Genetics 9 (9), 678–688. → pages 141Strong, D., Chan, T., 2003. Edge-preserving and scale-dependent properties of total variation regularization. Inverse problems19 (6), S165. → pages 34Sun, J., Li, Y., Studies, M., 2012. Joint inversion of multiple geophysical data : A petrophysical approach using guided fuzzyc-means clustering SEG Las Vegas 2012 Annual Meeting SEG Las Vegas 2012 Annual Meeting (4), 1–5. → pages 6, 121Tarantola, A., 2005. Inverse Problem Theory and Methods for Model Parameter Estimation. Society for Industrial and AppliedMathematics. → pages 21Tarantola, A., Valette, B., 1982. Generalized nonlinear inverse problems solved using the least squares criterion. Reviews ofGeophysics 20 (2), 219–232. → pages 21138Tikhonov, A., Arsenin, V., 1977. Solutions of Ill-Posed Problems. W.H. Winston and Sons. → pages 36, 56, 62, 76Towara, M., Schanen, M., Naumann, U., 2015. MPI-Parallel Discrete Adjoint OpenFOAM. Procedia Computer Science 51 (1),19–28. → pages 5, 14, 15, 61, 62, 63, 77Trottenberg, U., Oosterlee, C., Schuller, A., 2001. Multigrid. Academic Press. → pages 70Uieda, L., Jr, V. C. O., Ferreira, A., dos Santos, H. B., Jr., J. F. C., 2014. Fatiando a Terra: a Python package for modeling andinversion in geophysics. → pages 23van Genuchten, M. T., 1980. A Closed-form Equation for Predicting the Hydraulic Conductivity of Unsaturated Soils. SoilScience Society of America Journal 44 (5), 892. → pages 66, 92Van Genuchten, M. T., Leij, F. J., Yates, S. R., Others, 1991. The RETC code for quantifying the hydraulic functions ofunsaturated soils. → pages xiv, 94Van Rossum, G., Drake Jr, F. L., 1995. Python reference manual. Centrum voor Wiskunde en Informatica Amsterdam. → pages24, 42Vogel, C., 2001. Computational methods for inverse problem. SIAM, Philadelphia. → pages 73, 74Wahba, G., 1990. Spline Models for Observational Data. SIAM, Philadelphia. → pages 36Williams, N. C., 2008. Geologically-constrained UBCGIF Gravity And Magnetic Inversions With Examples From TheAgnew-Wiluna Greenstone Belt, Western Australia. Phd, University of British Columbia. → pages 10, 182Wilson, G., Aruliah, D. a., Brown, C. T., Chue Hong, N. P., Davis, M., Guy, R. T., Haddock, S. H. D., Huff, K. D., Mitchell,I. M., Plumbley, M. D., Waugh, B., White, E. P., Wilson, P., jan 2014. Best practices for scientific computing. PLoS biology12 (1), e1001745. → pages 24, 42Yang, D., Oldenburg, D. W., Haber, E., mar 2014. 3-D inversion of airborne electromagnetic data parallelized and accelerated bylocal mesh and adaptive soundings. Geophysical Journal International 196 (3), 1492–1507. → pages 3, 31Yee, K. S., 1966. Numerical solution of initial boundary value problems involving Maxwell’s equations in isotropic media. IEEETrans. on antennas and propagation 14, 302–307. → pages 144, 145Zhdanov, M. S., 2002. Preface. In: Zhdanov, M. S. (Ed.), Geophysical Inverse Theory and Regularization Problems. Vol. 36 ofMethods in Geochemistry and Geophysics. Elsevier, pp. XIX – XXIII. → pages 34139Appendix AFrameworks and ontologiesA computational science framework provides a set of standards such that individualscientists can contribute software components, in this case components used in simu-lation or inversion routines, with the confidence that those components will work withother components in that framework. As such, the standards of the framework definethe responsibilities of each component (or class of component) and the required in-terfaces of all components. The term framework is commonly used in the softwarecommunity, however, the formal term for the component organization and their prop-erties is an ontology. The use of ontologies in the sciences to formally describe domainknowledge has exploded in recent decades, especially in domains of artificial intelli-gence, chemistry, and biology, but also more recently in the geosciences (Sharmanet al., 2004; Ma, 2011). A computational ontology (rather than the underlying dis-cipline of philosophical ontology) is a “formal explicit specification of a shared con-ceptualization” (Gruber, 1993). Noy and McGuinness (2002) summarizes the purposeof computational ontologies as enabling a shared understanding of the structure of in-formation and systematically enabling knowledge and information reuse. Practically140ontologies can (a) provide access and discoverability to heterogeneous information;(b) act as a common language to lower the barrier to transfer of ideas; and (c) act asa specification for interoperability, for example, as a communication protocol or ap-plication programming interface (Sharman et al., 2004). The techniques for buildingontologies amount to capturing, synthesizing, organizing, and digitizing the relation-ships between concepts, conceptual inheritance patterns, and behaviour. Ontologiesare most commonly used for storing and organizing data, for example, connecting ge-netic data with phenotypic data in bioinformatics. However, ontologies are also usedin defining tasks, workflows, and problem solving methods (Fensel et al., 1997; Bardand Rhee, 2004). In more mature interdisciplinary fields, this research is becomingcore to scientists’ day to day research; for example, Stein (2008) notes that “all cur-rent biomedical cyberinfrastructure efforts use ontologies.” As a result of successesin other fields, geoscience integration is currently the target of major funding initia-tives across the world (e.g. EarthCube - 11 year NSF project, $35M in 2015; CIMICFootprints Project - NSERC Project, 24 Universities, 30 Industry, $13M). Many of thecurrent efforts are focused on computational science frameworks, formally describinggeoscientific data (using ontologies), and formally describing methods of integratingdisciplines. For example, a Common Component Architecture for high performancescientific computing (Armstrong et al., 1999) has been used as the basis for coupledforward integration of a number of geoscience simulation tools written by differentauthors (Peckham et al., 2013). The research into these domain specific standards forinteroperability is critical for sustainable interdisciplinary research.The growth in complexity of geophysical data and analysis and the necessity forcross-disciplinary integrations is also coincident with the revolution of open sourcesoftware communities, largely enabled through web-based interactions. Other re-141search communities, for example Astropy in astronomy and SciPy in numericalcomputing, have embraced the open source approach for collaboration and research(Astropy Collaboration et al., 2013; Jones et al., 2001). These pioneering efforts arenow complemented by easy-to-use, ubiquitous web-based repositories and version-control systems (e.g. GitHub), that have removed many of the barriers associatedwith management and collaboration. The growth of such systems, coupled with thematurity of individual geophysical subdisciplines (e.g. potential fields, electromag-netics), presents an opportunity to develop a computational framework and associatedontology for geophysical simulation and inversion. An ontology is an embodimentof concepts, relationships, and behaviours in a specific scientific domain and can be(a) captured in special purpose languages (e.g. Web Ontology Language, ResourceDescription Framework), or (b) captured in general purpose computer programminglanguages (e.g. Python, Java, C++) (Sharman et al., 2004). To research a geophysicalsimulation and inversion framework, I have chosen the latter approach for the pur-poses of utility, testing, and creating a framework/ontology that can be openly usedand evolved by the geoscience community.142Appendix BFinite volume techniquesB.1 IntroductionInverse problems are common across the geosciences: for example, in geophysicalimaging, history matching, and parameter estimation. Many of these inverse problemsrequire constrained optimization using partial differential equations (PDEs), which re-quires derivatives with respect to mesh variables in addition to simulation of the PDE.Finite difference, finite element, and finite volume techniques allow subdivision ofcontinuous differential equations into discrete domains. Knowledge and the appro-priate application of these methods is fundamental to simulating physical processes.Many inverse problems in the geosciences are solved using stochastic techniques orexternal finite difference-based tools (e.g. Doherty (2004)), which are robust to localminima and the programmatic implementation, respectively. However, these meth-ods do not scale to situations where millions of parameters are to be estimated. Thissort of scale is necessary for solving many of the inverse problems in geophysics and,increasingly, hydrogeology (e.g. electromagnetics, gravity, and fluid flow problems).143In the context of the inverse problem, when the physical properties, the domain,and the boundary conditions are not necessarily known, the simplicity and efficiencyin mesh generation are important criteria. Complex mesh geometries, such as bodyfitted grids, commonly used when the domain is explicitly given, are less appropriate.Additionally, when considering the inverse problem, it is important that operators andtheir derivatives are accessible for interrogation and extension. The goal of this workis to provide a high-level background to finite volume techniques, which are abstractedacross four mesh types: (1) tensor product mesh; (2) cylindrically symmetric mesh; (3)logically rectangular, non-orthogonal mesh; and (4) octree and quadtree meshes. Thiswork contributes an overview of finite volume techniques, in the context of geoscienceinverse problems, which are treated in a consistent way across various mesh types inorder to highlight similarities and differences.B.1.1 Attribution and disseminationThe numerical implementations underlying this work have been created throughout myPhD in multiple programming languages (i.e. Matlab, Julia, and Python) and have beeninfluenced by course material and instruction from Dr. Eldad Haber, Dr. Uri Ascherand Dr. Chen Grief (Haber, 2015; Ascher and Greif, 2011). Further references can befound throughout the scientific literature (cf. Yee (1966); Hyman and Shashkov (1999,1997)). I have published aspects of this work in Cockett et al. (2015c), in many Soci-ety of Exploration Geophysics abstracts (Heagy et al., 2014; Kang et al., 2015a; Heagyet al., 2015c) and in a tutorial paper in The Leading Edge (Cockett et al., 2016a). DaveMarchant and Lindsey Heagy influenced the development of the cylindrical mesh, andthe octree storage algorithm was loosely based on the implementation by Bursteddeet al. (2011). These techniques and implementations proved successful in the Sim-144PEG project (Cockett et al., 2015c) and are used in applications of frequency and timedomain electromagnetics, direct current resistivity, gravity, magnetics, fluid flow, andseismic across industry, academia, and education (Rosenkjaer et al., 2016; Kang andOldenburg, 2016; Heagy et al., 2015b; Rosenkjaer et al., 2015b; Cockett et al., 2015c;Kang et al., 2014). The techniques discussed below, as well as a number of accom-panying utilities for mesh generation, import, export, visualization, documentation,and testing are provided in an open source package for Python, called: discretize( The discretize package is released usingthe permissive MIT license to encourage reuse and future improvement of this work.My collaborators and I have also generalized across these types of meshes, in order tohave both a high-level and a standard programmatic interface, differing only in meshinstantiation. The generalization of meshes allows both ourselves and others to buildupon this work as we continue to improve and expand its capabilities.B.2 TerminologyTo simulate differential equations in any computational domain, we must approximatethe continuous equations through discretization onto a mesh. The mesh defines bound-aries, locations of variables, and connectivity between cells. In this section we will dis-cuss a staggered mimetic finite volume approach (Yee, 1966; Hyman and Shashkov,1999; Hyman et al., 2002).B.2.1 Mesh typesThe topographic interface, the location of boundary conditions, and the location ofsources/receivers are often the only non-numerical constraints on mesh generationin geophysics. These constraints require meshes that that can pad efficiently to suf-145ficiently distant boundaries (as in electromagnetics), align or refine to topographicfeatures, and refine around the locations of sources. Numerical efficiency generallytranslates to minimizing the number of cells used in any computational domain. Here,we will consider four mesh types: (1) tensor product mesh; (2) logically rectangu-lar; non-orthogonal mesh (curvilinear mesh); (3) octree and quadtree meshes; and (4)cylindrically symmetric mesh. We use different techniques for dealing with padding,alignment, and refinement for each mesh. Orthogonal vectors of spacings define atensor product mesh (tensor mesh). In the 2D example in Figure B.1a, the mesh is cre-ated from two vectors, hx and hy, which define constant spacing, orthogonal to eachdirection. As the spacing is fixed, any refinement in one dimension means that thatrefinement is completed everywhere in the domain. These refinement constraints of-ten lead to meshes with many cells and unnecessary resolution far from the domain ofinterest. Tree meshes are built through successively dividing mesh cells into four oreight cells in 2D quadtree meshes and 3D octree meshes, respectively (Figure B.1b).Octree meshes are used extensively in electromagnetic geophysical inversions (Haberand Heldmann, 2007). These meshes can also be built on a variable tensor spaced grid,but have the advantage of not refining in locations far from areas of interest, resultingin meshes with fewer cells. Unlike the tensor mesh, tree meshes are not logically rect-angular; that is, each cell does not necessarily have two neighbours in each dimension(x+, x−, y+, y−, etc.). A quadtree cell may have additional neighbours if it is coarserthan its direct neighbours. Using a mesh leveling algorithm, the level of refinementcan be enforced to be a maximum of one level change between cells (Burstedde et al.,2011). The tensor mesh is a logically rectangular orthogonal mesh, where ‘orthogonal’means that the tensors are orthogonal and define a local Cartesian coordinate system.Another mesh type under consideration is a logically rectangular non-orthogonal mesh,146where each cell still has two neighbours in each dimension but the cells are neither re-quired to be axes aligned nor to have orthogonal faces. Here, we will refer to logicallyrectangular non-orthogonal meshes as curvilinear meshes, as seen in Figure B.1c. Asthese meshes are no longer constrained to have orthogonal cells, topographic layerscan be better approximated, without the staircase effect that is present on both tensorand tree meshes. Additionally, curvilinear meshes allow different ways of padding to‘infinity’, and can be used to approximate spherical domains (Calhoun et al., 2008).Finally, we will also consider a cylindrically symmetric tensor mesh. The cylindricallysymmetric tensor mesh is defined in a cylindrical coordinate system where the radialr, azimuthal θ , and vertical z dimensions are in the following domains:r ∈ [0,∞), θ ∈ [0,2pi), z ∈ (−∞,∞) (B.1)Cylindrical symmetry is enforced through a single cell in θ . With the exception of cal-culations for boundary conditions, volume and area are formulated similarly to tensormeshes. Cylindrical meshes are often used for electromagnetics problems for layeredsystems or cylindrically symmetric problems, such as geophysics or fluid flow arounda borehole (Pidlisecky et al., 2013; Heagy et al., 2016). Fully unstructured (tetrahe-dral) meshes will not be considered here, but are commonly used in geophysics andhydrogeology (e.g. Ollivier-Gooch and Van Altena (2002); Jahandari et al. (2017)).We chose the meshes used in this appendix for their common use in electromagneticgeophysics and fluid flow (Haber and Ascher, 2001; Li and Oldenburg, 1996; Egbertand Kelbert, 2012; McDonald and Harbaugh, 2003; Kelbert et al., 2014; Cockett et al.,2015c). All meshes are easy to parameterize, which is an advantage when relativelylittle is known about the simulation domain, as is the case in the context in geophysical147inverse problems.Figure B.1: Three mesh types in two dimensions on the domain of a unit square:(a) a tensor product mesh, (b) a quadtree mesh, and (c) a curviliear mesh.B.2.2 Cell anatomyThis approach requires defining variables at either cell centers, nodes, faces, or edges,as described in Figure B.2. The finite volume technique is derived geometrically fromstudying the control volume of a mesh ‘cell’. The cell center is often used for scalarvariables or anisotropic tensors that represent physical properties. This shows that asingle value fills the entire cell, allowing discontinuities between adjacent cells. Froma geologic perspective, discontinuities are prevalent, as large differences in physicalproperties may exist between geologic layers. Cell nodes, alternatively, are often usedfor variables that are continuously varying in space; that is, internal to a cell valuesbetween nodes can be found through bi/tri-linear interpolation. Vector quantities areheld on the faces or edges.A cell face variable represents a vector that is a flux into or out of that face; thevector is pointed in the face normal direction,~n. As seen in the curvilinear cell in Fig-ure B.3, the face normal directions may not be orthogonal, nor parallel to the Cartesian148Figure B.2: Names of a finite volume cell on a tensor mesh in (a) one dimension,(b) two dimensions, and (c) three dimensions.axes. However, as the direction of the face normal is a property of the mesh, the facevariables only store the magnitude of the vector. A face variable on a single rectilinearcell is a length four array in 2D and a length six array in 3D. There are twelve edgesin 3D, four in 2D, and one in 1D for each cell, all holding vector quantities that pointin the tangent directions,~t. While cell faces represent fluxes, edges represent vectorfields, as is the case in electromagnetics.Figure B.4 displays a tree mesh cell, which shows the location of hanging faces andnodes when two cells of different refinement levels share an interface. These hangingnodes, faces, and edges become important when computing the differential operatorsand inner products. A cylindrically symmetric mesh has the same structure as a tensorproduct mesh cell, except that all cells must be in the radial domain r ∈ [0,∞). It istempting to conceptually locate the cell center of the first radial cell at r = 0, as thiswould be at the center of the cylinder. However, locating the cell center here violatesour staggered grid in cylindrical coordinates and operators, such as the divergence, donot converge with second order accuracy. For consistency throughout the following149Figure B.3: Names of a finite volume cell on a curvilinear mesh in (a) two di-mensions, and (b) three dimensions. Note that the cell faces and edges areno longer orthogonal.sections, we will use terminology derived from a 3D cell. A cell’s volume will referto: volume in 3D; area in 2D; and length in 1D. Face areas will refer to the areaperpendicular to a 3D face, which is a length in 2D and unity in 1D. Edge lengthswill refer to lengths in 2D and will be in the same spatial locations as the cell faces(although with a different numbering and vector direction). In 1D, edge lengths will bethe cell ‘volumes’ (lengths) and will be located at the cell centers. As such, the lengthof the ‘volume’ array will always be equal to the number of cells in the mesh.B.2.3 NumberingThe numbering of any mesh must be explicit in order to define arrays of properties,fields, and fluxes. The numbering of the mesh is arbitrary but has a number of conse-quences for the resulting differential operator matrices in terms of their structure and150construction. In all logically rectangular meshes under consideration, we count firstin the x, then y, then z dimensions. Counting in this way results in column vector-ization and allows the use of Kronecker products for many of the matrix equations,specifically, the vectorization identity:vec(AB) = (Im⊗A)vec(B) (B.2)where B is the discretized grid function, which is useful in building differential op-erators recursively from 1D operators. For non-logically rectangular meshes, suchas quadtree and octree meshes, we sort the numbering first by distance along thex-axis, then y- and then z-. In the case of faces and edges where there are x-, y-,and z-components, we order these components separately and then concatenate them.The numbering is shown in Figure B.8a for a quadtree mesh for the cell centers, x-faces, and y-faces. Although Kronecker products can be used for logically rectangularmeshes, this is not possible for tree-based meshes and the indexes must be kept trackof ‘by hand’.It is important to know the number of variables for the discretization of grid vari-ables. For a 3D logically rectangular mesh, the number of cells, nodes, faces, and151edges are:nc = ncx×ncy×ncz cellsnn = (ncx +1)× (ncy +1)× (ncz +1) nodesn fx = (ncx +1)×ncy×ncz x-facesn fy = ncx× (ncy +1)×ncz y-facesn fz = ncx×ncy× (ncz +1) z-facesnex = ncx× (ncy +1)× (ncz +1) x-edgesney = (ncx +1)×ncy× (ncz +1) y-edgesnez = (ncx +1)× (ncy +1)×ncz z-edges(B.3)When comparing this to a cylindrically symmetric mesh, it is interesting to note thatneither nodes nor θ faces exist, and edges only exist in the θ direction. A tree meshhas an added complication, which occurs when two adjacent cells have different re-finement levels, leading to hanging nodes, edges, and faces. Figure B.4 schematicallyshows the locations of the hanging faces and nodes. When not dealt with, these com-plications cause numerical inaccuracies, which we will discuss further in Section B.3on differential operators and Section B.4 on inner products.B.2.4 DC resistivity equationsWe will use the direct current (DC) resistivity problem from geophysics to motivatediscretization of a parabolic partial differential equation and explain the various op-erators and operations necessary to consider for the finite volume technique. Theequations for DC resistivity are derived in Figure B.5 and are further discussed in(Pidlisecky et al., 2007). Conservation of charge (which can be derived by taking thedivergence of Amperes law at steady state) connects the divergence of the current den-152Figure B.4: Names of a finite volume cell on a tree mesh in (a) two dimensions,and (b) three dimensions. Note the location of hanging x-faces from therefined cells; hanging edges are not shown.sity everywhere in space to the source term, which consists of two point sources: onepositive and one negative. The flow of current sets up electric fields according to Ohmslaw, which relates current density to electric fields through the electrical conductivity,σ . From Faradays law for steady state fields, we can describe the electric field in termsof a scalar potential, φ , which in a DC resistivity experiment is sampled at potentialelectrodes to obtain data in the form of potential differences. The first order form ofthe governing equations for DC resistivity are:∇ ·~j = I(δ (~r−~rs+)−δ (~r−~rs−)) = q1σ~j =−∇φ(B.4)where I is the input current at the positive and negative dipole locations,~rs± , capturedas Dirac delta functions. To motivate the discretization of the DC resistivity equations,we will write the equations in weak form in the following section.153Figure B.5: Derivation of the direct current resistivity equations.B.2.5 Weak formulationThe weak formulation integrates the DC resistivity equations with a test function, ~f ,which reduces the requirement of differentiability (more details are available in Haber(2015)). To keep the notation clean, we also introduce notation (·, ·), which we referto as an inner product.(a,b) =∫Ωa(~x) ·b(~x) ∂v (B.5)where the vectors~a and~b are arbitrary. The vector part of the DC resistivity equations(written in first order form) can be written in weak form as:(1σ~j, ~f)=(−∇φ , ~f)(B.6)154where ~f is the test function. We can now employ a vector identity:∇ ·(a ~f)= a(∇ ·~f)+(~∇ a)·~f (B.7)to integrate the right-hand side by parts. This integration results in the discretizationof the DC resistivity equations entirely in terms of the divergence operator.(1σ~j, ~f)=∫Ωφ(∇ ·~f)−∇ ·(φ~f)dv (B.8)Here, if we assume Dirichlet boundary conditions for φ |∂Ω= 0, that is, the potentialsare zero far away from the domain of interest, we can use the divergence theorem toeliminate the second term on the right-hand side of the equation. This results in thefollowing equation for DC resistivity with Dirichlet boundary conditions on φ :(1σ~j, ~f)=∫Ωφ(∇ ·~f)dv (B.9)We use Dirichlet for simplicity in this example. In practice, Neumann conditions areoften used because ‘infinity’ needs to be further away, if applying Dirichlet boundaryconditions, since potential falls off as 1/r2 and current density as 1/r3. Similar tech-niques for the weak formulation of Maxwell equations can be derived by applying theappropriate vector identities and boundary conditions. In the following two sections,we will discuss the differential operators and the discretization of inner products thatare prevalent in the weak formulation.155B.3 OperatorsWith the terminology and structure of the meshes well-defined, we can now create op-erators for the meshes. Although operators for averaging and interpolation are criticalto any implementation, in this section, we will focus on the differential operators forthe divergence, curl, and gradient. These operators take the form of sparse matrices,which are properties of each mesh. Although nuances exist in creating the operatorsfor each mesh type, the basic building blocks come from the geometric concepts ofindividual cells, specifically the cell volume, face areas, and edge lengths. Given thecell spacings of tensor meshes, the computation of these properties is straightforward.For the cylindrical mesh, these values must be calculated in cylindrical coordinates.The volume and area calculations on the curvilinear mesh are straightforward in twodimensions. However, in three dimensions, the faces of the cell may not lie on a planeand, as a result, both the volume and face areas may not be well-defined. For face area,we use the average of the four parallelograms, which are calculated at each node of theface. As seen in Figure B.6, the cell volume is calculated by dividing the cell into fivetetrahedrons and calculating the volume of each.Figure B.6: Volume calculation using five tetrahedra.156B.3.1 DivergenceThe divergence is the integral of a flux through a closed surface as that enclosed volumeshrinks to a point.∇ ·~f (p) def= limv→{p}∫∫S(v)~f ·~n|v| dS (B.10)Since we have discretized and no longer have continuous functions, we cannot takethe limit fully to a point. Instead, we approximate the limit around a finite volume: thecell. The flux out of the surface (~f ·~n) is exactly how we discretized ~f onto our mesh(i.e. f), except that the face normal points out of the cell (rather than in the coordinatedirections). We can readily calculate the surface areas and normals of all cells in themesh. The flux values on each cell face are discretized and represented by a scalarvalue pointing in the direction of the face normal. As such, we need only multiplythese values by the face area and multiply by ±1 to ensure an outward facing normalwith respect to the cell under consideration. To construct the divergence operator, D1,in one-dimension, the following discretization is used:D1 f = V−1D±S f = diag (v)−1−1 1. . . . . .−1 1︸ ︷︷ ︸D±diag (s)f1f2...fnfn+1(B.11)where V is a sparse matrix with the cell volumes on the diagonal, D± is a ‘stencil’matrix that ensures outward pointing normals, and S is a diagonal matrix includingthe surface areas of each face. To move to higher dimensions, we exploit the logicallyrectangular structure and the column-ordered vectorization of the face variable. Kro-157necker products are used to place the difference matrix, D±, on the correct cells andfaces. Not only is this conceptually efficient, when using an interpreted programminglanguage, this also allows use of lower-level functions in compiled languages. Forexample, the divergence matrix in three-dimensions may be formed by:Dx = I3⊗ I2⊗D1 (B.12a)Dy = I3⊗D2⊗ I1 (B.12b)Dz = D3⊗ I2⊗ I1 (B.12c)where Di is the difference matrix in one-dimension for the ith dimension. The Ii isthe identity matrix that has the length of the cells in the nth dimension. Here the fulldivergence operator can be formed by:D = V−1 [Dx Dy Dz] S (B.13)The diagonal matrix, S, contains the surface areas for each cell in the x, y, and zdirections, concatenated on the matrix diagonal. As the divergence only takes accountof fluxes into and out of a cell in the direction of the face normal, this concatenationworks for any logically rectangular mesh, regardless of orthogonality. For a cylindricalmesh, we need to give attention to the middle cylindrical cell where the flux at r = 0 isknown to be zero and, as such, this column can be removed from the Dr matrix.For a tree mesh, we need to pay special attention to the hanging faces to achievesecond-order convergence for the divergence operator. Although the divergence can-not be constructed through Kronecker product operations, the initial steps are exactlythe same for calculating the stencil, volumes, and areas. These steps yield a divergence158Figure B.7: Visual connection between the continuous and discrete representa-tions of the divergence.defined for every cell in the mesh using all faces. However, redundant information ex-ists when including hanging faces. As seen in Figure B.8, the x-face between the cells{1,3} and cell 4 has three locations for an x-face variable, but there is conceptuallyonly a single flux at that location: x-face 4. As such, we can construct a matrix thatidentifies these hanging faces and assigns them to the same face variable. This matrixincludes only ones and can be multiplied on the right-hand side of the unreduced di-vergence. This ensures that the flux into the negative x-face of cell 4 in Figure B.8 hasa single numeric value.B.3.2 CurlSimilar to the divergence operator, we rely on the geometric interpretation of the curl:( ∇×~e ) · nˆ def= lims→0(1|s|∮c~e ·dr)(B.14)where, ~n is the outward facing normal, s is the area of the face, and the line inte-gral direction c is oriented positively with respect to the normal (i.e. right-hand rule).Figure B.9 shows the integrating directions for a unit cube in each unit direction. Todiscretize the curl of an edge variable, we must integrate along each edge in the appro-159Figure B.8: Simple quadtree mesh showing (a) the mesh structure, cell number-ing, and face numbering in both the x and y directions; and (b) the structureof the face divergence matrix that has eliminated the hanging faces.priate direction (i.e. multiply by ±1) and divide by the face area:C = diag(s)−1 C±diag(l) (B.15)where l is an edge length vector, and s is the face area vector. The numeric curlon edges yields a vector variable on cell faces. Similar to the divergence operator,160this definition can exploit the logically rectangular nature of the mesh and create thedifference matrix, C±, using Kronecker products. For the octree mesh, we need toFigure B.9: Edge path integration or definition of the curl operator.treat both the hanging edges and faces to remove redundant information. Hangingedges are treated differently than faces. We can average the resulting flux from the curloperation through a face, from the five estimates of the flux to a single value over thelarger face. We multiply this averaging matrix on the left-hand side of the unreducedcurl matrix. The edges, however, need to be eliminated through linear interpolationof the coarse edges to the six refined edges in each direction on each coarse face. Wemultiply this interpolation on the right-hand side of the unreduced curl matrix. Thesetwo operations result in a curl that has eliminated both hanging edges and hangingfaces through interpolation and averaging, respectively.B.3.3 GradientThe gradient of a scalar function, f (x), is a vector field whose dot product with anarbitrary vector, v ∈ Rd , yields the directional derivative in that direction.(∇ f (x)) ·v = Dv f (x) (B.16)161In Cartesian coordinates (d = 3), this has the much more familiar form:∇ f =∂ f∂xi+∂ f∂yj+∂ f∂ zk (B.17)To discretize the gradient of a nodal variable, we can do a forward difference along alledges of the mesh and divide by the length of the edge tangentsG = diag(l)−1 G± (B.18)where G± is the gradient stencil identifying the correct nodes, with ±1, and l is thelengths of the edges. Multiplying the gradient by some nodal variable, Gn, results ina vector quantity that is the directional derivative along the edge tangent, which is acentral difference and results in a variable located on the edges on the mesh. Alter-natively, we can formulate the gradient for cell-centered, rather than nodal, variables.This formulation explicitly considers neighbouring cells, rather than just a single fi-nite volume, and is a finite difference method. Although possible, this formulation ismore cumbersome to represent on both tree meshes and curvilinear meshes or whenanisotropy is considered. When a gradient is necessary in a differential equation, andthe variable is located on cell centers instead of nodes, it is usually possible to rearrangethe equations, using vector identities, to use the negative transpose of the divergence,as previously discussed.For both the quadtree and octree meshes, we must once again deal with the hangingnodes and hanging edges. The treatment here is similar to that in the curl matrix. Weinterpolate hanging nodes from their neighbours (two in 2D, four in 3D) and the resultof the gradient, an edge vector on all edges, is averaged to exclude the hanging edges.162Since we are using a staggered grid with centered differences, the discretizationof the differential operators is second-order. That is, as we refine the mesh, our ap-proximation of the divergence should improve by a factor of two. We can verify thisnumerical convergence using simple test functions with analytic expressions, well-chosen boundary conditions, and known derivatives. The assertion of this expectationof order is a critical piece of any numerical implementation.B.4 Inner productsEvaluating volume integrals over the cell becomes important when formulating equa-tions in weak form, as in Section B.2.5. We can numerically evaluate these volumeintegrals in many different ways, however, the method must take into account the lo-cation and number of approximations of the integrand variables that are available. Fortwo cell-centered variables, for example, the evaluation of the integral is simple andcan be calculated by the midpoint approximation. The result is an inner product thatincludes a volume term, v, for each cell:(a,b) =∫Ωa(~x) ·b(~x) ∂v≈ a>diag(v)b(B.19)Complications arise when we approximate the inner product but the variables do notlive in the same location; that is, on the edges, faces, and cell centers.B.4.1 Face inner productWe use the face inner product when there is a multiplication between a cell-centeredvariable and a vector variable located on the faces. In the following example, we willuse a cell-centered variable, σ , which is fully anisotropic, meaning that, in 3D, each163cell is represented by a 3× 3 tensor. We multiply the tensor by ~j and take the innerproduct with a face variable ~f . When discretizing this integral, recall that the dis-cretization has approximations, j and f, on each face of the cell. In 2D, that means twoapproximations of jx and two approximations of jy. In 3D, we also have two approx-imations of jz. Additionally, in the general case, these vectors may not be orthogonalnor axis aligned and so we must first transform them into their Cartesian componentsbefore multiplying by the discretized tensor, Σ. Regardless of how we choose to ap-proximate the vectors, jcart and fcart, we can represent this in vector form for every cell.(σ−1 ~j, ~f)=∫Ωσ−1~j ·~f ∂v≈ j>cart(√vcell Σ−1√vcell)fcart(B.20)We multiply by square-root of volume on each side of the tensor, Σ to keep symmetryin the system. Here, jcart is an approximation of the Cartesian ~j that must be cal-culated from known locations on the mesh, jmesh. There are many different ways toevaluate the inner product(σ−1 ~j, ~f): we could approximate the integral using trape-zoidal, midpoint, or higher order approximations, for example. A simple, second-ordermethod is to break the integral into a sum of 2d sections and apply the midpoint rule,where d is the dimension of the mesh (i.e. two in 1D, four in 2D, and eight in 3D). Foreach of these sections, the midpoint rule uses the closest components of j to composea Cartesian vector. We use a Qi matrix of size 2×2d consisting of 0s and 1s, to selectthe appropriate faces to compose the corresponding vector jcart.jicart = N−1i Qi jmesh (B.21)164Here, the i index refers to the section where we choose to approximate this integral. Ina curvilinear mesh, the fluxes chosen by Qi are not necessarily axis-aligned and alsomay not be mutually orthogonal. As such, we must use a normalization to get backto Cartesian components, where multiplication with the Σ is defined. These projectionmatrices, Ni, are completed 2d times for every cell of the mesh and, in 2D, have theform:Q(i)jmesh︷ ︸︸ ︷[j(1)j(3)]=N(i)︷ ︸︸ ︷[n(1)x n(1)yn(3)x n(3)y] jcart︷ ︸︸ ︷[jxjy](1),[j(1)j(4)]=[−n>(1)−−n>(4)−][jxjy](2),[0 1 0 00 0 1 0]jmesh =[−n>(2)−−n>(3)−]j(3)cart,Q(4) jmesh = N(4) j(4)cart(B.22)where n(i) is the normal to face i. Solving for the Cartesian flux, j(i)cart, requires invertinga small matrix (2× 2 or 3× 3) for each section. We now have eight evaluations in3D of the midpoint rule, using various approximations for jcart. We can sum theseapproximations together to define the face inner product matrix, M fΣ−1 , for a logicallyrectangular mesh of any dimension, d.M fΣ−1 =12d(2d∑i=1Q>i N−>i√vcell Σ−1√vcell N−1i Qi)=2d∑i=1P>i Σ−1Pi, wherePi =√12dId⊗diag(v)N−1i Qi(B.23)Here, each P ∈ R(d∗nc, n f ) is a combination of the projection, volume, and any nor-malization to Cartesian coordinates. In a numerical implementation, it is often more165efficient to complete this operation for each cell in the mesh at the same time, andreturn the sparse matrices for caching as this is a common operation in any inverseproblem that estimates a cell-centered variable. For the tree mesh, we complete theface inner product identically as above, except for a final step to exclude the redundanthanging faces in the same way as the divergence matrix.B.4.2 Edge inner productThe edge inner product with a vector that is defined on the edges of a cell (rather thanon the faces, as above) has a similar derivation. The difference comes in 3D, whenselecting the edges around each node, as there are twelve edges instead of only sixfaces; these edges are selected by the Qˆi matrix. Similarly, in the normalization toCartesian coordinates, we use the edge tangent directions instead of using the facenormals. The Cartesian edge values, ec(i), are projected using the tangents:e(i)cart = T−1(i) Qˆ(i) emesh (B.24)where Qˆ(i) is the projection matrix that selects the edges closest to each midpointapproximation. Once again, we compute the inner product by the mass matrix actingon the edges:Meρ =12d(2d∑i=1Qˆ>i T−>i√vcell ρ√vcell T−1i Qˆi)(B.25)For the tree mesh, we complete the edge inner product as above, except with an addedstep to deal with the hanging edges. This step excludes the hanging edges throughlinear interpolation in the same way as the curl matrix.166B.4.3 Tensor product meshThe generality of this equation can be reduced when dealing with an axis-aligned ten-sor mesh with an isotropic physical property, σ . In this case, we do not need to takeinto account the normalization from N−1i . For a tensor mesh, the face inner productcan be calculated as follows and can be interpreted as averaging the physical propertybetween neighboring cells:M fσ−1 = diag(A>v (vσ−1))(B.26)where  is a Hadamard product for point-wise multiplication and Av ∈ R(n f , nc) is anaveraging matrix from faces to cell centers. In one dimension this matrix has the formA(1)v = 12 12. . . . . .1212 (B.27)The matrix has slight differences on a cylindrical tensor mesh to exclude the r= 0 face.The ‘averaging’ matrix can be made for higher dimensions using Kronecker productsand horizontal concatenation:Av =[I3⊗ I2⊗A(1)v , I3⊗A(2)v ⊗ I1, A(3)v ⊗ I2⊗ I1](B.28)Note that this matrix is often referred to as the face-to-cell-center averaging matrix and,as such, is often divided by the dimension of the mesh to be a true average (i.e. thereare three face averages above). If this is the case, care needs to be taken to restore thisconstant when calculating the inner product. Although general anisotropy is not easilyadded in this form, coordinate anisotropy can be added by composing this matrix as167block diagonals instead of horizontal concatenation. Similarly, on a cylindrical mesh,if anisotropy is necessary, care should be taken that the anisotropy uses Cartesian,rather than cylindrical, coordinates, unless intended.B.4.4 AnisotropyFor defining isotropic, coordinate anisotropic, and fully anisotropic parameters, thefollowing conventions are used in 3D:~σ =σ1 0 00 σ1 00 0 σ1 ~σ =σ1 0 00 σ2 00 0 σ3 ~σ =σ1 σ4 σ5σ4 σ2 σ6σ5 σ6 σ3 (B.29)In 2D, these conventions are similarly defined as:~σ =σ1 00 σ1 ~σ =σ1 00 σ2 ~σ =σ1 σ3σ3 σ2 (B.30)Both the isotropic and coordinate anisotropic material properties result in a diagonalmass matrix on a tensor mesh. This is easy to invert if necessary. However, in the fullyanisotropic case, or for any curvilinear mesh, the inner product matrix is not diagonal,as can be seen for a 3D mesh in the figure below.B.4.5 DerivativesIn the context of parameter estimation, we are often interested in the parameters insideinner products and require efficient matrix-free derivatives for these elements. Thederivative of these inner products is actually a tensor. However, the derivative is amatrix if we only require the computation of this product with a vector (as is usually168Figure B.10: Matrix structure of a face inner product of a cell centered physicalproperty on a tensor mesh.the case). To show the computation of the inner product derivatives, we will consider afully anisotropic tensor for a 3D logically rectangular mesh. Let us start with one partof the sum which makes up M fΣ and take the derivative when this matrix is multipliedby some vector, w:P>i ΣPiw (B.31)Here, we will let Piw = y and y will have the form:y = Piw =y1y2y3 (B.32)169This matrix can subsequently be multiplied by P>i Σ. When multiplying these matricesby hand, we can see that they have the form:P>i Σy = P>iσ1 σ4 σ5σ4 σ2 σ6σ5 σ6 σ3y1y2y3= P>iσ1y1+σ4y2+σ5y3σ4y1+σ2y2+σ6y3σ5y1+σ6y2+σ3y3 (B.33)where  is the Hadamard product, and represents element-wise multiplication. Wecan now take the derivative with respect to any one of the σ parameters, for example,∂∂σ1∂∂σ1(P>i Σy)= P>idiag(y1)00 (B.34)Meanwhile, the derivative for ∂∂σ4 is:∂∂σ4(P>i Σy)= P>idiag(y2)diag(y1)0 (B.35)These derivatives are calculated for each of the eight components of the discretizedinner product (four in 2D).For a tensor product mesh with an isotropic physical property, this is again not themost efficient method of calculation. Instead, we can recall an alternative formula forcalculating the inner product and take the derivative of that formula. We still must170multiply by an arbitrary vector, w:ddσ(M fσw)=ddσ(diag(A>v (vσ))w)= diag(w)A>v diag(v)(B.36)Here, we calculate the derivative by exploiting the fact that a diagonal matrix and avector can be multiplied in either order; that is, they are commutative.ab = ba= diag(a)b= diag(b)a(B.37)The derivative, with respect to physical properties, becomes critical for the inverseproblem, and is often not provided when implementations are solely focused on theforward simulation. In the implementation presented, we have given thought to ac-cessing the derivatives of isotropic and anisotropic physical properties in an efficientway for all mesh types under consideration.B.5 ImplementationThe discretize package ( allows for the explicit ac-cess to derivatives with respect to the inner product operations. This is due to our fo-cus on the inverse problem and the construction of sensitivities and adjoint sensitivitiesimplicitly, through a matrix vector multiplication. Many other packages for finite vol-ume or finite element simulation either focus entirely on the forward simulation (e.g.Guyer et al. (2009)) or explicitly create the sensitivity matrices with automatic differ-entiation and/or finite differencing (Ketcheson et al., 2012; LeVeque, 1997; Alexe and171Sandu, 2014; Hindmarsh et al., 2005; Jasak et al., 2007). Haber has developed imple-mentations in Ruthotto et al. (2016) and Haber (2015) which are written in Julia andMatlab, respectively. We chose Python for the implementation because: (a) Pythonhas a large and growing scientific community, including existing and maintained toolsfor matrix solvers and sparse matrix operations (eg. SciPy, pymatsolver); (b) it is anobject-oriented language (unlike Julia), which allows the relationships between themesh types to be expressed through base class inheritance and subtype polymorphism,where appropriate; and (c) it is a high-level language that has the ability to interfacewith low-level codes in Fortran and C, which allows for efficient creation of declarativescripts while leveraging existing work in lower-level languages. If our primary interestwas in finite volume techniques, it may have been more sensible to choose a languagesuch as Julia (or Matlab, which is proprietary), which has better built-in capabilitiesto represent sparse matrix operators and defaults to matrix multiplication, rather thanarray multiplication. However, finite volume techniques are not the endgame of thiswork. Rather, we wish to use these techniques in conjunction with geophysical in-verse problems, which consists of much more than the numerical implementation andrequires many additional utilities for critical tasks, such as scripting, interactive pro-gramming, reading files, 3D visualization, and communicating results.B.5.1 OrganizationThe object-oriented programming model in Python allows organization of the finitevolume methods for the various meshes into a class inheritance structure. This or-ganization has highlighted the similarities between techniques through elimination ofredundant code between shared concepts and methods. Such elimination leads to aconcise description of the differences between the methods, which was outlined in the172previous section. We show our chosen organization in its entirety in Figure B.11. Thisfigure shows the class inheritance and class properties of the discretize packageand is best viewed in digital form.There is a BaseMesh that has properties such as number of cells, nodes, faces,and edges nC, nN, nF and nE, respectively. We have decided to separate the dif-ferential, averaging and interpolation operators into their own class such that they canbe included as a mixin (rather than inherited) into the final mesh classes. Methods forthe inner products, IO to common formats, and visualization are also separated outand included as mixins. We have separated the concepts of being logically rectangu-lar and being a tensor product mesh because these are different concepts. The basiccounting of cells is on the rectangular mesh and the concept of a ‘cell-centered vectorin the x dimension’ is only for tensor product meshes. This inheritance demonstratesthe concepts that: (a) the curvilinear mesh is rectangular, but not a tensor mesh; (b)the tree mesh is a tensor mesh, but not logically rectangular; and (c) the cylindricalmesh is both logically rectangular and a tensor mesh, however, through subtype poly-morphism the cylindrical mesh overwrites the concepts for geometric calculations ofvolume, surface area, etc. as well as the counting for nodes, faces, and edges.B.5.2 User interfaceOur goal for the implementation is to create a common programmatic terminologyfor working with finite volume techniques. By sharing the numerical implementation,not only is this ‘language’ precise, but it can also be tested for accuracy. A briefdescription of the implementation and use in practice, as well as a look at some ofthe major properties on the abstract mesh types in Table 2.1, was previously givenin Section 2.3.4. As the implementation is openly available, we will point the reader173to both the online, up-to-date documentation of each specific property and method( and to the 400+ unittest results of numerical convergence( As a brief overview, we use the convention ofC, N, F, E to refer to cells, nodes, faces and edges, respectively. For example, fora tensor mesh the number of cells in the x-dimension would be called nCx, the nodaltensor of x-locations as a vector is called vectorNx, and the location of all face vari-ables with an x-component is called gridFx. We named differential operators by thevariable location that they act upon; for example: faceDiv and edgeCurl. Thesenames allow us to define multiple operators. For example, the gradient can either op-erate on cell centers or nodes, cellGrad and nodalGrad, respectively. This lan-guage is common across all meshes considered and extends to other mesh types not yetimplemented. As such, geophysical simulation codes can be built on top of this workto write PDEs in a declarative way, which is agnostic to the mesh implementation ac-tually used. In collaboration with Heagy, Kang, Rosenkjaer and Mitchell simulationshave been completed for time, frequency, and static implementations of electromag-netics using 1D, 2D, and 3D versions of the tensor, tree, curvilinear and cylindricallysymmetric meshes (Rosenkjaer et al., 2016; Kang et al., 2015a; Heagy et al., 2015c,2014; Cockett et al., 2014b; Kang et al., 2014; Heagy et al., 2015a; Cockett and Haber,2013a; Cockett et al., 2016a; Heagy et al., 2016; Rosenkjaer et al., 2015a; Heagy et al.,2015b). The terminology developed defines a clear interface, which allows for im-provements in speed and functionality of the discretize package to transparentlyimprove geophysical simulations.The discretize interface allows for lazy loading of properties; that is, all prop-erties of the mesh are created on demand and then stored for later use. This caching ofproperties is important, as not all operators are useful in all problems and, as such, are174not created. The implementation here is different from some other finite volume imple-mentations, as the operators are held in memory as matrices and are readily availablefor interrogation. We find this feature beneficial for educational and research purposes,as the discretization remains visually very close to the math, and the matrices can bemanipulated, visually inspected, and readily combined.The major difference between the mesh types is the instantiation of each type. Aregular tensor mesh that is defined on the unit cube can be instantiated with integers.For example, TensorMesh([4, 8])will create a 2D tensor mesh with equal spac-ings (a regular mesh) with nCx = 4 and nCy = 8 that has a domain over the unitsquare. Similarly, TensorMesh([hx, hy, hz]), where hx is a vector of variablecell spacings in the x dimension, will create an irregularly spaced tensor mesh in 3D.To produce a cylindrically symmetric mesh (i.e. with a single cell in the azimuthal, θ ,dimension), we can mix these notations. For example, CylMesh([hr, 1, hz]).For the definition of a CurvilinearMesh, all node locations must be specified ineach dimension. These node locations may be provided as a list of either 2D or 3Dmatrices. For a tree mesh, a base mesh of tensor products are provided (which mustbe a power of 2), which represents the location of nodes at the lowest refinement level.A refinement function must also be provided. The refinement function is recursivelypasses cell centers and chooses whether that cell should be refined. This functionalconstruction of the tree meshes encourages composable functions that refine basedon, for example, location from a source and/or distance from a topographic interface.Alternatively, the mesh can be loaded from several formats (e.g. the UBC mesh andmodel files or the Open Mining Format (*.omf)). Once we have constructed the mesh,there is access to the differential operators, averaging and interpolation functions, andutilities for visualization and export.175B.6 Numerical examplesHere we will briefly explore the application of the curvilinear mesh for a DC resis-tivity problem, which was introduced in Section B.2.4. We will explore the equationsfor a tensor mesh and a curvilinear mesh over a unit cube with Neumann boundaryconditions. We tested the forward operators for analytical potential fields with the ap-propriate boundary conditions. A series of electrode arrays (surveys) were written toproduce and collect data from the forward model. The survey used in this paper con-sidered all receiver permutations in a grid on the top surface of the model. We notethat it is not possible to experimentally collect data at the same location as the sourceelectrodes; we discarded these permutations.For the numerical experiments presented here, we use a true model with a geo-logic interface with varying elevation. A cross-section through the 3D model at amid-range discretization is seen in Figure B.12. The layer above the interface has aconductivity of 1 Sm−1 and the layer below the interface has a conductivity of 100Sm−1. We produced data from the forward operator, with the true model discretized,using 45×45×45 cells. We created a series of models that ranged from 5×5×5 to40×40×40, over the same domain. At each discretization level, the true model wasdown-sampled onto a regular mesh as well as a curvilinear mesh that was aligned tothe interface. The survey setup was a grid of 4×4 equally spaced electrodes centeredon the top surface of the model. There were a total of 16 electrodes, 120 source config-urations, and 91 active measurements per source dipole. This gave rise to 10,920 totalmeasurements, half of which are symmetric and likely would not have been collectedin a field experiment, but were collected in this numerical experiment. We compare thedata collected from each of the test models directly to the large model’s data and plot176the norm in Figure B.13. The mesh that is aligned to the layer performs significantlybetter at lower discretizations because it is more accurate at resolving the topographicinterface. For example, at a norm data error of 100, the mesh aligned to the layerneeded 163 cells, versus 273 when we used a rectangular discretization - or nearly fivetimes the number of cells. The changes in error at coarse discretizations is due to thelow accuracy in modeling the location of the sources and receivers.The logically rectangular mesh allows increased degrees of freedom when plac-ing the nodes of the mesh. In simple situations it is possible to significantly improveaccuracy of the numerical model at reduced computational costs. However, the in-creased freedoms in picking node locations forces additional thought in mesh creationand alignment. Issues of mesh creation can be complex and these problems must behandled appropriately. It is suggested that simple meshes (i.e. regular) be used whenpossible and to only use the logically rectangular mesh when known layers, such astopography, are well-defined and known to significantly influence the solution of theproblem and data collected.B.7 ConclusionsDiscretization techniques are necessary in every aspect of computational geophysicsand have been used extensively throughout my thesis and collaborative work. Manyof the components in the discretization require special attention when consideringthe inverse problem; most notably, derivatives to the inner products and choosingwhen to cache operators. In this chapter, I have provided a description of the finitevolume techniques that are necessary to discretize and simulate many of the ellipticand parabolic partial differential equations that are common in electromagnetic geo-physics and hydrogeologic fluid flow. I have provided the derivations in a general177form, such that they apply to four different types of meshes in common use in geo-physical inverse problems. This generality allows for differences between the meshtypes to be highlighted and discussed. I have also provided an open source, per-missively licensed implementation of this work. The Python implementation, calleddiscretize (, is object-oriented and highlights the con-cepts and inheritance structure of the meshes under consideration. The numerical ex-ample highlights these meshes in use for a direct current resistivity problem and brieflydiscusses some of the numerical advantages and disadvantages of each mesh type.B.7.1 Continuing workA number of improvements and extensions that have yet to be tackled at the time ofwriting. Among these are (in no particular order): (a) improved ease of use aroundboundary conditions; (b) more utilities for mesh creation especially for more compli-cated meshes (i.e. curvilinear and tree); (c) an increase in the combinatorial nature ofexisting meshes, for example, cylindrical or curvilinear octrees; (d) extension to un-structured meshes, voronoi meshes, and other coordinate systems (e.g. spherical); (e)a more rigorous comparison to finite element codes (cf. Jahandari et al. (2017)); and(f) integration to existing mesh generation packages. By providing this package in anopen, standalone, tested, documented form we hope that the implementation can beimproved by the growing community of contributors.178discretize.BaseMesh.BaseMeshdimnCnEnExnEynEznFnFxnFynFznNnormalstangentsvnEvnFx0projectEdgeVector()projectFaceVector()discretize.BaseMesh.BaseRectangularMeshnCnCxnCynCznExnEynEznFxnFynFznNnNxnNynNzvnCvnExvnEyvnEzvnFxvnFyvnFzvnNr()discretize.CurvilinearMesh.CurvilinearMeshareaedgegridCCgridExgridEygridEzgridFxgridFygridFzgridNnormalstangentsvol discretize.DiffOperators.DiffOperatorsaveCC2FaveE2CCaveE2CCVaveEx2CCaveEy2CCaveEz2CCaveF2CCaveF2CCVaveFx2CCaveFy2CCaveFz2CCaveN2CCaveN2EaveN2FcellGradcellGradBCcellGradxcellGradycellGradzedgeCurlfaceDivfaceDivxfaceDivyfaceDivznodalGradnodalLaplaciangetBCProjWF()getBCProjWF_simple()setCellGradBC()discretize.InnerProducts.InnerProductsgetEdgeInnerProduct()getEdgeInnerProductDeriv()getFaceInnerProduct()getFaceInnerProductDeriv()discretize.CylMesh.CylMeshareaaveE2CCaveE2CCVaveF2CCaveF2CCVcartesianOrigin : recarraycellGradedgeedgeCurlfaceDivfaceDivxfaceDivyfaceDivzisSymmetricnEznNxnNynodalGradnodalLaplacianvectorCCxvectorCCyvectorNxvectorNyvnEyvnEzvnFxvolgetInterpolationMat()getInterpolationMatCartMesh()discretize.TensorMesh.BaseTensorMeshgridCCgridExgridEygridEzgridFxgridFygridFzgridNhhxhyhzvectorCCxvectorCCyvectorCCzvectorNxvectorNyvectorNzgetInterpolationMat()getTensor()isInside()discretize.MeshIO.TensorMeshIOreadModelUBC()readUBC()readVTK()writeModelUBC()writeUBC()writeVTK()discretize.MeshIO.TreeMeshIOreadModelUBC()readUBC()writeUBC()writeVTK()discretize.TensorMesh.TensorMeshareacellBoundaryIndedgefaceBoundaryIndvol discretize.TreeMesh.TreeMeshareaaveE2CCaveE2CCVaveEx2CCaveEy2CCaveEz2CCaveF2CCaveF2CCVaveFx2CCaveFy2CCaveFz2CCaveN2CCedgeedgeCurlfaceDivfillgridCCgridExgridEygridEzgridFxgridFygridFzgridNlevelsmaxLevelnCnEnExnEynEznFnFxnFynFznNnhEnhExnhEynhEznhFnhFxnhFynhFznhNnodalGradntEntExntEyntEzntFntFxntFyntFzntNpermuteCCpermuteEpermuteFvntEvntFvolbalance()corsen()getInterpolationMat()number()plotGrid()plotImage()plotSlice()point2index()refine()Figure B.11: The computational ontology developed for the discretizepackage showing inheritance and commonalities between the four meshtypes.179Figure B.12: Regular mesh and mesh aligned to layer for a simple conductivitymodel at 14×14×14.180Figure B.13: Comparison of norm data error for the regular mesh and the meshaligned to the interface.181Appendix CInterfaces and extensionsC.1 IntroductionIncorporating and quantitatively capturing a-priori and hydrologic and geologic in-formation is a perennial problem in geophysics. Many methods already exist to in-clude geologic information in geophysical simulations and inversions. Broadly, thesemethods include ways of formulating: (a) the inverse problem: through regularization,objective functions, or constraints (cf. Williams (2008)) and (b) the forward simula-tion: through parameterization of physical properties, model conceptualization, andthe dimensionality or physical equations used in the simulation (e.g. Oldenburg andLi (2005); Li and Oldenburg (1998, 1996); Pidlisecky et al. (2007); Li et al. (2010);Pidlisecky et al. (2011); McMillan et al. (2015); Kang et al. (2015a)). Section 2.2.2gives the inversion elements, inverse problem formulation, and a brief discussion ofchoices in the regularization. Furthermore, the inverse formulation and the addition ofregularization can largely be done in an external way. The intricacies inherent in theforward simulation, however, were largely neglected in the general discussion of the182framework. In this appendix, we will explore a number of case studies that are drivenby the context of the inverse problem. The case studies span vadose zone flow, directcurrent resistivity, time and frequency domain electromagnetics, and simple structuralgeologic modelling.C.1.1 What is your model?In all of these case studies, we focus on the question ‘What is your model?’; that is,how does the framework approach enable custom model conceptualizations? The stan-dard approach in exploration geophysics of a 3D distribution of voxels is both generaland widely applicable. However, this approach falls short when integrating with fieldsof hydrology or geology. For example, Pidlisecky et al. (2011) uses a hydrologic con-ceptualization for a geophysical model and inverts directly for the spatial morphologyof a solute plume through a moment-based description. McMillan et al. (2015) takes asimilar approach to invert time domain electromagnetic data for a thin geologic unit ofvariable dip. In direct coupling of geophysical and hydrologic inverse problems, theparameterization of the model must be flexible and extensible by researchers who areasking new scientific questions.The focus on the forward simulation is to expand on the capabilities of the frame-work developed in Chapter 2. The forward simulation framework must have the flexi-bility to support and interface to arbitrary parameterizations. A general architecturerequires that derivatives be calculated with respect to: (a) multiple physical prop-erties in the physical equations (e.g. electrical conductivity, hydraulic conductivity,and magnetic permeability); (b) the sources (e.g. waveform, transmitter location, andboundary conditions); (c) estimation of additional fields/fluxes from the solution ofthe PDE (e.g. saturation field from pressure head); and, (d) from the measurements183(e.g. location and orientation of data collection). In order to support the custom pa-rameterizations that are necessary for a unique decision or prediction, this architecturemust provide building blocks that are independently extensible. The PEST frameworkfor model independent parameter estimation and uncertainty analysis provides a con-crete example of where this has previously been done with success (Doherty, 2004).The software is widely cited in academia (> 2K citations) especially in hydrologyand hydrogeophysics, and is heavily used in industry. The advantage of being model-independent has given this technique wide application due to the flexibility to adapt tonew scientific questions. However, this flexibility also comes at quite a cost becausethe structure of the simulation and modelling cannot be used to the advantage of thealgorithm. As with the Richards equation or electromagnetics, when moving to threedimensions there may be hundreds of thousands to millions of parameters to estimate.Not taking the structure of the problem into account severely limits the types and sizesof problems that can be considered. In the following sections, we will briefly describesome of the necessary components to simultaneously support a breadth of custom pa-rameterizations. These abstractions have been possible through collaborative workacross multiple fields, including electromagnetics, fluid flow, and parameterized geo-logic modeling.C.2 Forward simulation frameworkThe aim of the forward simulation is to compute predicted data, dpred, provided an in-version model, m, and sources, which may come in the form of boundary conditions,are used. Here, we use the term, inversion model, to describe a parameterized repre-sentation of the earth (voxel-based or some other parametric representation; that is, themodel is an array of numbers). Even if the model is solely used for forward modelling,184its form sets the context for the inverse problem and the parameter-space that is to beexplored. Additionally, it is often an advantage to explore the sensitivity of predicteddata or fields with respect to physical properties, sources, or receivers (for example, insurvey design or uncertainty estimation). The forward simulation framework was leadprimarily from the conceptual pieces necessary from the Richards equation (Chapter 3)as well as time and frequency domain electromagnetics, which are presented in Heagyet al. (2016). Figure C.1 shows the framework for the forward simulation as distilledfrom this collaborative work, which extends the general framework presented in Chap-ter 2 and, similarly, consists of two overarching categories (Cockett et al., 2015c):• the Problem, which is the implementation of the governing equations,• the Survey, which provides the source(s) to excite the system as well as thereceivers to sample the fields and produce predicted data at receiver locations.C.2.1 ConceptsHere, we provide a brief overview of each of the components in the forward sim-ulation. An in-depth discussion about each component in this framework has beenpublished in the context of electromagnetic problems (Heagy et al., 2016). Here,we present a succinct adaptation of this work in the context of the Richards equationby walking through Figure C.1. To compute pressure head responses everywhere inspace and time, the forward simulation requires the definition of two physical propertyfunctions, which describe the water retention curve (θ(ψ)) and hydraulic conductivityfunction (K(ψ)) on the simulation mesh. This method differs from many geophysicalmethods where physical properties are not dependent on the fields/fluxes (for exam-ple, electrical conductivity in direct current resistivity). Analogies exist here between185Figure C.1: The forward simulation framework that is used for Richards equa-tion.induced polarization geophysical methods and the Richards equation, where the mea-sured electrical conductivity depends upon the frequency of the alternating current.Regardless of the physics used, physical properties (or functions) must be definedthroughout the computational domain. The physics defines the equation, in this casethe Richards equation, which can be written as a matrix equation:A(m)u = q(s)where u is the solution of the matrix solve (possibly a field) and q(s) is both the right-hand side and the function of a source. In the time domain case, when backward Euler186is used, A amounts to a block bidiagonal matrix. The boundary conditions can beseen as sources, s, when the discretization is written in weak-form (Section B.2.5).The sources must be included in both the physics and, in the case of electromagnetics,the fields. In the case of the Richards equations, the evaluation of the pressure headfield to the water content field is dependent on the water retention curve, which may,in turn, depend on the model. These fields, defined everywhere in the computationaldomain, can be spatially and temporally interpolated onto the measurement locationsof the receivers to create predicted data. In other geophysical methodologies, the datamay be produced through a spatial or temporal derivative (e.g. gravity gradiometry), apotential difference (e.g. direct current resistivity), or a ratio involving multiple geo-physical fields (e.g. magnetotellurics). Figure C.1 shows the conceptual steps involvedin going from a model vector to predicted data. Heagy et al. (2016) discusses some ad-ditional intricacies; however, this conceptual organization into modular, exchangeable,and testable components is helpful when tackling the implicit derivatives necessary inthe inverse problem.C.2.2 DerivativesThe process we follow to compute matrix-vector products with the sensitivity is shownin Figure C.2, where Jv is built in stages by taking matrix vector products with the rele-vant derivatives in each component. This process is shown schematically in Figure C.2for both Jv and J>v. As an example, let us consider our model to be the van Genuchtenparameter, α , which is defined as a heterogeneous parameter inside the computationaldomain. In this case, m = α and no other parameters are considered to be active. The187sensitivity of the forward simulation to the model takes the form:J[m] =dF [m]dm=dP(f)dfdfdαdαdm=dP(f)df︸ ︷︷ ︸Receivers(∂ f∂uPhysics︷︸︸︷dudα+∂ f∂ sSources︷︸︸︷dsdα+∂ f∂α)︸ ︷︷ ︸Fieldsdαdm︸︷︷︸Properties(C.1)The annotations in Figure C.2 denote which of the elements shown are responsiblefor computing the respective contribution to the sensitivity. If the model provided isinstead in terms of Ks, this property replaces the role of α . The flexibility to invokedistinct properties of interest (e.g. α , Ks, source location, etc.) in the inversion re-quires quite a bit of ‘wiring’ to keep track of which model parameters are associatedwith which properties (physical properties, location properties, boundary conditions,etc.); this ‘wiring’ is achieved through a general Wires class that keeps track of theconnections between properties and the model (Program C.2).Although typically the source terms do not have model dependence and thus theirderivatives are zero, the derivatives of s must be considered in a general implementa-tion (Heagy et al., 2016). For example, if one wishes to use a nested approach, wheresource fields or boundary conditions are constructed by solving a simplified or dif-ferent physics problem, the source terms may have dependance on the model. Thesource terms’ dependance on the model means that their derivatives have a non-zerocontribution to the sensitivity (cf. Haber (2015); Heagy et al. (2015c, 2016)).The derivative of the solution, vector u with respect to the model, is found by188implicitly taking the derivative of the physics with respect to m, giving:dudm= A−1(m)(− ∂A(m)ufix∂m︸ ︷︷ ︸getADeriv+∂q∂ sdsdm+∂q∂m︸ ︷︷ ︸getRHSDeriv)(C.2)The annotations below the equation indicate the methods of the Problem class thatare responsible for calculating the respective derivatives. If multiple physical prop-erties are internal to the A, then this is indicated by an underscore in the methodname (getADeriv sigma). Typically the model dependence of the system ma-trix is through the physical properties. (for example, σ , µ in electromagnetics or Ks,α , and other van Genuchten parameters in the Richards equation). Thus, to computederivatives with respect to m, we first take the derivatives with respect to α and wetreat the dependence of α on m using chain rule.Figure C.2: The components required in calculating the derivatives of the for-ward simulation, showing (a) the modular nature of each derivative; (b)the process of multiplying each derivative in the forward sense with Jv;and (c) in the adjoint sense with J>v.189C.2.3 Properties and MappingsOften, in solving an inverse problem, the model which we choose to invert for (the vec-tor m) is some discrete representation of the earth that is decoupled from the physicalproperty model or perhaps represents multiple physical properties. This decouplingrequires the definition of a Mapping that is capable of translating m to physical prop-erties on the simulation mesh. For instance, if the inversion model is chosen to belog-conductivity, an exponential mapping is required to obtain electrical conductivity(i.e. σ =M (m)). To support this abstraction and integration to model conceptu-alization, the framework defines a number of extensible Mapping classes (Cockettet al., 2015c; Kang et al., 2015a; Heagy et al., 2016). These Mappings must alsobe able to deal with the potential for multiple physical properties, potentially in dif-ferent locations, in the forward simulation framework. For example, in the Richardsequation, the water retention function is required in both the physics and the fieldsto translate pressure head to water content or saturation. As we are in the processof developing a computational ontology that is extensible and applicable across dis-ciplines, our framework is explicit about where the physical properties (or functionsand their properties) are involved in the forward simulation framework. For example,in Program C.1, we can define a base class that represents the hydraulic conductivityfunction. Many variants (subclasses) of this function exist, which define parametersin different ways, including van Genuchten, Brooks-Corey, splines, interpolation, etc.The properties on these subclasses are the parameters that one might be interested ininverting for (in van Genuchten, these would include Ks, α , n, and I). The Problemclass, which contains the physics, can declaratively define that it requires a valid in-stance of a hydraulic conductivity function. This inheritance is similar for the water190retention, which can be defined on both the problem and the fields. The declarativenature of these properties is the backbone of extracting an ontology, which canbe used in a variety of other ways (specification, search, documentation, collaboration,etc.). In defining the properties, we also define a map,M (m), to that property as wellas a derivative with respect to the model.import propertiesfrom SimPEG import Props, Problemclass HydraulicConductivity(Props.HasModel):"""The base hydraulic conductivity function"""class Vangenuchten_k(HydraulicConductivity):Ks, KsMap, KsDeriv = Props.Invertible("Saturated hydraulic conductivity",default=1e3)alpha, alphaMap, alphaDeriv = Props.Invertible("inverse of the air entry suction [L-1]",default=6.0, min=0)class BrooksCorey_k(HydraulicConductivity):"""Or a spline, or something custom"""class RichardsProblem(Problem.BaseTimeProblem):hydraulic_conductivity = properties.Instance(’hydraulic conductivity function’,HydraulicConductivity)Program C.1: Definition of a hydraulic conductivity model with multiple invert-ible properties that is declaratively attached to the Richards problemclass.The mappings from the model space to a physical property on the computationaldomain are a key way to interface to domain knowledge in other disciplines (e.g. struc-tural geology). Additionally, they allow the inversion methodology to turn on or offsensitivity to the model at a property by property basis; this is completed by a Wires191utility, which creates projection matrices that sample the model vector at the appro-priate location(s). Program C.2 demonstrates this utility in Python code and showsthe outcome of assumptions using assert statements. In this case, we define the modelvectors to be Ks and α and use the wires to attach these vectors to the physical prop-erties. In the inversion, however, we may wish to invert in log conductivity space forKs, as this property varies logarithmically rather than linearly. To do this, we use anadditional Mapping to take the exponential of the model before setting the parameterin the hydraulic conductivity function. The length of these transformations is arbitraryand extremely specific to the case study or geoscience hypothesis (e.g. survey design,parametric inversions, or sensitivity analysis). The mappings between model concep-tualization and physical properties can be broken into composable, reusable pieces,which can use the chain rule to evaluate the derivative of the transformation. Fig-ure C.3 shows this evaluation in action, where a logarithmically scaled conductivity ina layered earth is transformed into the physical property on the entire computationaldomain.Figure C.3: Mapping an inversion model, a 1D layered, log conductivity modeldefined below the surface, to electrical conductivity defined in the full sim-ulation domain.192import numpy as npfrom SimPEG import Mesh, Mapsfrom SimPEG.FLOW.Richards import Empiricalmesh = Mesh.TensorMesh([[(1, 40)]], ’N’) # Create mesh with 40 cellsk_fun, theta_fun = Empirical.van_genuchten(mesh) # Create empirical modelsk_sat = 1e-3 # Define homogeneous physical propertiesalpha = 6.0k_sat_model = [np.log(k_sat)]*mesh.nC # Put values into array, log(Ks) is the modelalpha_model = [alpha]*mesh.nCmodel = np.array(k_sat_model + alpha_model)assert len(model) == mesh.nC * 2 # We have 80 model parameterstheta_r = np.array([0.02]*mesh.nC) # Define other properties (not in model)theta_s = np.array([0.30]*mesh.nC)wires = Maps.Wires( # Create wires from model to properties(’Ks’, mesh.nC), # Ks gets 40 parameters(’alpha’, mesh.nC) # α gets 40 parameters)# Use the maps to define Ks and alphak_fun.KsMap = Maps.ExpMap(mesh) * wires.Ks # Add the exponential mappingk_fun.alphaMap = theta_fun.alphaMap = wires.alpha # Note that alpha is in both functionstheta_fun.theta_r = theta_r # Use the properties to define θr & θstheta_fun.theta_s = theta_sk_fun.model = theta_fun.model = model # Set the model for each function# Test the setup of the functionsassert np.isclose(k_fun.alpha[0], alpha) # Check that the mappings are workingassert np.isclose(theta_fun.alpha[0], alpha)assert np.isclose(k_fun.Ks[0], k_sat) # Including the exponentialassert k_fun.KsDeriv.shape == (40, 80) # Check the Ks derivative is the correct shapeassert theta_fun.theta_rMap is None # Check that there are no Maps for θrassert theta_fun.theta_sMap is None # Nor for θsassert theta_fun.theta_rDeriv == 0 # Which means the derivative, ∂θr∂m , is zeroprint(k_fun.summary()) # Print a summary of the function# >> Physical Properties:# >> [*] I: set by default value# >> [*] Ks: set by the ‘KsMap‘: ComboMap[ExpMap(40,40) * Projection(40,80)] * model(80)# >> [*] alpha: set by the ‘alphaMap‘: Projection(40,80) * model(80)# >> [*] n: set by default valueProgram C.2: Demonstration of the ability to choose arbitrary parameters to in-clude in a model, and use the chain rule to compose parameterizations.193C.3 ParameterizationsIn the following case studies, I will touch briefly on some key questions and how theframework developed can answer these classes of questions. Many of the case studieshave been published and represent collaborative work with many colleagues. In thissection, rather than repeating the conclusion of each study, I will highlight what I havelearned and how the framework developed can be used.Some geoscience questions:• What is the distribution of physical properties?– What space?– What about known distributions?• What is the sensitivity to a conceptual model?– For example, in survey design?– How does this extend to structural geologic modelling?• What is the dimensionality (1D, 2D, 3D, or 4D) of the problem?– For example, in electromagnetics?– How do we keep control variables when testing assumptions?• How can we integrate, nest, couple, and join different problems?– For example, in a primary secondary formulation?– In the general case?194C.3.1 Expected distributionsElectrical or hydraulic conductivity are often parameterized logarithmically. This loga-rithmic parameterization changes the space in which the inversion ‘searches’ for an an-swer to the optimization problem. In the Richards equation, for example, by choosingthe parameterization of van Genuchten, we constrain the inversion to pulling only froma set of parameterizations of this function. In Heagy et al. (2014), we investigated twoapproaches for identifying the extent and location of an electrically conductive prop-pant. Hydraulic fracturing uses sand or ceramic beads to prop newly created fracturesopen; as such, the location of the proppant represents the volume of a reservoir thatcan be effectively drained. The standard approach involves using an uncoupled geo-physical inversion to create an image of electrical conductivity and then, subsequently,interpret that distribution either qualitatively or quantitatively (Figure C.4). In Heagyet al. (2013), the authors investigated the geophysical responses of conductive and per-meable proppant particles. Later research by Heagy lead to a parameterization usingeffective medium theory to analytically describe the relationship between conductivityand volume fraction of the proppant. In Heagy et al. (2014), we used this relation di-rectly in a coupled geophysical inversion for the volume of the proppant. Furthermore,by parameterizing the inversion in terms of volume, rather than conductivity, we canuse the known volume of the proppant pumped into the synthetic reservoir as anotherdatum. This coupled methodology was tested on a synthetic example and the jointinversion of geophysical and volume data showed promising results.This coupled geophysical method can be completed through a single, custom madeMapping that codifies the effective medium theory parameterization and the deriva-tive. We can then attach this mapping to a physical property, such as electrical con-195ductivity. By changing the space in which we choose to conceptualize the model, it ispossible to add more a priori information and other datum.Figure C.4: (a) Traditional approach to inversion, where the model space, elec-trical conductivity, is mapped to data space, the electromagnetic response,through a forward model. The inversion then provides a method by whichwe estimate a model that is consistent with the observed data. The re-covered conductivity model is then used to infer information about thereservoir properties of interest, in this case, the distribution of proppant.(b) Parametrized inversion, where we parametrize the model space, elec-trical conductivity, in terms of the property of interest, the distribution ofproppant. By defining such a parametrization, the inversion can provide ameans of estimating the properties of interest directly from the data.C.3.2 Survey designHeagy et al. (2016) presented the forward simulation framework for the context ofelectromagnetic simulations and inversions. One of the examples in this paper dealtwith a steel-cased well, which was used to deliver a galvanic or inductive source toa target reservoir (Figure C.5). Here, we posed the question: How sensitive are thedata to the location, depth, and conductivity of a target in a reservoir? To answer thisquestion, we conceptualized a simple model of location, dimensions, and conductiv-ities of an idealized block in a reservoir layer; all model parameters are visualized inFigure C.5. The challenge here is that the steel-cased well has a thickness on the orderof millimeters, while being kilometers long. Tackling this challenge in a computa-196tionally efficient manner required a primary-secondary approach, where source fieldswere constructed by solving a simplified problem without the target on a cylindricallysymmetric mesh. However, this approach required that the contribution of the modelsensitivity had to be efficiently traced all the way back through the primary fields.Figure C.5 shows the nesting of two forward modelling frameworks. By looking atthe sensitivity to these model parameters, the paper drew conclusions about potentialsurvey designs and to what extent the conceptual model could be resolved by noisydata.Figure C.5: Setup of a parametric models for a steel cased well and a reservoirtarget. The calculation of sensitivity for using a primary secondary ap-proach is shown using the forward simulation framework.This example required multiple formulations of Maxwell’s equations on two dif-ferent mesh types. The model could be conceptualized and potentially mapped toboth electrical conductivity and magnetic permeability. The mappings required forthis model conceptualization are largely reusable for other contexts. The sensitivity tothe model parameters required that two problems be nested, which showed the valueof composable pieces in a geophysical inversion framework.197C.3.3 Geologic modelingWhile completing my studies, I was involved in creating a number of structural geo-logic modelling tools (Cockett et al., 2014a; Cockett, 2012, 2015; Funning and Cock-ett, 2012; Cockett et al., 2016b; Cockett, 2013). Over 300,000 people around the worldhave used these tools, primarily in introductory geoscience education (Cockett et al.,2016b). These tools allow rapid conceptual modelling of geologic scenarios (Fig-ure C.6). Again, this process can be thought of as a mapping that codifies geologicknowledge and provides physical properties as a function of space. Using these map-pings directly in an inversion, however, is difficult as the parameterization is spatiallycoupled; that is, a single parameter, such as rotation of a tilting event or period ofa folding event, can change almost all physical properties at once. This spatial cou-pling is in contrast to a spatially decoupled voxel-based parameterization, where eachcell centered parameter has no effect on its neighbours. Due to the spatial coupling,the explicit geologic parameterizations developed in (Cockett et al., 2016b) creates anon-convex objective function, which is difficult to minimize with deterministic opti-mization. Implicit geologic modelling, in contrast, solves an inverse problem by us-ing radial basis functions to create a spatially extensive geologic parameterization (cf.Hillier et al. (2014)). Including this implicit modelling into the geophysics inversionframework as a mapping could be promising way to include geologic information.C.3.4 Dimensionality and controlled variablesIn Kang et al. (2015a), we explored the advantages to increasing the dimensionalityof the physical problem being solved (in this case, ground loop, time domain electro-magnetics). We explored these advantages over a three-dimensional synthetic modelof seawater intrusion into a confined aquifer (Figure C.7). The experiment allowed198Figure C.6: Parameterized geologic models using a series of analytic functions.Models were created using Visible Geology ( comparisons between the various dimensions because the tools were written in aconsistent framework where only a single variable was changing at a time. The in-creased dimensionality, in this case, led to a better recovery of the seawater intrusioninterface. (Kang et al., 2014) subsequently fit the saltwater intrusion by a parametricsurface. In Pidlisecky et al. (2013), however, we used a decrease in dimensionality(from 2.5D to 1D), which was tested for the ability to similarly fit collected data. InHeagy et al. (2016), data inversions were explored in both time domain and frequencydomain electromagnetics. Figure C.8 shows the similarities between the two forwardsimulation frameworks. Although the internals of the implementation differ slightly,the inversion framework that we used is identical.The flexibility to explore the dimensionality of the problem under consideration isimportant; this can be enabled by a consistent set of tools that allow you to control allvariables except the dimensionality of the computational domain. When comparingbetween formulations of the same physics, the forward problems can inherit from andutilize many of the same components and the inversion machinery is identical. Rather199Figure C.7: Conceptual diagram of moving between 1D, 2D, and 3D models.than comparing inversion results to a completely different black-box implementation,these shared components of the framework promotes controlled variables and morerigorous comparisons of inversion methodologies across physical problems.C.3.5 NestingThere are many future research opportunities for combining, coupling, nesting, or oth-erwise integrating different physical problems. An example of these opportunitieswas completed in Heagy et al. (2016), where we nested a physics problem inside an-other one (Figure C.5). In this case, we nested the same physics problem; however,in Rosenkjaer et al. (2016), the authors used the framework to combine electromag-netic dipole sources with the magnetotelluric forward modelling to investigate coherentsource interference. The growing field of hydrogeophysics will continue to combinegeophysics and fluid flow problems; the use of a consistent and integrated frameworkshould help to further these goals.200Figure C.8: Diagram showing the entire setup and organization of (a) the fre-quency domain simulation; (b) the time domain simulation; and (c) thecommon inversion framework used for each example. The muted textshows the programmatic inputs to each class instance.C.4 ConclusionsGeologic and a priori assumptions are often given through parameterizations. Theparameterizations necessary for a specific case study will often need to be unique toanswer or predict a specific geoscience hypothesis. Through the exploration of nu-merous case studies across the geophysical, hydrologic, and geologic literature, wepresented a forward modelling framework in Heagy et al. (2016). The important as-pects of this framework are: (a) rigorously defining the properties that can be invertedfor; (b) allowing these properties to be defined directly or wiring the property to themodel through a mapping; and, (c) providing a number of extensible and reusablemapping components, which, when chained together, allow for efficient calculation ofthe sensitivity. The forward modelling framework presented maintains the computa-tional scalability that is necessary to invert for large-scale 3D distributions of param-201eters (such as in electromagnetics or vadose zone flow). It is important to note thatthis framework also allows for customization through extensible mappings to custommodel conceptualizations. By defining these components in a common framework,we enable hypothesis testing and exploration of assumptions by changing single vari-ables (e.g. formulation or dimensionality) while keeping other variables controlled.The nesting and coupling of forward simulations, combined with these mappings be-tween model conceptualizations, is important to any framework that aims to enhancequantitative geoscience collaboration.202


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items