FORWARD M O D E L L I N G AND INVERSION OF G E O P H Y S I C A L M A G N E T I C D A T A by PETER GEORGE L E L I E V R E A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in THE FACULTY OF GRADUATE STUDIES (Department of Earth and Ocean Sciences, Geophysics) We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA March 2003 © Peter George Lelievre, 2003 In p r e s e n t i n g t h i s t h e s i s i n p a r t i a l f u l f i l m e n t of the requirements f o r an advanced degree a t t h e U n i v e r s i t y o f B r i t i s h Columbia, I agree t h a t t h e L i b r a r y s h a l l make i t f r e e l y a v a i l a b l e f o r r e f e r e n c e and study. I f u r t h e r agree t h a t p e r m i s s i o n f o r e x t e n s i v e c o p y i n g o f t h i s t h e s i s f o r s c h o l a r l y purposes may be g r a n t e d by t h e head o f my department o r by h i s o r h e r r e p r e s e n t a t i v e s . I t i s u n d e r s t o o d t h a t c o p y i n g o r p u b l i c a t i o n of t h i s t h e s i s f o r f i n a n c i a l g a i n s h a l l n o t be a l l o w e d w i t h o u t my w r i t t e n p e r m i s s i o n . Department o f £ a . r l k & O^do^n £c^£,<Ace,S The U n i v e r s i t y o f B r i t i s h Columbia Vancouver, Canada Date i 7 MgrcU lOQ? Abstract The ultimate goal of this research was to invert geophysical magnetic data to recover three dimensional distributions of subsurface magnetic susceptibility of any possible magnitude and geometric complexity. Magnetic data collected over bodies of high susceptibility contain significant self-demagnetization effects. Self-demagnetization causes magnetizations to rotate awayfromthe external inducingfieldand causes the amplitude of the magnetic response to scale nonlinearly with susceptibility. These effects are highly dependent on the shape of the object and they complicate interpretation. Examples where self-demagnetization is important include surveys for detection and discrimination of unexploded military ordnance (UXO) and mineral exploration surveys over highly mineralized banded iron formations and nickel deposits. Current modelling methods that account for self-demagnetization effects are limited to simple bodies, such as ellipsoids, where the geometry of the body is represented by a few parameters. Standard forward modelling methods for general susceptibility distributions (i.e. methods that can deal with complicated bodies) neglect the effects of self-demagnetization and can produce inaccurate results and subsequent deterioration in performance of the inverse solution. Here, a full solution to Maxwell's equations for source-free magnetostatics was developed using a finite volume discretization. The Earth region of interest is discretized into many prismatic cells, each with constant susceptibility, which allows for models of arbitrary geometric complexity. The finite volume forward modelling method is valid for any linear medium and is appropriate for modelling the response of highly magnetic objects. The forward modelling method was refined and the code was tested against analytical solutions for simple bodies and against a slower, more memory intensive full solution for general distributions formulated in the integral equation domain. All tests showed the forward modelling method to be sound within expected error tolerances. The finite volume modelling method formed the foundation for a subsequent inversion algorithm. In the discretization, many more model cells are used than there are data. As such, the inverse problem is underdetermined. The inverse problem was formulated as an unconstrained optimization problem in which an objective function is minimized. The objective function was designed so that the data arefitto an acceptable degree and the recovered model has desired spatial characteristics. The resulting optimization problem was nonlinear and required an iterative solution, for which a Gauss-Newton approach was used. Testing for the inversion code included inversion of synthetic data for simple bodies and inversion of survey data collected over a planted UXO target. All tests showed positive results. ii Table of Contents Abstract ii List of Figures ix Acknowledgements 1 The state of the art and the problem at hand 1.1 1.2 2 3 xvi 1 Introduction 1 1.1.1 Magnetic induction and important magnetic quantities 1 1.1.2 Magnetic survey methods 3 1.1.3 The effect of high susceptibility on induced magnetization . . 4 1.1.4 Motivation for research 5 A standard approximate solution 6 1.2.1 Discretization 6 1.2.2 The standard linear approximation 7 1.2.3 The linear inverse problem 8 1.3 Insufficiency of the linear approach for high susceptibility 1.4 A mineral exploration example 1.5 Research goals and thesis outline 9 17 . . . . 20 The magnetostatic governing equations and boundary conditions 22 2.1 Introduction 22 2.2 Maxwell's Equations for source-free magnetostatics 22 2.3 Physical continuity principles 24 2.3.1 The interface conditions 24 2.3.2 The boundary conditions 25 Discretization of the magnetostatic governing equations 26 3.1 Introduction 26 3.2 The system 26 3.3 The discrete grid and discrete variables 26 iii 4 5 6 3.3.1 Nomenclature for the discrete variables 26 3.3.2 The discrete grid 27 3.3.3 Grid coordinates and lengths 27 3.3.4 Variable numbering 29 3.4 Finite volume discretization 30 3.5 Discretizing the divergence equation 30 3.6 Discretizing the gradient equation 33 3.6.1 Arithmetic and harmonic averaging 33 3.6.2 Arguments for harmonic averaging 36 3.6.3 40 Discretization of ju B = V<f> l 3.7 An alternate formulation 43 3.8 Summary 46 Specifying the boundary conditions 48 4.1 Boundaryfluxapproximation 48 4.2 The congruous sphere method 48 4.3 The discretized divergence equation 50 Secondary flux and field formulations 52 5.1 The total flux formulation 52 5.2 A secondary flux formulation 53 5.2.1 Approach 1: formulatefirst,discretize second 53 5.2.2 Approach 2: discretizefirst,formulate second 55 5.3 Summary of the flux formulations 56 5.4 A secondaryfieldformulation 56 5.5 Generalized notation for forward modelling 57 Solving the discrete forward modelling equations 58 6.1 Introduction 58 6.2 Solving the discrete system for the potentials 58 6.2.1 58 The solver chosen 6.2.2 Preconditioning 62 iv 6.3 6.4 7 8 SSOR preconditioning 62 6.2.4 ILU preconditioning 63 6.2.5 The best preconditioner and the optimal parameter value 64 Accuracy of the flux and field formulations 68 6.3.1 The flux formulation 68 6.3.2 The field formulation 69 6.3.3 Comparison of the flux andfieldformulations 70 Grid design 71 An alternate solution in the integral equation domain 73 7.1 Introduction 73 7.2 A full discrete solution to the magnetic problem in the integral equation domain . . 73 7.3 Practical aspects of the full integral equation domain full solution 77 Testing the forward modelling 78 8.1 Introduction 78 8.2 Spherical and cubic magnetic bodies 78 8.2.1 Profiles above a sphere and a cube 78 8.2.2 Profiles above a cube with refined grids 82 8.2.3 Profiles away from a sphere and cube 85 8.2.4 Summary 86 8.3 9 6.2.3 A prolate spheroid 86 8.3.1 Introduction 86 8.3.2 Profiles above a prolate spheroid 90 8.3.3 Profiles through a prolate spheroid 92 8.3.4 Summary 99 8.4 Consistency of the boundary conditions 8.5 Conclusion 99 103 Inversion 105 9.1 Introduction 105 9.2 An unconstrained optimization approach to the inverse problem 106 v 9.3 9.2.1 Non-uniqueness 106 9.2.2 The discrete Earth model 106 9.2.3 Dealing with non-uniqueness 106 9.2.4 The model objective function 107 9.2.5 The discrete form of the model objective function 108 9.2.6 The data misfit 109 9.2.7 110 The positivity constraint An iterative approach to the optimization problem 112 9.3.1 A Gauss-Newton approach 112 9.3.2 Complications arising from the positivity constraint 117 9.3.3 The Steepest Descent method as backup 119 9.4 A split model formulation 119 9.5 The regularization parameter 121 9.6 The inversion algorithm 122 9.7 Practical aspects of the inversion algorithm 124 9.7.1 Depth weighting 124 9.7.2 Estimating an initial value for /? 124 9.7.3 Choosing and initial model 125 9.7.4 The line-search for a 125 9.7.5 Searching for m that minimizes 0 126 9.7.6 Searching for an appropriate value of 126 9.7.7 An initial guess for the forward solution 127 10 Calculation of the sensitivity matrix 128 10.1 Introduction 128 10.2 Revisiting the forward problem for use in an inversion algorithm 129 10.3 Expressions for Jfromthe flux formulation 130 10.4 Derivation of the required elements of J 133 10.4.1 Derivation of ¥ ( w ) 133 10.4.2 Derivation of dg/dm for a grid-centred congruous sphere vi 135 10.4.3 Options for dealing with the boundary conditions 138 10.5 Computing Jx and J x 139 10.6 Results for the field formulation 141 T 10.6.1 Revisiting the field formulation 141 10.6.2 Derivation of ^(w) for the field formulation 142 10.6.3 Computing Jx and J x for the field formulation 144 T 10.7 Measuring the total flux magnitude 145 10.8 Testing the calculation algorithm 147 11 Testing the inversion algorithm 151 11.1 Introduction 151 11.2 Inversion of synthetic data for an elongated rectangular prism 151 11.3 Inversion of survey data obtained above a planted UXO target 160 11.4 Conclusion 169 12 Discussion and conclusions .170 12.1 Summary 170 12.2 Application of the methods presented 170 12.2.1 Assumptions made in the forward solutions 170 12.2.2 Application of the methods to linear problems 171 12.2.3 Application of the methods to dispersed problems 172 12.2.4 Application of the methods to compact problems 173 12.3 Possible amendments to the methods presented 173 12.3.1 Extension to include anisotropy and remanent magnetization 173 12.3.2 Alteration of the inversion algorithm 173 12.4 Conclusion 175 References 176 Appendix A - nomenclature 179 A. 1 Differential and integral equations 179 A.2 Matrix-vector equations 179 Appendix B - Discretization for the field formulation vii 180 B.l Introduction 180 B.2 The system 180 B.3 The discrete grid and discrete variables 180 B.3.1 Nomenclature for the discrete variables 180 B.3.2 The discrete grid 180 B.3.3 Grid coordinates and lengths 181 B.3.4 Variable numbering 182 B.4 Finite volume discretization 182 B.5 Discretizing the divergence equation 182 B.6 Discretizing the gradient equation 186 B. 7 Derivation of a secondary field formulation B.7.1 The total field formulation 189 B.7.2 Secondary formulation approach 1: formulate first, discretize second 189 B.7.3 Secondary formulation approach 2: discretize first, formulate second 191 B.7.4 Comparison of the two approaches 191 Appendix C - Notable observations from the inversion tests Cl 189 Depth weighting 193 193 C. 2 Practical choice of the initial model 193 C. 3 Regional removal 199 Appendix D - Remanent magnetization and anisotropy D. l Introduction 200 200 D.2 Magnetic anisotropy 200 D.3 Remanent magnetization and hysteresis 201 D.4 Modelling anisotropy 203 D.5 Modelling remanent magnetization 205 viii List of Figures 1.1 A schematic representation of magnetic induction of a magnetic body showing the primary inducing field, Ho, the induced magnetization, M , and the associated secondary field, H 1 1.2 Typical magnetic susceptibilities for some common minerals and rock types 2 1.3 Individual magnetic fields for two magnetic dipoles with equal moments 4 1.4 An orthogonal grid of nc discrete cells, each with constant susceptibility, Xi 7 1.5 A vertical cross-section through a horizontal magnetic spheroid in an inclined s inducing 1.6 field Total flux magnitude data map and profile for a horizontal, prolate, magnetic spheroid of susceptibility ^'=10" in an inducingfieldinclined 70° to the horizontal 5 / 1.7 12 A vertical cross-section through a typical recovered model from MAG3D inversion of the data set in Figure 1.6 1.9 13 A horizontal cross-section through a typical recovered model from MAG3D inversion of the data set in Figure 1.6 1.10 13 Total flux magnitude data map and profile for a horizontal, prolate, magnetic spheroid of susceptibility #=10 1.11 in an inducingfieldinclined 70° to the horizontal 15 A vertical cross-section through a typical recovered model from MAG3D inversion of the data set in Figure 1.10 1.13 16 A horizontal cross-section through a typical recovered model from MAG3D inversion of the data set in Figure 1.10 1.14 14 Total flux magnitude data map and profile for a magnetic dipole in an inducing field inclined 20° to the horizontal 1.12 11 Total flux magnitude data map and profile for a magnetic dipole in an inducing field inclined 70° to the horizontal 1.8 10 16 A plan view of ground-based, total flux magnitude survey data for the Osborne mine region 17 ix 1.15 Total flux magnitude data for a dipole with inclination -53.3° and declination 47.5°E. 18 1.16 Interpolated geological cross-section 11670E through the Osborne mine region 19 1.17 Interpolated geological cross-section 21700N through the Osborne mine region 20 2.1 H and B at an interface where ju is discontinuous 24 3.1 A single discrete grid cell for the flux formulation 3.2 Position and numbering of discrete variables in the flux formulation for a one- 27 dimensional discretization 28 3.3 Grid cell numbering for nx=3, ny=4, nz=2 29 3.4 Homogenization of ju across an interface for a one dimensional problem 35 3.5 A comparison of arithmetic and harmonic averaging 36 3.6 Two analogous electrical circuits 37 3.7 Fluid flow through two connected porous media 38 3.8 A single discrete grid cell for the field formulation 44 3.9 The true and dual grids at a boundary for the field formulation 45 6.1 The Preconditioned Bi-Conjugate Gradient Stabilized Method 60 6.2 A typical BiCGStab convergence curve 61 6.3 Typical curves of CPU solution time for the SSOR and ILU preconditioners for many values of the parameters CO and droptol. The grid had 15 cubic cells of dimension lm 3 and the solution tolerance was 10" 64 5 6.4 Typical curves of CPU time for the ILU decomposition and subsequent BiCGStab solution time for many values of the parameter droptol. The grid had 15 cubic cells 3 of dimension lm and the solution tolerance was 10" 65 5 6.5 Typical curves of CPU solution time for the SSOR and ILU preconditioners for many values of the parameters coand droptol. The grid had 10 cubic cells of dimension lm 3 and the solution tolerance was IO" 65 5 6.6 Typical curves of CPU solution time for the SSOR and ILU preconditioners for many values of the parameters &>and droptol. The grid had 20 cubic cells of dimension lm 3 and the solution tolerance was 10" 66 5 x 6.7 Typical curves of CPU solution time for the SSOR and ILU preconditioners for many values of the parameters 6) and droptol. The grid had 15 cubic cells of dimension 0.1m 3 and the solution tolerance was 10" 66 5 6.8 Typical curves of CPU solution time for the SSOR and ILU preconditioners for many values of the parameters co and droptol. The grid had 15 cubic cells of dimension 10m 3 and the solution tolerance was 10" 67 5 6.9 Typical curves of CPU solution time for the SSOR and ILU preconditioners for many values of the parameters co and droptol. The grid had 15 cubic cells of dimension lm 3 and the solution tolerance was 1(T 6.10 67 Typical curves of CPU solution time for the SSOR and ILU preconditioners for many values of the parameters co and droptol. The grid had 15 cubic cells of dimension lm 3 and the solution tolerance was 10" 68 8 8.1 A schematic diagram of the first test scenario: flux profiles above and away from a susceptible sphere and cube that are concentric and have equal volume 8.2 B and B along a profile running in the x-direction at a height 1 Om above the X Z sphere/cube centre for multiple susceptibilities 8.3 B X and B Z differences (B„, ( M O T C A /- 80 B „ y ) along a profile running in the x-direction at a 0 a/ /c height 10m above the sphere/cube centre for multiple susceptibilities 8.4 X 2 B x and B, differences (BFKD - integral) 83 along a profile running in the x-direction at a height 2m above the cube centre for three grids of different size 8.6 B,, B " Z 1 / 3 and relative differences {B numerical-^ analytic)!^ analytic 85 A schematic diagram of the second test scenario: flux profiles above and through a susceptible spheroid 8.8 84 along a profile running in the z-direction away from the sphere/cube centre for a susceptibility of %= 1 8.7 81 B and B along a profile running in the x-direction at a height 2m above the cube centre for three grids of different size 8.5 79 86 A vector plot for a spheroid in an inclined inducing field. The plot is a vertical crosssection and the spheroid susceptibility is x = the analytic solution. . . 0.01. The fluxes were calculated using '. xi 87 8.9 A vector plot for a spheroid in an inclined inducingfield.The plot is a horizontal crosssection and the spheroid susceptibility is %= 0.01. Thefluxeswere calculated using the analytic solution 8.10 88 A vector plot for a spheroid in an inclined inducingfield.The plot is a vertical crosssection and the spheroid susceptibility is #=100. Thefluxeswere calculated using the analytic solution 8.11 89 A vector plot for a spheroid in an inclined inducingfield.The plot is a horizontal crosssection and the spheroid susceptibility is #=100. Thefluxeswere calculated using the analytic solution 8.12 90 B and B along a profile running in the x-direction at a height of 12m above the spheroid x z centre for multiple susceptibilities 8.13 B and B differences x z (B i mimerica 91 - B fytic) ana along a profile running in the x-direction at a height of 12m above the spheroid centre for multiple susceptibilities 8.14 Flux profiles running in the x-direction and z-direction through the centre of a spheroid with susceptibility #=0.01 8.15 94 Flux profiles running in the x-direction and z-direction through the centre of a spheroid with susceptibility #=100 8.16 92 94 A vector plot for a discretized spheroid in an inclined inducingfield.The plot is a vertical cross-section and the spheroid susceptibility is #= 0.01. Thefluxeswere calculated using the flux formulation 8.17 96 A vector plot for a discretized spheroid in an inclined inducingfield.The plot is a horizontal cross-section and the spheroid susceptibility is #= 0.01. Thefluxeswere calculated using the flux formulation 8.18 97 A vector plot for a discretized spheroid in an inclined inducingfield.The plot is a vertical cross-section and the spheroid susceptibility is #=100. Thefluxeswere calculated using the flux formulation 8.19 98 A vector plot for a discretized spheroid in an inclined inducingfield.The plot is a horizontal cross-section and the spheroid susceptibility is #=100. Thefluxeswere calculated using the flux formulation 99 xii 8.20 Specification of normal and tangential fluxes on and just below the grid boundary for the flux formulation 8.21 100 A symmetric grid containing 51 cells of varying dimensions with a central susceptible 3 region containing 5 cells 101 3 8.22 A schematic diagram of profile positioning for the test of the consistency of the boundary conditions 8.23 102 Single component flux and flux difference (B a/ B^o/e) profiles for a central _ m(mOTC cubic body of susceptibility #= 0.01 103 9.1 The Conjugate Gradient Least Squares method 115 9.2 Typical CGLS convergence curves for the relative residual and the stopping criterion plotted on linear and logarithmic vertical axes 116 9.3 A typical Tikhonov curve. 121 10.1 The finite difference solution for J using a secondary flux formulation with positivity imposed and with boundary conditions calculated using the congruous sphere method 10.2 149 The analytic solution for J using a secondary flux formulation with positivity imposed and with boundary conditions calculated using the congruous sphere method 10.3 159 The difference between thefinitedifference and analytic solutions for J using a secondary flux formulation with positivity imposed and with boundary conditions calculated using the congruous sphere method 150 11.1 The grid, target body and location of data used in the synthetic inversion test 152 11.2 Synthetic observed data map and profile for thefirstinversion test 153 11.3 Predicted data map and profile for the model recovered in the synthetic inversion test 11.4 154 Maps and profiles for various data quantities associated with the synthetic inversion test 155 11.5 A vertical cross-section through the model recovered in the synthetic inversion test.. 156 11.6 A horizontal cross-section through the model recovered in the synthetic inversion test 157 xiii 11.7 Values of the total objective function, gradient norm, misfit, model objective function, and the slope in the search direction at each model iteration for the first synthetic inversion test 11.8 158 Plots of the relationships between the misfit, model objective function and (3 for the first synthetic inversion test 11.9 159 The buried 105mm projective viewed in a vertical cross-section along a 45°E declination 161 11.10 Map and profile of the observed survey data collected above the planted UXO 162 11.11 Analytic data map and profile for a prolate spheroid with minor axis 0.105m, an eccentricity of 4.07 and a susceptibility of 1000 163 11.12 A map of the difference between the observed UXO data and the analytic data for the approximate prolate spheroid 164 11.13 Predicted data map and profile for the model recovered in the UXO inversion test. . . 165 11.14 A map of (d - 4r ) for the UXO inversion test and a comparison of the profiles obs ed for these two data sets 166 11.15 A vertical cross-section through the model recovered in the UXO inversion test 167 11.16 A horizontal cross-section through the model recovered in the UXO inversion test. ..168 B.l A single discrete grid cell for the field formulation B. 2 Position and numbering of discrete variables in the field formulation for a one 181 dimensional discretization 182 C. 1 The square-root function 193 C.2 A horizontal cross-section at z = 0m through the model recovered from inversion of the synthetic test data for an initial model containing a value of 1.0 throughout the active region C.3 195 A horizontal cross-section at z = 8m through the model recovered from inversion of the synthetic test data for an initial model containing a value of 1.0 throughout the active region 196 C.4 Predicted data map and profile for the model shown in Figures C.2 and C.3 197 C.5 Forward modelled data map and profile for the central body shown in Figure C.2. 198 xiv The difference between the data in Figures C.4 and C.5 A typical hysteresis curve for a ferromagnetic substance xv Acknowledgements Many thanks to my supervisor Dr. Doug Oldenburg for his input, guidance, encouragement and endless supply of enthusiasm. Thanks to everyone at UBC-GIF at the time but most importantly Jiuping Chen for his help and the irreplaceable Roman Shekhtman. This research was funded by an NSERC post-graduate scholarship, the Thomas and Marguerite MacKay Memorial Scholarship and the Egil H Lorntzsen Scholarship. xvi Chapter 1 The state of the art and the problem at hand 1.1 Introduction 1.1.1 Magnetic induction and important magnetic quantities In the absence of an external magnetic field, individual magnetic domains within ferromagnetic material, with which this thesis is concerned, will be oriented in random directions and the net magnetic field will be zero. When placed in an external magnetic field, Ho, such as the Earth's geomagneticfield,the magnetic domains will rotate toward the direction of the external field. Their orientation is no longer random and the material is said to be magnetized. The result is an induced net secondaryfield,H . This secondaryfieldis distinct from, but caused by, the (primary) inducingfield.The situation is pictured in Figure 1.1 below. 5 Figure 1.1 A schematic representation of magnetic induction of a magnetic body showing the primary inducing field, Ho, the induced magnetization, M, and the associated secondary field, H , . 1 Magnetization is a vector quantity defined as magnetic dipole moment per unit volume. The magnetization, M , at any location within a magnetic body is related to the magnetic field intensity (or "field"), H , at that location through a quantity called the magnetic susceptibility, %. (1-1) M = #H The more susceptible a material, the more highly magnetized it becomes when placed in an inducing field. Below, Figure 1.2 shows typical susceptibility values for some common minerals and rock types. A p p r o x i m a t e percent of magnetite by volume 0 . 1 % ,. 10 S.I. U n i t s 5 1 Q . 0.5% 1 % io- : 4 Magnetic minerals 5% 10-1 10% 20% \ Magnetite Hematite Igneous rocks Acid Volcanics ..i L. "S" type Basalts "T'type J J. Granites Gabbros Andesites Metamorphic rocks Metasediments Metamorphics Sedimentary rocks Sediments r io- io- 4 ! —r 10-3 IO" 2 S.I. U n i t s 1 Adapted from Clark and Emerson (1991) Figure 1.2 Typical magnetic susceptibilities for some.common minerals and rock types. Another important magnetic quantity is the magnetic flux density (or "flux"), B , which is related to the field, H , through the magnetic permeability, fi: (1-2) B = //H Permeability is related to susceptibility as 2 (1-3) // = #>(! +X) where /Jo is the permeability of free space, which has the value JUQ = 4KX10- (1-4) 7 in the SI system (in which this thesis will be working). H and M have units of amperes per meter (A/m) and # is dimensionless. B has units of tesla (T), or webers per square meter (IT = LWb/m ), where a weber is defined as one volt-second, //has units of henries per meter (H/m), where a henry is defined as one weber per ampere. 1.1.2 Magnetic survey methods Geophysical magnetic survey measurements include the geomagnetic field superimposed on any anomalous secondary fields arising from subsurface susceptible material (refer to Figure 1.1). A single component or several components of the vectorfieldquantities can be measured. Most commonly, the measurements are strength (magnitude) measurements of the total flux, B i: tota ||B || = IIBo + BJI (1-5) Mte/ Here, Bo is the geomagnetic flux, corresponding to the Earth's mainfield,and B^ is the anomalous flux produced by susceptible material in the subsurface. The objective of a geophysical magnetic survey is to render a feasible subsurface distribution of susceptibility that may have given rise to the anomalous fields in the survey data. This is an inverse problem. Inversion of magnetic survey data can provide constraints upon the subsurface susceptibility distribution. This information can then be interpreted to aid understanding of the subsurface geology. Geologically, magnetic susceptibility is related to the proportion of magnetic minerals in the rocks. These include iron oxides such as magnetite and hematite, iron sulphides such as pyrrhotite, and metallic iron, nickel and cobalt. These minerals often accompany economic mineral deposits and as such, geophysical magnetic survey data can be used in mineral exploration to provide information about potential drilling targets. The percentage of magnetic minerals in rocks can also be used to determine rock type (refer to Figure 1.2) and, as such, magnetic surveys aid geological mapping. Magnetic surveys are also used to aid in the detection and discrimination of buried unexploded military ordnance (UXO). In such an application, the subsurface magnetic susceptibility involves highly susceptible steel bodies (i.e. the UXO and metallic scrap such as exploded shrapnel) along with much lower susceptibilities within the surrounding soil. 3 1.1.3 The effect of high susceptibility on induced magnetization Recall that the magnetization of a body is related to the field at its location as M = #H = #(Ho + H,) (1-6) The total field at its location is a superposition of the geomagnetic field, Ho, plus the anomalous field, H , from any magnetized material in the surrounding area. If this secondary field is negligible then the body's magnetization is parallel to the inducing field with magnitude linearly dependent on the susceptibility of the body and the strength of the inducing field. However, if the secondary field is significant then the magnetization is affected in both magnitude and direction. In other words, separate parcels of magnetic material affect each other through their induced secondary fields and, hence, the magnetization at any location is dependent on any magnetized material outside that location. This effect increases with increasing susceptibility. The simple example shown in Figure 1.3 below will be used to demonstrate this concept. s Figure 1.3 Individual magnetic fields for two magnetic dipoles with equal moments. Consider two small magnetic bodies lying on a horizontal plane (across and into this page), such as those in Figure 1.3. The bodies have identical susceptibilities and are placed in an external inducingfieldthat is oriented vertically upward (toward the top of the page). The bodies will obtain identical magnetizations parallel to the inducingfield.Assume the magnetized bodies are small enough that they can be thought of as dipoles. The associated induced secondaryfieldsare then as depicted in Figure 1.3. The secondaryfieldof body 1 at the location of body 2 will be vertically downward and vice versa. Hence, the secondary fields oppose the primary inducingfieldat the dipole locations and the effect of the bodies on one another is to reduce their magnetizations by a certain amount. The 4 amount of reduction will depend on the relative magnitudes of the primary and secondary fields at the dipole locations, the latter depending on the susceptibility and the distance of separation. Such effects will be referred to as "high susceptibility effects" or "demagnetization effects". In Figure 1.3, the dipoles are represented as small susceptible spheres but could also be thought of as small horizontal loops of current. In this case, the concept of demagnetization is identical to the electrical concept of mutual inductance. The term "self-demagnetization" will sometimes be used when dealing with a single, macroscopic, isolated magnetic body. In this case, the individual microscopic magnetic elements in the body affect each other as discussed above. In Figure 1.3, the effect of demagnetization is one of reduction in the magnetization of the bodies involved. However, this is not always the case. For example, consider a situation where the two bodies in Figure 1.3 are magnetized in a horizontal direction (across the page). The effect would then be additive. With a greater number of susceptible bodies in more complicated geometries, the demagnetization effects become more difficult to predict. In the final analyses, the magnetic response of highly susceptible bodies is strongly dependent on their shape and orientation with respect to the inducing field. 1.1.4 Motivation for research Magnetic data collected over bodies of high susceptibility contain significant demagnetization effects, which can greatly complicate interpretation. Examples where demagnetization is important include surveys for detection and discrimination of UXO and mineral exploration surveys over highly mineralized banded iron formations and nickel deposits. Current modelling methods that account for self-demagnetization effects are limited to simple bodies, such as ellipsoids, where the geometry of the body is represented by a few parameters. Standard forward modelling methods for general susceptibility distributions (i.e. methods that can deal with complicated bodies) neglect the effects of demagnetization and can produce inaccurate results and subsequent deterioration in performance of the inverse solution. This is discussed within the remainder of this chapter. The purpose of this research is to improve upon current magnetic modelling techniques to develop forward modelling and inversion methods that can include the effects of demagnetization for distributions of arbitrary geometric complexity. 5 1.2 A standard approximate solution Before introducing more elaborate mathematical relationships, the reader is referred to Appendix A, which contains a discussion of some nomenclature conventions used in the mathematics of this thesis. All units used in this thesis are SI and the coordinate system used is Cartesian. 1.2.1 Discretization Written in integral equation form, the field at a point P due to a distribution of magnetic material within a region R is given by H(P) = H (i>) + H,(/>) 0 (1-7) where Ho is the primary inducing field (e.g. the Earth's geomagnetic field); Hs is the anomalous secondary field; M is the induced magnetization of the magnetic material in R; P is the position of the observer; Q represents the positions of the volume elements dv within R; and r(P,Q) is the distance from P to Q. To evaluate (1-7) numerically for a complicated susceptibility distribution, the three-dimensional Earth region of interest, R, is divided into a large number, nc, of discrete rectangular prismatic cells R within an orthogonal grid system. Each cell R„ has constant susceptibility %u and magnetization M". This discretization into many cells, as pictured in Figure 1.4, allows complicated bodies to be constructed. u 6 Figure 1.4 An orthogonal grid of nc discrete cells, each with constant susceptibility, 1.2.2 The standard linear approximation In specifying M", it is standard to make use of the Born linear approximation that assumes secondary fields, H , are small compared to the primary field, Ho. The assumptions of magnetic isotropy, magnetic linearity and absence of remanent magnetization are made and the induced magnetization of each cell can be written s M"=^H"=^(H;+H;)=^H; (i-8) Under this approximation, the magnetization at any location within the Earth model is parallel to Ho and is linearly proportional to the susceptibility at that location. With (1-8), the secondary field from (1 -7) becomes H < ( P ) = 74TT " J > ( 0 ) o ( f i ) • -7F7^ r(P,Q) H VV dv Jr Generally, one only works with the secondary flux in a single arbitrary direction £, which is given by 7 " (1 9) B = £-B = T (1-10) l-(]M ) S S It is assumed that the primaryfieldis constant over the volume of interest. In addition, the data are usually acquired in free space (i.e. air) so that B I: (1-11) = JUQ£-H S With the discretization, the linear forward problem can then be written in matrix-vector form as d = Gm (1-12) d is a data vector of N unknown secondary flux values measured at specific locations rf. e.g. d = [B (r\), B {r\), B (r\), B (r2), .. .] . m is a model vector of the specified susceptibilities in all nc model cells. G is the linear forward modeling operator, which contains tensor components determined from the physics and geometry of the problem: the constant element G,, describes the T x y z x th th contribution to the / datum of a unit susceptibility in the j cell. The forward problem simply involves multiplication of the TV by nc full matrix G with the model vector m. (1-12) leads to a simple, linear inverse problem valid only when the approximation (1-8) holds (i.e. when H « Ho). s 1.2.3 The linear inverse problem In the magnetic inverse problem, one attempts to determine an appropriate distribution of subsurface susceptibility that could have given rise to magnetic survey data. The principal difficulty in finding a solution to the magnetic inverse problem is its inherent non-uniqueness. There are infinitely many models that could adequately reproduce the data. This stems from nonuniqueness inherent in the magnetic problem and the fact that the data are finite in number and inaccurate. The model region of interest is discretized as discussed in Subsection 1.2.1. The number of model cells, nc, is set much larger than the number of data, N. The number of unknowns Xi is then larger than the number of equations in (1-12) and the inverse problem is underdetermined. Therefore, the system (1-12) has infinitely many solutions for the model, m. Although there are infinitely many models that can adequately reproduce the data, many of these will be unreasonable for a specific application. To reduce the number of acceptable models, the requirements for a feasible model are considered. Thefirstis that the data predicted by the recovered model shouldfitthe observed survey data to within a degree justified by their estimated uncertainties. The second requirement is that the recovered model should be compatible with any a priori knowledge of the subsurface geology or physical property distribution. This often requires that the model be smooth in all spatial directions. 8 As discussed in Li and Oldenburg (1996), the inverse problem can be formulated as an unconstrained optimization in which an objective function is minimized to obtain a feasible susceptibility distribution model. The objective function is designed so that the data are fit to an acceptable degree and the recovered model has desired spatial characteristics. It has the form 0(m)=0 (m) + P0 (m) d (1-13) m where 0 is the total objective function to be minimized; m is the model of subsurface susceptibility; 0d is the data misfit, which measures how well the model fits our data; 0 is the model objective function, which measures how closely the model conforms to the desired spatial characteristics; and P is a tradeoff parameter that controls the relative size of the data misfit and model objective function. m The inverse problem is solved by finding the model that minimizes (1-13). Because the survey data contains some level of uncertainty, a decision must be made on how well the data predicted by the recovered model should fit the observed survey data. Solving the inverse problem requires optimization of (1-13) for several values of ji until a model is found that yields the desired level of misfit. In addition, a third requirement can be incorporated into the inversion. This thesis deals with ferromagnetic material and this requires the susceptibility to be positive. A third term can be added to (1-13) to enforce positivity of the model parameters. This introduces nonlinearity into the problem and is further discussed in Li and Oldenburg (in press). 1.3 Insufficiency of the linear approach for high susceptibility The linear approach works well for many circumstances. However, a major insufficiency of the linear approach is its inability to account for demagnetization effects. This can have serious consequences when dealing with media of high susceptibility. RecallfromSection 1.2 that in specifying the magnetization of each cell, the linear forward problem made use of the Born approximation. This led to a simple, linear problem valid only for low susceptibilities where the resulting secondaryfieldsare small compared to the primary inducingfield,Ho. Under this assumption, susceptible material in one model cell does not significantly affect, nor is significantly affected by, the material in any other cell. Hence, demagnetization effects must be negligible for the linear method to be valid. The linear method assumes that the magnetization is parallel to the inducingfieldthroughout the entire model region. In contrast, when high susceptibilities are involved, the magnetization direction can rotate awayfromthe inducingfieldand become highly dependent on the shape of the body. 9 The inability of the linear method to recover acceptable modelsfromdata containing demagnetization effects can be demonstrated using synthetic data. The linear code used is the UBC-GIF code MAG3D, for which the major theoreticalframeworkis provided in Li and Oldenburg (1996). Here, synthetic data were computed 20m above a horizontal, prolate, spheroidal magnetic body with a minor semi-axis of 3m and eccentricity 5. The spheroid long axis was declined 45°E. The situation is pictured below in Figure 1.5. The data were computed analytically as discussed in Kaufman (1992). The susceptibility values considered were 10" and 10 and the inducingfieldwas chosen to maximize the high susceptibility effects. The inducing field had a strength of 5x 10 nT, a declination of 45°E and was inclined 70° to the horizontal. 5 2 4 30m Figure 1.5 A vertical cross-section through a horizontal magnetic spheroid in an inclined inducing field. Self-demagnetization effects are negligible for the low susceptibility and the spheroid magnetizes in the direction of the inclined inducingfield,regardless of the orientation of the spheroid with respect to the inducingfield.This creates data similar to that of a dipole with a moment inclined 70°. Below, compare the data for the spheroid in Figure 1.6 with that for a dipole in Figure 1.7 with moment inclined 70°. 10 Observed Magnetic Data 1089 data, I = 70.0, D = 45.0 0,003962 -0.000388 Figure 1.6 Total flux magnitude data map and profile for a horizontal, prolate, magnetic spheroid of susceptibility #=10" in an inducing field inclined 70° to the horizontal. 5 11 Observed Magnetic Data 1089 data, I = 70.0, D = 45.0 _ 0.005102 |. 0.004171 . 0.003239 L 0.002308 : . 0.001376 L 0.0004445 -0,000487 0.005102 -0.000487 Figure 1.7 Total flux magnitude data map and profile for a magnetic dipole in an inducing field inclined 70° to the horizontal. Below, Figures 1.8 and 1.9 show vertical and horizontal cross-sections through a typical recovered model from M A G 3 D inversion of the low susceptibility data set in Figure 1.6. The grid used had cells 3.125m square in the central region with 5m dimensions in the outer padding cells. M A G 3 D was able to adequately fit the data and successfully recover an appropriate model regardless of the orientation of the spheroid with respect to the inducing field. 12 Figure 1.8 A vertical cross-section through a typical recovered modelfromMAG3D inversion of the data set in Figure 1.6. The cross-section runs SW to NE (i.e. along the spheroid long axis) and looks toward the NW. Depth = -15.625 -70 -35 0 35 70 South Figure 1.9 A horizontal cross-section through a typical recovered model from MAG3D inversion of the data set in Figure 1.6. I now move on to the high susceptibility example in which self-demagnetization effects are expected. A discussion of self-demagnetization effects in spheroidal bodies is given in Clark and Emerson (1999). The two basic characteristics of these effects are that magnetizations rotate awayfromthe external inducing field directions and the amplitude of the magnetic response 13 scales nonlinearly with susceptibility. These effects can be observed through comparison of Figure 1.6 and Figure 1.10 below. Observed Magnetic Data 1089 data, I = 70.0, D = 45.0 50 R 1 1322 -709.1 Figure 1.10 Total flux magnitude data map and profile for a horizontal, prolate, magnetic spheroid of susceptibility #=10 in an inducingfieldinclined 70° to the horizontal. 2 Self-demagnetization effects cause the amplitude of the magnetic responsefromthe spheroid to change nonlinearly with susceptibility. If the linear formulation was valid then the differences in amplitude between the %= 10" and #=10 data should have been exactly 10 . Figures 1.6 and 1.10 show that the difference is less than 10 . This decrease of the response amplitude below the linear expectation is due to a reduction in the magnetization strength. It is for this reason that such high susceptibility effects are referred to as self-demagnetization effects. 5 2 7 6 14 Self-demagnetization also causes the magnetization direction to rotate toward the spheroid long axis. This manifests itself in Figure 1.10 as a shift of the data peak away from a central location, toward the southwest. The rotation of the magnetization direction results in data similar to that of a dipole with a moment inclined 20° (this inclination value was arrived at though theoretical calculations). Compare the data for such a dipole, in Figure 1.11 below, with the data for the spheroid in Figure 1.10. Figure 1.10 shows a similar signature to that shown in Figure 1.10 where the true inclination of the appliedfieldis 70°. Observed Magnetic Data 1089 data, I = 20.0, D = 45.0 . 455.3 . 259 . 62.72 -133.6 -329.8 -526.1 . -722.4 -17 0 Easting (m) 17 nT 455.3 -722.4 Figure 1.11 Total flux magnitude data map and profile for a magnetic dipole in an inducingfieldinclined 20° to the horizontal. Below, Figures 1.12 and 1.13 show vertical and horizontal cross-sections through a typical recovered model from MAG3D inversion of the high susceptibility data set in Figure 1.10. MAG3D was not able to successfully recover an appropriate model when the long axis of the 15 spheroid was oriented away from the inducingfield.In order to fit the data adequately, MAG3D was forced to place the central body in an incorrect position and place most of the magnetic material in the outer padding cells. Figure 1.12 A vertical cross-section through a typical recovered model from MAG3D inversion of the data set in Figure 1.10. The cross-section runs SW to NE (i.e. along the spheroid long axis) and looks toward the NW. Depth = -15.625 South Figure 1.13 A horizontal cross-section through a typical recovered model from MAG3D inversion of the data set in Figure 1.10. 16 1.4 A mineral exploration example The need for a full solution to the magnetic inverse problem can be further demonstrated using a real-world examplefrommineral exploration. The Osborne mine, owned by Placer Dome, is a gold-copper mine located in Queensland, Australia. The mineralized zones are well known and contain highly magnetic ironstone formations with complicated structure. In Figure 1.14 below, ground-based magnetic survey data are plotted on the local grid coordinates, relative to which the geomagneticfieldwas oriented with inclination -53.3° and declination 47.5°E and had strength 52000nT. For comparison, total flux magnitude data for a dipole with moment oriented in such a direction is shown in Figure 1.15. 11 000 11 Jfii) 12 OOO 12 500 13 000 Figure 1.14 A plan view of ground-based, total flux magnitude survey data for the Osborne mine region. (Figure supplied by Placer Dome.) 17 50. Easting (m) Figure 1.15 Total flux magnitude data for a dipole with inclination -53.3° and declination 47.5°E. Comparison of the anomalies in Figure 1.14 with the dipole field in Figure 1.15 suggests complicated structure and rotation of the induced magnetization away from the direction of the Earth'sfield.Furthermore, the amplitude of the anomalous response is roughly 16000nT, or about 30 percent of the inducing flux value. With such high secondaryfieldsrelative to the primary, self-demagnetization is expected to be important. Below, Figures 1.16 and 1.17 show interpolated geological cross-sections for the region. These further indicate complicated subsurface distributions of highly magnetic material. 18 Figure 1.16 Interpolated geological cross-section 11670E (local grid coordinate) through the Osborne mine region. (Figure supplied by Placer Dome.) 19 Figure 1.17 Interpolated geological cross-section 21700N (local grid coordinate) through the Osborne mine region. (Figure supplied by Placer Dome.) Aeromagnetic survey data gathered over the mine were inverted using the linear code MAG3D. These inversions were only able to adequately fit the data if large negative susceptibilities were allowed. Furthermore, the recovered models containing these non-physical negative susceptibilities did not contain features consistent with the knowledge of the subsurface. These problems stemfromthe fact that the subsurface rocks are highly magnetic and the mineralized zones have complicated structure. Thus, the survey data contain significant demagnetization effects and use of MAG3D to invert this data is inappropriate. A method of inversion that successfully accounts for demagnetization effects is required. 1.5 Research goals and thesis outline Both synthetic and real-world examples have indicated the need for full forward modelling and inversion (accounting for demagnetization effects) of magnetic datafromregions of high magnetic susceptibility. The ability to forward model such data would provide a tool to help better understand how magnetic fields behave in highly magnetic media. The ultimate goal of this research was to invert geophysical magnetic data to recover three-dimensional distributions of subsurface magnetic susceptibility of any magnitude and geometric complexity. 20 The thesis is structured in the following manner. The full Maxwell's equations for the magnetostatic problem of interest are discussed in Chapter 2 and are solved using a finite volume discretization in Chapter 3. Chapter 4 discusses incorporation of boundary conditions and a secondary formulation is given in Chapter 5. Solution of the resulting matrix-vector systems and associated accuracy concerns are discussed in Chapter 6. Chapter 7 presents an alternate solution in the integral equation domain, which is used alongside less versatile solution techniques in Chapter 8 to thoroughly test the refined forward modelling code. The forward modelling code forms the foundation for a subsequent inversion algorithm, discussed in Chapters 9 and 10. Chapter 11 presents testing of the inversion algorithm. The merits of, drawbacks of, and possible amendments to the modelling and inversion methods are discussed in Chapter 12. 21 Chapter 2 The magnetostatic governing equations and boundary conditions 2.1 Introduction The need for full forward modelling of magnetic data was addressed in Chapter 1. In order to accomplish this, Maxwell's equations for magnetostatics were solved using finite volume and integral equation techniques. To begin, this chapter discusses the physical equations and boundary conditions that govern the magnetic problem of interest. The exact method of discretization is then discussed in Chapter 3. 2.2 Maxwell's Equations for source-free magnetostatics Maxwell's equations for stationary media can be written V-B = 0 (2-la) V x H - ^ =J at (2-lb) A V-D = # (2-lc) VxE + —= 0 (2-Id) where B is magnetic flux density (or "flux"), with units of tesla (T = V-s/m ); H is magnetic field intensity (or "field"), with units of amperes per meter (A/m); D is electricfluxdensity, or electric displacement, with units of coulombs per square meter (C/m ); E is electricfieldintensity, with units of volts per meter (V/m); Pfisfreecharge density, with units of coulombs per cubic meter (C/m ); J/-isfreecharge current density, with units of amperes per square meter (A/m ); and t is time, with units of seconds (s). 2 3 The situation of interest is one in which there exist nofreecharges orfreecharge currents (i.e. no source), no electricfieldor electric displacement and no time variance. Maxwell's equations then reduce to the two source-free magnetostatic governing equations VB = 0 VxH = 0 (2-2a) (2-2b) 22 (2-2a) and (2-2b) are valid in any magnetic medium, even if the medium is anisotropic and nonlinear (Lorrain, Corson and Lorrain, 1988). Here, any material is assumed to be linear, isotropic and contain no remanent magnetization. The constitutive relation B = //H (2-3) is then introduced. Recall from Section 1.1 that the magnetic permeability, //, has units of henries per meter (H/m), where a henry is defined as one volt-second per ampere (1H = 1 V-s/A). Permeability can be expressed in terms of the relative permeability, ju , or the magnetic susceptibility, %. r // = //o//,=//o(l+#) (2-4) (2-2b) allows H to be expressed as the gradient of a scalar potential <p: H = V<j> (2-5) (2-2a), (2-3) and (2-5) combine to give the common elliptic div-grad PDE V-//V0=O (2-6) The formulation above is similar to the DC resistivity problem, discussed in Chen (2002) and McGillivray (1992), in which J = oE E =-VT V-0V"K=V-J* (2-7a) (2-7b) (2-7c) Here, J is current density, rris conductivity, Fis voltage (a scalar potential) and is is a source. The only major difference between the two problems is the source term on the right of (2-7c). An analogy can also be made to the hydrological problem of fluid flow through porous media in which q = -KS/h V-KNh = V q, (2-8a) (2-8b) Here, q is specific discharge, K is hydraulic conductivity, h is hydraulic head and q^ is a source. Again, the only major difference between this problem and the source-free magnetostatic problem is the source term on the right of (2-8b). 23 2.3 Physical continuity principles Maxwell's equations for the problem of interest are to be solved numerically on some discrete grid in which the physical properties are constant within each grid cell but vary between cells. There are well-defined continuity principles for magnetic fields and fluxes incident on interfaces at which magnetic physical properties change. These continuity principles will lead to conditions on H and B at the interfaces between the discrete grid cells (referred to as "interface conditions") and on the grid boundary (referred to as "boundary conditions"). 2.3.1 The interface conditions In the following discussion, refer to Figure 2.1 below. Figure 2.1 H and B at an interface where // is discontinuous. The first interface condition is continuity of the tangential component of the field: H, xn = H xii (2-9) 2 where n is a unit normal to the boundary. Because of the constitutive relation (2-3), the tangential component of the flux is then discontinuous for fi^fAi. //, 'B, xn = ju B xii (2-10) l 2 2 The second interface condition is continuity of the normal component of the flux: B. • n = B, • ii (2-11) Using (2-3), the normal component of thefieldis then discontinuous for ii^jii. (2-12) / / , H , • n = JU H • ii 2 2 24 2.3.2 The boundary conditions The continuity conditions above must hold not only between grid cells but also on the grid boundary. As will be discussed in Chapter 3, the geomagnetic inducing field, Ho, is incorporated into the problem through these boundary conditions. One is faced with prescribing the fields on the grid boundary. The model grid can be designed so that any susceptible material is far from the grid boundary such that the associated secondary field is negligible on the boundary. The approximation can then be made that H=Ho on the boundary. However, if the secondary response is too large to neglect at the boundary then some approximate modelling routine is required. The approximate secondary response can then be added to the primaryfieldand prescribed on the boundary, thereby increasing the accuracy of the modelling. Approximation methods for the secondary response are discussed further in Chapter 4. 25 Chapter 3 Discretization of the magnetostatic governing equations 3.1 Introduction Having discussed the governing equations and boundary conditions for the source-free magnetostatic problem in Chapter 2, a numerical solution to these equations is now required. The solution is achieved through a finite volume discretization and the resulting forward modeling method forms the foundation for a subsequent inversion algorithm. 3.2 The system The problem of interest is governed by two of Maxwell's equations and one continuity equation: V-B = 0 B = //V> B, • n = B • n (3-la) (3-lb) (3-lc) 2 The advantage of keeping B in the system instead of discretizing (2-6) directly is that methods can be designed to allow for higher order interpolation functions for B, the quantity of interest. The model region of interest will contain discrete cells, each with a constant permeability, contained within some regular grid system. The continuity equation, (3-lc), must hold between discrete cells. This has consequence on the positioning of the discrete variables (discussed shortly in Section 3.3). (3-lc) must also hold on the boundary of the model grid. Here, assume that the normal flux is known just outside the grid boundary, and is equal to a constant, Bo. The flux just inside the boundary is then prescribed as Bo. Extension of the methods to deal with any prescribed boundary flux, constant or not, will be obvious. 3.3 The discrete grid and discrete variables 3.3.1 Nomenclature for the discrete variables The cell permeabilities and discrete potentials will be denoted by ju and 0, respectively, where the subscript n indicates the cell number. The flux quantities within the discrete grid will be expressed in one of two ways: n 26 1) Here, the subscripts i,j and k are used to indicate the position (x„ y>j, z*) of the quantity. The optional superscript v is used as a reminder of the component of flux for that (scalar) quantity (i.e. ve {X, y, z}). B ijj 2) B V Here, the mandatory superscript v specifies the component of flux as above. The subscript £ numbers that component of flux within the grid using the numbering scheme to be discussed shortly. v ( 3.3.2 The discrete grid The orthogonal grid system pictured in Figure 3.1 was used for the discretization. Figure 3.1 A single discrete grid cell for the flux formulation. The coordinate system used has +x in a northerly direction, +y easterly, and +z vertically downward. The permeability, //, is constant within each cell (i.e. piecewise constant) and the potentials, <p , are placed at cell centres. Thefluxvalues, B], are placed at the centres of the cell n face interfaces: there is one B* or B j' or B\ component per face depending on the face location. This grid system leads to a cell-centred discretization scheme. The location of thefluxesresults from (3-lc) requiring continuity of normal B across cell interfaces. The positioning of the potential results from (3-lb): use of central differences to calculate results in both and B being defined at the same points in space, as required by (3-lb). 3.3.3 Grid coordinates and lengths The three-dimensional volume to be modelled is divided into nc = nx-ny-nz rectangular cells, which are denoted as being in nx rows, ny columns and nz layers. In the following discussion it 27 is useful to consider a one-dimensional problem and refer to the discretized line in Figure 3.2 below. XJ X X ^3/ H2 2 X 3 nx-l Xjyr x 4 3 Hnx-2 a <-Ax,-> Xjw+i rSnx-I < > Figure 3.2 Position and numbering of discrete variables in the flux formulation for a one-dimensional discretization. The node coordinates in each direction are xi :xi,X2, . . . ,x \ (3-2a) yj -yuyi, • •• ,y >+\ Z/c'Z\,Z2, ... , Z +\ (3-2b) (3-2c) nx+ n nz The cell lengths in each direction are denoted hxj: hx\, hx , • •• , hx hyj: hyu hy , ... , hy hz :hz\, hz , ... , hz 2 k ; ; ; nx 2 ny 2 nz hxi = x,+\ — x,hyj = y \ - yj hz = z +\ -z (3-3a) (3-3b) (3-3c) j+ k k k The cell centre coordinates are denoted •X/+1/2 '-X\ + \I2, X2+M2, (3-4a) (3"4b) (3-4c) • • • , XJU+1/2 J;+l/2 • >M + l/2, ^2+1/2, • • • , >V+l/2 The distances from one cell centre to the next are denoted Axi: Ax\, Ax2, ... , Ax, -\ Ayj: Ayi, Ay , ..., Ay -\ Az :Az\, Az , ... , Az .\ w 2 k 2 ny nz ; ; ; Ax,- = X/+3/2 -*/+i/2 = {hx + hx \)/2 Ay; = y y - y ^ m = {hyj + hy )/2 Az* = z -z \ = (/zz* + hz )/2 t j+ 2 k+i/2 28 i+ J+] k+ /2 k+l (3-5a) (3-5b) (3-5c) 3.3.4 Variable numbering The cells are numbered so that the row number changes fastest, then the column number and the layer number changes the slowest (refer to the Figure 3.3 below). '3/6/9/12, 8/11, -y 7 1 4 7 10 13 16 19 22 0 / 22 Figure 3.3 Grid cell numbering for nx=3, ny=4, nz=2. The potential <p is defined at the centre of the n cell. Similarly, //„ is the constant value of magnetic permeability within the n cell. There are nc unknown fa quantities, one at each cell centre. th n th The flux values, B\, are assigned to the cell faces. There are (nx+\)-ny-nz B. values on the xfaces (i.e. faces with normal vectors in the +x or —x direction), nx-{ny+\)-nz B\ values on the j-faces and nx-ny-{nz+\) B values on the z-faces. This gives z k (nx+l)-ny-nz + nx-(ny+\)-nz + nx-ny-(nz+\) flux values in total. However, the flux on the boundary is prescribed as the geomagnetic flux density Bo, or as some other suitable value (to be discussed in Chapter 4), and the total number of unknown B\ values is equal to the number of internal face interfaces, nf — (nx—\)-ny-nz + nx-{ny—\)-nz + nx-ny-{nz—\) = nfx + nfy + nfz The nf unknown flux values are divided into three sets: ' ' B]: Bl Bl- B\,B\,.. ,B \. "' 2 ' (3-6a) (nx—\)-ny-nz D.V nx\ny-\)nz (3-6b) B z ' 5 (3-6c) nx-ny-{nz-\) 29 B is on the +x face of the I cell (i.e. the face with outward normal in the +x-direction). B is x s X 2 on the +x face of the 2 nd cell and B* _ is on the +x face of the (nx-l) cell (refer to Figure 3.2). th a x Because B on the +x face of the nx cell is prescribed, B th x x is on the +x face of the (nx+l) cell th nx (i.e. the first cell in the second column of thefirstlayer) and so on. The last B is B? x n which is on the +x face of the («c-l) cell. Bj and B\ are numbered similarly: B[ is on the +y th face of the 1 cell, B is on the +y face of the 2 st nd 2 cell, and so on. 3.4 Finite volume discretization A finite volume discretization (FVD) of a partial differential equation (PDE) is a discretization of the integral weak form of the PDE (i.e. the integral of the PDE over volume). This weak form implies weaker continuity conditions on the variables involved. The numerical solution of the div-grad PDE (2-6) requires that the potential is twice differentiable. Use of the weak form relaxes the level of differentiability required of the discrete variables. The equations to discretize are J V-Bdv = 0 \vB dv = ly (3-7a) (3-7b) K dv or SvjU' Bdv = ! V0dv l v The volume of integration depends on the discretization and the quantity being integrated. Note that (3-7b) indicates two options for integrating the gradient equation (3-la). Section 3.6 discusses which option is used and why. 3.5 Discretizing the divergence equation To approximate (3-7a) it is integrated over each cell and Gauss's Theorem is applied to obtain \ V-Bdv = \ Bnds = 0 V (3-8) s B is assumed constant over each cell face. An equivalent assumption is that the flux defined at any face centre equals the average flux over that face. Withfluxdefined as positive-out, the integration over a cell with centre (x,+i/2, » + i / 2 , Zk+m) is JsB-n ds s (B +\j+\/2,k+V2 - B*ij+\n,k+\i2)hyjhzk x i + (B \i2j+\,k+\l2 - B 'i+\/2j,k+\/2)hZkhXj y (3-9) y i+ + (B i+\i2j+\n,k+\ - B i+\i2j+\i2,k)hxihyj 2 2 30 =0 where the subscripted indices on the fluxes indicate their x, y and z locations. Dividing by the volume of the cell yields (B* 1 j+1 /2,k+112 - B ij+1 Q,k+112/1hxj + (B i+1 /2j+1 M 112 - B 1 i j,k+1 a)/hyj X i+ y (3-10) y i+ 2 + (B +\/2j+\/2,k+\ ~ B i+\/2j+l/2,kyhZk = 0 Z i Z There are «c such equations, one for each grid cell, with /=1...nx, j=\...ny, and k = l ...nz. The boundary conditions are used to close the discretization. That is, the fluxes on the cell faces that constitute the grid boundary are prescribed quantities. In order to simplify the following discussion, consider the case where the prescribed boundary fluxes are equal to a constant inducing flux Bo = (BQ , Bo , BQ ). A cell on a grid boundary face would produce an equation of the form x Z y (B*i+\j+\/2,k+V2 ~ Bo )/hXi x + {B i+\i2j+\,k+\i2 - B i+mj,k+\i2)lhyj y (3-11 a) y + (B +\/2j+l/2,k+i ~ B i+\i2j+\l2,k)lhZk Z i Z = 0 or B j+\j-ir\l2,k+\l2lhXi X + (B i/2j+\,k+v2 - B \i2j,k+\i2)lhyj y (3-1 lb) y i+ i+ + (B i+\/2j+\/2,k+l B?i+\l j+\l2,k)lhzk = BoJhXi 2 2 Similarly, a cell on a grid boundary edge would produce an equation of the form B i+\j+\/2,k+V2/hx X i + B i+\/ j+\t+\/2/hyj + (B i+\/2j+\/2,k+\ - B i+\i2j+\i2,k)lhzk = BoJhxj + Boy/hyj and a cell on a grid boundary corner would produce an equation of the form y 2 (3-12) 2 Z \j+m,k+m/hxj + + (3-13) B /2j+\,k+\/2/hyj y i+] B /2j \/2,k+\/hz z i+i + k = BoJhxj + Boy/hyj + B /hz 0z k When all such equations (i.e. one for each grid cell) are combined, the matrix-vector equation obtained is DB = q (3-14) 31 Here, the divergence matrix D is nc by nf, the vector B is length nf and holds the unknown flux values; and the vector q is length nc and contains nc-(nx-2)-(ny-2)-(nz-2) non-zero elements arisingfromthe prescribed boundary fluxes. This number corresponds to the number of cells adjacent to the grid boundary. Note I also use 'q' to denote specific discharge in the analogous hydrological problem of fluid flow through porous media. I will ensure that the usage of this variable is clear in any future occurrences. The matrix-vector equation (3-14) can be split into parts: D B =q [D D D ] B x y 2 y (3-15a) =q (3-15b) D B +D B +D B =q x x y y z (3-15c) z The matrices D , D and D are as follows. x y z (3-16a) hx 1 x (3-16b) -i - hx,nx—1 hx, -l /J.V-1 - hx, nx The 0 and -1 diagonals of D are filled. D is nx by (nx-\) and is diagonally tiled ny-nz times to create D . Hence, D is nc by nfx. th st x x x x (3-17a) 32 0 Ay," 1 0 -Ay" hy\ 1 D. (3-17b) -hy'„ hy',, The 0 and -nx diagonals of D are filled. D is nx-ny by nx-(ny-l) and is diagonally tiled nz th th v v times to create D . Hence, D is nc by nfy. y y hz o hz: • hz. D, = (3-18) hz hz~ HZ-l - hz„ The 0 and -nx-ny diagonals of D are filled. D is nc by nfz. th z z 3.6 Discretizing the gradient equation 3.6.1 Arithmetic and harmonic averaging As suggested in (3-7b), the gradient equation can be integrated as either B = //V> (3-19a) or (3-19b) 33 Integration of (3-19a) will yield arithmetic averaging for ju on the cell faces whereas integration of (3-19b) will yield harmonic averaging. To demonstrate this, consider the one-dimensional problem in Figure 3.2. The lengths of integration lie between cell centres. The weak form of the equations in (3-19) for the integration centred at x,- become jB,._,fl&= j ju,_ ^-dx + | ] Mi^dx (3-20a) '.-1/2 J (3-20b) J dx M The integral in (3-20a) has been split up to handle the discontinuity at x,-. B is assumed constant over each integration volume and the gradient of the potential is approximated as d0/dx = A0/Ax. (3-20a) and (3-20b) then become (3-2 la) B /zx hX; ; 2//,._, (3-2 lb) 2/i,. In (3-2la), the assumption is made that (3-22) 2Ax,._, (i.e. a linear interpolation between the two discrete potentials). Dividing by the integration length then yields B (hx^fi^+hx^,.)^,-^) 2Ax,._, hx, , 1 B,._, = 2Ax._, — — (3-23a) Axi-i + hx, ^ (ft-ft-.) Ax (3-23b) — These equations can be expressed as B,_, = /V (ft-*,-,) Ax,.i-i (3-24) 34 where /4#is the effective (i.e. arithmetically or harmonically averaged) permeability across the interface: •uh_ {hx,_ H,_ +hx p, ) 2Axi-l x x i (3-25a) i : hx. , ,, harm r\ \ H eff — — = 2Ax _ M.- hx. + — (3-25b) - i A choice must be made on which form of the gradient equation, (3-19a) or (3-19b), is to be used. Consider the one-dimensional problem of calculating an appropriate midpoint average permeability ju at an interface between two cells with permeability values jU\ and ju (refer to Figure 3.4 below). m 2 K Ax =H A —> 2 r hx >K hx sH Figure 3.4 Homogenization of jl across an interface for a one-dimensional problem. B~ and B* are the fluxes K x 2 immediately to the left and right of the interface respectively. The relative values of the averages for this situation, with ji\=\ and f± =\.. .5, are shown below in Figure 3.5. 2 35 V-2 Ol =1 ) Figure 3.5 A comparison of arithmetic and harmonic averaging. When the permeability does not change significantly across a cell interface, the two effective permeability values in (3-25) are not significantly different. However, the values are significantly different at larger permeability discontinuities. 3.6.2 Arguments for harmonic averaging There are numerous arguments that suggest that an effective permeability is the harmonic average rather than the arithmetic. These arguments come from several different areas. I first consider some physical arguments. As discussed in Section 2.2, the situation in Figure 3.4 is identical to electrical current flow or fluid flow through a ID medium. Here, consider the electrical analogy of current flow across two resistors in series with a voltage difference AV across them. The situation is depicted in Figure 3.6 below. 36 — ® — vV—>—AA K N H g h -L L x >K » L 2 M Figure 3.6 Two analogous electrical circuits. V = voltage (7= conductance R = resistance A = surface area I = current J = current density The upper circuit in Figure 3.6 is the schematic equivalent of the lower circuit. First, consider the upper circuit. When two resistors are combined in series the effective resistance is given by a sum of the two resistance values (Cheng, 1992): (3-26) Reff-Ri + R2 For the lower circuit, the substitution R = L/(Ad) is made in (3-26) to give L Aa Aa. eff (3-27) Ac, Rearranging yields '' L ° eff = L cr, L ^ (3-28) cr j 2 37 Compare (3-28) to (3-25b). The effective conductivity is a harmonic average of the two conductivity values. Ohm's law for the upper circuit gives AV = III/? (3-29) eff For the lower circuit, the substitutions I = JA and R f^L/(Acr ff) rearranging yields an expression for the current density: eJ e are made in (3-29) and AV_ 'eff (3-30) L Compare (3-30) to (3-24). Now, consider one-dimensional fluid flow through two porous media with hydraulic conductivities and K . This is pictured below in Figure 3.7. 2 Figure 3.7 Fluid flow through two connected porous media. The two media are connected in series and there is a head difference Ah within the distance Az across the combination. The flow rate per unit area given by Darcy's law is q=q=K Ah (3-31) Ax 38 (Todd, 1959). Compare (3-31) to (3-24). Intuitively, for K\«K (i.e. medium 1 very resistive to fluid flow compared to medium 2) the flow through the two media should be controlled mainly by the conductivity of the more resistive medium. One therefore expects an averaged conductivity to be more sensitive to (i.e. closer to) the lower value K\. A harmonic average of K\ and K~2 provides this result (refer to Figure 3.5). 2 Another argument for harmonic averaging is purely mathematical. One is faced with the task of defining the integral of some quantity across an interface over which the quantity may not be smoothly varying (e.g. may contain a jump discontinuity). The interface conditions discussed in Section 2.3 state that the normal component of the flux, B, is always continuous across an interface, but not the normal component of the field, H = jtl B - V<p. Therefore, because integration is a smoothing operation, some level of accuracy may be gained by integrating the less smooth functions ju B and V<p of (3-19b). l A One can also argue for harmonic averaging through consideration of (3-lc), which requires continuity of normal B. For the problem of Figure 3.4, B n=B m m n (3-32) or using (3-lb), = Mi Mi v / dx (3-33) \ d x / Following de Marsily (1986), Taylor expansion allows one to write <t>l = 4>» dtp =</>,„ hx.f dtpX 0((x,-x„,) ) 2 + +(*! " * J (3-34) + o(hx{ ) 2 and </>2 hx f dtp ^ 2 = </>,n + o(hx ) (3-35) 2 2 + V J dx Assuming that length scales are small enough that the 0(hx ) terms can be ignored, subtracting (3-34) from (3-35) gives 2 hx. 02 ~<Pl \ hx ( dtp ^ + d x / + 2 (3-36) 2 ^ dx j 39 Substituting (3-33) into (3-36) yields ~1~ (3-37) 2 ju 2 m and rearranging yields _2_ fax ^ fax2 (3-38) (ft " f t ) Mi B„, can now be expressed as (ft - ft) Ax (3-39) where A„ = 2Ax fax A i fax 2 (3-40) ^ 2 is a harmonically averaged permeability value. Thus, in order to satisfy the continuity condition (3-lc) to first order, the effective permeability value at a cell interface must be a harmonically averaged value. Compare (3-40) to (3-28) noting that 2Ax = hx\ + hxj and L=L\+L2. 3.6.3 Discretization of ju B = V0 l From the arguments for harmonic averaging above, (3-19b) should be integrated rather than (319a). To do so, the equation is first split into three parts, each corresponding to a different Cartesian direction: (3-4 la) (3-4 lb) (3-4 lc) Consider (3-4 la). The volume of integration for this equation runs across a cell face with normal in the x-direction so that an unknown flux B. is at the centre of the integration volume. The integral is +1/2 -';+! *-+i z J J J — dz dy dx = J j " Jv</> dz dy dx x *,'-!/2 >'j k 2 40 (3-42) Assuming that B is constant over the integration volume leads to x *-+l • J'+I V •*,•+!/2 z \dy\dz IM'2MV2 B J ^= k '\dy\dz (3-43) j i-\I2 z X Assuming d<p/dx = A0/Ax and dividing by the integration volume yields D > /?X,. • ;,;+i/2,t+i/2 D 2Ax,. flX; -+- tti+\/2J+\/2,k+V2 ^H-l/2,y+l/2,fr+l/2 0/-l/2,y+l/2,*+l/2 (3-44) A -l/2,y+l/2,t+l/2 , or BjJ+\/2,k+l/2 _ Pi+M2J+\l2,k+\l2 < li,j+\l2,k+\l2 0i-\/2,j+\/2.k+[/2 (3-45) AX/-1 r where (3-46) tfi,j+\/2,k+\/2 — 2Ax _ ; A'+l/2,y+l/2,t+l/2 A'-l/2,;+l/2,t+l/2 is a harmonic average of // across the face. Note that (3-45) is exactly the same result one would obtain from (3-4 la) if one used //=/^ from (3-40) and used a central finite difference approximation for d<p/dx. The matrix-vector equation obtained is M 'B = G<|> x x x or B = x M G4 X (3-47) where the diagonal matrix M has non-zero diagonal elements 77,7+1/2^+1/2 given by (3-46). When (3-4lb) and (3-4lc) are discretized in a similar fashion, the following system is obtained: x B = M G 4> B = M G <|) X X y B z (3-48a) (3-48b) (3-48c) X y y = M G <t> z z which can be combined to give 41 0 B , = B z B 0" 0 M 0 0 G M G = 0 y (3-49a) z (3-49b) Note I also use ' M ' to denote magnetization and 'G' to denote the forward modelling operator for the approximate linear problem (refer to Section 1.2). I will ensure that the usage of these variables is clear in any future occurrences. In (3-49b), the gradient matrix G is «/by nc and the matrix M is nf by nf. The vector B is length nf and holds the unknown flux values. The vector ()) is length nc and holds the unknown potentials. The matrices M, G , G and G are as follows: x y z ^l(nx-\)nynz M = (3-50) nx(ny-\)nz ^1 nx-ny { nz-\) where rf is defined at the same position as B], t (3-5 la) - Ax," Ax"' - Ax ~ AxJ 1 1 2 (3-5 lb) Ax , 1 Ax , 1 42 The 0 and 1 diagonals of G are filled. G is (wx-1) by nx and is diagonally tiled ny-nz times to create G . Hence, G is nfx by nc. th st x x x x (3-52a) A^r 1 o 0 Ay; •Ay; G. = Ay," -Ay~ (3-52b) Ay- l 1 2 4y;!_, The 0 and nx diagonals of G are fdled. G is nx-(ny-\) by nx-ny and is diagonally tiled nz th th y y times to create G . Hence, G is nfy by nc. y y - Az,"' 0 0 Az," •Az G, = Az, •Az" nz-\ th (3-53) Az 1 Aznz-\ The 0 and nx-ny diagonals of G are filled. G is nfz by nc. th m 2 z 3.7 A n alternate formulation There are multiple possibilities for placement and use of the discrete magnetic quantities. It is not clearfromthe outset of the problem what the relative merits of the different options are. For example, instead of choosing to work withfluxesin a cell-centred scheme as above (the "flux formulation") one could work with fields in a node-centred scheme (a "field formulation"). Here, such afieldformulation is described. It will be compared against the flux formulation in order to assess the merits of each as solutions to the magnetic forward problem. The governing equations for the field formulation are 43 V(//H) = 0 H= H, xn = H xii (3-54a) (3-54b) (3-54c) 2 Now, the orthogonal grid system used is as pictured in Figure 3.8 below. The permeability, ju, is constant within each cell and the potentials, fa, are placed at cell corners (the "nodes"). The fields, H\, are placed at the centres of the cell edge interfaces. Figure 3.8 A single discrete grid cell for the field formulation. The location of the fields resultsfrom(3-54c) requiring continuity of tangential H and the positioning of the potential resultsfrom(3-54b). Let the number of nodes be nn = (nx+\)-(ny+l)-(nz+l) There are then nnfaquantities, nx-{ny+\)-{nz+X) H. values on the x-edges (i.e. edges parallel to the x-direction), (nx+\)-ny-{nz+\) Hj values on the v-edges and {nx+\)-{ny+\)-nz H\ values on the z-edges. The number of unknown field values is equal to the number of edge interfaces, ne = nx-(ny+l)-(nz+l) + (nx+l)-ny-(nz+l) + (nx+l)-(ny+l)-nz The boundary fields again enter the discretization through the divergence equation. The volumes of integration for the divergence equation are now the dual grid cells defined so that the potentials are at their centres (refer to Figure 3.9 below). As such, the incorporation of the prescribed boundary fields (here, the geomagnetic field, Ho) must occur outside of the defined 44 grid of permeable cells (the "true grid"). This requires the addition of a half layer of padding cells to the true grid, all of which have permeability offreespace, JUQ. H 0 H 0 r hue grid dual grid L H 0 "o Figure 3.9 The true and dual grids at a boundary for the field formulation. A complete derivation for the field formulation is provided in Appendix B. The process follows exactly those steps in Sections 3.2 through 3.6 for the flux formulation. However, in the field formulation, one does not need to decide the form of the gradient equation to discretize as was necessary in Section 3.6 for the flux formulation. For the field formulation, finite volume discretization of (3-54a) and (3-54b) leads to the matrix-vector equations DMH = q H = G(J) (3-55a) (3-55b) The divergence and gradient matrices D and G in (3-55) are similar to those for the flux formulation but are of different size. M now contains non-zero diagonal values that are arithmetic averages of the four permeability values surrounding each discrete field quantity. The 45 vector q has nn-{nx-1 )-(ny-1 )-{nz-1) non-zero elements arising from the prescribed boundary fields. 3.8 Summary To summarize, the governing equations for the flux formulation were V B =0 ju B = V<f> (3-56a) (3-56b) l which yielded the discrete matrix-vector equations DB = q B = MGcJ) (3-57a) (3-57b) The governing equations for the field formulation were V-//H = 0 H = (3-58a) (3-58b) which yielded the discrete matrix-vector equations DMH = q H = G(t» (3-59a) (3-59b) The flux or field on the grid boundary must be prescribed in order to close the discretization of the divergence equations. These boundary flux or field values enter into the vector q in (3-57a) or (3-59a). The following definitions were made: nc = nx-ny-nz = the number of cells nn = (nx+\)-(ny+\)-(nz+\) = the number of nodes nf = (nx-l)-ny-nz + nx-(ny-\)-nz + nx-ny-(nz-\) = the number of internal cell faces ne = nx-(ny+\)-{nz+\) + (nx+l)-ny-(nz+l) ... + (nx+1 H«y+1 )-nz = the number of cell edge elements The various matrix and vector quantities in (3-57) for the flux formulation have the following sizes: 46 B q D M G is is is is is is nf by 1, nc by 1, nc by 1, nc by nf nf by nf nf by nc. The various matrix and vector quantities in (3-59) for the field formulation have the following sizes: H q D M G is is is is is is ne by 1, nn by 1, nn by 1, nn by ne, ne by ne, ne by nn. For any given grid, it will be true that nn > nc and ne > nf and therefore, the field formulation creates computation problems of a slightly larger size than the flux formulation. 47 Chapter 4 Specifying the boundary conditions 4.1 Boundary flux approximation In Chapter 3, the source-free magnetostatic governing equations were discretized to obtain a system of equations to solve for the discrete potentials and fluxes. The flux on the grid boundary must be prescribed in order to close the discretization of the divergence equation. A reasonably accurate method of approximating these boundary fluxes is needed before the fluxes within the grid can be accurately determined. The model grid can be designed so that any susceptible material is far from the grid boundary such that the associated secondary flux is negligible on the boundary. The approximation can then be made that B=B on the boundary. However, if the secondary response is too large to neglect at the boundary then some approximate modelling routine is required. The approximate secondary response can then be added to the primary flux and prescribed on the boundary, thereby increasing the accuracy of the modelling. 0 There are several possibilities for approximating the secondary response on the boundary. The first is to use the finite volume forward solution on a larger, more coarse, homogenized grid with B=Bo on the boundary. This method should be accurate but possibly slow if the resulting problem is large. An alternative is to approximate the secondary response as being due to an appropriate sphere, spheroid, or other simple body for which an analytic solution is available. This approach could be less accurate but much faster. Because I intend to use the forward modelling methods within an inversion algorithm, any method of approximation for the boundary conditions should ideally be fast and simple. Although the accuracy of the method is important, the grid can always be enlarged in order to move the boundary further from any susceptible material and thereby increase the boundary condition accuracy. A fast approximation method is chosen that approximates the susceptibility within the grid as a sphere. 4.2 The congruous sphere method The susceptible material within the grid could be approximated as a single sphere with susceptibility equal to the volume-averaged susceptibility within the grid. I define the following quantities: 48 V= £v, (4-la) (=1,^*0 1 nc ^ =- I ^ v , (4-lb) where v, is the volume of the i cell; Xi is the susceptibility of the i cell; and nc is the total number of cells. th th Vis the total volume of susceptible material and £is a volume-averaged susceptibility value for the grid. In (4-la), the summation for Vis only over those cells with ^ 0 . To calculate the boundary fluxes, the magnetization of the sphere must be adjusted to account for self-demagnetization effects. Clark and Emerson (1999) discusses self-demagnetization within spheres, spheroids and other simple bodies. For a sphere placed in a uniform inducing field Ho, the resulting magnetization within the sphere is uniform and parallel to Ho and has components M " = T O M ' = r a H H ° - < 4 " 2 a ) ° - ( 4 " 2 b ) N , Ny and N are called the "demagnetization factors" for each Cartesian direction. For a sphere, these factors are constant for all directions: x z (4-3) N = N = N = 1/3 x y z For a spheroid, these factors are dependent on the elongation of the rotational symmetry axis. An appropriate magnetization for the sphere is then M =Jk ^ (4-4) x The magnetic flux density outside of a uniformly magnetized sphere is equivalent to that of a dipole located at the sphere's centre with dipole moment equal to the sphere's volume times its magnetization (Blakely, 1996). Therefore,from(4-4), the susceptible material within the grid could be approximated as a single dipole with moment 49 Let in have length m and unit direction m. The total flux, from both the magnetized dipole and the inducingfieldBo, at any point P is given by B(P) = -^-^-[3(m-f)f-m] + B 4TT r , 0 r±0 (4-6) where r is a vector with length r and unit direction f pointing from the dipole to P (Blakely, 1996). The sphere can be placed at the centre of the grid or at the "centre of susceptibility", r = [x , yc, z ], calculated in a similar manner to a centre of mass: c c c nc nc X*/*' x c y c ' - Xi (=1 Y.XiZi ^x,yt ' = nc z c = (4-7) ''TC Xi Xi i=l 1=1 One could extend the sphere method by computing higher spatial susceptibility moments to define the elongation and orientation of a congruous spheroid. However, the averaged susceptibility may not be significantly high and the higher moments may not lead to a significant elongation. The boundary fluxes calculated from such a spheroid would contain little in the way of self-demagnetization effects beyond reduction of the magnetization strength. The fields for a congruous sphere and a congruous spheroid will then not be appreciably different and the sphere becomes the simpler choice. 4.3 The discretized divergence equation Recall from Section 3.5 that finite volume discretization of the divergence equation resulted in the matrix-vector equation DB = q (4-8) where q contains non-zero elements arising from the prescribed boundary fluxes. In Chapter 5 it becomes useful to deal with q as a sum of two quantities: q = f+g (4-9) 50 Here, f contains primary contributions to the prescribed boundary fluxfromthe geomagnetic flux Bo. g contains secondary flux contributionsfromthe susceptible material within the grid. For example, when the boundaryfluxesare calculated with the congruent sphere approach of Section 4.2, the secondaryfluxcontributions given by CP) = —-^[3(m-f)f-m] 4/T r , r*0 (4-10) are incorporated into g. If B=Bo is used on the boundary with no additional secondary flux approximation, or if the grid contains no susceptible material, then g is a zero vector. 51 Chapter 5 Secondary flux and field formulations 5.1 The total flux formulation In Chapter 3, a finite volume discretization was performed on the governing equations VB = 0 B = //v> (5-la) (5-lb) to yield a method of numerical solution. The discrete matrix-vector equations obtained were DB = q B = MG(j) (5-2a) (5-2b) In (5-1), the flux, B, and potential, 0, are representative of the total flux. That is, they contain contributions from both the primary inducing flux and the secondary anomalous response: </>=<k + </>s (5-3a) B = B + B, (5-3b) 0 In order to solve for the anomalous secondary flux using this formulation, (5-2a) and (5-2b) are combined and one first solves for the vector of total potentials, (j), in DMq> = q • (5-4) The total potentials are then used to calculate the total flux through (5-2b) and the secondary flux is obtained after subtraction of the primary flux: B = MG<t> - B s (5-5) 0 Because the secondary flux values may be considerably smaller than the primary flux values, some accuracy may be lost through machine precision and rounding problems. In the mineral exploration example introduced in Section 1.4, the amplitude of the secondary response was approximately 30 percent of the geomagnetic flux of 52000nT. In contrast, many magnetic surveys for mineral exploration have secondary response amplitudes on the order of hundreds of nT (i.e. only a few percent of the geomagnetic flux). To avoid this significant difficulty, the secondary potentials can be solved for directly and the secondary fluxes can be calculated directly from these. 52 5.2 A secondary flux formulation There are two possible approaches to developing a secondary formulation in which the secondary potential is solved for directly. These involve a fundamental decision to be made when setting up a numerical solution to an inverse problem in which a function is to be recovered. The basic question is whether to formulate the secondary flux problem in function space and then discretize, or whether to first discretize the problem and then formulate the calculation of the secondary quantities. Here, both procedures are outlined. 5.2.1 Approach 1: formulate first, discretize second Here, the governing equations (5-1) are first reformulated to yield equations containing the primary and secondary quantities. Consider the primary response to be that with ju=/Jo and B = B (a constant). (5-1) is then 0 V-B = 0 Bo = //bV$) (5-6a) (5-6b) 0 For a combined primary and secondary response, (5-1) can be decomposed to yield V-(Bo + B,) = 0 B + B = juV(fo + <p ) 0 s (5-7a) (5-7b) s Combining (5-6) and (5-7), and using (2-4) to relate ji to x, yields V B, = 0 B, = MoxVfo + JuVfa (5-8a) = ZBo + M*& (5-8b) Combining (5-8a) and (5-8b) leads to the div-grad system for the secondary potential: V-//V^ = -V-CfcBo) (5-9) Note that (5-3b) defines B^ as the difference of two continuous quantities, the total flux B and the primary flux Bo. Therefore, B^ must be continuous everywhere. However,from(5-8b), if the susceptibility (or equivalently, the permeability) is discontinuous across a cell face then B on that face is equal to the sum of two discontinuous quantities. The sum must be continuous so it must be true that the two discontinuities cancel each other. s With B continuous everywhere, one can proceed with the finite volume discretization of (5-8) as in Chapter 3. The divergence equation (5-8a) would yield a similar matrix-vector equation as before: s 53 DB = g (5-10) s (recall the discussion of q as a sum of f and g in Section 4.3). The gradient equation (5-8b) is rearranged into the more rough equation ju fl (5-11) M \ P f Mo M which will yield harmonic averages in ju when discretized. (5-11) is split into three parts, one for each Cartesian component. The integration for the x-direction is */+!/2 -J+1 k+\ z V * i + l / 2 -'O'+l k+l z dzdydx+ 0x l i-l/2 >j j J" jV <pdzdydx x (5-12) Mo M -k Note the subscript V has been dropped and all quantities are secondary unless otherwise indicated (e.g. B ). Assuming that B and B are constant over the integration volume and dividing by the volume gives 0 x D i+\I2 x X "i,j+\!2,k+]/2 Ax,. d x f •> _ " 0.r B ju + Ax,.// J 0 V dx I Av Ax.J J 0 J 1 Td<f> dx //// A Ax, v J dx Xi (5-13) J r Performing the integration yields B /ZX; •+- 2Ax. yMi+l/2J+l/2,k+U2 B O.r Mi-\I2J+M2,k+\I2 j /zx, /zx i-l 2Ax,. ^/+l/2,y+l/2,t+l/2 (5-14) ! Bp.r ^ ^+l/2,y+l/2.t+l/2 Mi-\l2J+\l2,k+\l2 J A) ^/-l/2.y+l/2,t+l/2 A*; or li,j+\/2 r ' ~ Wo t+l/2' -'/,;+!/2,t+l/2 1 \ | Qi+\l2,j+\l2,k+\l2 '/,j+l/2.4+l/2 / 0.v R B 7 AX; 0i-l/2J+\l2,k+U2 (5-15) where //,V+l/2,t+l/2 = 2Ax,. /ZX; Mi+\l2,j+[l2Ji+\l2 -+- /ZX; Mi-\I2J+\I2,,,k+M2 54 (5-16) The matrix-vector equation obtained is M B = (/Vl - M - ' ) B x x x 0 x + (5-17) where M has non-zero diagonal elements Tjij+>/ +v given by (5-16) and I is an identity matrix of appropriate size. X 1>k 2 The discrete equations for all three directions can be combined to give 0 M; 1 0" B 0 or = 0 0 0 f 0 0 0 M;' 0 0 0 x M;' A Oy Oz M - ' B = (A,-'I - M'')Bo + G()) S Ox + G v (5-18a) G. (5-18b) s 5.2.2 Approach 2: discretize first, formulate second Here, a secondary formulation is developed though the opposite procedure: discretize first and then split the matrix quantities into primary and secondary quantities. (5-3) and (4-9) are used to decompose (5-2) into D(B + B ) = f+g M-'(Bo + B ) = G((l)o + (j)) 0 (5-19a) (5-19b) s s s The equations for the primary response are DB = f (5-20a) (5-20b) 0 MQ-'BO = G4> 0 Mo has non-zero diagonal elements rjo=jUo so that M =//ol ( I is an nf'by nf'identity matrix). Substitution and elimination using (5-19) and (5-20) yields 0 DB = g (5-2 la) (5-2 lb) s M- B, = (/A)" I-M- )Bo + G(|), 1 1 1 and hence, the two approaches yield identical results. 55 5.3 Summary of the flux formulations A cell-centred finite volume discretization of the governing equations for magnetostatics resulted in the discrete matrix-vector equations below. For the totalfluxformulation, DB = f+g B = MG(t) (5-22a) (5-22b) and the secondaryfluxis calculated as B = B - Bo (5-23) s The total flux formulation could also be written DB = g (5-24a) s B = MG(t> - B s (5-24b) 0 For the secondaryfluxformulation, the secondary quantities can be computed directly using DB = g B = ( M J - ' M - I)B + M G f (5-25a) (5-25b) S s 0 s 5.4 A secondary field formulation RecallfromSection 3.7 that a node-centred discretization approach dealing with fields instead offluxesresulted in the following discrete matrix-vector equations: DMH = q H = G(|) (5-26a) (5-26b) Following the same procedures of Section 5.2, one can show that either approach at a secondary field formulation results in the following: D M H = -D(M-//oI)H + g = -D(/4)-'M - I)B + g H = G4> s 0 (5-27a) (5-27b) 0 s s The full details of this derivation are provided in Appendix B. 56 5.5 Generalized notation for forward modelling Several systems of matrix-vector equations have been set up to solve for total and secondary potentials, fluxes andfields.To avoid confusion of symbols, to aid notation, and as a prelude to the inverse problem, all forward modelling systems will be written as A(m)u = q(m) (5-28) Here, m is a length nc model vector containing the discrete permeability values within the model grid. For example, in the secondary flux formulation, (5-25a) and (5-25b) combine to give DM(m)G(t) = g(m) - D(/VM(m) - I)B s 0 (5-29) This can be expressed in the form (5-28) with A(m) = DM(m)G u = <j> q(m) - g(m) - D(fJd M(m) - I)Bo (5-30a) (5-30b) (5-30c) s l Take care not to confuse q in (5-28) with the quantity q = f + g introduced in Section 4.3.1 will no longer use q in the latter context. A method of solving a system of equations of the form (5-28) will now be addressed in Chapter 6. 57 Chapter 6 Solving the discrete forward modelling equations 6.1 Introduction In Chapter 3, the source-free magnetostatic governing equations were discretized to obtain systems of equations to solve for the discrete potentials andfluxes(or fields). In Chapter 5, secondary formulations were developed in which the secondary quantities can be solved for directly to avoid losses in accuracy. A method of solving any of these systems of equations for the unknowns will now be addressed. RecallfromSection 5.5 that the forward modelling systems can be written as the linear system A(m)u = q(m) (6-1) where u is a vector of discrete potentials. Once u is determined, the data of interest (i.e. certain flux components at specified locations) are calculated through some nonlinear operation. For example, in the secondary flux formulation, the discretefluxeswithin the grid are calculated as B = (/A)- M(m) - I)B + M(m)G<t) (6-2) 1 s 0 s where (j) = u. The specific flux data are then interpolatedfromthe discretefluxesusing an eightpoint interpolation. Solution of u in (6-1) will now be addressed. s 6.2 Solving the discrete system for the potentials 6.2.1 The solver chosen Here, the goal is to solve the system of equations in (6-1) for u. This requires a solver from linear algebra. For any formulation used (i.e. total or secondary; field or flux), the matrix A is A(m) = DM(m)G (6-3) where D is the divergence matrix, G is the gradient matrix, and M is a matrix of averaged permeability values. D, M and G are large, sparse and contain simple structure. The resulting matrix A is also large and sparse and has a banded structure with seven diagonals of non-zero elements positioned fairly close to the main diagonal (one is on the main diagonal). The discussion of these matrices in Chapter 3 indicates that A is square and non-symmetric. Furthermore, it has been determined empirically that A is not positive definite in general. 58 The solver chosen was the Bi-Conjugate Gradient Stabilized method (BiCGStab). This method has been used successfully in both the dc-resistivity and magnetometric resistivity problems (Chen, 2002), which requires the solution of systems similar to (6-1). The Krylov Space method BiCGStab is established in Van der Vorst (1992) and an excellent explanation of the related Conjugate Gradient (CG) method is given in Shewchuk (1994). The BiCGStab method and many other related non-stationary iterative methods, including those mentioned below, are discussed in Barrett et al. (1994) and Saad (1996). The well known CG method is not suitable for non-symmetric systems (Barrett et al., 1994). The Bi-Conjugate Gradient (BCG) method was developed to solve non-symmetric linear systems. For symmetric positive definite systems it gives the same result as CG but at twice the cost per iteration (Barrett et al., 1994). For non-symmetric matrices, it is observed that the convergence behavior can be quite irregular and the method may break down (Barrett et al., 1994). The Conjugate Gradient Squared (CGS) method was developed as an alternative to BCG and provides faster convergence for roughly the same cost (Saad, 1996). However, the CGS method often shows highly irregular convergence behavior that may lead to substantial build-up of rounding errors (Saad, 1996), and it tends to diverge if the starting guess is close to the solution (Barrett et al., 1994). The BiCGStab method is a variation of CGS that avoids the convergence problems of CGS while maintaining about the same speed of convergence as CGS (Barrett et al., 1994). Indeed, testing of these two solvers on typical problems showed that CGS was sometimes faster (although not significantly) but always required more iterations than BiCGStab. The pseudo-code for the BiCGStab solver, adapted from Barrett et al. (1994), is provided in Figure 6.1 below. 59 Compute r = q - Au (0) Choose r (e.g.? = r <0) for some initial guess u <0) ) for / = 1,2,... = rV'-" Pi- if / = 1 (/-n = r else I = ( A - l / A - 2 )(«/-! M-l) (0 _ P endif r< H >+/t,k solve Pp = p v ( M ) -"MV ( W ) ) (,) o = Ap a, = P, -,/?V'> s =r - ' - a y * (/ if s < tol U ° := ( u (/ u +a.p STOP end solve Ps = s t = As CO,= t s/t t T u r (/) = s -<y,t (/) if T r </> < to/ STOP end end Figure 6.1 The Preconditioned Bi-Conjugate Gradient Stabilized Method. 60 (0) Here, P is a preconditioner, discussed shortly. The major computational requirements involve two solutions of the form Px = b, two matrix-vector products and four vector dot products per iteration. A typical BiCGStab convergence curve for the forward solution is shown below in Figure 6.2. iteration Figure 6.2 A typical BiCGStab convergence curve. Generally, the convergence is smooth until very low relative residual values are reached, at which point machine precision and rounding come into play. The BiCGStab routine is startedfromsome initial guess (i.e. u in Figure 6.1). The initial guesses considered were the zero vector initial guess and the secondary potentials for the congruous sphere of Section 4.2. Experimentation with typical problems showed no obvious difference in solution efficiency for these two initial guesses. Therefore, the zero vector was chosen as a default since the alternative requires non-trivial construction. However, in cases where a solution is known for a similar problem (i.e. a similar discrete model), the known solution can easily be used to speed up the solution time. (0) 61 As a final note, the solution for the potentials is non-unique because VjLN</>=VjuV(<t>+c) (6-4) where c is some constant scalar quantity. The solution to (6-1) is only unique to an additive constant. The BiCGStab solver is able to deal with this non-uniqueness. 6.2.2 Preconditioning Convergence of CG and related methods, such as BiCGStab, are enhanced if the condition number of A is low and the eigenvalues of A are clustered. This can be achieved by the addition of a preconditioner to the system in (6-1). As indicated in Figure 6.1, a preconditioner P can be incorporated and the BiCGStab algorithm then effectively solves the system P Au = P"'q (6-5a) Au = q (6-5b) 1 or By choosing an appropriate preconditioner, one may find a matrix A = P~'A with desired properties such that the BiCGStab solution of (6-5) is more efficient (i.e. faster and more stable) than that of (6-1). An effective preconditioner is one that approximates A so that P" is an 1 approximate inverse of A. A is then an approximate identity matrix. The preconditioners considered were Symmetric Successive Over-Relaxation (SSOR) and Incomplete LU Factorization (ILU). 6.2.3 SSOR preconditioning Let D be a diagonal matrix containing the main diagonal of A; L be the lower triangular part of A excluding the main diagonal; and U be the upper triangular part of A excluding the main diagonal so that A = D+L+U (6-6) Take care not to confuse D used here with the divergence operator. The SSOR preconditioner is P= (D + fflL)D (D + _1 fljU) , co(2 - co) Alternatively, (6-7) can be written P = PiP2 with 62 0<co<2 (6-7) P, = P = 2 ( D + 6JL)DV&>(2 - <y) , (6-8a) U2 1 =D-' 1 Jco(2-co) / 2 (6-8b) ( D + 6JU) where the elements of D" are the inverted square roots of the elements of D . Pi and P2 are lower and upper triangular respectively. When split in this manner, the equations in Figure 6.1 involving P can be solved quickly and easily through Gaussian elimination. One can verify that the matrix A defined by (6-3) has solely negative values on the main diagonal. This introduces purely imaginary values into D " , which convert back to purely real values when squared (as Pi multiplies P2). In practice, the SSOR preconditioning can be used to approximate -A so that all elements in D " remain real. 1 / 2 1/2 The number of iterations required in the BiCGStab solution for a given tolerance (i.e. tol in Figure 6.1) can be reduced by using an optimal value of co. For a few structured problems, the optimum value of cois known. However, in more complicated problems it may be necessary to perform a fairly sophisticated and prohibitively expensive eigenvalue analysis in order to determine the optimum value (Golub and Van Loan, 1990). Here, optimal values are estimated through thorough trial and error analysis. 6.2.4 I L U preconditioning Incomplete factorizations such as ILU are computed in a similar manner as their complete equivalents except that a non-negative scalar drop tolerance, droptol, is used to ignore smaller fill elements (i.e. non-zero entries). Incomplete factorization preconditioning requires a nontrivial construction phase, whereas SSOR preconditioning requires relatively little effort in construction. ILU preconditioning is especially effective if the system is diagonally dominant (Haber, 2000), as is generally the case for practical applications of the forward modelling methods. An ILU factorization of A yields two incomplete lower and upper diagonal matrices L , „ and V such that C inc A = h V inc (6-9) inc An appropriate preconditioner is therefore P = L / B C U/„ (6-10) C The parameter of interest is droptol and choosing its value presents a trade-off between construction time and BiCGStab solution time. Recall that the effectiveness of the 63 preconditioner depends on how well P approximates A. For a higher value of droptol, the construction of L„, and U,„ will be faster. However, (6-9) will be less accurate and, therefore, P in (6-10) will be less effective at improving the BiCGStab convergence. As droptol approaches zero, an exact decomposition of A is obtained and the BiCGStab solution is obtained in one iteration. c c The ILU algorithm used was the MATLAB function LUINC. Details of this function can be found at MathWorks (2002). LUINC performs a sparse incomplete LU factorization and allows for various possible modifications of the procedure. 6.2.5 The best preconditioner and the optimal parameter value Typical curves showing solution times for various parameter values are displayed below in Figures 6.3 through 6.10. For all figures, the model contained susceptibilities chosen randomly from a uniform distribution on [0,10]. In Figure 6.3, the grid had 15 cubic cells of dimension lm and the BiCGStab solution tolerance was 10". 3 5 Figure 6.3 Typical curves of CPU solution time for the SSOR (left) and ILU (right) preconditioners for many values of the parameters co and droptol. The grid had 15 cubic cells of dimension lm and the solution tolerance was 10". 3 5 The SSOR curve dips fairly smoothly to a minimum value between the two limits. For the ILU preconditioner, for small values of the drop tolerance the ILU decomposition runs nearly to completion and requires a long time. The consequent BiCGStab solution then requires far less iterations and the BiCGStab solution time is lowered but not enough to counteract the large preconditioner construction time. For large values of the drop tolerance, the ILU decomposition does very little and takes little time. The consequent BiCGStab solution requires far more iterations and the BiCGStab solution time is increased significantly. These two effects are demonstrated in Figure 6.4 below. 64 droptol Figure 6.4 Typical curves of CPU time for the ILU decomposition (left) and subsequent BiCGStab solution time (right) for many values of the parameter droptol. The grid had 15 cubic cells of dimension lm and the solution tolerance was 10". 3 5 In Figures 6.5 and 6.6, the grids were alteredfromthat in Figure 6.3 to contain 10 and 20 cubic cells respectively. Comparison of Figures 6.3, 6.5 and 6.6 indicate that the ranges of optimal parameter values do not change significantly with grid size. 3 droptol Figure 6.5 Typical curves of CPU solution time for the SSOR (left) and ILU (right) preconditioners for many values of the parameters «and droptol. The grid had 10 cubic cells of dimension lm and the solution tolerance was 10". 3 5 65 3 droptol Figure 6.6 Typical curves of CPU solution time for the SSOR (left) and ILU (right) preconditioners for many values of the parameters co and droptol. The grid had 20 cubic cells of dimension lm and the solution tolerance was 10". 3 5 In Figures 6.7 and 6.8, the grids were alteredfromthat in Figure 6.3 to contain cubic cells of dimension 0.1m and 10m respectively. Comparison of Figures 6.3, 6.7 and 6.8 indicate that the ranges of optimal parameter values do not change significantly with cell dimension. droptol Figure 6.7 Typical curves of CPU solution time for the SSOR (left) and ILU (right) preconditioners for many values of the parameters CO and droptol. The grid had 15 cubic cells of dimension 0.1m and the solution tolerance was 10". 3 5 66 droptol Figure 6.8 Typical curves of CPU solution time for the SSOR (left) and ILU (right) preconditioners for many values of the parameters co and droptol. The grid had 15 cubic cells of dimension 10m and the solution tolerance was 10". 3 5 In Figures 6.9 and 6.10, the same grid in Figure 6.3 was used but the BiCGStab solution tolerance was altered to 10" and 10" respectively. Comparison of Figures 6.3, 6.9 and 6.10 indicate that the ranges of optimal parameter values do not change significantly with the solution tolerance. droptol Figure 6.9 Typical curves of CPU solution time for the SSOR (left) and ILU (right) preconditioners for many values of the parameters CO and droptol. The grid had 15 cubic cells of dimension lm and the solution tolerance was 10". 3 2 67 Figure 6.10 Typical curves of CPU solution time for the SSOR (left) and ILU (right) preconditioners for many values of the parametersft)and droptol. The grid had 15 cubic cells of dimension lm and the solution tolerance was 10". 3 8 The difference in optimal solution times between the two preconditioners is relatively minor. However, the total solution time vs. droptol curves contain a fairly flat portion around the minimum, whereas the total solution time vs:ft)curves contain slightly irregular portions around a relatively sharp minimum. Hence, an optimal value for droptol is easier to predict than an optimal value for co. For this reason, I chose to use the ILU preconditioner with a default value of 10" for droptol. 6.3 Accuracy of the flux and field formulations There are inaccuracies introduced in any approach to a numerical solution of a PDE. Here, the source and magnitude of these errors are discussed with respect to both the flux and field formulations. 6.3.1 The flux formulation To simplify the discussion, consider the case when all model cells are identical cubes with dimension h. First, consider the flux formulation. The following x-directional Taylor expansions across a cell face can be written: h h2 68 Here, the subscript conventions of Section 3.3 are used. The derivatives of 0 with respect to x are denoted using prime notation so that V <p= <f>'x with x a unit vector in the x-direction. Subtracting (6-1 lb) from (6-1 la) and dividing by h gives x 4 + (6-12) h This is the central difference used in the finite volume discretization of Chapter 3. The governing equations for the flux formulation can be written V-B = 0 B = /N # (6-13a) (6-13b) x x By = fffy0 (6-13C) B = fff 0 (6-13d) 2 z which combine to give V(//V$ = 0 (6-14) When discretized using the finite volume discretization of Chapter 3, (6-13b) will yield expressions of the form B* (6-15) Here, 7] is a harmonic average of the two neighbouring permeability values on either side of the particular cell face and is given by (6-16) The "local truncation error" of a discrete method is the local difference between the exact quantity being discretized and the discrete numerical approximation of that quantity. Comparison of (6-15) and (6-12) shows that use of (6-15) introduces a term into the local truncation error that is not only dependent on the squared cell length, h , but also on the value of 7]. Hence, the accuracy of the flux formulation is 0(7]i, h ). arm 6.3.2 The field formulation Now consider thefieldformulation. The following x-directional Taylor expansions along a cell edge can be written: 69 « W = ^y . vi fa k = <W + +\€ -\<l>'~y,iM Y l j, +0(h ) (6-17a) 3 k - ° ( h i ) ( 6 ' 1 7 b Subtracting (6-17b)from(6-17a) and dividing by h gives The governing equations for the field formulation can be written V(//H) = 0 H^V^ H =V </> H =V0 y z (6-19a) (6-19b) (6-19C) (6-19d) y z which again combine to give (6-14). When discretized using the finite volume discretization, introduced in Chapter 3, (6-19b) will yield expressions of the form (6-20) H: Comparison of (6-20) and (6-18) shows that this step introduces a term into the local truncation error that is only dependent on the squared cell length, h . However, when (6-19a) is discretized, this term is multiplied by arithmetic averages, Tjanth, of the four permeability values surrounding a particular cell edge. Hence, the accuracy of the field formulation is 0(7] rithh ). 2 a 6.3.3 Comparison of the flux and field formulations Both the flux andfieldformulations have similar local truncation error terms that are not only dependent on the squared cell length but also on averaged permeability values. Consider the harmonic and arithmetic averages of two permeability values JU\ and fif. ^™=2U"'+/V)" (6-2 la) ^/*=|U+^) (6-2 lb) ! It can be shown that Vharm^Vanth for any jU\>jUo and // >M> (refer to Figure 3.5 in Section 3.6). Therefore, one expects that for a given model, the flux formulation may provide more 2 70 ) accurate solutions to the forward problem than the field formulation when large permeability contrasts are present across cell interfaces. However, this may be too simplistic and needs to be investigated numerically. 6.4 G r i d design The amount of error in the forward modelledfluxeswill depend on various factors: • . • • . the level of discretization; the tolerance for convergence of the BiCGStab solver; machine precision and rounding; the location of the observation points relative to the boundary; and the distance of the observation pointsfromhigh values of magnetic permeability. The majority of the modelling error is expected to comefrominaccuracy inherent in the finite volume discretization. The discretization used has second order accuracy, 0(h ), in B (or H), which resultsfromthe use of central differences. Because of this "discretization error", grids with a finer discretization (i.e. more cells within the same model region) will lead to more accurate solutions. 2 There is an obvious tradeoff between solution accuracy and solution time. A very finely discretized grid (i.e. one with a very large number of cells) will give accurate results but the solution may be too slow for practical purposes. This becomes especially significant when the forward modelling algorithm is used within an inversion algorithm and the forward solution must be computed many times. The accuracy in B is also dependent on the averaged permeability values. Therefore, when observation points are close to high values of magnetic permeability, the local truncation errors may lead to less accuracy at these positions in the grid. This may be too simplistic, however, because the local truncation errors propagate through the grid. A second source of modelling error resultsfrominaccuracy in the prescribed boundary conditions. RecallfromChapter 4 that the boundaryfluxescan be prescribed as the Earth's flux or can be calculated by approximating the susceptible material within the grid as a simple body. The accuracy of any such approximation will depend upon the location of the susceptible material with respect to the grid boundary (when farfromany susceptible material, the secondary fields become less significant). To interpolate at specific observation locations one needs to ensure that the grid is extended beyond those locations. Having to increase the grid size (e.g. above the Earth's surface to extend into the air) will increase the size of the system, and therefore increases the required memory and solution time. However, this will also help deal with boundary condition accuracy. 71 Inaccuracy in the boundaryfluxeswill propagate through the grid leading to increased inaccuracy throughout. The errorsfromthis source are expected to decrease awayfromthe grid boundary (i.e. closer to the grid centre). Therefore, the location of thefluxquantities of interest (i.e. at the survey locations) becomes an issue. Small changes in the external boundary conditions will affect flux values in the outer grid region more than those in the inner grid region and one should design any grid with this in mind. Another important accuracy concern is the accuracy to which the system is solved using the BiCGStab solver. This source of inaccuracy is controllable and therefore is not such a concern. Neither is the unavoidable and perhaps insignificant error due to machine precision and rounding. However, I note that inaccuracy from this latter source may be reduced through use of the secondary formulations discussed in Chapter 5. From the discussion above it is obvious that accuracy concerns should factor highly in the appropriate design of a grid when performing an inversion that utilizes the forward modelling algorithm. In that case, the only flux quantities that one is concerned with are those at the survey locations. Furthermore, the accuracy at these locations need not be much better than the estimated errors in the survey measurements. Although this is difficult to quantify, the discussion of this section has pointed out some important guidelines to consider for a responsible grid design. 72 Chapter 7 An alternate solution in the integral equation domain 7.1 Introduction Now that a numerical solution to the forward problem has been formulated, it needs to be tested. The finite volume discretization (FVD) solutions can be tested against exact analytic solutions for simple bodies such as ellipsoids. A drawback of this approach is that the number of shapes available is limited. Furthermore, because the FVD solutions deal exclusively with prismatic cells, it is impossible to exactly model bodies with curved surfaces. To provide more general tests, a full solution for arbitrary distributions of permeability has been developed in the integral equation domain. This solution uses a grid of prismatic cells and, as such, provides a method of comparison against the FVD solutions for complicated distributions. 7.2 A full discrete solution to the magnetic problem in the integral equation domain Written in integral form, the field at a point P due to a distribution of magnetic material within a region R is given by H(P) = H (P) + H (P) 0 J = H (P) + l f M ( f i ) - W - i - * 0 An ( 7 r(P,Q) iR " ! ) where Ho is the primary inducing field (e.g. the Earth's geomagnetic field); H is the anomalous secondary field; M is the induced magnetization of the magnetic material in R; P = (x , y , z ) is the position of the observer; Q = (x , y , z ) represents the positions of the volume elements dv within R; and r(P,Q) is the distancefromP to Q: s p p p q q q r(P,Q) = r pq =p -x ) 2 p q + (y p - yf q + (z - z ) 2 p q (7-2) Although the inducing field will be dealt with as a constant value, Ho, it may vary with location and extension to this case is simple. The Cartesian directions x, y and z are now expressed as x,, x and x respectively. Consider R to be a single rectangular prism with constant magnetic 2 3 73 susceptibility x and constant magnetization M = (Mi, M2, M ). The z' component of the field at P due to this prism can be written th 3 H?=H ,.+XT,.£M, 1 e {1,2,3} , 0 (7-3) where the superscript p indicates the observation point P, H field and the tensor component T.£ is T ' - ^ s k i i d x ^ 0 i is the /* component of the inducing ( 7 - 4 ) There are only five independent elements in the tensor T due to symmetry and because the trace of T is equal to a known constant: p p T,r = T £ (7-5a) T* + T/ + T 2 3 P 3 =0 for P outside R = -1 for P inside/? (7-5b) Equations for the various T,£ components when R is a rectangular prism are given in Sharma (1966). To evaluate (7-1) numerically for a more complicated susceptibility distribution, the threedimensional Earth region of interest, R, is divided into a large number, nc, of discrete, rectangular prismatic cells R within an orthogonal grid system. Each cell R„ has constant susceptibility Xu and magnetization M". This discretization into many cells allows complicated bodies to be constructed. The validity of the assumption of piecewise magnetization improves with refined discretization. lt With the model containing many rectangular prisms of constant magnetization, the field at a point P now requires a second sum over all nc cells: Hf = H W + £ XT^M; , i s {1,2,3} (7-6) »=i Here, the subscripts / and k are used to indicate the three Cartesian directions, the superscript u represents the prismatic cells and p represents the observation point P. As suggested in Sharma (1966), a full solution in the integral equation domain can be formulated by continuing from (7-6) without making the linear approximation. Assuming that 74 the material is isotropic, linear and contains no remanent magnetization, the induced magnetization at any location within the model can be written (7-7) In Section 1.2 I discussed the simplified case for H « H o , where (7-7) becomes M=£Ho- Here, I do not make this assumption but continue with (7-7) directly. 5 The magnetization is assumed to be constant within each cell. (7-6) and (7-7) allow us to write f Mf = Hf =Z p P X nc f H 0 / +£ I Y| 3 JTJTMJ «=1^*=1 ; le {1,2,3} , {l,2,...,«c} (7-8) J) where T..f' = 1 (7-9) AK dx dx. : PR Here, the observation points represented byp are at the grid cell centres. (7-8) represents 3nc equations in 3nc unknown magnetization values Mf (i.e. the 3 Cartesian components of magnetization for each discrete grid cell). Let k be the known vector of nc model susceptibility values. Let m be the unknown vector of 3nc model magnetization values. Let the nc by nc matrix T,* hold the tensor components in (79). The element in the p row and u column, T . f , describes the contribution to the t field component at the centre of the p cell due to a unit magnetization in the/ direction in the w cell. For example, the element T, describes the contribution to the jc-component (the 1 component) of the field at the centre of the 3 cell due to a magnetization y (one unit strength in the 2 direction) in the 4 cell. th th th th h 3 4 th st 2 rd nd th I now define the following quantities: K (7-10) = J V l 0x (7-11) H = 0 75 ( 'Y X T* xx xy T T yx yy ^ x: T *• (7-12) v; K is a 3«c by 3nc matrix with the susceptibility vector k repeated three times along the main diagonal. H is a length 3nc vector formed by repeating each component of the inducing field nc times and stacking. T is a Inc by 3nc full matrix. (7-8) can now be written in matrix-vector notation as 0 m = K (Ho + Tm) (7-13) which can be rearranged into an equation of the form Am = b (7-14a) A = I - KT b = KH (7-14b) (7-14c) where 0 and I is a Inc by 3nc identity matrix. Direct or iterative methods can be used to solve for the magnetizations m in (7-14a). These can then be used to compute the secondary response using (7-6). The i component of the secondary flux, B,, at point P due to these magnetizations can be calculated using (7-6) to give th =/oX X T ^ M ; ( 3 Bf {1,2,3} »=1 y_ k=\ i (7-15) e This assumes P is infreespace. In matrix-vector notation, the flux component values at several observation points are calculated as B, = #>[T„ T T ]m /2 (7-16) /3 where the matrices T,,, T,- and T,- are JVby nc: the element in thep row and w column, T,f , th 2 describes the contribution to the / fieldcomponent at the p in the/ direction in the uth cell th h th 3 xh w 76 datum due to a unit magnetization 7.3 Practical aspects of the full integral equation domain solution RecallfromSection 7.2 that the magnetization is assumed to be constant within each cell. The frill integral equation domain solution (or "integral solution") is expected to increase in accuracy with refined discretization because the assumption of piecewise continuous magnetization will improve. The advantage of the FVD methods is that only sparse matrix-vector products and vector dot products are required. This makes the FVD methods comparatively less time and memory intensive for large-scale problems (i.e. when many susceptible cells are involved). In contrast, the integral solution makes use of a large, full matrix T , which requires considerable construction time and memory, and solution of the system in (7-14) for the magnetizations requires many operations. Investigation has shown that for typical problems of practical size, the integral solution generally requires one or more orders of magnitude more time than do the FVD solutions (this will be better quantified in the concluding chapter). The integral solution is, therefore, much less suitable for use within an inversion algorithm, in which the forward solution must be computed many times on a grid of many susceptible cells. Hence, the integral solution is only introduced to provide a method of testing the FVD solutions for complicated distributions. This testing is discussed shortly in Chapter 8. In situations where forward modelling is required for compact bodies (such as UXO) that require relatively few cells to be discretized, use of the integral solution may be advantageous. This is especially true if the tensor values in (7-9) can be derived for helpful cell geometries other than simple rectangular prisms. Furthermore, the speed of the integral solution can easily be increased by approximating each prismatic cell as either a sphere or spheroid. Calculation of the tensor components for these simple bodies is much faster than for rectangular prisms, but some level of inaccuracy is introduced. 77 Chapter 8 Testing the forward modelling 8.1 Introduction The numerical finite volume discretization (FVD) solutions to the magnetostatics problem, discussed in previous chapters, were coded in MATLAB. Various tests were performed to assess the competency of the forward modelling code. The secondary flux andfieldformulation solutions were compared to analytical solutions and to the slower, more memory intensive full integral equation domain solution (or "integral solution") of Chapter 7. In Section 8.2, flux profiles are calculated above simple spherical and cubic magnetic bodies. For these bodies, the main effect of self-demagnetization is the nonlinear scaling of the secondary response amplitude with susceptibility. In Section 8.3, a more complicated spheroidal body is considered. For this spheroid, an additional effect of self-demagnetization is observed: rotation of the magnetization direction awayfromthe inducingfielddirection. Section 8.4 deals with issues related to incorporation of the prescribed boundary conditions. Conclusions are drawn in Section 8.5. 8.2 Spherical and cubic magnetic bodies 8.2.1 Profiles above a sphere and a cube In thefirsttest, the numerical solutions for a susceptible cube were tested against the analytical solution for a susceptible sphere. The magnetic responsefromboth bodies should be comparable at large distances. The discrete model involved a single susceptible cube at the centre of a uniform grid of 33 cubic cells of volume Av=lm (each grid dimension equaled 33m). The ideal sphere had identical susceptibility as the cube and unit volume Av so that the volume summed magnetization for each body was equal (i.e. there was the same amount of magnetic material in each). The inducing flux was Bo = (Bo*, Bq>,, Bo ) = (0,0,10000)T and the susceptibility values considered were %= 0.01, 1 and 100. The situation is as depicted in Figure 8.1 below. 3 3 z 78 profile 2 profile 1 data 10m lm Figure 8.1 A schematic diagram of the first test scenario: flux profiles above and away from a susceptible sphere and cube that are concentric and have equal volume. The magnetic flux density outside of a sphere is equivalent to that of a dipole located at the sphere's centre with dipole moment equal to the sphere's volume times its magnetization (Blakely, 1996). This moment needs to be adjusted to account for self-demagnetization effects within the sphere. The adjusted moment used is as derived in Section 4.2: (8-1) J Flux profiles 10m above the centres of the bodies (i.e. profile 1 in Figure 8.1) for each susceptibility are shown in Figure 8.2 below. Figure 8.3 shows the difference between the numerical and analytic solutions for the same profiles in Figure 8.2. In these and subsequent figures, the legends refer to the FVD flux andfieldformulation solutions as "B Form." and "H Form." respectively, and the integral solution is referred to as "Int. Sol'n". Flux units are displayed in tesla and dimension units are displayed in meters. 79 0.02 0.01 ca Oi -0.01 -15 ' -10 • -5 1, • • 1 • 0 x . • 5 • 10 15 -0.01 -15 . . , .1.5 1 1 • -10 • -5 • 0 x • 5 • 10 1 15 0.5 co 0.5 -0.5 _1 I -15 Oi . -10 , , , . 1 -5 0 5 10 15 -0.51 -15 5 1 • • • 1 0 5 10 15 Oi i • -15 -10 0 • -5 10 -5 _ • -10 • -5 * . , 1 0 5 10 15 _1 5 -15 . -10 . -5 i . 0 x 5 , 10 1 15 Figure 8.2 B (left) and B (right) along a profile running in the x-direction at a height 10m above the sphere/cube centre (x = 0) for susceptibilities of #= 0.01 (top), X=\ (middle) and #=100 (bottom). x z 80 x 10 Int. SoPn B Form. H Form. W W - * Figure 8.3 B (left) and B (right) differences ( B , „ / - B „ y ) along a profile running in the x-direction at a height 1 Om above the sphere/cube centre (x = 0) for susceptibilities of j = 0.01 (top),^=l (middle) and j=100 (bottom). x z m me ca a a/ /c Recall from Section 6.3 that the local truncation errors in the FVD methods are 0(f]h ), where h is a grid cell dimension and rf is an averaged permeability value. The inaccuracy in the FVD solutions in this test are severe. Although this is related to the susceptibilities used, the major source of error is the coarse discretization. The discretization is such that the susceptible cubic body is modelled as only a single cell and one can not realistically expect high accuracy. The inaccuracy can be reduced through refinement of the grid, as will be shown in Subsection 8.2.2. In contrast to the FVD solutions, the integral solution does not sufferfromthe same 0{rjh ) discretization error and is very much closer to the analytic solution, regardless of the susceptibility used. 2 2 Figures 8.2 and 8.3 show error trends expectedfromthe 0(rjh ) accuracy. The averaged permeability 7] is a harmonic average in the flux formulation and an arithmetic average in the field formulation. In Section 6.3 it was shown that for a given model, the flux formulation is expected to be more accurate (i.e. have smaller local truncation errors) than the field formulation 81 when large permeability contrasts are present. It is difficult to quantitatively compare the results with theoretical expectations because the local truncation errors propagate though the grid. However, the qualitative expectationsfromSection 6.3 are evident: for increasing susceptibility contrast, the flux formulation performed increasingly better in comparison to the field formulation. 8.2.2 Profiles above a cube with refined grids Here, the models again involved a central susceptible cube but the level of discretization was altered and the FVD solutions were compared to the integral solution alone. Three uniform grids were used containing 11,33 and 55 cells. These contained central, uniform, susceptible cubic bodies l , 3 and 5 cells in volume respectively. All grids had equal outer dimensions of 33m but different cell dimensions and as such, all three central bodies were of identical physical size. 3 3 3 The inducing flux was Bo = (0,0,100)T and the susceptibility of the central body was %=\00. Profiles of the flux 2m above the centres of the cubic bodies are shown in Figures 8.4 and 8.5 below. The three numerical solutions converge with increasingly refined discretization. Similar results were obtained for higher and lower susceptibility values. 82 Figure 8.4 B (left) and B (right) along a profile running in the x-direction at a height 2m above the cube centre (x = 0) for three grids of size 11 (top), 33 (middle) and 55 (bottom). There is one datum per grid cell. x z 3 3 83 3 -6 -4 -2 0 2 4 6 -6 -4 -2 0 2 4 -6 -4 -2 0 x 2 4, 6 -6 -4 -2 0 x 2 4 6 Figure 8.5 B (left) and B (right) differences (B - B i) along a profile running in the x-direction at a height 2m above the cube centre (x = 0) for three grids of size 11 (top), 33 (middle) and 55 (bottom). There is one datum per grid cell. z x 3 FVD 3 inlegra 3 Figures 8.4 and 8.5 show error trends expected from the 0(nh ) accuracy of the FVD methods. Let n equal the number of cells in a grid dimension (i.e. n=l 1, 33 and 55). n is inversely related to h and, thus, the errors in either FVD solution are expected to decrease as roughly n' . It is difficult to quantitatively compare the results with these theoretical expectations because the local truncation errors propagate though the grid. Furthermore, the exact solution for this test is not known. However, recall from Section 7.3 that the integral solution is expected to increase in accuracy with refined discretization because the assumption of piecewise continuous magnetization will improve. Thus, the three numerical solutions are expected to converge with increasingly refined discretization, and this qualitative result is observed. 2 84 8.2.3 Profiles away from a sphere and cube Here, the model involved a finely discretized, central cubic body. The grid was identical to that used in Subsection 8.2.1. In order to increase the accuracy of the solutions, the central, susceptible, cubic body was extended to contain 3 cells. The numerical solutions for this model were again compared to that of a concentric sphere of equal volume to the susceptible cubic body. The responses from the cube and sphere should agree more closely as the distance from these bodies increases. The relative differences between the numerical and analytic solutions in Figure 8.6 below show the expected result. After some expected effects close to the bodies, the relative differences reduce to zero as the distance from the bodies increases toward the grid boundary (i.e. profile 2 in Figure 8.1). Figure 8.6 also shows the expected 1/r dipole relationship between the secondary flux response and distance when far enough away from the cube. 3 3 Figure 8.6 shows the results for a susceptibility of #= 1. Similar results were obtained for higher and lower values of susceptibility. Figure 8.6 (top), B " (middle) and relative differences {B i-B iy )l& i (bottom) along a profile running in the z-direction away from the sphere/cube centre (z = 0) for a susceptibility of x 1 • B Z Z 1 / 3 mmerica = 85 ana tic ana ytic 8.2.4 Summary The tests performed in Section 8.2 have dealt with bodies for which the induced magnetization is always in or near the direction of the inducing field, regardless of the susceptibility of the body. In Section 8.3, a more complicated body is considered in which this is not the case. The tests performed in Section 8.2 have shown the need forfinediscretization in order to provide accurate FVD solutions when high susceptibility values are present. In the following sections, this result guides the design of the discrete model. 8.3 A prolate spheroid 8.3.1 Introduction In the second set of tests, the magnetic response was calculated for a susceptible prolate spheroid with vertical semi-major axis c - 10m and semi-minor axes a = b = 5m (an eccentricity of 2). The susceptibility values considered were ^"=0.01, 1 and 100. The inducingflux,Bo = (80.8, 0, 58.0)T, was chosen to maximize the rotation of the magnetization direction within the body for the high susceptibility. This inducingfluxis oriented 53.9° from the spheroid long axis and should result in a magnetization oriented 36.2° from the long axis (17.7° from the inducing flux) for ^=100. The situation is as depicted in Figure 8.7 below. profile 2 data profile 1 10m Figure 8.7 A schematic diagram of the second test scenario: flux profiles above and through a susceptible spheroid. 86 The spheroid was roughly represented as a collection of cubic cells of dimension lm within a uniform grid containing 55 cells. The FVD solutions were compared to an analytic solution for the spheroid calculated using the theory in Kaufman (1992) and Stratton (1941). 3 Below, Figures 8.8 through 8.11 show vector plots of the secondary flux responsefromthe exact body for both high and low susceptibilities. The vectors are all drawn with equal length and the magnitude of the flux is indicated by their colour. The colour bar units are on a base 10 logarithmic scale. The end of the vectors are indicated by a black dot. The outline of the exact body is indicated by a solid black line. -18.50 -9.25 0.00 x 9.25 18.50 Figure 8.8 A vector plot for a spheroid in an inclined inducing field. The plot is a vertical cross-section and the spheroid susceptibility is %= 0.01. The fluxes were calculated using the analytic solution. 87 Figure 8.9 A vector plot for a spheroid in an inclined inducing field. The plot is a horizontal cross-section and the spheroid susceptibility is %= 0*01. The fluxes were calculated using the analytic solution. 88 1S.5S- 9.25. 1.15 0.707 -lB.SjL -18.50 Figure 8.10 A vector plot for a spheroid in an inclined inducing field. The plot is a vertical cross-section and the spheroid susceptibility is ^=100. The fluxes were calculated using the analytic solution. 89 12.02 1.59 1.16 0.721 9.25 Figure 8.11 A vector plot for a spheroid in an inclined inducingfield.The plot is a horizontal cross-section and the spheroid susceptibility is ^=100. Thefluxeswere calculated using the analytic solution. The magnetization of the body can be inferredfromthefluxesinside the body. Recall that the inducingfluxwas directed in the +x-direction and oriented 53.9° from the spheroid long axis (i.e. 53.9° from the z-axis toward the x-axis). In Figures 8.8 and 8.9, for the low susceptibility value (j^=0.01), the self-demagnetization effects are negligible and the magnetization of the spheroid is parallel to the inducingfield.Recall that the inducingfieldchosen should result in a magnetization oriented 36.2° from the long axis for the high susceptibility (jf= 100). In Figures 8.10 and 8.11, for the high susceptibility value, the self-demagnetization effects are large and the magnetization of the spheroid has rotated awayfromthe inducingfieldtoward the long axis. 8.3.2 Profiles above a prolate spheroid Profiles of the flux above the spheroid (i.e. profile 1 in Figure 8.7) for each susceptibility are shown in Figures 8.12 and 8.13 below. For the FVD solutions, the boundary conditions prescribed were thefluxesfromthe exact spheroidal body. This helped reduce the error in the solutions. 90 —— Spheroid —a— B Form. H Form. Figure 8.12 B (left) and B (right) along a profile running in the x-direction at a height of 12m above the spheroid centre (x = 0) for susceptibilities of x 0.01 (top), X=\ (middle) and J=100 (bottom). X Z = 91 Figure 8.13 (left) and B (right) differences (B i-B i ) along a profile running in the x-direction at a height of 12m above the spheroid centre (x = 0) for susceptibilities of %= 0-01 (top), J=l (middle) and #=100 (bottom). Z numerica ana ytic 8.3.3 Profiles through a prolate spheroid Here, the response of interest is that within the spheroid. Stratton (1941) shows that the induced field inside the magnetic spheroid is uniform and parallel to the inducingfieldHo and is given by JJ_ 5«i Z// 2 ! ^ + 2// 5o£ + Z// 2 where 92 2 (g_2) r J^W,' ds A A 4 = r° ds as r ds " <8 fcyv (8 3a) " 3c) and R = *J(s + a )(s + b )(s + c ) 2 2 (8-4) 2 s In (8-2), jU\ is the spheroid permeability and JU2 is the background permeability. With //1 =/A)( Z) #2=M)> (8-2) becomes 1+ H _ a n d H 0 v , H o.v ! H o - /-g_ -j 5 where the factors multiplying x are the self-demagnetization factors referred to in Clark and Emerson (1999): _ abc N =—A r x abc . > v=-^A N . abc . . N =—A T z 3 / D (8-6) The secondary flux is calculated as B, = B - B o = / / H - B (8-7) 0 and therefore requires knowledge of the model permeability at each observation location. Below, Figures 8.14 and 8.15 show flux profiles through the spheroid for susceptibilities of X= 0.01 and J=100 respectively. These profiles would intersect the exact spheroid boundary at x = ±5 and z = ±10. However, due to the finite cell sizes, the boundaries of the discrete body lie at x = ±4.5 and z = ±9.5. 93 0.8 1 0.6 Figure 8.14 Flux profiles running in the x-direction (left) and z-direction (right) through the centre of a spheroid (x = z=1) with susceptibility ^=0.01. There is one datum per grid cell for the numerical solutions. Figure 8.15 Flux profiles running in the x-direction (left) and z-direction (right) through the centre of a spheroid (x = z=l) with susceptibility ^=100. There is one datum per grid cell for the numerical solutions. 94 As expected from the results of Section 8.2, errors in the FVD solutions increase with susceptibility. From (8-5) and (8-7), the theoretical value for the internal flux for the high susceptibility spheroid is B/„, = (112 0 178)T, which has a magnitude of 288T. The values given by the FVD solutions are reasonably close to these theoretical values. As expected, the consistency between theory and the FVD solutions was increased with refined discretization. The profiles in Figures 8.14 and 8.15 show concordance with the interface conditions introduced in Section 2.3. The normal component of the flux should be continuous across an interface at which the susceptibility is discontinuous. The tangential component of the flux should be discontinuous across such an interface. In Figures 8.14 and 8.15, the top-left and bottom-right profiles show components normal to the interface (i.e. the boundary of the spheroid). The topright and bottom-left profiles show components tangential to the interface. Although the susceptible body in the discrete model is only an approximation of a spheroid, the FVD solutions contain an almost uniform internal flux with the exception of some edge effects. This is evident in Figures 8.16 through 8.19 below, which show identical vector plots as in Figures 8.8 through 8.11 but for the flux formulation solutions. The outline of the discrete body is indicated by a solid black line. 95 -18.50 -9.25 0.00 x 9.25 18.50 Figure 8.16 A vector plot for a discretized spheroid in an inclined inducing field. The plot is a vertical cross-section and the spheroid susceptibility is X= 0.01. The fluxes were calculated using the flux formulation. 96 Figure 8.17 A vector plot for a discretized spheroid in an inclined inducing field. The plot is a horizontal cross-section and the spheroid susceptibility is %= 0.01. Thefluxeswere calculated using thefluxformulation. 97 18.58. -18.50 -9.25 0.00 x 9.25 18.50 Figure 8.18 A vector plot for a discretized spheroid in an inclined inducing field. The plot is a vertical cross-section and the spheroid susceptibility is ^=100. The fluxes were calculated using the flux formulation. 98 Figure 8.19 A vector plot for a discretized spheroid in an inclined inducing field. The plot is a horizontal cross-section and the spheroid susceptibility is ^=100. The fluxes were calculated using the flux formulation. 8.3.4 Summary The tests performed in Section 8.3, along with those in Section 8.2, have demonstrated the competency of the FVD and integral solutions to the forward problem, provided the grid is designed appropriately. However, there remains an issue relating to the incorporation of the prescribed boundary conditions. This will now be discussed. 8 . 4 Consistency of the boundary conditions Recall from Chapter 3 that for the FVD methods the inducing flux enters the problem through the boundary conditions. The normal component of the inducingfluxis prescribed on the grid boundary but no attention is given to the tangential components. It is desired that the fluxes calculated by the FVD methods be consistent with these tangential components. One may consider the approach of specifying the tangentialfluxesjust beyond the boundaries to force this "consistency of the boundary conditions". However, one can show that doing so only 99 reduces the effective problem by one layer of cells. Refer to the two-dimensional problem in Figure 8.20 below. Given the normal and tangential boundary fluxes (solid arrows in Figure 8.20) on and just below the original (larger) grid, the normal fluxes one cell deeper (i.e. dashed arrows in Figure 8.20) can be solved through direct use of the discrete divergence equation without any knowledge of the model susceptibilities. Hence, specifying the normal and tangential boundary fluxes on and just beyond the original grid boundary is practically equivalent to specifying the normal boundary fluxes on the smaller grid boundary (i.e. the shaded region in Figure 8.20). Bo Bo ir BJ- Bo Bo ir "V > Bo B^ ir B B Figure 8.20 Specification of normal and tangential fluxes on and just below the grid boundary for the flux formulation. A grid was designed that would help reduce solution errors associated with discretization and boundary condition accuracy. This symmetric grid contained 51 cells of varying dimensions, as illustrated in Figure 8.21 below. 3 100 > 3.0m cells 2.0m cells } 1.5m cells t. 1.0m cells i 1.5m cells 2.0m cells > 3.0m cells susceptible region data Figure 8.21 A symmetric grid containing 51 cells of varying dimensions with a central susceptible region containing 5 cells. 3 3 A central, susceptible, cubic body was placed in the grid, as indicated in Figure 8.21. The boundary conditions used were those calculated through the congruous sphere method of Section 4.2. These boundary flux values were compared to those calculated using the integral domain solution of Chapter 7. This comparison indicated that the grid was large enough that any modelling inaccuracy due to boundary condition inaccuracy could be neglected. The normal flux was prescribed on the grid boundary and the goal was to investigate the flux tangential and close to the boundary. If the consistency of the boundary conditions is withheld, then the tangential flux should equal thatfromthe congruous sphere (within a small percentage of expected error). The profiles for the numerical solutions were positioned along (i.e. parallel to) the grid boundaries one half cell away, as indicated in Figure 8.21. The central cubic susceptibility considered was ^=0.01 and the inducing flux used was Bo = (0,0,100)T. A low susceptibility 101 value was used to reduce the inaccuracy associated with large susceptibility values. The location of the profiles are as depicted in Figure 8.22 below. Profile 1 (down) Profile 3 Figure 8.22 A schematic diagram of profile positioning for the test of the consistency of the boundary conditions. Profile 1 measures B along a profile in the x-direction along the top (-z) grid boundary. Profile 2 measures B along a profile in the x-direction along the western (-y) grid boundary. Profile 3 measures B along a profile in the z-direction along the western (-y) grid boundary. Below, Figure 8.23 shows these profiles of the flux component tangential to the closest grid boundary for the model in Figure 8.21. x z z 102 x 10" Dipole —*— Int. Form. B Form. H Form. -40 -20 x 10" 5 0 x 20 40 20 40 00 -40 -20 Figure 8.23 Single component flux (left) and flux difference ( B „ „ / - B ^ / ) (right) profiles for a central cubic body of susceptibility x 0-01 • Profile 1 (refer to Figure 8.22) is at top, profile 2 is at middle and profile 3 is at bottom. OTerica 0 e = As desired, the solution for the tangential flux near the grid boundaries is consistent with the known solution. When similar tests were performed with no central susceptible body (i.e. the entire model had permeability of free-space), the resulting difference profiles were zero profiles (plus a negligible level of error due to machine precision and rounding). These results indicate that the method is withholding the consistency of the boundary conditions. 8.5 Conclusion Three methods have been developed for solving the full magnetostatic forward problem: a solution in the integral equation domain and two FVD solutions in the PDE domain (i.e. the flux and field formulations). All three solution methods suffer from inaccuracy due to finite discretization. This source of inaccuracy has less effect in the integral solution. Furthermore, the FVD solutions suffer from inaccuracy dependent on the model permeability values, whereas the 103 integral solution does not. These methods can now be considered for use within an inversion algorithm, to be discussed in Chapter 9. A tradeoff exists between solution accuracy and solution complexity. Complex forward modelling methods that are very accurate yet relatively slow are not optimal for use in an inversion algorithm, in which the forward solution must be calculated many times. What is required is a fast method that is accurate enough to provide reasonable inversion results. Therefore, the relatively fast FVD methods will be used and the advice in Section 6.4 will be followed to increase the accuracy of the forward solutions. Of the FVD methods, the flux formulation is more appropriate than the field formulation for numerical solution of the magnetostatic forward problem. The flux formulation is generally more accurate for the intended applications of modelling the magnetic response for high susceptibilities. This was theorized in Section 6.3 and is evident in the test results in the current chapter. Initially, however, both formulations were considered for use in the inversion algorithm. 104 Chapter 9 Inversion 9.1 Introduction Now that a refined and thoroughly tested solution to the forward problem is available, it can be used within an inversion algorithm. Before proceeding, I return to the summary of the FVD methods for solution of the forward problemfromSection 5.3 and 5.4. For the secondary flux formulation, the discrete equations to solve for the potentials andfluxeswithin the discrete grid are DMG<|>, = -Oub"]Vl-I)Bo + g (9-ia) B^MGkl),+ (#)"'M-I)Bo (9-lb) , For the secondary field formulation, the equations are DMG(t) = -D(M-/A)I)Ho + g H = G^s (9-2a) (9-2b) s s RecallfromSection 5.5 that the magnetostatic forward problem in (9-la) or (9-2a) can be written as the system A(m)u = q(m) (9-3) where m is the model of the spatial distribution of magnetic susceptibility within the Earth and u is a vector of unknown scalar potentials within the discrete grid. Once u is determined, the data of interest (i.e. certainfluxcomponents at specified locations) are calculated through some nonlinear operation, which will be written as A pred = Q[B ]= Q[ C(m)u + b(m) ] (9-4) S Here, df is the data predictedfromthe Earth model. The matrix C and vector b convert the potentials in u into the secondary flux values B throughout the discrete grid. This step corresponds to (9-lb) and (9-2b). The function Q then interpolates for specific flux quantities at specified measurement positions. red s (9-3) and (9-4) can be combined to write d pral = F[m] = Q[ C(m)A(m)-' q(m) + b(m) ] F is the nonlinear forward modelling operator that mapsfrommodel space to data space. 105 (9-5) As discussed in Section 1.1.2, the inverse problem is one of determining a feasible distribution of subsurface susceptibility, m, that could have given rise to magnetic survey data, d . A solution to this inverse problem will now be discussed. obs 9.2 An unconstrained optimization approach to the inverse problem 9.2.1 Non-uniqueness The principal difficulty in finding a solution to the magnetic inverse problem is its inherent nonuniqueness. There are infinitely many models that could adequately reproduce the data and there are a number of different sources of this non-uniqueness. First, a consequence of Gauss's theorem is that if a field distribution is known on the boundary surface of a region, there are infinitely many equivalent source distributions in the region that can produce the known field (Li and Oldenburg, 1996). A second important source is the fact that the number of survey measurements is finite and the data contain some level of inaccuracy. Therefore, the data only needs to be fit to a certain degree, defined by their expected errors. 9.2.2 The discrete Earth model The three-dimension Earth region of interest is divided into a large number of discrete cells, each with constant susceptibility % so that complicated bodies can be constructed. The cells need to be small enough that their size does not significantly affect the final outcome of the inversion. The number of model cells, nc, is set much larger than the number of data, N. The number of unknowns %j is then larger than the number of equations in (9-5) and the inverse problem is underdetermined. Therefore, the system (9-5) has infinitely many solutions for m and, as such, the inherent non-uniqueness is allowed for. h 9.2.3 Dealing with non-uniqueness Although there are infinitely many models that can adequately reproduce the data, many of these will be unreasonable for a specific application. To reduce the number of acceptable models, the requirements for a feasible model are considered. Thefirstis that the data predicted by the recovered model should fit the observed survey data to within a degree justified by their estimated uncertainties. The second requirement is that the recovered model should be compatible with any a priori knowledge of the subsurface geology or physical property distribution. This often requires that the model be smooth in all spatial directions. As discussed in Li and Oldenburg (1996), the inverse problem can be formulated as an unconstrained optimization in which an objective function is minimized to obtain a feasible susceptibility distribution model. The objective function is designed so that the data are fit to an acceptable degree and the recovered model has desired spatial characteristics. It has the form 106 0(m) = 0dm) + /30 (m) (9-6) m where 0 is the total objective function to be minimized; m is the model of subsurface susceptibility; 0d is the data misfit which measures how well the model fits the observed data; 0 is the model objective function which measures how closely the model conforms to the desired spatial characteristics; and P is a regularization parameter. m P takes any positive real value and controls the relative size of the model objective function and the data misfit. It is therefore referred to as the "tradeoff parameter". The inverse problem is solved by finding the model that minimizes (9-6). Because the survey data contains some level of uncertainty, a decision must be made on how well the data predicted by the recovered model should fit the observed survey data. Solving the inverse problem requires optimization of (9-6) for several values of (5 until a model is found that yields the desired level of misfit. 9.2.4 The model objective function An appropriate generic model objective function that incorporates the requirements discussed above is (9-7) dv where R is the region occupied by the model; (Xs, (Xx, Oy and are coefficients specifying the relative importance of the various terms; m(r) is the model function of subsurface susceptibility; m f is a reference function; W , W , W and W are spatially dependent weighting functions; and W is a depth weighting function. re s x v 2 r This scalar-valued model objective function can quantitatively distinguish between different models on the basis of various characteristics. The ability to ensure that the recovered model is compatible with known geological information is achieved through the spatial weighting functions and the reference model. 107 The first term in (9-7) is a "smallness" or "closeness" term. It controls the closeness of the model m to the reference model m f, or for m /= 0 it controls the "size" of the model (i.e. the total amount of susceptibility in the model). The remaining three terms are "smoothness" terms for the x-, y- and z-directions: they control how smooth the model is in the three spatial directions. Traditional £ norms are used to force the susceptibility distribution to vary smoothly in accordance with general expectations of the subsurface geology: unnecessary complexity should be avoided unless there is a priori geological information suggesting otherwise. Other norms can be used if, for a specific application, one wishes to allow non-smooth variation. The use of non-^2 measures in a nonlinear inverse problem is discussed in Farquharson (1998). re re 2 Depth weighting is crucial for magnetic inversion because without it, the resulting susceptibility distribution will be concentrated at the surface. This is a consequence of the non-uniqueness of the problem and of the kernel functions for the magnetic problem, which decay rapidly with distance from the magnetic medium. Any magnetic data collected above the Earth's surface can be fit with either smaller susceptibility values near the surface or with larger susceptibility values at depth. The model objective function has been designed to yield a model with small susceptibility values and if left unaltered, any inversion would prefer the near-surface option. When depth weighting is introduced to counteract the decay of the magnetic kernels, the inversion becomes free to distribute magnetic susceptibility at depth. 9.2.5 The discrete form of the model objective function To generate a numerical solution, the model objective function in (9-7) is discretized using the same grid as used for the forward problem. (9-7) becomes =«,|W W (m-m,. ,)f s ? r + a , | | W W ( m - n ) | | + ... 2 x r V (9-oa) 2 a I W W (m - m )|| + a |\V W (m - m y y r ref z Z r ref )f or =|W (m-m^,)| =(m-m,. .)W W (m-m,. ,) 2 m T e/ m <; (9-8b) where m is the model vector of susceptibility values; mrefis a reference model; W , W , W and W are spatially dependent weighting matrices; W . is a depth weighting matrix; and W is a function of the weighting matrices and the a coefficients. s x y z r m The discrete model objective function can be adjusted to incorporate a priori information by way of the reference model, the alpha parameters and the weighting matrices. This allows different aspects of model space to be investigated. 108 w, is a diagonal matrix with diagonal elements •yjw v i . v, is j the volume of the t h cell and the vv,- weights can be increased in regions where there is more confidence that the model should be close to the reference model. W , W and W are finite difference operators. From (9-7), the smoothness term for the xdirection is of the form x y 2 a. f W. JR dm dx dv (9-9) The discrete form of the derivative can be defined as a "slope" between the susceptibility values in adjacent cells, which will be called cells / and j. Discretization yields an equation of the form f J ^ /7v dx v AAxr ^ y y y V A> j A>\ Ax Ax-AyAz = Xfc-z) ^^ 2 (9-10) Hence, W takes the form of a matrix with elements of the form x ±Jw. Ay • Az (9-H) Ax Similar expressions can be obtained for the elements of W and W . The w weights can be increased to specify regions where there is more confidence that the model should be smooth in the x-direction. Again, this allows a prior information to be incorporated into @ . y z Xii m The square-root in the elements of W , W , W and W results from the fact that these operators are applied twice in the discrete objective function (9-8a) and their elements are effectively squared. (9-8) is then consistent with (9-7). s x y z 9.2.6 The data misfit Because the survey data contains some level of uncertainty, a decision must be made on how well the data predicted by the recovered model, d = F[m], should fit the observed survey data, d . The predicted data should not match the observed data exactly because if either is inaccurate then doing so ensures that the recovered model will be incorrect. The goal of the inversion is to find a model that reproduces the true data, d (i.e. the data that would be measured if measurements were exact). The relationship between the observed, predicted and true data is pred obs tn,e 109 ' ^modelling ' '•measurement = F[m] + e (9-12) Here, e surement are the measurement errors or noise associated with acquisition of the survey data, and e deiimg are the errors associated with the numerical solution to the forward problem. mea mo Let the error in the i' datum have standard deviation oj. The data misfit is defined as h (9-13) V J d is a vector of N observed data values. W is an TV by N diagonal weighting matrix with the t diagonal element equal to 1/cr. If the uncertainty in the f datum is large then it requires a small weight so that df is not fit close to d° . Conversely, if the uncertainty in the t datum is small then it requires a large weight so that df is fit close to df*. The 1/cr term in (9-13) accomplishes these requirements. 0DS d ed bs ed The forward modelling error is more difficult to define than the measurement error and one hopes that it may be neglected when defining the standard deviations o~i. Use of the flux formulation, as opposed to the field formulation, may help with this (recall from Section 8.5 that the flux formulation is more accurate for high susceptibilities). The errors in the survey data are assumed to be Gaussian, uncorrelated and have zero mean. Under these assumptions, @d in (9-13) is a chi-squared variable with expected value N and variance IN. Therefore, a suitable approximate size for the data misfit is N. 9.2.7 The positivity constraint For magnetics there is more a priori information that is available: because we are interested in ferromagnetic material, the model susceptibility values are required to be positive. The problem becomes minimize subject to <P= & + P®m %i>0 , i=\ ...nc (9-14a) (9-14b) d The positivity constraint further reduces the non-uniqueness of the problem and maintains physical reality. There are two options for imposing positivity. Thefirstis to use a logarithmic barrier approach, discussed in Li and Oldenburg (in press). A second regularization parameter X is introduced and the total objective function is modified by adding a logarithmic barrier term that gets large as the x parameters approach zero: 110 nc $ =$ + d B0 -l/£m(z ) m (9-15) l Here, y is a constant scaling factor and the barrier parameter, X, can take any positive constant real value. It now becomes undesirable for the optimization process to choose models with susceptibilities that approach negative values. Minimization begins with A. large. As iterations proceed, A is reduced so that A = 0 upon termination of the optimization procedure. The originally described objective function in (9-14a) is minimized and yet all model quantities Xi are positive. A second option for imposing positivity that avoids the complication introduced by adding a second regularization parameter is to re-parameterize the problem. Now, new parameters on the range (-oo oo) are solved for but susceptibilities derivedfromthese parameters lie in the range [0, °°). Any operator that maps a model quantity m,e (-oo oo) to # 6 [0,°o) is viable. Here, I use the square-root operator: 5 5 1/2 (9-16) (X = m, >0) 2 Instead of dealing with susceptibility values, the model holds values of square-root susceptibility. Square-root values are dealt with exclusively throughout the inversion algorithm. Only at the conclusion of the optimization process is the final recovered model of susceptibility values calculatedfromthe square-root values, thus resulting in an all-positivefinalmodel. Positivity can also be introduced through use of the natural logarithm of susceptibility: mi-ln{Xi), mj = exp(Xi)>0. However, the logarithmic function is much less sensitive at high values of susceptibility and is therefore less appropriate. As discussed in Gill, Murray and Wright (1995), one must take care when eliminating constraints through such a transformation of the problem. Some general difficulties that may arise are an increase in the degree of nonlinearity of the problem, an introduction of additional local minima and stationary points, and the Hessian may become singular or ill-conditioned in a region of interest. These may make the problem more difficult to solve. However, a major advantage provided by this method is the simplification in coding compared to the logarithmic barrier alternative. If is for this reason that positivity was imposed in this thesis though use of the square-root transformation. 111 9.3 A n iterative approach to the optimization problem 9.3.1 A Gauss-Newton approach The resulting optimization problem is nonlinear and requires an iterative approach. A standard Gauss-Newton approach is chosen in which a perturbation direction Sm is calculated at each iteration k and the model is updated as ( * » =<*> + ^ ( _ ) + m m 9 17 Introduction of the positivity constraint through re-parameterization creates some complications in the process. Here, a discussion of the Gauss-Newton method is given without including the positivity constraint. Subsection 9.3.2 discusses the complications introduced by the reparameterization. The total discrete objective function to be minimized is 0 = - J W (F[m] - d° ) f + 1 X bs d ||W (m - m ) f m (9-18) ref where the factors of one-half have been introduced to simplify the derivatives. Using the equalities (9-19a) dm ^ P ) =p dm (9-19b) the gradient function can be written as g(m) = V 0 = J W W (F[m] - d° ) +fiSVlW (m - m ) T m d T (9-20) bs d m ref Here, J is the sensitivity matrix, defined as J = i ^ dm = f S m] dm ( 9 2 ) ) which indicates how the data predicted by the model changes as the model changes. Calculation of the sensitivity matrix is a major aspect of the inverse problem and is covered in Chapter 10. Continuing from (9-21), the Hessian is defined as 112 H = V g = (V J )w W [F[m] - d° ) + J W W J + /W W T m d m bs T T d d (9-22) T T d m m The Gauss-Newton method makes the approximation F[m+Sm] = F[m] + Jdm (9-23) (9-23) is a Taylor expansion truncated to first order. The assumption is also made that J(m +<5m) = J(m ) w (9-24) w (9-24) will be a good approximation provided the model perturbation aSm in (9-17) is not too large. Thus, V J is set to zero and the first term in (9-22) is neglected. This approximate Hessian is T m B = J W W J + BWlW T d =H T d m (9-25) which is symmetric. The Gauss-Newton method combines (9-17) and (9-18) and solves for dm that minimizes 0: 0 = i||w>[m<*>+ < Sn]-d*)f + ||W (m<*> + <*n-m m Jf (9-26) m is the current model iterate. To simplify the discussion, the superscript will be temporarily dropped from m. To find the minimizer (3m, the gradient of 0 in (9-26) with respect to dm is set to zero and (9-23) and (9-24) are used to yield w V ^ 0 = J(m + dm) W W (F[m + Sm] - d° ) + ,9W W (m + dm - m ) T T d bs T d m ref = J(m) W 'W (F[m] + i{m)dm - d° ) + y9W W (m + £m - m,. .) T bs d d T m ?/ = (J W Wd J + fiWl Wm )Sm + J W Wd (F[m] - d° ) /W W (mT T T d d T bs T + m m m ) ref = B^m+g = 0 (9-27) Hence, the model perturbation dm is obtained by solving B(m * )^m = -g(m ) ( ) (9-28) a) In using this approach, one assumes that the computational costs involved in retaining the (V J ) term in (9-22) are not significantly outweighed by the potential gain in convergence of the inverse solution. This will be the case when near the optimal solution. The assumption (923) will also be more appropriate when near the optimal solution and therefore, the inversion T m 113 algorithm was designed to stay close to the optimal solution at all times (as will be discussed in Section 9.7). In practice, the model is updated as m (A+1) = m + aSm (9-29) w where the scalar a is determined by a line-search. The Gauss-Newton method approximates the objective function in the local vicinity of the current model as a quadratic function. The method then calculates the optimal position for this quadratic function and sets the perturbation direction toward that location. Therefore, the best value of a is expected to be close to unity, especially if the objective function is locally quadratic in nature. However, a line-search should be used to ensure that aSm is a descent direction (i.e. the perturbation in (9-29) reduces the objective function). An iterative method is required to solve the over-determined system (9-28). Solving (9-28) at the k iteration is equivalent to solving the least squares problem th Sm(*) ' - W ( F [ m(*) ] - d d -#w (im - m m ofo )^ w (9-30) ref which is of the form Ax = b. The Conjugate Gradient Least Squares (CGLS) algorithm is used to solve (9-30). CGLS is discussed in Paige and Saunders (1982), Hestenes and Stiefel (1952) and Bjorck, Elfving and Strakos (1998). The pseudo-code for the CGLS solver, adapted from CGLS1 in Bjorck, Elfving and Strakos (1998), is provided in Figure 9.1 below: 114 For some initial guess x , compute (0) r (0) _ s (0) = b- Ax =A P " ( 7o = s (0) Tr (0) 2 <0) for k = l,2,.. Ap w (*) q x r (*) 2 = (*) _ r s*> = (*-D Ar ( T (k) /|| - \<tol if s*> ( s STOP endif Yk = ||s <*)| 2 = n/7 - i p<*> = (*> + A p end + 1 k = s ( i ) Figure 9.1 The Conjugate Gradient Least Squares method. The CGLS routine is started from some initial guess (x in Figure 9.1). The initial guess used was a zero vector. The major computational requirements of the method involve two matrixvector products and two vector dot products per iteration. The two matrix-vector products require computation of the products Jx and J x for some vector x. In fact, the matrix J never needs to be constructed, which greatly simplifies the solution. This is discussed further in Section 10.5. (0) T The stopping criterion chosen is not the norm of the residual vector r in Figure 9.1. The residual norm is a measurement of the solution accuracy for (9-30). The stopping criterion chosen provides a measure of the solution accuracy for the equation of interest in (9-28). The residual norm is guaranteed to decrease monotonically with k. However, the stopping criterion can exhibit large oscillations when the condition number of A is large (Bjorck, Elfving and (A) 115 Strakos, 1998). Typical CGLS convergence curves for solution of (9-30) are shown in Figure 9.2. 5 10 15 iteration (k) 5 10 15 iteration (k) 5 10 15 iteration (k) 5 10 15 iteration (k) 20 Figure 9.2 Typical CGLS convergence curves for the relative residual and the stopping criterion plotted on linear (left) and logarithmic (right) vertical axes. As expected, the relative residual norm decreases monotonically in this example. Furthermore, the stopping criterion shows oscillating behavior. This was a common occurrence. Often, the stopping criterion can drop to an acceptable value while the residual norm remains close to its initial value. This makes a stopping tolerance on the residual norm a poor choice. In the inversion code, the CGLS algorithm was continued up to a maximum number of iterations while looking for a stopping criterion below a specified tolerance. Because the stopping criterion oscillates, the best solution (i.e. the solution with the lowest value of the stopping criterion) my not be the final solution. Therefore, the best solution is saved in memory until completion of the procedure. 116 9.3.2 Complications arising from the positivity constraint Recall from Section 9.2 that positivity is imposed through a re-parameterization of the problem. Instead of dealing with susceptibility values, the model holds values of square-root of susceptibility. Consider two model vectors m and k with k = (*,,J ,...,*,J 4xZl m = (V#T> (9-3 la) T 2 (9-3 lb) such that k =m m =k (9-32a) (9-32b) 2 m Here, a superscript on a vector denotes an element-by-element to-the-power-of operation. The objective function should always deal with (e.g. measure differences between) susceptibility values and is therefore * = + P* = I W (F[m] - d° ) f + B\\ W (k - k .) f bs m d m II II re/ II / II ^ ( 9 _ 3 3 ) = JW [F[m] - d ) J + ^||W (m - m ) || obs 2 d m 2 ref However, throughout the inversion algorithm, m is dealt with exclusively. Perturbations are made to the square-root values, which are allowed to be positive or negative. Only at the conclusion of the optimization process is the final recovered model of susceptibility values calculatedfromthe square-root values in m using (9-32a). The squaring operation results in an all-positive recovered susceptibility model. Use of this method results in some complications to the Gauss-Newton approach. I now define two quantities, Ju and J , which are the sensitivity matrices for the models in (9-3la) and (931b) respectively. They are defined by m K - _ ^ - _ ^ f - _ K S am dk am (9 .34) Here, the matrix S is defined as S = dk dm = 2diag(m) (9-35) 117 where the function diag(m) takes the elements of the length nc vector m and places them along the main diagonal of an nc by nc matrix. (9-35) is easily derivedfromconsideration of (9-31) and (9-32) to give 1/2 V d(Zj) 1 fl Let g = V (P and g =V 0 model m in (9-3 lb) is k (9-36) dXi dJXj k m g =V m m m . The gradient of the objective function with respect to the new 0 = SV<Z> = Sg k k = SJl W W [F[m] - d° ) + /?SW W (k - k,. ,) T bs d (9-37) T d m e = J W W (F[m] - d° ) + BSXV W (k - k ) T T m bs d T d m m ref Continuingfrom(9-37), the Hessian becomes ^ H rim = S =^-*=i-[ 3k dm ^ S + 3k 3k S g ]S k 2^-[diag(m)g \S k 3k (9-38) = s(V J ) W W (F[m] - d° )S + SJ^W W J^S + BSW W S + 2diag(g ) J m T k bs d T d d T d m m k = SH S + 2rf/ag(g ) k k where I use a subscript after a bracketed term being differentiated to indicate that the specified quantity is treated as a constant in the differentiation. In (9-38), the following derivation was used to obtain the final term: diag(m)w = (9-39a) i-(,ft«(»)w),=%^ =H. lm, dm (9-39b) A dm. [diag{m)w] = diag(w) (9-39c) w 118 In order to write an equation equivalent to (9-28), the approximate Hessian must now ignore not only the higher derivative term in (9-38) but also the final diagonal term. However, because this final term contains the gradient, it is not expected to be significant when near the optimal solution (where the gradient approaches zero). We then have B =SB S m (9-40) k 9.3.3 The Steepest Descent method as backup Recall from Subsection 9.3.1 that the Gauss-Newton method approximates the objective function in the local vicinity of the current model as a quadratic function. The method then calculates the optimal position for this quadratic function and sets the perturbation direction toward that location. Hence, when seeking a minimum solution to the optimization problem, the Gauss-Newton method requires an objective function of positive curvature (i.e. the gradient decreases as the objective function decreases). This will be true if B is positive definite (i.e. contains entirely positive eigenvalues). As verified by experiment, there are situations in which B is not positive definite and the total objective function has regions of negative curvature (i.e. the gradient increases as the objective function decreases). If situated at such a location, the Gauss-Newton method will fail because the resulting perturbation direction will lead toward the maximum of the approximate quadratic. Taking such a direction would cause the objective function to increase, which is counterproductive to the minimization. This problem is overcome through use of the Steepest Descent method when negative curvature is encountered. The Steepest Descent direction is simply the negative of the gradient. This method is fail-safe but can be much less efficient than the Gauss-Newton method in many circumstances. 9.4 A split model formulation As discussed in Section 6.4, a responsibly designed grid should have free-space padding cells in the outer portion of the grid to increase the accuracy of the approximated boundary conditions. There will also be free space model cells in the region of the grid corresponding to the aboveground air region. The susceptibility of these model cells needs to remain fixed throughout the inversion. Furthermore, we may feel that the susceptibility of certain cells is known well enough to keep them fixed. To accomplish this, the grid is split into two regions. An inactive region, R/, contains the outer free-space padding region and any other cells to remainfixedduring the inversion. An active region, R , contains all the cells with unknown susceptibility values. The model vector is split A 119 into mi and m , which hold the susceptibility values contained in Ri and RA respectively, mi and m are length nc\ and nc respectively so that the total number of model cells is nc = nci + nc . m can be expressed as A A A m — R|iTi[ +R m A A (9-41) A where Ri and R A are sparse matrices of ones that combine mi and m into m. For example, consider a small illustration in which the model contains three susceptible cells and the second is inactive. (9-41) is then A fl m= m = 2 \ m-, ) 3 °1 f 1 {m ) + 0 0 2 l°J m ^ = R,m, + R m A (9-42) A 0 1 R A is formed from an identity matrix by removing the columns that correspond to the inactive cells. Hence, R A contains only zero elements in each row corresponding to an inactive cell. Ri is formed in a similar manner. One can easily verify that m can be decomposed with these same matrices using mi = R] 'm m =R m (9-43a) (9-43b) T A A For the small example in (9-42) we have r R m= A \ 0 0^ (9-44) = R, 0 0 1 v y 3 For the split formulation, the objective function is changed to 2 W>[m]-d )f +||w (m -m\J (9-45) 2 ofa m A W needs to be altered to W to account for the change in size of the model vector on which it now operates. The alteration is performed by simply removing the columns of W that are associated with the inactive cells. The model objective function measurements in (9-45) and (933) are identical provided that m.\,ref mi. m m m = The same procedure of Subsection 9.3.1 can be followed, now taking derivatives with respect to m , to arrive at an equivalent expression to (9-28) to be solved for <5hiA- With 8m\ = 0, the total model vector m is updated as A 120 m R,m, + R ( m A A w + aSm\ )=m + (M Sm w x (9-46) A The appropriate sensitivity matrix is now BF[m] dm dF[m] d m dm A J —(R,m +R I dm A m A ) =J R A (9-47) A and its transpose is JI = ( J R ) A (9-48) T The new sensitivity matrix J A is related to the original sensitivity matrix J through the matrix R . When used as a right multiplier, R simply removes the columns of J that are associated with the inactive cells. Similarly, when used as a left multiplier, R A removes the rows of J that are associated with the inactive cells. A A T 9.5 The regularization parameter Recall from Section 9.2 that a regularization parameter (5 is needed to pose the inverse problem as an unconstrained optimization problem, fi is referred to as the tradeoff parameter: choosing its value represents a tradeoff between minimizing the misfit &d and minimizing the model objective function & . A typical representation of this relationship is shown in Figure 9.3 below. This curve, named a "Tikhonov curve", represents optimized conditions (i.e. ||g(m,/?)|| = 0) for a continuous range of /3 values. m m Figure 9.3 A typical Tikhonov curve. 121 Finding an appropriate value of f3 that results in a solution that meets our requirements is an important aspect of the inverse problem. Because the survey data contains some level of uncertainty, a decision must be made on how well the data predicted by the recovered model should fit the observed survey data. Solving the inverse problem requires optimization of (9-45) for several values of until a model is found that yields the desired level of misfit, <P*. The inversion process will reduce the data misfit until this target misfit is attained within a certain tolerance (this is discussed further in Section 9.7). Hence, the inversion procedure contains two levels of iterations: in the outer iterations, /3 is altered to search for the target misfit; in the inner iterations, the model is perturbed to optimize the objective function for the current value of /3. A discrepancy principle method is chosen for defining the target misfit, 0*. Errors in the data are assumed to be Gaussian and uncorrelated with zero mean. Recall from Section 9.2 that under these assumptions the expected value of the misfit is the number of data, N. Thus, we expect 0* ~N but to allow more flexibility we specify 0* = chifact x TV (9-49) where the parameter chifact can take any non-zero value but is expected to be close to unity (presuming the assumptions on the data errors are accurate). Adjustments of the multiplication factor chifact can be made so that the target misfit is consistent with any knowledge of supplied data errors. If data are noisy or they do not adhere to the assumptions made above then the target misfit may have to be larger than TV in order to obtain an acceptable model. One then sets chifact > 1 so that the data are fit less well. This results in susceptibility models that have smaller amplitude and are generally closer to the surface. If the data errors have been overestimated then one sets chifact < 1. The resultant inversion produces a model thatfitsthe data better and the model will exhibit more structure. 9.6 The inversion algorithm The inversion algorithm is as follows. Given: i) ii) iii) iv) v) a defined grid with active and inactive regions specified, data observations specifying position, flux measurement, flux component and estimated uncertainty, a reference model, the inducing flux, and a target misfit, 0d : 122 1) Construct the following invariants associated with the forward problem: D, G, B These quantities were discussed in Chapter 3. 0 2) Calculate the following invariants associated with the objective function: W , W (consisting of W „ W , W , W and W ) These quantities were discussed in Section 9.2. d m x y z r 3) Chose an initial model m and estimate an initial beta value /? . Subsections 9.7.2 and 9.7.3 discuss how these are performed. 4) Calculate d 5) Calculate a Gauss-Newton model perturbation dm through a CGLS solution to (9-30). 6) Check that dm is a descent direction (i.e. g<5m < 0). If not then revert to a Steepest Descent direction. 7) Perform a line-search for the step length am (9-46) such that 0(a) is reduced a significant amount. The line-search is discussed in Subsection 9.7.4. 8) Update the model as in (9-46) and calculate d model m * . ( 0 ) , & , 0 , 0 and V 0 for the current ft and m. pred d m m T , <P , 0 , 0 and V d> for the updated pred ( 9) (0) d m m +1) If close enough to the optimal solution for the current ft then continue to 10). If not, then return to 5). This step is discussed in Subsection 9.7.5. 10) Check if the misfit is close enough to the target misfit 0 *. If so, quit and return the current model and other pertinent information. If not then continue to 11). d 11) If still at the initial beta value /5 and the misfit is below the target then set P = 100/j and reset the model to m . Otherwise, if 0 > 0 * then reduce f3 by some amount; if 0 < 0 * then perform either a linear or quadratic interpolation between the closest known points on the 0 vs. j3 curve to estimate f3*. Return to 4). (0) (0) (0) d d d d d Steps 5 through 8 correspond to the inner iterative procedure, in which the model is perturbed. Step 11 is the update step for the outer iterative procedure, in which the value of /? is altered to search for the target misfit. 123 9.7 Practical aspects of the inversion algorithm 9.7.1 Depth weighting Recall from Section 9.2 that depth weighting is crucial for magnetic inversion because without it, the resulting susceptibility distribution will be concentrated at the surface. An immediate choice for the depth weighting function W(r) in (9-7) comes from Li and Oldenburg (1996): W(z) = (z + z y (9-50) 3 0 Here, z is depth below the surface and zo is some value that can be assessed from knowledge of the average survey height. This is an empirical estimate for the linear problem discussed in Section 1.2 and Li and Oldenburg (1996). The depth weighting matrix in the discrete objective function is W = diag[(z + z y ] (9-51) V2 r 0 Here, z is a vector containing the z-coordinates of the centres of each active model cell. This depth weighting function will always be applied twice in the model objective function and the effective weighting is then consistent with (9-50). Recall from Section 4.2 that the field of a dipole drops off with distance r as 1/r . Hence, the depth weighting function (9-50) is very good at counteracting the decay of any single isolated cell, regardless of the cell susceptibility or cell shape, especially at a significant distance away. For the linear problem, the depth weighting function (9-50) is satisfactory. The situation becomes more complicated when self-demagnetization effects are included because the cells are no longer in magnetic isolation. It is initially unclear how the depth weighting function needs to be altered for the nonlinear problem but experiment has shown (9-51) to be reasonable (this will be discussed in Chapter 11). 3 9.7.2 Estimating an initial value for /? It is most efficient to start the inversion algorithm with a value of that is large enough to make the first optimization approximately quadratic so that the Tikhonov curve is approached quickly and easily. This can be done by finding a j3 such that the J Wd WdJ term in the approximate Hessian is negligible compared to the /7W W term. This assures that thefirstiteration will involve an approximately quadratic system. T T m Estimates for /? (0) m are calculated as 124 T /? (0) = 100 (9-52) where r is some vector of random values. 9.7.3 Choosing an initial model The most immediate option for an initial model is the reference model, but a more sophisticated approach could be used. However, when an appropriate initial value of /?is chosen such that the first iteration is approximately quadratic, the choice of initial model makes little difference to the efficiency of the inversion. The model obtained after this first iteration will be approximately equal to the reference model, regardless of the initial model used. 9.7.4 The line-search for a After solving for the perturbation direction Sm, a line-search for or in (9-46) must be performed to determine how far to travel in the Sm direction. With 0=0(a), the line-search for a begins with 0(0) and 0'(O) (i.e. 0 and g<5m at the current iteration) and T then calculates 0(1). If 0(0) -0(1) 0'(O) >0.25 and <P(1)«P(0) (9-53) then the value of cc=\ is accepted. Otherwise, a quadratic interpolation is used to estimate the best value of a. The choice of the criterion (9-53) was arrived at through empirical investigation. If 0 is still not reduced below 0(0) after the first quadratic interpolation then further quadratic interpolations are performed. If 0 has yet to be reduced below ^(0) after several quadratic interpolations then it is assumed that the algorithm is close enough to the Tikhonov curve. Recall from Section 9.3 that for the Gauss-Newton direction, the best value of a is expected to be close to unity, especially when the objective function is locally quadratic in nature. However, when a Steepest Descent direction is used there is no expected value of a. Therefore, if the best or value found thus far is greater than or equal to unity, a "doubling" search (i.e. the opposite of a bisection search) can be employed to further constrain the best or value. The current best value is doubled and re-checked until the objective function begins to increase. This more in-depth search procedure ensures that a much larger perturbation can be taken (i.e. a » 1 ) if available. Although one forward solution must be performed for each additional or value, the doubling search may reduce the total number of model perturbations required in the optimization for the current value of R. Reducing the need to calculate a Gauss-Newton 125 perturbation direction eliminates the need for a CGLS solution that may require many forward solutions. There is a tradeoff here that is difficult to quantify without further investigation. 9.7.5 Searching for m that minimizes 0 For the current value of /?, the model perturbations are terminated if < tol (9-54) for some user supplied tolerance tol\. Here, g g <0) (,) is the gradient for the current model iterate and is the gradient for the initial model in the current outer iteration. Other possible termination criteria include a tolerance on g<5m (i.e. the slope in the search direction), a tolerance on how T much 0 is changing with each model iteration, and a maximum number of model perturbations per (5 value. RecallfromSection 9.2 that introduction of the positivity constraint through a reparameterization may make the problem more difficult to solve. For this reason, any inversions performed with the rudimentary code that were not thoroughly monitored made use of a very small tolerance in (9-44) and allowed many iterations per /? value. This may result in unnecessary inner iterations, leading to unnecessarily long inversion times, but the algorithm is made more robust. For example, there is less chance of the algorithm getting "stuck" in a false minimum such as a saddle point or a local low-gradient section. 9.7.6 Searching for an appropriate value of B In step 11) of the inversion algorithm in Section 9.6, /?is reduced as (n+\) step x B (9-55) until the misfit is reduced below the target. The value of step can be specified by the user but the default value of 0.5 generally leads to a procedure that stays quite close to the Tikhonov curve without taking unnecessarily small steps in /?. This is advantageous because the approximations made in (9-23) and (9-25) become more appropriate. When beta has been reduced such that the misfit is below the target, B* can be estimated using either a linear or quadratic interpolation between the closest known values. The interpolations continue until the current value of /? can be accepted. I also allow for a bisection search for B*, which is more robust than the interpolation methods because it does not assume a particular curvature in the &d vs. /? relationship. 126 The current value of beta is accepted as the final fi* if < tol (9-56) 2 for some tolerance tolj. A default value of 0.05 is used but the value can be specified by the user. 9.7.7 An initial guess for the forward solution Recall from Section 9.1 that solution of the forward problem requires solution of a system of equations of the form A(m)u = q(m) (9-57) for the potentials in u. The algorithm used is BiCGStab, which is started from some initial guess u . The closer u is to the final solution, the shorter the BiCGStab solution time. Hence, in cases where a solution is known for a similar problem the known solution can easily be used as an initial guess to reduce the BiCGStab solution time. This approach was used successfully in the CGLS algorithm and in the line-search for a and this significantly increased the efficiency of the inversion algorithm. (0) (0) 127 Chapter 1 0 Calculation of the sensitivity matrix 1 0 . 1 Introduction The sensitivity matrix J was introduced in Section 9.3. J is defined as dd dF[m] pred J= dm (10-1) dm which indicates how the data predicted by the model, d , changes as the Earth model, m, changes. F is the nonlinear forward modelling operator. Calculation of the sensitivity matrix is a major aspect of the inverse problem and will now be addressed. pred Section 10.2 revisits the flux formulation solution to the magnetostatic forward problem for use in an inversion algorithm. Sections 10.3 and 10.4 derive expressions for the sensitivity matrix for the total and secondary flux formulations. Practical aspects of performing calculations with J are discussed in Section 10.5 and the results for the total and secondary field formulations are given in Section 10.6. Difficulties arising from use of total field data (i.e. measurements of total field magnitude) are covered in Section 10.7 and Section 10.8 discusses the testing of the calculation code. RecallfromSection 9.3 that in order to impose positivity, the model is chosen to contain squareroot susceptibility values. Two model vectors were defined, k — (10-2a) (X\' X2 ' • • • s Xnc ) (10-2b) Let _ dF[k] ~ k (10-3) dk and recall 2diag(m) (10-4) such that 128 J„=^=5F[k]*=J S dm dk dm (10-5) k In (10-4), the operator diag(m) takes the vector m and creates a matrix with m on its main diagonal. For simplicity, the model considered in the derivations of this chapter is m = (Xi,X2,)C3, (10-6) •••,Xnc) T If positivity is later applied via a transformation of variables, then the model can be re-defined as in (10-2b) and (10-5) can be used to compute the new sensitivity matrix. 10.2 Revisiting the forward problem for use in an inversion algorithm Recall the summary of the flux formulations in Section 5.3. For the total flux formulation, the discrete equations to solve are DMG(J) = DBo + g B = MGtJ) - Bo (10-7a) (10-7b) s For the secondary flux formulation the equations are = -D(/4)"'M - I)B + g B = MG(|> + (//o' M-I)Bo DMG<1> (10-8a) (10-8a) 0 S l s s Recall from Section 5.5 that any formulation of the magnetostatic forward problem can be written as the system A(m)u = q(m) (10-9) where u is a vector of unknown scalar potentials within the discrete grid. Recall from Section 9.1 that once u is determined, the predicted data of interest are calculated through some nonlinear operation, which can be written as d pred = F[m] = Q[B ] = Q[ C(m)u + b(m) ] (10-10) S The matrix C and vector b convert the potentials in u into the secondary flux values B throughout the discrete grid. The function Q then interpolates for specific flux quantities at specified measurement positions. For the moment, assume d contains single component s pred 129 secondary magnetic flux density values (i.e. B , B or B ). In this case, Q becomes a matrix Q that performs eight-point linear interpolations and (10-10) can be written x d pred y z = F[m]= QB = Q( C(m)u + b(m) ) (10-11) S The case for total flux magnitude data is discussed in Section 10.7. The forward modelling equations for the flux formulations can be written in the general notation of (10-9) and (10-11) as follows. For the total flux formulation, A(m) = DM(m)G u = <)| q(m) = DB + g(m) C(m) = M(m)G b=-B (10-12a) (10-12b) (10-12c) (10-12d) (10-12e) 0 0 and for the secondary flux formulation, A(m) = DM(m)G u = <> | q(m) = - D C ^ - ' M ^ ) - I)B + g(m) C(m) = M(m)G b(m) = (/V'lVKm) - I)B (10-13a) (10-13b) (10-13c) (10-13d) (10-13e) s 0 0 In (10-11), the matrix C = M G acts on the potentials in u to givefluxvalues on the cell faces. The vector b then adds to thesefluxesto create the appropriate secondary values in B . Finally, the interpolation matrix Q calculates the single component secondaryfluxvalues at the measurement locations from the full set of face-centred flux values in B . s s 10.3 Expressions for J from the flux formulation Recallfromthe definitions in Chapter 3 that D and G are solely functions of the grid geometry. Q depends on the survey locations with respect to the grid coordinates. Bo is a vector of inducingfluxvalues. M and g are functions of geometry and of the model susceptibility values in m. Therefore,from(10-12) and (10-13), A, C, q and b will be functions of m (with the exception that b = -Bo for the total formulation is not a function of m). With these in mind, use of (10-11) allows the sensitivity matrix J to be written as 130 dm = Q dm dm (10-14) {c(m)-^[u(m)] + -^[C(m)u] + - f L [ b ( m ) ] l [ am J u am rfm where I use a subscript after a bracketed term being differentiated to indicate that the specified quantity is treated as a constant in the differentiation. J is simply a product of the invariant interpolation matrix Q with the derivative dBJdm, which now becomes the quantity of interest. The derivative with respect to m of (10-9) is -^-[A(m)u(m)]= A(m)-^-[u(m)] + -f-[A(m)u] =-f-[q(m)] dm dm dm dm (10-15) u Rearranging yields an expression for du/dm d [u(m)] = A" (m) j-p- [q(m)] - -f- [A(m)u] } dm [ dm dm J (10-16) 1 u In writing (10-16), A" is assumed to exist. D is nc by nf (where nc < nf) and can be shown to have rank nc. G is nfby nc and also has rank nc. M is an «/by w/full rank diagonal matrix so that A = D M G is nc by nc and has full rank. Thus, A is non-singular and is invertible. 1 (10-16) can be substituted into (10-14) to obtain ^ = C(m) A"' (m){-f- [q(m)] [A(m)u] 1 + [C(m)u] + -f- [b(m)] dm [dm dm J dm dm u u (10-17) (10-17) is now dealt with term by term. For both total and secondary formulations, f [ A ( m ) » l = D-^-[M(m)Gu] dm dm (10-18a) u -^[C(m)u] =-^-[M(m)Gu]„ am dm (10-18b) u The expressions (10-18a) and (10-18b) hold for both formulations. However, u represents a different potential quantity in each. For the total formulation, ~ [q(m)] = - p - [DB + g(m)] = ± - [g(m)j dm dm dm 0 131 (10-19a) A[ am b ( m ) ] A[_ ] = 0 = (10-19b) B o dm and for the secondary formulation, -f- [q(m)] = -f-1-D^-'MCm) - 1 ) B + g(m)] am 0 am (10-20a) = -^-'D-^-[M(m) B ] + -^-[g(m)] am am 0 d — [b(m)] = — [fa-*M(m) - 1 ) B ] = am am 0 — [M(m) B ] dm 0 (10-20b) The following function is defined to simplify the above expressions: Y(w) = -^-[M(m)w] dm (10-21) w where w is some vector. Because the vector w is treated as a constant in the differentiation, the function *F(w) is linear. Consequently, after some trivial algebraic rearranging, the total and secondary formulation results can both be written = M(m)GA (m) J — (g(m)] - D ¥ ( w) I + »F( w) dm [dm J _1 (10-22) where w is a different quantity in each and VIsecondary = G(j) + (io 'Bo s = G(() + Gtyo s = G(4» + ^o) = Gk> (10-23) s = Wtotal as expected. 132 10.4 Derivation of the required elements of J 10.4.1 Derivation of *F(w) The first necessary quantity to derive in order to calculate J is ^(w), defined in (10-21). Recall from Section 3.6 that for the flux formulation, M has non-zero diagonal elements TJ that are harmonically averaged permeability values of the form n (10-24) =2Ax, 0 Mi Mj From (10-24), the derivative of TJ,J with respect to a model element Xk is dt], _ Brio d Mk dX ^Mk d% k k = -2// Ar.. 0 (10-25) M Mj Ml ,^L{s s ) ik+ ^ M Jk I 1 where (2-4) was used to give (10-26) dju = jUodx Simple expressions for the elements of (10-21) can be obtained from (10-25). The product Mw will have the form 0" w, Vu W i2 w Vn Mw = V l 7J W 2 23 2 (10-27) VM i w 0 The element of (10-21) in the z' row and krJth column will be th </(Mw),. d Xk m d(r] w)_ h n T l w , , +d ~ d ~ 2Ax y v* ™> iM Xk k 1 A u+ M i k resulting in a sparse matrix of the form 133 (10-28) It may be more advantageous to work in terms of matrices and vectors instead of single elements to obtain an expression for (10-21). M can be written as M = ju diag((Y(m+iy )l ) ] 0 (10-30) where a superscript on a vector quantity is used to signify an element-by-element to-the-powerof operation. Y is an nf by nc matrix that picks out appropriate values of //"' from //o(m+l)"' to create (reciprocals of) the harmonic averaged permeability values: Ar Ax 1 2 0 l 2 K Y = - Ax 2 Ax 2 3 (10-31) 2 3 (10-21) now becomes ¥(w) = Mo4~ [<fag((Y(m +1)")"') w] 1 dm (10-32) 8(Y(m + l)-') [^g((Y ir')- )w]^ | ( m + < m + '>;' ' ° ' ) d(m+l) ( , + r l dm Consider the first derivative on the right hand side of (10-32): a(Y(m a l)- ) ^ 1 + [ ( Y ( m + 1) " ' l lr °°- )w 33) The quantity being differentiated is a diagonal matrix of the form diag(d' ) multiplied by some vector w. With d = (d\, d , .. .) and w = (wi, w , .. .) , the quantity being differentiated is (d\ w\, d ' w , .. .) . Differentiating the i term of this vector by dj will give ] T T 2 A l 2 T 2 th 2 134 (10-34) ^ W » A which leads to a k ^( d 1 w ] ^ j , ^ ov- , «(w) = d 2 (10-35) dd Evaluation of the second derivative in (10-32) is trivial. The third derivative equals a matrix with elements % ^ = -Cr, + D J (10-36) « , which leads to -i d(m + l) = -diag((m + iy ) (10-37) z r/m (10-32) can now be written Y(w) = // ^ag((Y(m +1)" )" )^flg(w)Y^/ag((m +1)" ) 1 2 2 0 = ju- MMdiag(w)\diag((m l +1)" ) 2 (10-38) which is simple to construct, given the geometry dependent matrix Y. Through careful consideration of (10-38), (10-31) and (10-29) one can verify that the two methods yield identical results. 10.4.2 Derivation of dg/dm for a grid-centred congruous sphere The second necessary quantity to derive in order to calculate J is dg/dm. Here, an expression for this quantity is derived when the boundary conditions are specified through use of the congruous sphere method of Section 4.2 with the sphere placed at the grid centre. Extension to more complicated boundary condition specification methods is discussed in Subsection 10.4.3. Recall from Sections 3.5 and 4.3 that the vector g arises in the discretization of the divergence equation (3-la), g has elements that are secondary flux values divided by cell dimensions. These secondary flux values are the contributions to the boundary fluxesfromthe susceptible material within the grid as calculated for the congruous sphere. Recall from Section 4.2 that a magnetic sphere is equivalent to a magnetic dipole. The congruous dipole has moment i n defined by 135 m = mm= V= B_o 1 & ? (10-39a) ( Mo 1 + (10-39b) X/ v <=i >*,•*<> ^ = TrZ^ ' v (10-39c) where in has length in and unit direction m; v, is the volume of the / cell; Xi is the susceptibility of the i cell; th th and the summation for Vis only over those cells with ^ 0 . £ i s a volume-averaged susceptibility. Take care not to confuse the model vector m and the dipole moment quantities in, in and m. The secondary flux from the dipole at any point P is calculated using B (P) = ^[3{m-r)r-m] , r*0 dip (10-40) where r is a vector with length r and unit direction f pointing from the dipole to P. The moment vector has unit length 111 = ^ (10-41) l o| B and therefore, (10-40) can be written as B*=^T|^|[3(B -f)f-B ] <\Kr B | 0 (10-42) 0 0 To obtain dg/dm, an expression for dUdipldm is required. With the dipole placed at the grid centre, the bracketed term in (10-42) is only dependent on geometry and Bo. Thus, dB^ dXk = dB^dm 3m dx = k [ 3 ( ^ _ dm B(>] ATir |B | ^.dm = dfi 0 k From the definitions in (10-39), 136 ( in dx k 1 ( M 3 ) dm _ dm dB, ^ dx k K\v a Mo ^ f Mo V fi\ d 1+3 nf V -i /=1 s-2 1+^ 3 1+- |Bj din (10-44) Mo dX k where f= 1 + 2- (10-45) is a scalar quantity dependent on m, and is the volume of the k cell. th (10-44) is only valid if dV/dXk = 0. Fis defined as a summation of v, only over cells with Xi* 0Hence, dV/dXk=0 if and only if Xk0- Therefore, the derivation above will not be valid for model cells with permeability of free space. This problem can be overcome simply by using homogeneous boundary conditions such that dg/dm is not required. Furthermore, the problem can be overcome for any choice of boundary conditions through use of the split model formulation of Section 9.4 so that dg/dm is no longer required for model cells with Xk = 0. Combining (10-43) and (10-44) gives dB dip (10-46a) dXk where 7 _K\c_ <r(i+£/3)_ £ m// 0 (10-46b) 3+£ Each element of g corresponds to a particular model cell. Thus, each element of dg/dm can be determinedfrom(10-46) given knowledge of the position and dimensions of the corresponding cell. A single element g, of g will have a value of the form g . iy QJ =^B' + hx " ' hy ' ' c p x d p y | iz c hz B> dip,z (10-47) 137 where the constants c , c and c are either -1, 0 or +1, depending on the location of the z' cell th ix iy within the grid. The B' dip dXj h tyj x iz terms are flux values on cell faces. Use of (10-46) and (10-47) gives hy dXj hz dxj • t **< >+ i *~ >+ t *** > c r v c r v e r - v (10 48) from which an expression for dg/dm is obtained: -^-=rgv (10-49) T am v is a length nc vector of cell volumes. (10-49) yields a non-sparse, nc by nc matrix with nc— (nx-2)(ny-2)(nz-2) full rows corresponding to the number of cells adjacent to the grid boundary. The remaining (nx-2)(ny-2)(nz-2) rows are empty. There is no need to compute the matrix gv because it will only be used to multiply some vector x within the CGLS algorithm discussed in Section 9.3. The most cost-effective way to compute gv x is to first perform the operation v x, which yields a scalar. T T 10.4.3 Options for dealing with the boundary conditions Recall that the fluxfromthe congruous dipole used to approximate the boundary conditions was <*> / ^ i j ^ t o ) • ' ) ' - o] B = (10-50) B Anr B 0 In order to obtain dg/dm, an expression for dBdip/dm was required. When the dipole is placed at the grid centre, the bracketed term in (10-50) is only dependent on grid geometry and the inducing flux and the derivation of dBdip/dm is straightforward. RecallfromSection 4.2 that this approach can be extended to more sophisticated boundary condition approximation techniques. These would calculate spatial susceptibility moments in order to define the shape, orientation and location of the congruous body. The bracketed term in (10-50) would then become a function of both grid geometry and of the model susceptibility values. The derivation for dBdip/dm would become much more complicated and the resulting calculation of dg/dm would be more time consuming. Because the boundary conditions are calculated for each new model iterate in the inversion process, this would ultimately lead to a slower inversion process. 138 There is a simple option for removing the need to derive or calculate an exceedingly complicated expression for dg/dm. This is to use a less complicated boundary condition approximation method when calculating dg/dm but to keep the more complicated method when calculating g. Hence, the less complicated method enters into the calculation of J(m ) on the left-hand-side of (9-30) but the more complicated method remains in the calculation of F\m ] on the right-hand-side. w (k) Order of magnitude estimates and experimentation have indicated that for most applications it may be acceptable to do so. The dg/dm term in (10-22) is comparatively small to begin with. Furthermore, the differences in the various options for boundary condition approximation become minor provided the user has followed the advice of Section 6.4 and kept the active susceptible region separatedfromthe grid boundary by a sufficient number offreespace padding cells. A more quantitative method of determining whether or not the above shortcut is appropriate for a given scenario is to monitor the quantity Wjfdf^ -d pred p (10-51) 2 where d\ is the predicted data when the more complicated method of boundary condition approximation is used and d2 is the predicted data when the less complicated method is used. pre pred 10.5 Computing Jx and J x Recall from Section 9.3 that at each iteration of the inversion algorithm, a model perturbation direction Sm is calculated using (10-52) where nr is the model at the current iterate. The algorithm CGLS is used to solve (10-52) and this requires computation of the products Jx and J x for some vector x. The matrix J never needs to be constructed, which greatly simplifies the solution. T Now that expressions for J have been derived, methods for calculating the products Jx and J x can be established. From Section 10.3, T (10-53) 139 where w depends on the formulation being used: ™total (10-54a) (10-54b) = Wsecondary G§s + \M) Bo = Substitution of (10-38) and (10-49) into (10-53) yields J =Q M G A " 1 { ygv - DMM</iflg(w)Y%((m +1)~)} T 2 (10-55) + QMMdiag{yv)Ydiag((m +1)~ ) 2 The derivation for the required elements of J in Section 10.4 considered a model containing the susceptibility values of every cell. The actual model used differs in two ways. First, it uses square-root susceptibility values. Second, it only holds those values associated with cells in the active portion of the grid. Therefore, (10-55) needs to be modified as indicated in (10-5) and (947) to give J = Q M G A " {ygv -BMMdiag{w)Ydiag((m +1)" )}SR ... 1 T 2 A + QMMdiag{w)Ydiag((m + 1)' ) s R (10 56) 2 A where S and R are simple sparse matrices defined in (10-4) and Section 9.4 respectively. A Computation of Jx is performed as follows: (0 Let x = SR x (ii) Let z = MMdiag(yv)Ydiag((m and b = rgv x - Dz (10-57c) (iii) Compute y = A"'b (10-57d) (iv) Jx = Q(MGy + z) (10-57a) A T +1)" )x 2 (10-57b) (10-57e) Step (iii) represents one forward modelling operation in which Ay = b is solved for y using the ILU-preconditioned BiCGStab approach discussed in Section 6.2. From (10-56), J is written as 140 J = R S{ yyg — diag((m +l) )Y ofm (w)MMD }A G MQ T T -2 T r A T 0 + R Sdiag((m +1)" ) Y diag (w) M M Q J 2 T A = R[Sdiag((m +1)' )Y diag{w)MM(l T 2 + 7R Svg A- G MQ T T A T _T T ... T T - DTA-TGTM)Q T (10-58) ... T where the equalities M = M and S = S have been used (recall that these matrices are diagonal). Computation of J x is performed as follows: T T T Let (0 z =Q x (10-59a) b = G Mz (10-59b) y=A b (10-59c) T T Compute (z7/) T J x = R Sdiag((m +1)" )Y diag{w)MM(z - D y)+ / R S v g y T T 2 t T A (10-59d) T A Step (ii) represents one forward modelling operation in which A y = b is solved for y using the ILU-preconditioned BiCGStab approach discussed in Section 6.2. T Through careful choice of multiplication order, all steps in the computation of Jx or J x can be carried out avoiding any matrix-matrix multiplications. Eleven matrix-vector multiplications and one vector dot product are necessary for each computation (not including any such operations required by the BiCGStab solver). T 10.6 Results for the field formulation 10.6.1 Revisiting the field formulation Following the same procedure of Sections 10.2 through 10.5 for the field formulation of Section 3.7, A(m)u = q(m) d pred (10-60a) =F[m]= QB =Q( C(m)u + b(m) ) S A(m) = DM(m)G C=/A)G (10-60b) (10-60c) (10-60d) The interpolation step (10-60b) for the field formulation is much simpler than for the flux formulation because C is no longer a function of the model. Therefore, the hope was that the objective function optimized in the inversion process would contain less irregularities than for the flux formulation, leading to more efficient inverse solutions. However, because all interpolations occur withinfreespace, (10-60b) is expected to be essentially equivalent for both 141 formulations. From Section 8.5, the flux formulation is more accurate for high susceptibilities and may be more appropriate for use in the inversion algorithm for this reason. Nevertheless, both formulations were considered for use in the inversion algorithm. For the total field formulation, DM(m)Gk|) = DBo + g(m) H = G(])-Ho u = (j) q(m) = DB + g(m) b =-B (10-61 a) (10-61b) (10-61c) (10-6 Id) (10-61e) s 0 0 and for the secondary field formulation DM(m)G(j) = -DOo"'M(m) - I)B + g(m) H = G(j) u = (|) (10-62a) (10-62b) (10-62c) q(m) = - D ( V M ( m ) - I)B + g (10-62d) b =0 (10-62e) s s 0 s s 0 The expression for dB /dm is now s dB dm = C A - (m){ j - [q(m)] - ±- [A(m)«] } ± - [Cu] [dm dm J dm u = jU GA- [ 0 + u + ^ dm (10-63) |g(m)] - D ¥ ( w) J (m)j^- 10.6.2 Derivation of *F(w) for the field formulation Calculation of (10-63) again requires expressions for ^(w) and dg/dm. Expressions for the latter are exactly as in Subsection 10.4.2. However, expressions for 4*(w) differfromthose in Section 10.4.1. RecallfromSection 3.7 that for the field formulation, M has non-zero diagonal elements rj that are arithmetic averaged permeability values. In Appendix B, these are shown to be of the form (hy hz ju +hy hz n + ...\ l ] u x ...hy hz ju 2 l 2i 2 n +hy hz ju 2 4Ay Az 12 2 2 (10-64) 12 Again, the derivations will be made for a model of susceptibility values. From (10-64), 142 d XiJ d d Mu *Ay Az ' a (10-65) *> K 1 %u l2 from which simple expressions for the elements of *F(w) in (10-21) can be obtained. The resulting sparse matrix is of the form hy hz{w hy hz w hy hz w hy hz w Ay Az Ay Az hy hz w Ay Az hy hz w 2 Ay Az hy hz w hy hz w Ay Az Ay Az 23 A^ Az A.y Az 23 x x X2 ¥(w)=A x2 0 4 x 2 x 12 12 2 2 23 2 2 x x2 2 23 x 2 x2 3 23 2 X2 3 x2 2 23 0 x 2 23 3 3 23 2 (10-66) : Again, it may be more advantageous to work in terms of matrices and vectors instead of single elements to obtain an expression for ^ ( w ) . M can be written as M = po diag(\ i(m +1) + Y ) (10-67) 2 Yi is an ne by nc matrix that picks out appropriate values of // from //o(m +1) to create arithmetic averages of permeability. Y is an ne by 1 vector that accounts for the extra layer of //b padding cells that need to be added to the grid for the field formulation. Yi and Y are of the form 2 2 (' • y, i h 0 Y, = hz hy hz Ay Ay hy hz Ay Ay 2 X2 Ay Ay x2 x2 2 23 Ay Av V ( '• \ hy hz 23 x Y, = 4y, Ay o 2 ] X2 3 hy hz Ay Ay hy hz 2 l2 2 0 (10-68a) l2 3 2 23 hy hz Ay Ay 3 23 3 23 23 x (10-68b) 12 »F(w) is 143 Y( w) = f ( M w ) = am ftf[ diag{\ (m +1) + Y ) w] dm 2 3 r ,. / . N -i = M> • ^ l ^ ( ) J = Y d rf(Y,(m w D L ( M + 1 ) Y + + l) + Y ) ^ 2 (10-69) m 2 which is simple to construct, given the geometry dependent matrix Y\. Through careful consideration of (10-69), (10-68a) and (10-66), one can verify that the two methods yield identical results. 10.6.3 Computing Jx and J x for the field formulation T Substituting (10-49) and (10-69) into (10-63) leads to an expression for J for an active-set model of square-root susceptibility values: J = / / Q G A { ygy - DJ/ag(w)Y,} SR 1 (10-70) T 0 A Computation of Jx is performed as follows: (/) Let x = SR x (10-7 la) (ii) Let b = ^gv x-D^'ag(w)Y,x (10-71b) (iii) Compute y = A"'b (10-71c) (iv) Jx = // QGy A T (10-71e) 0 Step (iii) represents one forward modelling operation in which Ay = b is solved for y using the ILU-preconditioned BiCGStab approach discussed in Section 6.2. From (10-70), J is T J T = // R S { / v g — Y, fi?z'ag(w)D }A" G Q T 0 T T T T T T A (10-72) and computation of J x is performed as follows: T (0 Let b=G Q x (10-73a) (ii) Compute y = A" b (10-73b) (iii) J x = // RlS {^vg - Y,V/flg(w)D }y T T T T T T T 0 144 (10-73d) Step (ii) represents one forward modelling operation in which A y = b is solved for y using the ILU-preconditioned BiCGStab approach discussed in Section 6.2. T 10.7 Measuring the totalfluxmagnitude The formation of the sensitivity matrix becomes slightly more complicated when dealing with total flux magnitude data. A total flux magnetometer measures the total magnitude of the flux, |B|. The total anomalous flux is then calculated as (10-74) Btotal = | B | - |Bo| = | B 0 + B , | - | B 0 | where Bo = (BQ , Bo , BQ ) is a three component vector that describes the Earth'sfield,so that |Bo| is a scalar. In some applications, the assumption can be made that the anomalous secondary flux is small compared to the inducingfluxB o so that B i can be approximated as x y Z ma B tai to = B -B S (10-75) 0 This assumption may not be appropriate in applications where data measurements are taken in close proximity to highly susceptible material (e.g. as with bore-hole data). For (10-75), the anomalous total flux magnitude predicted data (i.e. forward modelled data) can be written Ki = K Q ' W*°- -j + Q (10-76) B, = QB, Here, B is the vector of secondaryfluxvalues throughout the grid returned by the forward modelling code. The matrices Q , Q and Q are those that would be required to interpolate for the x-, y- and z-components of the secondary flux at the observation locations. (10-76) describes a linear operation and calculation of the associated sensitivity matrix, making use of the results of Sections 10.3 through 10.6, is straightforward. However, for (10-74) the total flux magnitude predicted data must be written as s x Kf i a = VK + Q>B ) S 2 y + (B z Qy +Q B ) y 2 5 + {B 0z +Q B ) Z S 2 - |B | 0 (10-77) where the squares and square-root are operations on the elements of the associated vector quantities. (10-77) involves nonlinear operations and, therefore, calculation of the associated sensitivity matrix is not as straightforward. (10-77) can be rewritten as d£=te)" -|B | (10-78a) 2 0 with 145 DZ1 = diag{B 0X + Q B )(B x S + QJB.)... 0X + diag(B +Q B )(B +Q B )... 0v + y S 0y y (10-78b) s diag{B +Q B ){B +Q B ) 0z z S 0z z s The sensitivity matrix is now dd """ dm = d^}_ dm = d\B^ dm d^)dD 3D dm = ( 1 0 _ ? 9 ) where the subscript and superscript labels on df^j and D ^ have been removed to simplify the derivation. With D = (D,, D , £ > 3 , ..., D f, 2 ^ - N = 1^'% (10-80) and therefore ^ = ^mg(D- ) = i ^ g ( d |B |)- 1/2 + (10-81) 1 0 In order to calculate the second term in (10-79), dD/dm, the following derivative is required: -j- [diag{B + Q B ) (B + Q B )] 0V V S 0V V _ Bfriagfo, + Q B , ) ( * o v a(5 +Q B ) v 0v v S +Q B,)] V s d{B Q B )dB 3B </m 0v+ v s S for some direction ve {x, y, z). Let x = y=BQ + Q B so that V V 146 S s °°" 82) dm [diag{B +Q B ){B +Q B )J 0v = v t Qv v t [diag(x)x]^-^ dB dm 1 dx s (10-83) ^[dia (x)y\W^ y J = {dia (x) g + J g dB dm = {diag{x) + diag{y)}Q dB^ s 2diag(B +Q B )Q dm = 0v v t r where the following derivation has been used: diag(x)y = xy 2 V 3 dx dx (^( l ) y : 2 (10-84a) J ),=%) = M dx. (10-84b) ) [diag{x)y] = diag(y) (10-84c) Combining (10-79), (10-81) and (10-83) yields ^ a,=diag(d + \B \y{diag{B +Q B )Q +... ol 0 0x x s } x (10-85) dm Hence, to compute the sensitivity matrix when using totalfluxmagnitude data, one only needs to compute dBJdm in (10-22) or (10-63) and then left multiply by Q.ota.Cm) = diag(d + \B \Y {diag{B +Q B )Q +diag(B + Q B ) Q + ... } (10-86) l 0 0x X S X 0y y s y 10.8 Testing the calculation algorithm The calculation code, using the derived expressions for the products Jx and J x in the proceeding sections, was tested against a brute force,finitedifference approach. The test used a small, simple model involving a 3 grid of cubic cells. The cell sizes and susceptibility values T 3 147 were chosen from uniform random distributions and there was one observation location at the centre of the grid. Such tests showed that the two calculation approaches were always consistent in the calculation of the dg/dm and ^ ( w ) terms. However, the results were not always consistent for dB /dm even when consistent for dg/dm and *P(w). This only occurred when cell dimensions were highly varying (i.e. with some cell dimensions being an order of magnitude or more different from others). Such inconsistent results could be pushed toward consistency by altering the BiCGStab solution tolerance or by altering the finite difference step length. Therefore, the inconsistency was assumed to result from some rounding effect in the BiCGStab solution, which is required in the calculation of <iB/<im but not for dg/dm or ^ ( w ) . One would very rarely (if ever) use a grid containing such highly varying cell dimensions. Therefore, the conclusion was made that the calculation algorithm for Jx and J x of this chapter was sound. s s T To test the calculation code with a more practical model, a grid of 33 cubic cells with lm dimensions was created with a central susceptible cell with r=l- Total flux magnitude data are calculated 5m above the centre of this body across a 10m square grid with one datum per grid cell (i.e. the number of data, TV, is 100). The inducing flux used was Bo = (1,1,1). The sensitivity matrix calculated for the single-cell is a column vector of length N. The complete sensitivity matrix can be calculated through a finite difference approach or through the analytic methods of this chapter by calculating Jx for a vector with a single element of 1. In Figures 10.1 and 10.2 below, both solutions are displayed as a data map with each element of J in its corresponding data location. The difference between these two maps is displayed in Figure 10.3, which shows a confirmatory result. 3 / 148 i -5 i -3 i -2 i 0 Easting (m) i 2 i 3 i 5 nT Figure 10.1 The finite difference solution for J using a secondary flux formulation with positivity imposed and with boundary conditions calculated using the congruous sphere method. -5 • -3 i -2 i 0 Easting (m) i 2 i 3 i 5 nT Figure 10.2 The analytic solution for J using a secondary flux formulation with positivity imposed and with boundary conditions calculated using the congruous sphere method. 149 Easting (m) Figure 10.3 The difference between the finite difference and analytic solutions for J using a secondary flux formulation with positivity imposed and with boundary conditions calculated using the congruous sphere method. 150 Chapter 11 Testing the inversion algorithm 11.1 Introduction The inversion code, structured as discussed in Chapters 9 and 10, was coded in MATLAB. Testing for the inversion code included inversion of synthetic data for a simple body and inversion of survey data collected over a planted UXO target. The survey data for mineral exploration from the Osborne mine, introduced in Section 1.4, was not inverted. This problem was too large for the MATLAB inversion algorithm to handle within a practical time scale for computing power available at the time. The synthetic test involved a horizontally oriented, elongated rectangular prism. This simple target body is similar to that of the spheroid in Section 1.3 and the effects of selfdemagnetization are similar. These include reduction in magnitude of the magnetization and rotation of the magnetization direction away from the inducing field. In the synthetic test, data was first forward modelled and then inverted using the same grid and forward modelling methods. This allowed testing of the inversion algorithm regardless of the accuracy and legitimacy of the forward solution. Hence, this test served only as an internal test of the inversion algorithm itself, not of its adequacy for solving real-world inversion problems. Recall from Section 10.6 that the interpolation step in the forward solution is simpler for the field formulation than for the flux formulation. Therefore, the hope was that the objective function optimized in the inversion process would contain less irregularities for the field formulation than for the flux formulation. Use of the field formulation would then lead to more efficient inverse solutions. However, because all interpolations occur within free space, the interpolation step was expected to be practically equivalent for both formulations. This expectation was observed in the inversion tests. Although only the results from the flux formulation are presented here, the results for the field formulation were practically identical and the efficiency of the inversion process (i.e. the time and number of iterations required) for both methods was not significantly different. 11.2 Inversion of synthetic data for an elongated rectangular prism Here, the data for a horizontally oriented, elongated rectangular prism was forward modelled and then inverted. The grid, synthetic model and data positioning are shown in Figure 11.1 below. The grid used contained cells of lm dimension in the inner portion with increased dimensions of 1.5m and 2m in the padding cells. The 10m by 4m by 4m target prism was discretized by 160 cells. The data were calculated at z = -4.5, a height of 2.5m above the top of the target body, 151 across a 26.5m square area. The total number of grid cells was nc = 32 = 32768 and the number of data was N=24 = 576. 3 2 -22. OR Figure 11.1 The grid, target body and location of data used in the synthetic inversion test. The view is an x-z cross-section bisecting the central target body. The inset on the lowerrightshows the central target body as would be viewed in ay-z cross-section. The inducing flux was B = {B , B , B ) = (30000, 0,40000)nT, which has a strength of 50000nT inclined 53.1° to the horizontal and with zero declination (i.e. in the x-direction). This inducing flux was chosen to nearly maximize the rotation of the induced magnetization in the azimuthal direction. 0 0x 0y 0z 152 A small amount of noise, e, was added to the synthetic forward modelled data, d^ , so that the observed data inverted were rf +£ (11-1) The noise was chosen randomly from a normal distribution with zero mean such that the standard deviation for the i * datum, d, , was pbs G = e\max{\& "' \) + e \d{ \ f d i (11-2) wd 2 where maxQdr |) is the maximum absolute value in d***. The first term in (11-2) is a floor term and the second is a relative term. A value of 0.01 was used for both e\ and e%. The synthetic observed total flux magnitude data are displayed in Figure 11.2 below. 13.25. 1.477e+004 8.83. 1,142e+004 4.42. . 8065 0.00. 4711 -4,42. 1356 -8.83. -1998 -13,25. -13.25 -5352 -4.42 0.00 4.42 Easting (m) nT 13.25 14413.0 5258.8 Figure 11.2 Synthetic observed data map and profile for the first inversion test. 153 For the inversion of these data, and for all subsequent inversions discussed in this chapter, the flux formulation was used with positivity imposed as discussed in Section 9.2. The alpha parameter values were 6^ = 0.001 and 1, all spatial weighting functions were constant throughout the grid, and a constant zero-valued reference model was used. The boundary conditions were calculated without any secondary field approximation (i.e. g = 0). Here, the active region used in the inversion is indicated in Figure 11.1 and the initial model was constant over the active region with a square-root susceptibility value of 0.1. The target misfit was TV = 576 and the misfit obtained was 600= 1.047V, corresponding to a Rvalue of 375. The data predicted by the recovered model are displayed below in Figure 11.3. Figure 11.3 Predicted data map and profile for the model recovered in the synthetic inversion test. For a more comprehensive comparison of the observed and predicted data, consider Figure 11.4 below. Figure 11.4 shows the observed data used in the inversion, the predicted data, the difference between these, and the actual errors that were added to the observed data. 154 I I- - 8 ' - -450.4 . . . . . . . . . . . . . . . . . . . . . . , ........ 1 8 . -196.4 Figure 11.4 Maps and profiles for various data quantities associated with the synthetic inversion test. The observed data is shown at top left, the predicted data at topright,the difference between these (d - dP ) at middle left and the errors added to the observed data (e=d - d^' ) at middleright.The profiles for the observed and predicted data are plotted together at bottom with the observed data profile in black and the predicted data profile overlaid in grey. obs red obs 155 d The observed data has been fit less well around the positive and negative peak values. This is as expected from the form of the standard deviations assigned in (11-2). Furthermore, the amplitude of the difference values (d - dP ) is close to (i.e. within an order of magnitude of) the amplitude of the true errors ( d ^ - d ^ . T h i s is a favourable result because we are seeking a model that reproduces dr . d Below, Figures 11.5 and 11.6 show vertical and horizontal cross-sections through the recovered model. The cross-sections would exactly bisect the synthetic target body. Figure 11.5 A vertical cross-section through the model recovered in the synthetic inversion test. The position of the synthetic target prism is indicated by a black and white dotted rectangle. 156 -14.00 -7.00 0.00 y 7.00 14.00 Figure 11.6 A horizontal cross-section through the model recovered in the synthetic inversion test. The position of the synthetic target prism is indicated by a black and white dotted rectangle. Essentially identical results were obtained when other flux components (i.e. B and B ) were used in the inversions. As expected from the form of the objective function being minimized, the recovered model is more smooth than the true synthetic model. Consequently, the susceptibilities in the recovered model are lower than that of the target prism. However, the volume summed susceptibility of the synthetic and recovered models where close (1600 and 1672 respectively), which is a favourable result. x Z There is a certain level of asymmetry in the recovered model that is not characteristic of the synthetic model. This can be explained through the non-uniqueness of the inverse problem. The level of error assigned to the data was large enough to provide the inversion algorithm enough freedom that it was able to minimize the objective function more by moving the central body to an off-centred position. Further investigation showed that when the errors assigned to the data where reduced, the recovered model became more symmetric with the central body closer in spatial alignment to the target prism. Below, Figure 11.7 shows the behavior of various values of interest to the inverse problem for thefirstsynthetic inversion. 157 20 Iteration Figure 11.7 Values of the total objective function (top), gradient norm (secondfromtop), misfit (middle), model objective function (secondfrombottom), and the slope in the search direction (bottom) at each model iteration for the first synthetic inversion test. The circled values are those at the beginning of each outer iteration (i.e. the initial values for each value of /?). The initial value of /?was 1000 and the initial model (i.e. at iteration 0 in Figure 11.7) had high 0d and 0 values. 0 was reduced greatly during thefirstfew iterations at the high initial R, as expectedfromthe discussion in Section 9.7. 0d and 0 eventually settled to limiting values for the initial R as the minimization of 0 continued. Furthermore, both the gradient norm and the slope in the search direction reduced in magnitude smoothly toward zero. At iteration 13, the objective function was changing slowly enough for the algorithm to breakfromthe inner iterative process (i.e. model iterations at a constant ft). At this point, 0d was greater than the target misfit (0d* = 576) so the value of R was reduced to 500 and minimization of the objective function was continued, beginning with the model at iteration 13. The optimizations at each value of R proceeded as for the initial value. After a subsequent minimization with pi = 250, 0d m m 158 had been reduced below the target misfit. A bisection search was then entered in order to narrow in on B* (corresponding to 0d*)- After optimizing at the first bisected value of j3= 375, the misfit was close enough to the target and the inversion algorithm quit. The values of 0 and 0 for the recovered model were 600 and 2.48 respectively. These values for the true synthetic model were 300 and 187 respectively, where this misfit value is the misfit between the noisy data being inverted and the clean forward modelled data. The two misfit values are close, as expected, and the much smaller 0 value for the recovered model indicates a consistent solution. d m m A section of the Tikhonov curve, and other associated curves, calculated around the target misfit (i.e. a greater range than provided in Figure 11.7) is shown below in Figure 11.8. Figure 11.8 Plots of the relationships between the misfit, model objective function and P for the first synthetic inversion test. The plots are centred around the target misfit. In each, the circled datum corresponds to the lowest value of ft. The Tikhonov curves (at bottom) are plotted on both linear-linear (left) and log-log (right) scales. 159 The curve characteristics in Figure 11.8 are as expected. At optimal conditions (i.e. when on the Tikhonov curve), 0d increases with increasing B and 0 decreases with increasing J3. The shape of the Od vs. B relationship is such that the optional linear or quadratic interpolations for /3* (discussed in Section 9.7) would be appropriate. The Tikhonov curve exhibits a typical shape when plotted on linear axes but not when plotted on logarithmic axes. This has consequence on the use of an L-curve method for choosing /?, in which an appropriate value is chosen as one corresponding to the elbow (i.e. the point of maximum curvature) in a log-log plot. The L-curve method is discussed further in Section 12.3. m 11.3 Inversion of survey data above a planted U X O target Here, total flux magnitude survey data collected over a planted UXO body was inverted. This data was collected with the Multi-Sensor Towed Array Detection System, MTADS, developed by the Chemistry Division of the US Naval Research Laboratory. Further information on this data and on MTADS can be found in McDonald and Robertson (1996) and McDonald et al. (1997). Regional fields (i.e. background magnetic gradients) had been removed from this data using a linear (planar) interpolation through background measurements. The data contain both effects of self-demagnetization: reduction in magnitude of the magnetization and rotation of the magnetization direction awayfromthe inducing field. The ordinance was a 105mm projectile with diameter 10.5cm and length 42.7cm. It was buried with its centre of mass at a depth of 50cm and was declined and inclined at 45°. The situation is pictured below in Figure 11.9. 160 data 0.25m z Figure 11.9 The buried 105mm projective viewed in a vertical cross-section along a 45°E declination. The inducing flux in the region had a strength of 53612nT inclined 67.4° to the horizontal and declined -10.7°E: B = (20238, -3831, 49506)nT. Hence, the angle between B and the UXO symmetry axis (i.e. long axis) was 36.3°. The grid used in the inversion contained cubic cells of 0.13m dimension in the inner portion with increased dimensions of 0.26m in the padding cells. The data were calculated at z = -0.25m (i.e. 0.25m above the ground surface at z = 0) across a 4m by 4m area. The total number of grid cells was nc = 46x46x33 = 69828 and the number of data was N= 504. The standard deviations assigned to the data were defined as in (11-2) with e\= 0.025 and ei= 0. The survey data are displayed in Figure 11.10 below. 0 0 161 l -1.99 i -1.33 l l -0.67 l -0.01 Easting (m) l 0.64 l 1.30 1.96 403.2 B -29.1 Figure 11.10 Map and profile of the observed survey data collected above the planted UXO. This data can be fit well by a high susceptibility, prolate, spheroidal body with similar dimensions and identical location and orientation as the UXO target. For example, Figure 11.11 below shows analytic data for such a spheroid with minor axis 0.105m, an eccentricity of 4.07 (i.e. equal to the length-to-diameter ratio of the UXO) and a susceptibility of 1000. Figure 11.12 shows the difference between this analytic data and the observed data in Figure 11.10. 162 i i i i i i -1.99 -1.33 -0,67 -0.01 Easting (m) 0.64 1.30 i nT 1.96 Figure 11.11 Analytic data map and profile for a prolate spheroid with minor axis 0.105m, an eccentricity of 4.07 and a susceptibility of 1000. The observed data profile from Figure 11.10 is also included in the profile, with the observed data profile in black and the analytic data profile overlaid in grey. 163 -1.99 -1.33 -0.67 -0.01 Easting (m) 0.64 1.30 1.96 Figure 11.12 A map of the difference between the observed UXO data and the analytic data for the approximate prolate spheroid. The observed data was inverted and the attained misfit was 6.26 = 0.01247V". The data could be fit very well without introduction of complicated, unnecessary structure. This indicates that the errors assigned to the data were higher than the actual values. Below, Figure 11.13 shows predicted data for the recovered model. The difference between the observed and predicted data is shown in Figure 11.14. Cross-sections through the recovered model are shown in Figures 11.15 and 11.16. 164 400,1 B -29.4 Figure 11.13 Predicted data map and profile for the model recovered in the U X O inversion test. 165 Easting (m) 403.2 B -29,1 Figure 11.14 A map of (d - df™ ) for the UXO inversion test and a comparison of the profiles for these two data sets. The observed data profile is black with the predicted data profile overlaid in grey. obs 1 166 3.51 3.51 Figure 11.15 A vertical cross-section through the model recovered in the UXO inversion test. The cross-section runs SW to NE (i.e. on a 45°E declination). 167 -3.51 -1.76 -0.00 >• 1.75 3.51 Figure 11.16 A horizontal cross-section through the model recovered in the UXO inversion test. The cross section is at depth z = 0.50m. The body recovered bears more resemblance to a low susceptibility, oblate spheroidal body than to the high susceptibility, prolate target UXO. The recovery of a compact, low susceptibility body that achieves rotation of the magnetization awayfromthe inducingfielddirection was unanticipated. The susceptibilities in the recovered model were low enough to perform a linear forward modelling. This yielded data consistent with the predicted data in Figure 11.13 and, therefore, the nonlinear forward modelling is not in question. The unanticipated result can be explained through the non-uniqueness of the inverse problem and the way in which this non-uniqueness was dealt with. It can be shown that there are many compact spheroidal bodies of various shape, size, orientation and susceptibility that can be used to fit the observed data in Figure 11.10. This particular inversion result has recovered a low susceptibility, oblate body. Without adding further a priori information about the target body, this result is inescapable since the inversion seeks a solution with a small volume-summed susceptibility (i.e. the inversion prefers low susceptibility bodies to high susceptibility bodies). 168 1 1 . 4 Conclusion The inversion methods developed were successful in inverting both synthetic and real-world survey data collected above regions containing highly magnetic bodies. The inversion algorithm and code were concluded to be sound. The survey data used was that collected above a single,. isolated UXO. This problem was small enough for the MATLAB inversion algorithm to handle within a practical time scale (i.e. within several days). The survey data for mineral exploration from the Osborne mine, introduced in Section 1.4, was not inverted. This problem was too large for the MATLAB inversion algorithm to handle within a practical time scale for computing power available at the time. Appendix C discusses more notable observationsfromthe inversion tests of this chapter. The topics discussed in Appendix C leave room for further study. I now move on, in Chapter 12, to discuss possible application of, amendment to and alteration of the forward and inverse methods developed in this thesis. The advantages of and drawbacks to the various magnetic modelling methods are discussed and the thesis concludes with a brief overview. 169 Chapter 12 Discussion and conclusions 12.1 Summary Both synthetic and real-world examples have indicated the need for full forward modelling and inversion (accounting for demagnetization effects) of magnetic data from regions of high magnetic susceptibility. The ultimate goal of this research was to invert geophysical magnetic data to recover three-dimensional distributions of subsurface magnetic susceptibility of any magnitude and geometric complexity. A full solution to Maxwell's equations for source-free magnetostatics was developed using a finite volume discretization. The Earth region of interest is discretized into many prismatic cells, each with constant susceptibility, which allows for models of arbitrary geometric complexity. The finite volume forward modelling method is valid for any linear medium and is appropriate for modelling the response of highly magnetic objects. The forward modelling method was refined and the code was tested against analytical solutions for simple bodies and against a slower, more memory intensive full solution for general distributions formulated in the integral equation domain. All tests showed the forward modelling method to be sound within expected error tolerances. The finite volume modelling method formed the foundation for a subsequent inversion algorithm. In the discretization, many more model cells are used than there are data. As such, the inverse problem is underdetermined. The inverse problem was formulated as an unconstrained optimization problem in which an objective function is minimized. The objective function was designed so that the data are fit to an acceptable degree and the recovered model has desired spatial characteristics. The resulting optimization problem was nonlinear and required an iterative solution, for which a Gauss-Newton approach was used. Testing for the inversion code included inversion of synthetic data for simple bodies and inversion of survey data collected over a planted UXO target. All tests showed positive results and the inversion algorithm and code were concluded to be sound. 12.2 Application of the methods presented 12.2.1 Assumptions made in the forward solutions The major physical assumptions made in the numerical solution of the forward problem have significant consequences to application of the methods presented. Recall from Section 2.2 that any magnetic media is assumed to be magnetically isotropic and linear and contain no remanent magnetization. Under some conditions, strong ferromagnetic materials exhibit anisotropy and 170 nonlinearity and, as a consequence, remanent magnetization. Before one uses the methods presented in this thesis for a particular application, one must determine the validity of the assumptions made. In the low field limit, nonlinearity disappears. In practice, one often accepts linearity as valid as long as flux densities do not exceed one Tesla (Bossavit, 1998). For applications in which the geomagnetic field is the inducing field, flux densities should be low enough that the assumption of linearity can be made. Due to the assumption of isotropy, one must be careful in applications where minerals exhibiting anisotropy are present. If one assumes that mineral grains are oriented randomly and, as a result, have no net orientation when averaged on model scales (i.e. model cell size), then the overall effect of the anisotropy may be ignored. However, remanent magnetization may still be important in such a situation. For application to UXO problems, any remanent magnetization present in an unexploded projectile is often removed upon impact (Nelson, 1998). However, shrapnel often contains remanence due to heating during explosion of its source ordinance. Remanent magnetization is also problematic in some mineral exploration applications. The direction and strength of the remanence may be confined from knowledge of the age and composition of the subsurface rocks and this information may be of aid to the problem. The concepts of anisotropy and remanent magnetization are discussed further in Appendix D, along with a brief discussion on possible incorporation of these phenomena into the modelling methods. 12.2.2 Application of the methods to linear problems An FVD solution could be incorporated into a linear inverse solution for materials of low susceptibility where self-demagnetization effects are negligible. The full forward solution in the integral equation domain is much slower than the flux or field formulation FVD solutions (in the differential domain). However, the approximate, linear forward solution in the integral equation domain requires much less time than the full integral solution. The linear integral solution only requires construction of the dense forward modelling matrix G and a multiplication of the model vector m by this matrix. This still requires considerable construction time and memory requirements and one may consider the benefits of using an FVD forward solution in place of the linear integral forward solution. One major advantage of the FVD methods is that the forward solution time is not a function of the number of data. The size of the system of equations to solve for the scalar potentials is only dependent on the size of the model grid. The data are interpolatedfromthese potentials and the time required for this second step is negligible in comparison to that required in the solution for the potentials. In contrast, an integral domain forward solution time is very much dependent on 171 the amount of data. For the linear integral solution, the size of the problem is determined by the size of the matrix G , which is nc (the amount of susceptible model cells) by N (the amount of data). Hence, the more data, the longer the linear integral solution time. To quantify the differences in solution time between the various methods, investigation was performed for problems of reasonable size and scope for mineral exploration applications. The linear integral forward solutions generally took an order of magnitude longer than the FVD forward solutions. Keep in mind that the solution times for the FVD methods depend on the accuracy to which the potentials are solved in (6-1). Irrespective of the relative residual tolerance or amount of data used, the time difference is large enough to conclude that an FVD forward solution is an appropriate replacement for the linear integral forward solution when performing a linear inversion. 12.2.3 Application of the methods to dispersed problems Here, problems are considered in which the region to be modelled contains a dispersed susceptibility distribution and, therefore, requires a large number of cells. The various numerical solution methods for the full magnetostatic problem were compared for use in such a disperse inverse problem. Often, the full integral forward solution could not be completed due to memory issues. Recall that there are 3nc equations for the 3nc unknown magnetization values, where nc is the number of susceptible cells. This requires a system to solve of the form Ax = b where the matrix A is of size 3nc by 3nc. For a problem with a susceptible grid 40 cells by 40 cells by 20 cells this equates to 9.216 billion elements in A, which is clearly a cause for concern. In contrast, a major advantage of the FVD methods is that they deal with sparse matrices with amounts of non-zero elements on the order of the number of model cells. The memory requirements are, therefore, much less problematic. For smaller problems with practical memory requirements, the full integral forward solution took one or two orders of magnitude longer than the FVD solutions. Keep in mind that the solution times for both methods depend on the accuracy to which the potentials or magnetizations are solved in (6-1) and (7-14). Regardless, the time difference is extreme and an FVD solution is much more appropriate than a full integral solution for such dispersed problems. A drawback of the FVD methods is that in order to maintain accuracy, a responsible grid design should finely discretize from the level of the ground surface up to and beyond the survey height. This will significantly increase the problem size when the data are close to the surface compared to the depth dimension of the near-surface susceptible cells. In order to decrease the required level of discretization in this region, the data could be upward-continued to a higher surface. 172 12.2.4 Application of the methods to compact problems Here, problems are considered in which the distribution of susceptibility contains one or several spatially isolated, compact bodies. This is the situation for UXO applications if the ordinance are buried within a non-magnetic soil. In such a situation, it may be more feasible to make use of the full integral forward solution. The FVD methods would require a large grid covering the entire region of interest plus a number of padding cells outside this region. In contrast, the integral solution would only require small grids encompassing each isolated magnetic body. For many such problems, the full integral forward solution would be faster than an appropriate FVD forward solution. However, when inverting for compact bodies of unknown position and shape, the grid would have to be designed as for the FVD methods. Therefore,fromthe discussion in Subsection 12.2.3, the FVD methods would be more appropriate for such an inverse problem. 12.3 Possible amendments to the methods presented 12.3.1 Extension to include anisotropy and remanent magnetization The methods in this thesis do not consider magnetic anisotropy or the presence of remanent magnetization. A discussion of these phenomena and how they may be introduced into the modelling methods is provided in Appendix D. Introduction of these phenomena would yield a much larger and more complicated inverse problem. The question of whether or not to be concerned with these phenomena in the inverse problem depends on the application. If the geology in the region of interest is known to contain significantly anisotropy on scales similar to model cell dimensions then this information needs to be incorporated into the inversion. If there is no anisotropy on any scale then there can be no remanent magnetization. If remanence is possible, one needs to determine whether a significantly large and problematic remanent magnetization in a direction other than the present inducingfieldis probable. 12.3.2 Alteration of the inversion algorithm Afirstpossible alteration to the inversion algorithm would be to use the alternate logarithmic barrier method of imposing the positivity constraint. This method was introduced in Section 9.2 and is discussed in Li and Oldenburg (in press). Although a second regularization parameter appears in the objective function to be minimized, which complicates the inversion algorithm, this method may be an improvement over the re-parameterization approach. This stemsfromthe expectation that in making the re-parameterization of the model to deal with square-root susceptibility values, the objective function contains more irregularities and thus leads to less efficient inverse solutions. A second possible alteration to the inversion algorithm would be to allow more options for choosing an appropriate value of the tradeoff parameter, fi. RecallfromSection 9.5 that there is a tradeoff between the data misfit, & , and the model objective function, & . Finding an d m 173 appropriate value of 8 that results in a solution that meets our requirements is an important aspect of the inverse problem. Three methods of choosing B that have been commonly used in the approximate linear inverse problem are discrepancy principle, L-curve, and cross validation (CV) methods. All three methods assume that the errors in the data are Gaussian and uncorrected with zero mean. The discrepancy principle method, discussed in Section 9.5, requires knowledge of these errors but the L-curve and CV methods do not. The L-curve method is a heuristic approach to choosing B when data errors are unknown. A series of optimizations for a range of B values are performed and the resulting Tikhonov curve (refer to Figure 9.3) is plotted using log-log axes. This plot is sometimes called an L-curve. A reasonable choice for f3 may be the value that corresponds to the point of maximum curvature on the L-curve (i.e. at the "elbow" of the curve). The model corresponding to this particular solution then becomes the inversion result. We do not wish to fit the data too poorly or too well and we do not wish to recover a model with too little or too much structure. Hopefully, the Lcurve model will present an acceptable compromise. Unfortunately, there is no guarantee that this model will be appropriate or that the L-curve will even contain a pronounced elbow (refer to the inversion test result in Figure 11.8). The CV method is a morerigorousand computationally intensive approach to choosing /?when data errors are unknown. Oldenburg, Li and Jones (1999) discuss the basics of the CV method. The basic CV concept is that a good model should predict any new datum and the choice of j3 should therefore be the one that generates a model which best predicts all data points. Each datum is assumed to have the same level of Gaussian, independent error but the individual errors are unknown. For each R value in a chosen range, N inversions are performed, each with a different datum omitted. The cumulative data misfit for a certain value of B (i.e. the sum of the misfits for the TV with that is called a cross The model chosen is the one associated with the value of B with the smallest cross validation and is thus the model least affected by any single datum. In practice, an alternate formulation providing an identical result is used that is far less computationally intensive. This is the Generalized Cross Validation (GCV) method. The specifics of this method are discussed in Golub and von Matt (1997). accurately inversions Rvalue) validation. A third possible alteration to the inversion algorithm would be to use full Newton model perturbations. Recall from Section 9.3 that the Gauss-Newton approach neglects the higher derivative term in the Hessian when calculating the model perturbations. In a full Newton approach, the exact Hessian would be used. Expressions for the exact Hessian can be derived from the results in Chapter 10. The equations to solve for the full Newton perturbation could no longer be written as a least-squares problem and a tradeoff is introduced: more time would be required to calculate the more accurate perturbation directions but taking them may reduce the total amount of perturbations required in the optimization. Without further investigation it is unclear if a full Newton approach would be a significant improvement or not. 174 12.4 Conclusion Both synthetic and real-world examples have indicated the need for full forward modelling and inversion (accounting for demagnetization effects) of magnetic data from regions of high magnetic susceptibility. The ultimate goal of this research was to invert geophysical magnetic data to recover three-dimensional distributions of subsurface magnetic susceptibility of any magnitude and geometric complexity. The work in this thesis was successful in reaching this goal. The inversion methods developed were successful in inverting both synthetic and realworld survey data collected above regions of high susceptibility. The survey data inverted was collected above a single, isolated UXO. This problem was small enough for the MATLAB inversion algorithm to handle within a practical time scale. The survey data for mineral explorationfromthe Osborne mine, introduced in Section 1.4, was not inverted. This problem was too large for the MATLAB inversion algorithm to handle within a practical time scale for computing power available at the time. The methods developed use several assumptions about the physical properties of the magnetic materials involved. Any media is assumed to be magnetically isotropic and linear and contain no remanent magnetization. The assumption of magnetic linearity should be appropriate for any application in which the geomagnetic field serves as the inducing field. Appendix D suggests simple extension of the methods to include anisotropy and presence of remanent magnetization. 175 References Barrett, Richard., et. al, Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods. Philadelphia: SIAM, 1994. Bjorck, A, T. Elfving and Z. Strakos, "Stability of Conjugate Gradient and Lanczos Methods for Linear Least Squares Problems", Siam J. Matrix Anal. Appl., 19, 1998, 720-736. Blakely, Richard J., Potential Theory in Gravity and Magnetic Applications, Cambridge: Cambridge University Press, 1996. Bossavit, Alain, Computational Electromagnetism, San Diego: Academic Press, 1998. Butler, Robert F., PALEOMAGNETISM: Magnetic Domains to Geologic Terranes. Oxford: Blackwell Scientific Publications, 1992. Chen, J., E. Haber and D. W. Oldenburg, "Three-dimensional numerical modelling and inversion of magnetometric resistivity data", Geophys. J. Int., 149, 2002, 679-697. Cheng, David K., Field and Wave Electromagnetics, 2 Ed., Reading: Addison-Wesley Publishing Co., 1992. nd Clark, D. A. and D. W. Emerson, "Notes on rock magnetization characteristics in applied geophysics studies", Exploration Geophysics, 22, 1991, 547-555 Clark, D. A., and D. W. Emerson, "Self-Demagnetization", Preview (Australian Soc. Of Exploration Geophysicists), Apr./May, 79, 1999, 22-25. Farquharson, C. G. and D. W. Oldenburg, "Non-linear inversion using general measures of data misfit and model structure", Geophysics, 134, 1998, 213-227. Gill, P. E., W. Murray and M. H. Wright, Practical Optimization. London: Academic Press, 1995. Golub, G. H., and C. F. Van Loan, Matrix Computations. 2 Ed., Baltimore: The John Hopkins University Press, 1990. nd Golub, G.H., and U. von Matt, "Generalized Cross-Validation for Large Scale Problems", J. of Computational and Graphical Statistics, 6, 1, 1997. 176 Haber, Eldad, "Computational methods for Fields and Fluxes - Handout 15: The search for the preconditioner", Diss.: University of British Columbia, Dept. of Earth and Ocean Science and Dept. of Comp. Sci., Mar. 2000. Hestenes, M. R. and E. Stiefel, "Methods of conjugate gradients for solving linear systems", J. of Research of the Nat. Bureau of Standards, 49, 1952, 409-436. Kaufman, Alexander A., Geophysical Field Theory and Method Part A, San Diego: Academic Press, 1992. Li, Y. and D. W. Oldenburg, "3-D inversion of magnetic data", Geophysics, 61, 1996, 394-408. Li, Y. and D. W. Oldenburg, "Separation of regional and residual magnetic field data", Geophysics, 63, 2, 1998, 431-439. Li, Y. and D. W. Oldenburg, "Fast Inversion of Large-Scale Magnetic Data Using Wavelet Transforms and Logarithmic Barrier Method", Geophys. J. Int. (in press). Lorrain, P., D. R. Corson and F. Lorrain, Electromagnetic Fields and Waves. 3 Ed., New York: W. H. Freeman and Company, 1988. rd Marsily, Ghislain de, Quantitative hydro geology. Orlando: Academic Press, 1986. MathWorks, The, MATLAB Functions. 7 Jan. 2002 <http://www.mathworks.com/products/matlab/mnctions/functions.shtml> McDonald, J. R. and R. Robertson, "Sensor Evaluation Study for Use with Towed Arrays for UXO Site Characterization", Symposium on Applications of Geophysics to Engineering and Environmental Problems (SAGEEP) 96, The Environmental & Engineering Geophysical Society, Keystone, CO, 28 April 1996. McDonald, J. R., et al. "Field demonstration of the Multi-Sensor Towed Array Detection System (MTADS)", 1997 UXO Forum, Nashville, USA, May 28, 1997. McGillivray, Paul R., "Forward Modeling of DC Resistivity and MMR Data", Unpublished Ph.D. Dissertation, University of British Columbia, 1992. Nelson, H. H., et al., "Magnetic modeling of UXO and UXO-like targets and comparison with signatures measured by MTADS ', Presentation at the UXO Forum 1998 (US Department of Defense) 1 Oldenburg, D., Y. Li and F. Jones, UBC-GIF Tutorial on Linear Inversion, 15 July 1999. < http://www.geop.ubc.ca/ubcgif/tutorials/magbasics/index.htm> 177 University of British Columbia (Vancouver, BC, Canada), Department of Earth and Ocean Sciences, Geophysical Inversion Facility. O'Reilly, W., Rock and mineral magnetism. Glasgow: Blackie and Son Ltd., 1984. Paige, C. C. and M. A. Saunders, "LSQR: An Algorithm for Sparse Linear Equations and Sparse Least Squares", ACM Trans. Math., 8, 1, (Mar.) 1982, 43-71. Saad, Yousef, Iterative Methods for Sparse Linear Systems. Boston: PWS Publishing Company, 1996. Sharma, P.V, "Rapid Computation of Magnetic Anomalies and Demagnetization Effects Caused by Bodies of Arbitrary Shape", Pure Appl. Geophysics, 64, 1966, 89-109. Shewchuk, Jonathan R., "An Introduction to the Conjugate Gradient Method Without the Agonizing Pain", Diss.: School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, Aug. 1994. Stacey, F. D. and S. K. Banerjee, The Physical Principles of Rock Magnetism, Amsterdam: Elsevier Scientific Publishing Co., 1974. Stratton, Julius A., Electromagnetic Theory. New York: McGraw-Hill, 1941. Tarling, D. H. and F. Hrouda, The Magnetic Anisotropy of Rocks. London: Chapman and Hall, 1993. Todd, David Keith, Ground Water Hydrology. New York: John Wiley and Sons, Inc., 1959. Van der Vorst, H. A., "BI-CGSTAB: A fast and smoothly converging variant of BI-CG for the solution of nonsymmetric linear systems", SIAM J. Sci. Stat. Comput., 13, 2, (Mar.) 1992, 631-644. 178 Appendix A Nomenclature The information in this appendix covers the nomenclature conventions for mathematical equations used throughout this thesis. A . l Differential and integral equations Scalar quantities are symbolized by Greek letters or by italicized Roman letters. Vector quantities are symbolized by bold letters. Hats ( ) are used to denote vectors of unit length (e.g. x, y and z are unit vectors in the three Cartesian directions). Individual directional components of vector quantities are expressed as either italic or bold type letters with italic subscripts to indicate directions: A e.g. B =Bx +By + Bz x y z =B +B + B X y z A.2 Matrix-vector equations Matrices are symbolized by bold capital letters. Vectors are symbolized by bold lowercase letters with the exception of B and H, which contain single component flux and field values respectively. Elements of vectors or matrices are denoted by numerical subscripts: e-g- B A [Bi B B ...] 2 3 A, A,12 A.21 A. L L 2 2 179 Appendix B Discretization for the field formulation B . l Introduction In Chapter 3, a numerical solution to Maxwell's equations for magnetostatics was presented using a cell-centred discretization scheme referred to as the "flux formulation". Section 3.7 summarized the results for an alternate node-centred discretization scheme referred to as the "field formulation". Here, the complete derivation for the field formulation is provided. B.2 The system Instead of choosing to work with fluxes in a cell-centred scheme, one could work with fields in a node-centred scheme. The problem of interest is governed by two of Maxwell's equations and one continuity equation: V(//H) = 0 H = V<p H . xii = F L xii (B-la) (B-lb) (B-lc) The advantage of keeping H in the system instead of discretizing (2-6) directly is that methods can be designed to allow for higher order interpolation functions for H , the quantity of interest. B.3 The discrete grid and discrete variables B.3.1 Nomenclature for the discrete variables The nomenclature for the discrete variables in the field formulation is exactly as described in Subsection 3.3.1 for the flux formulation. B.3.2 The discrete grid The orthogonal grid system pictured in Figure B. 1 is used for the discretization. 180 Figure B.l A single discrete grid cell for the field formulation. The coordinate system used has +x in a northerly direction, +y easterly, and +z vertically downward. The permeability, //, is constant within each cell (i.e. piecewise constant) and the potentials, are placed at cell corners (i.e. the nodes). Thefieldvalues, H], are placed at the centres of the cell edge interfaces: there is one component per edge depending on the edge location. This grid leads to a node-centred discretization scheme. Location of thefieldsresults from (B-1 c) requiring continuity of tangential H across cell interfaces. The positioning of the potential results from (B-lb): use of central differences to calculate results in both and H being defined at the same points in space, as required by (B-lb). B.3.3 Grid coordinates and lengths Definitions for grid coordinates and lengths for the field formulation are exactly as described in Subsection 3.3.3 for the flux formulation. The three dimensional volume tb be modelled is divided into nc = nx-ny-nz rectangular cells, which are denoted as being in nx rows, ny columns and nz layers. In the derivations that follow it is useful to consider a one dimensional problem and refer to the discretized line in Figure B.2 below. 181 V- t X \ W LJX 2 X Hf V i - 3 X u u X *4 - t. Xnx nx ^nx+\ X 1 x Figure B.2 Position and numbering of discrete variables in the field formulation for a one dimensional discretization. B.3.4 Variable numbering Numbering of the variables for the field formulation follows the conventions described in Subsection 3.3.4 for the flux formulation. Let the number of nodes be nn = (nx+\)-(ny+l)-(nz+\) There are then nn unknown fa quantities, nx-(ny+l)-(nz+{) unknown H* values on the x-edges (i.e. edges parallel to the x-direction), (nx+\)-ny-(nz+\) unknown H y values on the v-edges and (nx+l)-(ny+l)-nz unknown H\ values on the z-edges. The number of unknownfieldvalues is equal to the number of edge interfaces, ne = «x-(«y+l)-(«z+l) + (nx+l)-«v-(nz+l) + («x+l)-(ny+l)-nz = nex + ney + nez B.4 Finite volume discretization As in the flux formulation, a finite volume discretization is used to formulate a numerical solution. The equations to discretize are L/V-(//H)tfV = 0 !vYldv = ! V0dv (B-2a) (B-2b) v B.5 Discretizing the divergence equation The volumes of integration for the divergence equation are now the dual grid cells, defined so that the potentials are at their centres (i.e. they are the Voronoi cells defined by the nodal points 182 of the true grid - refer to Figure 3.9). To approximate (B-2a) it is integrated over each dual grid cell. Gauss's Theorem is applied to obtain \v^-fMdv = \slMnds =0 (B-3) H assumed constant over each dual grid cell face. With field defined as positive-out, the integration over a dual cell with centre (x„ y-, zt), divided by dual cell volume, is y V \ fjR-n x s dS = (T] 2j,kH \i2j,k - Vi-i/2j,kH i.y j,kVAx x i+U X i+ 2 iA ij+\l2,k~ lij-\l2,kH ij.\l2,k)IAyj.\ 7 > (B-4) + ( VijM 1 l2.H ;j,k+112 - TJij,k-1 lllfljjt-112)1Azk-1 - 0 Z where the subscripted indices on the fluxes indicate their x, y and z locations. The r\ values are arithmetic averages of the permeability around the field quantities. They are of the form (hy jhz [i j+i/2^+1/2 k = , MI2 hy j-\h kMi+\i2j-\i2,k+\i2 z + + hy • • • hy jhz _ fj. j j_ k l j+l/2 +U2 U2 +••• ^ j_ hz _ pL j_ _ x k x -» MI2 xl2k xl2 -,/ (B-5) 4Ay._,Az _ i 1 Note that we have arrived at the use of arithmetic averages without having to decide the form of the equations to discretize as was necessary in Section 3.6 for the flux formulation. The boundary conditions are used to close the discretization. In order to simplify the following discussion, consider the case where the prescribed boundary fluxes are equal to the constant inducing field Ho = (HQ , HQ , HQ ). AS discussed in Section 3.7, the incorporation of the prescribed boundary fields must occur outside of the defined grid of permeable cells (the true grid). This requires the addition of a half layer of padding cells to the true grid, all of which have permeability offreespace, JUQ. X }! Z A cell on a dual grid boundary face would produce an equation of the form ( Vi+1 /2J,kH*i+1 /2J,k ~ MoHox)/AXj. l + (^+1/2,^(7+1/2^- Tlij-uijfP'ij-mjdlkyj-x + (.TliJ,k+\l2H ij,k+\l2 - VijJiAnH ij,k-\l2)lAZk-\ Z 1 (B-6a) =0 or Vi+1 /2j,kH i+\/2J,k/A*/, i X + ( 1 ry0'ij+ I n,k - fly. i njcfPij. i i2,k)l^yj-1 + ( ViJM 1 l2H ij£+112 ~ Vijjk-1 llff ijjc-112)1 AZk-1 Z (B-6b) = /JQH} /AXj. \ x Similarly, a cell on a dual grid boundary edge would produce an equation of the form 183 Vi+1 /2j,kH i+112jjj AX/, l X + (B-7) 'Hij+i/2,kH 'ij+\/2,k/A.yj.i } and a cell on a grid boundary corner would produce an equation of the form Vi+1 i2j,kH i+1 njjj Axi-1 X (B-8) + 'Hij+\i2,kH 'ij+\i2,klAyj.\ > + riij,k+\i2H ij,k+\i2 ^ k-\ Z l = /JoHo /Axj.\ z x + /UoHoylAy jA + /JoH /Az .i 0z k When all such equations (i.e. one for each dual grid cell) are combined, the matrix-vector equation obtained is DMH = q (B-9) Here, the divergence matrix D is nn by ne; the vector H is length ne and holds the unknown field values; and the vector q is length nn and contains nn-(nx-\)-(ny-\)-(nz-\) non-zero elements arisingfromthe prescribed boundaryfields.This number corresponds to the number of dual cells lying on the boundary of the true grid. The matrix-vector equation (B-9) can be split into parts: D M [M [D D D ] x y z L x 0 0 M 0 H =q 0 1[ H i 0 y 0 H =q y J L M I I y y I I (B-lOc) I The matrices M, D , D and D are as follows. x y (B-lOb) J H D M H +D M H,+D M H =q 1 (B-lOa) z 184 «.V(»I'+1)(HZ+1) M = (B-H) (nx+\)ny(ni+\) (iix+l)iny+l)nz where rjj, is defined at the same position as HI D. D (B-12a) D Ax~ - Ax! -1 Ax^ 1 D (B-12b) - Ax nx~\ Ax -i MX-1 Axl 1 Here, the dual grid cell length Axo enters the discretization because we have added an extra layer of padding cells to the grid. The 0 and -1 diagonals of D are filled. D is («x+l) by nx and is diagonally tiled (ny+\)-(nz+\) times to create D . Hence, D is nn by nex. th st x x x x D. D. = D. (B-13a) D. 185 0 Av" 1 0 -Ayf Av," D. = (B-13b) -Ay Aym - 1 HV-1 Av; The 0 and -(nx+l) diagonals of D are filled. D is («x+l)-(«v+l) by (nx+l)-«y and is th th y y diagonally tiled («z+l) times to create D . Hence, D is nn by ney. y y -i Az, 0 Az -i Az, -Az," D, = (B-14) Az" AzH Z - 1 -Az \th tn The 0 and - ( « x + l ) ( « y + l ) diagonals of D are filled. D is nn by nez. th z z B.6 Discretizing the gradient equation In the field formulation, one does not need to decide the form of the gradient equation to discretize as was necessary in Section 3.6 for the flux formulation. The gradient equation is first split into three parts, each corresponding to a different Cartesian direction: (B-15a) (B-15b) (B-15c) 186 Consider (B-15a). The volume of integration for this equation runs along a cell edge in the xdirection so that an unknown field H* is at the centre of the integration volume. The integral is •Vj+| . ' > l / 2 Z*+l/2 J | JH dzdydx = J J 0dzdydx X j-l/2 - A - l / 2 i x - , v x (B-16) .V/-1/2 /t-l/2 z Assuming that H is constant over the integration volume leads to x •V,-+i Vy+1/2 j J #, i/2,;,* + j-l 12 .V , -> , /+ ]dz= j-£-dx -i-l / 2 V ; + | z,. / 2 j dy './-1/2 + 1 / 2 jdz (B-17) -t-1/2 Assuming d</>/dx = A^/Ax and dividing by the integration volume yields H. (B-18) hx. The matrix-vector equation obtained is H =G 0 x (B-19) x When (B-15b) and (B-15c) are discretized in a similar fashion, the following system is obtained: H = G <» Hy = Gy<|> X (B-20a) (B-20b) (B-20c) X H =G ^ Z Z which can be combined to give X" = H H z (B-21a) y G G,_ = G (B-21b) In (B-2 lb), the gradient matrix G is ne by nn and the vector H is length ne and holds the unknown field values. The vector (j) is length nn and holds the unknown potentials. The matrices G , G and G are as follows: x y z 187 (B-22a) hx hx 1 x 1 x hx 2 G„ = hx 2 (B-22b) /a nx The 0 and 1 diagonals of G are filled. G th st x x is nx by (nx+l) and is diagonally tiled (ny+\)-(nz+\) times to create G . Hence, G is nex by nn. x x G. = (B-23a) •hy; 1 0 0 /rv" -hy\ G. = (B-23b) hyy~,, h -hy'n The 0 and (nx+\) diagonals of G are filled. G is (nx+l)-ny by (nx+\)-(ny+\) and is th th v y diagonally tiled (nz+l) times to create G . Hence, G is ney by nn. y • hz; ] o y o hz; - hz. G, = hz~. — hz. — hz„ The 0 and (nx+\y(ny+\) diagonals of G are filled. G is nez by nn. th (B-24) hz'. th z z 188 hz~ B.7 Derivation of the secondary field formulation B.7.1 The total field formulation Above, a finite discretization was performed on the governing equations V-//H = 0 H = V(j> (B-25a) (B-25b) to yield a method of numerical solution. The discrete matrix-vector equations obtained were DMH = 0 H = G<t) (B-26a) (B-26b) In (B-25), thefield,H, and potential, (j), are representative of the totalfield.That is, they contain contributions from both the primary inducingfieldand the secondary anomalous response: <t>=<t>o + <f> H = H + H, (B-27a) (B-27b) s 0 In order to solve for the anomalous secondaryfieldusing this formulation, (B-26a) and (B-26b) are combined and onefirstsolves for the vector of total potentials, <> | , in DMG(j) = q (B-28) The total potentials are then used to calculate the totalfieldthrough (B-26b) and the secondary field is obtained after subtraction of the primary field: H =G^-H S (B-29) 0 Because the secondary field values may be considerably small compared to the primary field values, some accuracy may be lost through machine precision and rounding problems. To avoid this significant difficulty, the secondary potentials can be solved for directly and the secondary fields can be calculated directly from these. B.7.2 Secondary formulation approach 1: formulate first, discretize second As in the flux formulation, there are two possible approaches to developing a secondary formulation in which the secondary potential is solved for directly. Here, the governing equations (B-25) arefirstreformulated to yield equations containing the primary and secondary quantities. Consider the primary response to be that with jU=jiio and H = Ho (a constant). (B-25) is then 189 V-//oH = 0 H = V#> (B-30a) (B-30b) 0 0 For a combined primary and secondary response, (B-25) can be decomposed to yield V-ju(U + H,) = 0 Ho + Hf = V($> + (j) ) (B-31a) (B-31b) 0 s Combining (B-30) and (B-31), and using (2-4) to relate fl to x> yields V / / H = -M)VCrH0) H,= (B-32a) (B-32b) 5 Combining (B-32a) and (B-32b) leads to the div-grad system for the secondary potential: VjuVfa = -A)V(jH ) (B-33) 0 One can proceed with the finite volume discretization of (B-32) as in Sections B.2 through B.6. The gradient equation (B-32b) would yield a similar matrix-vector equation as before: H = G4» s (B-34) s The divergence equation (B-32a) is integrated over each dual grid cell and Gauss's Theorem is applied to obtain Js//H-n ds = -{Jol xKo-nds (B-35) s Note the subscriptV has been dropped and all quantities are secondary unless otherwise indicated (e.g. Ho). H is assumed constant over each dual grid cell face and field is defined as positive-out. After dividing by the dual cell volume, the left and right integrals in (B-35) over a dual cell with centre (x„ yj, z^ are V \sjuR-ndS l = (r/i+uij^i+injA- "ni-\njjJTi-mjj)IAxi-\ + (Tlij+injclfij+injc ~ Tii .mjfl'ij.mj)lbyj-\ j (B-36a) + (VijMV2H~ij,k+\/2 - VijMnH ij,k-\n)lAzk.\ = 0 2 F'Js^Ho-n dS = (K mj\k- Kj.\i2j,k)Ho IAx i+ x iA + (K , , ,k - Kj-1 i2,k)Ho Jkyj-1 iJ+ 2 y + (*UA+l/2 - K l2)HoJAZk-\ iJM (B-3 6b) =0 The T] and rvalues are arithmetic averages of the permeability and susceptibility respectively around the H quantities and are defined as in (B-5). 190 The secondary boundary conditions are used to close the discretization and the matrix equation obtained is DMH - -//oDKHo + g (B-37) S where M has non-zero diagonal elements rj and K has non-zero diagonal elements K. B.7.3 Secondary formulation approach 2: discretize first, formulate second Here, a secondary formulation is developed though the opposite procedure: discretize first and then split the matrix quantities into primary and secondary quantities. (B-27) and (4-9) are used to decompose (B-26) into DM(H + H ) = f + g H + H = G(^ + k) 0 0 (B-38a) (B-38b) s s 0 The equations for the primary response are DMoHo = f H = G4> 0 (B-39a) (B-39b) 0 Mo has non-zero diagonal elements T]o = /Jo so that M = //oI (I is an ne by ne identity matrix). Substitution and elimination using (B-38) and (B-39) yields 0 DMH = D(-M + A)I)H + g H = G(|) S s (B-40a) (B-40b) 0 s B.7.4 Comparison of the Two Approaches The first approach yielded DMH = -// DKHo + g H = S (B-41 a) (B-4 lb) 0 s and the second approach yielded DMH = -D(M - // I)H + g H = Gk S 0 (B-42a) (B-42b) 0 s (B-41 a) and (B-42a) are only equivalent if //oK = M-/4>I (B-43a) 191 or K=/Ao ri-\ (B-43b) A with /rand 7] defined as in (B-5). Because /rand 7] are arithmetic averages, it is simple to show that (B-43b) is equivalent to J = M>"Vl (B-44) which is just a rearrangement of (2-4) and, therefore, the two approaches are identical. 192 Appendix C Notable observations from the inversion tests This appendix discusses some of the notable observations from the inversion tests of Chapter 11. The topics discussed here leave much room for further study. C l Depth weighting Recall from Section 9.7 that the depth weighting function introduced into the inversion algorithm was a function appropriate for the approximate linear problem. It was initially unclear how the depth weighting function needed to be altered for the nonlinear problem. The experiments of this chapter have shown the linear depth weighting to be reasonable. Hence, the conclusion can be made that this method of depth weighting should be generally acceptable. C.2 Practical choice of the initial model The re-parameterization of the model to hold square-root susceptibility values introduces a practical issue with regard to the choice of the initial model. Consider the square-root function, pictured in Figure C l below. 2 1.5 •G)L 0 , , . J 0.5 1 x; 1.5: 2 Figure C l The square-root function. When situated near the origin (i.e. x«\ in Figure C l ) , large changes in the square-root function, sqrt(x), correspond to relatively small changes in x. Therefore, if the initial model contains small values, the model perturbation calculated using the Gauss-Newton method and 193 subsequent line-search may be such that although it is a large perturbation in model-space (i.e. square-root space) it is a small perturbation in susceptibility space. Neither the susceptibility distribution nor the predicted data would change significantly through such a perturbation and consequently, neither would the objective function. The inversion algorithm would continue to make only minor reductions to the objective function until the model values became large enough that the process accelerated to a practical rate. This supposition was confirmed through experiment. To avoid this problem, one may consider the approach of setting the initial model to large values throughout the active grid. However, this causes difficulties for cells that are close to the boundaries. Any susceptibility placed close to the boundary will have a diminished effect on the solution. This is because the prescribed boundary fluxes are invariant and, therefore, have an increased effect on the flux values close to them. Investigation showed that when large susceptibility values were placed close to the boundary in the initial model, the inversion generally failed to remove this susceptibility in the recovered model. This result indicates that susceptibility near the boundary does indeed have a diminished effect on the forward solution. For example, recall the synthetic inversion test of Section 11.2. There, the initial model contained a square-root susceptibility value of 0.1 (i.e. x 0.01) throughout the active region and the susceptibility in the recovered model was concentrated predominantly in a central location corresponding to the location of the target prism. Here, inversions were performed with an initial value of 1.0 and with all other parameters equal to those in Section 11.2. Below, Figures C.2 and C.3 show two horizontal cross-sections through the recovered model at two different depths: z = Om and z = 8m. Refer to the z-axis in Figure 11.5 for the depth location of these cross-sections with respect to the target prism, z = Om corresponds to the centre of the target prism and z = 8m lies 6m below the bottom of the target prism. As expected, the inversion was not able to completely remove the susceptibilities in the outer grid portion, as evident in Figure C.3. = 194 -14.00 -7.00 0.00 y 7.00 14.00 Figure C.2 A horizontal cross-section at z = Om through the model recovered from inversion of the synthetic test data for an initial model containing a value of 1.0 throughout the active region. The position of the synthetic target prism is indicated by a black and white dotted rectangle. 195 •14.00 -7.00 0.00 y 7.00 14.00 Figure C.3 A horizontal cross-section at z = 8m through the model recovered from inversion of the synthetic test data for an initial model containing a value of 1.0 throughout the active region. The data predicted by the recovered model are displayed in Figure C.4 below. When the susceptible material in the outer region of the recovered model was removed (i.e. set to zero) and the data for the central body was modelled, the data was very similar to the predicted data in Figure C.4. Figure C.5 shows this forward modelled data and Figure C.6 shows the difference between the two data sets (i.e. those in Figures C.4 and C.5). 196 -13.25 -8.83 -4.42 0.00 Easting (m) 4.42 1.83 13.25 nT 13595.7 -4406.8 Figure C.4 Predicted data map and profile for the model shown in Figures C . 2 and C . 3 . 197 14165.8 -4406.8 Figure C.5 Forward modelled data map and profile for the central body shown in Figure C.2. The predicted data profile from Figure C.4 is also included in the profile, with the predicted data profile in black and the forward modelled data profile overlaid in grey. 198 i -13 - i 9 - i 4 i 0 Easting (m) i 4 i 9 i 13 nT Figure C.6 The difference between the data in Figures C.4 and C.5. The maximum and minimum values are 579.205 and -664.825. C.3 Regional removal For the linear inverse problem, if high susceptibility is placed close to the grid boundaries in the recovered model it is an indication that a regional in the data was not correctly removed prior to inversion. This is discussed in Li and Oldenburg (1998). A regional is some component in the data, usually containing long spatial wavelengths, that is due to magnetic material outside of the region included in the model grid. The linear inversion algorithm will place susceptible material as close as possible to these bodies in order to fit the regional field. The result is high susceptibility adjacent to the grid boundary. In such an inversion, the effect on the central model region (i.e. the region away from the boundary) is often minor. In the nonlinear inversion methods, the active model region is often surrounded by an inactive region of free-space padding cells. In this situation, the inversion is not free to place susceptible material close to the boundaries in order to help fit any existing regional in the data. Furthermore, from the discussion in Section C.2, even if the inversion was free to place susceptible material close to the boundaries it may not be able to do so such that the regional is successfully fit. Hence, it is expected that presence of a regional in the data will significantly affect the central model region recovered by a nonlinear inversion. Therefore, regional removal from survey data becomes much more important when it is to be inverted with the nonlinear methods for the full magnetostatic problem. 199 Appendix D Remanent magnetization and anisotropy D . l Introduction When any magnetic object is placed in an external magnetic field it will obtain an induced magnetization. Unlike diamagnetic and paramagnetic materials, ferromagnetic materials can retain a permanent magnetization, or "remanent" magnetization, when the external field is removed (Tarling and Hrouda, 1993). Section D.3 discusses the mechanisms that lead to the acquisition of remanent magnetization. . An important concept related to remanent magnetization is magnetic anisotropy. Without anisotropy, permanent magnetization would not be possible (O'Reilly, 1984). In many magnetic materials, the induced magnetization depends on the orientation of the material with respect to the appliedfield(Tarling and Hrouda, 1993). This phenomenon of anisotropy is a result of energy minimization. It causes certain orientations of magnetization to be preferred. Anisotropy is discussed further in Section D.2. Possible incorporation of the phenomena of anisotropy and remanent magnetization into the modelling methods of this thesis are discussed in Sections D.4 and D.5. D.2 Magnetic anisotropy There are three types of magnetic anisotropy: magnetocrystalline anisotropy is a function of crystal composition, shape anisotropy is dependent on grain shape and size, and strain anisotropy results from crystal deformation. Every ferromagnetic material has certain crystallographic directions (i.e. axes or planes) along which the magnetic moments of the lattice ions prefer to lie (Stacey and Banerjee, 1974). This phenomena, which depends on the crystal structure, is called magnetocrystalline anisotropy. In the absence of an appliedfield,the magnetization of the material will lie along the preferred directions in order to minimize the magnetostatic energy associated with the crystal lattice (Tarling and Hrouda, 1993). Shape anisotropy arises when a magnetic particle lacks spherical symmetry (O'Reilly, 1984). It is a product of the magnetostatic forces generated by interaction of the surface charges on the particle (Butler, 1992). Shape anisotropy is related to the concept of self-demagnetization, introduced in Section 1.1. For example, a prolate, spheroidal magnetic grain will have a shape anisotropy leading to a preferred magnetic alignment along its long axis. 200 Strain anisotropy is intimately related to magnetocrystalline anisotropy and results from deformation of a crystal lattice due to strain. This effect can modify a pre-existing magnetocrystalline anisotropy or can cause an intrinsically isotropic material to become anisotropic (Stacey and Banerjee, 1974). Strain anisotropy becomes important wherever magnetic rocks are in a process of deformation. D.3 Remanent magnetization and hysteresis Ferromagnetic material is characterized by hysteresis, or irreversibility of magnetism. In this phenomenon, the magnetic polarization of a ferromagnetic material depends not only on the field to which it is exposed at a particular instant but also on its magnetic history (Stacey and Banerjee, 1974). Hysteresis curves, such as that in Figure D.l below, demonstrate the dependence of rock magnetization on the applied field. The characteristics of a hysteresis curve for any substance are determined by the balance between the internal lattice forces and the forces exerted by the applied field (Butler, 1992). Adapted from Butler (1992) Figure D . l A typical hysteresis curve for a ferromagnetic substance. The inset figures show spheroidal magnetic grains and their moments within a typical anisotropic material. 201 To understand how a magnetic material can obtain a remanent magnetism, or "remanence", consider a rock sample consisting of randomly oriented anisotropic ferromagnetic mineral grains within a non-ferromagnetic matrix, as depicted in the insets of Figure D.l. Here, the particles are prolate spheroids and their preferred magnetization directions are along their long axes. Initially, there is no externally applied field and the mineral grains are in their minimum energy states with magnetic moments in the preferred directions. Assuming no prior remanence, the combined effect of the randomly oriented particles will be a net magnetization of zero. This corresponds to the origin (point 'O') on the hysteresis curve in Figure D. 1. Anisotropy produces energy barriers that resist the rotation of the particle magnetic moments away from their preferred directions. When an external field, H , is applied, the magnetic moments of the individual grains are forced to rotate awayfromtheir preferred directions and toward the direction of H . This results in a net magnetization roughly in the direction of H and corresponds to point 'A' in Figure D.l. As the magnitude of H is further increased, some grain moments will rotate past the energy barriers associated with their anisotropy (i.e. past the spheroid short axis symmetry plane). At a sufficiently strong applied field, the magnetization of each grain becomes parallel to H and the material is said to be saturated, with saturation magnetization M . This corresponds to point 'B' in Figure D.l. When the external field is removed, the grain moments will rotate back to their nearest preferred directions (i.e. the nearest spheroid long axis). This corresponds to point ' C in Figure D.l. Not all grain moments will be in their initial directions and there will result a remanent magnetization, M , in the direction of the previously applied field. The ratio |1VL|/|]VL| can be used as a measure of efficiency of acquiring remanent magnetization for the specific sample (Butler, 1992). To force the net magnetization back to zero (point 'D' in Figure D.l), an opposing external field must be applied. The magnitude of this field, H , is called the "bulk coercive force" (Butler, 1992). ex ex ex e x ex s r c Remanent magnetization is not possible without anisotropy. Perfectly isotropic materials do not show hysteretic effects and can not retain a remanent magnetization. This is because there are no energy barriers preventing randomization of the grain moment directions after removal of the inducing field. There is a temperature called the Curie point above which ferromagnetic properties disappear. Above this temperature, thermal agitation overcomes the electron coupling interactions that produce ferromagnetic characteristics (i.e. mutual alignment of atomic magnetic moments) and the material then behaves paramagnetically (Stacey and Banerjee, 1974). Rocks are magnetized by the Earth's field as they cool and the so-called "thermoremanent magnetization" (TRM) acquired (as the temperature decreases below the Curie point and ferromagnetic properties are restored) isfrequentlyvery stable (Stacy and Banerjee, 1974). The direction of the remanent magnetization depends on the geomagnetic field at the time of cooling. There are several ways that a prior remanent magnetization can be removedfroma sample of magnetic material. One method is to heat the sample past its Curie temperature and cool. Another, important in UXO applications, is "shock demagnetization", in which a forceful 202 collision applies enough energy to the unexploded item to "knock" the prior magnetization out of the object (Nelson, 1998). In either method, the object will re-magnetize, through induction, in the direction of the appliedfieldpresent at the time. This may lead to a remanence at some point in the future if the object is reoriented with respect to the Earth's field. D.4 Modelling anisotropy The methods in this thesis do not consider magnetic anisotropy. Here, a discussion is provided on extension of the methods to include this phenomenon. The question of whether or not to be concerned with anisotropy in the inverse problem depends on the application. If the geology in the region of interest is known to contain significantly anisotropy on scales similar to model cell dimensions then this information needs to be incorporated into the inversion. In Sections 2.2 and 3.2, Maxwell's equations for source-free magnetostatics led to the following governing equations for the flux formulation: V-B = 0 B = /zv> (D-la) (D-lb) (D-lb) is valid for any isotropic, linear medium. It is possible to introduce anisotropy into the methods presented. In anisotropic materials, the permeability has some directional dependence. The permeability can no longer be written as a scalar and becomes the following second-order tensor: M M, /A, M= M /'„ My. M: M* xx x yx M :X v (D-2) so that (D-lb) becomes M xx B, B z X zx M My M. X = My M My v xz (D-4) r M:y (Tarling and Hrouda, 1993). In (D-2), ////=//,-,• so that there are only six independent components of the permeability tensor. If the non-diagonal elements in (D-2) equal zero then (D-3) can be split into three simple parts, one for each Cartesian direction. The following governing equations are obtained: 203 V-B = 0 (D-4a) (D-4b) (D-4c) (D-4d) B = jU V <p xx x x B = ju V 0 yy y B z y = /4zV 0 z The same finite volume discretization of Chapter 3 could be performed with this new set of equations. The resulting discrete matrix-vector equations would be essentially unchanged: DB = f + g (D-5a) 0 B - y 0 M 0 0 " x 0 y y 0 G M (D-5b) * z . where (D-5b) is of the form B = MG<() with M a diagonal matrix. For the forward problem, the number of unknown potentials remains unchanged but there are now three times the amount of known model parameters (i.e. ju , ju and /4 for each model cell). xx yy Z Use of (D-4) would yield a much larger inverse problem. A suitable model objective function would now have twelve terms instead of the original four. The associated weighting functions would allow incorporation of any a priori information about the geology that could have consequence on anisotropy (e.g. known directions of foliation). If the off-diagonal elements in (D-2) are considered, the forward and inverse problems become even larger. The governing equations become VB =0 B = ju V </> + fjsyVyf + n V <p B = jU V <t) + JUyyVyf + /ly^' (j) (D-6a) (D-6b) (D-6c) B = jLt V (i> + jLkyVyt + ju V (/) (D-6d) x xx } yX z zx X2 x z X Z x zz z and a finite volume discretization would yield the discrete matrix-vector equations DB = f + g B B = M. M x y y z M * y M (D-7a) G x G y (D-7b) H where (D-7b) is of the form B = MGkj) with M no longer a diagonal matrix. The discrete equations and constituent quantities are identical to those derived in Chapter 3 with the exception of the matrix M , which remains sparse but now contains three times as many non-zero elements of homogenized permeability. 204 D.5 Modelling remanent magnetization The methods in this thesis do not consider the presence of remanent magnetization. Here, a discussion is provided on extension of the methods to include this phenomenon. The question of whether or not to be concerned with remanent magnetization in the inverse problem depends on the application. If there is no anisotropy on any scale then there can be no remanent magnetization. If remanence is possible, one needs to determine whether a significantly large and problematic remanent magnetization in a direction other than the present inducing field is probable. If remanent magnetization needs to be modelled, a convenient law for low fields (i.e. less than IT) is M = ^ H +Mr (D-8) where M,- is a vector quantity that is not dependent on time or on the magnetic field, H , and is zero outside the magnetic material (Bossavit, 1998). The total flux is now the sum of the inducingfield,Bo, an induced component, B „ and a remanent component, JS : r (D-9) B = B + B, + B, 0 where B,. = ^ " ' M (D-10) r and M does not exist in free-space. The governing equations become r V - B = -v-^-'rvi,) (D-l la) (D-l lb) Here, M enters the problem as a source term in (D-l la). r The question of where best to define the M,- quantities in the discrete grid is a significant one, especially considering the relationship between remanent magnetization and anisotropy. Depending on the placement of the M,- quantities, the resulting discrete FVD forward modelling equations may change significantly from those in Chapter 3. Furthermore, approximation of the boundary conditions and homogenization of the permeability values may become more complicated. In the integral equation domain, the system of equations to be solved for the magnetizations in each cell would only change slightly from (7-8) to become 205 Mf = X . ' 1 P H o , + — DC f 1 ?=1 k=\ I \ IXfM* f + Mf ; le {1,2,3} , {l,2 nc} (D-12) it Here, Mf is the / component of the remanent magnetization in cell p. The system of equations is now consistent with (D-8). Mf are known quantities in the forward problem and (D-12) is solved for the unknown full magnetizations M f . Introduction of remanence would yield a larger and more complicated inverse problem. Additional model objective function terms may be required for size and smoothness of the remanence, as well as those necessary for dealing with anisotropy. The direction of remanence may be relatively constant and related to some past inducing field direction. Knowledge of this information may simplify the inverse problem. However, the remanence will also be related to the susceptibility distribution, and more importantly, to the degree and direction of anisotropy. An appropriate inversion should take these relationships into account. This would require "cross terms" in the model objective function that measure some correspondence between the anisotropy and the remanence magnitude and direction. It is not immediately clear how these cross terms should be defined. 206
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Forward modelling and inversion of geophysical magnetic...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Forward modelling and inversion of geophysical magnetic data Lelièvre, Peter George 2003
pdf
Page Metadata
Item Metadata
Title | Forward modelling and inversion of geophysical magnetic data |
Creator |
Lelièvre, Peter George |
Date Issued | 2003 |
Description | The ultimate goal of this research was to invert geophysical magnetic data to recover three dimensional distributions of subsurface magnetic susceptibility of any possible magnitude and geometric complexity. Magnetic data collected over bodies of high susceptibility contain significant self-demagnetization effects. Self-demagnetization causes magnetizations to rotate away from the external inducing field and causes the amplitude of the magnetic response to scale nonlinearly with susceptibility. These effects are highly dependent on the shape of the object and they complicate interpretation. Examples where self-demagnetization is important include surveys for detection and discrimination of unexploded military ordnance (UXO) and mineral exploration surveys over highly mineralized banded iron formations and nickel deposits. Current modelling methods that account for self-demagnetization effects are limited to simple bodies, such as ellipsoids, where the geometry of the body is represented by a few parameters. Standard forward modelling methods for general susceptibility distributions (i.e. methods that can deal with complicated bodies) neglect the effects of self-demagnetization and can produce inaccurate results and subsequent deterioration in performance of the inverse solution. Here, a full solution to Maxwell's equations for source-free magnetostatics was developed using a finite volume discretization. The Earth region of interest is discretized into many prismatic cells, each with constant susceptibility, which allows for models of arbitrary geometric complexity. The finite volume forward modelling method is valid for any linear medium and is appropriate for modelling the response of highly magnetic objects. The forward modelling method was refined and the code was tested against analytical solutions for simple bodies and against a slower, more memory intensive full solution for general distributions formulated in the integral equation domain. All tests showed the forward modelling method to be sound within expected error tolerances. The finite volume modelling method formed the foundation for a subsequent inversion algorithm. In the discretization, many more model cells are used than there are data. As such, the inverse problem is underdetermined. The inverse problem was formulated as an unconstrained optimization problem in which an objective function is minimized. The objective function was designed so that the data are fit to an acceptable degree and the recovered model has desired spatial characteristics. The resulting optimization problem was nonlinear and required an iterative solution, for which a Gauss-Newton approach was used. Testing for the inversion code included inversion of synthetic data for simple bodies and inversion of survey data collected over a planted UXO target. All tests showed positive results. |
Extent | 9545240 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
File Format | application/pdf |
Language | eng |
Date Available | 2009-10-17 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0052563 |
URI | http://hdl.handle.net/2429/13931 |
Degree |
Master of Science - MSc |
Program |
Oceanography |
Affiliation |
Science, Faculty of Earth, Ocean and Atmospheric Sciences, Department of |
Degree Grantor | University of British Columbia |
Graduation Date | 2003-05 |
Campus |
UBCV |
Scholarly Level | Graduate |
Aggregated Source Repository | DSpace |
Download
- Media
- 831-ubc_2003-0148.pdf [ 9.1MB ]
- Metadata
- JSON: 831-1.0052563.json
- JSON-LD: 831-1.0052563-ld.json
- RDF/XML (Pretty): 831-1.0052563-rdf.xml
- RDF/JSON: 831-1.0052563-rdf.json
- Turtle: 831-1.0052563-turtle.txt
- N-Triples: 831-1.0052563-rdf-ntriples.txt
- Original Record: 831-1.0052563-source.json
- Full Text
- 831-1.0052563-fulltext.txt
- Citation
- 831-1.0052563.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0052563/manifest