Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Joint inversion for correlated models in linear inverse problems Liu, Yidong 1994

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


831-ubc_1995-0055.pdf [ 4.85MB ]
JSON: 831-1.0079679.json
JSON-LD: 831-1.0079679-ld.json
RDF/XML (Pretty): 831-1.0079679-rdf.xml
RDF/JSON: 831-1.0079679-rdf.json
Turtle: 831-1.0079679-turtle.txt
N-Triples: 831-1.0079679-rdf-ntriples.txt
Original Record: 831-1.0079679-source.json
Full Text

Full Text

J O I N T  I N V E R S I O N  F O RC O R R E L A T E D I N V E R S E  M O D E L S  I N  L I N E A R  P R O B L E M S  By Yidong Liu B. Sc. (Mathematics) Shandong University, 1985 M . Sc. (Mathematics) Academia Sinica, 1988  A THESIS SUBMITTED IN PARTIAL F U L F I L L M E N T O F T H E REQUIREMENTS  FOR T H E DEGREE OF  M A S T E R OF SCIENCE  in T H E F A C U L T Y O F G R A D U A T E STUDIES D E P A R T M E N T O F M A T H E M A T I C S & INSTITUTE O F A P P L I E D M A T H E M A T I C S  We accept this thesis as conforming to the required standard  T H E UNIVERSITY O F BRITISH C O L U M B I A November 1994 © Yidong Liu, 1994  In  presenting  degree at the  this  thesis  in  University of  partial  fulfilment  of  this thesis for  department  or  by  his  or  scholarly purposes may be her  representatives.  permission.  Department The University of British Columbia Vancouver, Canada  DE-6 (2/88)  for  an advanced  Library shall make  it  agree that permission for extensive  It  publication of this thesis for financial gain shall not  Date  requirements  British Columbia, I agree that the  freely available for reference and study. I further copying of  the  is  granted  by the  understood  that  head of copying  my or  be allowed without my written  Abstract  Inverse problems arise in many fields. They are usually ill-posed since they often violate one or more of Hadmard's three conditions for well-posedness: existence, uniqueness and stability. In this thesis, we propose a new method for computing approximate solutions in certain linear inverse problems. We consider linear inverse problems based on integral equations of the first kind. Analysis of Picard's condition reveals that such equations may lead to ill-posed problems which may have no solution satisfying the observed data exactly and stably, but may have infinitely many solutions satisfying the data approximately. To get a unique and stable solution to this kind of inverse problem, we use Tikhonov's Regularization Method. To obtain the best possible approximation to the true model, we should use any and all available information regarding the true model, although we can not expect to get sufficient data. For example, it is standard practice to use the positivity of the model in inverting magnetic and IP data, and to use special weighting functions in solving magnetic problems. The key feature of the present work is a method that exploits the correlation between different model parameters in inverting the geophysical data. To keep different parameters in suitable confidence regions, a new methodology, Combined Inversion, is developed. In combined inversion, different kinds of data are inverted simultaneously. The objective functional imposing the correlation requirement may be neither convex nor quadratic, so corresponding algorithm and code are developed. When the objective functional is not quadratic, we use an iterative method to solve it and approximate the functional with its second order Taylor approximation in each iteration. When the objective functional is not convex, it may have more than one local minimum.  n  To get the minimum which well approximates the true model we should begin with a good initial model. In our case we produce the initial model by solving the combined problem with no correlation requirement. We introduce our method in the context of two practical geophysical inverse problems: the magnetic problem and the Induced Polarization (IP) problem. As we expect, regularization smooths the inverted models, so some model characteristics are lost in the recovered models.  Our numerical examples confirm the smoothing effects of the  regularization operators. Since magnetic susceptibility and chargeability are negatively correlated, we introduce a nonquadratic, nonconvex "correlation function", whose sub-level sets define confidence regions for the vector of susceptibility and chargeability. Then we require our recovered models to be in the confidence region. The recovered models from combined inversion method are significantly better than those from independent inversion. This method should be useful in practical prospecting when several kinds of data are available and there is some correlation among the parameters. This is the case in mining industry where several kinds of geophysical data are usually measured at the same time and the different parameters producing the data are known to be correlated. If we approximate the correlation with, a reasonable functional, we may reconstruct models satisfying the corresponding correlation.  in  Table of Contents  Abstract  ii  List of Tables  vii  List of Figures  viii  Acknowledgement  x  1  1  Introduction 1.1 1.2  1.3 2  Linear Inverse Problems In Geophysics  .'•  1  Two Practical Problems  2  1.2.1  Magnetic Problem  .  1.2.2  Induced Polarization Problem  2 5  Combined Inversion  8  Linear Inverse Problems  12  2.1  Ill-posed Inverse Problems: Functional Theory .  12  2.1.1  Theoretical Analysis of an Inverse Problem  13  2.1.2  Generalized Solutions and Regularization of Ill-posed Problems . .  15  2.1.3  Construction of Regularization . . .  19  2.2  . . .  Solving Inverse Problems .  22  2.2.1  Discretization of Inverse Problems  22  2.2.2  Direct Methods . . .  25  Iterative Methods  27  .2.2.3  iv  2.3  3  Linearizing Nonlinear Inverse Problems  Two Practical Problems  34  3.1  Magnetic Methods  34  3.1.1  34  3.1.2  3.2  Physical Principles Mathematical Relations  .  37  3.1.3  Forward Problem  38  3.1.4  Inverse Problem  40  3.1.5  A Numerical Example  44  Induced Polarization ,  50  3.2.1  Physical Principles  50  3.2.2  Mathematical Relationship  53  3.2.3  The Forward Problem  55  3.2 4  The Inverse Problem  57  3.2.5  A Numerical Example  61  ;  4  31  Combined Inversion  65  4.1  Known Approaches  65  4.2  Correlation Between Parameters  66  4.3  Combined Inversion  68  4.3.1  Optimization Problem  68  4.3.2  Statistical Description  72  4.4  Computing Techniques  74  4.4.1  Optimization Problem  74  4.4.2  Approximating the Nonlinear Correlation  75  4.4.3  Producing the Initial Model  76  4.4.4  Choosing the Correlation Weighting Parameter  78  v  4.4.5  Balancing the Regularization Operator . . . . . . . . . . . . . . .  79  4.4.6  Working with SLIM . . . . . /  80  4.5  A Numerical Example  4.6  Conclusion . . . . . ..  Bibliography  . .  81 83  .  92  vi  List of Tables  4.1  Chargeability and susceptibility of some minerals  67  4.2  Parameters of regularization operator.  81  4.3  Output with a = 0  4.4  Output with a ^ 0.  ;  c  c  . . .  •  vii  82 83  List of Figures  1.1  Four-electrode D C model  5  1.2  D C current for IP  6  1.3  IP decay voltage.  3.4  Geomagnetic field.  35  3.5  Rectangle geometry  39  3.6  Base function model  41  3.7  Exact magnetic data from the forward problem  45  3.8  Noisy data simulating magnetic observations..  3.9  Magnetic Model.  :  6  .  45  . . . ,  47  3.10 Predicted data from the output model  48  3.11 Difference between observed data and predicted data.  48  3.12 Four-electrode D C model  51  3.13 IP decay voltage  ;  51  3.14 D C current for IP  54  3.15 IP Model  64  4.16 Reconstructed Models with <f> = 0  85  c  4.17 Regularization term for magnetic part, (f>  86  4.18 Regularization term for IP part, (j>  86  K  v  4.19 Total Misfit, <f>  •  d  87  4.20 Correlation Function, (j> (m)  87  4.21 Total sum of the objective function, (j)  88  c  viii  4.22 Total sum of the objective function, <f>  88  4.23 Correlation Function, <j) (m).  89  4.24 Total misfit, </>  89  c  d  4.25 Reconstructed Models with </> = K T] . 2  c  2  . . .  4.26 Reconstructed Models with <^ =function product c  IX  . . . . . ......  .  90 .91  Acknowledgement  I would like to thank my supervisors, Dr. Philip Loewen and Dr. Doug Oldenburg, for their guidance and support throughout my work. I also like to express my gratitude to the friends at the Department of Mathematics and the Department of Geophysics and Astronomy who provide me friendly and supportive environment. I have no doubt that without the strong and continued support from my family, I would not have been able to complete this work as it stands. Their love outlasted the dark clouds. I owe much to them and so does this thesis. Finally I like to thank my wife Rui Zhang who gives me invaluable encouragement and help in these years.  x  Chapter 1 Introduction  This thesis concerns Linear Inverse Problems and their applications in Geophysics. Its main contribution is a combined inversion method that exploits known correlations between the underlying physical properties in linear inverse problems. After introducing the basic principles of linear inverse problems in part one, we work on two practical geophysical inverse problems in part two: Magnetic susceptibility inversion and Induced Polarization (IP) chargeability inversion. In part three we develop the combined inversion method and demonstrate its usefulness by inverting the susceptibility and chargeability jointly, since there is a correlation between these two parameters.  1 . 1 Linear Inverse Problems In Geophysics Geophysical Inverse Problems are to determine, on the basis of measured data, internal properties of the Earth which cannot be measured directly, such as the mass density or the electrical conductivity. They are usually ill-posed as mathematical problems, since they violate one or more of the three properties of existence, uniqueness and stability. For a linear inverse problem, we have to estimate some physical parameter m(x), i.e., a function of spatial coordinates, using only the measured data d and a known linear operator A. Usually m is in an infinite dimensional Hilbert space, so we can not expect to evaluate it exactly from an inaccurate, finite-dimensional vector of data d. This is an ill-posed problem. In practical work, we discretize the function m first: physical laws then take the form 1  Chapter 1.  Introduction  2  of a system of linear equations: Gm = d.  (1.1)  Usually G is an N x M matrix and N <C M. Since there are measurement errors in the observed data d, we may not find any solution vector m satisfying the system. Even if we can satisfy the observed data exactly, we should not solve the system in that way since that kind of solution is usually unstable with respect to measurement errors. Therefore, the discrete problem is still ill-posed.  The theory of regularization suggests that we  determine an approximate solution by solving the following optimization problem:  Here m  0  minimize  </> (m) :=|| W (m  subject to  <f> (m) :=|| W (Gm  m  m  d  d  — mo) | | , 2  - d ) \\ = <^*. obs  (1.2)  2  is called the base model, <f> is called the misfit criterion, W is called the data d  weighting matrix, W  m  d  is called the model weighting matrix and <f>* is called the target, d  misfit.  1.2  Two Practical Problems  1.2.1  Magnetic Problem  A typical linear inverse problem in geophysics is to calculate the magnetic susceptibility in magnetic prospecting. Magnetic prospecting is one of the oldest methods of geophysical exploration and is used to explore for oil, minerals, and even archaeological artifacts. A magnetizable body placed in an external magnetic field becomes magnetized by induction; the magnetization is due to the reorientation of atoms and molecules so that their spins line up. The alignment of internal dipoles produces a field M , which is added to the magnetizing field H. For low magnetic fields, M is proportional to H and is in the —*  direction of H. The degree to which a body is magnetized is determined by its magnetic  Chapter 1.  susceptibility  Introduction  3  K, which is defined by  M  (1.3)  = KH.  Here M, H and K may all vary with position. Magnetic susceptibility is the significant variable in magnetics. It plays the same role as the density does in the gravitational problem. A volume of magnetic material can be considered as an assortment of magnetic dipoles that results from the magnetic moments of individual atoms and dipoles.  They are  aligned by induction in the presence of the Earth's magnetic field ("the main field") H  e  (Ho\H \ ~ 0.5 x 1 0 e  -4  tesla)to produce a local magnetic anomaly. In magnetic prospecting  we estimate the magnetic susceptibility K by measuring the local magnetic anomaly. Magnetic anomalies are caused by magnetic minerals (mainly magnetite and pyrrhotite) contained in the rocks. In any case, we can regard a magnetic body as a continuous distribution of dipoles resulting in a vector dipole moment per unit volume, M. Each dipole at the origin causes an anomalous field at r according to F(r)  = V{M(f)- V(l/|f|)} = V{KH  £  • V ( l / | r | ) } when M = nH .  (1.4)  e  Magnetic anomalies caused by intrusions, flows, or iron-rich sedimentary horizons are common features in regions favorable for mineral exploration, and there is frequently a contrast between the magnetic mineral content of such features and that of the host rock. Usually magnetic polarization is in the H  e  direction and M = nH . e  To get the  data about our model, we partition the area of interest into a series of cells and assume that the susceptibility in each cell is constant. Thus, we can calculate the projections of the magnetic anomaly field in the direction of Earth's magnetic field. With the magnetic anomaly field F as the measured data and the corresponding  Chapter 1. Introduction  4  physical law (1.4) as the linear relation we get a standard linear inverse problem. It can be solved for the unknown functional K through general methods for inverse problems. In practical magnetic prospecting, data are gathered using magnetometers [4]. One type of magnetometer is a magnetic balance which measures the anomaly field values relative to a reference value. This type of magnetometer measures the magnetic anomaly field F directly. The other type of magnetometer measures absolute values of the magnetic field. Since this type of magnetometers are robust and extensively used, we show how can we compute one component of the anomalous field from the absolute value of the magnetic field. Indeed, if H is the main field and H is the anomalous field, then we can determine AH e  a  experimentally, where AH=  \H +  H \-\H \  e  a  e  = (H + H , H + Ha) /  1 2  e  a  e  — \H \ e  « [{He,He) +-2(H ,H )]  - \H \ ( since \H \ <  1/2  e  a  e  a  \H \) e  n(H ,H )/\H \ e  a  e  That is, AH can be used as the approximation of the projection of anomalous field in the He direction. Now the measured data are actually the magnetic flux density B: we calculate the magnetic field from H = where UQ = 47r X 1 0  -7  B/a , 0  weber/ampere-meter is the permeability of free space.  Chapter 1. Introduction.  5  1" X  Chargeable body  Figure 1.1: Four-electrode D C model. 1.2.2  Induced Polarization P r o b l e m  The other inverse problem we will consider in this thesis is to calculate the chargeability t] in Induced Polarization (IP). Induced Polarization is a relatively new technique in geophysics, and has been employed mainly in base-metal exploration and to a minor extent in ground water search. An illustration of induced polarization can be obtained with a standard four-electrode Direct Current (DC) resistivity spread by abruptly interrupting the current [20] (figure 1.1, 1.2).  The voltage across the potential electrodes generally does not drop to zero  instantaneously, but decays rather slowly, after an initial large decrease from the original steady-state value. This decay time is on the order of seconds. If the current is switched on again, the potential, after a sudden initial increase, builds up over a similar time interval to the original D C amplitude (figure 1.2, 1.3). Measurements of IP data may be made in either the time or the frequency domain. Although the phenomenon is the same, different parameters are used to represent it. We use the chargeability defined by Siegel [21]: . • (1.5)  V =  where (f  a  is the potential in the absence of the polarization effect and (p is the the v  Chapter 1. Introduction  I  t  Figure 1.2: D C current for IP.  * Figure 1.3: IP decay voltage.  Chapter 1. Introduction  7  potential with the polarization effect. To estimate the chargeability 77 in the area of interest, we measure the potentials <p  a  and Lp on the surface first and then compute the data r\ (apparent chargeability) using n  a  the equation Va =  •  (1.6)  For two-dimensional problems if we choose y in the strike direction, z downward and x perpendicular to the strike direction, then the chargeability is represented as  v{x,z)  and we only measure <p and (p along a line in the x direction. a  v  The definition of apparent chargeability (1.6) shows that estimating the chargeability 77 from the data r] is a nonlinear inverse problem. However, 77 values are usually small, a  so we can linearize equation (1.6) to produce a linear inverse problem that approximates the given problem. Linearizing the potential ip about the conductivity model a yields v  = (p(a) - ip'(a)na + 0(77).  (p = ip(a-rja) n  (1.7)  Substituting into equation (1.6) yields _ <f - if v  a  -<p'(a)r]a + 0(77)  ip  (p{o-)-ip'(cr)r)a  v  This can be approximately written as Va ~ -^f^-V  = JV,  (1-9)  for a linear operator J independent of rf. In the discrete version of the problem, J becomes a matrix—see Section 3.2.4. So, instead of solving the nonlinear inverse problem (1.5) we can solve the linear inverse problem JV = Va-  (1.10)  Chapter 1. Introduction  8  with the same methods used for magnetic inverse problem. Our inverse problems are always of the underdetermined type in which there are fewer observations than unknowns. Then the observations are not sufficient to determine the unknown parameters uniquely; additional information is needed. This lack of information cannot be remedied by any mathematical techniques. For geophysical inverse problems the physical parameters to be determined should be consistent with the geological structures and other information when it is available. Therefore, it is natural to construct optimization problems with proper objective functionals based on additional information.  1.3  C o m b i n e d Inversion  In our case, although we can invert the IP data for chargeability and the magnetic data for susceptibility independently, we note that there is some correlation between the chargeability and the susceptibility of different kinds of rocks. Our main contribution is to use prior information about this correlation to invert these two sets of data simultaneously. In fact, the tables in [20] shows that although there is a great variation in the revalues, even for a particular rock type, and a wide overlap between different rock types, sedimentary rocks have the lowest average susceptibility and basic igneous rocks the highest. In every case the susceptibility depends only upon the amount of ferrimagnetic minerals present, mainly magnetite, sometimes ilmenite or pyrrhotite.  The K values  for chalcopyrite and pyrite are typical of many sulphide minerals which are basically nonmagnetic. At the same time, we find that the chargeabilities n of minerals increase with the sulphide concentration. In contrast to their magnetic susceptibilities, chalcopyrite and pyrite are typical minerals with high chargeabilities.  Chapter 1. Introduction  9  Therefore, the magnetic susceptibility and the chargeability of minerals are not independent. Although the correlation between susceptibility and chargeability may depend on the underlying rock type and an exact description of this correlation is not available, we know that the susceptibility and chargeability can not both be large at the same place. In this thesis we use this correlation to invert jointly the magnetic and IP data. Suppose (f> (n, Tf) represents the correlation between K and rj, and the confidence region c  is determined as {(«,»/):&(«, » ? ) < # } .  (1.11)  If the discretized magnetic problem is Git = d?' and its corresponding regularized optimization problem is minimize  <J> (K) :=|| W (K — K )  subject to  <f> ( ) :=|| W (Gn - < ) || =  K  K  || , 2  0  s  d  K  (L12)  2  dl  and the discretized IP problem is Jr] = d°  bs 2  and its corresponding regularized optimization problem is. minimize  (/> (n) :=|| W {rj - 770) | | ,  subject to  h{r}):=\\W {Jrj-df )f=^  2  v  v  (1.13)  s  d2  then we can introduce a new optimization problem: minimize  <f> (m) : = \\ W (m - m ) | | +a <f) (m)  subject to  <t> (m) : = || W (Am  2  m  d  m  d  0  c  c  - d ) || = <^*, obs  2  (1.14)  Chapter 1.  Introduction  10  where m  =(« , )  T  ; 77  5  ={d?\d°f ) , 9  obs  d  w  0  0  W  dl  W,  (  ^ d2  w  0  0  w„  K  T  j  If we treat 77 and K as independent by setting a =0, solving (1.14) is equivalent to c  solving for the two vectors 77 and K independently except for the computing difficulty in keeping the two individual misfits at a reasonable level. But if we insist on a correlation between rj and K by choosing ct > 0, we can keep the parameters 77 and K in a particular c  confidence region by choosing the proper correlation weighting factor a . c  For a suitable Lagrange multiplier parameter /J, optimization problem (1.14) corresponds to the following unconstrained optimization: Minimize <f>  = <f> (m) + u(d>d(m) - (j>* ) + a <f> (m) m  = || W (m m  d  - m ) | | +/--(|| W (Am 2  0  d  c  c  •obs  I) + a <j) (m) (1.15)  - d  c  c  Determining the correct value of u is a separate issue that we discuss below. If (j) (m) c  is a quadratic convex functional, optimization problem (1.15) can be solved as a general optimization problem of regularization for each a . c  In our case, (j> (m) is the functional c  neither convex nor quadratic, so a corresponding algorithm must be developed to solve the optimization problem (1.15).  Chapter 1. Introduction  11  Since (f> (m) is not a quadratic functional of m, we use an iterative method to solve c  the optimization problem. In each step, (f) {m) is approximated with its second-order c  Taylor expansion (f> (m) « ^ ( W ) + g (m  - m<"') + l/2(m - m  T  c  m  ( n )  ) # ( m - m< ), T  n)  (1.16)  where g and H are the gradient and the Hessian matrix of (j> at mS ^ respectively. To n  c  realize this kind of approximation in each step, we modify the subspace method [15] which is designed specifically for linear inverse problems. To effectively reduce the correlation <f) in each iteration, we include the gradient of <j) in the subspace from which a new c  c  perturbation is chosen. Since (f) (rn) is not a convex functional, a good initial approximation must be choc  sen. We use the solution to (1.15) when a = 0 as the initial approximation. To obtain c  successively better reconstructed models, we must carefully choose a  c  > 0. In fact, the  objective functional <f> in (1.15) is not convex, so it may have different local minima. Furthermore, the regularization term <f> and correlation <f> in (1.15) produce contradicm  c  tory effects on our model m, so a must be chosen such that these effects are balanced c  in the iteration proceeds. In our numerical experiments the correlation decreases as the iteration, so a  c  must be increased to keep the correlation term and the regularization  term compatible.  •  In the central numerical example of this thesis, we set up both a magnetic model and an induced polarization model in a common two dimensional domain in which K and r) satisfy a certain correlation <f> . First, we invert K and r\ independently according to the c  methods described above. Then we invert for the two parameters together as in problem (1.14). We compare the results produced by these two different methods and show that the joint inversion gives much better results.  Chapter 2  Linear Inverse Problems  2.1  Ill-posed Inverse Problems: Functional T h e o r y  In 1923, the French mathematician J. Hadamard [7] introduced the concepts of wellposed and ill-posed problems. A mathematically well-posed problem should satisfy the following basic conditions: 1. A solution exists (existence). 2. The solution is uniquely determined by the data (uniqueness). 3. The solution depends continuously on the data (stability). If a problem does not satisfy all three of conditions, it is called ill-posed.  According  to Hadamard's philosophy, ill-posed problems are actually ill posed, in the sense that their basic formulation is wrong. However, since Hadamard's time many applications in industry have led to the formulation of ill-posed problems where the solution has a welldefined physical meaning. Such applications arise quite naturally in science, technology and medicine when one is confronted with the need to interpret measurements.  Such  interpretation is particularly difficult in geophysics, medicine and astrophysics.  The  problem here is to determine, on the basis of the measured data, internal parameters such as the mass density p{x) or the electrical conductivity o~(x), which are difficult or impossible to measure directly. Such problems are briefly called inverse problems. In fact, studying such problems is the only way of completely analyzing experimental results. 12  Chapter 2. Linear Inverse  Problems  13  Most inverse problems do not fulfil Hadamard's postulates of "well-posedness". They might not have a solution in the strict sense, their solutions might not be unique, or they might depend discontinuously on the data.  2.1.1  Theoretical Analysis of an Inverse Problem  Since many linear processes can be modeled by the integral equations of the first kind, it is interesting for us to show that such equations usually lead to ill-posed inverse problems. An integral equation of the first kind has the form (Tx)(s)= where k(s,t)  f  1  Jo  k(s,t)x(t)dt  = b(s),s e [0,1]  (2.17)  £ Z [0,1] x [0,1] is a known kernel, thus T is a bounded linear operator 2  with respect to the usual L  2  topology induced by (x,y)=  \\x\\={x,xyi , 2  ^ x(t)y(t)dt, Jo  (2.18)  x,y£L [0,l}.  (2.19)  2  The adjoint operator T* of T is characterized by (T*y)(t) = I' k{s,t)y(s)ds,  Jo  t G [0,1].  (2.20)  Above all, T is a compact operator (see [2], [8] and [24]). Thus [2] there exist adjoint orthogonal functions {(f>i(s)}, {if>i(t)}, and real scalars A; —> 0 such that f k(s,t)4>i{s)ds = Xi^i(t), Jo 1  t e [0,1]  (2.21)  and f k(s, t)ipi(t)dt = A,-^-(s), s G [0,1]. Jo 1  (2.22)  Thus, if the right-hand side in (2.17) has the expansion oo  6(s) = £ A V ^ ) ,  (2-23)  Chapter 2. Linear Inverse  Problems  14  then the solution is given by oo  *(*) = E ( A / A 0 ^ ( * ) ; i=l however x(t) £ L [0,1] 2  (2-24)  if only if CO  £(ft/A,-) Equation (2.25) is called the Picard  < oo.  2  condition:  (2.25) it determines whether a square-  integrable solution exists to the equation (2.17) with right-hand side b(s). If b(s) of (2.23) does not belong to R(T), we may still approximate b(s) by truncating the expansion (2.23) to a sum of k terms, k b (s) = J2W (s); k  (2.26)  J  j=i  each of these b clearly fulfils the Picard condition, and we have k  T~ h l  k = YXftlWii=i  We conclude that as k —>• oo we have b  k  || T^h  (2-27)  b, but ||-». oo, k — > • oo.  (2.28)  It is for this lack of stability that integral equations of the first kind often produce illposed inverse problems. Unfortunately, in practical situations we will typically only have access to an approximate right-hand side b in equation (2.17), which is contaminated with measurement errors, approximation errors, and rounding errors: 6= b  exact  + e, b  exact  € R(T).  (2.29)  Here, b  denotes the unknown, exact right-hand side and e denotes the perturbation.  Ideally it is  x  exact  =T~ b  exact  1  exact  that we want to compute. Of course, we cannot expect e to  Chapter 2. Linear Inverse Problems  15  satisfy the Picard condition, hence b £" R(T). which tries to compute T b _1  As illustrated above, any naive approach  instead of T~ b 1  exact  will usually diverge or return a useless  result with extremely large norm, no matter how small the norm of the perturbation e is. Furthermore, even if we know the exact values of b(s) at some finite set of points s.i,SN,  from the expansion oo  b{s) = Y;PiMs),  (2-30)  1=1  we still can not determine 6(5)' uniquely. In fact, usually there are infinitely many functions which have same values as b(s) at the points S i , 3 / v .  This is the fundamental  nonuniqueness for our inverse problems since our data are always finite. 2.1.2  Generalized Solutions and Regularization of Ill-posed P r o b l e m s  As we have seen, inverse problems are usually ill-posed problems and may have no solution in the traditional sense, so we should introduce the definition of Generalized  Solutions  for inverse problems [6]. Suppose that Hi and H axe Hilbert spaces over the same field of scalars. We consider 2  the fundamental problem of solving for x in Hi a general linear equation of the type Tx = b  (2.31)  where b £ H2 and T is a bounded linear operator from Hi to H . 2  The most prevalent  example of an equation of type (2.31) is the one which obtains when Hi = i ? , H = n  2  and T is an m by n matrix. If Hi = H =L [0,1], 2  2  R  m  then the integral operator defined in  (2.17) provides another example. In both these examples problem (2.31) may be ill-posed. For the operator T in equation (2.31), the range R(T) and null space N(T) are defined as R(T)  = {y £ H : y = Tx for some x £ #1}, 2  (2.32)  Chapter 2. Linear Inverse  N(T) If we let (xi,x ) 2  16  ={xeH :Tx  = Oe H }.  1  and < yi,y  2  and H  Problems  2  (2.33)  2  > denote the inner products in the Hilbert spaces Hi  respectively, then the adjoint operator T* of T is a linear operator from H  2  to  Hi characterized by the following relation: .  (T*y,x)=<y,Tx>,VxeH ,VyeH . 1  (2.34)  2  If the operator T is invertible then equation (2.31) has the unique solution x = T & , _ 1  and we have a well-posed problem since T~  l  is also a bounded linear operator. But in  general such a linear equation may have no solution at all when b  R(T), and have more  than one solution when N(T) 7^ {0}. When equation (2.31) has no solution in the traditional sense (6 ^ R(T)),  we may  assign a "best possible" solution to the problem. In fact, if we assume R(T) is closed and let P be the projection operator of H onto R(T), then Pb is the vector in R(T) which 2  is closest to b. It is possible to consider any u £ H\ such that Tu = Pb as a generalized solution of equation (2.31). This solution is often called the least squares solution of the equation Tx = b. Usually there is more than one least squares solution for equation (2.31). To get a uniquely determined least squares solution u £ Hi, let us examine the set of least squares solutions.  Groetsch [6] shows the set of least squares solutions of equation (2.31) is a  closed convex set when R(T) is closed, so there is a unique vector x* with minimum norm in the set, that is, || x* ||= inf{|| u ||: Tu = Pb} = inf{|| u ||:|| Tu - b \\<\\ Tx - b ||,x £ H }. x  (2.35)  Actually x* is the solution to the following optimization problem Minimize  cf) : = || x ||  Subject to  <f> :=|| Tx - Pb || = 0  2  x  2  d  (2.36.)  Chapter 2. Linear Inverse Problems  17  Usually we call x* the Generalized solution of equation (2.31). Sometimes it is also called "best approximate solution" of equation (2.31). It is designed to satisfy the right-hand side as closely as possible. :  Now let us consider a finite-dimensional inverse problem with noisy data. Assume Ax = d,  (2.37)  where A is an N X M matrix and d is a vector of N observed data with measurement errors. Usually N <C M, and A is an ill-conditioned matrix when N = M. Since the observed data d — (^)  are contaminated with measurement errors,  T  d ^ R(A) and thus there is no solution to equation (2.37). Furthermore, even if there is a solution x to equation (2.37) which satisfies the observed data d, this solution is usually unstable since the matrix A is ill-conditioned. If we write the observed data as d= d  + e,  exact  with d  exact  '•• (2.38)  being exact data from the true model and e being the measurement errors,  then equation (2.37) becomes Ax = d  + e.  exact  (2.39)  According to the analysis of generalized solutions, we should find the best approximate solution to linear equation system ;  . '  . Ax = d  .  (2.40)  exact  In other words, we should solve the following optimization problem . Minimize  (j> : = || x ||  2  x  Subject to • <j> :=|| Ax- d  exact  d  || = 0. 2  .  (2.41)  Chapter 2. Linear Inverse Problems  However, the vector d  exact  18  is not available in practical computing, so instead of calculating  the generalized solution to equation (2.40) we look instead for the generalized solution to the equation || Ax-d  || = 4>%.  (2.42)  2  where (f>* is our best estimate of the expected error norm E{\\ e || ). This is to solve 2  d  optimization equation Minimize  (j> : = || x ||  2  x  Subject to <f> :=|| Ax-d d  || = <j>*  (2.43)  2  d  Using the Lagrange multiplier method, problem (2.43) can be changed into the optimization problem Minimize (f) = <f> + a((f) - <f)* ) x  d  d  = || x || +a(\\ Ax-d  | | -</>*).'  2  (2.44)  2  Now we show that this approach is consistent with widely accepted Tikhonov regularization method. According to Tikhonov and Arsenin [22] a given algorithm is called a  regularization  method for (2.37) if—in the presence of perturbations e as in (2.39)—the algorithm determines approximations x(e) with the property that for some solution x x(e) -> x  e x a c  \ e -»• 0.  e x a c t  of (2.40), (2.45)  So, to find a generalized solution to equation (2.37) is also a regularization method. Furthermore, in Tikhonov's method the approximate solution x to (2.37) is defined as the unique minimizer of the quadratic functional 4 =|| d- Ax || +a || Bx | | 2  2  (2.46)  Chapter 2. Linear Inverse Problems  in a finite space R . M  19  Here B is the discrete version of a regularization  operator L which  defines a (semi-)norm || L- ||. It is used to measure the "size" of the approximations, a is a regularization  parameter which belongs to a suitable real parameter set. If we assume  T is injective, then the minimizer x  a  is unique.  Obviously, if L = I in (2.46), optimization problem (2.46) is same with optimization problem (2.44). Therefore, the approach of generalized solution is consistent with Tikhonov regularization. In the future we use the phrase "regularization method" for both Tikhonov regularization and for any other regularization methods.  2.1.3  Construction of Regularization  As we have seen, to solve a general ill-posed problem Tx = d,  (2.47)  we usually use some kind of regularization method. In other word, instead of solving equation (2.47) we usually solve the following optimization problem . Minimize^ =|| d-Tx Equivalently, the solution x  a  || +a || Lx || . 2  2  (2.48)  to (2.48) may be introduced as the solution to the  regularized normal equations (T*T + aL*L)x  a  = T*b.  (2.49)  Note that L*L is positive (semi-)definite so that the spectrum of the operator on the left-hand side of (2.49) is shifted in the positive direction; hence the solution of the regularized normal equations should be less susceptible to perturbations in the data d. In fact, L will typically be chosen such that Morozov's complementation condition [14] is fulfilled, i.e. II Tx || + || Lx ||> 7 II a; II  (2.50)  Chapter 2. Linear Inverse Problems  20  for some 7 > 0 and all x in the domain of L. In this case, T*T + aL*L invertible, hence (2.48) has a unique solution x  is continuously  depending continuously on d.  a  Instead of (2.48) one might- consider the slightly more general functional Minimize || d — Tx || +a || L(x — x  )  guess  if one has available a good guess for the exact solution x be reduced to (2.48) by replacing d by d — Tx  exact  ||  (2.51)  2  . However, (2.51) can easily  in (2.48).  guess  The basic idea behind the regularization in (2.48) is to search for some x , a  provides at the same time a small residual || d — Tx  || and a moderate value of the  a  penalty function || Lx  a  that  \\. The keys to succeed are a proper regularization operator L  and a suitable regularization parameter a. Choosing a proper regularization parameter a in problem (2.48) is very important. If the parameter is chosen too small, equation (2.49) is too close to the original ill-posed .problem and we must expect instabilities. In this case what we can expect to get is the best approximate solution of Tx = d. On the other hand, if the parameter is chosen too large, the problem we solve has very little similarity to the original equations since too much information from the data is lost. In this case, the solution we can expect to get is some x <E N(L) or simply 0. Finding the optimal regularization parameter a is important in solving ill-posed problems, but there are no general rule to choose it. One basic principle is to choose the regularization parameter a such that the norm of the misfit, 11 cZ — Tx  a  ||, equals the  norm of the error term [14], i.e. 4> =|| d-Tx d  || = || e || . 2  a  (2.52)  2  For our examples in the following Chapters, || e || has x 2  2  distribution with expected  value W, so we require the misfit (f) to equal its expected value N. Obviously this rule d  Chapter 2. Linear Inverse  Problems  21  depends on the accuracy of the estimate for || e ||. If no good estimate is available, it may be better to use an e-free parameter choice strategy. The role of L should not be underestimated. Numerical computing has shown that we could get completely different results with different choices of L. Varah [25] has shown that a wrong choice of L can lead to completely erroneous results. Although a general rule for choosing L is unknown, it is most common to choose either the identity J or a linear differential operator L ^ I.  This choice of L usually  implies the validity of the complementation condition (2.50). To choose a sensible L from appropriate a priori knowledge we use a example from the following chapters. Suppose we need to determine the unknown function m = m(x, z) using ' Tm = d  (2.53)  in a Cartesian coordinate system consisting of x-axis and z-axis. We can choose.L by combining differential operators and the identity such that for suitable weight functions  (2.54) If we choose ct = a = 0 and w as a constant, then L = I and we have a regularizax  z  s  tion in standard form opposed to a regularization in general form if L ^ I. This choice is simple and satisfies the complementation condition, but it is useful only when we have no idea about what the model should be. This choice leads to a reconstruction called the "smallest model" method since it only concerns the size of the model. If we choose a  x  ^ 0 and/or a ^ 0, we are asking the model m to be smooth in some z  Chapter 2. Linear Inverse  Problems  22  sense. The reason for choosing differential operators L ^ I is that noise components in the data usually lead to rough oscillations in the reconstructed model m which will provoke large X-(semi) norms, || Lm || when L ^ I while keeping standard norm || m \\ to be moderate. This choice is reasonable for geophysical inverse problems since our true models are usually at least piecewise smooth, and not just integrable. If we choose L as a differential operator with a  s  = 0, we can expect a very smooth  reconstructed model. This is also called the "flattest model".  2.2  Solving Inverse Problems  According to Anger [1], to solve an inverse problem we have to study the following issues: 1. The special process—both experimentally and theoretically, 2. Mathematical models of the process, - 3. The direct problem—both theoretical and numerical, 4. The information content of the inverse problem, 5. Algorithms for the numerical solution of the inverse problem. In this section we will work only on point 5 and leave the other four points to be studied in Chapter 3 for two specific practical problems.  2.2.1  Discretization of Inverse Problems  To compute an approximation to x  exact  , we somehow have to discretize the original prob-  lem. One reason is that for inverse problems we usually know only finitely many contaminated data instead of a continuous function. The other reason is that it is difficult to find analytic solutions to practical inverse problems. Although there are various ways to  Chapter 2. Linear Inverse Problems  23  reduce the integral equation (2.17) to a discrete equation, we do not want to elaborate on the different advantages of the respective methods here since the discretization methods are usually determined by the problems. No matter which method we use to discretize an integral equation, one problem we have to face is the choice of the proper mesh. If the discretization is too coarse, then the finite dimensional equation will be fairly well conditioned but its solution will suffer from large discretization errors; this is like using regularization with too large a regularization parameter a. If, on the other hand, the discretization level is too fine, then the influence of the small singular values of T will be too significant, and the solution of the discrete problem will resemble Tikhonov approximation with too small a regularization parameter a. Somewhere in between there is an optimal level of discretization with the resulting approximation being competitive to the one obtained from optimal regularization of the continuous problem. For practical problems the optimal discretization is hardly ever known in advance. Even if we chose the optimal discretization by chance, it would still remain a good idea for the purpose of numerical stability to regularize the finite-dimensional problem so that its condition number is improved. In practice, we usually select a discretization so fine that the corresponding discretized inverse problem is ill-posed. This allows us to use prior knowledge and adjust the misfit by choosing the proper regularization parameter a. There are two approaches to produce the system of linear equations: to discretize the integral equation (2.17) first, or to regularize the integral equation (2.17) first. If we discretize the integral equation first, we obtain a linear system of equations Ax = d, where A is the coefficient matrix and d is the vector of observed data!  (2.55) Usually the  Chapter 2. Linear Inverse  Problems  24  system of linear equations (2.55) is ill-posed since A will not be a square matrix when we discretize the integral equation with too fine a mesh. Even if A is a square matrix for some discretization methods, it may be singular or at least numerically singular. Applying regularization in standard form to the linear system of equations (2.55) amounts to solving the following regularized normal equation: (A A T  + al)x  = A d.  (2.56)  T  On the other hand, if we regularize the integral equation (2.17) first, the resulting discrete problem will involve in minimizing a quadratic functional of the form || Ax - d || +a || Bx | | , 2  (2.57)  2  where B is the discrete version of a differential operator L in the functional (2.46). Equivalently, its regularized normal equation is (A A T  + aB B)x T  = A d,  (2.58)  T  where B is the discrete form of the differential operator L. In the examples in the following chapters, we regularize the integral equation first. To discretize the regularized problem we divide the domain.of interest into M elements (e.g., intervals in 1-D, rectangles in 2-D) and assume the model x is constant in each element. If the number of observed data is N, then A is an N x M matrix. In our cases, N -C M , but both numbers are large: we have M = 560 and  = 310.  After discretizing the inverse problem, we have to find an efficient algorithm to solve it. This process is especially important for practical problems since we often have several thousands of unknown variables.  Here we review several direct methods for discrete  ill-posed problems like equations (2.55), (2.56), (2.58).  Chapter 2. Linear Inverse  2.2.2  Problems  25  Direct Methods  The Direct methods are simple and widely used, but usually they are not efficient for large scale inverse problems. 1. S V D with Truncation.  Assume A is an N x M(N < M) matrix in equation  Ax = d. To solve this system of linear equations, we suppose A has the following Singular Value Decomposition (SVD): A = UDV ,  (2.59)  T  where U is an N x N orthogonal matrix, V is an M x N matrix satisfying V V T  =  I, N  and Z)=diag(<Ji, ...,<r/v)- Here o~i > u ---- > CJV > 0 are the singular values of matrix A 2  Now for the linear system Ax — d, if N = M and  > 0, then the unique solution is  x = VD- U d. x  (2.60)  T  Furthermore, if we assume U =  [iti|u |...|ujv], 2  V=  (2.61) (2.62)  N'  d=  J2fc * i=l  = Pi  u  (- )  u  2  63  then • B=£-».•> i=i * N  x  (2.64)  a  \\* II =E(AM) 2  2  (2-65)  i=l  This is the solution to equation (2.55) when it is a well-posed problem. If N < M and O~N > 0, then we can define an approximate solution to equation (2.55) as x* = VD~ U d 1  T  (2.66)  Chapter 2. Linear Inverse Problems  26  with N  .  x  *  =  (2.67) t=i °*  || ||  2  =E(AM) -. 2  •  (2-68)  This solution to equation (2.55) is its generalized solution or best approximate solution as we introduced in formula (2.35). Sometimes the requirement that || x || in (2.68) is 2  moderate is called discrete Picard If o-i = 0  condition.  0 when i < k < iV, the matrix D is either exactly or numerically  singular. In this case, we introduce the truncated inverse of D with elements defined as ±- • i>k 0 With D  1  .  (2.69)  i < k.  defined in this way, we can obtain the truncated SVD solution to equation  (2.55): x = VD- J d, l  lT  (2.70)  T  wiith s* = E i L i f S x  i | | 2  -EL(AM) , 2  || d - Ax* | | H | d - UDV VD~W d 2  (2-71)  T  T  (2-72) || = Z l 2  k + l  131  (2.73)  Theoretically we can choose proper k to get a reasonable solution for which the misfit || d — Ax* | | , is small and the the solution norm || x || is not too large (see [24]). 2  l  However, for many practical problems we can not expect a particular gap in the spectrum to separate large singular values a from those that may be judged numerically zero (see k  [9]). To choose a proper k is completely subjective or depends on trial and error.  Chapter 2. Linear Inverse  Problems  27  2. S V D with Regularization. Instead of truncating the small singular values of A as in (2.69), we can use a regularization method to truncate the small singular values automatically.  This is to use SVD method to solve the regularized normal equation  (2.56), that is, (A A  + al)x  T  = A d.  (2.74)  T  As above, assume A has the singular value decomposition, A = UDV  T  with [/, V  and d in the form of (2.61), (2.62) and (2.63) respectively. Then we get a unique solution to (2.74), x  a  = V(D  2  + aI)~ DU d, 2  (2.75)  T  with  x = T£ iffi;Vi,  (2.76)  a  =  I K H=  (fe) ,  2  || d - Ax  a  || =|| d - UD(D 2  2  2  + aI)' DU d 2  T  (2-77) \\ = £ ^ 2  ^Bf.  (2.78)  Here we can see that as the regularization parameter a increases, the norm of the solution || x  a  || decreases and the. misfit || d — Ax  a  \\ increases.  This is consistent with our  discussion above about the regularization parameter.  2.2.3  Iterative Methods  As we have seen, direct methods give us an inside look at finite-dimensional ill-posed problems. However, for large-scale applications a striking disadvantage of direct methods becomes obvious: the amount of work required to compute the regularized approximation is of the order 0(N ) 3  where N. is the number of unknown variables.  Furthermore, 2-D and 3-D problems usually give rise to large structures or sparse matrices A and/or B. The direct methods for regularization become impractical because  Chapter 2. Linear Inverse Problems  28  they destroy the structure or sparsity of the matrices. Consequently, it is natural for us to turn to iterative methods that only require matrix-vector multiplications which can be implemented very efficiently and do not alter the matrices A and B. Here we will show how to solve linear inverse problems in the general form (2.58) using iterative methods. To find a minimum to the quadratic functional in (2.57) is to find a minimum to the following quadratic functional, <f> = <f> + a<j> =|| Ax ~d.\\ +a \\ Bx || 2  d  (2.79)  2  m  where (f> =|| Ax — d || is the misfit and <f> =|| Bx || is the size of the model. If we 2  2  d  m  choose a = fi , -1  the optimization problem (2.79) becomes Minimize cj> = [uj) + (f) = fi \\ Ax - d || + || Bx || . 2  d  2  m  (2.80)  Sometimes [i is called a Lagrange multiplier since (2.80) can be obtained from following optimization problem Minimize  <j> =|| Bx ||  Subject to  <f> =|| Ax-d  (2.81)  2  m  d  || = <f)* . 2  d  (2.82)  1. Conjugate Gradient Method. Since <j>(x) in (2.79) is a strictly convex functional, its unique minimum point is characterized by the regularized normal equation Gx = b where G = A A + aB B T  T  and b = A d. T  (2.83)  G is usually a positive definite symmetric matrix,  so it is natural for us to turn to the famous Conjugate Gradient Method [5]. In contrast to the steepest descent method, which chooses the gradient of the objective functional <f>(x) as its search direction in each iteration, the conjugate gradient method  Chapter 2. Linear Inverse  Problems  29  chooses the search direction pk ^ 0 for iteration k to satisfy plApi  = 0, i = l,...,k-  1.  (2.84)  Then the vector Xk = Xk-i + ojfcPfc is chosen to minimize <j>(x) over span{pi, ...,.pk}. Since  pk  is A-conjugate to  pi,...,  Pk-i,  the finite termination of the conjugate gradient  method is guaranteed. In fact, when A is a positive definite symmetric matrix, any set of A-conjugate vectors is linearly independent. Theoretically this procedure solves (2.83) in finitely many steps, but rounding errors lead to a loss of orthogonality among the residuals in the conjugate gradient method. As a consequence, the property of finite termination does not apply in practice. Furthermore, the loss of orthogonality may cause slow convergence in ill-conditioned problems. Preconditioning A may improve the converge rate, but we still have to exert a huge computing effort for each regularization parameter a until we find the proper one. This is the main reason we turn to the subspace iterative method. 2. Subspace Iterative M e t h o d . When we use direct methods to solve the normal equation (2.83), we search for the solution x in R  N  and get its solution in one big step.  However, when we use the conjugate gradient method to solve equation (2.83), in each iteration we search for an improved solution in only one direction. The subspace method of Oldenburg and Li [15] can be viewed as a compromise, in which we search for an improved solution in each iteration in a larger subspace of  R. N  Oldenburg and Li [15] give a detailed discussion of the subspace linear inverse method, so it suffices to review its basic ideas and steps here. Solving the optimization problem (2.80) Minimize <j>(x) = a<f> + (j) = a \\ Ax - d || + || Bx || 2  d  m  2  (2.85)  Chapter 2. Linear Inverse Problems  30  is equivalent to solving the regularized normal equation (uA A T  + B B)x T  = (iA d.  (2.86)  T  Suppose we know an approximate solution x^) to (2.85) and want to compute Sx such that x(  fc+1  ) = x^  + 6x is a better approximate solution of (2.86). We can substitute  a;( ) into (2.85) and take derivatives to get fc+1  (fiA A  + B B)6x  T  T  = fiA dT  (uA A  + B B)x( \  T  T  (2.87)  k  In contrast to direct methods, which solve (2.87) for 6x in the whole space R , N  solve for Sx in the subspace of R  we  spanned by some preassigned vectors { v i , v } , which  N  q  will be specified later. Let V = [ui-|u |...|u ], (3 = ( / ? i , f 3 ) 2  g  and Sx = J2i=i if3i  T  v  q  = VP- Then just as in  (2.86) we can get V {iiA A T  Here V (/J,A A T  T  + B B)V  T  + B B)V/3 T  = [iV A d. T  T  (2.88)  is an q x q matrix and q <C N, so it is easy to solve (2.88)  T  using a direct method of the sort described above. To choose proper basis vectors Vi is an important consideration in the subspace method. Usually the vectors u are chosen as the steepest descent directions of object  tive functionals or the steepest descent direction of specific components of the objective functionals [15]. Here are the basic steps of the subspace method. We will describe each step in detail in our numerical examples. 1. Set up the initial model x^ '; 0  2. Compute the function values cf> (x^), (f>d{x^) and the spanning vectors m  i = 1,q  from x^ \ k  k > 0;  V={vi},  Chapter 2. Linear Inverse Problems  31  3. Linear search for proper n such that (f>d decreases by a user-specified factor while <j) does not increase too much. m  4. Compute Sx=Y^ i =  fliVi  5. Update x^ ^=x^+Sx; k+  6. If x( ^ k+1  by solving the linear system V/3</>=0; '  satisfies the misfit requirement and (f) (x^ ^)  (j) {x^), m  differs very little from  k  m  stop the iteration and print out the results; otherwise, return to 2.  The subspace algorithm is efficient, flexible and robust [15]. It chooses the proper regularization parameter u automatically by controlling the misfit (f>d and searches for a perturbation from a subspace of moderate dimension exceeding one, so it is much more efficient than the conjugate gradient method. With proper transforms, it is possible for users to impose positivity constraints in practical problems. Therefore, we will use this algorithm to solve all our examples in the following chapters.  2.3  Linearizing Nonlinear Inverse Problems  In practical applications, many inverse problems are nonlinear. Although some nonlinear inverse problems can be transformed into linear inverse problems exactly, in many cases we have to make a linear approximation. Although we only use an initial linearization in a numerical example, we review this Newton's method for general linearization in each step since their basic ideas are common. Suppose x £ R  M  is a finite-dimensional unknown which produces data d{ under the  vector-valued nonlinear transform F, that is, (2.89) The inverse problem is to find x, given the data d — (d\,dpi)"  1  transforms Fi, i  1, - N .  and the nonlinear  Usually this kind of inverse problem is ill-posed.  Chapter 2. Linear Inverse Problems  32  For problem (2.89), we usually seek a generalized solution since it may have no solution in the strict sense. To find a generalized solution, we have to solve the following optimal problem Minimize  <f> {x) =\\ Bx | |  Subject to  <j> (x) =|| d-F{x)  (2.90)  2  m  d  || = <f>*  (2.91)  2  d  where B is an M x M matrix. To solve equation (2.91), we use linear approximation and iteration.  If x^ is an  approximate solution to the optimization problem (2.90)—(2.91), then for x near x^ \ k  we have F(x) = F(x^) + J( x — xW) + 0(\x — X ^ | ) . 2  .(2.92)  Here F(x) =  (F (x),,..,F (x))  T  1  N  and J = (Jij) is the N x M sensitivity matrix defined by dF(x^) Jij = — g S  i = 1,  N,j = 1 , M .  (2.93)  OXj  Dropping the quadratic error term in (2.92) and using the resulting approximation in (2.91) produces the following optimization problem: Minimize  <f> =\\ Bx | | , 2  m  Subject to -<f> =|| d  {k)  d  where dW = d- F(x^) + Jx^l  '  - Jx | |  (2.94) (2.95)  2  This is a linear inverse problem and we can solve it as  outlined above to produce an improved approximation solution x^ \ k+1  If we iterate the  above process, recomputing J from (2.93) at each step, we obtain a general Newton's method.  For simple problems, however, a single computing of J and just one linear  Chapter 2. Linear Inverse Problems  33  inversion may provide sufficient accuracy. It is the latter method we have used in our IP inversions. After we linearize the nonlinear inverse problems, we solve a linear inverse problem in each iteration. However, in some cases we can decrease the linear misfit efficiently, without decreasing the nonlinear misfit (2.91) at all. Therefore, when we solve linearized inverse problems in each iteration, it is necessary to choose the proper regularization parameter such that the nonlinear misfit (2.91) decreases in each iteration. Another difficulty in linearizing nonlinear inverse problems is the computing of the sensitivity matrix J . If we cannot find an analytical form for J , we have to compute it numerically. This process requires large scale computing, so an efficient method is needed. To find efficient methods of computing J is quite problematic. We refer to [11] for some practical advice on computing sensitivity matrices using Frechet derivatives and adjoint differential operators. Linearizing nonlinear inverse problems works well when the initial approximation is close to the true model of the original nonlinear problem. If the initial approximation is not close enough and/or the objective is not convex, it is possible for the method to diverge or to converge to a completely incorrect model.  Chapter 3  Two Practical Problems  In this chapter we use the regularization methods of Chapter 2 to solve two kinds of practical inverse problems in Geophysics: the Magnetic prospecting problem and the Induced Polarization (IP) problem.  3.1  Magnetic Methods  Magnetic prospecting is one of the oldest methods of geophysical exploration. It is used to explore for oil, minerals, and even archaeological artifacts. Collecting magnetic field measurements is easy, cheap and simple compared to most geophysical techniques. On the other hand, a precise interpretation of magnetic field data is very difficult. Therefore, it is important to find efficient algorithms to invert magnetic data.  3.1.1  Physical Principles  Before studying the magnetic effects associated with subsurface materials in the Earth we review the elementary physical concepts that are fundamental to magnetic prospecting. Geomagnetic field. If a steel needle, not previously magnetized, could be hung at its center by a thread so that it were free to orient itself in any direction in space, at most points on the earth's surface it would assume a direction neither horizontal nor in line with the geographical meridian. Its orientation would rather be the direction of the Earth's total magnetic field H  e  at this point. The magnitude of this field, H , the e  inclination of the needle from the horizontal, / , and the angle it makes with geographic 34  Chapter 3.  Two Practical Problems  35  north, D, completely define the magnetic field (figure 3.4).  Here the units of H  E  are  Geographic North  Magnetic North  East  z \/_  '___!/'  z I"  Figure 3.4: Geomagnetic field. ampere/meter 47rl0  -7  and the magnitude of HoH is of the order 0.5 x 1 0 e  -4  tesla, where JIQ =  weber•/'ampere-meter is the permeability of free space.  Susceptibility.  In the atoms or molecules of any magnetic material small loopcur-  rents (including rotating charges) occur, all having themselves directions but a random orientation. Each loopcurrent corresponds to a magnetic dipole. When the magnetic material is placed in a magnetic field, these magnetic dipoles tend to line up in the direction, of the field. The magnetization M (also called "Polarization") is defined as the magnetic moment per unit volume of magnetized material at a point. The magnitude of magnetization depends on the strength of the external field through the relation M = KH  (3.96)  Chapter 3. Two Practical  Problems  36  where K, the proportionality constant, is called the susceptibility.  Since M and H are both  measured in ampere/meter, susceptibility is dimensionless in the SI system. Susceptibility is the fundamental parameter in magnetic prospecting, since the magnetic response of rocks and minerals is determined by the amount of magnetic material in them and the latter have K, values much larger than the rocks and minerals themselves. For a vacuum, K = 0. Diamagnetic susceptibilities.  Paramagnetic  substances are weakly magnetic, with negative  materials are weakly magnetic, characterized by small  positive susceptibilities. Ferromagnetic  materials have positive and very high suscepti-  bilities, but rarely occur naturally in the Earth's crust. Ferrimagnetic positive and high susceptibilities.  materials have  Ferrimagnetic materials contained in rocks are the  most important in magnetic prospecting. M a g n e t i c anomaly. The magnetic anomaly is the difference between observed and background earth's magnetic field values. In magnetic prospecting magnetic anomalies are usually caused by the magnetization of ferrimagnetic material under the earth's surface, so they are our measured data for the magnetic inverse problem. M a g n e t i c flux density. If a current is passed through a coil consisting of several turns of wire, a magnetic flux flows through and around the coil annulus which arises from a magnetizing force H. The magnitude of H is proportional to the number of turns in the coil and the strength of the current, so that H is expressed in ampere/meter. The density of the magnetic flux, measured over an area perpendicular to the direction of flow, is known as the magnetic flux density B. B is proportional to H, i.e.  B = fxH,  (3.97)  where the constant of proportionality fi /i = / i o ( l + « ) ,  (3.98)  Chapter 3.  Two Practical  Problems  37  is known as the magnetic permeability. The units of permeability are weber/'amperemeter and in free space a — fi — Air x 1 0  -7  0  (weber/meter )  weber/ampre-meter. The units of B are tesla  and sometime B is also used to express magnetic field. Although we  2  —*  mainly use H to represent the strength of magnetic field, it is easy to transform between H and B using equation (3.97) once [i is known.  3.1.2  Mathematical Relations  The strength of magnetic field F at the origin caused by a magnetic dipole M at point P with position r satisfies [10], [20] 1 M x r  ^lrT  m  = VX  )  1  =  1  ^ ' ?\ V[M V{  '  )]  -"  (3  }  Generally, a volume of magnetic material can be considered as an assortment of magnetic dipoles that results from the magnetic moments of individual atoms and dipoles. They are aligned by induction in the presence of a magnetizing field. In any case, we may regard the body as a continuous distribution of dipoles resulting in a magnetization M, of magnitude M. The strength of the field caused by the whole body measured at a point f  outside the body is  0  ^  = h lv ^ • iw^\) ' v[  v  where f = (a; ,y ,^o) and \r - f \ = {(x - x )  2  0  0  0  0  0  -  ]dv  + (y - y )  2  0  + (z -  (3 100)  z)} . 2  1/2  0  If M is a constant vector with direction a = (/, m, n) and magnitude M then the operator M - V = M — = M ( / — + m — + n—). da ox ay dz Thus, we have: d  r I  dV  'v \ \r — r 0  and we can get the projection of F in any direction  (3.101)  Chapter 3.  3.1.3  Two Practical  Problems  38  Forward Problem  The data for magnetic prospecting are the magnetic field anomalies caused by underground bodies with high magnetic susceptibility. Magnetic anomalies caused by intrusions, flows, or iron-rich sedimentary horizons are common features in regions favorable for mineral exploration, and there is frequently a difference between the magnetic mineral content of such features and that of the host rocks. Such features may often be simulated by a two dimensional dipping dike. Here we set up a 2-dimensional model to produce magnetic data. We assume a dike with dip £ and strike /?; see figure 3.5. If we take the y-axis along the strike direction, the z-axis downward and the x-axis perpendicular to the strike direction, then the magnetic anomalies are independent of y. susceptibility is K = K(X,Z)  For this two-dimensional model we can assume the  and the magnetic polarization is in the direction of the  Earth's field H , that is, M = n(x,z)H . e  e  magnetic field anomaly d(f ) 0  fa) Here r = (x ,0,z ) , T  0  H  e  o  o  Then according to (3.100) the corresponding  satisfies (3.103)  = f=  (x,y,z)  T  and \r - r \ = [(x - x )  2  0  0  + y + (z - 2 r ) ] - Since 2  2  1 / 2  0  is a constant vector, this is an integral equation of first kind when data, d{fo) are  known. When we integrate the right-hand side with respect to y from — oo to oo we reduce (3.103) to a completely 2D problem. To get the data about our two-dimensional model, we partition the area of interest into a series of rectangles. We assume that the susceptibility in each rectangle is constant, and use (3.102) to compute the magnetic field caused by each rectangle. For the geometry illustrated in figure 3.5, we have the following relations [20]: r\  = d + (x + c o t £ ) , 2  2  Chapter 3.  Two Practical Problems  39  Figure 3.5: Rectangle geometry r  = D + (x + £ > c o t £ ) ,  r  2  = d + (x + b + d c o t £ ) ,  \  = D + (x + b + £ > c o t £ ) .  2  r  2  2  2  2  2  2  If we assume the angles between x-axis and r; are <^,-, then fa = tan {d/(x + dcot £)},  (3.104)  1  and we can compute fa, fa and <^4 similarly. Usually we compute the projections of the magnetic anomaly field in ^-direction or/and the Earth's magnetic field direction. Assuming that £ = 9 0 ° , we get the projections of F in H direction and z—direction from (3.103): E  F = E  K\H  P  -4-^{sin(27) sin f3 x l n ( r r / r r ) 2  3  +{fa - fa - fa + ^ )(cos Ism  4  2  4  F,  K\H  E  2TT  •{cos/sin/? x l n ( r r / r 4 r ) — ( ^ 2  3  1  2  1  /3 - sin /)}, 2  ^3 + <^t) sin/}.  (3.105) (3.106)  Chapter 3.  Two Practical  Problems  40  At each observation point for the magnetic problem, the measured anomaly is a vector whose two components are the sums of F and F over all rectangles. e  3.1.4  z  Inverse P r o b l e m  To solve the magnetic inverse problem, we must solve an integral equation of the first kind. In contrast to the standard integral equation of the first kind, we only know d(r ) 0  for a constant zo, that is, along a line, instead of knowing d(fo) for all (XQ,0,Z ) O  in the  domain of definition. In fact, in practical experiments, we only measure the magnetic anomaly on the surface.  For two-dimensional problems we just measure the anomaly  along a line on the surface. From theoretical analysis, we know that the magnetic field satisfies Laplace's equation above the surface, so the values of d(f ) 0  different z  0  are not independent.  along lines with  Therefore, it is not helpful to measure d(f ) 0  along  different lines with different ZQ above the surface. That is, we are required to solve for K(X, Z) in the system of equations d(xi, z ) 0  =  -^-V 47T  P<^- (w^h^\) }' v  '  dV  (3 107)  where d(xi, ZQ) are the data which are measured along a line on the surface, (x;, zo), i = 1,...,W. To find a numerical solution to equation (3.107), we have to discretize (3.10.7) first. We divide the area of interest into a series of M rectangles (see figure 3.6) and assume the susceptibility K(X,Z) is constant, say  in rectangle i,i =  1,...,M.  Choosing basis functions 1,  if (x,z) is in the rectangle i,  0,  otherwise,  <f>i(x,z) =  (3.108)  we get M  K(X, Z) ~ E  Ki<f)i(x, z).  (3.109)  Chapter 3. Two Practical  Problems  J  41  2  3  i  M-l  M  z  Figure 3.6: Base function model. Substituting  K(X,Z) from (3.109) into (3.107) and using formulas (3.105) and/or  (3.106), we arrive at the finite-dimensional linear problem An = d where A is an N x M matrix, d = (d\,d^) 1  observed data and K = ( / C i , K M )  T  (3.110) = (d(x , z ),d(xpj, 1  0  o))  z  T  are the  is the model to be determined. This is the discretized  ill-posed inverse problem to be solved. To use Tikhonov's regularization method for the integral equation (3.107),.we introduce the following regularization operator || L(K — /Co) | | 2 = OL I  W (K — Kf)) dxdz + a 2  S  S  x  Jarea  +a  z  j Jarea  I Jarea  w ( ~ Ydxdz. d{K  Ko)  z  OZ  w(  ) dxdz 2  x  UX  (3.111)  Chapter 3.  Two Practical Problems  42  Here KQ is an initial guess about the model, w (x,z) x  functions, and a , ct and s  ct  x  = 0 and a  z  and w (x,z)  are the weighting  z  are chosen to adjust the final computed model. If we choose  x  = 0, we call the final output "the smallest model"; this is similar to  choosing L = I. If we choose a  s  = 0, we call the final output "the flattest model"; this  is similar to choosing L as a differential operator with a nontrivial null space. Choosing the right parameters a ,-a s  and a  x  z  depends on experience and prior knowledge about  the true model to be determined. For magnetic inverse problems, one important step is to choose proper weighting functions, w (x, z), w (x, z) and w (x, z). It is obvious that the strength of the magnetic s  x  z  field on the surface depends on the depth of the magnetic body underground. Let us find the functional form of this relationship. For two-dimensional problems, if the rectangles are sufficiently small, we can consider each cell as a linear source which consists of dipoles along an infinitely long line. For a dipole with constant moment M = (a,/?,7), the field at the origin is is given by (3.100):  F = ^ V ( M • V ( l / r ) ) = i - V ( o £ + ^ - + 7^).  (3.112)  For a linear source, the total field at the origin is  = J Fdy  F  t  / 1  f°°  _  n V  X  Z  NN ,  1 ^7/20:3: + 27Z  ~ 4^  ^ x + z 2  2  and its projection in the direction n = (QO,/?O,7O) of any unit vector is F  n  =n-F  t  2x(ax + 72)  2a  —cto Air x + z 2  2  O(-rr) if z —* 00.  (x + z ) 2  2  2  +  4^ °  27  '  2z(ax -j- 72)  7  x +z 2  2  ~  (x + z ) 2  2  2  _  Chapter 3. Two Practical Problems  43  So, we choose 1/z for each of the weighting functions w (x,z), w (x,z) and w (x,z) 2  s  x  z  to compensate for the decay of the magnetic field with depth. After choosing the weighting functions, we can substitute n(x,z) from (3.109) into the regularization operator (3.111) and get, after discretization, ^(«)  = (« -  Kof'WTW.+WZW*  +  WJW ){K - «„)  (3.113)  Z  or in general form, fa( ) K  = (K- K ) W?W {K  - K ) =|| W (K - K ) f  t  0  K  k  0  (3.114)  .  0  Combining equations (3.110) and (3.114), the ill-posed magnetic inverse problem is transformed to the following well-posed optimization problem: Minimize  <^ = || W (K - « ) || K  Subject to fa =|| W (GK d  where (j) is our misfit criterion, W d  d  (3.115)  2  0  K  - d ) || = </>*, obs  (3.116)  2  is an N x N data weighting matrix and <f) is the d  target misfit. Here we choose <b according to the Morozov's discrepancy principle [14], d  that is, <b = E ( | | W (d — d° )  || ). If the observed data di, i = 1 , N  hs  d  2  d  normal distributions N(0,o~i), then YA=I{~~ '—) J  has a \  2  2  have independent  distribution with expected  value N. So, we may choose (f>* = N here when W is chosen as a diagonal matrix with d  diagonal elements 1/cr;, i =  d  1,N.  In practical computing we treat the constrained optimization problem (3.115), (3.116) by solving the following unconstrained problem: Minimize <f>( ) = </> (K) + ^~ {4>d{d) x  K  k  4>* ).  (3.117)  d  Here \i is the Lagrange multiplier. Differentiating with respect to the unknown vector K produces the following regularized normal equation (A WjW A T  d  + UW/W )K k  = b  (3.118)  Chapter 3.  Two Practical Problems  where b = fj.W W K T  K  K  0  +  44  A WjW d. T  d  Equation (3.118) is a standard regularized normal equation and we can use the subspace method (SLIM) described in Chapter 2 to solve it.  3.1.5  A Numerical Example  Here we analyze a simple two dimensional magnetic problem. The forward problem is to produce the data using a known model. The inverse problem is to input the data from the forward problem and use them to reconstruct the model. For this magnetic problem, the forward problem is to produce the data from a known magnetic susceptibility distribution function, K = n(x,y,z).  The data here are measurements of the anomaly field along a  line 2 meters above the surface. We use only the projection of the anomaly field onto the main field (geomagnetic field) direction as the data. In this example we consider a rectangular area under ground. We set up a Cartesian coordinate system with y in the horizontal strike direction, z in downward direction and a corresponding x axis. Then the magnetic anomaly field is symmetric in y direction and we get a 2D problem. Here we consider a 240 meter by 56 meter rectangle below the surface (0 < x < 240, 0 < z < 56). We partition the rectangle into 280 cells with equal sizes: 12 meters in the x direction and 4 meters in the z direction. After adding bigger cells around the boundary of the rectangle, we get 560 cells all together.  The magnetic susceptibility  distribution we postulate is illustrated in Figure 3.9: it involves a high susceptibility part (K — 0.5) inside the background part (K = 0.001). Assuming the magnetic susceptibility is constant in each cell, we can compute the magnetic anomaly from each cell using the formula (3.105). The total magnetic anomaly can be produced by taking the vector sum of the magnetic anomalies of all the cells.  In this example we compute the magnetic  Chapter 3.  Two Practical Problems  45  anomaly at 200 equally spaced points along a line from x=0 to x=240.  0.12  I  ,  ,  r-  X  Figure 3.7: Exact magnetic data from the forward problem.  0 .12  X  Figure 3.8: Noisy data simulating magnetic observations.. When we produce the data, we can calculate the coefficient matrix A for the inverse problem at the same time. In fact, from the formula (3.105) about the rectangular cell  Chapter 3.  Two Practical Problems  46  we can produce each row of the matrix A before we calculate the data at each point. After we compute the data (figure 3.7) and the coefficient matrix A, we add normally distributed noise to the data, i.e. d =d i  + N(0,o'i),i  i  = l...N'  (3.119)  where cr; = a equals 1% of the maximum of absolute values of the data. The contaminated data is our observed data d and the elements of the diagonal matrix Wd are the reciprocals of the standard deviations (l/cr=10 here). 3  In the regularization term, || L(K — Ko) \\ —  a  2  / W (K — Ko) dxdz Jarea S  +a  f w (^ J area  z  we choose no = 0, w — w s  = w = 1/z , z  2  T  K  K  2  OX  K  2  OZ  a  = 16, a  s  x  7  || LK | | « (j) = K W^W K  ) dxdz  x  Jarea  H^An^ and WZW  the corresponding matrices,  w(  x  °^) dxdz,  K  z  2  x  + a  2  s  = K WjW K T  S  = 26 and a  z  — 70, and produce  with the relation  + K W?W K T  X  + K WjW K. T  Z  (3120)  With coefficient matrix A , contaminated data and regularization matrices we can set up the optimization problem like (3.117), and use SLIM  (Subspace Linear Inversion  Method) [15] to solve its regularized normal equation. Since we mainly consider ferrimagnetic materials in magnetic prospecting, we require the reconstructed model to be nonnegative in the computing. The number of subspace vectors in this example is 5 (one from <f>d, one from </> and three from the the three K  parts of <j> ) and the SLIM K  converges with about 80 steps. We plot the true model and  numerically reconstructed model in figure (3.9). Our inversion gives us the correct position and shape of the magnetic body, but the susceptibility of our calculated model is a little lower than that of the true model. The  Chapter 3. Two Practical Problems  Reconstructed Magnetic Susceptibility Figure 3.9: Magnetic Mode).  47  Chapter 3.  Two Practical Problems  48  Pred.dat'  -0.04  50  100  150  200  250  Figure 3.10: Predicted data from the output model.  250  Figure 3.11: Difference between observed data and predicted data.  Chapter 3.  Two Practical Problems  49  reason for this is that the solution of the magnetic inverse problem is not unique and the calculated model is smoothed by the regularization operator. So, even though we fit the data very well (Figures 3.7, 3.8, 3.10), our calculated model is still a little different from the true model (figure 3.9).  Here we also notice that the difference between the  observed data and the predicted data is small at each point and is independent of the observed points (figure 3.11), so it is reasonable to consider them as identical independent distribution random variables. Furthermore, our fitted data from the reconstructed model approximates the exact data better than our observed data with noise does (figure 3.7, 3.8, 3.11).  Chapter 3.  3.2  Two Practical Problems  50  Induced Polarization  Induced polarization (IP) is a relatively new technique in geophysics, and has been employed mainly in base-metal exploration and to a minor extent in ground water search. As we have done for the magnetic problem, we treat only two dimensional situations here using a similar Cartesian coordinate system.  3.2.1  Physical Principles  A typical IP experiment involves putting a current into the ground and measuring the resulting potential some distance from the source. The simplest experimental equipment is a standard four-electrode D C (Direct Current) resistivity spread (figure 3.12). When the input current is suddenly interrupted the voltage across the potential electrodes generally does not drop to zero instantaneously, but decays rather slowly, after an initial large decrease from the original steady state value. This decay time is of the order of seconds or even minutes. If the current is switched on again, the potential, after a sudden initial increase, builds up over a similar time interval to the original D C amplitude (figure 3.13). In one type of IP detector the decay voltage is measured as a function of time in various ways; this method is known time-domain  IP. Since the build-up time is also  finite, it is clear that the apparent resistivity (actually a complex impedance) must vary with the frequency, decreasing as the latter increases. Thus the measurement of apparent resistivity, p at two or more A C frequencies, generally below 10 Hz, constitutes another a  method of detection! This is known as frequency-domain  IP. Measurements of IP may be  made either in the time or the frequency domain. The former are known as pulse transient measurements, the latter as frequency variations. Several units of measurement which have been used in the two methods are defined below.  Chapter 3.  Two Practical  Problems  51  X  Figure 3.12: Four-electrode D C model. 4>  a  ^ ^ ^ ^  1  Figure 3.13: IP decay voltage. Millivolts  per volt (IP percent). The simplest way to measure IP effects with time  domain equipment is to compare the residual voltage ip(t) existing at a time t after the current is cut off with the final voltage <p during the current flow interval. Since ip(t) is v  much smaller than (p , the ratio ip(t)/<p is expressed as millivolts/volts, or as a percent, v  v  lOOf (t)/if , where both are in millivolts. The time interval t may vary between 0.1 and v  10 seconds. The difficulty with this method is is not possible to measure potential at the instant of cut-off because of large transients caused by breaking the current circuit. On the other hand ip(t) must be measured before the residual has decayed to noise level.  Chapter 3.  Two Practical Problems  52  Decay-time integral. Commercial IP sets generally measure potential integrated over a definite time interval of transient decay. If the integration time is short and if the decay curve is sampled at several points, the values of the integral are effectively measurements of the potential existing at different times, i.e., <p(ti), ^ 2 ) ,  This is an extension  of the measurement above from which one also obtains the decay curve shape. Combining this method with the following definition of the chargeability we can calculate the integral interval by interval. Chargeability. This is defined as 1 /'*2 M = — / (f(t)dt  (3.121)  and is the most commonly used quantity in time domain IP measurement. When <p(t) and <f have the same units, the chargeability M has the dimensions of time, and is v  usually reported in milliseconds. Frequency effect. In frequency domain IP, one measures the apparent resistivity at two or more frequencies. The frequency effect is usually defined as  Pac where p , pac are apparent resistivities measured at direct current and very high fredc  quency. The per cent frequency effect is given by PFE = 100/e. In theory, since both IP methods measure the same phenomenon, their results ought to be the same; practically, however, the conversion of time domain to frequency domain and vice versa is quite difficult. Siegel [21] defines the chargeability as M=  {\im ip(t)-lim<p(t)}/lim £—•00  t—tO  t—>CO  (p(t)  (3.123)  By Laplace transform theory, it can be shown that lim <p(t) — J pdc n d l i m ^ i ) = Jpoo? a  t—*oo  t—1-0  (3.124)  Chapter 3.  Two Practical Problems  53  where p^ is the apparent resistivity at very high frequency and J is the current density. Consequently, assuming that p  ac  M  Pdc  ~  = p^, we can write for the chargeability  Pco  =  ,  = /Orfc  Pac  -.  =  1  1  1 - -i . ,  Pdc  1 +  /e  /e = 7—T ~ ^ 1 +  e  , (3.125)  fe  when / e C l . In practical situations this simple relation is not valid, partly because an exact theoretical analysis of the IP effect is not available (that is, the basic premises of the two systems of measurement are only approximately valid), and partly because the measurements are not made at D C and V H F (Very High Frequency) in either IP system. This complicates the conclusion from one result to the other.  3.2.2  Mathematical Relationship  According to the macroscopic representation of the physical property governing the IP response that was put forth by Siegel [21], we can use a macroscopic physical parameter called chargeability (a little different from that defined in (3.121)) to represent IP response.  Here our earth model is described by two. quantities: conductivity o~(x,y,z)  and chargeability rj(x,y,z). Both are positive, but while conductivity varies over many orders of magnitude, chargeability is confined to the region [0,1). A typical IP experiment involves putting a current / into the ground at the source position (x , y , z ) and measuring the resulting potential at various points some distance s  s  s  from the source. In a time domain system the current has a duty cycle which alternates the direction of the current and has off-times between the current pulses at which the IP voltages are measured. Let tp denote the potential that is measured in the absence of a  chargeability effects. This is the "instantaneous" potential measured when the current is turned on (see figure 3.14). In mathematical terms <fia=F [a], dc  (3.126)  Chapter 3.  Two Practical Problems  54  I  t  Figure 3.14: D C current for IP. where the forward mapping operator Fd is defined by the D C equation and its boundary c  conditions:  , - V • (a(x^)V^)  =IS{x-x )6{y-y )6(z-z ),  (3.127)  £¥v(x,y,0)  =0,  (3.128)  Y\m ^ Lp (x,y,z)  =0  Ra  where R = \J{xs  x)  2  s  co  a  + (y - y )  a  + (z -  2  s  8  8  '  (3.129)  z). 2  s  If the ground is chargeable then the potential (p , recorded well after the current is v  applied, will differ from tp . According to Siegel's formulation, the effect of the chargea  ability of the ground is modelled by using the D C resistivity forward mapping Fd but c  with the conductivity in (3.127) replaced by <7=<r(l — 77). That is,  V  with operator F  dc  v  = F [a(l dc  - )} V  being defined as above (3.127)—(3.129).  (3.130)  Chapter 3.  3.2.3  Two Practical Problems  55  T h e Forward P r o b l e m  The basic IP datum, which we refer to as apparent chargeability, is defined by (see figure 3.14) r  ] a  :=^ =  (3.131)  or  Equation (3.132) shows that the apparent chargeability can be computed by carrying out two D C resistivity forward modelling steps with the conductivities a and cr(l — rj). Here the equation (3.132) also defines the forward mapping for the IP data. From the definition of chargeability (3.132), we see that here IP data depend on conductivity a and chargeability 77 through differential operator Fd . c  To produce the  data, apparent chargeability r/ , we have to compute ip and <p from known distributions a  a  v  of conductivity cr(x,z) and chargeability r/(x,z). This requires solving the differential equation (3.127) with its boundary conditions two times. This is a forward Direct Current (DC) problem, an important subject in Geophysics. Although we refer to [12] for details regarding forward D C problems, it is possible to summarize the basic steps here. 1. Although a two-dimensional conductivity structure has been assumed in this work, the potential field that results from injecting current into the ground at a single point is still three-dimensional. The effective dimensionality of the problem can be reduced by considering the Fourier Cosine transform of the potential. This is possible since the potential can be thought as an even function of y. The transform is given by roo  tp(x,k ,z) = y  Jo  <p(x,y,z)cos(k y)dy y  (3.133)  where k is a spatial wavenumber in the y direction. The transformed potential is y  Chapter 3. Two Practical Problems  56  then found to satisfy the boundary value problem - -^Hx,  z)^-\  -T^HX, z)^}  + k a(x, z)<p 2  y  —<p(x,k ,0)  = ^6{x - x )S(z s  =0  y  2  s  (3.134)  (3.136)  y  s  s  (3.135)  lim (p{x, k , z) = 0 where r = \J{x — x )  - z)  + (z — z ) . 2  s  2. Before using standard numerical methods (finite difference, finite elements and so on) to solve equation (3.134), we remove its singularity in the right-hand side by means of asymptotic approximation. 3. When discretizing the differential equation (3.134), we need to produce a proper approximation for the boundary condition (3.136) based on wide grids around the boundary. 4. After solving equation (3.134) with its boundary conditions for different wavenumbers k , we have to take the inverse y  2 <p(x,y,z) = —  Fourier Transform f°°  <^(x, k , z) cos(k y)dk y  y  (3.137)  y  7T JO  to recover the potential (p(x,z) in the spatial domain. Theoretically our IP data are not only the apparent chargeability r} but also the a  potentials ip and ip , but in practical field work it is difficult to measure potentials at a  n  the instant the current is turned on or cut off. Instead the measured data are only the final potential <p and the remaining potentials (ip — c/v) after the current is switched v  v  Chapter 3. Two Practical Problems  57  off. We usually approximate the potential (<p — ip ) at the instant of cut-off with the v  a  remaining potential measured as soon as possible after the cut-off. Then the apparent chargeability n can be computed using formula (3.131). a  3.2.4  The Inverse Problem  Linearization The IP inverse problem is nonlinear and there are several ways to solve it [16]. One is to linearize the data equation (3.131) when the chargeability n <C 1. Let the earth model be partitioned into M cells and let rji and cr; denote the chargeability and electrical conductivity of the i  th  cell. Linearizing the operator  about the  conductivity model a yields (3.138) (3.139) Substituting into equation (3.131) yields  "a  (3.140)  =  A further approximation yields  Thus the i  th  datum is  M Vai  =  ^2  JijVji  = 1,...,N,  (3.142)  where (3.143)  <~Pi da  3  is the ij  th  element of the sensitivity matrix.  Chapter 3.  Two Practical Problems  58  So, instead of solving the nonlinear inverse problem (3.132) we can solve the linearized inverse problem Jn = n .  (3.144)  a  C o m p u t i n g T h e Sensitivity M a t r i x To linearize the nonlinear inverse problem (3.132), we have to compute the sensitivity matrix J for equation (3.144). From the measured D C potentials a D C inverse problem is first solved to compute a background a and a sensitivity matrix (difi/daj).  The D C  inverse problem is nonlinear and we refer [12] for details based on the subspace iteration method. Here we briefly review the methods of computing the sensitivity matrix since it is useful in general linearization. The most straightforward way to calculate the differential sensitivities is to approximate them using a one-side finite difference formula, but this method is inefficient since each sensitivity requires solving a completely new boundary-value problem. Efficient methods of computing the sensitivity matrix use either sensitivity equations or Adjoint Equations. 1. Sensitivity Equations. For the steady state diffusion problem •Lip  = - V • (<rV<p) + q(x)ip = f(x) in D  ' (3.145)  with boundary condition a(x)<p + 8(x)^ On  = 0 on <9£>  (3.146)  we can use the sensitivity equation approach of Rodi [17]. Taking o~(x) to be the model, and assuming the parameterization  M  <r(x) = X>^(£)  (3.147)  Chapter 3.  Two Practical Problems  59  where 1  if x in cell i  0  otherwise  Substituting the a from (3.147) into the equation (3.145) we get  M W  = -V • (J2(TiipiVip) + q(x)<p = f(x) in D  (3.148)  i=l  Differentiating with respect to cr,-, and writing  « W  =  (3-149) CCT,-  yields the sensitivity problem L<pi = - V • (aVifi)  + q(x)ipi = V • (V>,-(x)W) in D  a ( x ) ^ + ^ ( x ) ^ = 0 on dD an  (3.150)  -  (3.151)  To compute the sensitivities for model cr(x), the forward problem (3.145), (3.146) is first solved to obtain <p(x) at all points x in D.  For each parameter  the  corresponding sensitivity problem must then be solved to obtain <pi(x) which is then —*  —*  evaluated at each of the observation locations. Since the source terms V - (tpi(x)V<j>) are different for each i , a total of M + 1 forward problems must be solved to obtain all of the sensitivities. Since the M sensitivity problems and the original forward problem differ only in terms of their right hand sides, they can all be solved using the same numerical forward algorithm. Since the sensitivity problems and the forward problem are identical except for their respective source terms, their numerical solution can be done very efficiently if a direct solver is used in the forward code.  Chapter 3.  Two Practical Problems  60  2. Adjoint Equations. We can also calculate the sensitivities based on the adjoint Green's function. This approach has been used by Smith and Vozoff [19]. For the sensitivity problem (3.150)—(3.151) considered in the sensitivity equation approach, we consider the problem L*G* = - V • (a(x)VG*)  + q{x)G* = 6(x - £j) in D  (3.152)  BC*  a(x)G* +  = 0 on dD  where L* is an adjoint operator and G*(x,Xj)  (3.153)  is an adjoint Green's function.  Then we can get the sensitivity for an observation location from <p (xj) = / k  G  *  V  • i^kV<f>)dV.  (3.154)  JD  For a 2D problem with M cells and N observation data we must solve M + 1 forward problems in the sensitivity equation approach and AT + 1 forward problems in the adjoint equation approach. So we can use the sensitivity equation approach for overdetermined problems (M <C N) and the adjoint equation approach for underdetermined problems ( M ^> N).  The IP problem here is underdetermined so we use the adjoint equation  approach.  Solving The IP Inverse Problem After we have linearized the discretized IP inverse problem and obtained Jn = d,  (3.155)  we can use standard regularization methods to solve it. Here we summarize the basic steps in solving IP inverse Problems.  Chapter 3.  Two Practical Problems  61  1. Solve D C inverse problems with a subspace iteration method to get that background conductivity cr.  '  2. Use linearization to produce linear equation system (3.155). 3. Introduce the regularization operator || Lr\ || as in equation (3.111) with constant weighting functions.  4. Use SLIM  3.2.5  to solve the regularized normal equation.  A Numerical Example  In this example, the forward problem is to produce the IP data and the inverse problem is to recover the IP model from the produced data. We work on the same domain introduced in the magnetic computing example since we will later be solving both problems together. The chargeability distribution of the true model is illustrated in Figure 3.15: it consists of high chargeability part (77=0.15) and background chargeability part (77=0.001). The background conductivity is chosen as constant (<T=0.6 every cell). Here we use a Pole-Pole configuration (similar to the four-electrode D C configuration in figure 3.12, but with one current electrode and one potential electrode at "infinity") to introduce the current / and to measure the potential ip. We introduce electric current at 10 points sequentially. For each current source point, we measure the potential at 11 other points. So, we have 110 measurements all together. A little different from the true experimental IP inverse problem, in this example we use linearization to produce the sensitivity matrix J first, then produce the synthetic data d with the formula d = Jn, where 77 is the true model to be reconstructed.  (3.156) To simulate the measured data with  Chapter 3.  Two Practical Problems  62  errors, we add normally distributed noises to the data, i.e., df" = di + ei, i = 1...110  (3.157)  where e- has A^O, <T,-) distribution. Here <r,- = cr is 0.5% of the maximum of the absolute t  values of the exact 110 data. After we get the data, we solve the discrete inverse problem according to the above steps and use the introduced regularization method to invert the data. Here the regularization operator is II Hv - Vo) | | =  <*s /  2  w (v -  n ) dxdz 2  s  Q  J area  +a f  (^Lp^rd dz  x  Wx  where r?o = 0, w  s  = w  x  = w  z  +a  X  Jarea  f  z  OX  = 1, a  s  M ^ ^ f d x d z ,  Jarea  = 1, a  = 5 and a  x  z  OZ  — 10, and its discretization  produces the corresponding matrix II Ln  W rf || .  (3.158)  2  v  According to regularization method, we solve optimization problem Minimize <j> =|| W rj || +u || W (Jr] - d ) | | , 2  v  obs  2  d  (3.159)  where W is a diagonal matrix and its elements are 1/cr;. d  This is a standard optimization problem from linear inverse theory and we can solve it using  SLIM.  We plot the true model and the reconstructed model in figure (3.15) From the calculated output, we can note that the reconstructed model is at the correct position corresponding to the true model and the chargeability value of the calculated model is consistent with the true model (figure 3.15).  However, the inverse method  can not reveal the fact that there is no chargeability in the interior of the true model.  Chapter 3.  Two Practical Problems  63  This is due to the smoothing effect of the regularization operator which we use to solve the ill-posed IP inverse problem. However, it is possible to form IP structures with no chargeability in the interior, so it is important to develop inversion methods which can reveal this kind of IP model structure. This is what we will do in the next chapter..  Chapter 3. Two Practical Problems  Figure 3.15: IP Model.  64  Chapter 4  Combined Inversion  To get reasonable approximate solutions to ill-posed inverse problems, we should use all the available information. This leads us to consider the combined inversion of geophysical data based on the correlation between the different physical parameters.  4.1  Known Approaches  To overcome the nonuniqueness inherent in geophysical inverse problems, we can use the regularization method. However, this mathematical technique does not provide the additional information needed to uniquely determine the true model. Instead, it simply chooses the approximate solution with the minimum (semi-)norm which is determined by its regularization operator. In this section we summarize the possible remedies for this problem. First of all, to get a better approximation to the true model one natural approach is to collect more experimental data. Actually we use this approach for IP inverse problems. In IP field work the current electrode is moved to different points sequentially and the corresponding potentials are measured after the current electrode has been moved to each point. One obvious disadvantage of this approach is that we can not get as many data as we need, especially for practical problems. The second approach is to reduce the set of possible reconstructed models. In this way one tries to overcome the nonuniqueness of ill-posed problems by restricting possible solutions to a smaller set based on reliable information regarding the true models. We use 65  Chapter 4.  Combined Inversion  66  this approach for both of our practical inverse problems. In solving both the magnetic problem and the IP problem, we require the inverted models to be nonnegative. The third approach is to choose a proper regularization operator based on available information about the true models. For example, in solving magnetic inverse problems we choose weighting functions to compensate for the decay effects of the magnetic field. For general geophysical inverse problems, we should choose a regularization operator such that the reconstructed model is consistent with possible geological structures, mining recordings and so on. Although these approaches are useful in solving inverse problems, they are based on one kind of observed data alone.  If we try to combine the observed data of different  kinds, we come to the correlation approach.  4.2  Correlation Between Parameters  To the above approaches for solving inverse problems, we would like to add a new approach—using the correlation between different model parameters. In daily life, if we like to study a new material, it is natural for us to investigate its different characteristics. For example, to distinguish a kind of metal, we consider its color, density, conductivity and other characteristics at the same time. In the mineral industry and practical geophysical prospecting, it is common to use several kinds of prospecting methods in the same area. Therefore, observations of several different types are usually available. Although different prospecting methods usually investigate different parameters of the ores, these parameters are often correlated. One good example regarding this kind of correlation is the relation between seismic wave velocity and density in an underground area. We can use gravity measurements to investigate the density of an underground  Chapter 4.  Combined Inversion  67  area. At the same time, we can use seismic experiments to study the velocity of waves propagating through the region. It is known that the density and the wave velocity in a region are positively correlated. Here we would like to study the correlation between the chargeability n and the susceptibility K of an underground region. From geophysical measurements [20], we know the magnetic susceptibility of an ore depends only upon the amount of ferrimagnetic minerals present, mainly magnetite, sometimes ilmenite or pyrrhotite. Chalcopyrite and pyrite are typical of many sulphide minerals which are basically nonmagnetic. At the same time, the chargeabilities of minerals increase with the increase of the sulphide concentration. Chalcopyrite and pyrite are typical sulphide minerals with high chargeabilities. Here let us list the chargeability and susceptibility of some minerals from [20] in Table 4.2. The unit of chargeability 77. here is msec since it is computed using formula (3.121). name Pyrite Graphite Chalcopyrite Magnetite Hematite  chargeability n 13.4 11.2 9.4 2.2  susceptibility  /cx 1000 1.5 0.1 0.4  0.0  6000 6.5  Table 4.1: Chargeability and susceptibility of some minerals. The chargeability and susceptibility data for various minerals and rocks in the tables of [20] suggest that these two quantities are negatively, correlated. This correlation can also be shown by a mineral example. McMillan [13] shows in the forming process of porphyry deposits that high chargeability material like pyrite gives way outward to high susceptibility metal like magnetite. This characteristic is simulated in our numerical example where there is high chargeability material is at the outside and material with high susceptibility inside. We may not know the exact correlation between the susceptibility and chargeability  Chapter 4.  Combined Inversion  68  since it may depend on the sampling site and other unknown factors, but even an approximate effort to account for this correlation helps considerably to invert the magnetic and IP data.  4.3  Combined Inversion  Although we can invert the IP data for chargeability and the magnetic data for susceptibility independently, we propose to invert these two groups of data simultaneously. This will allow us to account for the correlation between the chargeability and the susceptibility. Since this amounts to adding more physical knowledge to our solution process, we expect it to give more reasonable solutions than the independent inversions do separately.  4.3.1  Optimization Problem  Now let us set up an optimization problem to invert the magnetic and IP data simultaneously.  Under the assumptions for the magnetic problem stated in Chapter 3, the  discrete magnetic inverse problem is G*IK = < S , where K £ R  M  (4.160)  represents the model to be reconstructed. Assuming the discrete form of  the regularization operator || Ln'\\  2  is  fa =|| W (K-KQ) K  || ,  (4.161)  2  we arrive at the following optimization problem: Minimize  cj) =|| W (K — KO) ||  2  K  K  Subject to fa, =|| W (G dl  - d^ ) 8  lK  || = 2  fa  di  (4.162)  Chapter 4. Combined Inversion  69  where <j) is our misfit criterion, W dl  dl  is an vYx x Ni data weighting matrix and <j>*  d  is the  target misfit. Similarly for the IP inverse problem: if its discretized linear inverse problem is (4.163)  Gn = d \ ob  2  and its regularization operator has the form (4.164)  <f>v=\\Wr,(ri-rio)\\ , 2  then we arrive at the following optimization problem: <^ =|l W (rj — no) ||  Minimize  2  v  Subject to <t>d =\\W (G n-dt)\\ =<t>  (4.165)  2  2  where (j) is our misfit criterion, W d2  d2  d2  2  d2  is an N x N data weighting matrix and d>* 2  2  d  is the  target misfit. If the two parameters K and n in each cell were independent, we could solve these two inverse problems independently. However, when there is a correlation between the two parameters, we should keep the solutions in the corresponding Confidence Region. Therefore, it is natural to solve the two inverse problems simultaneously. If (Ki,r]i)  GR  2  in cell i is a random vector and <J> (K,V) represents its probability c  density function, then its confidence region is determined as: {(«,»,)  (4.166)  : ^ ( « , 7 / ) < # } .  To keep the reconstructed model pair (K^, rji) for each cell in its confidence region, we introduce a new optimization problem: Minimize  (f> : = || W (m m  m  — m ) || , 2  0  Subject to <j> :=|| W (Gm - d ) || = <j>% obs  d  and  {(Ki,rji)  2  d  : <j) {^i-,r]i) < c  (4.167)  Chapter 4. Combined Inversion  70  where Ki...K )  K  ,  M  V m Jobs  =  (dl ,df) , bs  T  = Ei=i </>c and G Go  W,  W,di  =  W,d  2  )  w,m i  )  w  V  m2  The constrained optimization problem (4.167) can be transformed into the following unconstrained optimization problem Minimize <f>  := <j> + /i(j) + a (j) m  d  c  c  = || W (m - m ) | | +fi || W {Gm - d ) | | +cx 2  m  obs  0  d  (4.168)  2  c  From (4.168) we see that choosing (j) =0 corresponds to solving for the two parameters c  T) G R  M  and K G R  M  independently; if we enforce a correlation between 77 and K by setting  <f> ^ 0, we may keep the parameters 77 and K in a particular confidence region by choosing c  a suitable correlation weighting factor a . c  Consider the case when (f> is a convex quadratic function of m = (K,T]) G R , T  c  1 = —m Hm + g m + c, 1  T  2M  i.e.,  (4.169)  Chapter 4.  Combined Inversion  71  where H is an 2M x 2M positive definite matrix, g is a vector of 2M elements and c is a constant. Then solving optimization problem (4.168) amounts to solving its normal equation Am = 6,  (4.170)  where A = WlW  + uG WjWjG  + a H,  T  m  (4.171)  c  and b=WlW m m  + aG WjW Gd-a g:  (4.172)  T  Q  d  c  To solve linear equation system (4.170) we can use standard linear algebra techniques except choosing proper parameters u and a . In fact both the direct methods and iterative c  methods reviewed in chapter 2 may be used to solve this equation system. If <f> is not a quadratic function of m = (/c,r?) , we approximate 4> using Taylor's T  c  c  expansion and solve problem (4.168) with iterative methods. In fact, in a neighborhood of m^ \ n  we have  (j> (m) « ^(m'"') + g {m - m ) T  {n)  c  + \{m-  m ) H(m {n)  - m*"')  T  (4.173)  where g and H are the gradient and the Hessian matrix of d> at mS^ respectively. c  Combining (4.168) and (4.173), we obtain the approximate optimization problem for iterative solution m ^  n+1  ':  Minimize (f) '•= <t>m + P><t>d + ®c<f>c «|| W {m m  - m ) || +a \\ W (Gm 2  0  d  (4.174) - d° ) bs  ||  2  + a [ ^ ( m ^ ) + / ( m - m W ) + ^(m-m( )) ff(m-m( ))]. n  c  c  T  n  J  .  To solve this optimization problem iteratively is an important part in the combined inversion, we will return to it later.  Chapter 4.  4.3.2  Combined  Inversion  72  Statistical Description  To solve the optimization problem (4.174), we should know the correlation function <f>c(m). Although it is impossible to determine an exact distribution of (re, 77) € R  2  in each  part, from the lists of magnetic susceptibilities and IP chargeabilities of various minerals and rocks in [20] we can find that.any mineral or rock does not likely have both high chargeability  and high susceptibility.  We should approximate this correlation in our  inversion for magnetic susceptibility and IP chargeability. Here are some possible choices. 1. Lognormal format. This is appropriate when the random vector (re, 77) £ R  2  has a lognormal distribution. Let x = bare;, y = In 77; in each cell, i = 1...M. Under the assumption above, x and y have normal distributions with joint density function  f(x,y) =  -( ' W ' x  y  x  y)T  e  where E is the covariance matrix. Suppose (xi,yi),i  = 1...M are the observed values of (x,y),  theory we know J2iti( i,  J/i)S(x -, y,) is of ^(M)  x  T  t  then from statistical  distribution. Therefore, the confidence  region is  M {(x,y) : X>  2  -x, -  y)X(xi - x,  yi  Vi  - yf  =  2 X  ( M ) < C}.  (4.175)  t=i  If we write in the function ofreand 77, then the confidence region is determined by  <^(re,77) = (Inre— lnre ,ln77 — ln?7o)S(lnre — lnre ,ln77 — ln?7o) . T  c  0  0  (4.176)  Remark: Although the lognormal distribution is convenient theoretically, when we assume (re, 77) are lognormal we mean (re , 77,) are always positive. However, (re;, 77,) may have t  Chapter 4. Combined  Inversion  73  values being almost zero and this may result in computing difficulty when we compute the gradient and Hessian matrix of the function  d> (K,rj). c  2. P r o d u c t format. In practical computing, a function like (4.176) is impractical, so we seek simpler functions that approximate the correlation <^ (/c,r/). c  One possible choice is the product of two probability density functions; one for group with high chargeability but low susceptibility and the other one for the group with low chargeability but high susceptibility. For example,  <f) (K,n) = C  {a (K - K ) + 2a (K - K )(TI - 7] ) + a (v ~ Voi) } 2  U  2  01  01  12  01  22  x { 6 . ( / c - re ) + 26 (re - /c )(r? - r/02) + M ^ - 7 7 0 2 ) } , 2  n  2  02  12  02  where re and 7/ represent the susceptibility and chargeability of each cell, and reni  (4.177) a n  d V01  are the average values of susceptibility and chargeability of groups with high susceptibility and low susceptibility. Similarly reo and 7/02 are the average values of susceptibility and 2  chargeability of groups with low susceptibility and high susceptibility. Usually we choose  «i2 = b  12  = 0.  In this choice, we have a lot of freedom to adjust the shape of the contours determining the boundaries of the confidence regions, but the computing is still somewhat complicated. An even simpler example is } (K,ri) c  = Kn. 2  2  (4.178)  Obviously, if the value of function <^c(AC, 77) is small, it is impossible for the both values of K and n to be large. So, this function really reflects the correlation between re and 77 in each cell. If the relation for re and 77 is symmetric, (4.178) should be a good choice. Although (f> may remain small when one of re and 77 becomes huge, this possibility is c  avoided by the presence of the regularization operator.  Chapter 4. Combined  Inversion  74  Furthermore, if we choose a n = b  22  and n  02  = K  02  = 0, b  = 1, a i = a 2  t2  22  = b  n  = 0, T] = K 01  01  = 0  = 0 in function (4.177), what we get is just function (4.178). So function  (4.178) is a special case of function (4.177).  4.4  Computing Techniques  After we approximate the correlation with the function  ^ (K,T/), c  we can solve the op-  timization problem of combined inversion. This is not trivial'here since the objective function is neither quadratic nor convex.  4.4.1  Optimization Problem  For the optimization problem (4.168)  Minimize <j>  := <f> + n<f> +  a <f)  c W (m - m ) || +fi || cW (Gm - d° ) m  2d  0  m  hs  d  if we introduce the notation  <f>  d  =  \\Am-d°' r o b s 12 tobs 1 I + || G r, - d* •2  2  2  = (f>d! + <t>d  2  and W (m - m ) | | m  2  0  W « ( / c - « o ) | | + || W ^ - i / o ) II 2  2  || + a < £ , 2  c  Chapter 4. Combined Inversion  75  it will become the following unconstrained optimization problem Minimize <j> =  Here we choose a common chosen as  ii\<f>  dl  +  fi (j>d i 2  2  = [uf) + <j> + a <t>c d  P^di  for the misfit  fi  m  + U<f>d +  4.4.2  <t>ri  <f>K +  2  +  « c ^ -  Theoretically the misfit term should be  <j>d-  but this choice will bring more parameters to be decided. In the  numerical examples, the computing output shows we just choose  (4.179)  c  4> \ ~ d  <f>*  dl  and <f>  d2  ~  e  (j>  v  d2  e  n  though  = p — p.. 2  Approximating the Nonlinear Correlation  If we let K{ and rji denote the susceptibility and chargeability of each cell i and function  <^c(«i,  Vi) =  {aii(«» - « o i ) + 2a (Ki - K i){r)i - rj ) + a (rn - ?7oi) } 2  2  12  x{M««  _  0  02) + 26i (/c,- -  K  2  2  22  01  n )(r}i 02  - rj ) + b (r)i - V02) } (4.180) 2  02  22  denote their correlation in the cell, then the total correlation <f) in optimization problem c  (4.179) is M  (4.181) t=i M  52{ n( i a  K  - oi) K  2  + 2a (ni 12  -  K )(rn 01  x{6n(/Cj - / c ) + 2&i (rei - K )(rii 0 2  2  2  02  -  -  7/01)  + a22(»?i - ??oi) }  7/02) + 6 2 2 -  2  T702) }2  Similarly, if we use the correlation function (4.182) instead of function (4.180), we can get corresponding total correlation M  M  (4.183)  Chapter 4.  Combined Inversion  76  For both cases'(4.181) and (4.183), </> is not a quadratical function of c  m =  (4.184)  (K ...K ,m--- lM) , r  T  M  1  so we have to use an iterative method to solve problem (4.179). At each iteration we solve a problem of the form (4.174). This is Newton's method. In each step, we have to compute the gradient vector, matrix, H =  (hij^jL-y  of  cj) , c  g = (gi, ...,g M) , T  2  and Hessian  that is, d(f) (m) c  (4.185)  dm;  and d <j> (m) 2  =  c  dmidm;  1 3  \m=Tn( ) n  (4.186)  i  where m = («i...KM,?/I---?7M)TFor our specific correlation function <j> (4.183), we have c  2KiT}  2  { 2K T]i 2  i <  M  .  i > M  and '  i =  2nf  2K  I = j > M,  AKim  i < M  and j = M +  AKirji  i > M  and j = i'—  2  hn =  I  j<M,  0  i, M,  otherwise.  Similarly, we can find the analytical expressions of g and H for the specific correlation function 4.4.3  <f>  c  of (4.181).  P r o d u c i n g the Initial M o d e l  Neither of correlation functions 4> in (4.181 or (4.183)) is convex. c  Chapter 4.  Combined Inversion  77  In the first case, fa = K r] , 2  (4.187)  2  has Hessian matrix 2rj  2  4KTJ  2K,  2  4Kn  which has negative determinant, so it is nonconvex. In the second case,  <f>c =  (iti - « o i ) + 2  X{6II(KJ with a  K  Q  )  2  2a (Ki  +  2  -  12  K )(r}i  2b {Ki  01  rj  0  1  }+  a (rji 22  ~  ??oi) } 2  K )(fn - n ) + b (r]i - r] ) }, 2  -  X2  -  02  02  22  02  = &12 = 0. To show it is a nonconvex function we can choose  12  m  1  =  (rvni,r?oi)  =  (K ,T]02)  m  2  T  ^  G  t  02  R, 2  i? , 2  then cj) (m )~= 0 and </>(m) = 0. However, when we let 1  c  2  c  m = (m + m )/2, 3  then (f) (m ) > 0 unless an = a 3  c  22  1  (4.188)  2  = &n = &22 = 0 which is a trivial case.  Actually we are using Newton's method when we expand <j> as in equation (4.174) and c  use iterative methods to solve it. To get a reasonable solution to this kind of optimization problem, we must start with a good initial model m^°\ Our approach is to generate an initial model by setting <j) = 0 in the optimization c  problem (4.179) . In other words, we minimize <f>(m)  =u<j> + <f  = n\\Gm-  d  (4.189)  )m  d°  bs  | | + || W (m - m ) | | . 2  2  m  0  This is a standard optimization problem for linear inverse theory and we solve it as in Chapter 3 to produce the initial model m^°\  Chapter 4.  4.4.4  Combined Inversion  78  Choosing the Correlation Weighting Parameter  After we introduce (f> { ) to the optimization problem (4.179), we note the effects of m  c  correlation term <j) and regularization term <f) are in opposition. The regularization term c  m  (f> increases as the correlation term <f> decreases. The reason is that our regularization m  c  operator tends to smooth the reconstructed model while the correlation tends to reveals the nonsmooth characteristic of the true model. Therefore, we should choose a proper  If we choose a  c  too large, the initial model gets a huge perturbation and may not  converge to the ideal model anymore. On the other hand, if we choose a too small, the c  smoothing effect of (j> may swamp the effect of correlation. To balance the effects of the m  correlation term and the regularization term, we should choose a  c  such that these two  terms are compatible. However, the correlation <f> decreases as the iteration proceeds, so a c  c  should be in-  creased continually to maintain the compatibility between cf> and || Lm || . 2  c  Actually, in each big step, we fix the parameter a  and use SLIM to solve the linear  c  system Am - b,, where  A = WlW b = WlW m m  + fxG WjWjG T  m  + uG WjW Gd T  0  d  (4.190)  + a H,  (4.191)  c  - a g + a Hm^. c  c  (4.192)  Recall that (4.190) is the normal equation of the optimization problem (4.174). After SLIM solves problem (4.195), we update the parameter a and run SLIM again. c  For our numerical examples, we usually need several steps to get the final results. How to modify SLIM such that it can choose ct automatically will need further work in the c  future.  Chapter 4.  Combined Inversion  79  Here we update cx by replacing it with ca , where c is an increasing factor (c > 1). c  The initial a  c  c  is chosen such that a. cf) is compatible with <6 . Similarly, in each following c  m  c  step, we choose ct such that cx (j) is compatible with the new value of (f> . c  4.4.5  c  c  m  B a l a n c i n g the Regularization O p e r a t o r  In problem (4.174), the model norm cj) is the regularization operator. Here we have m  <t> = fa + K  (4.193)  m  where  =|| L(K - K )  (f>  K  ||  2  0  = <f>sl + (f>xl + <f>zl = a si /  W (K — Ko) dxdz 2  S  +  J area  j u>*( ^ r °^) dxdz d  ct \ x  Jarea  K  + ai  2  z  OX  i  Jarea  w(^ ~ d  K  z  (4.194)  ^) dxdz.  K  2  Oz  = 1 1 (V ~ Vo) II  <f>ri =  K  2  L  <j>s2 +  —a  s 2  (f>x2 + <j>z2  /  w (rj - r} ) dxdz + 2  s  0  J area  c  x 2  [  w ( ~ ) dxdz d{r}  Vo)  +  2  x  Jarea  a  z  2  OX  I  w^'^fdxdz.  Jarea  (4.195)  Oz  When we solve inverse problems using regularization method, our output is smoothed by the regularization operator. At the same time, the correlation term 4> reduces the c  smoothing effect of the regularization operator. To keep the smoothing effects of (f) and K  cf) to be compatible, we require the values of </> and (b to be similar. So, we should v  K  choose proper ct i, a i and ct i, and a , s  x  z  s2  a  x2  and a  z2  v  such that </> and <b are compatible.  We will provide these values in our numerical examples.  K  v  Chapter 4.  4.4.6  Combined Inversion  80  W o r k i n g with S L I M  SLIM [15] is designed mainly for linear inverse problems. For our combined inversion with correlation term <f) which is neither quadratic nor convex we have to approximate c  it iteratively, so we should modify the original code. After changing the code of SLIM, we can calculate H and g in each iteration. Furthermore, we can compute the gradient of <j> in each step which is useful in producing c  the vectors of the subspace. When we use the subspace method to solve problem (4.179), in each iteration we solve a linear system B6m = b  (4.196)  for a perturbation 6m in the subspace V„ = s p a n { u i , v } instead of the whole solution q  space R , that is, 6m = YX=i fii i- Here 2M  v  B = (WlW  + fiG WjW G  + a H),  T  m  d  c  and b = -WlW {rnW  - m ) - fiG WjW (Gm^  - d° )  T  m  0  bs  d  - ag c  Choosing the right subspace V is an important step in subspace methods. For our case, s  we produce the subspace vectors as follows:  = V^ , v = V<^, "3 = V<£«, v = V</>„, "5 = V (</>* + </>,,),. U6 = V(^ + ^) v = V(^ i + < />), V = V((f) i + <j) )  vi  dl  2  2  4  K  7  s  2s  v = \7(<f> i 9  so q = 10 here.  z  + (t> ) z2  8  v  10  x  s2  = V<f> , c  Chapter 4. Combined Inversion  81  To produce the steepest descent directions, the Conjugate Gradient Method is used to solve the related linear system of equations. We refer to [15] for details.  4.5  A Numerical Example  We have already solved a magnetic inverse problem and an IP inverse problem independently in Chapter 3. Now we can solve them simultaneously with our combined inversion method. According to the above steps, we first solve problem (4.179) with a = 0. To solve c  this problem we choose 1.2 as the decreasing factor of fa in the SLIM. Here we list the parameters for the regularization operator || Lm | | in table (4.5). 2  name <t>K  <f>n  a  a 260. 2.5  s  x  160. 0.5  2000. 5.  Table 4.2: Parameters of regularization operator. Then we use the correlation function fa defined by  M  4c  M  = Y.MWi) = Y,"Wi-  . • • (4.197)  j=l  1=1  We choose the initial correlation coefficient a = l.EA and increase a with the multiplyc  c  ing factor c = 1.5. The outputs are plotted here (see figure 4.16). In a small departure from the theoretical analysis, here I chose 320 as the target misfit for 310 data. When we set a = 0, we get the magnetic inversion model and IP inversion model c  plotted in figure (4.16). The output shows that the smoothing effect of the regularization operator makes the magnetic susceptibility K low and the IP chargeability 77 smooth (compare the output in Table 4.5).  •  After we introduce a ^ 0 at iteration 306, we get significantly improved reconstrucc  tions for both model components (figure 4.25) with fa(m) being denned as in (4.178)  Chapter 4.  Combined Inversion  82  name True Model  4>n Correlation </> 51 0.25 x 10~  <t>K <f>d,V  c  192  28  102  Inverted Model (a = 0)  228  20  92  58  0.62 x l O "  2  Inverted Model (cx ^ 0)  221  63  99  201  0.58 x l O "  4  c  c  5  Table 4.3: Output with ct = 0. c  (see Table 4.5). From the output we find the new calculated magnetic model is a better approximation to the true model and the new calculated IP model reveals the discontinuity of the true model. The iteration process is shown in figures (4.17), (4.18), (4.19) and (4.20). In fact, although the total misfit does not change much after introducing correlation factor  a, c  the correlation decreases slowly. Similarly, if we choose a correlation function <f> (m) defined as c  M  0c =X^^ ( '^') < C /Ci  i=l  M  = YH ^( i a  K  ~  K  o i ) + 2a {Ki - K )(rji 2  12  i=l  .x{6n(«j -  K02)  01  -  7701) + a (r]i 22  + 26 (« - - K )(rn - 7 7 0 2 ) + b (rn -  2  12  8  22  02  ~ Voi) } 2  7/02) }, 2  where a  n  = 64,  = 0.1,  Ct22  = 0.5,  rjoi= 0.01,  = 0,  bi2 =  611 = 64,  b = 0.1,  K02 = 0.01,  V02= 0.15,  22  then the correlation (f> for true model is 0.89. c  (4.179) with a  c  0,  = 0, then the correlation  <f>  c  If we solve the optimization problem  is 18. After we choose a  c  > 0 and use the  Chapter 4.  Combined Inversion  83  combined inversion, we also observe improved reconstruction relative to the independent inversion in Chapter 3. (figure 4.26) (see 4.5). Its correlation curve is plotted in Figure (4.23) and the total objective function (j) curve is plotted in Figure (4.22). We note both the correlation (f> and total objective function (j> decrease with the iteration, after we let c  OL > C  0. name Magnetic IP Correlation  <f>d  (f>m  232.6 87.4  55.9 280.4  (from) 18.0  (to) 0.84  Table 4.4: Output with a ^ 0. c  4.6  Conclusion  To solve ill-posed inverse problems, usually some kind of regularization is used. Regularization methods overcome the nonuniqueness of ill-posed problems by choosing a smooth model with minimum (semi-)norm. However, sometime the true model does not possess a smooth structure, so the regularization method based on smoothing does not work. To reveal this kind of nonsmooth structure, we consider different properties of the true model and find there is a correlation between its different physical parameters.  With  this additional information, it is natural to put it into the objective functional in solving the original inverse problem. In this thesis one more penalty is put into the objective function and corresponding combined inversion is developed. In doing this I found there is difficulty in solving the resulting optimization problem since the objective functional is neither convex nor quadratic. Even in the simple examples of this thesis we tried different computing techniques in solving the optimization problem. This combined inversion does reveal the nonsmooth characteristic of the true  Chapter 4.  Combined Inversion  84  model and the reconstructed models are much better than those from the traditional regularization. In contrast to traditional regularization methods which usually use one kind of data, this combined inversion method based on correlation uses observed data of different kinds. Although in this thesis we use the combined.inversion to invert the susceptibility and the chargeability only, this method may be used to solve similar inverse problems arising from many areas. Generally speaking, if there is a correlation between two different physical parameters and each of these parameters can be inverted from corresponding data independently, we can use combined inversion to invert these two parameters simultaneously by keeping the reconstructed parameters in their confidence regions.  The key part in  using the combined inversion is to find the inherent correlation between the parameters and use proper functions to approximate them. Since combined inversion makes effective use of more information in solving ill-posed inverse problems, more reasonable outputs are obtained. The combined inversion method here is developed to solve linear inverse problems, but it should be possible to develop similar combined inversion in solving nonlinear inverse problems. Based on this combined inversion, similar methods may be developed using the correlation among more than two physical parameters. In developing the combined inversion an efficient software package which can choose several parameters automatically in solving the resulting optimization problems should be developed.  Chapter 4.  85  Combined Inversion  CO  is -j  L'J  • 1  OCO  1—-  L'J 1 1  o PO  CO  CO  t  1  O ro  CO  N  C!l 1 1  o  p  o on  o  o ro  o ro  i-»  o ro  - - 2  CO  o  Reconstructed Magnetic Model  to  a  CO 1 1  Q CO  H»  03  CC •  i fao  to  cr a  1 1  Q ro  —^  —i > fed i i roa  -0 CO  — 1 1  o ro  OB CO M 1 1  ro  a o  p  ro  p  i—.  o  CO  R e c o n s t r u c t e d IP M o d e l Figure 4.16: Reconstructed Models with & • ••  p  Chapter 4.  Combined Inversion  '86  180 'phim_mag'  160 140 <u o g  u  120 100  <D  4J  G  di  a  300  . 400 Iteration  700  Figure 4.17: Regularization term for magnetic part, <j> . K  300  300  7 00  . 400 Iteration  Figure 4.18: Regularization term for IP part, (j> . v  Chapter 4.  Combined Inversion  87  300 400 Iteration  700  Figure 4.19: Total Misfit, <f> . d  Figure 4.20: Correlation Function, <f> (m). c  Chapter 4.  Combined Inversion  88  300 400 Iteration  700  Figure 4.21: Total sum of the objective function, (f>.  ' tot'  100  150  200  250 Iteration  300  350  400  Figure 4.22: Total sum of the objective function, (j>.  450  500  Chapter 4.  °' 0 8  Combined Inversion  50  100  150  89  200  250. Iteration  300  350  400  450  500  Figure 4.23: Correlation Function, <f) (m). c  250. Iteration  500  Figure 4.24: Total misfit, <j> . d  Chapter 4. Combined Inversion  90  X  0  .A  o  o  o  c o  o  rocs  o  en o  N  o O  03 O  01 02 0C  Ot 09  CO CO  W CO  M 1  M 1 i  to 1  O CO  -J  O W  o  —  1 1  O  09  09  ~1  1 1  1 c CO  o ro  c I—*  co  o  o  o  -a  so  o ho  R e c o n s t r u c t e d IP M o d e l Figure 4.25: Reconstructed Models with * = « V -  to o  o  Chapter 4. Combined Inversion  R e c o n s t r u c t e d IP M o d e l Figure 4.26: Reconstructed Models with ^function product.  91  Bibliography  Anger, G . , Gorenno, R., Jochmann, H . , Moritz, H . and Webers, W.: Inverse Problems: Principles and Applications in Geophysics, Technology and Medicine, Berlin: Akademie Verlag, (1993). Courant, R. and Hilbert, D.: Methods of Mathematical Physics, Interscience, New York, (1953). Dey, A . , and Morrison, H . F. Resistivity modeling for arbitrarily shaped twodimensional structures. Geophysical Prospecting 27, 106-136, (1979). Gerkens, J . C.: Foundation of Exploration Geophysics. Elsvier Science Publishers B . V . , The Netherlands, 1989. Golub, G . H . and Van Loan, C. F.: Matrix computations. The Johns Hopkins University Press, Maryland (1983). Groetsch, C. W.: Generalized inverses of linear operators: representation and approximation. Marcel Dekker, New York (1977). Hadamard, J.: Lectures on Cauchy's problems in linear partial differential equations. Yale University Press, New Haven, 1923. Heinz W . Engl, L.: Regularization methods for the stable solution of inverse problems. Surveys on Mathematics for Industry 3: 71-143 (1993). Martin Hanke, K . and Per Christian Hansen, L.: Regularization methods for largescale problems. Surveys on Mathematics for Industry 3: 253-315 (1993). Lorrain, P., Corson, D. P. and Lorrain, F.: Electromatnetic fields and waves, W. H . Freeman and Company, New York, 1988. McGillivray, P. R. and Oldenburg, D.W.: Methods for computing Frechet derivatives and sensitivities in the nonlinear inverse problem-a comparative study. Geophys. Prosp. 38: 499-524 (1990). McGillivray, P. R.: Ph.D Thesis, University of British Columbia, (1992). McMillan, W . J . : Porphyry Deposits in the Canadian cordillera; Ore deposits, tectonics and metallogeny in the Canadian Cordillera. Province of British Columbia; Ministry of Energy, Mines and Petroleum Resources, (1994). 92  Bibliography  93  Morozov, V . A.: Methods for solving incorrectly posed problems. Springer, New York (1984). Oldenburg, D. W. and L i , Y . : Subspace Linear Inverse Method, Inverse Problems 10: 915-935 (1994). Oldenburg, D. W. and Li, Y . : Inversion of Induced Polarization Data, to appear. Rodi, W . L . : A technique for improving the accuracy of frnited element solutions for M T data. Geophysical Journal of the Royal Astronomical Society 44, 483-506 (1976) . Sheriff, R. E.: Encyclopedic dictionary of exploration geophysics. Society of Exploration Geophysics, 1991. Smith, N . C . and Vozoff, K.: Two-dimensional D C resistivity inversion for dipoledipole data. I E E E Transactions on Geoscience and Remote Sensing GE-22, 21-28 (1984). Telford, W . M . , GelDart, L. P. and Keys, D. A.: Applied Geophysics. Cambridge; New YorK: Cambridge University Press, (1990). Siegel, H . 0.: A theory of induced polarization effects for step-function excitation. In Overvoltage Research and Geophysical Applications, J . R. Wait, ed. pp. 4-21. London: Pergamon, (1959). Tikhonov, A. N . , Arsenin, V . Y . : Solutions of ill-posed problems. Wiley, New York (1977) . Van Loan, C. F.: Generalizing the singular value decomposition. SIAM J . Numerical Analysis 13: 76-83 (1976). Varah, J . M . : A practical examination of some numerical methods for linear discrete ill-posed problems. SIAM Review 21: 100-111(1979). Varah, J . M . : Pitfalls in the numerical solution of linear ill-posed problems. SIAM J. Sci. Statis. Comput. 4: 164-176 (1983). Wahba, G.: The practical approximate solutions to linear operator equations when the data are noisy. SIAM J . Numer. Anal. 14: 651-667 (1977).  


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items