UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Fast imaging with surface-related multiples Tu, Ning 2015

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


24-ubc_2015_september_tu_ning.pdf [ 8.43MB ]
JSON: 24-1.0166623.json
JSON-LD: 24-1.0166623-ld.json
RDF/XML (Pretty): 24-1.0166623-rdf.xml
RDF/JSON: 24-1.0166623-rdf.json
Turtle: 24-1.0166623-turtle.txt
N-Triples: 24-1.0166623-rdf-ntriples.txt
Original Record: 24-1.0166623-source.json
Full Text

Full Text

Fast imaging with surface-related multiplesbyNing TuB.Eng., Beijing University of Posts and Telecommunications, 2005M.Sc., Tsinghua University, 2009a thesis submitted in partial fulfillmentof the requirements for the degree ofDoctor of Philosophyinthe faculty of graduate and postdoctoralstudies(Geophysics)The University of British Columbia(Vancouver)August 2015c© Ning Tu, 2015AbstractSurface-related multiples, which are waves that bounce more than once be-tween the water surface and the subsurface reflectors, constitute a signifi-cant part of the data acquired in marine seismic surveys. If left untreated,they can lead to misplaced phantom reflectors in the image, and result inerroneous interpretations of the subsurface structure. As a result, thesemultiples are removed before the imaging procedure in conventional seis-mic data processing. However, because they interact more with the sub-surface medium, they may carry extra information that is not present inthe primaries. Therefore instead of removing these multiples, a more de-sirable alternative is to make active use of them. We derive from the well-established “Surface-Related Multiple Elimination” relation, and arrive at alinearized expression of the wave-equation based modelling that incorporatesthe surface-related multiples. We then present a computationally efficientapproach to iteratively invert this expression to obtain an image of the sub-surface from data that contain multiples. We achieve the computationalefficiency inside each iteration by (i) using the wave-equation solver to im-plicitly carry out the expensive multiple prediction; and (ii) reducing thenumber of wave-equation solves during data simulation by subsampling themonochromatic source experiments. We show that, compared with directlyapplying the cross-correlation/deconvolutional imaging conditions, the pre-sented approach can suppress the coherent imaging artifacts from multiplesmore effectively. We also show that, by curvelet-domain sparsity promot-ing and occasionally drawing new data samples during the inversion, theproposed inversion method gains improved robustness to velocity errors inthe background model, as well as modelling errors incurred during lineariza-tion of the wave-equation. To combine the information encoded in boththe primaries and the multiples, we then propose a highly-accurate sourceestimation method to jointly invert the total upgoing wavefield. We showwith field data examples that we can reap benefits from both the relativenoise-free primaries and the extra illumination coverage of the multiples.iiWe also demonstrate that the inclusion of multiples help mitigate the am-plitude ambiguity during source estimation. We conclude the thesis with anoutlook for future research directions, as well as potential extensions of theproposed work.iiiPrefaceAll of the presented work is carried out with the supervision of Dr. FelixHerrmann, in the Seismic Laboratory for Imaging and Modelling at the Uni-versity of British Columbia. Dr. Felix Herrmann was extensively involvedin all my research projects as the supervisory author, from conceptual for-mation to experiment design and manuscript edits.A version of Chapter 2 has been published [Ning Tu and Felix J. Her-rmann, Fast imaging with surface-related multiples by sparse inversion, Geo-physical Journal International, 2015, vol. 201, p. 304-317]. I was the leadinvestigator and the manuscript composer.A version of Chapter 3 has been published [Ning Tu, Xiang Li, and FelixJ. Herrmann, Controlling linearization errors in `1 regularized inversion byrerandomization, SEG Technical Program Expanded Abstracts, 2013, vol.32, p. 4640-4644]. I was the lead investigator and the manuscript composer.Xiang Li contributed in the development of the algorithm.A version of Chapter 4 has been published [Ning Tu, Tristan van Leeuwen,and Felix J. Herrmann, Limitations of the deconvolutional imaging conditionfor two-way propagators, in SEG Technical Program Expanded Abstracts,2013, vol. 32, p. 3916-3920]. I was the lead investigator and the manuscriptcomposer. Tristan van Leeuwen was involved in the concept formation.A version of Chapter 5 has been submitted for publication [Ning Tu,Aleksandr Aravkin, Tristan van Leeuwen, Tim Lin and Felix J. Herrmann,Source estimation with multiples—fast ambiguity-resolved seismic imaging,submitted on May 14 2015]. I was the lead investigator, responsible for con-cept formation, carrying out numerical experiments, and manuscript com-position. Aleksandr Aravkin and Tristan van Leeuwen provided proof ofthe theory and developed the numerical algorithm that I use the solve theproblem. Tim Lin contributed in the concept formation.A version of Chapter 6 has been published [Ning Tu and Felix J. Her-rmann, Fast least-squares imaging with surface-related multiples: applica-tion to a North-Sea data set, The Leading Edge, 2015, vol. 34, no. 7, p.iv788-794]. I was the lead investigator and the manuscript composer.A version of Chapter 7 has been published [Felix J. Herrmann, NingTu, and Ernie Esser, Fast “online” migration with Compressive Sensing,77th EAGE Conference & Exhibition, 2015, Madrid, Spain]. Felix moti-vated and supervised this research project, and composed the manuscript. Icontributed to the concept formation, developed the algorithm, and carriedout the numerical experiments. Ernie Esser was involved in the conceptformation.A version of Chapter 8 has been published [Rajiv Kumar, Ning Tu,Tristan van Leeuwen, and Felix J. Herrmann, Least-squares extended imag-ing with surface-related multiples, CSEG GeoConvention 2015, Calgary,Canada]. Rajiv Kumar was the leading investigator and the manuscriptcomposer. I contributed to the concept formation, algorithm development,and experiment design. Tristan van Leeuwen was involved in the early de-velopment of the algorithm.vTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . viList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiGlossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xviiiAcknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . xix1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Surface-related multiples: the origin and the problem . . . . . 11.2 A treasure in disguise . . . . . . . . . . . . . . . . . . . . . . 21.3 Curvelet-domain sparsity promoting . . . . . . . . . . . . . . 31.4 Theme and objectives of the thesis . . . . . . . . . . . . . . . 61.5 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 Fast imaging with surface-related multiples by sparse in-version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.2.1 Contributions of this work . . . . . . . . . . . . . . . . 152.2.2 Paper outline . . . . . . . . . . . . . . . . . . . . . . . 152.3 Imaging with multiples . . . . . . . . . . . . . . . . . . . . . . 162.3.1 Physics of the free surface . . . . . . . . . . . . . . . . 162.3.2 Modelling the free surface via areal sources . . . . . . 18vi2.3.3 Migration with multiples by cross-correlation . . . . . 212.4 Fast sparsity-promoting imaging with multiples . . . . . . . . 232.4.1 Subsampling monochromatic source experiments . . . 232.4.2 Removal of source crosstalks by promoting sparsity . . 242.4.3 Further acceleration by rerandomization . . . . . . . . 262.4.4 Sparsity promoting by `1 or `2 . . . . . . . . . . . . . 262.4.5 Putting it all together . . . . . . . . . . . . . . . . . . 282.5 Performance analysis . . . . . . . . . . . . . . . . . . . . . . . 282.5.1 Solution path and model error . . . . . . . . . . . . . 292.5.2 Computational considerations . . . . . . . . . . . . . . 312.6 Synthetic case study . . . . . . . . . . . . . . . . . . . . . . . 312.7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332.7.1 The source wavelet . . . . . . . . . . . . . . . . . . . . 332.7.2 Imaging with multiples only . . . . . . . . . . . . . . . 352.7.3 Sensitivity to the background velocity model . . . . . 372.7.4 Internal multiples . . . . . . . . . . . . . . . . . . . . . 392.7.5 Other applications . . . . . . . . . . . . . . . . . . . . 392.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393 Controlling linearization errors in `1 regularized inversionby rerandomization . . . . . . . . . . . . . . . . . . . . . . . . 443.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453.3 The full-data case . . . . . . . . . . . . . . . . . . . . . . . . 463.4 The subsampling case . . . . . . . . . . . . . . . . . . . . . . 473.5 Synthetic examples . . . . . . . . . . . . . . . . . . . . . . . . 483.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514 Limitations of the deconvolutional imaging condition fortwo-way propagators . . . . . . . . . . . . . . . . . . . . . . . 534.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534.3 Theoretical analysis . . . . . . . . . . . . . . . . . . . . . . . 544.4 Stylized examples . . . . . . . . . . . . . . . . . . . . . . . . . 564.5 Imaging of multiples . . . . . . . . . . . . . . . . . . . . . . . 574.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 595 Source estimation with multiples . . . . . . . . . . . . . . . 635.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 635.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64vii5.2.1 Related work and our contribution . . . . . . . . . . . 655.2.2 Paper outline . . . . . . . . . . . . . . . . . . . . . . . 665.3 Problem formulation and method . . . . . . . . . . . . . . . . 665.3.1 Compressive imaging . . . . . . . . . . . . . . . . . . . 665.3.2 Source estimation . . . . . . . . . . . . . . . . . . . . 675.3.3 Variable projection . . . . . . . . . . . . . . . . . . . . 685.3.4 Relaxing the `1-norm constraint . . . . . . . . . . . . 705.3.5 Acceleration by redrawing random subsets . . . . . . . 705.4 Resolving scaling ambiguities with multiples . . . . . . . . . . 715.4.1 Compressive imaging with total upgoing wavefields . . 725.4.2 Putting it all together . . . . . . . . . . . . . . . . . . 745.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 745.5.1 Imaging with primaries only . . . . . . . . . . . . . . . 755.5.2 Scaling images with multiples . . . . . . . . . . . . . . 815.5.3 Convergence analysis . . . . . . . . . . . . . . . . . . . 855.5.4 Robustness to wavelet initial guess . . . . . . . . . . . 875.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 885.6.1 A note on the “true-amplitude” seismic imaging . . . 885.6.2 Source estimation as a “garbage collector” . . . . . . . 895.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 906 Application to a field data set . . . . . . . . . . . . . . . . . 926.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 926.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 936.3 Imaging with multiples & source estimation . . . . . . . . . . 966.4 Challenges during imaging with multiples . . . . . . . . . . . 986.4.1 Data preprocessing . . . . . . . . . . . . . . . . . . . . 986.4.2 The need to invert . . . . . . . . . . . . . . . . . . . . 986.4.3 Imaging & source estimation with multiples . . . . . . 996.4.4 Fast randomized iterative inversion . . . . . . . . . . . 1036.5 North-sea case study . . . . . . . . . . . . . . . . . . . . . . . 1036.5.1 Joint versus separate inversions . . . . . . . . . . . . . 1046.5.2 Suppressing imaging artifacts from multiples . . . . . 1076.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1107 Future work: sparse seismic imaging simplified . . . . . . . 1117.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1117.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1117.3 Fast sparse solvers . . . . . . . . . . . . . . . . . . . . . . . . 113viii7.3.1 Solution by Randomized Iterative Shrinkage-ThresholdingAlgorithm (RISTA) . . . . . . . . . . . . . . . . . . . 1137.3.2 Solution by Randomized Iterative Shrinkage-thresholdingKaczmarz Algorithm with linearized Bregman (RISKA)1147.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1167.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1168 Further extension: least-squares extended imaging withsurface-related multiples . . . . . . . . . . . . . . . . . . . . . 1198.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1198.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1208.3 Extended imaging with free-surface multiples . . . . . . . . . 1208.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1228.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1239 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1279.1 Significance and potential application . . . . . . . . . . . . . 1289.2 Limitation and future work . . . . . . . . . . . . . . . . . . . 129Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131A Additional details of methodology . . . . . . . . . . . . . . . 141A.1 Solving the BPDN problem using SPG`1 . . . . . . . . . . . . 141A.2 Subsampling noise and approximate message passing . . . . . 142ixList of TablesTable 2.1 SNRs as a function of the number of monochromatic sourceexperiments and the number of iterations. . . . . . . . . . 32Table 2.2 SNRs as a function of the number of simultaneous sourcesand frequencies. The number of monochromatic sources aswell as the number of iterations are fixed. . . . . . . . . . . 32xList of FiguresFigure 1.1 Illustrations of the origin of the surface-related multi-ples. (a) The ray-paths of the primaries. (b) The ray-paths of a primary event (in blue) and its correspond-ing first/second/third-order surface-related multiples (inred/purple/green respectively). . . . . . . . . . . . . . . . 3Figure 1.2 (a) The waveform of primaries. (b) The waveform ofdata that contain multiples. . . . . . . . . . . . . . . . . . 4Figure 1.3 (a) The RTM image of primaries. (b) The RTM imageof data that contain multiples. . . . . . . . . . . . . . . . 5Figure 1.4 Illustrations showing the wider illumination coverage ofmultiples compared with primaries. The translucent bluecolor highlights the illuminated subsurface structure bythe primaries (top) and the surface-related multiples (bot-tom). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6Figure 1.5 Curvelets at different angles and scales. . . . . . . . . . . 7Figure 1.6 Decay of coefficients of a geological model in the physicaldomain as well as in the curvelet domain. The bottomfigure shows a zoomed-in section of the top figure. . . . . 8Figure 1.7 Illustration that minimizing the `1-norm leads to a sparsesolution while using `2-norm does not. . . . . . . . . . . . 9Figure 2.1 The synthetic salt dome model and two RTM images.(a) The background model. (b) Model perturbation. (c)RTM of total data but only with the primary imagingoperator. (d) RTM of total data by including the totaldowngoing wavefield in the source wavefield. . . . . . . . . 22xiFigure 2.2 Examples using the salt dome model. (a) Inversion withall the data. (b) Migration with 10 simultaneous sources.(c) Inversion with 10 simultaneous sources. (d) Inversionwith 2 simultaneous sources and 15 frequencies withoutrerandomization. (e) Same as (d) but with rerandomiza-tion. (f) Same as (e) but by `2 norm minimization. . . . 27Figure 2.3 Decreases of the data misfit and model errors, with (ingreen) and without (in blue) rerandomization. (a) De-crease of the relative data misfit. (b) Decrease of therelative model errors. . . . . . . . . . . . . . . . . . . . . 30Figure 2.4 The cropped Sigsbee 2B model. (a) The true model. (b)The background model. . . . . . . . . . . . . . . . . . . . 34Figure 2.5 RTM image of the Sigsbee 2B model, using data contain-ing multiples. . . . . . . . . . . . . . . . . . . . . . . . . . 35Figure 2.6 Inversion results of the Sigsbee 2B model. Both imagesare obtained after curvelet thresholding to remove rem-nant incoherent noise in the image. The arrows in a, band in Figure 2.5 point to the same location of the model.(a) Fast inversion result using the proposed method. (b)Fast inversion result without accounting for the multiples. 36Figure 2.7 Dealing with the unknown source signature. Both imagesare after curvelet thresholding. (a) Fast imaging of thetotal upgoing wavefield with source estimation. (b) Fastimaging of multiples only. . . . . . . . . . . . . . . . . . 38Figure 2.8 The inaccurate background model. (a) The wrong back-ground velocity model. (b) One vertical trace of (a). . . 40Figure 2.9 Images using the proposed method, without (a) and with(b) the velocity error. . . . . . . . . . . . . . . . . . . . 41Figure 2.10 Images without curvelet-domain sparsity promotion, with-out (a) and with (b) the velocity error. . . . . . . . . . . 42xiiFigure 3.1 Examples with the two-layer model. The images are plot-ted with the same color scale. (a). Traces of the linearizeddata (blue), the forward modelling data (green), and theirdifference (red). (b). Inversion of the full linearized data.(c). Inversion of the full forward modelling data with thetrue σ. (d). Inversion of the full forward modelling datawith σ = 0. (e). A comparison of the input forwardmodelling data (blue) and the linearized modelling datawith the inversion results of (d) (green). (f). The middletraces of (b) and (d). (g). Inversion of the subsampledlinearized data, σ = 0. (h). Inversion of the subsampledforward modelling data, σ = 0. (i). Otherwise the sameas (h) but with rerandomization, σ = 0. . . . . . . . . . . 49Figure 3.2 Case study with the synthetic SEG/EAGE 2D salt model.All images are plotted with the same color scale except theRTM image. (a). Traces of the linearized data (blue),the forward modelling data (green), and their difference(red). (b). True model perturbation. (c). The RTMimage of the forward modelling data. (d). Inversion ofthe subsampled linearized data. (e). Inversion of thesubsampled forward modelling data. (f). Otherwise thesame as (e) but with rerandomization. (g). Traces of theinput forward modelling data (blue) and the linearizedmodelling data with the inversion results of (e) (green)and (f) (red). (h). Zoomed subsalt areas of (c) (left)and (f) (right). . . . . . . . . . . . . . . . . . . . . . . . . 52Figure 4.1 The model and the imaging results of the stylized ex-ample. The arrow in (d) or (e) indicates the imagedistortion. (a). The background model, stars indicatesource positions and triangles indicate receivers. (b).True model perturbation. (c). The RTM image. (d).Image by the deconvolutional imaging condition. (e). Im-age by the smoothing imaging condition. (f). Image bysparsity promoting migration. . . . . . . . . . . . . . . . . 58Figure 4.2 The cropped Sigsbee 2B model. (a) The backgroundmodel. (b) True model perturbation. . . . . . . . . . . . 60Figure 4.3 The RTM (a) and the deconvolutional (b) images of mul-tiples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61xiiiFigure 4.4 Image of multiples by our fast sparsity promoting migra-tion, where the simulation cost is roughly the same asFigure 4.3. . . . . . . . . . . . . . . . . . . . . . . . . . . 62Figure 5.1 A slightly modified 2D slice of the SEG/EAGE salt model.(a) True velocity model. (b) Smoothed background ve-locity model. . . . . . . . . . . . . . . . . . . . . . . . . . 76Figure 5.2 Imaging results of the ideal primary-only data set. Thearrows point to the same subsurface locations. (a) Con-ventional RTM with the true source wavelet. (b) to (d)Compressive imaging results with the true source wavelet(b), with the wrong source wavelet that has a -0.1s timeshift (c), and with the proposed source-estimation method(d). It is evident that the images in (b) and (d) arehighly comparable, and are of much higher spatial reso-lution than the conventional RTM image in (a). . . . . . 78Figure 5.3 Source estimates from the ideal primary-only data set.(a) and (b): Amplitude (a) and phase (b) spectra of thetrue (in blue) and the estimated (in green) source waveletduring the last LASSO(τ) subproblem, after amplitudenormalization. . . . . . . . . . . . . . . . . . . . . . . . . 79Figure 5.4 Imaging results of the primary-only data set modelled us-ing iWave. The arrows point to the same subsurface loca-tions. (a) Conventional RTM image. (b) and (c) Com-pressive imaging results using the true source wavelet (b)and with the proposed source-estimation method (c). . . 80Figure 5.5 Source estimates from the primary-only data set modelledusing iWave. (a) and (b): Amplitude (a) and phase (b)spectra of the true (in blue) and the estimated (in green)source wavelet during the last LASSO(τ) problem, afteramplitude normalization. . . . . . . . . . . . . . . . . . . 81Figure 5.6 A sedimentary part of the Sigsbee 2B model. (a) Truevelocity. (b) Smooth background velocity. . . . . . . . . . 83Figure 5.7 Compressive imaging-with-multiple result of the ideal dataset containing multiples. (a) The inverted model per-turbations. (b) Estimate of the true velocity model byadding the inverted model perturbations back to the back-ground model. . . . . . . . . . . . . . . . . . . . . . . . . 84xivFigure 5.8 Source estimates from the ideal data set containing multi-ples. (a) and (b): Amplitude (a) and phase (b) spectraof the true (in blue) and the estimated (in green) sourcewavelet during the last LASSO(τ) subproblem. No nor-malization is applied. . . . . . . . . . . . . . . . . . . . . 85Figure 5.9 Compressive imaging-with-multiples results of the dataset modelled using iWave. (a) and (b): Image obtainedwith the true source wavelet (a) and with source estima-tion (b). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86Figure 5.10 Source estimates using multiples from the data set mod-elled using iWave. (a) and (b): Amplitude (a) and phase(b) spectra of the true (in blue) and the estimated (ingreen) source wavelet during the last LASSO(τ) subprob-lem, after amplitude normalization and correction with|ω|. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87Figure 5.11 Model misfit represented by the normalized cross-correlation(NCC) as a function of the number of iterations. Eachdot indicates the finish of one LASSO subproblem. (a)NCC curves for the imaging-with-primary examples inFigure 5.4, using the true wavelet (in blue), and withsource estimation (in green). (b) NCC curves for theimaging-with-multiple examples in Figure 5.9, using thetrue wavelet (in blue) and with source estimation (in green). 88Figure 5.12 (a) The true source wavelet and the two erroneous initialguesses of the wavelet. The initial guess in green has asignificant time shift, and the one in red has a 90 degreephase rotation plus a time shift. (b) The correspondingsolution paths in terms of model errors measured usingNCC. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89Figure 5.13 The spectra of the predicted Green’s functions using theinverted model perturbations with the true source wavelet(in blue), with source estimation (in green), versus that ofthe input data (wavelet is deconvolved from the data, inred). The spectra is obtained by summing over all tracesand is amplitude-normalized. (a) Imaging-with-multipleexamples. (b) Imaging-with-primary examples. . . . . . . 91xvFigure 6.1 Broader illumination coverage of multiples compared withprimaries: an illustration. The model is cropped from thesedimentary part of the Sigsbee 2B FS (with free-surface)model. (a) and (b): Illustrations of the illuminated sec-tions from the primaries and the multiples, respectively. . 94Figure 6.2 Broader illumination coverage of multiples compared withprimaries: numerical examples. (a) and (b): The itera-tive inversion results (after 60 iterations) of the primariesand the multiples, respectively. . . . . . . . . . . . . . . 95Figure 6.3 Illustration of the imaging artifacts stemming from surface-related multiples. In both figures, the input data containprimaries and surface-related multiples. (a) ConventionalRTM with spatially impulsive sources. (b) RTM withareal sources. . . . . . . . . . . . . . . . . . . . . . . . . . 100Figure 6.4 True model and the migrated images. (a) The true modelperturbation. (b) Conventional RTM with spatially im-pulsive sources and multiple-free data. (c) The same as(b) but now for input data with surface-related multiples.(d) The same as (c) but now with areal sources given inequation 6.2 . . . . . . . . . . . . . . . . . . . . . . . . . . 101Figure 6.5 Migrated images using the proposed method. (a) Imagingof total data but only with primary-imaging operator. (b)and (c) Joint imaging of primaries and multiples, withthe true source (b) and with source estimation (c). (d)Source estimates (in green) versus the true source function(in blue) at the frequencies selected in the last iteration. . 102Figure 6.6 LS images of (a) primaries only, and (b) total upgoingdata. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105Figure 6.7 Separate images of primary-only data (a) and multiple-only data (b). . . . . . . . . . . . . . . . . . . . . . . . . 106Figure 6.8 Joint imaging of total data consisting of primaries andmultiples. . . . . . . . . . . . . . . . . . . . . . . . . . . . 107Figure 6.9 (a) RTM of multiples only. (b) LS image of multiplesonly. Both with the downgoing receiver data acting asareal sources. . . . . . . . . . . . . . . . . . . . . . . . . . 108Figure 6.10 Zoomed-in comparisons of sections A and B between Fig-ure 6.9(a) and Figure 6.9(b). . . . . . . . . . . . . . . . . 109xviFigure 7.1 Comparison of randomized sparse inversions on primaryonly data using SPG`1 (a) and RISKA (b). It is clearthat for the same computational budget, the inversionresult for RISKA is less noisy and recovers the reflectorsbelow the Salt better. . . . . . . . . . . . . . . . . . . . . 117Figure 7.2 Comparison of randomized sparse inversions on data withsurface-related multiples using SPG`1 (a) and RISKA(b). Again, the much simpler RISKA does an excellentjob compared to SPG`1. . . . . . . . . . . . . . . . . . . . 118Figure 8.1 (a)True velocity model. (b) Background velocity modelused in extended imaging. . . . . . . . . . . . . . . . . . . 124Figure 8.2 Least-squares reverse-time migration. (a) Primary dataonly. (b) Total up-going data is used but areal source isnot used. We can clearly see the ghost reflector when wedo not use the areal source. (c) Total up-going data isused along with the areal source. Incorporation of arealsource removes the ghost reflector. . . . . . . . . . . . . . 125Figure 8.3 Least-squares common-image gather extracted along x =1000m and for all z. (a) Primary data only. (b) Total up-going data is used but areal source is not used. (c) Totalup-going data is used along with the areal source. We canclearly see the effect of ghost reflectors (in the middle)which can be removed (on the right) via incorporatingthe areal source. . . . . . . . . . . . . . . . . . . . . . . . 126xviiGlossaryRTM - Reverse-Time MigrationSRME - Surface-Related Multiple EliminationEPSI - Estimation of Primaries by Sparse InversionSPG - Spectral Projected GradientBPDN - Basis Persuit De-NoiseLASSO - Least Absolute Shrinkage and Selection OperatorAMP - Approximate Message PassingxviiiAcknowledgmentsFirst and foremost, I would like to thank my research supervisor Dr. FelixHerrmann. The high standards that he abide by in conducting scientificresearch have profoundly influenced me, and will form a solid foundationfor my career as a responsible scientist. The critical thinking skills that Ilearned through his training will benefit me in all aspects of my life, and I amdeeply grateful for this great gift. I am also thankful for the opportunitieshe provided to get myself exposed to international geophysical communitiesvia all kinds of conferences and meetings.Thanks to Michael Bostock, Eldad Haber, Michael Friedlander, andOzgur Yilmaz for sitting on my supervisory committee. Their support andhelp during the last six years are very important.My gratitudes also go to all my colleagues at the SLIM lab. I wouldlike to thank Henryk Modzelewski, Manjit Dosanjh, Miranda Joyce, andIan Hanlon for their professional assistance, and everyone else — especiallyXiang Li and Tim Lin — for their valuable scientific insights as researchcollaborators, and for the past six-year’s company and friendship.Many thanks to Dr. Eric Verschuur and Dr. Joost van der Neut fromTU Delft, as well as Dr. Shaoping Lu, Dr. Dan Whitmore, and Dr. SteveKelly from PGS, for the very insightful technical discussions. I also wouldlike to thank the authors of SPG`1, the SEG/EAGE salt model, and theSigsbee 2B model, which I used extensively in my research. Many thanksto PGS for providing the field data set that I used in Chapter 6.I am utterly fortunate to know and marry my wife during my PhD study.Her unconditional support and love encourage me everyday to conquer theobstacles in my research and other aspects of life. I also thank my parentsfor tolerating me for six years to study and live at such a distant place fromtheir home in China. I am grateful to my kids — they bring new purposesto my life and motivate me to make this world a better place through myefforts.Thanks to all funding agencies that supported my research — the Chi-xixnese Scholarship Council, NSERC, and all SINBAD consortium sponsors.This thesis is not possible without their generous financial support.xxTo my dear wife Yuan.You make me who I am today.To my dear Sarah and Jonathan.You give my life new purposes.And a new lifelong job, twice.To all PhD candidates.You will make it.xxiChapter 1IntroductionIn this chapter, we first briefly explain the reflection seismic method in ma-rine towed-streamer acquisitions, the origin of the surface-related multiples,and the challenges they impose on the following seismic data processing andimaging workflow. We then state the theme of the thesis, and continue withthe goals we aim to achieve as a result of the presented work. We concludethe chapter with the outline of this thesis. We will keep this introductionchapter concise, as each of the main chapters also contains its own introduc-tion and literature review.1.1 Surface-related multiples: the origin and theproblemIn marine towed-streamer seismic acquisitions, practitioners collect seismicreflection data by exciting artificial seismic sources, and gather the reflectedseismic waves with receivers deployed along one or more cables, known as thestreamers. Both the source(s) and the streamer(s) are towed by specially-equipped vessels, and are placed at certain depth(s) beneath the surface ofthe water. Once the data are collected, they undergo a series of processingprocedures, such as the removal of both coherent and incoherent noises in thedata, before they are migrated to produce a seismic image that displays thegeological structures of the subsurface. The term migration, which we useinterchangeably with imaging throughout this thesis, as defined in Yilmaz(2001), refers to the process of mapping reflection/diffraction seismic datato their true subsurface positions. Geologists then extract information fromthe seismic image to locate potential oil/gas reservoirs. The more faithfulthis image is to the true geology, the more valuable insights it can provide1in locating the oil/gas reservoirs.Cartoon 1.1(a) illustrates a 2D line of the marine towed-streamer seismicsurvey in a simplified medium. Seismic waves propagate from the source,reflect at boundaries between two different media, and are picked up by thereceivers. In this cartoon, S indicates a seismic source, and R1 through R4indicate several receivers along the streamer. The solid blue lines indicatethe ray paths of the primaries that are picked up by these receivers. Theterm primaries, as defined in Verschuur (2006), refer to seismic waves thatundergo only one upward reflection. However, primaries are not the onlywave modes that are picked up by these receivers. Because the water-aircontact acts as a perfect reflector, the upgoing waves picked up by the re-ceivers will continue to propagate upward until they hit the water surface,bounce at the surface, and propagate downward into the subsurface again.As a result, the surface-related multiples are also picked up by the receivers.As shown in Figure 1.1(b), a primary event is recorded by R1, and its corre-sponding first/second/third-order surface-related multiples are recorded byR2 through R4 respectively.Throughout this thesis, we define surface-related multiples as the wave-fields that undergo at least one downward reflection at the water surface(Verschuur, 2006). In the data space, these multiples resemble displacedcopies of the primaries, and if left untreated in the data, they can leadto phantom reflectors in the seismic image. We use a simple homogenousmodel with one horizontal reflector to show these multiples in a common-shot gather in Figure 1.2, and the resulting phantom reflectors in the reverse-time migrated (RTM, Baysal et al. (1983)) seismic image in Figure 1.3. Wealso show the waveforms and the RTM image of the primaries to comparewith. The phantom reflectors arising from the multiples can lead to erro-neous interpretations, and as a result, conventional data processing entailsthe removal of these multiple events in the data. A comprehensive literaturereview of existing de-multiple methods is included in Chapter 2.1.2 A treasure in disguiseBecause multiples undergo extra bounces between the surface and the sub-surface reflectors compared with primaries, they are exposed more to thesubsurface structures, and may therefore provide extra information that isabsent in the primaries. As illustrated in Figure 1.4, for a given seismicsource and a certain range of receiver coverage, surface-related multiplescan illuminate a wider range of subsurface structures, as all the bouncingpoints at the water surface now act as secondary sources. In the literature,2water&surfacesubsurface&reflectorS R4R1 R2 R3(a)R1 R2 R3 R4Ssubsurface&reflectorwater&surface(b)Figure 1.1: Illustrations of the origin of the surface-related multiples.(a) The ray-paths of the primaries. (b) The ray-paths of a pri-mary event (in blue) and its corresponding first/second/third-order surface-related multiples (in red/purple/green respec-tively).Verschuur (2011) and Lu et al. (2014a) showed that in cases where the illu-mination from the primaries is not enough, these multiples help to removethe acquisition footprints. Therefore it is desirable to make active use of thesurface-related multiples rather than removing them by a de-multiple pro-cedure during seismic data processing. A comprehensive literature review ofexisting imaging-with-multiple methods — including their respective bene-fits and limitations — is detailed in Chapter 2.1.3 Curvelet-domain sparsity promotingThe inversion scheme we use to image surface-related multiples borrowsinsights from recent developments of compressive sensing. We reduce theper-iteration data-simulation cost by subsampling the source experiments,which are proportional to the number of wave-equation solves. To stabilize3Offset (m)Time (s)0 500 100000. (m)Time (s)0 500 100000. 1.2: (a) The waveform of primaries. (b) The waveform of datathat contain multiples.the inversion of the subsampled system, we promote curvelet-domain spar-sity, i.e., we look for the solution that has the most sparse representations inthe curvelet domain where the sparsity is measured using the `1-norm. Weuse several figures to illustrate the necessity of using the curvelet transformand the `1-norm.Curvelets are designed to detect curved singularities (Cande`s and Donoho,2004). Figure 1.6 shows several curvelet “atoms” of certain angles and scales(adapted from Herrmann and Hennenfent, 2008). We can see that curveletsare continuous in one direction and oscillatory in the perpendicular direc-tion, plus they are multi-scale and directional. Therefore they can representcurved singularities with only a few coefficients. Figure 1.6 shows the decayof coefficients in the physical domain as well as in the curvelet domain. We4Lateral distance (m)Depth (m)0 500 1000 1500 200002004006008001000(a)Lateral distance (m)Depth (m)0 500 1000 1500 200002004006008001000(b)Figure 1.3: (a) The RTM image of primaries. (b) The RTM imageof data that contain multiples.can see that the curvelet representation is much sparser than in the physicaldomain.Figure 1.7 shows that the solution with the minimal `1-norm is a sparsesolution, because the `1-ball almost always touches the solution space atone of its vertex. Compared with the `1-norm, using the `2-norm leads to anon-sparse solution.5S R4R1 R2 R3subsurface)reflectorwater)surface(a)R1 R2 R3 R4Ssubsurface)reflectorwater)surface(b)Figure 1.4: Illustrations showing the wider illumination coverage ofmultiples compared with primaries. The translucent blue colorhighlights the illuminated subsurface structure by the primaries(top) and the surface-related multiples (bottom).1.4 Theme and objectives of the thesisThis thesis focuses on an effective and computationally efficient inversionapproach to image the surface-related multiples. We aim to achieve thefollowing objectives through this thesis:1. To understand the necessity of using an inversion-based approach toimage the surface-related multiples, to suppress the coherent artifactsassociated with these multiples such as the phantom reflectors shownin Figure 1.3(b). We seek insights by carefully comparing inversionwith direct application of either the cross-correlation imaging condi-tion (such as RTM of multiples, as reported by Liu et al. (2011)), or thedeconvolutional imaging condition (“deconvolution” in the sense thatthe source wavefield is deconvolved from the cross-correlated image,as reported by Muijs et al. (2007); Whitmore et al. (2010)).6Figure 1.5: Curvelets at different angles and scales.2. To significantly reduce the high computational costs of the inversionprocedure, without compromising the image quality. The computa-tional costs inside each iteration has two main sources: (i) carryingout multiple prediction; and (ii) solving wave-equations in data simu-lation. We aim to reduce the computational costs from both factors.3. To improve the robustness of the inversion procedure w.r.t. modellingerrors incurred during the linearization of the wave-equation (i.e., thelinearization errors), and velocity errors in the background model. Asmigration is based on the linearization of the wave-equation (migra-tion equates to the adjoint operation of the linearized modelling, as re-ported by Lailly (1983)), linearization errors can significantly degradethe image quality if not properly dealt with. As the wave-equation islinearized at a given background velocity model (Plessix and Mulder,2004), and the linear (i.e., first-order) approximation is based on theassumption that the model perturbations (i.e., the difference betweenthe true model and the background model) are small enough for thehigher-order terms (of the Taylor expansion of the wave-equation w.r.t.the model parameters) to vanish, a biased estimate of the backgroundvelocity model will also introduce errors to the image, for examplecausing reflectors to shift in depth or even defocus. We aim to develop70 2 4 6 8x 10500. of samplesNormalized amplitude  Physical domainCurvelet domain0 1 2 3x 10600. of samplesNormalized amplitude  Physical domainCurvelet domainSample numberSample numberFigure 1.6: Decay of coefficients of a geological model in the physicaldomain as well as in the curvelet domain. The bottom figureshows a zoomed-in section of the top figure.8L1Solution(spaceL2Solution(space(x,(0)(x,y)Figure 1.7: Illustration that minimizing the `1-norm leads to a sparsesolution while using `2-norm does not.an inversion approach that is relatively robust to both types of errors.4. To jointly use the information encoded in both the primaries, whichare of relatively high signal-to-noise-ratio (SNR), and multiples thatprovide extra illumination coverage. Compared with imaging the twowavefields separately, joint imaging produces one image, with properlyscaled contributions from primaries and multiples without the need toseparate them. Given the true source wavelet, we aim to performthe joint inversion by matching the observed total upgoing wavefieldincluding primaries and multiples. For field seismic data, as the sourcewavelet is typically unknown, we aim to combine source estimation intothe joint imaging procedure. As this problem falls into the category ofthe blind-deconvolutional type of problems (Stockham et al., 1975), itcan be plagued by amplitude and phase ambiguities. We also aim toinvestigate whether and how we can resolve the possible ambiguitiesby exploiting the extra information from multiples.1.5 OutlineIn Chapter 2, we explain the proposed iterative inversion method and themeasures we take to reduce its computational cost, given the knowledgeof the background velocity model and the source wavelet. We first pro-9vide a comprehensive review of the literature on existing methods thatdeal with surface-related multiples in the data. We then follow the well-established Surface-Related Multiple Elimination (SRME) relation, and de-rive a linearization of the wave-equation-based modelling that incorporatesthe surface-related multiples. After showing the limitation of the con-ventional cross-correlation imaging condition in imaging the multiples, wepresent the proposed sparsity-promoting inversion approach and explain itskey ingredients using a stylized synthetic model. Next we discuss the in-fluence of the inversion parameters on the performance of the algorithm,and how we deal with certain practical issues in imaging field seismic data— such as the unknown source wavelet, and possible velocity errors in thebackground model. We verify the effectiveness of the proposed method usingrealistic synthetic examples.In Chapter 3, we focus on a single topic we hardly touch in Chapter 2,namely how the sparsity-promoting inversion approach presented in Chapter2 gains robustness w.r.t. linearization errors (i.e., modelling errors incurredduring the linearization of the wave equation) by a procedure we refer toas “rerandomization”. The linearization errors can be interpreted as thecoherent mismatch between the linearized modelling and the input datathat contain non-linear wave phenomena. We demonstrate the effectivenessof the proposed approach using realistic synthetic examples.In Chapter 4, the main question to answer is whether we can obviate theiterative inversion procedure — and alternatively use the deconvolutionalimaging condition with two-way propagators. We provide first a theoret-ical analysis, and then synthetic examples to reveal the limitation of thedeconvolutional imaging condition. We reach a conclusion that a true iter-ative inversion approach is necessary to properly image the surface-relatedmultiples.In Chapter 5, we deal with the problem of the unknown source waveletin seismic imaging, especially in the joint imaging of primaries and mul-tiples (in Chapter 2 we assume the source wavelet is known). We firstprovide the motivation for this research, review the related literature, andexplain the contribution of our work. We then detail the method, and ex-plain the challenge in obtaining deterministic — i.e., ambiguity-resolved —source estimates as well as how we address this issue by actively making useof self-consistency between primaries and the corresponding surface-relatedmultiples. In the end, we verify the efficacy of the proposed approach usingrealistic synthetic examples, quantitatively analyze the performance of themethod, and study its robustness to the initial guess of the wavelet.In Chapter 6, we apply the proposed inversion approach — together with10the source estimation method — to a North-sea field data set. We study itseffectiveness in suppressing the coherent artifacts associated with multiples,and the benefits of the joint imaging of primaries and multiples comparedwith imaging them separately.In Chapter 7, we present a prototype of a simplified sparse-inversionalgorithm via so-called “linearized Bregman projection”. It will be an im-portant topic for our future research, as it leads to fast convergence, yieldshigh-fidelity images, and affords easy algorithmic implementation. We ex-plain the algorithm, compare it with a representative existing approach,and show promising preliminary imaging results — including imaging withmultiples — using realistic synthetic models and noise-free data.In Chapter 8, we present an extension of the proposed work to extendedimaging — least-squares extended imaging with surface-related multiples.Extended imaging is widely used in migration velocity analysis. Throughthis work, we aim for a data processing workflow in the future that obviatesthe primary-multiple separation procedure altogether, which is currentlynecessary because velocity analysis relies on multiple-free data. We presentthe theory and show promising preliminary results using stylized syntheticexamples.11Chapter 2Fast imaging withsurface-related multiples bysparse inversion2.1 SummaryIn marine exploration seismology, surface-related multiples are usually treatedas noise mainly because subsequent processing steps, such as migration ve-locity analysis and imaging, require multiple-free data. Failure to removethese wavefield components from the data may lead to erroneous estimatesfor migration velocity or result in strong coherent artifacts that interferewith the imaged reflectors. However, multiples can carry complementaryinformation compared to primaries, as they interact with the free surfaceand are therefore exposed more to the subsurface. Recent work has shownthat when processed correctly multiples can improve seismic illumination.Given a sufficiently accurate background velocity model and an estimatefor the source signature, we propose a new and computationally efficienttwo-way wave-equation based linearized inversion procedure that producesaccurate images of the subsurface from the total upgoing wavefield includ-ing surface-related multiples. Modelling of the surface-related multiples inthe proposed method derives from the well-known surface-related multi-ple elimination method. We incur a minimal overhead from incorporatingthe multiples by having the wave-equation solver carry out the multipleA version of this chapter has been published in Geophysical Journal Internation, 2015,vol 201, pp. 304-31712predictions via the inclusion of an areal source instead of expensive densematrix-matrix multiplications. By using subsampling techniques, we obtainhigh-quality true-amplitude least-squares migrated images at computationalcosts of roughly a single reverse-time migration with all the data. These im-ages are virtually free of coherent artifacts from multiples. Proper inversionof the multiples would be computationally infeasible without using thesetechniques that significantly brings down the cost. By promoting sparsity inthe curvelet domain and using rerandomization, out method gains improvedrobustness to errors in the background velocity model, and errors incurredin the linearization of the wave-equation with respect to the model. Wedemonstrate the superior performance of the proposed method compared tothe conventional reverse-time migration using realistic synthetic examples.2.2 IntroductionWhile reverse-time migration (RTM) has been very successful, its applica-tion to data sets with strong surface-related multiples is challenging becausethis imaging technology generally hinges on the single-scattering approxi-mation. Ignoring these multiples can lead to erroneous velocity models andimages that are prone to coherent artifacts that can lead to misguided in-terpretation. Throughout this paper, we discuss surface-related multiplesunless specified otherwise.To overcome this problem, industry has been adopting workflows thatare designed to remove surface-related multiples prior to velocity analysisand migration. Early methods employed periodicity of multiples, which leadto trace-based predictive deconvolution (Peacock and Treitel, 1969), whilelater multi-trace techniques, such as NMO and the Radon transform, wereused to separate primaries and multiples based on kinematic differences inthe data space (Hampson, 1986; Foster and Mosher, 1992).Unfortunately, these above techniques fail in cases where primaries andmultiples overlap and where discrimination based on periodicity and kine-matics alone is insufficient. In these cases, wave-equation based methods of-fer a viable, although computationally expensive alternative. Amongst thesewave-equation based methods, we count Surface-Related Multiple Elimina-tion (SRME) (Verschuur et al., 1992; Guitton and Verschuur, 2004; Her-rmann et al., 2008c; Wang et al., 2008), model-based multiple predictionmethods as reported by Wiggins (1988) and Lu et al. (1999), the inversescattering method by Weglein et al. (1997), and wave-equation based move-out discrimination by Nemeth et al. (2000) and Sava and Guitton (2005).13An alternative to multiple removal is to actively use them to benefit fromthe extra information they carry, and to altogether avoid the prediction-subtraction de-multiple paradigm that may hurt the primaries (e.g., SRME).In data space processing, we count Estimation of Primaries by Sparse Inver-sion (EPSI), which inverts the SRME relation by identifying the surface-freedipole Green’s function and the source wavelet as unknowns (van Groen-estijn and Verschuur, 2009b; Lin and Herrmann, 2013). Compared to theprediction-subtraction paradigm of SRME where multiples and therefore theinformation they carry are removed, EPSI maps the surface-related multiplesback to the primary impulse response via multi-dimensional deconvolutions.As a result, primaries are better preserved in EPSI (see van Groenestijn andVerschuur, 2009b, Figure (4) for example). However, this progress comesat a price. Its iterative inversion procedure involves significantly more com-putational work (e.g., multiplications of dense matrices, the computationalcomplexity of which is O(n3) for n × n matrices) than SRME. This maypartially explain the relatively slow uptake of EPSI by industry for 3D seis-mic, where the computation can be prohibitively expensive. In model spaceprocessing such as imaging, many works have shown that surface-relatedmultiples are useful signals and can increase illumination coverage whenused properly (see examples in Berkhout, 1993; Guitton, 2002; Verschuur,2011; Long et al., 2013). Among these works, the first school of methodstreat the downgoing receiver wavefield as the generalized areal source forthe surface-related multiples (Berkhout, 1993; Guitton, 2002; Shan, 2003;Schuster et al., 2004; Berkhout and Verschuur, 2006; Muijs et al., 2007; Heet al., 2007; Whitmore et al., 2010; Liu et al., 2011; Verschuur, 2011; Luet al., 2011; Long et al., 2013; Zuberi and Alkhalifah, 2013, and counting).These methods can be derived from the same multiple prediction principleas in SRME, by identifying the surface-related multiples to be the multi-dimensional convolution of the Green’s function and the downgoing receiverwavefield. Most of these methods apply the cross-correlation imaging con-dition to image these multiples, which results in coherent artifacts in theimage (a more detailed explanation can be found in Muijs et al., 2007; Liuet al., 2011). Others propose to use the deconvolutional imaging conditionto reduce these artifacts (Shan, 2003; Muijs et al., 2007; Whitmore et al.,2010), but with limited success when complex geology is present (see Pooleet al., 2010, and Chapter 4 for a detailed explanation). Methods exploit-ing least-squares migration to achieve this goal have also emerged in theliterature (He and Schuster, 2003; Lin et al., 2010a; Verschuur, 2011; Tuand Herrmann, 2012b; Wong et al., 2012, 2014). However, wide adoptionof these iterative methods is largely hindered by the expensive simulation14cost (in terms of the number of wave-equation solves). The second school ofimaging-with-multiple methods relies on including a free surface (and / orother reflecting boundaries) in the background model (e.g., Reiter et al.,1991; Brown and Guitton, 2005; Wong et al., 2012). These methods donot fit into current single-scattering based migration paradigm, and are lessdiscussed in the literature. Nevertheless, such a method that accounts fordifferent orders of surface-related multiples still relies on a proper iterativeinversion procedure (for example Brown and Guitton, 2005; Wong et al.,2012) and therefore also faces the challenge of expensive computation.2.2.1 Contributions of this workInspired by the computationally expensive but successful SRME-based iter-ative inversion approaches (e.g., EPSI and the least-squares type imaging-with-multiples methods), we present a new and computationally efficientinversion method to image data with surface-related multiples. As far as weknow, our approach is the first instance where primaries and all orders ofsurface-related multiples are imaged jointly using a two-way wave-equationbased linearized inversion method in a computationally efficient manner.As we will show, the method eliminates dense, and therefore prohibitivelyexpensive, matrix-matrix multiplications by solving wave-equations w.r.t.an areal-source term that contains the total downgoing wavefield. By in-corporating this areal source into recently proposed compressive imaging(Herrmann and Li, 2012), we arrive at a formulation that creates high fi-delity images at low computational costs. We control the coherent imagingartifacts from multiples and the computational cost by promoting curvelet-domain sparsity in the model space.Without loss of generality, we limit our discussion to a single inversionwhere the velocity model is not updated and where obtaining high-fidelityartifact-free images is our main emphasis. However, our proposed methodcan be applied recursively, which would lead to a non-linear migration typeof approach as presented by Me´tivier (2011); Me´tivier et al. (2011). Whenthe velocity model is sufficiently accurate, this will also take care of internalmultiples.2.2.2 Paper outlineWe organize the paper as follows: first, we introduce the matrix equationthat establishes the physical relationship between the up- and downgoingwavefields at the free surface. Next, we combine this relationship with lin-15earized forward modelling using the solution operator of the time-harmonicHelmholtz system. This combination enables the use of all the techniquesdiscussed in this paper to bring down the computation. With the lin-earized forward model, we introduce migration with multiples and sparsity-promoting imaging with multiples, using compressive imaging techniques tobring down the cost of our method to roughly equal that of a conventionalRTM. We conclude by evaluating our algorithm using a stylized example anda realistic synthetic case study that demonstrate the superior performanceof the proposed method compared to conventional RTM.2.3 Imaging with multiplesBecause we want to image the surface-related multiples, we start our trea-tise by first introducing the time-harmonic relationship between the up- anddowngoing wavefields described by the SRME relation, and then propose tocombine this relation with wave-equation based linearized inversion. Thiswill allow us to directly relate the total data, including surface-related mul-tiples, to the perturbations of the subsurface model.2.3.1 Physics of the free surfaceThe SRME relation explicitly reveals the role of the surface-free Green’sfunction that relates the total up- and downgoing pressure wavefields atthe free surface. Using monochromatic matrices, it can be expressed as(Verschuur et al., 1992):Pi = Xi(Si + RiPi), (2.1)where subscript i = 1, · · · , nf indexes the frequencies where nf is the totalnumber of discretized frequencies, Pi is the discretized total upgoing receiverwavefield at the free surface, Xi is the surface-free dipole Green’s function,and Si is the downgoing point-source wavefield. The operator Ri modelsthe reflectivity at the free surface, and as a result, RiPi is the downgoingreceiver wavefield at the surface that acts as a generalized “areal-source”wavefield for the surface-related multiples. Throughout this paper, we ap-proximate the surface reflectivity to be −1, i.e., the action of Ri simply flipsthe sign of Pi. The above relation reveals that the total upgoing wavefieldcan be represented by a frequency-domain (dense) matrix-matrix product ofthe surface-free dipole Green’s function and the total downgoing wavefield(including the downgoing point-source wavefield and the downgoing receiver16wavefield). Note that because direct waves do not reflect at the surface, theyare excluded from both the data Pi and the dipole Green’s function Xi (Ver-schuur, 1991; Dragoset and Jericˇevic´, 1998). Internal multiples are part ofthe primary response (i.e., XiSi) in the SRME relation. To obtain the upgo-ing wavefield at the surface, receiver ghosts (i.e., the downgoing data) needto be removed (source ghosts should be kept untouched) (Verschuur, 1991;Verschuur et al., 1992; Dragoset and Jericˇevic´, 1998), and then the upgoingdata need to be extrapolated from the receiver level to the free surface.In Equation (2.1), the monochromatic upgoing wavefield is representedas a complex-valued matrix, Pi ∈ Cnr×ns , with ns common-shot gathers inits columns and nr common-receiver gathers in its rows. The dipole Green’sfunction Xi is organized in the same way. Equation (2.1) assumes fixed-spread acquisition with co-located sources and receivers (i.e., ns = nr), andtherefore the point-source wavefield Si ∈ Cnr×ns can also be organized in thisway so that the addition of Si and RiPi makes physical sense. With theseassumptions, the matrix product in the above expression corresponds tomulti-dimensional convolutions consisting of a temporal convolution alongthe time dimension and a spatial convolution along the source / receiverdimension. For marine streamer acquisitions, we use reciprocity on-the-fly to mimic fixed-spread acquisition, which leads to minor computationaloverhead. Equation (2.1) describes 2D seismic lines but it can be extendedto 3D seismic data where frequency slices can also be organized into matrices(Verschuur et al., 1992; Dragoset et al., 2010).Here are more details about Xi and Si. In the continuous case, thedipole Green’s function is the vertical derivative of the monopole Green’sfunction, which can be alternatively represented using the particle-velocitywavefield (again see Fokkema and van den Berg, 1993, Equation (12.25)).In the discretized case, we approximate Xi by the response of the subsurfacedue to the injection of a (closely placed) pair of monopole impulsive sources(with the opposite sign) above and below the free surface, properly scaledto include the subsurface response due to source ghost. The point-sourcewavefield Si is a diagonal matrix, with the diagonal entries being the Fourierspectrum of the source wavelet at a given frequency, i.e., Si = wiI with withe spectrum of the source wavelet at the ith frequency, and I the identitymatrix (Verschuur et al., 1992).While the Equation (2.1) has been responsible for major breakthroughsin the mitigation of surface-related multiples, including SRME (Verschuuret al., 1992) and EPSI (van Groenestijn and Verschuur, 2009a; Lin andHerrmann, 2013), it poses the following challenges: (i) it relies on densematrix-matrix multiplications that are computationally expensive in terms17of CPU time, memory requirement, and computer input / output, and (ii)these dense matrix products require full data sampling, rendering expensiveon-the-fly data interpolations necessary to compute these matrix products.While low-rank approximation techniques can be used to alleviate the costof performing matrix products (Jumah and Herrmann, 2014), we proposeto altogether avoid these dense matrix products by using the wave-equationsolvers to implicitly carry them out.2.3.2 Modelling the free surface via areal sourcesTo arrive at a formulation that is conducive to imaging data with surface-related multiples, we combine the physics of the free surface, described in theprevious section, with the wave physics of the Earth interior, i.e., we incor-porate the free surface as modelled in Equation (2.1) into the solution of thewave equation. As imaging is based on a linearization of the wave-equationat a smooth background model m0, we first approximate the direct-wave-free upgoing dipole Green’s function Xi via Born modelling, assuming thatthe high-order reflections are weak. This approximation can be expressedasXi ≈ ω2DrHi[m0]−1Hi[m0]−1diag(δm)D∗sI (2.2).= ∇F̂i[m0, δm](D∗sI).= ∇Fi[m0, δm; I],where ∇Fi denotes linearized forward modelling at the background modelm0, which is linear in both the model perturbations δm and the source I(denoting the impulsive source array for the Green’s function). We use thesymbol ≈ to indicate that by linearized modelling, the higher-order scat-terings are ignored. The symbol “.=” means “defined as”, and the notation∇F̂i in the second line denotes the same forward modelling, but is used ina way that explicitly shows the role of the source wavefield. The matrixDr is a detection operator that collects data at receiver locations, and D∗sinjects the source wavefield. Again for the dipole-source effect, a pair ofmonopole sources is injected at each physical source location (with a posi-tive sign) and the mirrored location w.r.t the free surface (with a negativesign). The matrix Hi represents the discretized monochromatic Helmholtzoperator with absorbing boundaries (including at the surface, as the Green’sfunction is surface free, i.e., it does not include a reflecting surface). TheHelmholtz operator is discretized using the finite difference method with a9-point stencil (Jo et al., 1996). Assuming constant density, the operator18can be parameterized by the square of the slowness (s2/m2) discretized inm.To incorporate surface-related multiples, a linearized modelling of thetotal upgoing wavefield can be derived by replacing Xi in Equations (2.1)by Equation (2.2), yielding:Pi ≈ ∇Fi[m0, δm; I](Si −Pi). (2.3)Compared to Equation (2.1), this linearized modelling involves the samedense matrix-matrix multiplications to predict the multiples, plus we needto simulate the full Green’s function Xi by solving wave-equations for all(sequential) source experiments. To avoid the expensive dense matrix prod-ucts, we simplify the forward modelling using the associativity of the matrixproduct, i.e., ABC = A(BC), and arrive at the following identity:Pi ≈ ∇Fi[m0, δm; I](Si −Pi)= ∇F̂i[m0, δm](D∗sI)(Si −Pi)= ∇F̂i[m0, δm](D∗s(Si −Pi))= ∇Fi[m0, δm; Si −Pi]. (2.4)This simplified expression forms the basis of the proposed fast linearizedinversion procedure to image data with surface-related multiples. The seem-ingly trivial algebraic manipulations in Equation (2.4) have profound impli-cations on the computations. First, the expensive matrix-matrix productsbetween the Green’s function and the total downgoing wavefield (the secondline) are turned into wave-equations solves with the same downgoing wave-field injected as the source (the third line). This major improvement goesat the expense of working with a more complicated areal source, which mayaffect the performance of certain modelling engines. We expect this effect tobe minor compared with the computational gain. Second, the dense sourcesampling required by the SRME relation can now be relaxed in seismic imag-ing: compared with the data-space inversion procedure, namely EPSI (vanGroenestijn and Verschuur, 2009b; Lin and Herrmann, 2013), which iden-tifies the entire Green’s function Xi in Equation (2.1) as the unknown, weoptimize in the model space and therefore have much less unknowns in themodel parameters m. This significant reduction in the number of unknownsleaves room for us to use less source experiments to recover the model pa-rameters. Note that in this case the subsampled sources should still co-locatewith certain receivers, i.e., the underlying source and receiver grid should19co-locate but there is not necessarily a source deployed at each receiver lo-cation. Third, this simplified expression enables us to reduce the numberof wave-equation solves by subsampling the source experiments (note thatone source experiment corresponds to one column of the data matrices Piand Si −Pi, and by source subsampling, we reduce the number of columnsof these data matrices) (Herrmann and Li, 2012). According to the secondline of Equation (2.4), we still need to solve wave-equations for the fully-sampled sequential sources (i.e., the full source array I with the number ofcolumns equal to the number of rows) to obtain the full Green’s functionXi to perform matrix-matrix product, i.e., the subsampling does not bringdown the simulation cost. According to the third line, we only need to solvewave-equations for the subsampled source wavefield, which has less numberof columns than the full source array I in the second line. Without thesimplified expression by Equation (2.4), significant reduction in simulationcost would be impossible.To facilitate further discussion and problem formulation, we cast theimaging problem in the canonical form of solving a linear system of equa-tions, i.e., Ax = b where A is a linear operator and x and b are vec-tors. Throughout this paper, lower case symbols refer to vectors and upper-case symbols to matrices or operators. By vectorizing the wavefield Pi andrewriting the forward modelling in Equation (2.4) in a linear operator formthat explicitly reveals the model perturbations as the unknown, we have thefollowing monochromatic expressionpi = vec(Pi) (2.5)≈ vec(∇Fi[m0, δm; Si −Pi]).= ∇Fi[m0; Si −Pi]δm,where “vec” means to vectorize a matrix, and ∇Fi is the linearized demi-gration (i.e., Born modelling) operator. The above expression for multiplefrequencies can be obtained by stacking over the rows (i.e., vertical concate-nation along the columns). This expression relates the medium perturba-tions δm to the total upgoing wavefield in a linear manner. Remember thatthese multiples are modelled by injecting the downgoing receiver wavefieldas an areal source, rather than by placing a free surface in the backgroundmodel. With this linearized expression in place, we will now show how con-ventional RTM (no explicit free-surface in the background model) fails tocorrectly image the surface-related multiples, which motivates us to exploitthe least-squares inversion approach with sparsity promotion.202.3.3 Migration with multiples by cross-correlationImaging with multiples by applying the cross-correlation imaging conditioncan be mathematically interpreted as applying the adjoint of the linearizeddemigration operator to the total upgoing data. In our case, we use two-waypropagators and this amounts to the reverse-time migration (Baysal et al.,1983; Lailly, 1983). Using Equation (2.5), we obtain the RTM image aftersumming over all frequencies:δ˜m =nf∑i=1∇F∗i [m0; Si −Pi]pi (2.6)where δ˜m denotes the RTM image, and superscript ∗ denotes the adjoint.As there are primaries and different orders of surface-related multiples in thetotal up- and downgoing wavefields, cross-correlation between acausal pairsof up- and downgoing wavefields will lead to artifacts in the image. A causalpair of surface reflections comprises the interaction between the kth-orderdowngoing and the k + 1th order upgoing multiples for k = 0, 1, 2, · · · andwith primaries regarded as the 0th-order multiples. By acausal, we mean anypair other than the causal pairs (see detailed explanations by Muijs et al.,2007; Liu et al., 2011).We use a relatively simple velocity Earth model to illustrate these ar-tifacts. The background model and the model perturbations are shown inFigure 2.1(a) and 2.1(b). The model dimensions are 1250 m deep and 2235 mwide with a five-meter grid distance in both dimensions. We make linearizeddata according to Equation (2.5) assuming a known source wavelet spectrumw for which we use a Ricker wavelet, and known areal-source wavefields−Pi that we simulate using finite-difference modelling with a free-surfaceboundary condition. The frequency band of the source wavelet goes up to60 Hz. The total recording time is 2.044 s with 4 ms sampling interval, whichcorresponds to a total number of 122 non-zero frequencies in the specifiedfrequency range. There are 150 co-located sources and receivers with 15 mspacing.We show two RTM images here. In both cases, the input data con-tain surface-related multiples. The first image is based on the conventionalsingle-scattering forward model (i.e., the demigration operator only modelsthe primaries). The result is shown in Figure 2.1(c). We can clearly see thephantom reflectors imaged from multiples (indicated by the arrows in thefigure). This is understandable since the multiples are not taken into accountby the modelling operator. The second image is by Equation (2.6), which21(a) (b)(c) (d)Figure 2.1: The synthetic salt dome model and two RTM images. (a)The background model. (b) Model perturbation. (c) RTM oftotal data but only with the primary imaging operator. (d)RTM of total data by including the total downgoing wavefieldin the source wavefield.models the surface-related multiples by using the total downgoing wave-field as the areal source wavefield. This image is shown in Figure 2.1(d).Although the multiples are modelled by the modelling operator, we stillobserve severe coherent artifacts (again indicated by the arrows in the fig-ure) caused by acausal cross-correlations between acausal pairs of up- anddowngoing wavefields that contain multiples. To overcome these harmfulartifacts, we propose to properly invert the linearized modelling operatorinstead of applying the adjoint.222.4 Fast sparsity-promoting imaging withmultiplesThe computational cost of inverting Equation (2.5) for all the frequenciesby the iterative least-squares migration is excessive, because each iterationinvolves one reverse-time migration and demigration (Nemeth et al., 1999).Without storing the wavefield, this amounts to solving four wave equationsfor each monochromatic source experiment. Therefore the total number ofwave-equation solves roughly equals to 4 × nf × ns × niter, with niter thenumber of iterations of the inversion. We reduce this excessive simulationcost by subsampling the monochromatic source experiments and promotingcurvelet-domain sparsity of the model parameters (Herrmann et al., 2009;Herrmann and Li, 2012).We use synthetic examples to help readers understand the different com-ponents of the method. In these examples, we use the same model as inFigure 2.1. To obtain a baseline image, we first run an inversion of allthe data for 60 iterations. For a fair comparison with the following com-pressive imaging examples, we cast the inversion as the same Basis PursuitDe-Noise (BPDN) problem (Chen et al., 2001; van den Berg and Friedlan-der, 2008), and use the same SPG`1 solver (van den Berg and Friedlander,2008). Details about the BPDN problem and the SPG`1 solver will follow.The result is shown in Figure 2.2(a), which is as expected more or less ar-tifact free as all surface-related multiples are correctly mapped to locationsof the true reflectors. The coherent artifacts in Figure 2.1(d) are largelyremoved by adopting the inversion approach. However, the computationfor Figure 2.2(a) is expensive, involving roughly 4.4 million wave-equationsolves.2.4.1 Subsampling monochromatic source experimentsTo reduce the number of wave-equation solves, which is proportional to thenumber of monochromatic source experiments, we follow Herrmann et al.(2009) and subsample over both frequencies and source experiments. First,we only use a randomized subset of frequencies, denoted by Ω. Second, wereplace the monochromatic forward modelling in Equation (2.5) bypi= vec(PiE) (2.7)≈ ∇Fi[m0; (Si −Pi)E]δm= ∇Fi[m0; Si −Pi]δm.23In this expression, the underlined quantities refer to subsampled source andreceiver wavefields by randomized source superimpositions (i.e., formingsimultaneous sources) via the action of a tall (i.e., more number of rowsthan columns) source-encoding matrix E on the right, which has Gaussian-distributed randomized entries. The number of simultaneous sources is de-fined as the number of columns of the encoding matrix E. Since thereare now fewer source experiments, we have fewer wave-equations to solve.However, this speed-up goes at the expense of introducing source crosstalksinto an RTM image, caused by the randomized source superimposition (seeRomero et al. (2000), Figure (1), and Herrmann and Li (2012), Figure (1b)).We also show such an example here. Instead of using all 150 sequentialsources, we use 10 simultaneous sources. For this example we use all thefrequencies. We then compute the RTM image of the 10 simultaneous shot-gathers and show it in Figure 2.2(b). In this image we can identify thenoisy source crosstalks as well as the coherent acausal artifacts caused bythe surface-related multiples. We propose to remove these artifacts usingideas from sparse recovery.2.4.2 Removal of source crosstalks by promoting sparsityFollowing sparse-recovery and randomized sampling ideas from CompressiveSensing (CS) (Cande`s et al., 2006b; Donoho, 2006), we remove these arti-facts by promoting sparsity of the model perturbations using the `1-norm.To maximally exploit the sparsity, we incorporate the curvelet transformthat affords a sparse representation of the model perturbations (Cande`s andDonoho, 2004; Herrmann et al., 2008a,b; Herrmann and Hennenfent, 2008;Neelamani et al., 2010). As a result, we aim to solve the following BasisPursuit De-Noise (BPDN) problem:argminx‖x‖1 (2.8)subject to∑i∈Ω‖pi−∇Fi[m0; Si −Pi]C∗x‖22 ≤ σ2,where C∗ represent the curvelet synthesis operator, i.e., its action is toapply the inverse curvelet transform (Cande`s et al., 2006a), and x is thecurvelet coefficient vector for the image (i.e., δm = C∗x). This optimiza-tion program finds, among all curvelet coefficient vectors, the one with thesmallest `1-norm subject to the fact that its inverse curvelet transform yieldsdemigrated (simultaneous) data that explain the input (simultaneous) datawithin a user-specified σ that allows for noise in the data and modelling24errors (e.g., from ignoring the higher-order reflections during linearization).We solve the above BPDN problem using the SPG`1 solver, which, insteadof directly dealing with the BPDN formulation, solves a series of underly-ing Least Absolute Shrinkage and Selection Operator (LASSO) subproblems(Tibshirani, 1996). We refer to Appendix A.1 for a more detailed explana-tion of how SPG`1 solves a general BPDN problem.The CS theory states that solving such a convex sparsity program (cf.Equation (2.8)) can be used to invert underdetermined systems of equationsas long as these systems behave approximately as Gaussian matrices, andδm permits a sufficiently sparse curvelet representation. The latter argu-ment is made relatively easy by referring to the body of literature that hasappeared on the compressibility of seismic images w.r.t. curvelets (Her-rmann et al., 2008a,b). The first argument is a bit more difficult to makealthough we can argue that the combination of curvelets, randomized sourcesuperimposition, demigration, and free-surface-multiple generation leads tomatrices that are well-mixed and not too far off from Gaussian matrices.However, the demigration operator has a null space caused by finite acquisi-tion aperture and complex geologies, which may result in unresolvable steepdips and shadow zones (see e.g., Prucha and Biondi, 2002; Herrmann et al.,2008a). Therefore we cannot expect perfect recoveries.Using the same setup as for Figure 2.2(b), i.e., 10 simultaneous sources,we obtain an inversion result after 60 SPG`1 iterations. The result is shownin Figure 2.2(c). Compared to Figure 2.2(b), the source crosstalks, as wellas the acausal artifacts associated with the multiples, are both removed inFigure 2.2(c) by adopting the above optimization formulation that promotessparsity in the curvelet domain (cf. Equation (2.8)). Compare to Figure2.2(a), we reduce the number of wave-equation solves by 15 folds.While a 15× speed-up is encouraging, Figure 2.2(c) still involves roughly8× the simulation cost of a conventional RTM of all the data. We need tobring down the cost even further to make the method computationally fea-sible for large data sets. However, the SPG`1 solver converges too slowlyif we subsample too much. We show an example where we only use twosimultaneous sources and 15 frequencies. The image after 305 iterations isshown in Figure 2.2(d), where we still see lots of remnant noisy artifacts.As explained in recent work on compressive imaging and Approximate Mes-sage Passing (AMP), these artifacts are related to the correlation built upbetween the model iterate and the randomized subsets of monochromaticsource experiments (Herrmann and Li, 2012; Herrmann, 2012; Montanari,2010). We refer to Appendix A.2 for more details about the cause of theseartifacts. We will now explain how we remove these artifacts.252.4.3 Further acceleration by rerandomizationMotivated by recent findings in AMP (Donoho et al., 2009; Montanari, 2010),we proposed a simple rerandomization of the BPDN problem (cf. Equa-tion (2.8)) to remove the remnant incoherent artifacts related to correlationbuild up shown in Figure 2.2(d). For our problem, this corresponds to se-lecting a new set of randomized source experiments and a new randomizedfrequency subset for each LASSO subproblem (Herrmann and Li, 2012; Tuand Herrmann, 2012a). This practice misses a strict theoretical justificationbut is close in spirit to other randomized algorithms that have let to fast so-lutions of overdetermined problems, e.g., the randomized (block) Kaczmarzalgorithm (Strohmer and Vershynin, 2009; Needell and Tropp, 2014).To verify the efficacy of rerandomization by comparing with Figure 2.2(d)where no rerandomization is used, we rerun the above example using thesame setup (i.e., two simultaneous sources, 15 frequencies, 305 SPG`1 it-erations) with rerandomization. The result is shown in Figure 2.2(e), andwe can see that the image quality has drastically improved compared toFigure 2.2(d). Compared with Figure 2.2(a), we achieve a 120× speed-upwithout notable compromise on the image quality. In fact, the simulationcost to obtain Figure 2.2(e) roughly equals to that of a single RTM of all thedata, i.e., the rerandomization technique enables us to obtain a high-fidelitytrue-amplitude least-squares migrated image when the computational bud-get only allows for one RTM with all the data.2.4.4 Sparsity promoting by `1 or `2While the success of the speed-up really hinges on the combination of thetwo contributing factors—sparsity promoting and rerandomization—the rolethat the `1-norm plays in promoting curvelet-domain sparsity should not beunderstated.To demonstrate the importance of using the `1-norm as a measurementof sparsity (Mallat, 2009), we solve a problem that is otherwise the sameas Equation (2.8) except that we replace the `1-norm by the `2-norm in theobjective function. Again we use two simultaneous sources, 15 frequencies,305 iterations and rerandomization. The result is shown in Figure 2.2(f).We can see that it has more noise compared with Figure 2.2(e) where weuse the `1-norm objective. Therefore the `1-norm should be used.26(a) (b)(c) (d)(e) (f)Figure 2.2: Examples using the salt dome model. (a) Inversion withall the data. (b) Migration with 10 simultaneous sources. (c)Inversion with 10 simultaneous sources. (d) Inversion with 2simultaneous sources and 15 frequencies without rerandomiza-tion. (e) Same as (d) but with rerandomization. (f) Same as(e) but by `2 norm minimization.272.4.5 Putting it all togetherNow we have explained each component of the proposed method, we sum-marize and explain the algorithm in Algorithm (1), which is an adaptedversion of the compressive imaging algorithm by Herrmann and Li (2012).Input and initialization [lines 1–4]As we stated before, our inversion algorithm requires estimates of the sourcewavelet and the background velocity model m0. The solution vector x0 isinitialized to be a zero vector. We select the tolerance parameter σ simply tobe zero (i.e., we only stop when we reach the maximal number of iteration),and choose the number of simultaneous source experiments n′s  ns, thenumber of randomized frequencies n′f  nf and the maximal number ofiterations kmax based on our computational budget (Chapter 3). We use lto index the LASSO subproblems.Main loop [lines 5–10]Given the input and after the initialization, we enter into the main loop inwhich we solve the LASSO subproblems. For each subproblem, we draw newindependent randomized subsets of frequencies Ω and source experiments bydrawing new source encoding matrix E (line 6). Without this “redraw” step,the algorithm would be identical to the original SPG`1. Following van denBerg and Friedlander (2008), we compute the sparsity constraint for thel-th LASSO subproblem τl using τl−1 and σ using Newton’s method on thePareto curve (line 7), and solve the LSτ subproblem (line 8).2.5 Performance analysisWhile the stylized example clearly showed qualitative improvements in thesparse recovery with rerandomization, we would like to study the behaviorof the solver more quantitatively. In particular, we are interested in studyingthe solution path and the model error as a function of the number of wave-equation solves, as it can give us insights into the way that rerandomizationworks to speed up the convergence. We are also interested in the recoveryerror as a function of the number of simultaneous sources, frequencies, anditerations, which will help us to choose the optimal inversion parameters forthe proposed algorithm given a fixed computational budget.28Algorithm 1 Fast imaging with multiples1: Input:2: total upgoing wavefield Pi, point-source wavefield Si,background velocity model m0, tolerance σ = 0, iteration limit kmax3: Initialization:4: k ← 0, l← 0,xl ← 05: while k < kmax do6: Ωl,El ← new independent draw, Pi = PiEl, Si = SiEl, pi = vec(Pi)7: τl ← determine from τl−1 and σ by root finding on the Pareto curve8: xl ←{argminx∑i∈Ωl‖pi−∇Fi[m0; Si −Pi]C∗x‖22subject to ‖x‖1 ≤ τl//warm startwith xl−1, solved in kl iterations9: k ← k + kl, l← l + 110: end while11: Output: Model perturbation estimate δm = C∗x2.5.1 Solution path and model errorTo understand the significant difference between optimizing with and with-out rerandomization (results in Figure 2.2(d) and 2.2(e)), we plot their re-spective solution path, i.e., the (relative) `2-norm of the data misfit as afunction of the `1-norm of the solution vector. We also plot the decreases ofthe model errors (i.e., the relative `2-norm of the misfit between the modeliterate and the true model perturbations) in the two cases as a function ofthe number of wave-equation solves.Shown in Figure (2.3(a)), the solution path for the optimization withrerandomization behaves differently (expected, as we are solving a differentproblem with rerandomization) from that of the standard SPG`1, where the`2-norm of the data residual decreased close to zero albeit slowly. Becausethe residual is larger for the case with rerandomization, one would normallyexpect a deterioration rather than an improvement in the image qualityby using rerandomization. However, we can see in Figure (2.3(b)) that themodel errors decay continuously with rerandomization that breaks down thecorrelation built up between the model iterate and the randomized subsetsof monochromatic source experiments, but almost stall at an early stageof the optimization in the case of no rerandomization (i.e., no significantimprovement by using more iterations).Because the artifacts in Figure 2.2(d) with wrongly identified supportlie close to the nullspace of the compressive modelling operator (as the blue290 20 40 60 80 100 12000.−norm of solution vectorRelative two−norm residual  w/o rerandomizationw/ rerandomization(a)1 2 3 4 5 6x 1040. of wave−equation solvesRelative model error  w/o rerandomizationw/ rerandomization(b)Figure 2.3: Decreases of the data misfit and model errors, with (ingreen) and without (in blue) rerandomization. (a) Decreaseof the relative data misfit. (b) Decrease of the relative modelerrors.curve decreased close to zero in Figure 2.3(a)), they are difficult to removeeven after extended number of iterations. However, by occasionally drawingnew compressive modelling operators with independent subsets of frequen-cies and source experiments, these artifacts can be gradually removed. Thisobservation was also made by Herrmann and Li (2012), where using reran-domization greatly promotes the convergence in terms of model error.302.5.2 Computational considerationsAs we use wave-equation solvers to carry out the SRME type multi-dimensionalconvolutions implicitly, which also enables us to reduce the simulation costby subsampling the monochromatic source experiments, the costs of ourimaging with multiple algorithm are dominated by the number of wave-equation solves. This raises the question of identifying the best inversionstrategy given a certain computational budget in terms of the number ofwave-equation solves, which is proportional to the product of the number ofmonochromatic sources and the number of iterations, i.e., nf × ns × niter.In conventional inversion approaches, people usually work with all sourcesand frequencies, but can only afford a few iterations. With the proposedapproach, we can work with significantly smaller data subsets and can there-fore afford many more iterations because each iteration is now much cheaper.While both approaches are parallelizable over independent monochromaticsource experiments in the data space or via domain decomposition in themodel space, our approach has the advantage of partially removing the com-puter input / output bottleneck as the subsampling also reduces the datasize.Using the same synthetic model / data as in the previous section, wecompare the performance of the proposed method with different combina-tions of randomized monochromatic source experiments and iterations. Thesignal-to-noise ratios (SNR) of the recoveries compared to the true modelperturbation are shown in Table (2.1). From the table we can see that moreiterations with cheaper per-iteration cost lead to better recoveries, whichalso offers flexibility in the design of parallel algorithms.Another aspect is to understand the interplay between the number ofsimultaneous sources and frequencies once we fix the number of monochro-matic source experiments. We compare five different setups and show theresults in Table (2.2). We can see that these combinations lead to almostequally good recoveries.2.6 Synthetic case studyTo evaluate the performance of the proposed method for a more realisticsetup, we verify our method with a more complex Earth model. The modelwe use here is cropped from the Sigsbee 2B model that contains abundantsedimentary layers. The Sigsbee 2B model is designed to contain a thinhigh-velocity ocean bottom layer to generate strong surface-related multiples(The SMAART JV, 2014). We reduced the thickness of the water column31No. of mono. sources. 15 30 60 120 240 480Aspect ratio of ∇F 0.02 0.04 0.08 0.16 0.32 0.64No. of iterations 610 305 152 76 39 20SNR (dB) 6.5 6.2 5.8 5.1 4.2 3.4Table 2.1: SNRs as a function of the number of monochromatic sourceexperiments and the number of iterations.No. of sim. sources 2 3 6 10 15No. of frequencies 15 10 5 3 2SNR (dB) 6.2 6.2 6.2 6.3 6.7Table 2.2: SNRs as a function of the number of simultaneous sourcesand frequencies. The number of monochromatic sources as wellas the number of iterations are fixed.to allow for more orders of surface-related multiples. The modified trueand background models are shown in Figure 2.4. The background model isobtained by smoothing the true model and is kinematically close to the truemodel. The grid spacing is 7.62 m. There are 261 co-located sources andreceivers with 22.86 m spacing. We model the data using iWave with a free-surface boundary condition (Terentyev et al., 2014). We use a Ricker waveletwith 15 Hz peak frequency, and record data for 8.184 seconds. We simulateboth the pressure and particle velocity wavefields to separate the upgoingand downgoing wavefields (Wapenaar, 1998). We then remove the directarrivals by subtracting data modelled with the background model. Next, weextrapolate the upgoing wavefields to the surface level (see Verschuur et al.,1992, for more details). In the imaging procedure, we use our own frequency-domain modelling engine to mimic the discrepancy between the observeddata and the modelled data in field seismic data processing (van Leeuwenand Herrmann, 2012a). For the same reason, we use forward modellinginstead of linearized Born modelling to simulate the data. However, ourproposed method is robust to this linearization error by virtue of usingrerandomization (see Chapter 3 for more details).For comparison, we first computed the RTM image of data with multipleswith all the data according to Equation (2.6). The result is shown in Figure2.5. Again we see the coherent acausal artifacts caused by surface-related32multiples, which are indicated by the arrow.For the proposed method, we use 26 simultaneous sources from all 261sequential sources (roughly 10× subsampling), 31 frequencies randomly cho-sen from all 311 discretized frequencies (roughly 10× subsampling), and runthe inversion for 50 iterations. In this way the simulation cost remains com-parable to a single RTM of all the data (e.g., Figure 2.5). After the inversionis finished, we perform a simple additional curvelet thresholding to removethe remaining incoherent noise in the image (see Herrmann et al., 2008a,for more details). We choose such a threshold level that the filtered-outnoise does not contain noticeable coherent energy. The result is included inFigure 2.6(a). This thresholding procedure is also applied to the imagingresults in the Discussion section. Again we obtain an image with minimalcoherent acausal artifacts arising from the surface-related multiples by us-ing the proposed inversion method. These weak remnant artifacts have thefollowing sources: (i) we use different modelling engine for data simulationand inversion; (ii) the use of a pair of monopole sources to approximate thesource-ghost effect is inexact; and (iii) the prediction of multiples is alsoinexact because of finite aperture. Despite these remnant artifacts, the re-sulting image is of higher spatial resolution compared to Figure 2.5, as weproperly invert the demigration operator that includes the source wavelet.To highlight the efficacy of the proposed method in reducing the coherentartifacts caused by the surface-related multiples, we run another inversionwith the same setup, but ignore the presence of the surface-related multiples(i.e., the modelling operator only models primaries). The result is shown inFigure 2.6(b), where we can see strong coherent artifacts that mimic thosein Figure Discussion2.7.1 The source waveletWe assumed that we have access to the true source wavelet. In practice,however, source estimation is never a trivial task because of the amplitude /phase ambiguity in blind deconvolution (Ulrych and Sacchi, 2005, and Chap-ter 5). The EPSI algorithm successfully eliminates these ambiguities byusing information from surface-related multiples (van Groenestijn and Ver-schuur, 2009b; Lin and Herrmann, 2013). However, it is computationallyexpensive because it uses an iterative algorithm with dense matrix-matrixmultiplies.33(a)(b)Figure 2.4: The cropped Sigsbee 2B model. (a) The true model. (b)The background model.34Figure 2.5: RTM image of the Sigsbee 2B model, using data contain-ing multiples.Alternatively, source signatures can be estimated during the imaging it-self using an approach known as variable projection (Golub and Pereyra,2003; Aravkin and van Leeuwen, 2012). Recent work by Aravkin et al.(2013b) and Tu et al. (2013) successfully applied this alternating optimiza-tion method to recover the image as well as an estimate for the sourcefunction. Figure 2.7(a) contains an example of this method obtained usingthe same inversion parameters as in the previous section, i.e., we use 26 si-multaneous sources, 31 randomized frequencies, and run the inversion for 50iterations. Compared to Figure 2.6(a), a 50% computational overhead is in-curred by the variable-projection step within each iteration, as the primaryand multiple wavefields have to be modelled separately to update the sourceestimate (see Chapter 5 for more details). This extra computation, whichis about half of a single RTM of all the data, is still insignificant comparedwith a conventional least-squares migration of all the data. We report thissource-estimation method in more details in Chapter Imaging with multiples onlyOur proposed method can also be used in cases where a reliable primary /multiple separation is available, e.g. after a conventional multiple predic-tion step. In that situation, we can image multiples separately and ex-35(a)(b)Figure 2.6: Inversion results of the Sigsbee 2B model. Both imagesare obtained after curvelet thresholding to remove remnant inco-herent noise in the image. The arrows in a, b and in Figure 2.5point to the same location of the model. (a) Fast inversionresult using the proposed method. (b) Fast inversion resultwithout accounting for the multiples.36ploit their extra illumination properties without explicit knowledge of thesource wavelet. (However, multiple predictions by the SRME-type multi-dimensional convolutions typically need to apply the inverse of the wavelet.)As before, we obtain these images by including an areal source, which nowconsists of the downgoing receiver wavefield only. Compared to earlier work(e.g. Guitton, 2002; Muijs et al., 2007; Whitmore et al., 2010; Liu et al.,2011), our approach leads to a true inversion result while others merely ap-ply the deconvolutional imaging condition or the cross-correlation imagingcondition, which is known to lead to undesirable acausal artifacts. We usethe same inversion parameters as for Figure 2.7(a), and the result is includedin Figure 2.7(b).2.7.3 Sensitivity to the background velocity modelOur method hinges, as all other wave-equation based imaging procedures, onthe availability of a smooth background velocity model that accurately cap-tures the kinematics of reflected waves. We have also seen that incorporatingsurface-related multiples in the formulation calls for an iterative sparsity-promoting inversion procedure and the question arises how sensitive thismore elaborate imaging procedure is to errors in the background velocitymodel. This question is particularly pertinent because migration velocityanalysis tools also often suffer from surface-related multiples. Moreover, it-erative inversion procedures such as (sparsity-promoting) least-squares mi-gration may be more sensitive to velocity-model errors than conventionalRTMs and it is not clear how the inclusion of surface-related multiples playsinto this.To address these concerns we conducted an experiment where we delib-erately introduce errors in the velocity model that cumulate with depth (seeFigure 2.8(a) and 2.8(b)). The results of this exercise are included in Fig-ure 2.9(a) to 2.10(b) and show that for this example least-squares imagingwith multiples is relatively well behaved with imaging results that degradegracefully. As expected, imaging results with curvelet-domain sparsity pro-motion are more coherent by virtue of that fact that curvelets allow forsmall kinematic errors as reported in the literature (Herrmann et al., 2008c).The results without curvelet-domain sparsity were obtained by putting thesparsity-constraint parameter τ in each LSτ subproblem to infinity (it isessentially the block Kaczmarz method (see e.g. Needell and Tropp, 2014)),and eliminating the use of the curvelet transform. While more work isneeded to fully understand the influence of the background velocity modelon our iterative inversion algorithm, these results show that our method37(a)(b)Figure 2.7: Dealing with the unknown source signature. Both imagesare after curvelet thresholding. (a) Fast imaging of the totalupgoing wavefield with source estimation. (b) Fast imaging ofmultiples only.38is relatively well-behaved w.r.t. possible errors in the background velocitymodel.2.7.4 Internal multiplesThe proposed work here addresses dominant surface-related multiples in alinearized inversion framework. Aside from surface-related multiples, thereare also internal multiples in the acquired seismic data. When used correctly,internal multiples also increase illumination (Cavalca and Lailly, 2005; Mal-colm et al., 2009). In cases where fine-scale heterogeneity exists, includinginternal multiples can be crucial to produce an accurate image (Delprat-Jannaud and Lailly, 2008).Our proposed approach is based on the same linearization as full-waveforminversion. In addition, our sparsity-promoting inversion can be viewedas a sparsity-constrained Gauss-Newton update (Herrmann and Li, 2012).Therefore, recursive application of the proposed method with updates onthe velocity model can be construed as an instance of non-linear migration.As outlined by Me´tivier (2011); Me´tivier et al. (2011), this type of recur-sive inversion will account for internal multiples for a sufficiently accuratestarting velocity model.2.7.5 Other applicationsIn essence, the proposed method entails multi-dimensional deconvolutions ofwavefields combined with a linearized imaging procedure. While we special-ized this approach to image with surface-related multiples, there is nothingthat prevents the application of our proposed method to other areas as longas the receivers are sampled adequately. For instance, the method couldrelatively easily be adapted to imaging with receiver functions. In that case,the observed converted shear waves are deconvolved with the pressure wavesto image the pressure-to-shear transmission coefficients (Ryberg and Weber,2000; Shang et al., 2012). Another example would be the deconvolutionalinterferometric imaging as recently proposed by van der Neut and Herrmann(2013).2.8 ConclusionWe proposed a new and computationally efficient method to image seismicdata with surface-related multiples. Our contributions are threefold. First,our formulation combines the wave equation with surface-related-multiple39(a)0 1000 2000 3000150020002500Depth (m)Velocity (m/s)  model w/o errormodel w/ error(b)Figure 2.8: The inaccurate background model. (a) The wrong back-ground velocity model. (b) One vertical trace of (a).40(a)(b)Figure 2.9: Images using the proposed method, without (a) and with(b) the velocity error.41(a)(b)Figure 2.10: Images without curvelet-domain sparsity promotion,without (a) and with (b) the velocity error.42prediction by using a generalized areal-source term that contains the down-going receiver wavefield. By incorporating this source term, we arrive ata linearized formulation that predicts, and therefore correctly images, datawith surface-related multiples. Instead of predicting the multiples via ex-plicit dense matrix-matrix multiplications as performed in surface-relatedmultiple elimination, which is prohibitively expensive for large seismic datasets, our method predicts multiples implicitly by solving wave equations forthe areal-source wavefields, and therefore eliminate the cost of the matrixproduct altogether. Second, our combination of multiple prediction and lin-earized forward modelling with the Born approximation allows us to comeup with a proper curvelet-domain sparsity-promoting inversion frameworkwhere primaries and surface-related multiples are jointly inverted to pro-duce a high-resolution true-amplitude image of the subsurface medium per-turbations. Compared to non-iterative correlation-based imaging methodswith multiples, this approach leads to high-resolution images with minimalartifacts. Third, we overcome prohibitive computational cost of sparsity-promoting inversion by choosing to work with small independent random-ized subsets of data. This subsampling and rerandomization strategy leadsto a simulation cost that is comparable to the cost of a single reverse-timemigration with all the data, which makes our method feasible in practice.More importantly, our method produces by virtue of the inversion high-fidelity true-amplitude images, an accomplishment that is difficult, if at allpossible, to achieve with conventional methods. Because of its superior per-formance on complex synthetics, in conjunction with a reduced sensitivityto modelling errors and errors in the velocity model, we expect our methodto be suitable to field data.43Chapter 3Controlling linearizationerrors in `1 regularizedinversion by rerandomization3.1 SummaryLinearized inversion is a data fitting procedure that tries to match the ob-served seismic data with data predicted by linearized modelling. In practice,the observed data is not necessarily in the column space of the linearizedmodelling operator. This can be caused by lack of an accurate backgroundvelocity model or by coherent noises not explained by linearized modelling.Through carefully designed experiments, we observe that a moderate datamismatch does not pose an issue if we can use all the data in the inversion.However, artifacts do arise from the mismatch when randomized dimen-sionality reduction techniques are adopted to speed up the inversion. Tostabilize the inversion for dimensionality reduction with randomized sourceaggregates, we propose to rerandomize by drawing independent simultane-ous sources occasionally during the inversion. The effect of this rerandom-ization is remarkable because it results in virtually artifact-free images ata cost comparable to a single reverse-time migration. Implications of ourmethod are profound because we are now able to resolve fine-scale steepsubsalt features in a computationally feasible manner.A version of this chapter has been published in SEG Technical Program ExpandedAbstracts, 2013, vol. 32, p. 4640-4644443.2 IntroductionLinearized inversion aims to obtain the perturbation of the subsurface mediumparameterized by a model vector m, by fitting the predicted data to theobserved seismic data. The predicted data is identified as a linear func-tion of the model perturbation δm over a background model m0 wherem = m0 + δm. The underlying optimization problem can be written as:LS : minimizeδm‖∇F[m0,q]δm− d‖2, (3.1)where ∇F is the linearized Born scattering operator, q is the vectorizedsource wavefield for all sources, which is either assumed to be known, orcan be estimated by variable projection (Aravkin et al., 2013b; Tu et al.,2013), and d is the observed data, usually after removal of surface-relatedmultiples.Despite all its advantages, such as true-amplitude preservation and ro-bustness to missing data (Tu et al., 2013; Nemeth et al., 1999), linearizedinversion is not widely adopted because of its high computational cost. Solv-ing problem LS usually requires iterative evaluations of the Born scatteringoperator and its adjoint, while each evaluation is already expensive as itentails several wave-equation solves (e.g., four wave-equation solves with-out saving the background wavefield) for each monochromatic source ex-periment. Since the simulation cost is mainly determined by the numberof sources and frequencies, dimensionality reduction techniques—such assource encoding and using randomized frequency subsets—are proposed toreduce these costs (Dai et al., 2011; Herrmann and Li, 2012). To overcomethe negative effects these subsampling techniques have (e.g., source cross-talks) on the seismic image, Herrmann and Li (2012) proposed to promoteCurvelet sparsity in the model space by solving the following `1 minimiza-tion problem (known as the Basis Pursuit De-Noise (BPDN) problem, Chenet al., 2001; van den Berg and Friedlander, 2008):BPσ : minimize ‖x‖1 (3.2)subject to ‖∇F[m0,q]C∗x− d‖2 ≤ σ,where underlined variables denote subsampled quantities, i.e., d = RMdwhere RM is the subsampling matrix (see Herrmann and Li, 2012, for moredetails). Matrix C∗ is the Curvelet synthesis operator (Cande`s et al., 2006a),and σ is adjusted to allow for data mismatch due to noise and modellingerrors. To solve the above BPDN problem, we use the SPG`1 solver that45implicitly solves a series of Least Absolute Shrinkage and Selection Operator(LASSO) subproblems with gradually relaxed τ ’s (Tibshirani, 1996; van denBerg and Friedlander, 2008):LSτ : minimize ‖∇F[m0,q]C∗x− d‖2 (3.3)subject to ‖x‖1 ≤ τ.In practice, the input data d does not necessarily lie in the column spaceof the linearized modelling operator ∇F. This can be caused by lack of anaccurate background model, or by coherent events (e.g., internal multiples)or incoherent noises in the data that the linearized modelling fails to ex-plain. As a result, there will be a mismatch between the input data andthe predicted data. However, it is rarely discussed in the literature how tochoose the “right” σ for the above BPσ problem in this “noisy” case. More-over, how the mismatch is going to affect the image quality, especially whenwe subsample the data, is not well understood. In this article, we carefullydesign several experiments to find out the answers to the above questions.We also propose possible solutions if the image quality is compromised dueto the data mismatch. We start with the case where hypothetically we canuse all the data (i.e., RM is simply a Dirac operator in this case), and thendelve into the case where we randomly subsample the data to speed up theinversion. In the end, we verify our observations with a case study using arealistic synthetic model.3.3 The full-data caseWe use a simple two-layer model to examine how the choice of σ in problemBPσ affects the inversion. The model is 400 m wide and 400 m deep, witha 5 m grid spacing. The true model m has two layers. The top layer rangesfrom the surface to 145 m in depth, and has a velocity of 1500 m/s. Thebottom layer ranges from 145 m to 400 m in depth, and has a velocity of2500 m/s. We smooth the true model to get the background model m0, andobtain the perturbation δm by subtracting the background model from thetrue model. There are 21 sources put at a depth of five meters with 20 mlateral spacing, and 81 receivers put at the same depth as the sources with5 m lateral spacing.We generate two sets of data here. We make the first data set by lin-earized modelling (i.e., ∇F[m0,q]δm), which we refer to as the linearizeddata. We make the second data set by first modelling the entire wavefieldwith the true model (i.e., F[m,q] where F is the forward modelling op-46erator), and then subtracting the direct waves modelled with the smoothbackground model (i.e., F[m0,q]). We refer to this data set as the forwardmodelling data. For both data sets, we use a Ricker wavelet of 25 Hz peakfrequency as the source. From an inversion point of view, the major differ-ence between the two data sets is that the linearized data is perfectly in thecolumn space of the Born scattering operator, while the forward modellingdata is not. Traces from the two data sets are plotted in Figure 3.1(a). Thevisual difference between the two data sets is mainly a phase shift, mostprobably caused by the slightly different kinematics between the true andthe background model.We compare three scenarios here. In the first scenario, we obtain abaseline image by inverting the linearized data. As the data is completelynoise-free, we choose σ = 0 and run for 100 iterations using the SPG`1solver (van den Berg and Friedlander, 2008). In the second scenario, we usethe forward modelling data as the input, and use the true σ defined as the`2-norm of the difference (by subtraction) between the linearized data andthe forward modelling data. The third scenario is otherwise the same as thesecond scenario, except that we simply choose σ = 0 and let the solver runfor 100 iterations.The inversion results are shown in Figure 3.1(b) to 3.1(d). We can seethat the results of the first and the third scenarios are comparable. Thefailure of the second scenario indicates that the choice of σ is inappropriate.As we can see from Figure 3.1(a), although the phase shift between thetwo data sets is minimal, the difference (in red) is almost as large as theinput data itself. Therefore using the “true” σ, which is the `2-norm ofthe difference, will terminate the optimization prematurely. As a result, thesecond scenario stopped after only four iterations in our experiment. In fact,the linearized modelling in the third scenario well fits the forward modellingdata (Figure 3.1(e)). Figure 3.1(f) shows that the phase shift between thetwo data sets (Figure 3.1(a)) is reasonably explained mainly by a slightspatial shift of the model perturbation. Although this spatial shift is notdesirable, at least it leads to a coherent image in Figure 3.1(d) comparedwith 3.1(c).3.4 The subsampling caseWith the same setup as the previous section, we examine how the datamismatch affects the image quality when we subsample the data to speed upthe inversion. Instead of using all 21 sequential sources and 61 frequencies,we use 5 simultaneous sources by randomized source superposition, and 1547randomly selected frequencies (see Herrmann and Li, 2012, for details).Based on our observations in the previous section, we choose σ = 0 and runfor 100 iterations.We first compare two scenarios. Again, we use linearized data as theinput in the first scenario and the forward modelling data in the secondscenario. The results are shown in Figure 3.1(g) and 3.1(h). We can see thatthe mismatch does result in significant interfering artifacts in 3.1(h) whenwe subsample the data. In other words, the subsampled BPDN problem failsto converge to the same solution as the full problem. This is closely relatedto the calibration problem in compressive sensing (Herman and Strohmer,2010; Gribonval et al., 2012).In our previous work, we noticed that using a technique that we refer toas rerandomization can greatly improve the convergence in terms of modelerrors decrease (Tu and Herrmann, 2012a), by suppressing the imaging ar-tifacts that arise from source cross-talks due to significant subsampling. Byrerandomization, we mean to draw an independent random subsamplingmatrix RM for each LASSO subproblem, leveraging SPG`1’s warm startfeature. To maximize the benefit of rerandomization, we redraw both si-multaneous sources and randomized frequency subsets (Tu and Herrmann,2012a).As the interfering artifacts from linearization errors are also related tothe subsampling procedure, we suppose the rerandomization technique to bealso able to mitigate this type of artifacts. To verify this hypothesis, we setupthe third scenario, which is otherwise the same as the second scenario exceptthat we use rerandomization. The result is shown in Figure 3.1(i), whichis virtually artifact-free. Although not shown here, we observe that themiddle traces of Figure 3.1(d) and Figure 3.1(i) highly resemble each other.Therefore we argue that rerandomization enables the subsampled BPDNproblem to converge to virtually the same solution as the full problem.This also explains the improved convergence observed by Tu and Herrmann(2012a).3.5 Synthetic examplesIn this section, we verify our previous observations using a more challeng-ing model modified from a 2D slice of the SEG /EAGE salt model. Wepad 10 grid points to the water layer of the model for a more accurate re-moval of direct waves (i.e., we can better retain the velocity in the shallowwater layer after spatially smoothing this model to obtain the backgroundmodel, which we use to simulate the direct waves). The model is 3.9 km480.15 0.2 0.25−15−10−5051015Time (s)Amplitude(a)Lateral distance (m)Depth (m)0 200 4000100200300400(b)Lateral distance (m)Depth (m)0 200 4000100200300400(c)Lateral distance (m)Depth (m)0 200 4000100200300400(d)0.15 0.2 0.25−15−10−5051015Time (s)Amplitude(e)0 200 400−0.15−0.1−0.0500.050.10.15Depth (m)Perturbation(f)Lateral distance (m)Depth (m)0 200 4000100200300400(g)Lateral distance (m)Depth (m)0 200 4000100200300400(h)Lateral distance (m)Depth (m)0 200 4000100200300400(i)Figure 3.1: Examples with the two-layer model. The images are plot-ted with the same color scale. (a). Traces of the linearized data(blue), the forward modelling data (green), and their difference(red). (b). Inversion of the full linearized data. (c). Inver-sion of the full forward modelling data with the true σ. (d).Inversion of the full forward modelling data with σ = 0. (e).A comparison of the input forward modelling data (blue) andthe linearized modelling data with the inversion results of (d)(green). (f). The middle traces of (b) and (d). (g). Inversionof the subsampled linearized data, σ = 0. (h). Inversion of thesubsampled forward modelling data, σ = 0. (i). Otherwise thesame as (h) but with rerandomization, σ = 0.49deep and 15.7 km wide with 24.384 m (80 feet) grid spacing. We smooththe true model to obtain the background model. As salt creates high veloc-ity contrast with the surrounding medium, smoothing a salt boundary canresult in significant linearization errors (remember the linearization of thewave-equation is based on the assumption that the model perturbations aresmall). We obtain the true model perturbations by subtracting the true andthe background models, displayed in Figure 3.2(b). We put 323 co-locatedsources and receivers every two grid points laterally at the second grid pointin depth. We again generate two sets of data: the linearized data and theforward modelling data. For both data sets, we use a Ricker wavelet of 5 Hzpeak frequency as the source and record for 8 seconds. Traces from the twodata sets, as well as their difference, are plotted in Figure 3.2(a).We compare four scenarios here. In the first scenario, we run a reverse-time migration (RTM) with all the data to obtain a baseline image. Inthe second scenario, we invert from the subsampled linearized data. In thethird scenario, we invert from the subsampled forward modelling data. Inthe fourth scenario, we again invert from the subsampled forward modellingdata, but with rerandomization. In all the inversion scenarios, we use 15frequencies and 15 simultaneous sources. We set the misfit allowance σ tobe 0 and run for 100 iterations. With such a setup, the simulation cost isabout 1.45X of a single RTM with all the data. We also mute the waterlayer during each iteration of the inversion, as the artifacts imaged insidethe water layer can adversely affect data fitting.The results are shown in Figure 3.2(c) to 3.2(f). The target zone thatwe are most interested in lies below the salt structure. Figure 3.2(d) and3.2(e) show that the linearization errors does significantly degrade the im-age quality when the input data is subsampled, especially in the target zone.However, by rerandomization, the image quality is greatly improved (Fig-ure 3.2(f)). To understand this significant improvement in the image qual-ity from the perspective of data-fitting, we demigrate the inverted modelperturbations from the (subsampled) forward modelling data, both with-out (Figure 3.2(e)) and with (Figure 3.2(f)) rerandomization, and comparethem with the input forward modelling data. Traces from these data setsare plotted in Figure 3.2(g). We can see that improvements in the imagequality indeed leads to improved data fitting.We have yet another important observation from this set of examples.Compared with inversion, RTM (the adjoint operation of the linearized mod-elling) is usually considered to be more robust to coherent mismatches inthe data. However, using the proposed fast linearized inversion approachwith rerandomization, we can better resolve the subsalt structures despite50the linearization errors. To provide a detailed comparison, we zoom intothe target zones in Figure 3.2(c) and 3.2(f), and compare them in Figure3.2(h). We can see that the image in Figure 3.2(f) is more faithful to thetrue model perturbations shown in 3.2(b).3.6 ConclusionWe analyzed the effects of data mismatches on `1-norm regularized inversionrelated to errors in the linearization. This can lead to major artifacts espe-cially when randomized sources are used to make the `1-norm regularizedinversion computationally feasible. Allowing for a tolerance depending onthe linearization errors makes the situation only worse because the inversionstops prematurely in that case. We overcome this detrimental weakness of`1-norm regularized inversion by choosing independent subsets of random-ized sources during the inversion. Not only does this rerandomization leadto faster convergence but it also removes many of the artifacts related toerrors in the linearization. We tested this algorithm on stylized and realis-tic synthetic examples. Our method produced virtually artifact-free high-resolution images at low computational cost despite errors in the linearizedforward model. Contrary to relying on the fold to stack out imaging relatederrors during reverse-time migration, our method uses redundancy in thedata to stabilize the inversion. This allows us to create high-resolution im-ages at a computational cost comparable to the cost of a single reverse-timemigration with all data.510 0.5 1 1.5 2 2.5−2−101Time (s)Amplitude(a)Lateral distance (m)Depth (m)0 5000 10000 150000100020003000(b)Lateral distance (m)Depth (m)0 5000 10000 150000100020003000(c)Lateral distance (m)Depth (m)0 5000 10000 150000100020003000(d)Lateral distance (m)Depth (m)0 5000 10000 150000100020003000(e)Lateral distance (m)Depth (m)0 5000 10000 150000100020003000(f)0 0.5 1 1.5 2 2.5−2−101Time (s)Amplitude(g)Zoom of (c) Zoom of (f)Lateral distance (m)Depth (m)0 5000 10000 150000100020003000Lateral distance (m)Depth (m)0 50 10000 150000100020003000(h)Figure 3.2: Case study with the synthetic SEG/EAGE 2D salt model.All images are plotted with the same color scale except theRTM image. (a). Traces of the linearized data (blue), theforward modelling data (green), and their difference (red). (b).True model perturbation. (c). The RTM image of the forwardmodelling data. (d). Inversion of the subsampled linearizeddata. (e). Inversion of the subsampled forward modelling data.(f). Otherwise the same as (e) but with rerandomization. (g).Traces of the input forward modelling data (blue) and the lin-earized modelling data with the inversion results of (e) (green)and (f) (red). (h). Zoomed subsalt areas of (c) (left) and (f)(right).52Chapter 4Limitations of thedeconvolutional imagingcondition for two-waypropagators4.1 SummaryThe deconvolutional imaging condition has gained wide attention in recentyears, as it is often used to image surface-related multiples. However, wenoticed on close inspection that this condition was derived from one-waypropagation principles. Now that two-way wave-equation based simulationshave become more affordable, we revisit the deconvolutional imaging condi-tion and reveal its limitations for two-way propagators. First, it can distortthe image due to receiver-side propagation effects. Second, when used toimage surface-related multiples, it is not capable of removing all interferingphantom reflectors.4.2 IntroductionSurface-related multiples in marine acquisition can be strong enough to in-terfere with the primary reflections. If not handled correctly, they can re-sult in phantom reflectors that hinder correct interpretation. On the otherA version of this chapter has been published in SEG Technical Program ExpandedAbstracts, 2013, vol. 32, p. 3916-392053hand, if correctly used, they can provide additional information on the sub-surface. Based on the SRME formulation (Verschuur et al., 1992), surface-related multiples can be identified as the seismic response of the subsurfacemedium when the downgoing receiver wavefield at the surface is injectedas the source wavefield (Berkhout, 1993). However, even with the sourcewavefield correctly identified, imaging the multiples still remains a challengeas the cross-correlation imaging condition fails to yield an artifacts-free seis-mic image due to acausal cross-correlations (Muijs et al., 2007; Liu et al.,2011). To address the issue, some researchers proposed to use the decon-volutional imaging condition (or the smoothing imaging condition derivedfrom it) (Claerbout, 1971; Guitton et al., 2007; Muijs et al., 2007; Whitmoreet al., 2010; Lu et al., 2011), while others suggested that an inversion ap-proach (e.g., least-squares migration) should be used (Verschuur, 2011; Tuand Herrmann, 2012a).The deconvolutional imaging condition was derived from one-way prop-agation principles (Claerbout, 1971). Now that the two-way wave-equationbased simulations have become more affordable, we scrutinize the applicabil-ity of the deconvolutional imaging condition for two-way propagators. Ourcontribution is organized as follows. First, we do a theoretical analysis ofthe deconvolutional imaging condition in the context of one-way and two-way wave propagations. As it is extensively used to image surface-relatedmultiples, we continue to examine its efficacy for imaging surface-relatedmultiples, and compare it with our linearized inversion approach, namelythe sparsity promoting migration with surface-related multiples (Herrmannand Li, 2012; Tu and Herrmann, 2012a).4.3 Theoretical analysisFor simplicity, we use the detail-hiding matrix notation to describe the one-way wave propagation (Berkhout and Wapenaar, 1993). Tensor U+ is thedown-going wavefield of the subsurface, obtained by downward propagatingthe down-going source wavefield at the surface S+, and V− is the up-goingwavefield of the subsurface, obtained by downward propagating the upgoingreceiver wavefield at the surface D−. In frequency domain modelling, bothU+ and V− are representations of the physical position (x, z), the source isindexed by i, and the frequency is indexed by j. Denoting the monochro-matic downward continuation operator as W+j , and the upward continu-ation operator as W−j , and the reflectivity operator as R(i,j) , we haveU+(i,j) = W+j S+(i,j) and the receiver wavefield D−(i,j) = W−j R(i,j)W+j S+(i,j).54Note that with fixed indices (i, j), all wavefields here are vectors. In caseswhere the exact inverse of the upward continuation operator W−j can be de-rived so that V−(i,j) = (W−j )−1D−(i,j), the zero-offset deconvolutional imagingcondition can be put as (Claerbout, 1971)R˜ =∑i,jV−(i,j)/U+(i,j) (4.1)=∑i,j(W−j )−1W−j R(i,j)W+j S+(i,j)W+j S+(i,j)=∑i,jdiag(R(i,j)),where R˜ is the reconstructed image, the division is performed in a element-wise manner, R is a diagonal operator with the reflectivity coefficients onthe diagonal entries (Verschuur and Berkhout, 2009), and notation “diag”means to obtain the diagonal entries. In this case, the image is a summationof the reflectivities for all sources and frequencies. However, as pointed outby Gray (1997), (W−j )∗ (star means the adjoint) is often used in practice dueto the instability of inverting W−j . In this case, the image can be distorteddue to receiver side propagation effects, i.e., (W−j )∗W−j .Now we turn to two-way wave-equation migrations. A fundamental dif-ference from its one-way counterpart is that an image is not formed fromthe reflectivity, but from the perturbations in medium parameters δm overa background medium m0. Instead of having the down-going wavefield U+and up-going wavefield V− by downward continuation, we now have the tem-porally forward-propagated source wavefield U and backward-propagatedreceiver wavefield V. Denoting the time-harmonic Helmholtz operator asHj = H[m0, ωj ], we can write U(i,j) = H−1j P∗S(i,j) where operator P re-stricts data at the surface and P∗ maps surface data to the entire domain. Bywriting W+j.= H−1j P∗ and W−j.= PH−1j here, we have U(i,j) = W+j S(i,j),and the linearized receiver wavefieldD(i,j) = ω2jW−j diag(δm)W+j S(i,j).Here we overload the notation “diag” to mean inserting a vector into thediagonal entries of a matrix. Now we clearly see the analogue between theone-way and two-way propagators. By a time-reversal propagation of the55receiver wavefield, we haveV(i,j) = (H∗j )−1P∗D(i,j) = (W−j )∗D(i,j).Since the adjoint of W−j is used, as in the case of using one-way propagators,the propagation effects on the receiver side can again distort the image ifthe deconvolutional imaging condition is applied.On the other hand, apply the cross-correlation imaging condition fortwo-way propagators (i.e., reverse-time migration, RTM) remains the ad-joint of the linearized forward modelling (Lailly, 1983). As the decon-volutional imaging condition is often regarded as an “inversion” scheme(see Guitton et al., 2007, equation (3)), we also compare the approachwith our iterative linearized inversion approach, where we invert the lin-earized Born scattering operator by promoting Curvelet sparsity in themodel space (Herrmann and Li, 2012). With the notations written above,the monochromatic linearized Born scattering operator ∇Fj is expressed as:W−j diag(δm)W+j Sj.= ∇Fj [m0; Sj ]δm, and is a function of the monochro-matic source wavefield Sj and the background model m0.Since the deconvolutional imaging condition is plagued by zeros in thedenominator (see Equation (4.1)), two stabilized versions have been reportedin the literature. First, using a damping term, as reported by Claerbout(1971):R˜ =∑i,jV(i,j)U∗(i,j)‖U(i,j)‖22 + . (4.2)In our experiments, we choose the damping parameter  by  = λ∗mean(‖U(i,j)‖22).Second, the smoothing imaging condition, as reported by Guitton et al.(2007):R˜ =∑i,jV(i,j)U∗(i,j)< ‖U(i,j)‖22 >(x,z), (4.3)where the symbol <>x,z means smoothing vertically and horizontally overa window.4.4 Stylized examplesWe design the following experiment to verify our theoretical analysis. Thebackground model we use is shown in Figure (4.1(a)). The left and the righthalves are symmetric. There are 3000 m/s velocity anomalies (boundariesare smoothed) over a background velocity of 1500 m/s. The model pertur-56bation is shown in Figure 4.1(b). The model is 594 m deep and 1194 mwide with 6 m grid distance. There are 50 sources put on the left side and50 receivers symmetrically put on the right side at 6 m in depth with 12 mlateral spacing. By designing such a model and acquisition geometry, thewaves to image the left half of the perturbation are more affected by thesource-side propagation effects, while the right half is more affected by thereceiver-side effects. We generate data by applying the linearized Born scat-tering operator ∇F to the true model perturbation δm. A Ricker waveletof 20 Hz peak frequency is used.We compare three scenarios here. In the first scenario, we obtain a base-line RTM image. In the second scenario, we use two-way propagators, butuse the damped deconvolutional imaging condition (Equation 4.2), with thedamping parameter λ = 0.05, and the smooth imaging condition (Equa-tion 4.3), with a 100 m smoothing window. In the third scenario, we runour sparsity promoting migration with all the data for 100 iterations. Theresults are shown in Figure 4.1(c) to 4.1(f).Comparing the images, we can see that the deconvolutional imagingcondition, as well as its smoothing variant, lead to wrong positions of themodel perturbation on the right half of the model (indicated by arrows).Combining this observation with our theoretical analysis, we conclude thatfor two-way propagators, the deconvolutional imaging condition can distortthe image due to receiver side propagation effects. On the other hand, bothRTM and the linearized inversion give us correct images.4.5 Imaging of multiplesMost applications of the deconvolutional imaging condition that we are in-terested in are to image surface-related multiples, where it is used somewhatas an “invesion” approach. Now we have shown that it is not really an inver-sion approach, we are going to investigate the plausibility to use it to imagethe surface-related multiples by comparing it with the iterative linearizedinversion approach.However, linearized inversion for a model of realistic dimensions is usu-ally considered to be computationally unaffordable as it requires repeatedevaluations of the Born scattering operator and its adjoint, while each eval-uation itself is already expensive by having to solve four wave-equations(without saving the background wavefield) for each monochromatic sourceexperiment. In the case of imaging multiples, the computational cost is sig-nificantly increased by performing multi-dimensional convolution for multi-ple prediction. However, we have shown in Chapter 2 that the computational57Lateral distance (m)Depth0 200 400 600 800 10000100200300400500(a)Lateral distance (m)Depth0 200 400 600 800 10000100200300400500(b)Lateral distance (m)Depth0 200 400 600 800 10000100200300400500(c)Lateral distance (m)Depth0 200 400 600 800 10000100200300400500(d)Lateral distance (m)Depth0 200 400 600 800 10000100200300400500(e)Lateral distance (m)Depth0 200 400 600 800 10000100200300400500(f)Figure 4.1: The model and the imaging results of the stylized exam-ple. The arrow in (d) or (e) indicates the image distortion. (a).The background model, stars indicate source positions and tri-angles indicate receivers. (b). True model perturbation. (c).The RTM image. (d). Image by the deconvolutional imagingcondition. (e). Image by the smoothing imaging condition. (f).Image by sparsity promoting migration.58cost can be greatly reduced by: (i) identifying the total down-going wavefieldat the surface (i.e., D with our notation) as the (generalized) areal sources,and letting the wave-equation solver to carry out the multi-dimensional con-volution implicitly; and (ii) reducing per-iteration simulation cost by usingsimultaneous sources and randomized subsets of frequencies. By introducingrerandomization (Tu and Herrmann, 2012a), the computational cost can befurther reduced without compromising the image quality. Here we use thesetechniques to achieve a fast inversion.We use a more realistic model cropped from the sedimentary part of theSigsbee 2B model (courtesy of the SMAART JV). The background modelis shown in Figure 4.2(a), and the true model perturbation is shown inFigure 4.2(b). The model is 3.8 km deep and 5.9 km wide, with 7.62 m gridspacing. There are 261 co-located sources and receivers at a depth of 7.62m with 22.86 m lateral spacing. We generate the surface-related multiplesby applying the linearized Born scattering operator with generalized source∇F[m0,D] to the true model perturbation δm. We model the total dataD using a time-domain finite difference code with a free-surface boundarycondition.We do the same comparisons as we did for the stylized example. We skipthe smoothing imaging condition as it yield similar results as the dampeddeconvolutional imaging condition (e.g., compare Figure 4.1(d) and 4.1(e)).The damping parameter λ for the deconvolutional imaging is tweaked to be0.4 (a smaller λ yields to instability in division). In our sparsity promotingmigration, we use 15 frequencies, 8 simultaneous sources, and run 338 iter-ations. This leads to a simulation cost roughly the same as a single RTMof all the data. The results are shown in Figure 4.3(a) to 4.4.Many authors have concluded that for multiples, the cross-correlationimaging condition (i.e., RTM) introduces artifacts caused by acausal cross-correlations (a replicate of the ocean bottom reflection is indicated by thearrow in Figure 4.3(a), see Muijs et al., 2007; Liu et al., 2011, for de-tails). Now from Figure 4.3(b), we can see that the deconvolutional imagingcondition is neither the right choice (the ocean bottom replicate is againindicated by an arrow). The same artifact is also observed by Muijs et al.(2007) (Figure 11b) using one-way propagators. On the other hand, we doget an virtually artifact-free image using our fast iterative inversion method.4.6 ConclusionWe analyzed the limitations of the deconvolutional imaging condition fortwo-way propagators. We show that it can distort the image due to receiver-59Lateral distance (m)Depth0 1000 2000 3000 4000 50000100020003000(a)Lateral distance (m)Depth0 1000 2000 3000 4000 50000100020003000(b)Figure 4.2: The cropped Sigsbee 2B model. (a) The backgroundmodel. (b) True model perturbation.60Lateral distance (m)Depth0 1000 2000 3000 4000 50000100020003000(a)Lateral distance (m)Depth0 1000 2000 3000 4000 50000100020003000(b)Figure 4.3: The RTM (a) and the deconvolutional (b) images of mul-tiples.61Lateral distance (m)Depth0 1000 2000 3000 4000 50000100020003000Figure 4.4: Image of multiples by our fast sparsity promoting migra-tion, where the simulation cost is roughly the same as Figure 4.3.side propagation effects. When used to image the surface-related multiples,it fails to yield an image free of coherent artifacts as it is not really aninversion approach. In light of this, a true inversion scheme is necessary toimage the surface-related multiples, and this can be done efficiently usingour fast sparsity promoting migration approach with rerandomization.62Chapter 5Source estimation withmultiples5.1 SummaryWe propose a fast “wavelet-free” least-squares imaging procedure that pro-duces high-accuracy seismic images without the knowledge of the sourcewavelet. Conventional reverse-time migration requires the knowledge of thesource wavelet, which is either unavailable or very difficult to accuratelydetermine; inaccurate estimates of the source wavelet can result in seriouslydegraded reverse-time migrated images, and therefore wrong geological in-terpretations. To solve this problem, we present a “wavelet-free” imagingprocedure that simultaneously inverts for the source wavelet and the seis-mic image, by tightly integrating source estimation into a fast least-squaresimaging framework, namely compressive imaging, given a reasonably accu-rate background velocity model. However, this joint inversion problem isdifficult to solve as it is plagued with local minima and the ambiguity withrespect to amplitude scalings, because of the multiplicative, and thereforenonlinear, appearance of the source wavelet in the otherwise linear formal-ism. We have found a way to solve this nonlinear joint-inversion problemusing a technique called variable projection, and a way to overcome thescaling ambiguity by including surface-related multiples in our imaging pro-cedure following recent developments in surface-related multiple predictionby sparse inversion. As a result, we obtain highly accurate estimates of thesource wavelet and high-resolution seismic images, comparable in quality toA version of this chapter has been submitted for publication63images obtained assuming the true source wavelet is known. By leverag-ing the computationally efficient compressive-imaging methodology, theseresults are obtained at affordable computational costs compared with con-ventional processing work flows that include surface-related multiple removaland reverse-time migration.5.2 IntroductionConventional reverse-time migration (RTM, Baysal et al., 1983) requiresthe knowledge of the source wavelet as prior information, which is usedduring the forward propagation of the source wavefield. Unfortunately, thisknowledge is usually unavailable, or very difficult to accurately determine forfield seismic data. Inaccurate estimates of the source wavelet will introduceerrors to the forward propagated source wavefield, and later to the seismicimage during the cross-correlation of the source and receiver wavefields.For example, wrong estimates in the phase of the source wavelet can resultin misplaced structures in the seismic image, and therefore lead to wronggeological interpretations.To eliminate the dependence of seismic imaging on the knowledge of thesource wavelet, we are motivated to incorporate source estimation into seis-mic imaging via joint inversion of the seismic image and the source wavelet,by extending the least-squares-migration formalism (Nemeth et al., 1999).Because the forward modelling of the seismic data is linear with respectto the source wavelet, this type of joint inversion problem is known as theseparate least-squares problem (Golub and Pereyra, 1973; Kaufman, 1975;Golub and Pereyra, 2003; Aravkin and van Leeuwen, 2012). To reducethe excessive simulation cost of conventional least-squares migrations, weleverage curvelet-domain sparsity of the seismic image (Herrmann et al.,2008a), and tightly integrate source estimation into a computationally effi-cient least-squares imaging formalism (Herrmann and Li, 2012), also knownas the compressive imaging method.Although a separable least-squares problem can usually be effectivelysolved using variable projection (Golub and Pereyra, 2003), applying vari-able projection to the proposed joint inversion procedure is complicated bythe following two issues. First, the compressive imaging formalism adopts asparsity-promoting objective function, which is different from typical sepa-rable least-squares problems (Golub and Pereyra, 2003). We follow van denBerg and Friedlander (2008) and Aravkin et al. (2013a), and derive an al-ternative problem formulation that turns the sparsity-promoting objective64into a sparse constraint. Second, there is an ambiguity in the amplitudescaling between the model perturbations and the estimated source wavelet,which cannot be resolved using variable projection alone, as in any blind-deconvolution type of problem (Stockham Jr et al., 1975). Inspired by recentdevelopments in Estimation of Primaries by Sparse Inversion (EPSI, vanGroenestijn and Verschuur, 2009a; Lin and Herrmann, 2013), we proposeto resolve the ambiguity by incorporating surface-related multiples in theinversion. In this way we leverage the self-consistency between the primaryevents and their corresponding surface-related multiples, and obtain defi-nite estimates of the source wavelet as well as the seismic image. Assumingnegligible modelling errors, our method can recover the original amplitudesof the seismic image and the source wavelet with high fidelity. On a sidenote, as any other imaging algorithm, our proposed method relies on theknowledge of a reasonably accurate background model.5.2.1 Related work and our contributionThe variable projection method has found its applications in a variety ofgeophysical problems in recent years, for example, in characterization ofP-S wave conversion (Fomel et al., 2003), in velocity model building (vanLeeuwen and Mulder, 2009), and notably in source estimation during full-waveform inversion (Aravkin et al., 2012; Rickett, 2013; Li et al., 2013). Inall these applications, an objective function that measures the data misfitis used for the variable projection technique to be applicable. The misfit isusually measured using the `2-norm (i.e., a least-squares formulation), butother differentiable misfit functions can also be used (Aravkin et al., 2012).In this paper, however, we minimize the `1-norm of the solution vectorto promote sparsity, with the least-squares data-fitting term acting as aconstraint (Equation (5.2)). To the authors’ knowledge, this work is the firstinstance of applying the variable projection technique to an optimizationproblem with a sparsity-promoting objective.Regarding the use of multiples in source estimation, successful applica-tions have been demonstrated in the literature of Surface-Related MultipleElimination (SRME, Verschuur et al., 1992), and EPSI (van Groenestijn andVerschuur, 2009a; Lin and Herrmann, 2013; Esser et al., 2015). However,these approaches did not (i) recognize and tightly integrate the source es-timation with variable projection; neither did they (ii) clearly demonstratewhy incorporation of the surface-related multiples into sparsity-promotingimaging helps resolve the ambiguity in amplitude scalings that plague sourceestimations during imaging. This paper is the first, within the context of65seismic imaging, to properly address the above issues.5.2.2 Paper outlineThe paper is organized as follows. First, we formulate the joint sparsity-promoting source- and image-estimation problem, followed by a technicaldiscussion on how to solve `1-norm minimization problems with source es-timation via variable projection. Next, we explain why including surface-related multiples into the formulation helps us resolve the scaling ambiguityduring imaging and source estimation. We conclude by demonstrating theadvantages of the proposed method over conventional RTM, using carefullyelaborated synthetic examples.5.3 Problem formulation and methodAs we propose the joint inversion approach in the compressive-imagingframework, we will first briefly review the compressive-imaging formalismand formulate the source estimation problem. Afterwards, we will explainhow to solve the problem using variable projection.5.3.1 Compressive imagingGiven the knowledge of the source wavelet, compressive imaging aims toobtain least-squares migrated images in a computationally efficient manner,by solving the following Basis Pursuit De-Noise (BPDN) problem (Herrmannand Li, 2012):BPDN(w, σ) : minx‖x‖1subject to∑i∈Ω∑j∈Σ‖di,j −∇F[m0, wisj ]C∗x‖22 ≤ σ2.(5.1)In this formulation, vector w represents the Fourier spectrum of the sourcewavelet, and the tolerance parameter σ allows for data mismatch due tonoise in the data and modelling errors. Vector x is the curvelet coefficientsof the image δm, i.e., δm = C∗x with C the curvelet synthesis operator(Cande`s and Donoho, 2004), and δm itself being the perturbations over agiven gridded background velocity model m0 parameterized by the squareof the slowness. In the data-fitting constraint of the above optimization pro-gram, i indexes the discretized frequencies, and j indexes the different sourceexperiments. Instead of using all nf discretized frequencies and ns source66experiments, Herrmann and Li (2012) proposed to subsample the monochro-matic source experiments to reduce the simulation cost in forward modelling:notations Ω and Σ denote the randomly selected n′f  nf frequencies andn′s  ns source experiments. By subsampling, the number of wave-equationsolves in each iteration is reduced by a factor of (nf ∗ns)/(n′f ∗n′s). We referto Herrmann and Li (2012) and Chapter 2 on the details of the subsam-pling procedure. The underlined variables di,j and wisj represent the jthcolumn of the subsampled primary-only receiver wavefields (i.e., simultane-ous data) and point-source wavefields (i.e., simultaneous sources) respec-tively. The operator ∇F represents the linearized Born scattering operator.Note that the source weight wi is separable from the source vector sj , i.e.,∇F[m0, wisj ]δm = wi∇F[m0, sj ]δm. Throughout this paper, we form theoperator∇F based on the two-way wave equation, therefore the evaluation ofthe adjoint of ∇F is equivalent to the reverse-time migration (RTM, Baysalet al., 1983; Lailly, 1983). The above formulation means that among allpossible solutions, the optimization program looks for the sparsest curveletcoefficients that, after curvelet synthesis and linearized modelling with thesimultaneous sources, predict the simultaneous observed data up to a noisethreshold specified by σ. With the above compressive-imaging formalism inplace, we now continue to formulate the source estimation problem.5.3.2 Source estimationNow we identify the source wavelet wi, i ∈ Ω (collected in the vector w)as an additional unknown in the compressive imaging formulation, i.e., wehaveBPDN(σ) : minx,w‖x‖1subject to∑i∈Ω∑j∈Σ‖di,j − wi∇F[m0, sj ]C∗x‖22 ≤ σ2.(5.2)While conceptually attractive, BPDN(σ) does not lend itself to developtractable algorithms. Following early work by van den Berg and Fried-lander (2008), we interchange the objective and constraint to arrive at anextension—remember the formulation now includes the source as an un-known, which appears nonlinearly in the BPDN(σ) program, of the LeastAbsolute Shrinkage and Selection Operator (LASSO, Tibshirani (1996)) for-67mulation:LASSO(τ) : minx,wf(x,w).=∑i∈Ω∑j∈Σ‖di,j − wi∇F[m0, sj ]C∗x‖22subject to ‖x‖1 ≤ τ.(5.3)As shown by Aravkin et al. (2013a), the set of minimizers of problemBPDN(σ) and LASSO(τ) coincide, given the condition that each minimizerof problem BPDN(σ) satisfies its constraint to equality (i.e., f(x,w) = σ2).Not coincidently, when this condition is met, each minimizer also satisfiesthe constraint of the LASSO(τ) problem to equality, meaning that we canlook for such a τ that the infimum of f(x,w) with |x|1 ≤ τ has a value thatequals σ2. Because the objective function of problem LASSO(τ) adopts thecanonical form of a separable least-squares problem (Golub and Pereyra,2003), we can now solve the problem using variable projection combinedwith projection onto a convex set. In the next few sections, we will firstdiscuss how we solve problem LASSO(τ) with a given τ , and then discusshow we compute the right τ so that problem LASSO(τ) converges to thesame solution as problem BPDN(σ).5.3.3 Variable projectionTo incorporate variable projections for the source wavelet into `1-norm spar-sity promoting imaging, let us first consider iterations that involve softthresholding that undergirds this type of `1-norm optimization. Model it-erates at the kth iteration are in this case projected onto the `1-norm ball‖x‖1 ≤ τ , and the involved projected-gradient step is given byxk+1 = PX[xk + λ∇xf(x,w)∣∣∣∣x=xk,w=wk]. (5.4)In this expression, λ is a line-search parameter, ∇xf(x,w)∣∣∣∣x=xk,w=wkisthe gradient of the objective function in Equation (5.3) with respect to xat the kth iteration, and PX denotes the projection onto the feasible setX = {x : ‖x‖1 ≤ τ}.Contrary to conventional `1-norm promoting imaging where w is as-sumed to be known, we need to evaluate this gradient ∇xf(x,w)∣∣∣∣x=xk,w=wkwith variable projection at each iteration for unknown wk’s. During each68iteration, we accomplish this by solving the following optimization problemfor each frequency:minwi∑j∈Σ‖di,j − wi∇F[m0, sj ]C∗x‖22, (5.5)which permits the following closed-form solution for the source wavelet(Pratt, 1999):w˜i(x) =∑j∈Σ〈di,j ,∇F[m0, sj ]C∗x〉∑j∈Σ〈∇F[m0, sj ]C∗x,∇F[m0, sj ]C∗x〉. (5.6)Here, the angular brackets 〈·, ·〉 denote the inner product of two vectors,and w˜i(x), i ∈ Ω denote complex-valued estimates of the source wavelet atdifferent angular frequencies. The above solution enables us to eliminate theunknown source wavelet term in problem LASSO(τ), rendering the problemto solely and nonlinearly depend on x:minxf(x).=∑i∈Ω∑j∈Σ‖di,j − w˜i(x)∇F[m0, sj ]C∗x‖22subject to ‖x‖1 ≤ τ.(5.7)At first inspection, the above problem becomes more complicated thanLASSO(τ) as we need to evaluate the derivative of w˜i(x), i ∈ Ω with re-spect to x. However, Aravkin and van Leeuwen (2012) (Corollary 2.3) haveproved that a stationary point of the above problem remains a station-ary point of problem LASSO(τ). Furthermore, we have (Aravkin and vanLeeuwen, 2012, Theorem 2.1)∇xf(x) = ∇xf(x, w˜(x)). (5.8)The above equation means that in each iteration, after vector x is mod-ified according to Equation (5.4), we can compute ∇xf(x) by evaluating∇xf(x,w) of problem LASSO(τ) at w = w˜(x) via Equation (5.6).While there is no guarantee the variable projection approach will yieldthe globally optimal solution as the problem remains non-convex, numericalexperiments show that in most cases it enables the optimization program toconverge within less iterations (Golub and Pereyra, 2003). Rickett (2013)also showed its superior performance over the simultaneous descent methodin source estimation for full-waveform inversion applications. Furthermore,the projections themselves (cf. Equation (5.6)) are computationally afford-69able as they do not involve additional expensive operations such as wave-equation solves.5.3.4 Relaxing the `1-norm constraintIn this section, we describe our strategy to select the right τ so we actu-ally solve problem BPDN(σ) by solving problem LASSO(τ). We follow themethodology proposed by van den Berg and Friedlander (2008) in SPG`1,where a series of τ ’s are found by solving a root-finding problem on thePareto trade-off curve. With the unknown source wavelet, we find the τ ’sby solving the value function ν(τ).= inf f(x,w)||x|1≤τ = σ2, which yieldsthe following `1-norm constraint for the (l + 1)th LASSO subproblem usingthe Newton’s method with an initial guess of τ0 = 0:τ l+1 = τ l −ν(τ l)− σ2ν ′(τ l). (5.9)Using this approach, we arrive at the solution of BPDN(σ) by solving aseries of LASSO(τ) subproblems for gradually increasing τ ’s. While ν(τk)in the above equation can be evaluated straightforwardly with the previous τestimate, the evaluation of ν ′(τk) is more difficult, because the nonlinearityintroduced by the unknown source wavelet violates the linearity assumptionon the forward model, which is needed to compute this value (Aravkin et al.,2013a). However, by treating w as fixed, we approximate ν ′(τk) by (van denBerg and Friedlander, 2008; Aravkin et al., 2013a):ν ′(τk) ≈ −‖∑i∈Ω∑j∈ΣC∇F∗[m0, wisj ] (di,j −∇F[m0, wisj ]C∗x)‖∞. (5.10)With numerical examples, we will show that this approximation works rea-sonably well.5.3.5 Acceleration by redrawing random subsetsExcept for the extra variable projection step, solving LASSO(τ) is similarto compressive imaging (Herrmann and Li, 2012), during which computa-tional costs can be reduced by working with randomized subsets of data. Asshown by Herrmann and Li (2012) and Chapter 2, solutions of BPDN(w;σ)for a given source wavelet can be accelerated significantly by selecting newindependent randomized subsets of frequencies and simultaneous sources,after each corresponding LASSO subproblem is solved. We found empiri-70cally that working with these new independent subsets leads to faster decayof the model error and to improved robustness with respect to possible lin-earization errors (Chapter 3)—i.e., di,j − F[m0, wisj ] ≈ ∇F[m0, wisj ]δm.Both findings can be explained because independent subsets tend to breakcorrelations between errors in the solution and the randomly selected subsetsof data (Herrmann, 2012).While these empirical findings lead to a computationally feasible androbust compressive imaging scheme, we need to justify that working withdifferent randomized subsets does not interact adversely with the proposedsource estimation approach. Aside from additional empirical evidence fromvarious examples included below, which suggests that there is no measurableadverse effect, we argue that (i) numerical of the value function, ν ′(τ),and its derivative ν ′(τ) remain similar as long as we keep the number offrequencies and simultaneous sources the same; and (ii) we only draw newsubsets after each LASSO(τ l) is solved. The latter argument assumes thatestimates of the source at the lth subproblem have converged, which is areasonable to make because the number of knowns in Equation (5.6) for eachfrequency far exceeds the dimensionality of the single unknown complex-value for the source.5.4 Resolving scaling ambiguities with multiplesDespite the fact that there are indications that we arrived at a practicalalgorithm to estimate source wavelet as well as high-resolution least-squaresimages, fundamental issues related to the ambiguity with respect to ampli-tude scalings remain, which are intrinsic to any blind-deconvolution type ofproblems where both the source wavelet and the seismic image are unknown(Stockham Jr et al., 1975). However, compared to source estimation in dataspace using the convolution model (Ulrych et al., 1995), source estimationin the image space does not suffer from ambiguities with respect to globalphase shifts, which can be explained by the multiplicity of data and move-out characteristics that are specific to the background-velocity model beingused. As stated in the introduction, we assume this background model tobe given and to be relatively accurate.Unfortunately, the same observation does not apply to the ambiguitywith respect to amplitude scalings, which correspond mathematically to aninvariance of our objective functions with respect to amplitude scalings by71α ∈ R+—i.e., we havef(x,w) = f(1αx, αw). (5.11)This invariance results in amplitude ambiguity in the source estimation,which originates from the fact that the objective of LASSO(τ) is bi-linear,namely linear in both the source wavelet w and the solution vector x.Unfortunately, including sparsity constraints via the `1-norm or even viathe non-convex `1/`2-norm (Esser et al., 2015), are inadequate to resolve theabove amplitude ambiguity. We therefore alternatively resort to the physicsof the free surface. By incorporating the free surface, our problem becomesnonlinear because the forward model now includes a “feedback loop”, whichgenerates surface-related multiples from primaries scaled by the (unknown)source wavelet. Extending the physics in this way allows us to resolve thescaling ambiguity by using observed surface-related multiples explicitly toestimate the source wavelet as proposed during EPSI (van Groenestijn andVerschuur, 2009a; Lin and Herrmann, 2013). Following recent work shownin Chapter 2, we propose to do this by incorporating predictions of surface-related multiples into our formulation for compressive imaging.5.4.1 Compressive imaging with total upgoing wavefieldsIf we ignore internal multiples, the total upgoing wavefield, including pri-maries and surface-related multiples, can be modelled by including the ob-served upgoing wavefields ui,j at the water surface as areal sources in thelinearized Born scattering operator (Chapter 2), yieldingui,j ≈ ∇F[m0, wisj − ui,j ]. (5.12)For a perfect free surface with a reflection coefficient of −1, the downgoingspatially impulsive source wavefields wisj are replaced by “areal” sourceswisj − ui,j , i.e., we include the downgoing receiver wavefields at the watersurface −ui,j into the source wavefields. We obtain ui,j after careful prepro-cessing to the observed seismic data, such as applying source-receiver reci-procity (to fill in missing traces), up-down decompositions (for receiver-sidedeghosting), and extrapolation of the upgoing wavefield from the receiverlevel to the free surface (see e.g., Verschuur et al., 1992). As a result, we ar-rive at a formulation that remains conducive to sparsity-promoting imagingwith source estimation via variable projection. Because prediction for themultiples are carried out by the wave-equation solver, this forward model is72also computationally viable compared to processing work flows that involveseparate multiple-prediction and RTM-imaging procedures.To arrive at our final formulation, we first replace our objective byf(x,w) =∑i∈Ω∑j∈Σ‖ui,j −∇F[m0, wisj − ui,j ]C∗x‖22, (5.13)which we obtain by substituting the primary wavefield with the total upgoingwavefield and the spatially impulsive sources with areal sources containingthe total downgoing wavefield at the surface. Given this new definitionfor the objective, we proceed by solving the LASSO(τ) subproblems withvariable projections. For this end, we solve for all (simultaneous) sourcesand for each frequency the following problem:minwi∑j∈Σ‖ui,j −∇F[m0,−ui,j ]C∗x− wi∇F[m0, sj ]C∗x‖22, i ∈ Ω,which in turn permits the following analytic solution:w˜i(x) =∑j∈Σ〈ui,j −∇F[m0,−ui,j ]C∗x,∇F[m0, sj ]C∗x〉∑j∈Σ〈∇F[m0, sj ]C∗x,∇F[m0, sj ]C∗x〉. (5.14)To arrive at the modified solution for the source wavelet above, we sub-stitute the primary-only wavefield by an estimate for the primaries givenby the total total upgoing wavefield minus the current prediction for thesurface-related multiples—i.e., di,j 7→ ui,j −∇F[m0,−ui,j ]C∗x. The multi-ple predictions do not require information on the source wavelet; rather, theyare carried out by solving wave-equations with the areal sources given by−ui,j . With these substitutions, we are able to estimate the source waveletby variable projection at additional costs of a single linearized forward mod-elling operation to predict the multiples.In return of the increased computational costs, including surface-relatedmultiples during imaging with source estimation has several key advantages,namely (i) including the multiples resolves ambiguities in the source esti-mation, which leads to a correct scaling of the image and therefore of thepredicted primaries and multiples; (ii) sparsity-promoting imaging with mul-tiples removes coherent artifacts related to surface-related multiples by map-ping multiple energy onto primaries and therefore onto the true image; and(iii) our inversion procedure properly accounts for the source wavelet result-ing in true-amplitude source-deconvolved high-resolution images (Chapter2). This latter claim of course hinges on the assumption that amplitudes73are correctly modelled by the linearization.5.4.2 Putting it all togetherWith the building blocks of our sparsity-promoting imaging with source esti-mation and multiples in place, we summarize our proposed imaging schemein Algorithm (2).Input and initialization [Line 1-6]As discussed above, our proposed inversion approach requires the knowl-edge of a reasonably accurate background velocity model m0. As discussedin Chapter 3, we choose a zero σ to prevent the optimization algorithmfrom being terminated prematurely (Line 3). The initial guess of the sourcewavelet is simply an impulse at zero time, which corresponds to a flat Fourierspectrum with unit amplitude and zero phase (Line 6). Both the solutionvector x and the sparsity level τ are simply initialized as zeros.Main loop [Line 7-16]In the main loop we solve a series of LASSO(τ l) subproblems. For each sub-problem, we draw new independent subsets of randomized frequencies andsimultaneous sources (Line 8). We update the sparsity parameter τ l usingNewton’s method (Line 9), and solve for the model parameters collected invector x (Line 10) as well as the source wavelet collected in vector w (Line13) by variable projection. Note that we do not impose any assumption onthe phase of the source wavelet (e.g., minimum phase assumption as usedin Robinson, 1967) during source estimation. We terminate the optimiza-tion program when the pre-specified maximal number of iterations kmax isreached.5.5 ExamplesTo validate and test the proposed imaging scheme with on-the-fly source es-timation, we conduct a series of synthetic examples designed to illustrate keyaspects of our algorithm including sensitivity to errors in the source wavelet,robustness with respect to modelling errors, and resolving ambiguity withmultiples.74Algorithm 2 Fast imaging with source estimation and multiples1. Input:2. total upgoing wavefield u, background velocity model m0,3. tolerance σ = 0, iteration limit kmax4. Initialization:5. iteration index k ← 0, LASSO subproblem index l← 0, τ0 ← 0, x0 ← 06. wi = 1 for all i ∈ 1, · · · , nf7. while k < kmax do8. Ωl, Σl, ui,j , sj ← new independent draw9. τ l ← determined from τ l−1 and σ using Newton’s method10. xl ←{argminx∑i∈Ωl,j∈Σl‖ui,j −∇Fi[m0, wi(x)sj − ui,j ]C∗x‖22subject to ‖x‖1 ≤ τ l11. //warm start with xl−1, solved in kl iterations12. //in each iteration, update the source by13. wi(x) =∑j∈Σ〈ui,j−∇F[m0,−ui,j ]C∗x,∇F[m0,sj ]C∗x〉∑j∈Σ〈∇F[m0,sj ]C∗x,∇F[m0,sj ]C∗x〉14. k ← k + kl, l← l + 115. end while16. Output: model perturbations δx = C∗x5.5.1 Imaging with primaries onlyMost conventional seismic data processing work flows use the primary wave-field as the input for migration, i.e., after a de-multiple procedure to the ob-served data, for example using SRME (Verschuur et al., 1992). We thereforefirst apply our method to primary-only data to demonstrate its advantagesover RTM when used in conventional data work flows.Experiment setupWe use a 2D slice of the SEG/EAGE salt model. We pad 10 grid pointsat the top of the model so that the water layer becomes slightly thicker, tobetter retain the water velocity near the surface when we smooth the truemodel to get the background model. In this way we can model and thereforeremove the direct waves more accurately. The true and the backgroundvelocity models are shown in Figure 5.1. The model is 3.9 km deep and15.7 km long, with a grid spacing of 24.38 m (80 ft). We use a fixed-spreadconfiguration, and put 323 co-located sources and receivers with a 48.77m lateral grid spacing (i.e., every other grid point) at a depth of 24.38 m(i.e., the second vertical grid point). We use a Ricker wavelet with a peak75(a)(b)Figure 5.1: A slightly modified 2D slice of the SEG/EAGE salt model.(a) True velocity model. (b) Smoothed background velocitymodel.frequency of 5 Hz, and the peak of the wavelet is at 0.25 s. We record for 8seconds, which in the frequency domain yields 96 discretized frequencies upto 12 Hz.Ideal caseTo rule out the influence of any imperfection in the data, we first test ourmethod with an ideal setup, i.e., we synthesize and invert the primary-onlydata using the same linearized modelling engine. To obtain a baseline image,we run a conventional RTM by fictitiously assuming that we know the truesource wavelet (Figure 5.2(a)). To remove the low-wavenumber artifacts inthe RTM image, we apply a high-pass filtering along the depth dimension(Mulder and Plessix, 2003). To verify the necessity of the proposed on-the-fly source estimation procedure, we first show the detrimental effect ofinverting with a wrong wavelet (Figure 5.2(b)). Next, we compare it withusing the true source wavelet (Figure 5.2(c)), and then we show how to avoidthe adverse effect wavelet using the proposed method (Figure 5.2(d)). Thewrong source wavelet we use is obtained by introducing a time shift of -0.1s (the minus sign indicates an advance in time) to the true source wavelet.76We can see that compared with conventional RTM (Figure 5.2(a)), theproposed approach produces an image of higher spatial resolution given thetrue source wavelet (Figure 5.2(b)). However, the image quality is signifi-cantly compromised when the wavelet is wrong. In this case the shifted phaseof the wavelet not only produces an image contaminated with subsampling-related noises (Figure 5.2(c)), but also leads to a shift of the subsurfacestructures in depth, which can result in erroneous interpretations (althoughnot shown here, the structures in the RTM image using the wrong waveletare also shifted in depth). With the proposed source-estimation approach,we obtain a faithful image (Figure 5.2(d)), as if we use the true sourcewavelet (Figure 5.2(b)).Remarks on the experiment: In terms of computational cost, we use 16randomly selected frequencies, 16 simultaneous sources, and 60 iterationsfor all inversion results; as a result, the four images in Figure 5.2 involveroughly the same number of wave-equation solves. For all inversion resultsin this and later sections, after the inversion is finished, we apply a curveletthresholding to remove residual incoherent noises in the image (Herrmannet al., 2008b). The threshold is chosen in such a way that the thresholdednoise does not contain noticeable coherent energy. As our solution vectorx is already in the curvelet domain, extra computation incurred in thisthresholding step is minimal.As a quality-control procedure, we also plot in Figure 5.3 the estimates ofthe source wavelet outputted by the last LASSO(τ) subproblem (in green),and compare it against the true source wavelet (in blue). Both amplitudeand phase spectra are plotted. Because of the amplitude ambiguity wediscussed above, we normalize the amplitude spectra of both the true andthe estimated source wavelet to compare them. The source estimates arehighly accurate except for the absolute amplitude scaling.More realistic caseTo test the robustness of our method to imperfections in the data — such aslinearization errors (as explained earlier), internal multiples, and discrepan-cies between the modelling engines used for data simulation and inversion —we now apply the method to a more realistic primary-only data set modelledwith iWave (Terentyev et al., 2014). We compare the following examples.For comparison, we again get the baseline image using conventional RTMwith the true wavelet (Figure 5.4(a)). Next we obtain the image using theproposed method with the true source wavelet (Figure 5.4(b)) and withsource estimation (Figure 5.4(c)). In Figure 5.5 we plot the amplitude (with77(a)(b)(c)(d)Figure 5.2: Imaging results of the ideal primary-only data set. Thearrows point to the same subsurface locations. (a) ConventionalRTM with the true source wavelet. (b) to (d) Compressiveimaging results with the true source wavelet (b), with the wrongsource wavelet that has a -0.1s time shift (c), and with theproposed source-estimation method (d). It is evident that theimages in (b) and (d) are highly comparable, and are of muchhigher spatial resolution than the conventional RTM image in(a).780 2 4 6 8 10 1200. (Hz)Amplitude(a)0 2 4 6 8 10 12−3−2−10123Frequency (Hz)Phase(b)Figure 5.3: Source estimates from the ideal primary-only data set.(a) and (b): Amplitude (a) and phase (b) spectra of the true(in blue) and the estimated (in green) source wavelet during thelast LASSO(τ) subproblem, after amplitude normalization.normalization) and phase spectra of the true wavelet and the estimatedwavelet during the last LASSO(τ) subproblem.We observe that, in the presence of the imperfections in the data, (i) theproposed source-estimation method still yields an image that highly resem-bles the one obtained with the true source wavelet; (ii) both images are stillof higher spatial resolution than the conventional RTM image; and (iii) thesource estimates degrade gracefully, compared with the case where we invertthe ideal data (see Figure 5.3). The imperfections do result in some imag-ing artifacts beneath the salt structure, but they do not particularly causeproblems for the source-estimation approach. The above observations leadto the conclusion that the proposed source-estimation approach is relativelyrobust with respect to coherent noises in the primary-only data.Remarks on the experiment: In generating the data with iWave, we usean absorbing boundary condition at the surface to avoid generating surface-related multiples. The observed data is the difference between the datamodelled with the true model and the background model. We use the sameinversion parameters as the above set of examples, and as a result, thecomputational costs of the three images in Figure 5.4 remain roughly thesame.As demonstrated by the two sets of examples above, our proposed sourceestimation method, when applied to conventional primary-only data, pro-duces seismic images that are comparable to images obtained using true79(a)(b)(c)Figure 5.4: Imaging results of the primary-only data set modelled us-ing iWave. The arrows point to the same subsurface locations.(a) Conventional RTM image. (b) and (c) Compressive imag-ing results using the true source wavelet (b) and with the pro-posed source-estimation method (c).source wavelets. The method is also relatively robust to imperfections inthe input data. However, as we discussed above, the scaling ambiguitycannot be solved by imaging with primaries alone because of the blind-deconvolutional nature of the problem. Next we will demonstrate how weresolve the ambiguity using surface-related multiples.800 2 4 6 8 10 1200. (Hz)Amplitude(a)0 2 4 6 8 10 12−3−2−10123Frequency (Hz)Phase(b)Figure 5.5: Source estimates from the primary-only data set modelledusing iWave. (a) and (b): Amplitude (a) and phase (b) spectraof the true (in blue) and the estimated (in green) source waveletduring the last LASSO(τ) problem, after amplitude normaliza-tion.5.5.2 Scaling images with multiplesTo validate our proposal to incorporate surface-related multiples in the imag-ing procedure to resolve the amplitude ambiguity, we apply the proposedmethod to data that contain surface-related multiples. As above, we studythe performance of the proposed approach both with an ideal data set andwith a more realistic data set. The initial guess for the source wavelet re-mains an impulse at t = 0.Experiment setupWe use a sedimentary part of the Sigsbee 2B model (The SMAART JV,2014), which by design has a strong ocean-bottom reflector, to carry outexamples using surface-related multiples. We reduce the water depth of theoriginal model, to allow for more orders of multiples to be recorded. The trueand the background velocity models are shown in Figure 5.6. The model is3.8 km deep and 6 km long, with 7.62 m (25 ft) grid spacing. We again usea fixed-spread configuration, and put 261 co-located sources and receiverswith 22.86m lateral grid spacing (i.e., every three grid point) at a depthof 15.24 meters (i.e., the third vertical grid point). We use a zero-phaseRicker wavelet with a peak frequency of 15 Hz, and record for 8.184 seconds(i.e., 1024 time samples with 8 ms sampling interval). This setup yields 31181discretized frequencies up to 38 Hz.Ideal caseWe synthesize the ideal data set including surface-related multiples usinglinearized modelling following Equation (5.12). Note that in this case, themultiples are not generated by including an explicit free surface in the model,but by injecting the downgoing receiver wavefield as areal sources. To showthe removal of amplitude ambiguity, we compare the imaging results ob-tained with source estimation, including both the image and estimates ofthe source wavelet during the last LASSO(τ) problem, against the truemodel and source wavelet. The inverted model perturbations are shown inFigure 5.7(a). To show that it is indeed of the true amplitude and phase, weadd it back to the background model to estimate the true model. The re-sult is shown in Figure 5.7(b), and is comparable to the true velocity modelshown in Figure 5.6(a). We juxtapose the amplitude and phase spectraof the true and estimated source wavelets in Figure 5.8, without applyingany normalization. This comparison shows that in this ideal case, the pro-posed imaging-with-multiple approach can retrieve the original amplitude(and phase) of both the model perturbations and the source wavelet withhigh accuracy.Remarks on the experiment: In the inversion, we use 31 randomly se-lected frequencies, 26 simultaneous sources, and 50 iterations. As a result,the number of wave-equation solves is roughly 1.5× of one RTM with allthe data (as discussed above, source estimation with multiples introduces a50% increase in simulation cost for each iteration).More realistic caseWe model a more realistic data set that contains multiples using iWave witha free-surface boundary condition (Terentyev et al., 2014). The observeddata is the difference between the data modelled with the true model andthe background model. Note that the imperfections in the data now alsooriginate from approximating the observed surface-related multiples by theSRME-type multiple prediction, although this prediction is carried out im-plicitly using wave-equation solvers in our proposed approach (Chapter 2).Because the modelling engines used in data simulation and inversion havedifferent scalings, we can no longer recover the original amplitude. Thereforewe compare the images obtained with the true source wavelet (Figure 5.9(a))and with source estimation (Figure 5.9(b)). The inversion parameters and82(a)(b)Figure 5.6: A sedimentary part of the Sigsbee 2B model. (a) Truevelocity. (b) Smooth background velocity.83(a)(b)Figure 5.7: Compressive imaging-with-multiple result of the idealdata set containing multiples. (a) The inverted model pertur-bations. (b) Estimate of the true velocity model by adding theinverted model perturbations back to the background model.840 10 20 30020406080100Frequency (Hz)Amplitude(a)0 10 20 30−3−2−10123Frequency (Hz)Phase(b)Figure 5.8: Source estimates from the ideal data set containing multi-ples. (a) and (b): Amplitude (a) and phase (b) spectra of thetrue (in blue) and the estimated (in green) source wavelet duringthe last LASSO(τ) subproblem. No normalization is applied.therefore the computational costs remain the same as the above set of ex-amples. The two images are comparable, both containing minimal coherentartifacts from surface-related multiples (e.g., periodic copies of the oceanbottom at deeper section of the image).We plot the amplitude and phase spectra of the true and the estimatedsource wavelets, in Figure 5.10(a) and 5.10(b). Because we do not recoverthe original amplitude, we need to normalize the amplitude spectra. We ob-serve that in the presence of imperfections in the data, the estimated sourcewavelet again degrades gracefully, as in the primary-only case. The imper-fections in the data do cause the the estimated source wavelet to be biasedtowards low frequencies, which is presumably caused by the “obliquity” fac-tor (Dragoset and Jericˇevic´, 1998), and can be mitigated by applying theabsolute value of the angluar frequencies, i.e., |ω|. We will discuss aboutthe biased source estimates in the Disucssion section with a more detailedanalysis.5.5.3 Convergence analysisTo obtain a more quantitative understanding of the proposed method, westudy and compare the convergence behaviors of the recovery algorithm withthe true wavelet and with source estimation. We use the normalized cross-correlation (NCC) to measure how well the inverted results approximate true85(a)(b)Figure 5.9: Compressive imaging-with-multiples results of the dataset modelled using iWave. (a) and (b): Image obtained withthe true source wavelet (a) and with source estimation (b).860 10 20 3000. (Hz)Amplitude(a)0 10 20 30−3−2−10123Frequency (Hz)Phase(b)Figure 5.10: Source estimates using multiples from the data set mod-elled using iWave. (a) and (b): Amplitude (a) and phase(b) spectra of the true (in blue) and the estimated (in green)source wavelet during the last LASSO(τ) subproblem, afteramplitude normalization and correction with |ω|.model perturbations. For two vectors v1 and v2 (in our case, we vectorize thetrue and inverted model perturbations first), the NCC is the inner productof two vectors divided by the product of the `2 norms of the two vectors(i.e., normalized inner product). Geometrically the NCC is the cosine of theangle between two vectors and is not sensitive to the amplitude of eithervector. The smaller the angle, the closer the two vectors are, and the closerthe NCC is to one. We plot the NCC between the model iterates (we savea model iterate whenever a LASSO subproblem is finished) and the truemodel perturbation.We plot the NCC curves in Figure 5.11 for both the imaging-with-primary and the imaging-with-multiple results in the presence of data imper-fections. In both cases, we can see that the convergence behaviors of usingthe true source wavelet and source estimation are comparable, demonstrat-ing the efficacy of the proposed source-estimation method.5.5.4 Robustness to wavelet initial guessWe use an impulsive source as the initial guess of the source wavelet in allexamples shown in this paper and get reasonable recoveries of the seismicimages as well as the source wavelets. In this section, we would like tostudy the performance of the proposed method when the initial guess of the870 10 20 30 40 50 6000. cross−correlation(a)0 10 20 30 40 5000. cross−correlation(b)Figure 5.11: Model misfit represented by the normalized cross-correlation (NCC) as a function of the number of iterations.Each dot indicates the finish of one LASSO subproblem. (a)NCC curves for the imaging-with-primary examples in Fig-ure 5.4, using the true wavelet (in blue), and with source es-timation (in green). (b) NCC curves for the imaging-with-multiple examples in Figure 5.9, using the true wavelet (inblue) and with source estimation (in green).source wavelet contains notable phase errors. Robustness of the proposedmethod to these phase errors is desirable. We use the same experiment setupas the imaging-with-multiples examples. We show the two erroneous initialguesses of the source wavelet in Figure 5.12(a), and show their correspondingsolution paths in terms of model errors in Figure 5.12(b). We can see despitethat both initial guesses of the source wavelet can lead to local minima, thesolution paths and the final model errors are comparable to the case where weuse an impulsive initial guess, demonstrating the robustness of the proposedmethod with respect to errors in the the initial guess of the wavelet.5.6 Discussion5.6.1 A note on the “true-amplitude” seismic imagingFrom the above discussion, we can see that unless the true source waveletis known, obtaining the true-amplitude seismic image is out of the questionwhen primary-only data are used due to the amplitude ambiguity. Incorpo-88−0.1 −0.05 0 0.05 0.1 0.15−2−1012Time (S)Amplitude(a)0 10 20 30 40 5000. cross−correlation(b)Figure 5.12: (a) The true source wavelet and the two erroneous ini-tial guesses of the wavelet. The initial guess in green has asignificant time shift, and the one in red has a 90 degree phaserotation plus a time shift. (b) The corresponding solutionpaths in terms of model errors measured using NCC.rating surface-related multiples in the inversion formulation can resolve theambiguity, i.e., we get both the source estimates and the image with definiteamplitudes. However, these definite amplitudes are the “true” amplitudesonly if the linearized modelling engine can accurately reproduce the inputwavefield given the true source wavelet, i.e., there is no modelling error.In practice, many factors hinder this requirement from being satisfied, forexample, scaling of the modelling engine, and modelling errors includinglinearization errors (Chapter 3), and oversimplifications of physics such aselasticity and attenuation etc. Nevertheless, the presented method providesa viable approach for the least-squares type of seismic imaging algorithmwith an unknown source wavelet to be one step closer towards the retrievalof true amplitudes.5.6.2 Source estimation as a “garbage collector”To understand the biased source estimates in Figure 5.10(a), we performa more careful analysis of the source estimation scheme in the proposedmethod. According to the objective function of the source estimation step(Equation (5.5)), we estimate the source wavelet in such a way that thelinearized modelling produces predicted data that best match the observeddata. Because the spectra of the data is the product of the spectrum of the89source wavelet and that of the Green’s function, the estimated source waveletaims to collect whatever in the data that is not adequately explained by thepredicted Green’s function. In cases where the predicted Green’s functionis biased because of modelling errors, the source estimates can also departfrom the true sources to accommodate this bias.We therefore plot the frequency spectra of the predicted Green’s function(by linearized modelling in the frequency domain using the inverted modelperturbations with an impulsive source wavelet), and compare them withthe Green’s function corresponding to the input data (i.e., the true sourcewavelet is deconvolved from the data). For easier visualization, we sum overall data traces to obtain one trace that approximately shows the spectra en-velope. The result is shown in Figure 5.13(a). We can see that because thepredicted Green’s function has underestimated low frequencies, the sourceestimates in return are biased towards the low frequencies to compensate forthis underestimation. The underestimation of the predicted Green’s func-tion at low frequencies is caused by modelling errors—such as linearizationerrors—as we do not see the biased source estimates in Figure 5.8(a) wherewe use the ideal input data. Nevertheless, what is more important is that de-spite the bias in the source estimates, the predicted Green’s function and theinverted model perturbations are comparable to the ones obtained using thetrue source wavelet (Figure 5.13(a), blue and green lines, and Figure 5.9(a)and 5.9(b)). In contrast, for the imaging-with-primary examples, there isnot such a significant underestimate or overestimate of the predicted Green’sfunction (Figure 5.13(b)), and we therefore obtain roughly unbiased sourceestimates (Figure 5.5(a)).5.7 ConclusionWe have proposed a fast least-squares imaging procedure that, unlike con-ventional reverse-time migration, doesn’t require prior knowledge of thesource wavelet which is difficult to accurately determine. We formulatedthis wavelet-free imaging procedure as a nonlinear sparse-inversion prob-lem, by tightly integrating source estimation into the compressive imagingframework, and proposed to solve the problem using variable projection.Because of the blind-deconvolutional nature of this problem, applying vari-able projection alone could not address the scaling ambiguity in estimatingthe source wavelet and the seismic image. We therefore continued by resolv-ing the issue through the incorporation of surface-related multiples in theinversion process (although at the expense of an increased computationalcost). We subsequently provided a series of examples demonstrating that900 10 20 3000. (Hz)Amplitude(a)0 2 4 6 8 10 1200. (Hz)Amplitude(b)Figure 5.13: The spectra of the predicted Green’s functions using theinverted model perturbations with the true source wavelet (inblue), with source estimation (in green), versus that of theinput data (wavelet is deconvolved from the data, in red).The spectra is obtained by summing over all traces and isamplitude-normalized. (a) Imaging-with-multiple examples.(b) Imaging-with-primary examples.our method can produce highly accurate estimates of the source wavelet andhigh-resolution seismic images that are comparable to the ones obtained withthe true source wavelet. We also provided examples where we overcome thescaling ambiguity by incorporating surface-related multiples, yielding def-inite amplitudes for the estimates of the source wavelet as well as for theseismic image. In all these examples, we controlled the simulation cost of theinversion to be more or less comparable to a single reverse-time migrationwith all the data, without compromising the quality of the images.91Chapter 6Application to a field dataset6.1 SummaryIn marine seismic acquisition, surface-related multiples constitute a signifi-cant portion of the acquired data. Typically, multiples are removed duringearly-stage data-processing as they can lead to phantom reflectors duringmigration that may result in erroneous geological interpretations. However,if properly dealt with, multiples can provide valuable extra information andcomplement primaries in illuminating the subsurface. In this article, wedemonstrate the limitation of the reverse-time migration in imaging thesemultiples, and present an alternative inversion procedure that is computa-tionally efficient, that jointly maps both primaries and multiples to the truereflectors, and where the source function is estimated on the fly. As a re-sult, we obtain high-quality, mostly artifact-free, broad-band images wherethe imprint of the source-function are partly removed at a computationallyaffordable expense compared to the combined costs of the wave-equationbased surface-related multiple elimination and the reverse-time migration.We achieve all this by including the total downgoing wavefields as arealsources in the least-squares migration in combination with curvelet-domainsparsity promotion. We apply the proposed method to a shallow-watermarine data set from the North sea, which contains abundant short-periodsurface-related multiples, and show its efficacy in eliminating coherent imag-A version of this chapter has been published in The Leading Edge, 2015, vol. 34, no.7, p. 788-79492ing artifacts associated with these multiples. We also demonstrate the ben-efits of joint imaging of primaries and multiples compared to imaging thesesignal components separately.6.2 IntroductionMarine seismic data usually contain strong surface-related multiples, whichbounce more than once between the free surface (i.e., the air-water contact)and subsurface reflectors. Amongst these multiples, waves that bounce in thewater column are often the most pronounced, which can lead to challengingshort-period multiples. Acting on first impulse, surface-related multiplesare typically removed during conventional seismic data processing. Whilethese approaches, such as Surface-Related Multiple Elimination (SRME,Verschuur et al., 1992), have been successful in removing the imprint ofmultiples on the final image, they discard valuable information containedin these multiples whose wave-number contents and apparent aperture areenriched by the fact that multiples undergo multiple bounces. We illustratethis effect in Figure 6.2 . For this reason, we propose to embrace surface-related multiples in our imaging formulation so we can maximally leveragetheir complementary information.The idea of using surface-related multiples by itself is not new. Berkhout(1993) presented with his W−R+W+ model the theoretical foundation forlater attempts to image multiples by modifying the source wavefield, ei-ther using the conventional cross-correlation imaging condition (i.e., RTM,Guitton, 2002; Schuster et al., 2004; Liu et al., 2011; Zuberi and Alkhali-fah, 2013), or the deconvolutional imaging condition to control the coherentimaging artifacts associated with multiples using RTM (Muijs et al., 2007;Whitmore et al., 2010; Lu et al., 2014b). As pointed out by Poole et al.(2010), as well as in Chapter 4, even the latter approach may fail in thepresence of complex geologies, as shown in Muijs et al. (2007), Figure (11b).Recent work by Lin et al. (2010b), Verschuur (2011), and Wong et al. (2012)showed that these artifacts can be largely removed by casting the imaging-with-multiples problem as a proper iterative least-squares migration prob-lem at the expense of expensive, and perhaps prohibitive in 3D, simulationcosts. The main contributions of this work is fourfold, namely (i) we in-corporate surface-related multiples in RTM via wave-equation solves w.r.t.areal sources, and thereby avoid expensive multiple predictions that plagueSRME and Estimation of Primaries by Sparse Inversion (EPSI, van Groen-estijn and Verschuur, 2009a; Lin and Herrmann, 2013); (ii) we estimatethe source function on-the-fly during the iterations making use of multiples93primary illuminated region(a)multiple illuminated region(b)Figure 6.1: Broader illumination coverage of multiples compared withprimaries: an illustration. The model is cropped from the sed-imentary part of the Sigsbee 2B FS (with free-surface) model.(a) and (b): Illustrations of the illuminated sections from theprimaries and the multiples, respectively.94(a)(b)Figure 6.2: Broader illumination coverage of multiples compared withprimaries: numerical examples. (a) and (b): The iterativeinversion results (after 60 iterations) of the primaries and themultiples, respectively.95(Chapter 5), following ideas from variable projection and using the uniquerelation between primaries and multiples and the source (Verschuur et al.,1992); (iii) we remove cross-talk between multiples by promoting curvelet-domain sparsity in the model space; and finally (iv) we reduce the numberof wave-equation solves to roughly the same number of a single RTM usingall shots (Chapter 2).To demonstrate the possible advantages of our joint imaging (i.e., usingboth primaries and multiples, details in Chapter 2), we apply this methodto a North-sea field data set. In particular, we are interested in specific ben-efits of imaging primaries and multiples together instead separately (e.g., inGuitton, 2002). After demonstrating the performance of the joint inversionapproach on a stylized example, we apply it to a real North-sea data set.6.3 Imaging with multiples & source estimationFor a given background-velocity model m0, reverse-time migration (RTM)derives from the following linear relationshipδd = ∇F[m0; s]δm (6.1)between the observed upgoing primary wavefields collected in the vector δd,the unknown source wavefields collected in the vector s, and the unknownmedium perturbation collected in δm. This linear relationship is basedon the single-scattering approximation, embodied in the linearized Bornscattering operator ∇F.During conventional imaging, surface-related multiples are removed fromthe observed data; the source is estimated, and a background-velocity modelis built using (traveltime) tomography or migration-velocity analyses. Inthis paper, we assume the background-velocity model m0 to be known whilewe estimate both the source signature s and the image δm from the totalupgoing wavefield, which contains imprints of both the source signature andthe surface-related multiples. Without loss of generality, we assume spatiallyimpulsive sources with the same time-harmonic source signature si = q(ω)for all ns sources i = 1 · · ·ns with ω the angular frequency.Our formulation for imaging with multiples derives from three importantmodifications to conventional RTM where the source signature s is assumedto be known—i.e.,δmrtm = ∇F∗[m0; s(q)]δd,where the image is obtained by applying the adjoint of ∇F to the de-96multipled data—namely,1. we replace the above spatially impulsive sources by areal sources thatcontains the downgoing data—i.e., we haves˜(q) 7→[s(q)− δd˜]. (6.2)Note that now the observed data δd˜ also contains all upgoing surface-related multiples. Contrary to conventional imaging, the source wave-field is now made up of the downgoing point-source wavefield plus thedowngoing data (i.e., the sign-reversed upgoing data as we approxi-mate the surface reflectivity by −1);2. instead of carrying out migration with the areal source described ear-lier, we solve the curvelet-domain (denoted by the matrix C) sparsity-promoting optimization programminimizeδx ‖δx‖1subject to ‖δd˜−∇F[m0; s˜(q)]C∗δx‖22 ≤ σ2,(6.3)within some user-specified tolerance σ. Similarly to Robust EPSI(REPSI, Lin and Herrmann, 2013), this sparsity-promoting programinverts the surface-related multiples but, instead of estimating the pri-maries, it produces a sparse curvelet representation for the multi-freeleast-squares image δxLS with a give source estimate q. The imageis then readily derived by taking the inverse curvelet transform viaδmLS = C∗δxLS;3. we estimate the source function on-the-fly after the kth iteration of theabove optimization program—i.e,minimizeq ‖δd˜−∇F[m0; s˜(q)]C∗δxk‖22. (6.4)As we will demonstrate, the proposed formulation produces high-resolution, high-fidelity, artifact-free images as well as accurate esti-mates for the source signature. The imprint of the source and surface-related multiples are removed by combining the linearized forwardmodel with SRME. The latter relates the total upgoing wavefield tothe surface-free Green’s function acting on the total downgoing wave-field. Compared to SRME our approach has the advantage of invert-ing, instead of removing, the surface-related multiples and thereby pre-serving multiple-scattering information. In addition, the proposed for-97malism uses wave-equation solvers to carry out the multi-dimensionalconvolutions implicitly (Chapter 2), which can lead to considerablecomputational savings. However, as REPSI, our approach is basedon an iterative inversion procedure that requires multiple linearizedBorn scatterings and migrations, which can be prohibitively expensivefor large models. Before providing details how to reduce the compu-tational cost, let us first describe common difficulties one encounterswhen imaging with surface-related multiples.6.4 Challenges during imaging with multiples6.4.1 Data preprocessingAs we mentioned earlier, our formulation derives from the SRME relation,which requires dense fixed-spread sampling with co-located sources and re-ceivers. For marine towed-streamer acquisition, we fill in the missing halfof data using reciprocity, and interpolate the missing (cross-line/near-offset)traces as in standard SRME. To obtain a reasonable estimate for this to-tal upgoing wavefield, the direct waves and the downgoing wavefield at thesurface (i.e., receiver ghosts) need to be removed, and the upgoing wave-field needs to be extrapolated from the receiver level to the free surface(Verschuur et al., 1992). The source-side ghosts, on the other hand, shouldbe left intact so that we mimic a dipole source as required by the theory(Verschuur et al., 1992).6.4.2 The need to invertContrary to previous attempts to mitigate the imprint of surface-relatedmultiples, the proposed formulation calls, as in REPSI, for a sparsity-promotinginversion of the linear forward modelling operator that ties perturbations inthe velocity model to data that contain surface-related multiples. Failureto invert this operator leads to images that suffer from the imprint of thesource and, perhaps more importantly, from artifacts related to coherentinterferences between the areal sources and the observed data due to thepresence of multiples.To explain the physical origin of these artifacts, we included in Figure 6.3a cartoon of the ray paths of primaries associated with conventional RTM(Figure 6.3(a)) and RTM with areal sources (Figure 6.3(b)). In this plots,the star indicates the seismic source and the triangles indicate the receivers.98Moreover, the solid arrows depict the ray paths of the primaries P0, and the1st to 3rd-order surface-related multiples M1, · · · ,M3. We use solid circlesto delineate imaging points that correspond to true reflectors, and dashedcircles to delineate imaging points that produce phantom reflectors. As wecan see from the illustration in Figure 6.3(a), conventional RTM produceserroneous events that do not correspond to physical reflectors when the in-put data contains surface-related multiples. Only primaries contribute toimaging the true reflectors. (We ignore internal multiples for now.) Con-versely, when we include the total down-going wavefield as an areal source,the RTM image contains a mix of proper and erroneous contributions frominteractions between multiples present in the data and the areal sources.Neighboring — e.g., P0 and M1, and M1 and M2, etc. — pairs of multiplesimage to physical reflectors while the others create coherent interferences,which are similar to artifacts when carrying out conventional migration ondata that contain surface-related multiples.To illustrate interferences between surface-related multiples on the sourceand receiver sides, we include a stylized synthetic example consisting of ahomogeneous water body, which also serves as the background model, anda single shallow perturbation as depicted in Figure 6.4(a) . In this model,we generate linearized primary-only data, by applying equation 6.1 withspatially impulsive sources, and data with surface-related multiples by in-jecting the areal sources given by equation 6.2 . The output of conventionalRTM with the primary-only data is of reasonable quality, as shown in Fig-ure 6.4(b), demonstrating RTM’s ability to reproduce subsurface structuresfrom multiple-free data. Unfortunately, the same observation does not ap-ply when we image data including surface-related multiples (juxtapose Fig-ures 6.4(b) and 6.4(c)). In that case, the image contains erroneous multiple-related artifacts. Unfortunately, including the total downgoing wavefield asareal sources during RTM is insufficient as we can see the result still con-tains interferences that can lead to wrong interpretations (juxtapose Fig-ures 6.4(b) and 6.4(d)).As we will demonstrate below, these artifacts can be removed by properlyinverting the multiple-prediction operator, now disguised as areal sources inthe Born scattering operator, with curvelet-domain sparsity promotion asoutlined above.6.4.3 Imaging & source estimation with multiplesAs the stylized example in the previous section clearly illustrates, usingareal sources (in equation 6.2) alone is obviously inadequate to image the99Free$surfaceTrue$reflectorPhantom$reflectorsP0 M1 M2 M3(a)Free$surfaceTrue$reflectorPhantom$reflectorsP0 M1 M2 M3(b)Figure 6.3: Illustration of the imaging artifacts stemming fromsurface-related multiples. In both figures, the input data con-tain primaries and surface-related multiples. (a) ConventionalRTM with spatially impulsive sources. (b) RTM with arealsources.100Lateral distance (m)Depth (m)0 500 1000 1500 200002004006008001000(a)Lateral distance (m)Depth (m)0 500 1000 1500 200002004006008001000(b)Lateral distance (m)Depth (m)0 500 1000 1500 200002004006008001000(c)Lateral distance (m)Depth (m)0 500 1000 1500 200002004006008001000(d)Figure 6.4: True model and the migrated images. (a) The true modelperturbation. (b) Conventional RTM with spatially impulsivesources and multiple-free data. (c) The same as (b) but nowfor input data with surface-related multiples. (d) The same as(c) but now with areal sources given in equation 6.2 .surface-related multiples. Furthermore, RTM also needs prior knowledge ofthe temporal source signature of the spatially impulsive sources, which istypically unknown unfortunately. However, when we invert the linearizedrelationship with sparsity promotion, we obtain far superior results for theabove example as shown in Figure 6.5 where we either know the sourcesignature beforehand, or we estimate this source signature on the fly.We can make the following observations: (i) sparsity-promoting inver-sion of data with multiples, but without using the areal sources, leads toa high resolution image that contains artifacts related to the multiples (seeFigure 6.5(a)); (ii) when provided with the correct source wavelet, carryingout the inversion with areal sources leads to a high-resolution artifact-freeimage (see Figure 6.5(b)); (iii) the same applies when we invert for thewavelet on the fly (see Figure 6.5(c) ); (iv) source estimation produces ac-curate estimates for the source signature (see Figure 6.5(d)).The above observations are truly remarkable because we have, follow-101Lateral distance (m)Depth (m)0 500 1000 1500 200002004006008001000(a)Lateral distance (m)Depth (m)0 500 1000 1500 200002004006008001000(b)Lateral distance (m)Depth (m)0 500 1000 1500 200002004006008001000(c)0 10 20 30 40 5000. (Hz)Amplitude(d)Figure 6.5: Migrated images using the proposed method. (a) Imag-ing of total data but only with primary-imaging operator. (b)and (c) Joint imaging of primaries and multiples, with the truesource (b) and with source estimation (c). (d) Source esti-mates (in green) versus the true source function (in blue) at thefrequencies selected in the last iteration.ing the procedure outlined above, been able to solve three problems in onego, namely (i) we successfully jointly map primaries and surface-relatedmultiples to true reflectors so we maximally benefit from the higher Signal-to-Noise Ratio (SNR) of the primaries and the extra illumination of themultiples; (ii) we estimate the unknown source function accurately whileremoving its imprint; and (iii) we properly scale the contributions from pri-maries and multiples that both contribute to the image. While the resultsfor this stylized example are certainly encouraging, we will need to firstremedy the computational costs associated with this procedure, which re-lies on many iterations involving computationally expensive migrations anddemigrations.1026.4.4 Fast randomized iterative inversionLeast-squares migrations, especially with sparsity-promoting, are in mostcases computationally prohibitively expensive because they rely on expen-sive iterations, the simulation cost of which scales linearly with the numberof (monochromatic) source experiments. To reduce these costs, we bor-row ideas from Herrmann and Li (2012) and limit the total number ofwave-equation solves by working with randomized subsets of frequenciesand source experiments during each iteration. As described in Chapter 2,this type of randomized procedure—where we regularly draw new random-ized experiments for each subproblem to expedite the convergence—leadsto significant cost savings bringing the required number of wave-equationsolves of our iterative procedure down to roughly equal a single RTM withall sources. Remember that by virtue of using areal sources, our imagingprocedure also avoids expensive iterative multi-dimensional convolutionaloperations required by SRME and EPSI. Instead, these expensive opera-tions are carried out by the wave-equation solver (Chapter 2), leading towhat we believe a computationally viable approach that can be applied to2D field data sets as shown below, and in the future, to 3D field data.6.5 North-sea case studyTo evaluate the performance of the proposed method, we consider a 2D rel-atively shallow (90 m water depth) marine data set from the North Sea,acquired by PGS with dual-sensor streamers. Because this type of acquisi-tion gives us access to both the pressure and the vertical particle-velocitywavefields, this data set is ideal for us to accurately separate the observedwavefield into its up- and down-going constituents. To get the best results,we apply the following preprocessing steps: near-offset data interpolation us-ing the Radon transform; up-down wavefield decomposition to remove thereceiver ghost and removal of the direct waves. These processing steps, incombination with application of source-receiver reciprocity to mimic a fixedspread acquisition, gives us an estimate for the downgoing wavefield with401 co-located sources and receivers with 12.5 m grid spacing that serves asinput to our imaging algorithm. Only the first 4092 ms of the data record areimaged up to 48 Hz. We derive the 3×5 km background-velocity model with6.25 m grid spacing from the stacking velocity. To assess the quality of ourimaging results, we also ran REPSI to get independent estimates for the pri-maries and surface-related multiples. The least-squares (LS) image (we willuse this term for images obtained by solving the above sparsity-promoting103program) from the primary-only data is included in Figure 6.6(a) for refer-ence. The image of the total upgoing data using the proposed method isshown in Figure 6.6(b) . For both images, we estimate the source wavelet onthe fly. On the computational side, both images are obtained with roughlythe same wave-equation solves as a single RTM with all the data (in thejoint-imaging case, the source estimation step necessitates some extra datamodelling that results in 50% more simulation cost, we in return reduce thenumber of iterations by one-third for Figure 6.6(b)).6.5.1 Joint versus separate inversionsWith the proposed method, we can either image primaries and multiplesseparately, or image the total upgoing wavefields in one step via joint inver-sion with on-the-fly source estimation. Compared to imaging with primariesand multiples separately, this joint approach has the advantage that the im-ages from these two components are properly scaled as we optimize over thesource wavelet to fit both primaries and multiples. As a result, we avoidthe source-scaling ambiguity, which hampers imaging with primaries only(Chapter 5). In addition, we produce one combined imaging result thatmaximally uses the information residing in both primaries and multiples.To substantiate this claim, we conduct a series of experiments. First,we compare images from primaries only (Figure 6.7(a)); from multiples only(Figure 6.7(b)), and from primaries and multiples jointly (Figure 6.8). Allresults are obtained with our sparsity-promoting inversion method usingtheir respective source wavefields (see Discussions in Chapter 2). When wecompare the images obtained from primaries and multiples only, we findthat in general the image of the primaries alone is, as expected, cleaner. Forexample, the structures indicated by the red arrows are better resolved withthe primaries, and so are the two sections marked inside the blue rectan-gles. However, while scaled differently, the image from the multiples alonereveals most of the major reflectors (cf. Figures 6.7(a) and 6.7(b)) and evenleads to an improved image for the reflector indicated by the grey arrows inFigure 6.7(b) .Comparing our results obtained by inverting primaries- and multiples-only data with the joint inversion suggests that inverting the total data maylead to superior results that benefit from the relatively noise-free primariesand extra illumination from the multiples. As the results in Figure 6.8 in-dicate, our joint inversion strikes a good compromise and produces a singleimage that combines the benefits of both primaries and multiples — com-pared with Figure 6.7(a), the structures indicated by the grey arrows are104Lateral distance (m)Depth (m)0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000050010001500200025003000(a)Lateral distance (m)Depth (m)0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000050010001500200025003000(b)Figure 6.6: LS images of (a) primaries only, and (b) total upgoingdata.105Lateral distance (m)Depth (m)0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000050010001500200025003000(a)Lateral distance (m)Depth (m)0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000050010001500200025003000(b)Figure 6.7: Separate images of primary-only data (a) and multiple-only data (b).better imaged in Figure 6.8; and compared with Figure 6.7(b), the structuresindicated by the red and blue arrows are better imaged in Figure 6.8.106Lateral distance (m)Depth (m)0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000050010001500200025003000Figure 6.8: Joint imaging of total data consisting of primaries andmultiples.6.5.2 Suppressing imaging artifacts from multiplesTo further substantiate our claim that the proposed inversion scheme canremove the coherent artifacts associated with multiples, we study the detri-mental effects that surface-related multiples can have when handled incorrectly—i.e., when included but not inverted. For this purpose, we again draw yourattention to imaging with multiples only by comparing the result of a sin-gle RTM (Figure 6.9(a)) with the result obtained with sparsity-promotinginversion (Figure 6.9(b)). In both cases, we use areal sources defined inequation 6.2 . Even though using areal sources leads to true reflectors suchas the sea floor (c.f. the solid circles in Figure 6.3(b)), it also result ininterferences (c.f. the dashed circles in Figure 6.3(b)), which rings acrossthe section (Figure 6.9(a)). The presence of these artifacts exposes a fun-damental shortcoming of RTM’s correlation-based imaging condition, whichwe overcome largely by sparsity-promoting inversion. As we can be readilyobserve from Figure 6.9(b) and the zoomed in sections A and B included inFigure 6.10(a) and 6.10(b), the strong artifacts associated with the multi-ples in Figure 6.9(a) are mostly suppressed in Figure 6.9(b) . These resultsdemonstrate that the energy of surface-related multiples can be mapped totrue reflectors via a sparsity-promoting inversion procedure using the down-going receiver data as areal sources. The results also support our claim thatinverting primaries and multiples jointly allows us to benefit from the mul-tiples without the need to separate primaries and multiples upfront, which107Lateral distance (m)Depth (m)0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000050010001500200025003000AB(a)Lateral distance (m)Depth (m)0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000050010001500200025003000AB(b)Figure 6.9: (a) RTM of multiples only. (b) LS image of multiplesonly. Both with the downgoing receiver data acting as arealsources.is an computationally expensive proposition.108Lateral distance (m)1000 1200 1400 1600 18001002003004005006007008009001000Lateral distance (m)1000 1200 1400 1600 18001002003004005006007008009001000(a)Depth (m)600 800 1000 1200 1400 1600 1800 20002300240025002600Lateral distance (m)Depth (m)600 800 1000 1200 1400 1600 1800 20002300240025002600(b)Figure 6.10: Zoomed-in comparisons of sections A and B between Fig-ure 6.9(a) and Figure 6.9(b).1096.6 ConclusionBecause surface-related multiples bounce more than once between the freesurface and the subsurface reflectors, they generate coherent artifacts whennot treated properly during imaging—even in situations where conventionalsources are replaced by areal sources that include the downgoing data. How-ever, if we combine these areal sources with a proper sparsity-promotinginversion procedure—during which the deconvolved image and source signa-ture are estimated—we arrive at a formulation that maximally leverages pri-mary and multiple energy via a joint-inversion procedure. During this inver-sion, the adverse coherent artifacts arising from multiples are turned into co-herent images, and we can benefit from the multiples that are better sampledthan primaries as their bounce points act as secondary sources. Aside frommapping multiple-related artifacts to the correct reflectors, our method alsoavoids the expensive combination of wave-equation based surface-relatedmultiple elimination and subsequent reverse-time migration of primariesand/or multiples. Instead, our method proposes to image the primariesand multiples jointly during an iterative sparsity-promoting inversion pro-cedure. By adaptively estimating the source wavelet, we achieve an optimalbalance between the contributions from primaries and multiples. By us-ing randomized subsampling techniques, we limit the number of requiredwave-equation solves to roughly equal a single reverse-time migration withall data. Since the modelling of the multiple is done by the wave-equationsolver, no explicit multiple prediction and removal are needed. Applicationof our inversion framework on a real marine field dataset clearly demon-strates the advantage of inverting primaries and multiples jointly comparedto estimating and imaging these wave constituents separately.110Chapter 7Future work: sparse seismicimaging simplified7.1 SummaryWe present a novel adaptation of a recently developed relatively simple itera-tive algorithm to solve large-scale sparsity-promoting optimization problems.Our algorithm is particularly suitable to large-scale geophysical inversionproblems, such as sparse least-squares reverse-time migration or Kirchoffmigration since it allows for a tradeoff between parallel computations, mem-ory allocation, and turnaround times, by working on subsets of the data withdifferent sizes. Comparison of the proposed method for sparse least-squaresimaging shows a performance that rivals and even exceeds the performanceof state-of-the-art one-norm solvers that are able to carry out least-squaresmigration at the cost of a single migration with all data.7.2 MotivationWave-equation based migration corresponds to computingδm =K∑i=1∇F∗i (m0,qi)δdi, (7.1)A version of this chapter has been published in the proceedings of the 77th EAGEConference & Exhibition 2015, Madrid, Spain111where the sum runs over the number of experiments, indexed by i = 1 · · ·K.For seismic applications, this number K is typically very large. Theseexperiments—with sources qi and corresponding data δdi—involve eitherindividual traces for different source/receiver pairs as in Kirchoff migrationor (monochromatic) shot records as in (time-harmonic) reverse-time migra-tion. In either case, the computational costs, including data-storage andmovement, grows with K and the size of the survey area. To make mattersworse, migration often relies on increasing the fold, to stack out unwantedartifacts in the image space, or on carrying out least-squares migration. Ineither case, the aforementioned increased costs strain even the largest com-putational resources stifling the adaptation of sparsity-promoting inversiontechniques.To overcome these costs, which are exacerbated by rapidly increasingcosts of RTM as the frequency increases, we proposed randomized algo-rithms based on certain subsampling techniques that work on randomizedsubsets of the sources and corresponding data. These techniques have ei-ther lead to so-called batching techniques (van Leeuwen and Herrmann,2012), where the above sum is approximated by carrying out the sum oversubsets of randomly selected sources (or simultaneous sources), or to tech-niques based on Compressive Sensing (Herrmann and Li, 2012). Contrary tobatching techniques, where the error is controlled by increasing the size of thesubsets, sampling related errors during Compressive Imaging are controlledby promoting curvelet-domain sparsity. By incorporating rerandomizationtechniques from stochastic optimization into sparsity-promoting imaging, wearrived at an imaging technology that is able to carry out sparsity-promotingleast-squares migration working on small subsets at the cumulative costs ofroughly one RTM carried out over all data (Herrmann and Li, 2012, andChapter 2).While our Compressive Imaging technology is certainly promising, itrelies on the combination of a sophisticated `1-norm solver (SPG`1, vanden Berg and Friedlander, 2008), rerandomizations, and warm starts. Thismakes the resulting algorithm complicated, and therefore difficult to imple-ment on industry-scale problems. In addition, there is, aside from empiricalobservations, no formal proof what optimization problem this heuristic ran-domized approach actually solves. Below, we present a new highly flexiblesparse inversion methodology that derives from a recently proposed sparseKaczmarz solver for Compressive Sensing problems using linearized Breg-man iterations (Lorenz et al., 2014). The resulting algorithm is relativelysimple and suitable for seimic imaging problems where measurements—read application of rows/row-blocks of the linearized modelling operator and112therefore terms in the sum of Equation 7.1—take time and parallel resourcesto evaluate.We organize our work as follows. First, we briefly state a simplifiedversion of the algorithm undergirding our Compressive Imaging technologyand juxtapose it with the proposed algorithm. Next, we compare the per-formance of the proposed algorithm on seismic imaging problems, followedby conclusions.7.3 Fast sparse solversWe are interested in solving the following (partly) separable—i.e., we haveaccess to individual rows as in trace-based migration or blocks of rows asin shot migration—linear inverse problems, where we aim to estimate theimage x from measurements b ∈ RM given by b = Ax, with x sparse orcompressible, and A ∈ RM×N tall—i.e., M  N . This overdeterminedproblem is different from Compressive Sensing where M  N and couldin principle be solved easily by inverting A with least-squares or by simplyapplying the adjoint. Unfortunately, since we are dealing with problemswhere there are many rows of A that are expensive to evaluate, we cannot follow this approach and we have to resort to working on randomizedpairs {Ai,bi}, made of randomly selected subsets of rows only—i.e., we onlyhave access to Ai ∈ Rm×N and bi ∈ Rm with m  N  M . Our aim isnow to come up with an algorithm where the total amount of work, readthe total number of rows participating in the inversion, is roughly the sameas the total number of rows of the total system , or terms in the sum ofEquation (7.1)—i.e., L = m× nk ∼ M with nk the number of iterations ofthe solver.7.3.1 Solution by Randomized IterativeShrinkage-Thresholding Algorithm (RISTA)While many algorithms have been developed to solve the convex Basis Pur-suit problem for underdetermined A’s with m NBP : minimizex‖x‖1 subject to Ax = b,its solution often requires too many iterations, and therefore evaluations ofA and its adjoint A∗, to make this a viable option for industrial problems,where the matrices are either very large or expensive to evaluate or both.To overcome this problem, the authors proposed a method to accelerate this113type of algorithms using ideas from approximate message passing. The keyidea of that approach was to “break” buildup of correlations between themodel iterates, xk, and the matrix A by drawing randomized pairs {Ak,bk}for each iteration k. This results in Algorithm 3, a randomized version of theIterative Shrinkage-Thresholding Algorithm (ISTA,Yin et al. (2008);Beckand Teboulle (2009)). For a fixed pair {Ak,bk} and appropriately chosenstep length tk and λ → 0+ (read as λ goes to 0 from above), Algorithm 3provably solves BP. While RISTA type algorithms—where techniques fromstochastic gradients are combined with sparsity promotion—lead to fast andtherefore practical algorithms, their performance hinges on choices for thethreshold. Not only do small λ’s lead to slow convergence, there is alsono general proof that Algorithm 3 solves BP when choosing different pairs{Ak,bk} for each iteration. So the question is whether there exists an alter-native iterative solver that makes few assumptions on A and that is capableof working with small randomized subsets of rows and that is relativelysimple to implement.Algorithm 3 Randomized Iterative Shrinkage-Thresholding Algorithm(RISTA) for appropriately chosen step lengths tk’s and threshold λ for thesoft thresholding with Sλ(x) = max(|x| − λ, 0).Result: Estimate for the model xk1. for k = 0, 1, · · · do2. Draw Ak and bk3. zk+1 = xk − tkA∗k(Akxk − bk)4. xk+1 = Sλ(zk+1)5. end for7.3.2 Solution by Randomized IterativeShrinkage-thresholding Kaczmarz Algorithm withlinearized Bregman (RISKA)Consider the following slight modification—i.e. we replace line 3 of Algo-rithm 3 by zk+1 = zk− tkA∗k(Akxk−bk). We call this Solution by Random-ized Shrinkage-Thresholding Kaczmarz Algorithm with linearized Bregman(RISKA,Yin et al. (2008); Maharramov and Biondi (2014)). According torecent work by Lorenz et al. (2014), this modified algorithm solves for step114lengths, tk =‖Akxk−bk‖2‖A∗k(Akxk−bk‖2 ,KB : minimizexλ‖x‖1 +12‖x‖22 subject to Ax = b.For large enough λ (van Leeuwen and Herrmann, 2013), RISKA provablyconverges to the solution of BP. While the modification w.r.t. RISTA isat first sight relatively minor, its implications are profound and manifold.First, compared to RISTA the inversion involves two iterates, namely zk+1,which seeks to minimize the `2-norm of the residual, and sparse model iter-ates xk obtained by soft thresholding zk+1 with the threshold λ. Contraryto RISTA, updates for the zk’s are calculated by subtracting gradients com-puted with respect to the sparse model iterate xk. Third, the role of thethreshold λ fundamentally changes. Within RISTA, the λ determines theprojection of the model iterates on a λ dependent `1 ball. For a fixed λthe iterations without drawing new pairs {Ak,bk} solve the unconstrainedproblem minx 12‖Ax−b‖+λ‖x‖1. By taking the limit λ→ 0+ these uncon-strained problems solve BP. Unfortunately, small λ lead to slow convergence(Hennenfent et al., 2008). For large-scale problems this makes it necessaryto devise ingenious cooling techniques where the `1-norm constraint is grad-ually relaxed, e.g., by decreasing the λ’s as the algorithm progresses. Thisprocess allows components to enter into the solution slowly while bringingdown the residual. Thresholding within RISKA, on the other hand, playsa completely different role since it balances `1- and `2-norms on the model.This means in a way that RISKA is “auto tuning”—i.e., when we chooseλ too high, the xk will be zeros but not the zk’s. As k increases, the largeentries—these are the first candidates to enter the solution—in zk will growwhile the noisy crosstalk will become smaller if we chose to work with ran-domized {Ak,bk}’s. After sufficient iterations, the number of which can bereduced by using intelligent step lengths, components will enter the solu-tion. A too large λ for RISTA, on the other hand, would lead to zeros andwould require a lowering of λ. Finally, RISKA solves a constrained problem,i.e., the data is fitted exactly (this can be relaxed to different norms errorballs (Lorenz et al., 2014)). This has the advantage that step lengths can belarger leading to a very fast but above all extremely simple algorithm thatlends itself well to (i) work on (randomized) subsets of the data. Depend-ing on the available computational resources, this gives us the freedom tochoose, for a fixed number of passes through the rows L, an optimal batchsize m by trading of numbers of iterations and numbers of available nodespartaking in the inversion. For RTM this means that we can control the115number of PDE solves while for trace-based migration this would limit thetotal number of source-receiver pairs touched by migration; (ii) implementRISKA in existing industrial settings as long as one has access to random-ized subsets of data bk, the Born modelling Ak, and its adjoint migrationA∗k. For RTM this means randomly selected shots or simultaneous shotswhile for Kirchhoff migration one should have access to randomized traces;(iii) select data and rows adaptively, e.g., according to the norm of the rowsof A or condition number of block of rows as described in the literature(Strohmer and Vershynin, 2009; Needell and Tropp, 2014).7.4 ExamplesTo illustrate the performance of RISKA, we compare sparsity-promotinglinearized imaging results for primaries only and data including primariesand surface-related multiples obtained with our randomized SPG`1 solverand with the above described algorithm. In all examples, we fix the numberof wave-equation solves to be equivalent to a single RTM with all data. Theresults included in Figure 7.1 show improvements, especially below the Salt(juxtapose Figure 7.1(a) and 7.1(b)). These results are remarkable becausewe are able to obtain high-fidelity inversion results at the cost of a singleRTM. An advantage of RISKA is that the underlying algorithm is simple toimplement and relatively robust w.r.t. choices of λ.For the same budget (cost of 1 RTM), the proposed algorithm also pro-duces improved results for an imaging example that includes surface-relatedmultiples. In this case, we invert the linearized Born scattering operatorwith an areal source wavefield that contains the downgoing receiver wave-field (Chapter 2). Again, the results in Figure 7.2 are excellent despite therelative simplicity of the algorithm.7.5 ConclusionWe presented a simplified randomized sparsity-promoting solver that prov-ably converges to a mixed one- two-norm penalized solution while workingon arbitrarily small subsets of data. This latter ability, in combination withits simplicity and “auto-tuning”, makes this algorithm particular suitablefor large-scale and parallel industrial applications.116Lateral distance (m)Depth (m)0 5000 10000 150000100020003000(a)Lateral distance (m)Depth (m)0 5000 10000 150000100020003000(b)Figure 7.1: Comparison of randomized sparse inversions on primaryonly data using SPG`1 (a) and RISKA (b). It is clear that forthe same computational budget, the inversion result for RISKAis less noisy and recovers the reflectors below the Salt better.117Lateral distance (m)Depth (m)0 1000 2000 3000 4000 50000500100015002000250030003500(a)Lateral distance (m)Depth (m)0 1000 2000 3000 4000 50000500100015002000250030003500(b)Figure 7.2: Comparison of randomized sparse inversions on data withsurface-related multiples using SPG`1 (a) and RISKA (b).Again, the much simpler RISKA does an excellent job comparedto SPG`1.118Chapter 8Further extension:least-squares extendedimaging with surface-relatedmultiples8.1 SummaryCommon image gathers are used in building velocity models, inverting aniso-tropy parameters, and analyzing reservoir attributes. Often primary reflec-tions are used to form image gathers and multiples are typically attenuatedin processing to remove strong coherent artifacts generated by multiplesthat interfere with the imaged reflectors. However, researchers have shownthat, if correctly used, multiples can actually provide extra illumination ofthe subsurface in seismic imaging, especially for delineating the near-surfacefeatures. In this work, we borrow ideas from literatures on imaging withsurface-related multiples, and apply these ideas to extended imaging. Thisway we save the massive computation cost in separating multiples from thedata before using them during the formation of image gathers. Also, we mit-igate the strong coherent artifacts generated by multiples which can sendthe migration velocity analysis type algorithms in wrong direction. Syn-thetic examples on a four-layer model show the efficacy of the proposedformulation.A version of this chapter has been published in the proceedings of CSEG GeoConven-tion 2015, Calgary, Canada1198.2 IntroductionAn extended image is a multi-dimensional correlation of source and receiverwavefields, as a function of all subsurface offsets (see Sava and Vasconcelos,2011, and references therein for a recent overview). There are many differentapplications in which extended images are used extensively like constructionof angle-domain common-image gathers (ADCIGs) and migration velocityanalysis (MVA) (Biondi and Symes, 2004; Shen and Symes, 2008; Symes,2008; Sava and Vasconcelos, 2011; Kumar et al., 2013, 2014; Wong et al.,2014). Other than that, extended images are used to study the rock proper-ties and fluid indicators by estimating the amplitudes of reflected waves asa function of incident angle at the interface. One of the current limitationsis that the extended imaging conditions cannot handle multiple reflections.However, Lu et al. (2014a) showed that primaries may not provide enoughillumination to image near-surface targets. To overcome this situation, Luet al. (2014a) proposed to use multiples where they apply the same imagingcondition to multiples by putting in the primaries as the source wavefield andthe multiples as the receiver wavefield. However, efforts during seismic dataprocessing to extract multiple wavefields are always challenging, computa-tionally expensive and can risk damage to the underlying signal. Therefore,to mitigate these impediments and to get benefit from the multiples, we fol-low earlier work on joint imaging of primaries and surface-related multiples(Chapter 2 and Chapter 5) and adapt it to extended imaging which poten-tially leads to better images. The idea is to combine EPSI (Estimation ofPrimaries via Sparse Inversion) (van Groenestijn and Verschuur, 2009a) withmigration as a way to benefit from surface-related multiples. Following thesame ideas, we show that we can exploit the EPSI relationship while formingthe image gathers. As a result, we need to move to a least-squares extendedimaging rather than a simple correlation-based imaging condition. The pro-posed way of imaging the surface-related multiples along with primaries donot increase the dominant computational cost, which is the solution of thewave-equation. Finally, we show the efficacy of the proposed formulation ona four-layer synthetic velocity model.8.3 Extended imaging with free-surface multiplesGiven the time-harmonic source and receiver wavefields as matrices U andV, extended images at a single frequency, for all subsurface offsets andfor all subsurface points can be written as the outer product of these two120matrices, i.e., we haveE = VU∗, (8.1)where U, V are calculated using the two-way wave-equation and each col-umn of U, V corresponds to a source experiment. The explicit expressionof E can be written asE = H−∗PTr DQ∗PsH−∗, (8.2)where H is a discretization of the Helmholtz operator (ω2m +∇2) where mis the squared slowness, matrix Q represents the source function, ∗ repre-sents the conjugate-transpose, D is the primary reflection data matrix, andmatrices Ps and Pr sample the wavefield at the source and receiver positions(and hence, their transpose injects the sources and receivers into the grid).Note here that H−1 is a PDE solve and constitutes the main computationalcost in equation 8.2. In order to incorporate multiples along with primary,we follow the SRME (surface-related multiple estimation) formulation pro-posed by Verschuur et al. (1992) where the total up- and down-going pressurewavefields, the Green’s function and the source are related as followsP = G(Q + RP), (8.3)where G represents the Green’s function, P is the total up-going wavefield,RP represents the down-going receiver wavefield at the surface that acts asa generalized areal-source wavefield for the surface-related multiples, and Qis the down-going point-source wavefield. In this article we assume that thesurface reflectivity can be approximated by −1. Note that each quantity inthe above expression represents monochromatic variables. Therefore, we canreplace the primary reflections data matrix in equation 8.2 with the totalup-going data matrix P and redefine the extended image E asE = H−∗PTr P(Q−P)∗PsH−∗. (8.4)In 2D, the extended image is a 5-dimensional function of all subsurfaceoffsets and temporal shifts. So even in this case, it is prohibitively expensiveto compute and store the extended image for all the subsurface points. Toovercome this problem, we select l columns from E implicitly by multiplyingthis matrix with the tall matrix W = [w1, . . . ,wl] yielding:E˜ = EW, (8.5)where wi = [0, . . . , 0, 1, 0, . . . , 0] represents a single scattering point with121the location of 1 corresponding to the ith grid location of a point scatterer.Each column of E˜ represents a common-image point gather (CIPs) at thelocations represented by wi. We follow van Leeuwen and Herrmann (2012b)to efficiently compute E˜ by combining equations (4) and (5) asE˜ = EW = H−∗PTr P(Q−P)∗PsH−∗W. (8.6)As we can see the computational cost of calculating E˜ is 2l PDE solvesplus the cost of correlating the areal-source and the data matrices. Thus,the cost of computing the CIPs does not depend on the number of sources orthe number of subsurface offsets, as it does in the conventional methods forcomputing image gathers (Sava and Vasconcelos, 2011). This is particularlybeneficial when we are interested in computing only a few CIPs. Here,Equation 8.6 defines an extended migration operator that maps the totalup-going data matrix to an extended image. The corresponding extendeddemigration operator, F , is defined asF(E˜) = PrH−1E˜((Q−P)∗PsH−∗W)∗, (8.7)where F is a linear operator w.r.t the sources. To compute reliable ampli-tudes for extended image, we estimate the least-squares extended image bysolvingminimizeE˜12‖P−F(E˜)‖2F , (8.8)where ‖.‖2F is the Frobenius norm of the matrix (sum of the squaredentries). In the above expression we assume that Q is known, however, inthe proposed framework Q can be estimated as well (Aravkin et al., 2013b).8.4 ResultsTo test the proposed algorithm for computing common-image gathers, weuse a synthetic four-layer velocity model (with a grid sampling of 10 m) asshown in Figure 8.1(a). We place the sources and receivers on the surfacewith a spatial interval of 10 m. We use a finite-difference frequency-domaincode to generate the synthetic data sets. The source signature is a Rickerwavelet with a peak frequency of 10 Hz. Figure 8.1(b) shows the backgroundvelocity model used to obtain least-squares reverse-time migration (LSRTM)images and common-image gathers (CIGs). We use 15 LSQR iterationsto obtain the LSRTM images and the CIGs. We construct common-imagegathers at x = 1000 m for all z. Figures 8.2(a) and 8.3(a) show the migrated122image and CIGs when we only use primary reflection data and Q as thesource function. As expected, the LSRTM image and the image gathers arefully focused. Next, we use the total up-going wavefield P as input, but stilluse Q as the source function. Since the multiple reflections are not properlydealt with, we can see the focused ghost event generated by the multiplewavefields in Figures 8.2(b) and 8.3(b). As mentioned before, the presenceof such ghost events can mislead the velocity analysis process. Finally, weobtain the image and the image gathers using the total up-going wavefieldP along with the areal source function Q−P. We can see in Figures 8.2(c)and 8.3(c) that incorporating the areal sources in the LSRTM and extendedimages help mitigate the ghost reflector.8.5 ConclusionWe have proposed a methodology for forming extended image gathers fromdata containing both primaries and multiples. The approach is based onearlier developments in simultaneous least-squares imaging of primaries andmultiples, and matrix-probing methods to compute the extended image ina computationally efficient manner. The result is a method that creates atrue-amplitude extended image gather directly from data containing surfacerelated multiples, at a computational cost that is roughly equivalent to aleast-squares migration. Numerical results on a simple synthetic model arepromising and encourage further research.123x (m)z (m)  0 500 1000 1500 200002004006008001000 1700180019002000210022002300(a)x (m)z (m)  0 500 1000 1500 200002004006008001000 1700180019002000210022002300(b)Figure 8.1: (a)True velocity model. (b) Background velocity modelused in extended imaging.124z (m)x (m)0 500 1000 1500 200002004006008001000(a)z (m)x (m)0 500 1000 1500 200002004006008001000(b)z (m)x (m)0 500 1000 1500 200002004006008001000(c)Figure 8.2: Least-squares reverse-time migration. (a) Primary dataonly. (b) Total up-going data is used but areal source is notused. We can clearly see the ghost reflector when we do not usethe areal source. (c) Total up-going data is used along with theareal source. Incorporation of areal source removes the ghostreflector.125z (m)x − xo (m)−400 −200 0 200 40002004006008001000(a)z (m)x − xo (m)−400 −200 0 200 40002004006008001000(b)z (m)x − xo (m)−400 −200 0 200 40002004006008001000(c)Figure 8.3: Least-squares common-image gather extracted along x =1000m and for all z. (a) Primary data only. (b) Total up-goingdata is used but areal source is not used. (c) Total up-goingdata is used along with the areal source. We can clearly see theeffect of ghost reflectors (in the middle) which can be removed(on the right) via incorporating the areal source.126Chapter 9ConclusionIn response of the objectives we aim to achieve through this research, whichwe detailed in the introduction chapter, we draw the following conclusions:1. When used to image surface-related multiples, direct application ofthe cross-correlation imaging condition, i.e., reverse-time migrationof these multiples, produces strong coherent artifacts associated withthese multiples. Using the deconvolutional imaging condition can sup-press these artifacts to some extent, but its performance is compro-mised in the presence of complex geologies. Through the presentedwork, we have observed that using an iterative inversion approach,which maps the surface-related multiples to the true “primary-image”,can significantly suppress these artifacts.2. Iterative inversions are prohibitively expensive in computation, as eachiteration involves several wave-equation solves for each monochromaticsource experiment. The incorporation of surface-related multiples ex-acerbates the issue as one also need to predict these multiples in eachiteration, for example via the surface-related-multiple-elimination typeof data convolutions (Verschuur et al., 1992), which are also computa-tionally expensive. In this work, we address the two issues by (i) usingwave-equation solvers to implicitly carry out the multiple predictionand therefore save most of the cost of performing data convolutions;and (ii) reducing the number of wave-equation solves by subsamplingthe monochromatic source experiments. As a result, we can obtainan inverted image with roughly the same number of wave-equationsolves as in one reverse-time migration with all the data. We effec-tively remove the cross-talk artifacts introduced by the subsampling127by adopting curvelet-domain sparsity-promoting and rerandomization(i.e., we draw new data samples every once in a while) techniques, andtherefore the image quality is not compromised.3. We have observed that the proposed method gains robustness (i) tovelocity errors in the background migration model by curvelet-domainsparsity promoting; and (ii) to linearization errors by adopting thererandomization technique. Robustness to these types of errors areimportant for the method to be applicable to field data sets.4. We can obtain true-amplitude images by the joint inversion of pri-maries and all orders of surface-related multiples. Using field seismicdata, we observe that by joint inversion, we can reap benefits from thehigh signal-to-noise ratio of primaries and the wider illumination cov-erage of multiples by joint inversion. The joint inversion does not relyon the knowledge of the wavelet — we estimate the source wavelet onthe fly — and we show with synthetic examples that the image we pro-duce with source estimation is comparable to that obtained with thetrue source wavelet. A key contribution of this work is that by incorpo-rating surface-related multiples in the joint imaging process, we resolvethe amplitude ambiguity in both the image and the source wavelet, byexploiting the self-consistency between primary events and their cor-responding surface-related-multiple events. We show with syntheticexamples that, assuming no modelling errors, we can recover the orig-inal true amplitude of both the image and the source wavelet.9.1 Significance and potential applicationAs we discussed above, an iterative inversion approach, also known as theleast-squares migration, is needed to image the surface-related multiples.Although the promising benefits of using surface-related multiples in seis-mic imaging have been reported in the literature, the wide adoption of theleast-squares migration by the oil and gas exploration industry has longbeen hindered by the expensive computational cost, especially for 3D datasets. We show with synthetic and field data examples that we can producehigh-quality least-squares migrated images with roughly the same numberof wave-equation solves as one reverse-time migration, compared with thecombined cost of one surface-related multiple elimination and one reverse-time migration in conventional seismic data processing. Therefore we believethat the proposed work has great potential to promote the adoption of the128least-squares migration that makes active use of surface-related multiples bythe industry, especially for 3D data sets.Moreover, via the adaptive source estimation method proposed in thiswork, joint imaging of primaries and all orders of surface-related multiplesobviates the necessity of primary/multiple separation during the imagingprocedure, given the knowledge of a reasonably accurate velocity model.With potential extension of the proposed methodology to extended imaging,as discussed in Chapter 8, we may be able to avoid this wavefield separationfor velocity-model building as well. We are therefore making great progresstowards a seismic data processing workflow that does not involve this wave-field separation altogether. This improvement is especially important asseismic acquisition techniques move towards simultaneous/blended acquisi-tions. As de-multiple methods such as surface-related multiple eliminationrelies on conventional “sequential” data, geophysical practitioners now firstneed to recover conventional data from the simultaneous/blended data toseparate primaries and multiples. Our contribution will potentially enablegeophysical practitioners to directly process seismic data from simultane-ous/blended acquisitions.On a side note, the proposed method, apart from being computationallyefficient, also facilitates parallel computing and can therefore benefit frommodern parallel computing hardware. The algorithm can be parallelizedover frequencies or source experiments. As in each iteration, we only usea small subset of data, we greatly decrease the memory requirement foreach computational thread, and therefore the computational time spent ondata transmission that greatly stains computer input/output. The abovebenefit facilitates the use of massively parallel accelerators such as GraphicalProcessing Units (GPUs).9.2 Limitation and future workIn this presented work, we implemented and tested the proposed methodfor 2D seismic data. A future research topic is to extend the work for 3Dseismic data, to account for the 3D wave-propagation effects in field seismicdata.We mainly discussed the relatively strong surface-related multiples, andlargely ignored the usually much weaker internal multiples. As we discussedin Chapter 2, when properly used, internal multiples can also improve sub-surface illumination. By extending our proposed method, via a non-linear-migration type of approach that gradually builds up sharp contrasts in thevelocity model, we will investigate how to properly image the internal mul-129tiples.The proposed method is designed for marine surface data, i.e., bothsources and receivers are placed close to the surface, for example, the towed-streamer acquisition. We will investigate how we can adapt the method towork with ocean-bottom data, and how the change of the receiver horizonwill change the way that primaries and multiples interact.The proposed method requires the input data to be free of the downgoingwavefield at the surface (i.e., receiver ghosts). Therefore, a receiver-side de-ghost process should be applied to a data set before we apply the proposedmethod to it. We will investigate how we can incorporate the receiver ghosts,instead of removing them, in the proposed inversion approach.In our implementation of the proposed method, we used the highly so-phisticated SPG`1 solver for the sparsity-promoting optimization program.The complexity of the algorithm can hinder the uptake of the proposedmethod by the industry. We have already started experimenting with al-ternative approaches, notably the linearized Bregman projection approachthat we discussed in Chapter 7. Preliminary results are promising, and wewill investigate its performance in more details in our future research.To avoid the primary/multiple separation altogether in the seismic dataprocessing workflow, we also attempt to extend the proposed methodologyto extended imaging used in migration velocity analysis, as we discussed inChapter 8. We also see promising results and will continue our research onthis topic.130BibliographyAravkin, A. Y., J. Burke, and M. Friedlander, 2013a, Variationalproperties of value functions: SIAM Journal on Optimization, 23,1689–1717. → pages 64, 68, 70Aravkin, A. Y., and T. van Leeuwen, 2012, Estimating nuisance parametersin inverse problems: Inverse Problems, 28. → pages 35, 64, 69Aravkin, A. Y., T. van Leeuwen, H. Calandra, and F. J. Herrmann, 2012,Source estimation for frequency-domain FWI with robust penalties:Presented at the EAGE technical program, EAGE. → pages 65Aravkin, A. Y., T. van Leeuwen, and N. Tu, 2013b, Sparse seismic imagingusing variable projection: Acoustics, Speech and Signal Processing(ICASSP), 2013 IEEE International Conference on, 2065–2069. → pages35, 45, 122Baysal, E., D. Kosloff, and J. Sherwood, 1983, Reverse time migration:GEOPHYSICS, 48, 1514–1524. → pages 2, 21, 64, 67Beck, A., and M. Teboulle, 2009, A fast iterative shrinkage-thresholdingalgorithm for linear inverse problems: SIAM J. Img. Sci., 2, 183–202. →pages 114Berkhout, A. J., 1993, Migration of multiple reflections: SEG TechnicalProgram Expanded Abstracts, SEG, 1022–1025. → pages 14, 54, 93Berkhout, A. J., and D. J. Verschuur, 2006, Imaging of multiple reflections:Geophysics, 71, SI209–SI220. → pages 14Berkhout, A. J., and C. P. A. Wapenaar, 1993, A unified approach toacoustical reflection imaging. ii: The inverse problem: The Journal ofthe Acoustical Society of America, 93, 2017–2023. → pages 54Biondi, B., and W. W. Symes, 2004, Angle-domain common-image gathersfor migration velocity analysis by wavefield-continuation imaging:Geophysics, 69, 1283. → pages 120Brown, M., and A. Guitton, 2005, Least-squares joint imaging of multiplesand primaries: GEOPHYSICS, 70, S79–S89. → pages 15Cande`s, E. J., L. Demanet, D. L. Donoho, and L. Ying, 2006a, Fast131discrete curvelet transforms: Multiscale Modeling and Simulation, 5, no.3, 861–899. → pages 24, 45Cande`s, E. J., and D. L. Donoho, 2004, New tight frames of curvelets andoptimal representations of objects with piecewise c2 singularities:Communications on Pure and Applied Mathematics, 57, 219–266. →pages 4, 24, 66Cande`s, E. J., J. K. Romberg, and T. Tao, 2006b, Stable signal recoveryfrom incomplete and inaccurate measurements: Communications onPure and Applied Mathematics, 59, 1207–1223. → pages 24Cavalca, M., and P. Lailly, 2005, 652, in Prismatic reflections for thedelineation of salt bodies: SEG, 2550–2553. → pages 39Chen, S. S., D. L. Donoho, and M. A. Saunders, 2001, Atomicdecomposition by basis pursuit: SIAM Review, 43, 129. → pages 23, 45Claerbout, J. F., 1971, Toward a unified theory of reflector mapping:Geophysics, 36, no. 3, 467–481. → pages 54, 55, 56Dai, W., X. Wang, and G. T. Schuster, 2011, Least-squares migration ofmultisource data with a deblurring filter: Geophysics, 76, R135–R146.→ pages 45Delprat-Jannaud, F., and P. Lailly, 2008, Wave propagation inheterogeneous media: Effects of fine-scale heterogeneity: Geophysics, 73,T37–T49. → pages 39Donoho, D., 2006, Compressed sensing: Information Theory, IEEETransactions on, 52, 1289 –1306. → pages 24Donoho, D., A. Maleki, and A. Montanari, 2009, Message passingalgorithms for compressed sensing: Proceedings of the NationalAcademy of Sciences, 106, 18914–18919. → pages 26, 142Dragoset, B., E. Verschuur, I. Moore, and R. Bisley, 2010, A perspectiveon 3d surface-related multiple elimination: Geophysics, 75,75A245–75A261. → pages 17Dragoset, W. H., and Zˇ. Jericˇevic´, 1998, Some remarks on surface multipleattenuation: Geophysics, 63, 772–789. → pages 17, 85Esser, E., T. T. Lin, R. Wang, and F. J. Herrmann, 2015, A lifted `1/`2constraint for sparse blind deconvolution: Presented at the 77th EAGEconference and exhibition 2015, EAGE. → pages 65, 72Fokkema, J. T., and P. M. van den Berg, 1993, Seismic applications ofacoustic reciprocity: Elsevier. → pages 17Fomel, S., M. Backus, M. DeAngelo, P. Murray, and B. A. Hardage, 2003,Multicomponent seismic data registration for subsurface characterizationin the shallow gulf of mexico: Presented at the Offshore TechnologyConference. → pages 65132Foster, D., and C. Mosher, 1992, Suppression of multiple reflections usingthe radon transform: Geophysics, 57, 386–395. → pages 13Golub, G., and V. Pereyra, 1973, The differentiation of pseudo-inversesand nonlinear least squares problems whose variables separate: SIAMJournal on Numerical Analysis, 10, 413–432. → pages 64——–, 2003, Separable nonlinear least squares: the variable projectionmethod and its applications: Inverse problems, 19, R1–R26. → pages35, 64, 68, 69Gray, S. H., 1997, True-amplitude seismic migration: A comparison ofthree approaches: Geophysics, 62, 929–936. → pages 55Gribonval, R., G. Chardon, and L. Daudet, 2012, Blind calibration forcompressed sensing by convex optimization: Acoustics, Speech andSignal Processing (ICASSP), 2012 IEEE International Conference on,2713–2716. → pages 48Guitton, A., 2002, Shot-profile migration of multiple reflections: SEGTechnical Program Expanded Abstracts, 1296–1299. → pages 14, 37, 93,96Guitton, A., A. Valenciano, D. Bevc, and J. Claerbout, 2007, Smoothingimaging condition for shot-profile migration: Geophysics, 72, S149–S154.→ pages 54, 56Guitton, A., and D. J. Verschuur, 2004, Adaptive subtraction of multiplesusing the `1-norm: Geophysical prospecting, 52, 27–38. → pages 13Hampson, D., 1986, Inverse velocity stacking for multiple elimination:SEG Technical Program Expanded Abstracts, 422–424. → pages 13He, R., B. Hornby, and G. T. Schuster, 2007, 3d wave-equationinterferometric migration of vsp free-surface multiples: Geophysics, 72,S195–S203. → pages 14He, R., and G. Schuster, 2003, Least-squares migration of both primariesand multiples: SEG Technical Program Expanded Abstracts, SEG,1035–1038. → pages 14Hennenfent, G., E. van den Berg, M. P. Friedlander, and F. J. Herrmann,2008, New insights into one-norm solvers from the Pareto curve:Geophysics, 73. → pages 115Herman, M., and T. Strohmer, 2010, General deviants: An analysis ofperturbations in compressed sensing: Selected Topics in SignalProcessing, IEEE Journal of, 4, 342–349. → pages 48Herrmann, F. J., 2012, Pass on the message: recent insights in large-scalesparse recovery: Presented at the 74th EAGE Conference & Exhibition,EAGE. → pages 25, 71, 142Herrmann, F. J., Y. A. Erlangga, and T. T. Lin, 2009, Compressive133simultaneous full-waveform simulation: Geophysics, 74, A35. → pages23Herrmann, F. J., and G. Hennenfent, 2008, Non-parametric seismic datarecovery with curvelet frames: Geophysical Journal International, 173,233–248. → pages 4, 24Herrmann, F. J., and X. Li, 2012, Efficient least-squares imaging withsparsity promotion and compressive sensing: Geophysical Prospecting,60, 696–712. → pages 15, 20, 23, 24, 25, 26, 28, 30, 39, 45, 48, 54, 56,64, 66, 67, 70, 103, 112Herrmann, F. J., P. P. Moghaddam, and C. Stolk, 2008a, Sparsity- andcontinuity-promoting seismic image recovery with curvelet frames:Applied and Computational Harmonic Analysis, 24, 150–173. → pages24, 25, 33, 64Herrmann, F. J., D. Wang, G. Hennenfent, and P. P. Moghaddam, 2008b,Curvelet-based seismic data processing: a multiscale and nonlinearapproach: Geophysics, 73, A1–A5. → pages 24, 25, 77Herrmann, F. J., D. Wang, and D. J. Verschuur, 2008c, Adaptivecurvelet-domain primary-multiple separation: Geophysics, 73, A17–A21.→ pages 13, 37Jo, C., C. Shin, and J. H. Suh, 1996, An optimal 9-point, finite-difference,frequency-space, 2-d scalar wave extrapolator: Geophysics, 61, 529–537.→ pages 18Jumah, B., and F. J. Herrmann, 2014, Dimensionality-reduced estimationof primaries by sparse inversion: Geophysical Prospecting, 62, 972–993.→ pages 18Kaufman, L., 1975, A variable projection method for solving separablenonlinear least squares problems: BIT Numerical Mathematics, 15,49–57. → pages 64Kumar, R., T. van Leeuwen, and F. J. Herrmann, 2013, Ava analysis andgeological dip estimation via two-way wave-equation based extendedimages: Presented at the SEG. → pages 120——–, 2014, Extended images in action: efficient WEMVA via randomizedprobing: Presented at the EAGE. → pages 120Lailly, P., 1983, The seismic inverse problem as a sequence of before stackmigrations: Conference on inverse scattering: theory and application,Society for Industrial and Applied Mathematics, Philadelphia, PA,206–220. → pages 7, 21, 56, 67Li, M., J. Rickett, and A. Abubakar, 2013, Application of the variableprojection scheme for frequency-domain full-waveform inversion:Geophysics, 78, R249–R257. → pages 65134Lin, T. T., and F. J. Herrmann, 2013, Robust estimation of primaries bysparse inversion via one-norm minimization: Geophysics, 78,R133–R150. → pages 14, 17, 19, 33, 65, 72, 93, 97Lin, T. T., N. Tu, and F. J. Herrmann, 2010a, Sparsity-promotingmigration from surface-related multiples: SEG Technical ProgramExpanded Abstracts, SEG, 3333–3337. → pages 14——–, 2010b, Sparsity-promoting migration from surface-related multiples:SEG Technical Program Expanded Abstracts, SEG, 3333–3337. →pages 93Liu, Y., X. Chang, D. Jin, R. He, H. Sun, and Y. Zheng, 2011, Reversetime migration of multiples for subsalt imaging: Geophysics, 76,WB209–WB216. → pages 6, 14, 21, 37, 54, 59, 93Long, A. S., S. Lu, D. Whitmore, H. LeGleut, R. Jones, N. Chemingui,and M. Farouki, 2013, Mitigation of the 3d cross-line acquisitionfootprint using separated wavefield imaging of dual-sensor streamerseismic: Presented at the 75th EAGE Conference & Exhibitionincorporating SPE EUROPEC 2013, EAGE. → pages 14Lorenz, D., F. Schpfer, and S. Wenger, 2014, The linearized Bregmanmethod via split feasibility problems: Analysis and generalizations:SIAM Journal on Imaging Sciences, 7, 1237–1262. → pages 115Lorenz, D. A., S. Wenger, F. Scho¨pfer, and M. Magnor, 2014, A sparseKaczmarz solver and a linearized Bregman method for onlinecompressed sensing: Presented at the Image Processing (ICIP), 201421th IEEE International Conference on. → pages 112, 114Lu, G., B. Ursin, and J. Lutro, 1999, Model-based removal of water-layermultiple reflections: Geophysics, 64, 1816–1827. → pages 13Lu, S., D. Whitmore, A. Valenciano, and N. Chemingui, 2014a, Enhancedsubsurface illumination from separated wavefield imaging: First Break,32, 87–92. → pages 3, 120——–, 2014b, Enhanced subsurface illumination from separated wavefieldimaging: First Break, 32, 87–92. → pages 93Lu, S., N. D. Whitmore, and A. A. Valenciano, 2011, Imaging of primariesand multiples with 3d seam synthetic: SEG Technical ProgramExpanded Abstracts, 3217–3221. → pages 14, 54Maharramov, M., and B. Biondi, 2014, Improved depth imaging byconstrained full-waveform inversion: SEP report, 155, 19–24. → pages114Malcolm, A. E., B. Ursin, and V. Maarten, 2009, Seismic imaging andillumination with internal multiples: Geophysical Journal International,176, 847–864. → pages 39135Mallat, S. G., 2009, A wavelet tour of signal processing: Academic Press.→ pages 26Me´tivier, L., 2011, Interlocked optimization and fast gradient algorithm fora seismic inverse problem: Journal of Computational Physics, 230,7502–7518. → pages 15, 39Me´tivier, L., P. Lailly, F. Delprat-Jannaud, and L. Halpern, 2011, A 2dnonlinear inversion of well-seismic data: Inverse Problems, 27, 055005.→ pages 15, 39Montanari, A., 2010, Graphical models concepts in compressed sensing:arXiv:1011.4328v3 [cs.IT]. → pages 25, 26, 142Muijs, R., J. O. A. Robertsson, and K. Holliger, 2007, Prestack depthmigration of primary and surface-related multiple reflections: Parti—imaging: Geophysics, 72, S59–S69. → pages 6, 14, 21, 37, 54, 59, 93Mulder, W. A., and R. E. Plessix, 2003, 225, in One-way and two-waywave-equation migration: SEG, 881–884. → pages 76Needell, D., and J. A. Tropp, 2014, Paved with good intentions: Analysisof a randomized block kaczmarz method: Linear Algebra and itsApplications, 441, 199 – 221. (Special Issue on Sparse ApproximateSolution of Linear Systems). → pages 26, 37, 116Neelamani, R., A. Baumstein, and W. Ross, 2010, Adaptive subtractionusing complex-valued curvelet transforms: Geophysics, 75, V51–V60. →pages 24Nemeth, T., H. Sun, and G. T. Schuster, 2000, Separation of signal andcoherent noise by migration filtering: Geophysics, 65, 574–583. → pages13Nemeth, T., C. Wu, and G. T. Schuster, 1999, Least-squares migration ofincomplete reflection data: Geophysics, 64, 208–221. → pages 23, 45, 64Peacock, K. L., and S. Treitel, 1969, Predictive deconvolution: theory andpractice: Geophysics, 34, 155–169. → pages 13Plessix, R.-E., and W. A. Mulder, 2004, Frequency-domain finite-differenceamplitude-preserving migration: Geophysical Journal International,157, 975–987. → pages 7Poole, T., A. Curtis, J. Robertsson, and D.-J. van Manen, 2010,Deconvolution imaging conditions and cross-talk suppression:Geophysics, 75, W1–W12. → pages 14, 93Pratt, R. G., 1999, Seismic waveform inversion in the frequency domain,part 1: Theory and verification in a physical scale model: Geophysics,64, 888–901. → pages 69Prucha, M., and B. Biondi, 2002, Subsalt event regularization withsteering filters: SEG Technical Program Expanded Abstracts,1361176–1179. → pages 25Reiter, E. C., M. N. Toksoz, T. H. Keho, and G. M. Purdy, 1991, Imagingwith deep-water multiples: Geophysics, 56, 1081–1086. → pages 15Rickett, J., 2013, The variable projection method for waveform inversionwith an unknown source function: Geophysical Prospecting, 61,874–881. → pages 65, 69Robinson, E. A., 1967, Predictive decomposition of time series withapplication to seismic exploration: GEOPHYSICS, 32, 418–484. →pages 74Romero, L., D. Ghiglia, C. Ober, and S. Morton, 2000, Phase encoding ofshot records in prestack migration: GEOPHYSICS, 65, 426–436. →pages 24Ryberg, T., and M. Weber, 2000, Receiver function arrays: a reflectionseismic approach: Geophysical Journal International, 141, 1–11. →pages 39Sava, P., and A. Guitton, 2005, Multiple attenuation in the image space:Geophysics, 70, V10–V20. → pages 13Sava, P., and I. Vasconcelos, 2011, Extended imaging conditions forwave-equation migration: Geophysical Prospecting, 59, 35–55. → pages120, 122Schuster, G. T., J. Yu, J. Sheng, and J. Rickett, 2004,Interferometric/daylight seismic imaging: Geophysical JournalInternational, 157, 838–852. → pages 14, 93Shan, G., 2003, Source-receiver migration of multiple reflections: SEGTechnical Program Expanded Abstracts, SEG, 1008–1011. → pages 14Shang, X., M. V. de Hoop, and R. D. van der Hilst, 2012, Beyond receiverfunctions: Passive source reverse time migration and inverse scattering ofconverted waves: Geophysical Research Letters, 39, L15308. → pages 39Shen, P., and W. W. Symes, 2008, Automatic velocity analysis via shotprofile migration: Geophysics, 73, VE49–VE59. → pages 120Stockham, T.G., J., T. Cannon, and R. Ingebretsen, 1975, Blinddeconvolution through digital signal processing: Proceedings of theIEEE, 63, 678–692. → pages 9Stockham Jr, T. G., T. M. Cannon, and R. B. Ingebretsen, 1975, Blinddeconvolution through digital signal processing: Proceedings of theIEEE, 63, 678–692. → pages 65, 71Strohmer, T., and R. Vershynin, 2009, A randomized kaczmarz algorithmwith exponential convergence: Journal of Fourier Analysis andApplications, 15, 262–278. → pages 26, 116Symes, W. W., 2008, Migration velocity analysis and waveform inversion:137Geophysical Prospecting, 56, 765–790. → pages 120Terentyev, I. S., T. Vdovina, W. W. Symes, X. Wang, and D. Sun., 2014,iWave: a framework for wave simulation:http://www.trip.caam.rice.edu/software/iwave/doc/html/index.html. (Online;accessed 16-January-2014). → pages 32, 77, 82The SMAART JV, 2014, Sigsbee2B FS & NFS 2D synthetic datasets:http://www.delphi.tudelft.nl/SMAART/sigsbee2b.htm. (Online; accessed24-February-2014). → pages 31, 81Tibshirani, R., 1996, Regression shrinkage and selection via the lasso:Journal of the Royal Statistical Society Series B Methodological, 58,267–288. → pages 25, 46, 67Tu, N., A. Y. Aravkin, T. van Leeuwen, and F. J. Herrmann, 2013, Fastleast-squares migration with multiples and source estimation: Presentedat the 75th EAGE Conference & Exhibition incorporating SPEEUROPEC 2013, EAGE. → pages 35, 45Tu, N., and F. J. Herrmann, 2012a, Imaging with multiples accelerated bymessage passing: SEG Technical Program Expanded Abstracts, SEG,1–6, Chapter 682. → pages 26, 48, 54, 59——–, 2012b, Least-squares migration of full wavefield with sourceencoding: Presented at the EAGE. → pages 14Ulrych, T. J., and M. D. Sacchi, 2005, Information-based inversion andprocessing with applications: Elsevier Science. → pages 33Ulrych, T. J., D. R. Velis, and M. D. Sacchi, 1995, Wavelet estimationrevisited: The Leading Edge, 14, 1139–1143. → pages 71van den Berg, E., and M. P. Friedlander, 2008, Probing the pareto frontierfor basis pursuit solutions: SIAM Journal on Scientific Computing, 31,890–912. → pages 23, 28, 45, 46, 47, 64, 67, 70, 112, 141van der Neut, J., and F. J. Herrmann, 2013, Interferometric redatuming bysparse inversion: Geophysical Journal International, 192, 666–670. →pages 39van Groenestijn, G. J. A., and D. J. Verschuur, 2009a, Estimatingprimaries by sparse inversion and application to near-offset datareconstruction: Geophysics, 74, A23–A28. → pages 17, 65, 72, 93, 120——–, 2009b, Estimation of primaries and near-offset reconstruction bysparse inversion: Marine data applications: Geophysics, 74, R119–R128.→ pages 14, 19, 33van Leeuwen, T., and F. J. Herrmann, 2012, Fast waveform inversionwithout source encoding: Geophysical Prospecting. → pages 112——–, 2012a, A parallel, object-oriented framework for frequency-domainwavefield imaging and inversion.: Technical Report TR-2012-03,138Department of Earth and Ocean Sciences, University of BritishColumbia, Vancouver. → pages 32——–, 2012b, Wave-equation extended images: computation and velocitycontinuation: Presented at the EAGE technical program, EAGE. →pages 122——–, 2013, Mitigating local minima in full-waveform inversion byexpanding the search space: Geophysical Journal International, 195,661–667. → pages 115van Leeuwen, T., and W. A. Mulder, 2009, A variable projection methodfor waveform inversion: Presented at the 71st EAGE Conference &Exhibition. → pages 65Verschuur, D. J., 1991, Surface-related multiple elimination, an inverseapproach: PhD thesis, Delft University of Technology. → pages 17——–, 2006, Seismic multiple removal techniques: Past, present andfuture: EAGE Publications BV. → pages 2——–, 2011, Seismic migration of blended shot records with surface-relatedmultiple scattering: Geophysics, 76, A7–A13. → pages 3, 14, 54, 93Verschuur, D. J., and A. J. Berkhout, 2009, Target-oriented, least-squaresimaging of blended data: SEG Technical Program Expanded Abstracts,2889–2893. → pages 55Verschuur, D. J., A. J. Berkhout, and C. P. A. Wapenaar, 1992, Adaptivesurface-related multiple elimination: Geophysics, 57, 1166–1177. →pages 13, 16, 17, 32, 54, 65, 72, 75, 93, 96, 98, 121, 127Wang, D., R. Saab, O. Yilmaz, and F. J. Herrmann, 2008, Bayesianwavefield separation by transform-domain sparsity promotion:Geophysics, 73, 1–6. → pages 13Wapenaar, K., 1998, Reciprocity properties of one-way propagators:Geophysics, 63, 1795–1798. → pages 32Weglein, A., F. Gasparotto, P. Carvalho, and R. Stolt, 1997, Aninversescattering series method for attenuating multiples in seismicreflection data: GEOPHYSICS, 62, 1975–1989. → pages 13Whitmore, N. D., A. A. Valenciano, and W. Sollner, 2010, Imaging ofprimaries and multiples using a dual-sensor towed streamer: SEGTechnical Program Expanded Abstracts, 3187–3192. → pages 6, 14, 37,54, 93Wiggins, J. W., 1988, Attenuation of complex water-bottom multiples bywave-equation-based prediction and subtraction: Geophysics, 53,1527–1539. → pages 13Wong, M., B. Biondi, and S. Ronen, 2012, Imaging with multiples usinglinearized full-wave inversion: SEG Technical Program Expanded139Abstracts, SEG, 1–5, Chapter 706. → pages 14, 15, 93——–, 2014, Imaging with multiples using least-squares reverse timemigration: The Leading Edge, 33, 970–976. → pages 14, 120Yilmaz, ., 2001, Seismic data analysis: Society of ExplorationGeophysicists. → pages 1Yin, W., S. Osher, D. Goldfarb, and J. Darbon, 2008, Bregman iterativealgorithms for `1-minimization with applications to compressed sensing:SIAM Journal on Imaging Sciences, 1, 143–168. → pages 114Zuberi, A., and T. Alkhalifah, 2013, Imaging by forward propagating thedata: theory and application: Geophysical Prospecting, 61, 248–267. →pages 14, 93140Appendix AAdditional details ofmethodologyA.1 Solving the BPDN problem using SPG`1Using canonical linear algebra notations, the BPDN formulation to inverta linear system of equations Ax = b with sparsity promoting using the`1-norm followsBPσ : minx‖x‖1 subject to ‖b−Ax‖2 ≤ σ, (A.1.1)where τ is an estimate for the noise level in the data.To efficiently solve the above BPσ problem, SPG`1 leverages the Spec-trum Projected Gradient (SPG) method that efficiently solves the `1-constrainedleast-squares problem, aka the LASSO problem (van den Berg and Fried-lander, 2008):LSτ : minx‖b−Ax‖2 subject to. ‖x‖1 ≤ τ, (A.1.2)where τ is a sparsity constraint on the solution vector x.van den Berg and Friedlander (2008) found that problem BPσ and LSτshare the same solution for certain choices of σ and τ , which are related bythe Pareto trade-off curve between the `2-norm of the data residual (i.e.,the objective function of problem LSτ ) and the `1-norm of the solutionvector (i.e., the objective function of problem BPσ). As the Pareto curve iscontinuously differentiable, SPG`1 uses the Newton’s method on the Paretocurve to determine the optimal value of τ given an estimate of σ. As a result,141SPG`1 effectively solves a series of warm-started LASSO subproblems forgradually increasing τ ’s. By “warm-started”, we mean that the solution forthe lth subproblem with τ l < τ l+1 serves as the initial guess for the nextsubproblem. Solving each subproblem LSτ itself using the SPG methodinvolves several gradient updates. In each update, we need to evaluate the(expensive) action of A and A∗, perform a line search, followed by a (cheap)`1 projection that promotes sparsity.A.2 Subsampling noise and approximate messagepassingIn simple terms, the key idea of compressive sensing is to work with a flatsampling matrix A that creates noisy interferences, i.e., the action of A∗Aon a sparse vector yield a “noisy” sparse vector where the “noise” actuallyis incoherent crosstalks with a noise level that depends on degree of sub-sampling (aspect ratio of A, defined as the ratio of the number of rows tothe number of columns) and the sparsity of x. In principle, this interfer-ence noise should be easier to remove than coherent noise. As explained byDonoho et al. (2009) and Herrmann (2012), this statement holds true forthe first iteration where some noises are removed by projection onto the `1ball (which corresponds to a single soft thresholding with the appropriatethreshold level). However, later iterations suffer from the correlation builtup between the model iterate and the sampling matrix A, which makes thedenoising step less efficient and may result in coherent artifacts.In the context of our compressive imaging problem, this means that whenwe reduce the number of simultaneous sources and frequencies, which corre-sponds to a reduced aspect ratio of A, we may get more coherent artifactsin the image that cannot be removed by simply using more iterations.Recent insights into approximate message passing find that with a cor-rection term in the iterative soft thresholding, this denoising step can contin-uously bring down the model errors in later iterations (Donoho et al., 2009;Montanari, 2010). While this approach relies on careful analysis, we foundthat rerandomization of source experiments has statistically the same effectin breaking down this correlation buildup, and fits well in the optimizationframework of the proposed method (Herrmann, 2012).142


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items