Estimation of Surface-free Data by Curvelet-domain Matched Filtering and Sparse Inversion by Mufeed H. AlMatar B.S.c. Computer Engineering, King Fahad University of Petroleum and Minerals, Dhahran, 2005 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF Master of Science in THE FACULTY OF GRADUATE STUDIES (Geophysics) The University Of British Columbia (Vancouver) April 2011 c©Mufeed H. AlMatar, 2011 Abstract A recent robust multiple-elimination technique, based on the underlying principle that relates primary impulse response to total upgoing wavefield, tries to change the paradigm that sees surface-related multiples as noise that needs to be removed from the data prior to imaging. This technique, estimation of primaries by sparse inversion (EPSI), van Groenestijn and Verschuur; Lin and Herrmann, proposes an inversion procedure during which the source function and surface-free impulse re- sponse are directly calculated from the upgoing wavefield using an alternating op- timization procedure. EPSI hinges on a delicate interplay between surface-related multiples and pri- maries. Finite aperture and other imperfections may violate this relationship. In this thesis, we investigate how to make EPSI more robust by incorporating curvelet- domain matching in its formulation. Compared to surface-related multiple removal (SRME), where curvelet-domain matching was used successfully, incorporating this step has the additional advantage that matches multiples to multiples rather than predicated multiples to total data as in SRME. ii Table of Contents Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Theme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.4 Theoretical background . . . . . . . . . . . . . . . . . . . . . . . 4 2 Primary-multiple separation techniques . . . . . . . . . . . . . . . . 8 2.1 Primary-multiple separation techniques . . . . . . . . . . . . . . 8 2.2 Surface-related multiples model . . . . . . . . . . . . . . . . . . 9 2.2.1 Surface-related multiples formulation for normal-incidence plane wave in a horizontally stratified medium . . . . . . 9 2.2.2 Data matrix . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.2.3 Two-dimensional formulation . . . . . . . . . . . . . . . 11 2.3 Iterative surface-related multiple removal . . . . . . . . . . . . . 14 2.4 Curvelet-domain matched-filtering . . . . . . . . . . . . . . . . . 15 iii 2.5 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3 Convexified EPSI and curvelet-domain matched-filtering . . . . . . 19 3.1 Estimation of primaries by sparse inversion . . . . . . . . . . . . 19 3.1.1 Convexified EPSI . . . . . . . . . . . . . . . . . . . . . . 20 3.1.2 Bi-convex optimization . . . . . . . . . . . . . . . . . . . 21 3.2 EPSI with curvelet-domain matched-filtering . . . . . . . . . . . 23 3.2.1 Frequency-domain monochromatic matching . . . . . . . 23 3.2.2 Time-domain matching . . . . . . . . . . . . . . . . . . . 24 4 Observations and results . . . . . . . . . . . . . . . . . . . . . . . . 26 4.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.1.1 Synthetic data . . . . . . . . . . . . . . . . . . . . . . . . 26 4.1.2 Results evaluation . . . . . . . . . . . . . . . . . . . . . 26 4.2 Factors affecting our results . . . . . . . . . . . . . . . . . . . . . 27 4.2.1 The smoothing and positivity of the curvelet-domain diag- onal scaling . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.2.2 When and how frequent should we do the matching? . . . 32 4.2.3 Optimized result . . . . . . . . . . . . . . . . . . . . . . 41 5 Conclusion and future works . . . . . . . . . . . . . . . . . . . . . . 43 5.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 5.2 Future works . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Appendix A Implementations . . . . . . . . . . . . . . . . . . . . . . . 48 iv List of Tables Table 4.1 Synthetic time-domain finite difference acoustic forward mod- eling data parameters. . . . . . . . . . . . . . . . . . . . . . . 27 v List of Figures Figure 1.1 An illustration of land seismic survey. A seismic source sends energy into the earth. Some of this energy is reflected back to the surface at rocks layers discontinuities. This energy is then recorded by geophones at the surface. . . . . . . . . . . . . . 2 Figure 1.2 An illustration that shows a primary and two types of multi- ples. The solid blue line has only one upward reflection, so it is a primary reflection. The green dashed line has more than one upward reflection and it has its shallowest downward re- flection at the surface, hence, it is a surface-related multiple. Finally, the internal-multiple, the red dashed line, has its shal- lowest reflection at an internal rock layer and it has more than one upward reflection. . . . . . . . . . . . . . . . . . . . . . 3 Figure 1.3 Mesh view of a 2D curvelet. Curvelets are oscillatory in one direction, but smooth in the other. They resemble a windowed sinusoid in the oscillatory direction and a very smooth Gaus- sian window in the other direction. . . . . . . . . . . . . . . . 5 Figure 1.4 Spatial (left) and frequency (right) representations of six curvelets with different angles and scales. Notice how each curvelet is localized in the spatial domain with a rapid decaying amplitude and how it has a compact support in the frequency domain. Notice also the 90◦ rotation between any two correspondence representations in both domains of each curvelet. . . . . . . . 6 vi Figure 1.5 a) Synthetic shot gather b) Curvelet decomposition of the shot gather into different scales and angles. Five scales are shown here. Notice how the angles double every other scale and how the different parts of the shot gather are decomposed at the various scales and angles. The coarsest (center) scale shows the DC and low frequencies. The second scale has 4 angles and the third and the fourth has 8. The fifth (finest) scale has 16 angles. . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Figure 1.6 Curvelet correlation with wavefronts. A large coefficient is yield when the curvelet and the waterfront have locally the same direction and frequency contents. Otherwise, a small co- efficient is yielded. . . . . . . . . . . . . . . . . . . . . . . . 7 Figure 2.1 Two reflection events: a primary (solid blue line) and a surface- related multiple (dashed red line). Both events have similar ar- rival time but have traveled different paths. The primary event has traveled deeper into the subsurface and has seen higher velocity than the multiple event. Therefore, it arrived with a smaller angle at the receiver. . . . . . . . . . . . . . . . . . . 9 Figure 2.2 Feedback diagram for surface-related multiples generation. First, surface-free data is generated by convolving the source signa- ture s+ with surface-free impulse response g. Then, surface- free data feedbacks into the earth to generate higher order mul- tiples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Figure 2.3 a) An illustration that shows the generation of surface-related multiples. b) Source signature s(t). c) Surface-free impulse response g(t). d) Surface-free data p0(t). e) First-order multi- ples m1(t). f) Second-order multiples m2(t). g) Total response p(t). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Figure 2.4 After convolving the source gather from total data P with the common receivers gather from surface-free impulse response G for each point in xk, the convolution result is summed up to produce the predicted surface-related multiple trace at xr. . . . 13 vii Figure 2.5 A common-offset section from North Sea field dataset with results from two different multiples-elimination methods. a) Total data plotted with automatic-gain control. b) Primary es- timation from one term SRME. c) Scaled Bayesian iterative threshold primaries estimate. Adapted from Herrmann et al. (2008b). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Figure 4.1 Zero-offset EPSI synthetic data results with no matching. a) Total upgoing wavefield section (P̂). b) Primaries estimation (P̂0). c) primaries estimation minus the total data (P̂− P̂0 = −ĜP̂), i.e. the multiples estimation d) Final estimated source wavelet (Q̂) . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 Figure 4.2 Mosaic and 2D plots for the curvelet-domain matched-filtering scaling for one shot (offset = 1260m) for λ = 0.46 and η = 10,35 and 100. The λ and η used to generate each subfigure are stated underneath it. . . . . . . . . . . . . . . . . . . . . 30 Figure 4.3 Zero-offset synthetic data results for λ = 0.46 and η = 10, 35 and 100. The λ and η used to generate each subfigure are stated underneath it. a-c) zero-offset primaries estimates. d-f) zero-offset multiples estimates. . . . . . . . . . . . . . . . . . 31 Figure 4.4 Mosaic and 2D plots for the curvelet-domain matched-filtering scaling for one shot (offset = 1260m) for η = 35 and λ = 0.046, 0.46 and 46.4. The λ and η used to generate each subfigure are stated underneath it. . . . . . . . . . . . . . . . . . . . . 33 Figure 4.5 Zero-offset synthetic data results for η = 35 and λ = 0.046, 0.46 and 46.4. The λ and η used to generate each subfigure are stated underneath it. a-c) zero-offset primaries estimates. d-f) zero-offset multiples estimates. . . . . . . . . . . . . . . 34 Figure 4.6 a-d) Primaries estimates at different iterations. e-h) Multiples estimates calculated by subtracting the primaries estimate for the total data P̂− ĜQ̂. i-l) Multiples estimates calculated by convolving the Green’s function with the total upgoing wave- field −ĜP̂. . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 viii Figure 4.7 Zero-offset results of EPSI with only one matching step at the first alternating optimization loop. a) Zero-offset primaries es- timate. b) Zero-offset multiple estimate calculated by subtract- ing the primaries estimate form the total upgoing wavefield. . 37 Figure 4.8 Zero-offset results of EPSI with six matching steps starting af- ter the second alternating optimization loop (even loops only). a) Zero-offset primaries estimate. b) Zero-offset multiple esti- mate calculated by subtracting the primaries estimate form the total upgoing wavefield. . . . . . . . . . . . . . . . . . . . . 38 Figure 4.9 Zero-offset results of EPSI with only one matching step at the twenty first alternating optimization loop. a) Zero-offset pri- maries estimate. b) Zero-offset multiple estimate calculated by subtracting the primaries estimate form the total upgoing wavefield. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Figure 4.10 Zero-offset results of EPSI with only one matching step at the twelve alternating optimization loop. a) Zero-offset primaries estimate. b) Zero-offset multiple estimate calculated by sub- tracting the primaries estimate form the total upgoing wavefield. 40 Figure 4.11 Zero-offset EPSI synthetic data result with curvelet-domain matched filtering incorporated into loop 12. The number of it- erations η = 60 and the trade off parameter λ = 0.46. a) Total upgoing wavefield section (P̂). b) Primaries estimation (P̂0). c) primaries estimation minus the total data (P̂− P̂0 = −ĜP̂), i.e. the multiples estimation d) Final estimated source wavelet (Q̂) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 ix Acknowledgments I would like to thank my supervisor Felix Herrmann for his continuous support and encouragement, and creative ideas without which this thesis would not be possible. I would like also to thank all slim members for their support in filling the gaps in my Geophysical background. I am very fortunate to have Michael Bostock and Eldad Haber as part of my supervisory committee. I thank Michael for his two insightful courses and Eldad for the few lectures I attending during his course last fall. I would like also to thank my management at Saudi Aramco for giving me the opportunity to study at UBC and for their support and funding during my studies. A special thank goes to my wife Sakeel for her encouragement and patient during my research period especially after the twins came in board. Thank you all. x Dedication To my parents, my wife Sakeel, my daughters Zahraa and Zainab xi Chapter 1 Introduction In early nineteenth century, searching for oil was very simple, easy and straight- forward. People were looking for oil seepages at the surface near oil springs. Back then, the demand for oil was limited and oil was mainly used for medical pur- poses (Gadallah and Fisher, 2005). But the situation has changed since then, many objects around us are made from oil or made by machines that use energy from oil and gas. The paper used to print this thesis was manufactured using oil energy. The ink itself is an oil product. The demand for oil has changed dramatically and so do procedures to search for oil. The objective of the seismic survey is to reveal the subsurface geology. It is a very costly and complicated procedure. On land, dynamite (impulsive source) or Vibroseis (trucks that propagate energy into the earth over an extended period of time), sources send energy into the subsurface. Some of this energy is reflected at the boundaries between rocks layers back to the surface where it is recorded by an array of geophones. The same procedure applies at sea but the seismic source is now an air gun and the receivers are hydrophones. Figure 1.1 illustrates the seismic survey on land. Most seismic imaging algorithms assume that the reflected energy recorded by geophones at surface has reflected only once in the subsurface (Verschuur, 2002), as shown by the solid line in Figure 1.2. However, in practice, this assumption is not true. Some of the energy we send bounce between rocks layers before being recorded by geophones as shown by the dashed lines in Figure 1.2. As a result 1 Figure 1.1: An illustration of land seismic survey. A seismic source sends energy into the earth. Some of this energy is reflected back to the sur- face at rocks layers discontinuities. This energy is then recorded by geophones at the surface. multiple reflections events occur. These events are usually considered as undesired events that need to be removed from the seismic data before processing to avoid confusion during the interpretation of seismic images. The success of multiple- removal or multiple-suppression methods can make the difference between hitting the target and making profit or missing the target and losing millions of dollars. One way to categorize multiples is by the interface where they have their shal- lowest downward bounce. The first downward bounce for the green dashed line in Figure 1.2 happens at the surface. Hence it is called a surface-related multiple while the first downward bounce for the red dashed line happens at an internal interface and so it is called an internal-multiple (Verschuur, 2002). 1.1 Theme The main theme of this thesis is to extend a recent robust multiple-elimination technique that estimates primaries by sparse inversion, EPSI (van Groenestijn and Verschuur, 2009; Lin and Herrmann, 2009). This technique proposes an inversion procedure during which the source function and surface-free impulse response are 2 Figure 1.2: An illustration that shows a primary and two types of multiples. The solid blue line has only one upward reflection, so it is a primary reflection. The green dashed line has more than one upward reflection and it has its shallowest downward reflection at the surface, hence, it is a surface-related multiple. Finally, the internal-multiple, the red dashed line, has its shallowest reflection at an internal rock layer and it has more than one upward reflection. directly calculated from the upgoing wavefield using an alternating optimization procedure. We propose to incorporate a curvelet-domain matched-filtering step into this optimization procedure to mitigate possible adverse effects that stem from amplitude mismatches that vary smoothly as a function of position and dip along the predicted wavefronts of the current estimate for the surface-free data. The implementation and the different parameters affecting this matched-filtering step are investigated. 1.2 Objectives This thesis aims to incorporate a matched-filtering operator into the formulation of EPSI. This operator will be able to absorb amplitude errors of non-ideal reflections at the surface, finite-aperture, and other unknown effects. 3 1.3 Outline Chapter 2 introduces primary-multiple separation followed by an explanation of the theory underlying the model behind surface-related multiple removal, SRME (Verschuur et al., 1992), and primary estimation by sparse inversion, EPSI (van Groenestijn and Verschuur, 2009). Finally, we discuss the theory behind curvelet- domain matched-filtering. In Chapter 3, we introduce estimation of primaries by sparse inversion, before incorporating a matching step into the convexified EPSI procedure. Chapter 4 studies the factors affecting the outcome of incorporating a matched- filtering step into EPSI. We conclude this chapter by presenting our results. We will conclude our research in chapter 5 by a summary and suggested future work. The last section of this chapter introduces the curvelet transform and highlights its importance as a sparsifying domain to the success of the matched-filtering pro- cedure. 1.4 Theoretical background The ability of the transform to handle seismic data with conflicting dips and to min- imize the overlap between the primary and multiples coefficients in the transform domain is significant to the success of the primary-multiple separation algorithms (Herrmann, 2008b). In this section, we show that the curvelet transform (Candes et al., 2006) addresses these issues by efficiently representing seismic data sparsely and by its ability to handle conflicting dips. Curvelet transform was first introduced in Candes and Donoho (2000). Curvelet transform is a localized, multiscale, multi-directional and redundant transform. Curvelets are oscillatory in one direction and smooth in the perpendicular direc- tion. They resemble a windowed sinusoid in the oscillatory direction and a very smooth Gaussian window in the other direction, see Figure 1.3. Each curvelet is spatially localized in the spatial domain and has a compact support in the frequency domain. In spatial domain, its amplitude decays rapidly outside a certain region. Curvelets obey the parabolic scaling principle with length ≈ width2 (Neelamani et al., 2008). 4 The curvelet transform computational complexity of data of size N is O(NlogN) which is not an issue. However, the curvelet transform demands memory; it is an overcomplete transform, i.e. the resulting transform coefficients are more than the image coefficients. It is about 8 times redundant for the 2D curvelet transform and 24 times redundant for the 3D case. Figure 1.4 shows a few curvelets of different scales and angles with a rapid amplitude decay in the spatial domain and a compact support in the frequency domain and Figure 1.5 shows a decomposition of a shot gather at different scales and angles. Curvelet correlate very well locally with wavefronts of the same direction and frequency contents, see Figure 1.6. This permits the curvelet transform to rep- resent seismic data sparsely as a small set of significant coefficients which are location, scale, and dip dependent. This sparse representation of seismic data helps in mapping the different seismic events to different coefficients and that will make the overlap of the primaries and multiples coefficients minimal and hence leads to a successful separation. 5 Figure 1.3: Mesh view of a 2D curvelet. Curvelets are oscillatory in one direction, but smooth in the other. They resemble a windowed sinusoid in the oscillatory direction and a very smooth Gaussian window in the other direction. Figure 1.4: Spatial (left) and frequency (right) representations of six curvelets with different angles and scales. Notice how each curvelet is localized in the spatial domain with a rapid decaying amplitude and how it has a compact support in the frequency domain. Notice also the 90◦ rotation between any two correspondence representations in both domains of each curvelet. 6 (a) (b) Figure 1.5: a) Synthetic shot gather b) Curvelet decomposition of the shot gather into different scales and angles. Five scales are shown here. No- tice how the angles double every other scale and how the different parts of the shot gather are decomposed at the various scales and angles. The coarsest (center) scale shows the DC and low frequencies. The second scale has 4 angles and the third and the fourth has 8. The fifth (finest) scale has 16 angles. Figure 1.6: Curvelet correlation with wavefronts. A large coefficient is yield when the curvelet and the waterfront have locally the same direction and frequency contents. Otherwise, a small coefficient is yielded. 7 Chapter 2 Primary-multiple separation techniques In this chapter, we start briefly by categorizing the primary-multiple separation techniques. Then, we introduce the model behind surface-related multiple removal (SRME) and estimation of primaries via sparse inversion (EPSI). Finally, we layout the theory behind the curvelet-domain matched-filtering. 2.1 Primary-multiple separation techniques Multiples removal techniques can be categorized in two main classes: • filtering methods based on move-out and dip discrimination. • data-driven methods based on periodicity and predictability. Filtering methods rely on the fact that multiple events traveled through different rock layers. Hence, they traveled at an apparent different velocity or reflected at different geological structures than the primary events, see Figure 2.1. These meth- ods use a transform, e.g. F-K or Radon transform, to map the seismic data events, primaries and multiplies, into different regions, i.e. mapping different frequency contents or different dip angles into different regions. Then, they eliminate the multiples events and transform the remaining events, the primaries, back to the original time-offset domain (Verschuur, 2002). 8 Figure 2.1: Two reflection events: a primary (solid blue line) and a surface- related multiple (dashed red line). Both events have similar arrival time but have traveled different paths. The primary event has traveled deeper into the subsurface and has seen higher velocity than the multiple event. Therefore, it arrived with a smaller angle at the receiver. Data-driven methods exploit the relation between primaries and multiples events to suppress the repetitive patterns of multiples. These methods involve two steps: a prediction and a subtraction step. First, multiples are predicted and then subtracted through a matching procedure from total data, i.e. primaries, surface-related mul- tiples and internal multiples. Assumptions are needed for both steps. For example the energy after the subtraction step is assumed to be minimum (Verschuur, 2002). 2.2 Surface-related multiples model This section discusses the model behind SRME and EPSI for 1D and 2D situation. We discuss theoretical details for SRME and EPSI in Section 2.3 and Section 3.1, respectively. 9 2.2.1 Surface-related multiples formulation for normal-incidence plane wave in a horizontally stratified medium Consider a horizontal plane wave propagating in a laterally invariant earth. This wave reflects at the subsurface geology to produce surface-free data p0, which contains primaries and internal multiples but no surface-related multiples (blue solid and red dashed lines in Figure 2.2a). At the surface, this surface-free data acts as a new source of events that will feedback into the earth to produce first- order multiples (green dashed lines). The first-order multiples will then feedback into the earth again to produce higher order multiples. The feedback diagram in Figure 2.2b illustrates this process. We can express the surface-free data mathematically as p0(t) = g(t)∗ s(t), where p0 is the surface-free data, g is the surface-free impulse response, ∗ denotes convolution, and s(t) is the source signature. The feedback diagram in Figure 2.2 permits us to formulate the total response with the surface related multiples p as p(t) = g(t)∗ [s(t)− p(t)] = p0(t)−g(t)∗ p(t). (2.1) Figure 2.3 illustrates the generation of surface-related multiples with the source signature shown in Figure 2.3b, for a model with only one-reflection layer, Fig- ure 2.3c. The generated surface-free data p0, which consists of only one primary is shown in Figure 2.3d. This surface-free data feedbacks to the earth and convolves with the surface-free impulse response g to produce first-order multiples m1, Fig- ure 2.3e. The same process produces second-order multiples m2 (Figure 2.3f). Fi- nally, the total response (surface-free data and surface-related multiples) is shown in Figure 2.3g. 2.2.2 Data matrix Our seismic data in the 2D case is three-dimensional: time, source position and receiver position. By considering the earth as a linear time-invariant system, we can represent an experiment, i.e. a shot gather (one source and multiple receivers), 10 (a) (b) Figure 2.2: Feedback diagram for surface-related multiples generation. First, surface-free data is generated by convolving the source signature s+ with surface-free impulse response g. Then, surface-free data feedbacks into the earth to generate higher order multiples. as a superposition of many fully independent monochromatic experiments in the Fourier domain. After taking the shot gather into the frequency domain and se- lecting a single frequency, we obtain a vector of measurements for the different receivers for that frequency. Repeating the process for all shot gathers produces a data matrix, indicated by bold capital P (Berkhout, 1985), whose columns repre- sent common source point gathers and rows represent common receivers gathers. 2.2.3 Two-dimensional formulation Convolving surface-free impulse response G with total upgoing wavefield P− gen- erates surface-related multiples. Figure 2.4 illustrates the surface-related multiples prediction for 2D case interpreted as wave field extrapolation. We start with a source positioned at xs and a set of receivers (geophones) positioned at xk. Our 11 (a) (b) (c) (d) (e) (f) (g) Figure 2.3: a) An illustration that shows the generation of surface-related multiples. b) Source signature s(t). c) Surface-free impulse response g(t). d) Surface-free data p0(t). e) First-order multiples m1(t). f) Second-order multiples m2(t). g) Total response p(t). 12 Figure 2.4: After convolving the source gather from total data P with the common receivers gather from surface-free impulse response G for each point in xk, the convolution result is summed up to produce the predicted surface-related multiple trace at xr. goal is to calculate the wavefield at a receiver positioned at xr, which represent the surface-related multiples. To do that, we need the Green’s function to extrapolate the path for a source positioned at xr and whose energy is measured at xk. The surface-free impulse response G can act as the Green’s function to predict all possible surface-related multiples. First, we convolve the two wavefields, the shot gather from the total data P and the common receivers gather from the surface- free impulse response G, for each point in xk. Then, we sum the convolution result to get the wavefield at xr and that gives the surface-related multiples. In practice, both the data P and the surface free impulse response G are trans- formed to the frequency domain and then the convolution and the summation are repeated over all source-receiver combinations to obtain the predicted surface re- lated multiples M0. If we adapt a similar data matrix notation as in Berkhout (1985), then the multiple prediction can be formulated as a matrix-matrix multipli- cation: M̂0 =−ĜP̂, (2.2) where the minus sign represents the reflection at the surface and the hat is used to represents a monochromatic matrix. 13 Furthermore, we can use the same approach in Equation 2.2 to generalize Equa- tion 2.1 for the 2D case to get P̂ = P̂0+M̂0, (2.3) where P̂0 is the surface-free data and M̂0 is the surface-related multiples. Expand- ing Equation 2.3 yields P̂ = ĜQ̂+ ĜRP̂, (2.4) where Q̂ is the source signature and R is the surface reflectivity, which is ap- proximated to be −I, i.e. P̂ = Ĝ(Q̂− P̂). (2.5) 2.3 Iterative surface-related multiple removal In this section, we aim to give a brief description of the iterative Surface-Related Multiple Removal (SRME) procedure (Verschuur et al., 1992). We start from the surface-related multiples model, Equation 2.4. Now, let A = Q−1R be the sur- face operator. Next, we rewrite the surface-related multiples as M̂0 = ĜRP̂ = ĜQ̂Q̂−1RP̂ = P̂0AP. Substitution this expression in Equation 2.3 yields P̂ = P̂0+ P̂0AP. (2.6) By assuming the source function Q to be omnidirectional, i.e. Q̂ = Iq̂ with q̂ the discretized Fourier representation of the source function, the above equation can be solved iteratively as, P̂(n+1)0 = P̂−A(n+1)P̂(n)0 P, (2.7) where n is the iteration index. The iterative procedure starts with approximating the surface-free data with total data, i.e. P̂(0)0 = P. Next, it estimates A in a matching process. This iterative procedure (Equation 2.7) matches predicted multiples to total data, which may lead to throwing out multiples energy. It also requires regular sampling. 14 2.4 Curvelet-domain matched-filtering Many seismic processing techniques depend on matching seismic wavefield am- plitudes whether for adaptive subtraction of multiples (Verschuur et al., 1992; Her- rmann et al., 2008b) or ground roll (Yarham and Herrmann, 2008) predictions from total data or for restoration of migration amplitudes through image to migrated- demigrated-image matching (Herrmann et al., 2008a). For a successful matching, we need to control a possible overfitting that may inadvertently lead to primary energy removal. Furthermore, we need to handle conflicting dips and apply the separation after the matching stably. We will see in this section that curvelet sat- isfies all these conditions by its ability to decompose seismic data sparsely into position, scale and dip dependant coefficients and by imposing smoothness on its coefficients in the phase space (Herrmann, 2008a). In Herrmann (2008a), the curvelet-domain matched-filtering is based on two assumptions: • A global Fourier matching removes the stationary differences between the two matched wavefields. The Fourier matching removes only amplitude- spectra mismatches and global kinematic errors and fails to remove errors that vary spatial as a function of dip. In SRME (Verschuur et al., 1992), these stationary differences come from the convolution of the source signa- ture during the multiples prediction step. • The remaining non-stationary differences, which come from the non-ideal circumstances in the prediction, e.g. the absence of 3D effects, vary smoothly as a function of position and dip along wavefronts. After compensating for the stationary differences via global Fourier matching procedure, i.e. no kinematic errors, we can approximate the non-stationary differ- ences mathematically by the operatorΨ, whose action on a d-dimensional function f is given by (Ψ f )(x) = ∫ Rd e jk·xa(x,ζ ) f̂ (ζ )dζ , (2.8) where x,ζ are the spatial coordinate and wavenumber vectors, f̂ (ζ ) is the Fourier transform coefficients of f (x), and a(x,ζ ) is a space- and spatial-frequency depen- dent filter (Herrmann et al., 2008a). 15 In practice, the operator Ψ, which after discretizing becomes a full positive- definite matrix, acts on 2D images, e.g. shot gathers or common receivers gathers. We usually have two images to match, g and f . In the context of SRME, f would be predicted multiples and g would be total data. Then, the mismatched can be modeled as a matrix-vector multiplication, i.e., g =Ψ f . (2.9) Following Herrmann et al. (2008a), we approximate the action of Ψ operator by a positive diagonal scaling in the curvelet domain (Ψ f )(x)≈ C∗DΨC f (x), (2.10) where C is the forward curvelet transform and C∗ is its adjoint, which is also its pseudo inverse since we are using curvelet-domain via wrapping (Candes et al., 2006), which is a tight frame with C∗C= I. The diagonal matrix DΨ approximates the symbol a(x,ζ ) evaluated at the curvelet centers. After substituting Equation 2.10 into Equation 2.9, the problem of finding the action of the operator Ψ reduces to finding the elements of the curvelet diago- nal scaling matrix DΨ, which we calculate by solving the following least-squares problem argmin z 1 2 ||g−C∗diag(C f )z||22, (2.11) where z represents the diagonal elements of DΨ. Many issues complicate the estimation of the curvelet diagonal scaling of Equa- tion 2.11. The forward model is underdetermined, which leads to mapping many scaling coefficients to the same image under the adjoint operator C∗. There is also the possibility of overfitting that may lead to primary energy removal or incorrect amplitude corrections. To solve these issues, a smoothness constraint is imposed on the diagonal scaling to give argmin z 1 2 ||g−C∗diag(C f )z||22+λ ||Lz||22, (2.12) 16 with L = [LT1 L T 2 L T θ L T scale] T , (2.13) where λ is the trade off parameter and L is the sharpening operator, which pe- nalizes fluctuations among neighboring coefficients in the diagonal scaling z. The sharpening operator penalizes the scaling coefficients of each wedge (Figure 1.5) along the first and second direction, L1 and L2. It also penalizes coefficients be- tween adjacent wedges at the same scale Lθ , and between wedges of adjacent scales Lscale, see Herrmann et al. (2008a); Shahidi and Herrmann (2009). To ensure positivity of the diagonal scaling, it is suggested in Herrmann et al. (2008a) to replace z with ez. This is not implemented in the curvelet-domain matched-filtering used to produce our result and we leave its implementation to future works. 2.5 Motivation To illustrate the potential benefit of adding curvelet-domain matching, let us con- sider a common-offset section from North Sea field dataset. Figure 2.5 shows the results of primaries estimation from two different multiples-elimination methods, namely Surface-Related Multiple Elimination (SRME, Figure 2.5b), and Bayesian threshold (Figure 2.5c, Saab et al. (2007)), which uses curvelet-domain matched- filtering. Comparing the primary estimations from SRME and Bayesian thresh- old reveals improved continuity and amplitude preservation for the primaries (near the lower two arrows in each plot around 2.6 and 3.1s) of the Bayesian threshold method. Furthermore, the strong residual of the first- and second-order water bot- tom multiples in the shallow part (near the top two arrows in each plot around 0.75 and 1.2s) are better suppressed. 17 (a) (b) (c) Figure 2.5: A common-offset section from North Sea field dataset with re- sults from two different multiples-elimination methods. a) Total data plotted with automatic-gain control. b) Primary estimation from one term SRME. c) Scaled Bayesian iterative threshold primaries estimate. Adapted from Herrmann et al. (2008b). 18 Chapter 3 Convexified EPSI and curvelet-domain matched-filtering In this chapter, we layout the theory behind estimation of primaries by sparse in- version (EPSI), before incorporating a matching step into the convexified EPSI procedure. Will incorporating a matching step into EPSI optimization problem give better results than using EPSI alone, knowing that using EPSI one would match multiples to multiples instead of matching predicted multiplies to the total data as in SRME? The remainder of this thesis is dedicated to answering this question. 3.1 Estimation of primaries by sparse inversion EPSI is based on the same model as SRME. However, it estimates primaries in an inversion process rather than in a prediction and a subtraction process. Unlike SRME, EPSI is not sensitive to limited sampling because it can reconstruct missing offset during the inversion process. EPSI tries to change the paradigm that sees the surface-related multiples as noise, which needs to be removed from the data prior to imaging. It proposes an operator, which is based on the underlying principle that relates the primary im- 19 pulse response to total upgoing wavefield that includes the source signature and surface related multiples. EPSI monochromatic relationship is expressed mathe- matically as: Ĝ︸︷︷︸ surface-free impulse response downgoing wavefield︷ ︸︸ ︷[ Q̂+RP̂ ] ≈ upgoing wavefield︷︸︸︷P̂ , (3.1) where the hat indicates a monochromatic representation of a wavefield arranged into a matrix, see Section 2.2.3 for details. The matrix product of any two-hatted wavefields corresponds to a non-stationary convolution in the time domain. The original EPSI (van Groenestijn and Verschuur, 2009) assumes the impulse response G to be sparse in the time domain and the source function Q is omni- directional, i.e. a constant-source for all shots or Q̂ = Iq̂ with q̂ the discretized Fourier representation of the source function. Sparsity is enforced on the updates of the surface-free impulse response by a zero norm. An increasing time window in each iteration is placed over these updates in which the biggest events per trace are selected. Increasing the window size improves the convergence. A Fourier smoothness (shortness) is also enforced on the source function. We obtain a reasonable estimate for primary impulse response G by choosing the largest τ (sparsity level) elements of the gradient and setting the rest to zeros. We keep repeating this process till we obtain the desired image of primaries. 3.1.1 Convexified EPSI Many issues affect the practical adoption of the original EPSI formulation. It is difficult to estimate many of its inversion parameters, e.g. the sparsity level of the primaries in each iteration or the precise knowledge of the time window containing them. The original EPSI also relies on the zero norm which makes it unstable. Lin and Herrmann (2009) address these issues via a `0 to `1 convexification, where the one norm on the surface-free impulse response is minimized alternately with the two-norm on the source function. Introducing `1 convexification will not only preserve sparsity on the Green’s function but will also stabilize the problem. It will also eliminate most the inversion parameters. The time window, which is used to prevent the inversion from being trapped in a local minima, is no longer needed 20 since convex problems have no local minima. The sparsity level of primaries is replaced with a more practical and easy to estimate parameter σ , which is related to the noise level of measured data, see Equation 3.3. 3.1.2 Bi-convex optimization EPSI optimization involves two alternating optimization problems that solve for the unknown surface-free Green’s function G and the unknown source function Q. To be consistent with the notation of optimization problems, indicated by lower bold face letters e.g. p = vec(P) in our notation, we rewrite Equation 3.1 in a vectorized form in this bi-linear forward model formulation E[Q̂]g :=F ∗t blockdiag ([ Q̂+RP̂ ]∗ 1···n f ⊗ I ) Ftg = p, (3.2) where E[Q̂] is a linear operator, which depends on the source signature Q and the upgoing wavefield P that acts on the vectorized surface-free Green’s function g, ∗ denotes the conjugate transpose, and ⊗ denotes the Kronecker product. This expression reformulates the matrix-matrix product into a matrix-vector product. Ft is the Fourier transform, whose action on a vectorized wavefield g is defined as Ftg := ĝ = [ĝ1, ĝ2, · · · , ĝn f ]∗ with n f the number of frequencies and F ∗t is the inverse Fourier transform operator that brings the vectorized wavefield back to the time domain. The blockdiag lays out each frequency slice (1 · · ·n f ) in a vectorized form and p is the vectorized upgoing wavefield. For now, we will assume R =−I. Equation (3.2) is linear in both the surface-free Green’s function and the Fourier coefficients of the source function. This permits us to invert for these two un- knowns by alternatingly solving two optimization problems. First, we solve for free impulse response g subject to data fitting, i.e. the one norm optimization prob- lem on the unknown surface-free Green’s function, g̃ = argmin g ||g||1 subject to ||p−E[Q̂]g||2 ≤ σ , (3.3) where g̃ is the vectorized estimation (estimations are indicated by )̃ for the surface- free Green’s function and σ is the residual, which is linked to the noise level of the data. Then, we solve the regularized least-squares problem for the unknown source 21 function, ˜̂q = argmin q 1 2 ||ỹ−B[ ˜̂G]q̂||22+λF ||LF q̂||22, (3.4) where ỹ = vec([P̂+ ˜̂GP̂]1···n f ) is the data vector with ˜̂G = vec−1(g̃) the current es- timate for the surface-free Green’s function and B[ ˜̂G] := blockdiag( ˜̂G1···n f ⊗ I). In this expression, we assume the source to be omnidirectional, i.e. Q1···n f = Iq1···n f The sharpening operator LF in the `2 penalty term imposes smoothness on the es- timated Fourier coefficients of the source function ˜̂q, which corresponds to enforc- ing decay in the time domain. The trade off parameter λF balances Fourier domain smoothness versus data misfit to avoid overfitting, which may lead to leakage of multiples into the source function. In this bi-convex optimization, we start with estimate for the source function and then solve for the surface-free Green’s function by an one-norm optimization problem, which seeks a sparse vector that after convolution with the downgoing wavefield explains the total upgoing wavefield, see Equation (3.3). Given this sparse estimation for the Green’s function, the regularized least-squares problem is solved that seeks the Fourier coefficients of the source function, see Equation (3.4). A sparsfying Transform To further enforce the sparsity assumption, we can define a suitable sparsifying transform on the estimated surface-free Green’s function. In Lin and Herrmann (2009), they define this sparsifying transform as a Kronecker product between the 2D curvelet transform along source-receiver coordinates, and the wavelet trans- form along time coordinate, i.e. S := C⊗W, with C, W the curvelet and wavelet transform operators, respectively. The synthesis operator S∗ then sits between E[Q̂] and g in Equation 3.3, and the solution seeks the sparsest set of transform domain coefficients of the surface-free Green’s function Sg̃ instead of the sparse Green’s function g̃. Due to the redundancy of the curvelet transform, about 8 times redundant in 2D case, and the high demand for memory of our implementation of Algorithm 1, see Section 3.2, the above sparsifying transform is not used in our matching. 22 3.2 EPSI with curvelet-domain matched-filtering EPSI implicitly assumes that the source is omnidirectional, i.e. constant source for all shot gathers; absence of 3D effects; infinite aperture; and ideal total surface reflectivity, i.e. R = −I. Although in principle EPSI can be extended to 3D, it is been shown that this assumption is more problematic to handle in practice as expe- riences with SRME-based method have shown (Lin et al., 2004). Deviation form these assumptions may lead to detrimental effects on the quality of the estimated surface-free data. Following the successful application of curvelet-domain matched-filtering within the SRME procedure, we propose to incorporate a matching step into our bi-convex optimization procedure to mitigate possible adverse effects that stem from ampli- tude mismatches that vary smoothly as a function of position and dip along the predicted wavefronts of the current estimate for the surface-free data. We include a non-trivial surface reflectivity operator, i.e. R 6= −I, that will absorb amplitude errors of non-ideal reflections at the surface, finite-aperture, and other unknown effects. We start with the assumption that the operator R is symmetric positive defi- nite and pseudo local (no kinematic shifts). Then, we approximate this space-dip- dependent filter by a simple curvelet domain scaling R≈C∗diag(z)C. (3.5) After that, the diagonal curvelet scaling is estimated in the regularized least-squares problem Equation 2.11 that matches multiples to multiples and not predicted mul- tiples to total data as in SRME. 3.2.1 Frequency-domain monochromatic matching Our first attempt to incorporate the surface reflectivity operator R into our formula- tion was to assume that this operator is frequency independent, even though it can be easily extended to include angular frequency dependency. Then we incorporate this operator in Equation 3.1 to get P̂− ĜQ̂ = ĜC∗diag(z)CP̂, (3.6) 23 which we can solve by least-squares to get the curvelet scaling z: argmin z 1 2 ||ũ−M̂z||22+λ ||Lz||22, (3.7) where M̂ := ˜̂GC∗diag(CP̂) and ũ := vec([P̂− ˜̂G˜̂Qi])with i the index corresponding to the frequency for which ˜̂q is maximum. This method solves for the curvelet scaling fast, 200x faster than the method proposed in the next section, and does not demand large memory to execute. How- ever, it does not give the desired space-dip dependent amplitude corrections corre- sponding to the symbol of the pseudo differential operator a(x,ζ ) in Equation 2.8 since it is acting on a frequency slice that has two spatial coordinates and no time coordinate. To solve this problem, the scaling should act on a spatial-time slice, e.g. a shot gather or a common receivers gather. 3.2.2 Time-domain matching In this approach, we apply the spatial-dip dependant operator R to each shot record of the upgoing wavefield P separately, to give the desired spatial-dip dependent amplitude corrections. Then, we take the scaled upgoing wavefield to the Fourier domain and substitute it into EPSI monochromatic relationship Equation 3.1, to get vec(P̂− ˜̂G ˜̂Q) = ĜM̂z1···ns , (3.8) where M̂i = C∗diag(CPi) with i the shot gather index and ns is the number of sources. Our operator R now acts on each shot record of the upgoing wavefield P separately. This new approach seeks a smooth positive spatial-dip dependent curvelet- domain scaling for all the shot gathers of the total upgoing wavefield simultane- ously, which makes it time and memory demanding approach, especially with the curvelet redundancy of about 8 times for the 2D case. Incorporation of this expression into our formulation, yields a tri-convex opti- 24 mization problem that now includes curvelet-domain matching: z1···ns = argmin z1···ns 1 2 ||ũ−Nz1···ns ||22+λ ||Lz1···ns ||22, (3.9) where N := blockdiag([ ˜̂G ˜̂M]1···n f ⊗ I) and ũ := vec(P̂− ˜̂G ˜̂Q]1···n f ). The trade-off parameter λ and the sharpening operator L are of Equation 2.12 respectively and ns and n f are the number of sources and the number of frequencies respectively. To solve this tri-convex optimization, we start by assuming R = −I. Then, given an estimate for the source signature we solve Equation 3.3 for the sparse surface-free Green’s function. After that, we substitute our Green’s function esti- mate into Equation 3.4, and solve the least-squares problem to obtain a new source estimate. Now that we have an estimate for both G and Q, we can start the matching step by assuming R 6=−I and solve Equation 3.9 for the diagonal curvelet-domain scaling. The new estimate for the surface reflectivity operator R is then used in the next estimations of the surface-free Green’s function and the source function, see Algorithm 1. Algorithm 1 Stabilized Estimation of Primaries by Sparse Inversion with Curvelet Domain Matched Filtering Result: Estimate for g and Q choose noise level σ , iteration increment ρ g0←− 0, Q0←− 0, R0 ←−−I, k←− 1 while ‖p−E[Q̂,R]g‖2 ≥ σ do gk ← solve (3.3) using initial guess Qk−1,Rk−1 with SPG`1 1at least ρk itera- tions. Qk ← solve (3.4) for the source signature using gk,Rk−1 with LSQR z← solve (3.9) with LSQR Rk ← C∗diag(z)C end while g← gk Q← Qk 1see van den Berg and Friedlander (2008) 25 Chapter 4 Observations and results This chapter studies the factors affecting the outcome of incorporating a matched- filtering step into EPSI. We conclude this chapter by showing our results. 4.1 Methodology 4.1.1 Synthetic data We consider synthetic time-domain finite-difference acoustic forward modeling data with 128 shot gathers each with 128 receivers (traces). Each trace has 256 time samples and a time sampling interval of 6.4ms; and the distance between any two sources or receivers is 20m. Delphi f dacmod was used to generate this data, see Table 4.1 for detailed parameters. We assume the data to be receiver ghost free. Unless otherwise stated, all the shot gathers shown in this section are the middle shot gather, i.e. shot index of 64 (offset = 1260m). 4.1.2 Results evaluation We want to compare the primary multiples separation results of EPSI with and without the curvelet-domain matched-filtering step. However, since we do not have the true primaries to evaluate the separation results, e.g. with signal-to-noise ratio, we rely on prior knowledge of the locations of these primaries to evaluate our results. 26 Table 4.1: Synthetic time-domain finite difference acoustic forward modeling data parameters. Parameter Value Parameter Value time sample interval 6.4ms sources depth 5m samples/trace 256 receivers depth 5m number of sources 128 grid size (dx) 5m number of receivers 128 grid size (dz) 5m wavelet type Ricker wavelet model size (nx) 509 frequency range 0-60Hz model size (nz) 251 source type monopole The matching step will be evaluated by the smoothness and the positivity of the curvelet-domain matched-filtering diagonal scaling. 4.2 Factors affecting our results In this section, we study the factors that control the output of the matching process, namely the number of least-squares iterations η and the trade off parameter λ which control the smoothing and positivity of the curvelet-domain scaling. We will also answer the question of when and how frequent should we do the matching. Before experimenting with these factors, we present EPSI results without match- ing of the synthetic data described in Table 4.1. Figure 4.1a shows the zero-offset section of total upgoing wavefield. After applying EPSI to this data, it converges after 22 alternating optimization loops to give an estimate for the primaries (Fig- ure 4.1b) and an estimate for the source wavelet (Figure 4.1d). The multiples es- timate (Figure 4.1c) is calculated by subtracting the primary estimate form total upgoing wavefield. 4.2.1 The smoothing and positivity of the curvelet-domain diagonal scaling In Section 2.4, the second assumption behind the curvelet-domain matched-filtering states that the non-stationary differences vary smoothly as a function of location and dip and then it can be modeled mathematically by a matrix-vector multiplica- 27 (a) (b) (c) (d) Figure 4.1: Zero-offset EPSI synthetic data results with no matching. a) To- tal upgoing wavefield section (P̂). b) Primaries estimation (P̂0). c) pri- maries estimation minus the total data (P̂− P̂0 = −ĜP̂), i.e. the multi- ples estimation d) Final estimated source wavelet (Q̂) tion of a positive definite matrix on a reference vector, see Equation 2.9. To test the effects of the number of least-squares matching iterations η of Equation 3.9 and the trade off parameter λ on the positivity and smoothness of the curvelet-domain scaling vector z1···ns , we carry out two experiments on which we fix one of the parameters and allow the other to vary. Then, we compare the smoothness of the curvelet-domain scaling of the middle shot gather (offset = 1260m) with a mosaic plot, see Figure 1.5 for detailed description. We also show 2D-plots of the same curvelet-domain scaling to judge the positivity. After that we present the final zero-offset primary multiple separation results, i.e. the estimated 28 primaries P̂0 and the calculated multiples M̂0. Fixing λ and varying η In this experiment, we test the effect of the number of curvelet-domain matched- filtering iteration η on the positivity and smoothness of the curvelet-domain scal- ing vector by fixing the trade off parameter λ = 0.46 and varying the number of iteration η = 10,35 and 100. Figure 4.2a, 4.2c and 4.2e show mosaic plots of the curvelet-domain scaling coefficients for η = 10,35 and 100, respectively and Figure 4.2b, 4.2d and 4.2f show the corresponding 2D-plots of the same coefficients. Since the surface reflec- tivity coefficients are negative, approximated by R=−I in EPSI without matching, and to avoid the confusion with literature (Herrmann et al., 2008a; Yarham and Herrmann, 2008) where the curvelet-domain scaling vector z1···ns is positive, we will plot −z1···ns instead of z1···ns . The corresponding final primaries and multiples estimations are shown in Figure 4.3(a)-(f). As the number of iterations increases, the curvelet-domain scaling vector be- comes more smooth and positive as shown in Figure 4.2. The smoothness and positivity also affect the final EPSI output. In Figure 4.3a, we notice a significant leakage of multiples into the zero offset primaries section, indicated by the small black arrows at around 0.48, 0.75, 0.8 and 1s. However, in Figure 4.3b, these indicated multiples are either eliminated or faded. Figure 4.3c has slightly less multiples than Figure 4.3b. For example, the multiple, indicated by the third arrow, has been eliminated. Figure 4.3(d)-(f) show the same effects, so as the number of iterations increases the multiples become more preserved. Increasing the number of iterations produces a more smooth and positive curvelet- domain scaling vector, which results in a better primary-multiples separation. How- ever, that comes at the expense of execution time. Although we are utilizing the parallel execution and the distributed arrays of MATLAB in our implementation of Algorithm 1, one matching iteration still takes about 10 minutes to execute on 5 nodes each with 8GB memory shared between 2 workers. 29 (a) λ = 0.46,η = 10 (b) λ = 0.46,η = 10 (c) λ = 0.46,η = 35 (d) λ = 0.46,η = 35 (e) λ = 0.46,η = 100 (f) λ = 0.46,η = 100 Figure 4.2: Mosaic and 2D plots for the curvelet-domain matched-filtering scaling for one shot (offset = 1260m) for λ = 0.46 and η = 10,35 and 100. The λ and η used to generate each subfigure are stated underneath it. . 30 (a) λ = 0.46,η = 10 (b) λ = 0.46,η = 35 (c) λ = 0.1,η = 100 (d) λ = 0.46,η = 10 (e) λ = 0.46,η = 35 (f) λ = 0.46,η = 100 Figure 4.3: Zero-offset synthetic data results for λ = 0.46 and η = 10, 35 and 100. The λ and η used to generate each subfigure are stated under- neath it. a-c) zero-offset primaries estimates. d-f) zero-offset multiples estimates. 31 Fixing η and varying λ This experiment aims to test the effect of the trade off parameter λ on the positiv- ity and smoothness of the curvelet-domain scaling vector by fixing the number of matching iterations η = 35 and allowing the trade off parameter to vary λ = 0.046, 0.46 and 46.4. The mosaic plots of the curvelet-domain scaling coefficients for λ = 0.046, 0.46 and 46.4 are shown in Figure 4.4a, 4.4c and 4.4e respectively. Next to each mosaic plot, a 2D-plot of the same coefficients is shown, Figure 4.4b, 4.4d, and 4.4f. The corresponding final primaries and multiples estimations are shown in Figure 4.5(a)-(f). Increasing the trade off parameter λ will obviously make the curvelet-domain scaling smoother and more positive, see Figure 4.4, but λ also balances the curvelet- domain smoothness versus data misfit to avoid overfitting, which may lead to leak- age of multiples into primaries estimate. The later point is obvious in Figure 4.5, where we notice severe leakage of multiples (Figure 4.5f) into the primaries (Fig- ure 4.5c). Figure 4.5a has slightly less multiples than Figure 4.5b especially at the deepest parts near the lowest three arrows. A large value of the trade off parameter λ and the number of matching iter- ations η are desirable, but if not chosen correctly might lead to severe leakage of multiples into primaries estimate or to longer execution time. Hence, in order to practically incorporate the matching step, we need to choose these parameters wisely to avoid data overfitting and timely execution. 4.2.2 When and how frequent should we do the matching? In this section, we carry out different experiments to determine the best way to incorporate curvelet-domain matched-filtering step into EPSI. We present these experiments starting with the worst result. Then, we evaluate the quality of each result by comparing it to EPSI result with no matching presented in Figure 4.1. In all the experiments, the number of iteration η and the trade off parameter λ are fixed, η = 30 and λ = 0.46. Before we carry out the experiments, we will take a look at different alternat- ing optimization iterations snapshots (Figure 4.6). In particular, we look at the 32 (a) λ = 0.046,η = 35 (b) λ = 0.046,η = 35 (c) λ = 0.46,η = 35 (d) λ = 0.46,η = 35 (e) λ = 46.4,η = 35 (f) λ = 46.4,η = 35 Figure 4.4: Mosaic and 2D plots for the curvelet-domain matched-filtering scaling for one shot (offset = 1260m) for η = 35 and λ = 0.046, 0.46 and 46.4. The λ and η used to generate each subfigure are stated underneath it. 33 (a) (b) (c) (d) (e) (f) Figure 4.5: Zero-offset synthetic data results for η = 35 and λ = 0.046, 0.46 and 46.4. The λ and η used to generate each subfigure are stated under- neath it. a-c) zero-offset primaries estimates. d-f) zero-offset multiples estimates. 34 primaries estimates for different iterations P̂0 = ĜQ̂, Figure 4.6a- 4.6d, the mul- tiples estimations calculated by subtracting the primaries estimates from the total upgoing wavefield MS0 = P̂− ĜQ̂, Figure 4.6e- 4.6h, and finally at the multiples estimates generated by convolving the Green’s functions with the total upgoing wavefield M̂C0 − ĜP̂, Figure 4.6i- 4.6l. Referring to Figure 4.6, we notice the importance of the first few iterations; they define the way that the upcoming iterations follow. After these few iterations, we notice only small changes in the inversion process, i.e. small changes in P̂0, M̂S0 and MC0 . For example, there is a slight change between the primaries estimate after the sixth iteration (Figure 4.6c) and after the twenty-first iteration (Figure 4.6d) compared to a significant change between the primaries estimate after the first iter- ation (Figure 4.6a) and after the third iteration (Figure 4.6b). Notice also that after these few iterations, the two multiples estimates, M̂S0 and M̂C0 start to become closer to each other. In curvelet-domain matched-filtering, we are trying to estimate the surface-reflectivity by taking advantage of EPSI by matching multiples to multiples instead of predicted multiples to total data like in SRME. Hence, it is desirable to have M̂S0 close to M̂ C 0 , but how close? That is what we will try to find out in the following experiments. Only one matching at the first alternating optimization loop In this experiment, we only solve one curvelet-domain matched-filtering optimiza- tion problem, Equation 3.9, right after we get the first estimation of the Green’s and the source function. Then, we use the estimated curvelet-domain scaling in all the upcoming iterations. Figure 4.7a- 4.7b show the zero-offset primaries and multiples estimates, re- spectively. There is obvious leakage of multiples into the primaries estimate, indi- cated by the small black arrows. The primaries estimate after the first alternating loop is still unstable and will go under a lot of changes during the first few loops, which might lead to this leakage. On the other hand, we notice a leakage of pri- maries into the multiples estimate, which might come form the fact that the two multiples M̂S0 and M̂ C 0 are far apart from each other when we do the matching. Therefore, incorporating a matched-filtering step into EPSI in the first few iter- 35 (a) Iteration = 1 (b) Iteration = 3 (c) Iteration = 6 (d) Iteration = 21 (e) Iteration = 1 (f) Iteration = 3 (g) Iteration = 6 (h) Iteration = 21 (i) Iteration = 1 (j) Iteration = 3 (k) Iteration = 6 (l) Iteration = 21 Figure 4.6: a-d) Primaries estimates at different iterations. e-h) Multiples es- timates calculated by subtracting the primaries estimate for the total data P̂− ĜQ̂. i-l) Multiples estimates calculated by convolving the Green’s function with the total upgoing wavefield −ĜP̂. 36 (a) (b) Figure 4.7: Zero-offset results of EPSI with only one matching step at the first alternating optimization loop. a) Zero-offset primaries estimate. b) Zero-offset multiple estimate calculated by subtracting the primaries estimate form the total upgoing wavefield. ations produces an undesirable leakage of events between primaries and multiples estimates. Multiple matching starting from the second alternating optimization loop In this experiment, we solve six curvelet-domain matched-filtering optimization problems at even loops starting from the second loop, i.e. in loops indexed 2, 4, ..., 12. The zero-offset primaries and multiples estimates are shown in Figure 4.8a- 4.8b, respectively. We immediately notice that the primaries are better preserved 37 (a) (b) Figure 4.8: Zero-offset results of EPSI with six matching steps starting after the second alternating optimization loop (even loops only). a) Zero- offset primaries estimate. b) Zero-offset multiple estimate calculated by subtracting the primaries estimate form the total upgoing wavefield. and there is much less primaries leaking into the multiples. However, there are ob- viously still multiples leaking into the primaries estimate, indicated by the arrows. This leakage might be coming form the first two matching steps at the second and fourth loop where the inversion process still goes under significant changes. This result is better than the previous one. However, solving more than one matched-filtering optimization problem is impractical. Solving one matched-filtering optimization problem takes longer than solving the same problem without the matching step. 38 (a) (b) Figure 4.9: Zero-offset results of EPSI with only one matching step at the twenty first alternating optimization loop. a) Zero-offset primaries es- timate. b) Zero-offset multiple estimate calculated by subtracting the primaries estimate form the total upgoing wavefield. Only one matching near the last alternating optimization loop In this experiment, we only solve one matched-filtering optimization problem near the last loop, loop 21, just before the inversion problem converges. The zero-offset primaries and multiples estimates are shown in Figure 4.9a- 4.9b, respectively. This result is very close to the original result with no curvelet- domain matched-filtering but it suffers from weak multiples events leaking into the primaries estimate, indicated by the black arrows. In the last few loops, the inversion problem goes through small changes and the two multiples estimates M̂S0 and M̂C0 are very close and there is no much that the matching step can correct. 39 (a) (b) Figure 4.10: Zero-offset results of EPSI with only one matching step at the twelve alternating optimization loop. a) Zero-offset primaries esti- mate. b) Zero-offset multiple estimate calculated by subtracting the primaries estimate form the total upgoing wavefield. Only one matching at the middle alternating optimization loop In this experiment, we solve one matched-filtering optimization problem near the middle loop, loop 12. At this stage in the inversion problem, the dramatic changes of the first few iterations have already stabilized and we still have many upcoming iterations before the inversion problem exits. Figure 4.10a- 4.10b show the zero-offset primaries and multiples estimates, respectively. This result is better than all previous results including the one without curvelet-domain matched-filtering one. We notice less multiples leakage into the primaries estimate especially where the two black arrows indicate. 40 4.2.3 Optimized result We will conclude this section by first presenting an optimized result, a result where λ and η are chosen wisely and the matched-filtering optimization is solved only once in the middle loop, and then compare it to EPSI result without the curvelet- domain matched-filtering. The optimized result is shown in Figure 4.11. Figure 4.11a- 4.11c show the zero-offset section of the total upgoing wavefield, the estimate for the primaries and the calculated multiples, respectively. The estimated source wavelet is shown in Figure 4.11d. This result is better than EPSI without the curvelet-domain matched-filtering step presented in Figure 4.1. By comparing this result primaries estimate (Fig- ure 4.11b) to the one without matching estimated primaries (Figure 4.1b), we no- tice that the multiple event indicated by the top arrow is now weaker and more interestingly the other multiple indicated by the bottom arrow is no longer there. The number of iterations (η = 60) and the trade off parameter (λ = 0.49) are chosen to avoid overfitting the data and to solve the least-squares optimization problem in a reasonable time and at the same time does not sacrifice the positivity and smoothness of the curvelet-domain scaling vector. The matching step is incor- porate at the twelve loop away from the instability of the few first iteration to avoid leakage between the primaries and the multiples. 41 (a) (b) (c) (d) Figure 4.11: Zero-offset EPSI synthetic data result with curvelet-domain matched filtering incorporated into loop 12. The number of iterations η = 60 and the trade off parameter λ = 0.46. a) Total upgoing wave- field section (P̂). b) Primaries estimation (P̂0). c) primaries estimation minus the total data (P̂− P̂0 = −ĜP̂), i.e. the multiples estimation d) Final estimated source wavelet (Q̂) 42 Chapter 5 Conclusion and future works 5.1 Conclusion The aim of this thesis is to incorporate a curvelet-domain matched-filtering step into the formulation of EPSI and investigate the factors that affect the outcome of this matching step on the outcome of EPSI. We started by leveraging the curvelet transform ability to represent seismic data and images sparsely as a function of location, scale and dip to end up with a solid matched-filtering formulation that has the ability to control a possible overfitting that might lead to primary energy removal. Then we incorporated this formulation into EPSI formulation to produce a tri- convex optimization problem in which after we invert for the Green’s and source function, we solve our matching-filtering problem. After that we looked at the different factors that affect the outcome of our inversion namely, the number of iteration η , the value of the trade off parameter λ and when and how frequent should we solve the matched-filtering problem. We also saw that the number of iterations and the value of the regularization parameter should be chosen wisely to balance the smoothness and positivity of the curvelet- domain scaling versus data overfitting and long execution time. Then, we concluded by showing that solving the curvelet-domain matched- filtering step around the middle alternating optimization loop gives the best results since in that stage the inversion problem have already stabilized and the dramatic 43 changes of the first few iterations have already done. 5.2 Future works In this section we try to improve our method by making it faster. We suggest two ways to do that either by changing the problem formulation or by imposing frequency regularization over the curvelet-domain scaling vector. New formulation The problem with the current formulation Equation 3.9 is that we have to solve for the curvelet domain scaling for the whole data at the same time which requires a great deal of time and memory. We start by assuming R =−I and then change the formulation a bit to get P−GQ =−C∗diag(z)CGP. (5.1) This formulation is solved for each shot gather separately which makes it fast and easy to parallelized. However, this formulation does not have a physical meaning like matching for the reflectivity coefficients (Equation 3.9). Frequency regularization In Shahidi and Herrmann (2009), they show that by imposing a smoothness over the curvelet-domain scaling vector in the frequency domain, faster convergence is realized. By incorporating this in our formulation we could shorten the execution time of Equation 3.9. 44 Bibliography A. Berkhout. Seismic migration : imaging of acoustic energy by wave field extrapolation. Elsevier, 1985. → pages A. Berkhout. Multiple removal based on the feedback model. The Leading Edge, 18:127–131, 1999. → pages E. Candes, L. Demanet, D. Donoho, and L. Ying. Fast discrete curvelet transforms. Multiscale Modeling and Simulation, 5:861–899, 2006. → pages E. J. Candes and D. L. Donoho. Curvelets: a surprisingly effective nonadaptive representation of objects with edges. In in Curve and Surface Fitting: Saint-Malo. University Press, 2000. → pages M. R. Gadallah and R. L. Fisher. Applied Seismology: A Comprehensive Guide to Seismic Theory and Application. PennWell Corporation, 2005. → pages G. Hennenfent and F. J. Herrmannn. Seismic denoising with nonunformly sampled curvelets. Computing in Science and Engineering, 8:16–25, 2006. → pages F. J. Herrmann. Curvelet-domain matched filtering. In SEG Technical Program Expanded Abstracts, volume 27, pages 3643–3647. SEG, November 2008a. → pages F. J. Herrmann. Curvelet-domain matched filtering. In SEG Technical Program Expanded Abstracts, volume 27, pages 3643–3647. SEG, November 2008b. → pages F. J. Herrmann and G. Hennenfent. Non-parametric seismic data recovery with curvelet frames. Geophysical Journal International, 173:233–248, 2008. → pages 45 F. J. Herrmann, P. P. Moghaddam, and C. C. Stolk. Sparsity- and continuity-promoting seismic image recovery with curvelet frames. Applied and Computational Harmonic Analysis, 24:150–173, 2008a. URL http://slim.eos.ubc.ca/Publications/Public/Journals/HerrmannACHA07.pdf. Accepted for publication in the Journal of Applied and Computational Harmonic Analysis. → pages F. J. Herrmann, D. Wang, and D. J. Verschuur. Adaptive curvelet-domain primary-multiple separation. Geophysics, 73(3):A17–A21, 2008b. doi:10.1190/1.2904986. URL http://slim.eos.ubc.ca/Publications/Public/ Journals/herrmann08acd/paper html/paper.html. → pages D. Lin, J. Young, Y. Huang, and M. Hartmann. 3d srme application in the gulf of mexico. SEG Technical Program Expanded Abstracts, 23(1):1257–1260, 2004. doi:10.1190/1.1851098. URL http://link.aip.org/link/?SGA/23/1257/1. → pages T. Lin and F. J. Herrmann. Unified compressive sensing framework for simultaneous acquisition with primary estimation. In SEG, 2009. → pages R. Neelamani, A. I. Baumstein, D. G. Gillard, M. T. Hadidi, and W. L. Soroka. Coherent and random noise attenuation using the curvelet transform. The Leading Edge, 27:240–248, 2008. → pages C. C. Paige and M. A. Saunders. Lsqr: An algorithm for sparse linear equations and sparse least squares. ACM Trans. Math. Softw., 8:43–71, March 1982. ISSN 0098-3500. doi:http://doi.acm.org/10.1145/355984.355989. URL http://doi.acm.org/10.1145/355984.355989. → pages R. Saab, D. Wang, O. Yilmaz, and F. Herrmann. Curvelet-based primary-multiple separation from a bayesian perspective. In SEG International Exposition and 77th Annual Meeting, 2007. URL http: //slim.eos.ubc.ca/Publications/Public/Conferences/SEG/2007/saab07seg.pdf. → pages R. Shahidi and F. J. Herrmann. Curvelet-domain matched filtering with frequency-domain regularization. In SEG Technical Program Expanded Abstracts, volume 28, page 3645. SEG, 2009. → pages E. van den Berg and M. P. Friedlander. Probing the pareto frontier for basis pursuit solutions. SIAM Journal on Scientific Computing, 31(2):890–912, 2008. doi:10.1137/080714488. URL http://link.aip.org/link/?SCE/31/890. → pages 46 G. J. A. van Groenestijn and D. J. Verschuur. Estimating primaries by sparse inversion and application to near-offset data reconstruction. Geophysics, 74(3): A23–A28, 2009. doi:10.1190/1.3111115. URL http://link.aip.org/link/?GPY/74/A23/1. → pages D. J. Verschuur, A. J. Berkhout, and C. P. A. Wapenaar. Adaptive surface-related multiple elimination. Geophysics, 57:1166–1177, 1992. → pages E. Verschuur. Seismic multiple removal techniques: Past, present and future. EAGE Publications BV, 2002. → pages C. Yarham and F. J. Herrmann. Bayesian ground-roll seperation by curvelet-domain sparsity promotion. In SEG Technical Program Expanded Abstracts, volume 27, pages 2576–2580. SEG, November 2008. → pages 47 Appendix A Implementations There are many implementations of the curvelet-domain matched-filtering namely, a MATLAB, a Python and a partially MPI C++ implementation. Although the last implementation is the fastest and has more features, we chose to extend the MAT- LAB implementation for these reasons. MATLAB does not only make developing and testing algorithms fast and easy, but a recent extension to MATLAB, namely Parallel Computing Toolbox, makes solving memory and time intensive algorithms possible. It provides MPI free programming of distributed arrays, different parts of one large array reside at different computational nodes, and distributed algorithm execution, different parts of an algorithm is executed at different computational nodes at the same time. The latter feature makes our algorithm, see Section 3.2.2, possible. For example, the diagonal scaling in Equation 3.9 for our small synthetic data has more than 32 million elements. SPG`1, a gradient-based solver that uses spectral step length and Pareto root finding (van den Berg and Friedlander, 2008), is used for solving the `1 optimiza- tion problem in Equation 3.3 and for the `2 optimization problem in Equation 3.4, the least-squares solver described in Paige and Saunders (1982) is used. We have chosen the curvelet-domain transform via wrapping (Candes et al., 2006) as our curvelet transform with 6 scales and 16 angles at the finest scale. For this choice of implementation, the pseudo-inverse equals the adjoint. 48
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Estimation of surface-free data by curvelet-domain...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Estimation of surface-free data by curvelet-domain matched filtering and sparse inversion AlMatar, Mufeed H. 2011
pdf
Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.
Page Metadata
Item Metadata
Title | Estimation of surface-free data by curvelet-domain matched filtering and sparse inversion |
Creator |
AlMatar, Mufeed H. |
Publisher | University of British Columbia |
Date Issued | 2011 |
Description | A recent robust multiple-elimination technique, based on the underlying principle that relates primary impulse response to total upgoing wavefield, tries to change the paradigm that sees surface-related multiples as noise that needs to be removed from the data prior to imaging. This technique, estimation of primaries by sparse inversion (EPSI), (van Groenestijn and Verschuur, 2009; Lin and Herrmann, 2009), proposes an inversion procedure during which the source function and surface-free impulse response are directly calculated from the upgoing wavefield using an alternating optimization procedure. EPSI hinges on a delicate interplay between surface-related multiples and pri- maries. Finite aperture and other imperfections may violate this relationship. In this thesis, we investigate how to make EPSI more robust by incorporating curvelet- domain matching in its formulation. Compared to surface-related multiple removal (SRME), where curvelet-domain matching was used successfully, incorporating this step has the additional advantage that matches multiples to multiples rather than predicated multiples to total data as in SRME. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2011-05-09 |
Provider | Vancouver : University of British Columbia Library |
Rights | Attribution 3.0 Unported |
DOI | 10.14288/1.0053426 |
URI | http://hdl.handle.net/2429/34367 |
Degree |
Master of Science - MSc |
Program |
Geophysics |
Affiliation |
Science, Faculty of Earth, Ocean and Atmospheric Sciences, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 2011-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
Rights URI | http://creativecommons.org/licenses/by/3.0/ |
AggregatedSourceRepository | DSpace |
Download
- Media
- 24-ubc_2011_fall_almatar_mufeed.pdf [ 2.95MB ]
- Metadata
- JSON: 24-1.0053426.json
- JSON-LD: 24-1.0053426-ld.json
- RDF/XML (Pretty): 24-1.0053426-rdf.xml
- RDF/JSON: 24-1.0053426-rdf.json
- Turtle: 24-1.0053426-turtle.txt
- N-Triples: 24-1.0053426-rdf-ntriples.txt
- Original Record: 24-1.0053426-source.json
- Full Text
- 24-1.0053426-fulltext.txt
- Citation
- 24-1.0053426.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
data-media="{[{embed.selectedMedia}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.24.1-0053426/manifest