Estimation of Surface-free Data by Curvelet-domain Matched Filtering and Sparse Inversion by Mufeed H. AlMatar B.S.c. Computer Engineering, King Fahad University of Petroleum and Minerals, Dhahran, 2005 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF Master of Science in THE FACULTY OF GRADUATE STUDIES (Geophysics) The University Of British Columbia (Vancouver) April 2011 c Mufeed H. AlMatar, 2011 Abstract A recent robust multiple-elimination technique, based on the underlying principle that relates primary impulse response to total upgoing wavefield, tries to change the paradigm that sees surface-related multiples as noise that needs to be removed from the data prior to imaging. This technique, estimation of primaries by sparse inversion (EPSI), van Groenestijn and Verschuur; Lin and Herrmann, proposes an inversion procedure during which the source function and surface-free impulse response are directly calculated from the upgoing wavefield using an alternating optimization procedure. EPSI hinges on a delicate interplay between surface-related multiples and primaries. Finite aperture and other imperfections may violate this relationship. In this thesis, we investigate how to make EPSI more robust by incorporating curveletdomain matching in its formulation. Compared to surface-related multiple removal (SRME), where curvelet-domain matching was used successfully, incorporating this step has the additional advantage that matches multiples to multiples rather than predicated multiples to total data as in SRME. ii Table of Contents Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii 1 2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Theme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.4 Theoretical background . . . . . . . . . . . . . . . . . . . . . . . 4 Primary-multiple separation techniques . . . . . . . . . . . . . . . . 8 2.1 Primary-multiple separation techniques . . . . . . . . . . . . . . 8 2.2 Surface-related multiples model . . . . . . . . . . . . . . . . . . 9 2.2.1 Surface-related multiples formulation for normal-incidence plane wave in a horizontally stratified medium . . . . . . 9 2.2.2 Data matrix . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.2.3 Two-dimensional formulation . . . . . . . . . . . . . . . 11 2.3 Iterative surface-related multiple removal . . . . . . . . . . . . . 14 2.4 Curvelet-domain matched-filtering . . . . . . . . . . . . . . . . . 15 iii 2.5 3 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Convexified EPSI and curvelet-domain matched-filtering . . . . . . 19 3.1 Estimation of primaries by sparse inversion . . . . . . . . . . . . 19 3.1.1 Convexified EPSI . . . . . . . . . . . . . . . . . . . . . . 20 3.1.2 Bi-convex optimization . . . . . . . . . . . . . . . . . . . 21 EPSI with curvelet-domain matched-filtering . . . . . . . . . . . 23 3.2.1 Frequency-domain monochromatic matching . . . . . . . 23 3.2.2 Time-domain matching . . . . . . . . . . . . . . . . . . . 24 Observations and results . . . . . . . . . . . . . . . . . . . . . . . . 26 4.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.1.1 Synthetic data . . . . . . . . . . . . . . . . . . . . . . . . 26 4.1.2 Results evaluation . . . . . . . . . . . . . . . . . . . . . 26 Factors affecting our results . . . . . . . . . . . . . . . . . . . . . 27 3.2 4 4.2 4.2.1 The smoothing and positivity of the curvelet-domain diagonal scaling . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.2.2 When and how frequent should we do the matching? . . . 32 4.2.3 Optimized result . . . . . . . . . . . . . . . . . . . . . . 41 Conclusion and future works . . . . . . . . . . . . . . . . . . . . . . 43 5.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 5.2 Future works . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Appendix A Implementations . . . . . . . . . . . . . . . . . . . . . . . 48 5 iv List of Tables Table 4.1 Synthetic time-domain finite difference acoustic forward modeling data parameters. . . . . . . . . . . . . . . . . . . . . . . v 27 List of Figures Figure 1.1 An illustration of land seismic survey. A seismic source sends energy into the earth. Some of this energy is reflected back to the surface at rocks layers discontinuities. This energy is then recorded by geophones at the surface. . . . . . . . . . . . . . Figure 1.2 2 An illustration that shows a primary and two types of multiples. The solid blue line has only one upward reflection, so it is a primary reflection. The green dashed line has more than one upward reflection and it has its shallowest downward reflection at the surface, hence, it is a surface-related multiple. Finally, the internal-multiple, the red dashed line, has its shallowest reflection at an internal rock layer and it has more than one upward reflection. . . . . . . . . . . . . . . . . . . . . . Figure 1.3 3 Mesh view of a 2D curvelet. Curvelets are oscillatory in one direction, but smooth in the other. They resemble a windowed sinusoid in the oscillatory direction and a very smooth Gaussian window in the other direction. . . . . . . . . . . . . . . . Figure 1.4 5 Spatial (left) and frequency (right) representations of six curvelets with different angles and scales. Notice how each curvelet is localized in the spatial domain with a rapid decaying amplitude and how it has a compact support in the frequency domain. Notice also the 90◦ rotation between any two correspondence representations in both domains of each curvelet. . . . . . . . vi 6 Figure 1.5 a) Synthetic shot gather b) Curvelet decomposition of the shot gather into different scales and angles. Five scales are shown here. Notice how the angles double every other scale and how the different parts of the shot gather are decomposed at the various scales and angles. The coarsest (center) scale shows the DC and low frequencies. The second scale has 4 angles and the third and the fourth has 8. The fifth (finest) scale has 16 angles. Figure 1.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Curvelet correlation with wavefronts. A large coefficient is yield when the curvelet and the waterfront have locally the same direction and frequency contents. Otherwise, a small coefficient is yielded. . . . . . . . . . . . . . . . . . . . . . . . Figure 2.1 7 Two reflection events: a primary (solid blue line) and a surfacerelated multiple (dashed red line). Both events have similar arrival time but have traveled different paths. The primary event has traveled deeper into the subsurface and has seen higher velocity than the multiple event. Therefore, it arrived with a smaller angle at the receiver. . . . . . . . . . . . . . . . . . . Figure 2.2 9 Feedback diagram for surface-related multiples generation. First, surface-free data is generated by convolving the source signature s+ with surface-free impulse response g. Then, surfacefree data feedbacks into the earth to generate higher order multiples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Figure 2.3 11 a) An illustration that shows the generation of surface-related multiples. b) Source signature s(t). c) Surface-free impulse response g(t). d) Surface-free data p0 (t). e) First-order multiples m1 (t). f) Second-order multiples m2 (t). g) Total response p(t). Figure 2.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 After convolving the source gather from total data P with the common receivers gather from surface-free impulse response G for each point in xk , the convolution result is summed up to produce the predicted surface-related multiple trace at xr . . . . vii 13 Figure 2.5 A common-offset section from North Sea field dataset with results from two different multiples-elimination methods. a) Total data plotted with automatic-gain control. b) Primary estimation from one term SRME. c) Scaled Bayesian iterative threshold primaries estimate. Adapted from Herrmann et al. (2008b). Figure 4.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Zero-offset EPSI synthetic data results with no matching. a) ˆ b) Primaries estimation Total upgoing wavefield section (P). (Pˆ0 ). c) primaries estimation minus the total data (Pˆ − Pˆ0 = ˆ P), ˆ i.e. the multiples estimation d) Final estimated source −G ˆ . . . . . . . . . . . . . . . . . . . . . . . . . . . wavelet (Q) Figure 4.2 28 Mosaic and 2D plots for the curvelet-domain matched-filtering scaling for one shot (offset = 1260m) for λ = 0.46 and η = 10, 35 and 100. The λ and η used to generate each subfigure are stated underneath it. . . . . . . . . . . . . . . . . . . . . Figure 4.3 30 Zero-offset synthetic data results for λ = 0.46 and η = 10, 35 and 100. The λ and η used to generate each subfigure are stated underneath it. a-c) zero-offset primaries estimates. d-f) zero-offset multiples estimates. . . . . . . . . . . . . . . . . . Figure 4.4 31 Mosaic and 2D plots for the curvelet-domain matched-filtering scaling for one shot (offset = 1260m) for η = 35 and λ = 0.046, 0.46 and 46.4. The λ and η used to generate each subfigure are stated underneath it. . . . . . . . . . . . . . . . . . . . . Figure 4.5 33 Zero-offset synthetic data results for η = 35 and λ = 0.046, 0.46 and 46.4. The λ and η used to generate each subfigure are stated underneath it. a-c) zero-offset primaries estimates. d-f) zero-offset multiples estimates. . . . . . . . . . . . . . . Figure 4.6 34 a-d) Primaries estimates at different iterations. e-h) Multiples estimates calculated by subtracting the primaries estimate for ˆ Q. ˆ i-l) Multiples estimates calculated by the total data Pˆ − G convolving the Green’s function with the total upgoing waveˆ P. ˆ . . . . . . . . . . . . . . . . . . . . . . . . . . . field −G viii 36 Figure 4.7 Zero-offset results of EPSI with only one matching step at the first alternating optimization loop. a) Zero-offset primaries estimate. b) Zero-offset multiple estimate calculated by subtracting the primaries estimate form the total upgoing wavefield. . Figure 4.8 37 Zero-offset results of EPSI with six matching steps starting after the second alternating optimization loop (even loops only). a) Zero-offset primaries estimate. b) Zero-offset multiple estimate calculated by subtracting the primaries estimate form the total upgoing wavefield. . . . . . . . . . . . . . . . . . . . . Figure 4.9 38 Zero-offset results of EPSI with only one matching step at the twenty first alternating optimization loop. a) Zero-offset primaries estimate. b) Zero-offset multiple estimate calculated by subtracting the primaries estimate form the total upgoing wavefield. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Figure 4.10 Zero-offset results of EPSI with only one matching step at the twelve alternating optimization loop. a) Zero-offset primaries estimate. b) Zero-offset multiple estimate calculated by subtracting the primaries estimate form the total upgoing wavefield. 40 Figure 4.11 Zero-offset EPSI synthetic data result with curvelet-domain matched filtering incorporated into loop 12. The number of iterations η = 60 and the trade off parameter λ = 0.46. a) Total ˆ b) Primaries estimation (Pˆ0 ). upgoing wavefield section (P). ˆ P), ˆ c) primaries estimation minus the total data (Pˆ − Pˆ0 = −G i.e. the multiples estimation d) Final estimated source wavelet ˆ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . (Q) ix 42 Acknowledgments I would like to thank my supervisor Felix Herrmann for his continuous support and encouragement, and creative ideas without which this thesis would not be possible. I would like also to thank all slim members for their support in filling the gaps in my Geophysical background. I am very fortunate to have Michael Bostock and Eldad Haber as part of my supervisory committee. I thank Michael for his two insightful courses and Eldad for the few lectures I attending during his course last fall. I would like also to thank my management at Saudi Aramco for giving me the opportunity to study at UBC and for their support and funding during my studies. A special thank goes to my wife Sakeel for her encouragement and patient during my research period especially after the twins came in board. Thank you all. x Dedication To my parents, my wife Sakeel, my daughters Zahraa and Zainab xi Chapter 1 Introduction In early nineteenth century, searching for oil was very simple, easy and straightforward. People were looking for oil seepages at the surface near oil springs. Back then, the demand for oil was limited and oil was mainly used for medical purposes (Gadallah and Fisher, 2005). But the situation has changed since then, many objects around us are made from oil or made by machines that use energy from oil and gas. The paper used to print this thesis was manufactured using oil energy. The ink itself is an oil product. The demand for oil has changed dramatically and so do procedures to search for oil. The objective of the seismic survey is to reveal the subsurface geology. It is a very costly and complicated procedure. On land, dynamite (impulsive source) or Vibroseis (trucks that propagate energy into the earth over an extended period of time), sources send energy into the subsurface. Some of this energy is reflected at the boundaries between rocks layers back to the surface where it is recorded by an array of geophones. The same procedure applies at sea but the seismic source is now an air gun and the receivers are hydrophones. Figure 1.1 illustrates the seismic survey on land. Most seismic imaging algorithms assume that the reflected energy recorded by geophones at surface has reflected only once in the subsurface (Verschuur, 2002), as shown by the solid line in Figure 1.2. However, in practice, this assumption is not true. Some of the energy we send bounce between rocks layers before being recorded by geophones as shown by the dashed lines in Figure 1.2. As a result 1 Figure 1.1: An illustration of land seismic survey. A seismic source sends energy into the earth. Some of this energy is reflected back to the surface at rocks layers discontinuities. This energy is then recorded by geophones at the surface. multiple reflections events occur. These events are usually considered as undesired events that need to be removed from the seismic data before processing to avoid confusion during the interpretation of seismic images. The success of multipleremoval or multiple-suppression methods can make the difference between hitting the target and making profit or missing the target and losing millions of dollars. One way to categorize multiples is by the interface where they have their shallowest downward bounce. The first downward bounce for the green dashed line in Figure 1.2 happens at the surface. Hence it is called a surface-related multiple while the first downward bounce for the red dashed line happens at an internal interface and so it is called an internal-multiple (Verschuur, 2002). 1.1 Theme The main theme of this thesis is to extend a recent robust multiple-elimination technique that estimates primaries by sparse inversion, EPSI (van Groenestijn and Verschuur, 2009; Lin and Herrmann, 2009). This technique proposes an inversion procedure during which the source function and surface-free impulse response are 2 Figure 1.2: An illustration that shows a primary and two types of multiples. The solid blue line has only one upward reflection, so it is a primary reflection. The green dashed line has more than one upward reflection and it has its shallowest downward reflection at the surface, hence, it is a surface-related multiple. Finally, the internal-multiple, the red dashed line, has its shallowest reflection at an internal rock layer and it has more than one upward reflection. directly calculated from the upgoing wavefield using an alternating optimization procedure. We propose to incorporate a curvelet-domain matched-filtering step into this optimization procedure to mitigate possible adverse effects that stem from amplitude mismatches that vary smoothly as a function of position and dip along the predicted wavefronts of the current estimate for the surface-free data. The implementation and the different parameters affecting this matched-filtering step are investigated. 1.2 Objectives This thesis aims to incorporate a matched-filtering operator into the formulation of EPSI. This operator will be able to absorb amplitude errors of non-ideal reflections at the surface, finite-aperture, and other unknown effects. 3 1.3 Outline Chapter 2 introduces primary-multiple separation followed by an explanation of the theory underlying the model behind surface-related multiple removal, SRME (Verschuur et al., 1992), and primary estimation by sparse inversion, EPSI (van Groenestijn and Verschuur, 2009). Finally, we discuss the theory behind curveletdomain matched-filtering. In Chapter 3, we introduce estimation of primaries by sparse inversion, before incorporating a matching step into the convexified EPSI procedure. Chapter 4 studies the factors affecting the outcome of incorporating a matchedfiltering step into EPSI. We conclude this chapter by presenting our results. We will conclude our research in chapter 5 by a summary and suggested future work. The last section of this chapter introduces the curvelet transform and highlights its importance as a sparsifying domain to the success of the matched-filtering procedure. 1.4 Theoretical background The ability of the transform to handle seismic data with conflicting dips and to minimize the overlap between the primary and multiples coefficients in the transform domain is significant to the success of the primary-multiple separation algorithms (Herrmann, 2008b). In this section, we show that the curvelet transform (Candes et al., 2006) addresses these issues by efficiently representing seismic data sparsely and by its ability to handle conflicting dips. Curvelet transform was first introduced in Candes and Donoho (2000). Curvelet transform is a localized, multiscale, multi-directional and redundant transform. Curvelets are oscillatory in one direction and smooth in the perpendicular direction. They resemble a windowed sinusoid in the oscillatory direction and a very smooth Gaussian window in the other direction, see Figure 1.3. Each curvelet is spatially localized in the spatial domain and has a compact support in the frequency domain. In spatial domain, its amplitude decays rapidly outside a certain region. Curvelets obey the parabolic scaling principle with length ≈ width2 (Neelamani et al., 2008). 4 The curvelet transform computational complexity of data of size N is O(NlogN) which is not an issue. However, the curvelet transform demands memory; it is an overcomplete transform, i.e. the resulting transform coefficients are more than the image coefficients. It is about 8 times redundant for the 2D curvelet transform and 24 times redundant for the 3D case. Figure 1.4 shows a few curvelets of different scales and angles with a rapid amplitude decay in the spatial domain and a compact support in the frequency domain and Figure 1.5 shows a decomposition of a shot gather at different scales and angles. Curvelet correlate very well locally with wavefronts of the same direction and frequency contents, see Figure 1.6. This permits the curvelet transform to represent seismic data sparsely as a small set of significant coefficients which are location, scale, and dip dependent. This sparse representation of seismic data helps in mapping the different seismic events to different coefficients and that will make the overlap of the primaries and multiples coefficients minimal and hence leads to a successful separation. 5 Figure 1.3: Mesh view of a 2D curvelet. Curvelets are oscillatory in one direction, but smooth in the other. They resemble a windowed sinusoid in the oscillatory direction and a very smooth Gaussian window in the other direction. Figure 1.4: Spatial (left) and frequency (right) representations of six curvelets with different angles and scales. Notice how each curvelet is localized in the spatial domain with a rapid decaying amplitude and how it has a compact support in the frequency domain. Notice also the 90◦ rotation between any two correspondence representations in both domains of each curvelet. 6 (a) (b) Figure 1.5: a) Synthetic shot gather b) Curvelet decomposition of the shot gather into different scales and angles. Five scales are shown here. Notice how the angles double every other scale and how the different parts of the shot gather are decomposed at the various scales and angles. The coarsest (center) scale shows the DC and low frequencies. The second scale has 4 angles and the third and the fourth has 8. The fifth (finest) scale has 16 angles. Figure 1.6: Curvelet correlation with wavefronts. A large coefficient is yield when the curvelet and the waterfront have locally the same direction and frequency contents. Otherwise, a small coefficient is yielded. 7 Chapter 2 Primary-multiple separation techniques In this chapter, we start briefly by categorizing the primary-multiple separation techniques. Then, we introduce the model behind surface-related multiple removal (SRME) and estimation of primaries via sparse inversion (EPSI). Finally, we layout the theory behind the curvelet-domain matched-filtering. 2.1 Primary-multiple separation techniques Multiples removal techniques can be categorized in two main classes: • filtering methods based on move-out and dip discrimination. • data-driven methods based on periodicity and predictability. Filtering methods rely on the fact that multiple events traveled through different rock layers. Hence, they traveled at an apparent different velocity or reflected at different geological structures than the primary events, see Figure 2.1. These methods use a transform, e.g. F-K or Radon transform, to map the seismic data events, primaries and multiplies, into different regions, i.e. mapping different frequency contents or different dip angles into different regions. Then, they eliminate the multiples events and transform the remaining events, the primaries, back to the original time-offset domain (Verschuur, 2002). 8 Figure 2.1: Two reflection events: a primary (solid blue line) and a surfacerelated multiple (dashed red line). Both events have similar arrival time but have traveled different paths. The primary event has traveled deeper into the subsurface and has seen higher velocity than the multiple event. Therefore, it arrived with a smaller angle at the receiver. Data-driven methods exploit the relation between primaries and multiples events to suppress the repetitive patterns of multiples. These methods involve two steps: a prediction and a subtraction step. First, multiples are predicted and then subtracted through a matching procedure from total data, i.e. primaries, surface-related multiples and internal multiples. Assumptions are needed for both steps. For example the energy after the subtraction step is assumed to be minimum (Verschuur, 2002). 2.2 Surface-related multiples model This section discusses the model behind SRME and EPSI for 1D and 2D situation. We discuss theoretical details for SRME and EPSI in Section 2.3 and Section 3.1, respectively. 9 2.2.1 Surface-related multiples formulation for normal-incidence plane wave in a horizontally stratified medium Consider a horizontal plane wave propagating in a laterally invariant earth. This wave reflects at the subsurface geology to produce surface-free data p0 , which contains primaries and internal multiples but no surface-related multiples (blue solid and red dashed lines in Figure 2.2a). At the surface, this surface-free data acts as a new source of events that will feedback into the earth to produce firstorder multiples (green dashed lines). The first-order multiples will then feedback into the earth again to produce higher order multiples. The feedback diagram in Figure 2.2b illustrates this process. We can express the surface-free data mathematically as p0 (t) = g(t) ∗ s(t), where p0 is the surface-free data, g is the surface-free impulse response, ∗ denotes convolution, and s(t) is the source signature. The feedback diagram in Figure 2.2 permits us to formulate the total response with the surface related multiples p as p(t) = g(t) ∗ [s(t) − p(t)] = p0 (t) − g(t) ∗ p(t). (2.1) Figure 2.3 illustrates the generation of surface-related multiples with the source signature shown in Figure 2.3b, for a model with only one-reflection layer, Figure 2.3c. The generated surface-free data p0 , which consists of only one primary is shown in Figure 2.3d. This surface-free data feedbacks to the earth and convolves with the surface-free impulse response g to produce first-order multiples m1 , Figure 2.3e. The same process produces second-order multiples m2 (Figure 2.3f). Finally, the total response (surface-free data and surface-related multiples) is shown in Figure 2.3g. 2.2.2 Data matrix Our seismic data in the 2D case is three-dimensional: time, source position and receiver position. By considering the earth as a linear time-invariant system, we can represent an experiment, i.e. a shot gather (one source and multiple receivers), 10 (a) (b) Figure 2.2: Feedback diagram for surface-related multiples generation. First, surface-free data is generated by convolving the source signature s+ with surface-free impulse response g. Then, surface-free data feedbacks into the earth to generate higher order multiples. as a superposition of many fully independent monochromatic experiments in the Fourier domain. After taking the shot gather into the frequency domain and selecting a single frequency, we obtain a vector of measurements for the different receivers for that frequency. Repeating the process for all shot gathers produces a data matrix, indicated by bold capital P (Berkhout, 1985), whose columns represent common source point gathers and rows represent common receivers gathers. 2.2.3 Two-dimensional formulation Convolving surface-free impulse response G with total upgoing wavefield P− generates surface-related multiples. Figure 2.4 illustrates the surface-related multiples prediction for 2D case interpreted as wave field extrapolation. We start with a source positioned at xs and a set of receivers (geophones) positioned at xk . Our 11 (a) (b) (c) (d) (e) (f) (g) Figure 2.3: a) An illustration that shows the generation of surface-related multiples. b) Source signature s(t). c) Surface-free impulse response g(t). d) Surface-free data p0 (t). e) First-order multiples m1 (t). f) Second-order multiples m2 (t). g) Total response p(t). 12 Figure 2.4: After convolving the source gather from total data P with the common receivers gather from surface-free impulse response G for each point in xk , the convolution result is summed up to produce the predicted surface-related multiple trace at xr . goal is to calculate the wavefield at a receiver positioned at xr , which represent the surface-related multiples. To do that, we need the Green’s function to extrapolate the path for a source positioned at xr and whose energy is measured at xk . The surface-free impulse response G can act as the Green’s function to predict all possible surface-related multiples. First, we convolve the two wavefields, the shot gather from the total data P and the common receivers gather from the surfacefree impulse response G, for each point in xk . Then, we sum the convolution result to get the wavefield at xr and that gives the surface-related multiples. In practice, both the data P and the surface free impulse response G are transformed to the frequency domain and then the convolution and the summation are repeated over all source-receiver combinations to obtain the predicted surface related multiples M0 . If we adapt a similar data matrix notation as in Berkhout (1985), then the multiple prediction can be formulated as a matrix-matrix multiplication: ˆ P, ˆ 0 = −G ˆ M (2.2) where the minus sign represents the reflection at the surface and the hat is used to represents a monochromatic matrix. 13 Furthermore, we can use the same approach in Equation 2.2 to generalize Equation 2.1 for the 2D case to get ˆ 0, Pˆ = Pˆ 0 + M (2.3) ˆ 0 is the surface-related multiples. Expandwhere Pˆ 0 is the surface-free data and M ing Equation 2.3 yields ˆQ ˆ + GR ˆ P, ˆ Pˆ = G (2.4) ˆ is the source signature and R is the surface reflectivity, which is apwhere Q proximated to be −I, i.e. ˆ Q ˆ − P). ˆ Pˆ = G( 2.3 (2.5) Iterative surface-related multiple removal In this section, we aim to give a brief description of the iterative Surface-Related Multiple Removal (SRME) procedure (Verschuur et al., 1992). We start from the surface-related multiples model, Equation 2.4. Now, let A = Q−1 R be the surˆ Pˆ = ˆ 0 = GR face operator. Next, we rewrite the surface-related multiples as M ˆQ ˆQ ˆ −1 RPˆ = Pˆ 0 AP. Substitution this expression in Equation 2.3 yields G Pˆ = Pˆ 0 + Pˆ 0 AP. (2.6) ˆ = Iqˆ with qˆ By assuming the source function Q to be omnidirectional, i.e. Q the discretized Fourier representation of the source function, the above equation can be solved iteratively as, (n+1) (n) = Pˆ − A(n+1) Pˆ 0 P, Pˆ 0 (2.7) where n is the iteration index. The iterative procedure starts with approximating the (0) surface-free data with total data, i.e. Pˆ = P. Next, it estimates A in a matching 0 process. This iterative procedure (Equation 2.7) matches predicted multiples to total data, which may lead to throwing out multiples energy. It also requires regular sampling. 14 2.4 Curvelet-domain matched-filtering Many seismic processing techniques depend on matching seismic wavefield amplitudes whether for adaptive subtraction of multiples (Verschuur et al., 1992; Herrmann et al., 2008b) or ground roll (Yarham and Herrmann, 2008) predictions from total data or for restoration of migration amplitudes through image to migrateddemigrated-image matching (Herrmann et al., 2008a). For a successful matching, we need to control a possible overfitting that may inadvertently lead to primary energy removal. Furthermore, we need to handle conflicting dips and apply the separation after the matching stably. We will see in this section that curvelet satisfies all these conditions by its ability to decompose seismic data sparsely into position, scale and dip dependant coefficients and by imposing smoothness on its coefficients in the phase space (Herrmann, 2008a). In Herrmann (2008a), the curvelet-domain matched-filtering is based on two assumptions: • A global Fourier matching removes the stationary differences between the two matched wavefields. The Fourier matching removes only amplitudespectra mismatches and global kinematic errors and fails to remove errors that vary spatial as a function of dip. In SRME (Verschuur et al., 1992), these stationary differences come from the convolution of the source signature during the multiples prediction step. • The remaining non-stationary differences, which come from the non-ideal circumstances in the prediction, e.g. the absence of 3D effects, vary smoothly as a function of position and dip along wavefronts. After compensating for the stationary differences via global Fourier matching procedure, i.e. no kinematic errors, we can approximate the non-stationary differences mathematically by the operator Ψ, whose action on a d-dimensional function f is given by (Ψ f )(x) = Rd e jk·x a(x, ζ ) fˆ(ζ )dζ , (2.8) where x, ζ are the spatial coordinate and wavenumber vectors, fˆ(ζ ) is the Fourier transform coefficients of f (x), and a(x, ζ ) is a space- and spatial-frequency dependent filter (Herrmann et al., 2008a). 15 In practice, the operator Ψ, which after discretizing becomes a full positivedefinite matrix, acts on 2D images, e.g. shot gathers or common receivers gathers. We usually have two images to match, g and f . In the context of SRME, f would be predicted multiples and g would be total data. Then, the mismatched can be modeled as a matrix-vector multiplication, i.e., g = Ψf. (2.9) Following Herrmann et al. (2008a), we approximate the action of Ψ operator by a positive diagonal scaling in the curvelet domain (Ψ f )(x) ≈ C∗ DΨ C f (x), (2.10) where C is the forward curvelet transform and C∗ is its adjoint, which is also its pseudo inverse since we are using curvelet-domain via wrapping (Candes et al., 2006), which is a tight frame with C∗ C = I. The diagonal matrix DΨ approximates the symbol a(x, ζ ) evaluated at the curvelet centers. After substituting Equation 2.10 into Equation 2.9, the problem of finding the action of the operator Ψ reduces to finding the elements of the curvelet diagonal scaling matrix DΨ , which we calculate by solving the following least-squares problem 1 arg min ||g − C∗ diag(C f )z||22 , 2 z (2.11) where z represents the diagonal elements of DΨ . Many issues complicate the estimation of the curvelet diagonal scaling of Equation 2.11. The forward model is underdetermined, which leads to mapping many scaling coefficients to the same image under the adjoint operator C∗ . There is also the possibility of overfitting that may lead to primary energy removal or incorrect amplitude corrections. To solve these issues, a smoothness constraint is imposed on the diagonal scaling to give 1 arg min ||g − C∗ diag(C f )z||22 + λ ||Lz||22 , 2 z 16 (2.12) with L = [L1T L2T LθT T Lscale ]T , (2.13) where λ is the trade off parameter and L is the sharpening operator, which penalizes fluctuations among neighboring coefficients in the diagonal scaling z. The sharpening operator penalizes the scaling coefficients of each wedge (Figure 1.5) along the first and second direction, L1 and L2 . It also penalizes coefficients between adjacent wedges at the same scale Lθ , and between wedges of adjacent scales Lscale , see Herrmann et al. (2008a); Shahidi and Herrmann (2009). To ensure positivity of the diagonal scaling, it is suggested in Herrmann et al. (2008a) to replace z with ez . This is not implemented in the curvelet-domain matched-filtering used to produce our result and we leave its implementation to future works. 2.5 Motivation To illustrate the potential benefit of adding curvelet-domain matching, let us consider a common-offset section from North Sea field dataset. Figure 2.5 shows the results of primaries estimation from two different multiples-elimination methods, namely Surface-Related Multiple Elimination (SRME, Figure 2.5b), and Bayesian threshold (Figure 2.5c, Saab et al. (2007)), which uses curvelet-domain matchedfiltering. Comparing the primary estimations from SRME and Bayesian threshold reveals improved continuity and amplitude preservation for the primaries (near the lower two arrows in each plot around 2.6 and 3.1s) of the Bayesian threshold method. Furthermore, the strong residual of the first- and second-order water bottom multiples in the shallow part (near the top two arrows in each plot around 0.75 and 1.2s) are better suppressed. 17 (a) (b) (c) Figure 2.5: A common-offset section from North Sea field dataset with results from two different multiples-elimination methods. a) Total data plotted with automatic-gain control. b) Primary estimation from one term SRME. c) Scaled Bayesian iterative threshold primaries estimate. Adapted from Herrmann et al. (2008b). 18 Chapter 3 Convexified EPSI and curvelet-domain matched-filtering In this chapter, we layout the theory behind estimation of primaries by sparse inversion (EPSI), before incorporating a matching step into the convexified EPSI procedure. Will incorporating a matching step into EPSI optimization problem give better results than using EPSI alone, knowing that using EPSI one would match multiples to multiples instead of matching predicted multiplies to the total data as in SRME? The remainder of this thesis is dedicated to answering this question. 3.1 Estimation of primaries by sparse inversion EPSI is based on the same model as SRME. However, it estimates primaries in an inversion process rather than in a prediction and a subtraction process. Unlike SRME, EPSI is not sensitive to limited sampling because it can reconstruct missing offset during the inversion process. EPSI tries to change the paradigm that sees the surface-related multiples as noise, which needs to be removed from the data prior to imaging. It proposes an operator, which is based on the underlying principle that relates the primary im19 pulse response to total upgoing wavefield that includes the source signature and surface related multiples. EPSI monochromatic relationship is expressed mathematically as: downgoing wavefield ˆ G ˆ + RPˆ Q upgoing wavefield Pˆ ≈ , (3.1) surface-free impulse response where the hat indicates a monochromatic representation of a wavefield arranged into a matrix, see Section 2.2.3 for details. The matrix product of any two-hatted wavefields corresponds to a non-stationary convolution in the time domain. The original EPSI (van Groenestijn and Verschuur, 2009) assumes the impulse response G to be sparse in the time domain and the source function Q is omniˆ = Iqˆ with qˆ the discretized directional, i.e. a constant-source for all shots or Q Fourier representation of the source function. Sparsity is enforced on the updates of the surface-free impulse response by a zero norm. An increasing time window in each iteration is placed over these updates in which the biggest events per trace are selected. Increasing the window size improves the convergence. A Fourier smoothness (shortness) is also enforced on the source function. We obtain a reasonable estimate for primary impulse response G by choosing the largest τ (sparsity level) elements of the gradient and setting the rest to zeros. We keep repeating this process till we obtain the desired image of primaries. 3.1.1 Convexified EPSI Many issues affect the practical adoption of the original EPSI formulation. It is difficult to estimate many of its inversion parameters, e.g. the sparsity level of the primaries in each iteration or the precise knowledge of the time window containing them. The original EPSI also relies on the zero norm which makes it unstable. Lin and Herrmann (2009) address these issues via a 0 to 1 convexification, where the one norm on the surface-free impulse response is minimized alternately with the two-norm on the source function. Introducing 1 convexification will not only preserve sparsity on the Green’s function but will also stabilize the problem. It will also eliminate most the inversion parameters. The time window, which is used to prevent the inversion from being trapped in a local minima, is no longer needed 20 since convex problems have no local minima. The sparsity level of primaries is replaced with a more practical and easy to estimate parameter σ , which is related to the noise level of measured data, see Equation 3.3. 3.1.2 Bi-convex optimization EPSI optimization involves two alternating optimization problems that solve for the unknown surface-free Green’s function G and the unknown source function Q. To be consistent with the notation of optimization problems, indicated by lower bold face letters e.g. p = vec(P) in our notation, we rewrite Equation 3.1 in a vectorized form in this bi-linear forward model formulation ˆ := Ft∗ blockdiag E[Q]g ˆ + RPˆ Q ∗ 1···n f ⊗ I Ft g = p, (3.2) ˆ is a linear operator, which depends on the source signature Q and the where E[Q] upgoing wavefield P that acts on the vectorized surface-free Green’s function g, ∗ denotes the conjugate transpose, and ⊗ denotes the Kronecker product. This expression reformulates the matrix-matrix product into a matrix-vector product. Ft is the Fourier transform, whose action on a vectorized wavefield g is defined as Ft g := gˆ = [ˆg1 , gˆ 2 , · · · , gˆ n f ]∗ with n f the number of frequencies and Ft∗ is the inverse Fourier transform operator that brings the vectorized wavefield back to the time domain. The blockdiag lays out each frequency slice (1 · · · n f ) in a vectorized form and p is the vectorized upgoing wavefield. For now, we will assume R = −I. Equation (3.2) is linear in both the surface-free Green’s function and the Fourier coefficients of the source function. This permits us to invert for these two unknowns by alternatingly solving two optimization problems. First, we solve for free impulse response g subject to data fitting, i.e. the one norm optimization problem on the unknown surface-free Green’s function, g = arg min ||g||1 ˆ subject to ||p − E[Q]g|| 2 ≤ σ, (3.3) g where g is the vectorized estimation (estimations are indicated by ) for the surfacefree Green’s function and σ is the residual, which is linked to the noise level of the data. Then, we solve the regularized least-squares problem for the unknown source 21 function, 1 ˆ q|| ˆ 22 + λF ||LF q|| ˆ 22 , qˆ = arg min ||y − B[G] 2 q (3.4) ˆ = vec−1 (g) the current esˆ P] ˆ 1···n f ) is the data vector with G where y = vec([Pˆ + G ˆ 1···n ⊗ I). In ˆ := blockdiag(G timate for the surface-free Green’s function and B[G] f this expression, we assume the source to be omnidirectional, i.e. Q1···n f = Iq1···n f The sharpening operator LF in the penalty term imposes smoothness on the esˆ which corresponds to enforctimated Fourier coefficients of the source function q, 2 ing decay in the time domain. The trade off parameter λF balances Fourier domain smoothness versus data misfit to avoid overfitting, which may lead to leakage of multiples into the source function. In this bi-convex optimization, we start with estimate for the source function and then solve for the surface-free Green’s function by an one-norm optimization problem, which seeks a sparse vector that after convolution with the downgoing wavefield explains the total upgoing wavefield, see Equation (3.3). Given this sparse estimation for the Green’s function, the regularized least-squares problem is solved that seeks the Fourier coefficients of the source function, see Equation (3.4). A sparsfying Transform To further enforce the sparsity assumption, we can define a suitable sparsifying transform on the estimated surface-free Green’s function. In Lin and Herrmann (2009), they define this sparsifying transform as a Kronecker product between the 2D curvelet transform along source-receiver coordinates, and the wavelet transform along time coordinate, i.e. S := C ⊗ W, with C, W the curvelet and wavelet ˆ transform operators, respectively. The synthesis operator S∗ then sits between E[Q] and g in Equation 3.3, and the solution seeks the sparsest set of transform domain coefficients of the surface-free Green’s function Sg instead of the sparse Green’s function g. Due to the redundancy of the curvelet transform, about 8 times redundant in 2D case, and the high demand for memory of our implementation of Algorithm 1, see Section 3.2, the above sparsifying transform is not used in our matching. 22 3.2 EPSI with curvelet-domain matched-filtering EPSI implicitly assumes that the source is omnidirectional, i.e. constant source for all shot gathers; absence of 3D effects; infinite aperture; and ideal total surface reflectivity, i.e. R = −I. Although in principle EPSI can be extended to 3D, it is been shown that this assumption is more problematic to handle in practice as experiences with SRME-based method have shown (Lin et al., 2004). Deviation form these assumptions may lead to detrimental effects on the quality of the estimated surface-free data. Following the successful application of curvelet-domain matched-filtering within the SRME procedure, we propose to incorporate a matching step into our bi-convex optimization procedure to mitigate possible adverse effects that stem from amplitude mismatches that vary smoothly as a function of position and dip along the predicted wavefronts of the current estimate for the surface-free data. We include a non-trivial surface reflectivity operator, i.e. R = −I, that will absorb amplitude errors of non-ideal reflections at the surface, finite-aperture, and other unknown effects. We start with the assumption that the operator R is symmetric positive definite and pseudo local (no kinematic shifts). Then, we approximate this space-dipdependent filter by a simple curvelet domain scaling R ≈ C∗ diag(z)C. (3.5) After that, the diagonal curvelet scaling is estimated in the regularized least-squares problem Equation 2.11 that matches multiples to multiples and not predicted multiples to total data as in SRME. 3.2.1 Frequency-domain monochromatic matching Our first attempt to incorporate the surface reflectivity operator R into our formulation was to assume that this operator is frequency independent, even though it can be easily extended to include angular frequency dependency. Then we incorporate this operator in Equation 3.1 to get ˆQ ˆ = GC ˆ ∗ diag(z)CP, ˆ Pˆ − G 23 (3.6) which we can solve by least-squares to get the curvelet scaling z: 1 ˆ 22 + λ ||Lz||22 , arg min ||u − Mz|| 2 z (3.7) ˆ i ]) with i the index corresponding ˆQ ˆ ∗ diag(CP) ˆ and u := vec([P− ˆ G ˆ := GC where M to the frequency for which qˆ is maximum. This method solves for the curvelet scaling fast, 200x faster than the method proposed in the next section, and does not demand large memory to execute. However, it does not give the desired space-dip dependent amplitude corrections corresponding to the symbol of the pseudo differential operator a(x, ζ ) in Equation 2.8 since it is acting on a frequency slice that has two spatial coordinates and no time coordinate. To solve this problem, the scaling should act on a spatial-time slice, e.g. a shot gather or a common receivers gather. 3.2.2 Time-domain matching In this approach, we apply the spatial-dip dependant operator R to each shot record of the upgoing wavefield P separately, to give the desired spatial-dip dependent amplitude corrections. Then, we take the scaled upgoing wavefield to the Fourier domain and substitute it into EPSI monochromatic relationship Equation 3.1, to get ˆ Q) ˆ =G ˆ Mz ˆ 1···ns , vec(Pˆ − G (3.8) ˆ i = C∗ diag(CPi ) with i the shot gather index and ns is the number of where M sources. Our operator R now acts on each shot record of the upgoing wavefield P separately. This new approach seeks a smooth positive spatial-dip dependent curveletdomain scaling for all the shot gathers of the total upgoing wavefield simultaneously, which makes it time and memory demanding approach, especially with the curvelet redundancy of about 8 times for the 2D case. Incorporation of this expression into our formulation, yields a tri-convex opti- 24 mization problem that now includes curvelet-domain matching: 1 z1···ns = arg min ||u − Nz1···ns ||22 + λ ||Lz1···ns ||22 , z1···ns 2 (3.9) ˆ 1···n ). The trade-off ˆ Q] ˆ M] ˆ 1···n f ⊗ I) and u := vec(Pˆ − G where N := blockdiag([G f parameter λ and the sharpening operator L are of Equation 2.12 respectively and ns and n f are the number of sources and the number of frequencies respectively. To solve this tri-convex optimization, we start by assuming R = −I. Then, given an estimate for the source signature we solve Equation 3.3 for the sparse surface-free Green’s function. After that, we substitute our Green’s function estimate into Equation 3.4, and solve the least-squares problem to obtain a new source estimate. Now that we have an estimate for both G and Q, we can start the matching step by assuming R = −I and solve Equation 3.9 for the diagonal curvelet-domain scaling. The new estimate for the surface reflectivity operator R is then used in the next estimations of the surface-free Green’s function and the source function, see Algorithm 1. Algorithm 1 Stabilized Estimation of Primaries by Sparse Inversion with Curvelet Domain Matched Filtering Result: Estimate for g and Q choose noise level σ , iteration increment ρ g0 ←− 0, Q0 ←− 0, R0 ←− −I, k ←− 1 ˆ R]g 2 ≥ σ do while p − E[Q, gk ← solve (3.3) using initial guess Qk−1 , Rk−1 with SPG 1 1 at least ρk iterations. Qk ← solve (3.4) for the source signature using gk , Rk−1 with LSQR z ← solve (3.9) with LSQR Rk ← C∗ diag(z)C end while g ← gk Q ← Qk 1 see van den Berg and Friedlander (2008) 25 Chapter 4 Observations and results This chapter studies the factors affecting the outcome of incorporating a matchedfiltering step into EPSI. We conclude this chapter by showing our results. 4.1 Methodology 4.1.1 Synthetic data We consider synthetic time-domain finite-difference acoustic forward modeling data with 128 shot gathers each with 128 receivers (traces). Each trace has 256 time samples and a time sampling interval of 6.4ms; and the distance between any two sources or receivers is 20m. Delphi f dacmod was used to generate this data, see Table 4.1 for detailed parameters. We assume the data to be receiver ghost free. Unless otherwise stated, all the shot gathers shown in this section are the middle shot gather, i.e. shot index of 64 (offset = 1260m). 4.1.2 Results evaluation We want to compare the primary multiples separation results of EPSI with and without the curvelet-domain matched-filtering step. However, since we do not have the true primaries to evaluate the separation results, e.g. with signal-to-noise ratio, we rely on prior knowledge of the locations of these primaries to evaluate our results. 26 Table 4.1: Synthetic time-domain finite difference acoustic forward modeling data parameters. Parameter Value Parameter Value time sample interval samples/trace number of sources number of receivers wavelet type frequency range source type 6.4ms 256 128 128 Ricker wavelet 0-60Hz monopole sources depth receivers depth grid size (dx) grid size (dz) model size (nx) model size (nz) 5m 5m 5m 5m 509 251 The matching step will be evaluated by the smoothness and the positivity of the curvelet-domain matched-filtering diagonal scaling. 4.2 Factors affecting our results In this section, we study the factors that control the output of the matching process, namely the number of least-squares iterations η and the trade off parameter λ which control the smoothing and positivity of the curvelet-domain scaling. We will also answer the question of when and how frequent should we do the matching. Before experimenting with these factors, we present EPSI results without matching of the synthetic data described in Table 4.1. Figure 4.1a shows the zero-offset section of total upgoing wavefield. After applying EPSI to this data, it converges after 22 alternating optimization loops to give an estimate for the primaries (Figure 4.1b) and an estimate for the source wavelet (Figure 4.1d). The multiples estimate (Figure 4.1c) is calculated by subtracting the primary estimate form total upgoing wavefield. 4.2.1 The smoothing and positivity of the curvelet-domain diagonal scaling In Section 2.4, the second assumption behind the curvelet-domain matched-filtering states that the non-stationary differences vary smoothly as a function of location and dip and then it can be modeled mathematically by a matrix-vector multiplica- 27 (a) (b) (c) (d) Figure 4.1: Zero-offset EPSI synthetic data results with no matching. a) Toˆ b) Primaries estimation (Pˆ0 ). c) prital upgoing wavefield section (P). ˆ P), ˆ i.e. the multimaries estimation minus the total data (Pˆ − Pˆ0 = −G ˆ ples estimation d) Final estimated source wavelet (Q) tion of a positive definite matrix on a reference vector, see Equation 2.9. To test the effects of the number of least-squares matching iterations η of Equation 3.9 and the trade off parameter λ on the positivity and smoothness of the curvelet-domain scaling vector z1···ns , we carry out two experiments on which we fix one of the parameters and allow the other to vary. Then, we compare the smoothness of the curvelet-domain scaling of the middle shot gather (offset = 1260m) with a mosaic plot, see Figure 1.5 for detailed description. We also show 2D-plots of the same curvelet-domain scaling to judge the positivity. After that we present the final zero-offset primary multiple separation results, i.e. the estimated 28 ˆ 0. primaries Pˆ 0 and the calculated multiples M Fixing λ and varying η In this experiment, we test the effect of the number of curvelet-domain matchedfiltering iteration η on the positivity and smoothness of the curvelet-domain scaling vector by fixing the trade off parameter λ = 0.46 and varying the number of iteration η = 10, 35 and 100. Figure 4.2a, 4.2c and 4.2e show mosaic plots of the curvelet-domain scaling coefficients for η = 10, 35 and 100, respectively and Figure 4.2b, 4.2d and 4.2f show the corresponding 2D-plots of the same coefficients. Since the surface reflectivity coefficients are negative, approximated by R = −I in EPSI without matching, and to avoid the confusion with literature (Herrmann et al., 2008a; Yarham and Herrmann, 2008) where the curvelet-domain scaling vector z1···ns is positive, we will plot −z1···ns instead of z1···ns . The corresponding final primaries and multiples estimations are shown in Figure 4.3(a)-(f). As the number of iterations increases, the curvelet-domain scaling vector becomes more smooth and positive as shown in Figure 4.2. The smoothness and positivity also affect the final EPSI output. In Figure 4.3a, we notice a significant leakage of multiples into the zero offset primaries section, indicated by the small black arrows at around 0.48, 0.75, 0.8 and 1s. However, in Figure 4.3b, these indicated multiples are either eliminated or faded. Figure 4.3c has slightly less multiples than Figure 4.3b. For example, the multiple, indicated by the third arrow, has been eliminated. Figure 4.3(d)-(f) show the same effects, so as the number of iterations increases the multiples become more preserved. Increasing the number of iterations produces a more smooth and positive curveletdomain scaling vector, which results in a better primary-multiples separation. However, that comes at the expense of execution time. Although we are utilizing the parallel execution and the distributed arrays of MATLAB in our implementation of Algorithm 1, one matching iteration still takes about 10 minutes to execute on 5 nodes each with 8GB memory shared between 2 workers. 29 (a) λ = 0.46, η = 10 (b) λ = 0.46, η = 10 (c) λ = 0.46, η = 35 (d) λ = 0.46, η = 35 (e) λ = 0.46, η = 100 (f) λ = 0.46, η = 100 Figure 4.2: Mosaic and 2D plots for the curvelet-domain matched-filtering scaling for one shot (offset = 1260m) for λ = 0.46 and η = 10, 35 and 100. The λ and η used to generate each subfigure are stated underneath it. . 30 (a) λ = 0.46, η = 10 (b) λ = 0.46, η = 35 (c) λ = 0.1, η = 100 (d) λ = 0.46, η = 10 (e) λ = 0.46, η = 35 (f) λ = 0.46, η = 100 Figure 4.3: Zero-offset synthetic data results for λ = 0.46 and η = 10, 35 and 100. The λ and η used to generate each subfigure are stated underneath it. a-c) zero-offset primaries estimates. d-f) zero-offset multiples estimates. 31 Fixing η and varying λ This experiment aims to test the effect of the trade off parameter λ on the positivity and smoothness of the curvelet-domain scaling vector by fixing the number of matching iterations η = 35 and allowing the trade off parameter to vary λ = 0.046, 0.46 and 46.4. The mosaic plots of the curvelet-domain scaling coefficients for λ = 0.046, 0.46 and 46.4 are shown in Figure 4.4a, 4.4c and 4.4e respectively. Next to each mosaic plot, a 2D-plot of the same coefficients is shown, Figure 4.4b, 4.4d, and 4.4f. The corresponding final primaries and multiples estimations are shown in Figure 4.5(a)-(f). Increasing the trade off parameter λ will obviously make the curvelet-domain scaling smoother and more positive, see Figure 4.4, but λ also balances the curveletdomain smoothness versus data misfit to avoid overfitting, which may lead to leakage of multiples into primaries estimate. The later point is obvious in Figure 4.5, where we notice severe leakage of multiples (Figure 4.5f) into the primaries (Figure 4.5c). Figure 4.5a has slightly less multiples than Figure 4.5b especially at the deepest parts near the lowest three arrows. A large value of the trade off parameter λ and the number of matching iterations η are desirable, but if not chosen correctly might lead to severe leakage of multiples into primaries estimate or to longer execution time. Hence, in order to practically incorporate the matching step, we need to choose these parameters wisely to avoid data overfitting and timely execution. 4.2.2 When and how frequent should we do the matching? In this section, we carry out different experiments to determine the best way to incorporate curvelet-domain matched-filtering step into EPSI. We present these experiments starting with the worst result. Then, we evaluate the quality of each result by comparing it to EPSI result with no matching presented in Figure 4.1. In all the experiments, the number of iteration η and the trade off parameter λ are fixed, η = 30 and λ = 0.46. Before we carry out the experiments, we will take a look at different alternating optimization iterations snapshots (Figure 4.6). In particular, we look at the 32 (a) λ = 0.046, η = 35 (b) λ = 0.046, η = 35 (c) λ = 0.46, η = 35 (d) λ = 0.46, η = 35 (e) λ = 46.4, η = 35 (f) λ = 46.4, η = 35 Figure 4.4: Mosaic and 2D plots for the curvelet-domain matched-filtering scaling for one shot (offset = 1260m) for η = 35 and λ = 0.046, 0.46 and 46.4. The λ and η used to generate each subfigure are stated underneath it. 33 (a) (b) (c) (d) (e) (f) Figure 4.5: Zero-offset synthetic data results for η = 35 and λ = 0.046, 0.46 and 46.4. The λ and η used to generate each subfigure are stated underneath it. a-c) zero-offset primaries estimates. d-f) zero-offset multiples estimates. 34 ˆ Q, ˆ Figure 4.6a- 4.6d, the mulprimaries estimates for different iterations Pˆ 0 = G tiples estimations calculated by subtracting the primaries estimates from the total ˆ Q, ˆ Figure 4.6e- 4.6h, and finally at the multiples upgoing wavefield MS = Pˆ − G 0 estimates generated by convolving the Green’s functions with the total upgoing ˆ P, ˆ C −G ˆ Figure 4.6i- 4.6l. wavefield M 0 Referring to Figure 4.6, we notice the importance of the first few iterations; they define the way that the upcoming iterations follow. After these few iterations, ˆS we notice only small changes in the inversion process, i.e. small changes in Pˆ 0 , M 0 and MC0 . For example, there is a slight change between the primaries estimate after the sixth iteration (Figure 4.6c) and after the twenty-first iteration (Figure 4.6d) compared to a significant change between the primaries estimate after the first iteration (Figure 4.6a) and after the third iteration (Figure 4.6b). ˆ S and Notice also that after these few iterations, the two multiples estimates, M 0 ˆ C start to become closer to each other. In curvelet-domain matched-filtering, M 0 we are trying to estimate the surface-reflectivity by taking advantage of EPSI by matching multiples to multiples instead of predicted multiples to total data like in ˆ S close to M ˆ C , but how close? That is what SRME. Hence, it is desirable to have M 0 0 we will try to find out in the following experiments. Only one matching at the first alternating optimization loop In this experiment, we only solve one curvelet-domain matched-filtering optimization problem, Equation 3.9, right after we get the first estimation of the Green’s and the source function. Then, we use the estimated curvelet-domain scaling in all the upcoming iterations. Figure 4.7a- 4.7b show the zero-offset primaries and multiples estimates, respectively. There is obvious leakage of multiples into the primaries estimate, indicated by the small black arrows. The primaries estimate after the first alternating loop is still unstable and will go under a lot of changes during the first few loops, which might lead to this leakage. On the other hand, we notice a leakage of primaries into the multiples estimate, which might come form the fact that the two ˆ S and M ˆ C are far apart from each other when we do the matching. multiples M 0 0 Therefore, incorporating a matched-filtering step into EPSI in the first few iter- 35 (a) Iteration = 1 (b) Iteration = 3 (c) Iteration = 6 (d) Iteration = 21 (e) Iteration = 1 (f) Iteration = 3 (g) Iteration = 6 (h) Iteration = 21 (i) Iteration = 1 (j) Iteration = 3 (k) Iteration = 6 (l) Iteration = 21 Figure 4.6: a-d) Primaries estimates at different iterations. e-h) Multiples estimates calculated by subtracting the primaries estimate for the total data ˆ Q. ˆ i-l) Multiples estimates calculated by convolving the Green’s Pˆ − G ˆ P. ˆ function with the total upgoing wavefield −G 36 (a) (b) Figure 4.7: Zero-offset results of EPSI with only one matching step at the first alternating optimization loop. a) Zero-offset primaries estimate. b) Zero-offset multiple estimate calculated by subtracting the primaries estimate form the total upgoing wavefield. ations produces an undesirable leakage of events between primaries and multiples estimates. Multiple matching starting from the second alternating optimization loop In this experiment, we solve six curvelet-domain matched-filtering optimization problems at even loops starting from the second loop, i.e. in loops indexed 2, 4, ..., 12. The zero-offset primaries and multiples estimates are shown in Figure 4.8a4.8b, respectively. We immediately notice that the primaries are better preserved 37 (a) (b) Figure 4.8: Zero-offset results of EPSI with six matching steps starting after the second alternating optimization loop (even loops only). a) Zerooffset primaries estimate. b) Zero-offset multiple estimate calculated by subtracting the primaries estimate form the total upgoing wavefield. and there is much less primaries leaking into the multiples. However, there are obviously still multiples leaking into the primaries estimate, indicated by the arrows. This leakage might be coming form the first two matching steps at the second and fourth loop where the inversion process still goes under significant changes. This result is better than the previous one. However, solving more than one matched-filtering optimization problem is impractical. Solving one matched-filtering optimization problem takes longer than solving the same problem without the matching step. 38 (a) (b) Figure 4.9: Zero-offset results of EPSI with only one matching step at the twenty first alternating optimization loop. a) Zero-offset primaries estimate. b) Zero-offset multiple estimate calculated by subtracting the primaries estimate form the total upgoing wavefield. Only one matching near the last alternating optimization loop In this experiment, we only solve one matched-filtering optimization problem near the last loop, loop 21, just before the inversion problem converges. The zero-offset primaries and multiples estimates are shown in Figure 4.9a4.9b, respectively. This result is very close to the original result with no curveletdomain matched-filtering but it suffers from weak multiples events leaking into the primaries estimate, indicated by the black arrows. In the last few loops, the ˆS inversion problem goes through small changes and the two multiples estimates M 0 ˆ C are very close and there is no much that the matching step can correct. and M 0 39 (a) (b) Figure 4.10: Zero-offset results of EPSI with only one matching step at the twelve alternating optimization loop. a) Zero-offset primaries estimate. b) Zero-offset multiple estimate calculated by subtracting the primaries estimate form the total upgoing wavefield. Only one matching at the middle alternating optimization loop In this experiment, we solve one matched-filtering optimization problem near the middle loop, loop 12. At this stage in the inversion problem, the dramatic changes of the first few iterations have already stabilized and we still have many upcoming iterations before the inversion problem exits. Figure 4.10a- 4.10b show the zero-offset primaries and multiples estimates, respectively. This result is better than all previous results including the one without curvelet-domain matched-filtering one. We notice less multiples leakage into the primaries estimate especially where the two black arrows indicate. 40 4.2.3 Optimized result We will conclude this section by first presenting an optimized result, a result where λ and η are chosen wisely and the matched-filtering optimization is solved only once in the middle loop, and then compare it to EPSI result without the curveletdomain matched-filtering. The optimized result is shown in Figure 4.11. Figure 4.11a- 4.11c show the zero-offset section of the total upgoing wavefield, the estimate for the primaries and the calculated multiples, respectively. The estimated source wavelet is shown in Figure 4.11d. This result is better than EPSI without the curvelet-domain matched-filtering step presented in Figure 4.1. By comparing this result primaries estimate (Figure 4.11b) to the one without matching estimated primaries (Figure 4.1b), we notice that the multiple event indicated by the top arrow is now weaker and more interestingly the other multiple indicated by the bottom arrow is no longer there. The number of iterations (η = 60) and the trade off parameter (λ = 0.49) are chosen to avoid overfitting the data and to solve the least-squares optimization problem in a reasonable time and at the same time does not sacrifice the positivity and smoothness of the curvelet-domain scaling vector. The matching step is incorporate at the twelve loop away from the instability of the few first iteration to avoid leakage between the primaries and the multiples. 41 (a) (b) (c) (d) Figure 4.11: Zero-offset EPSI synthetic data result with curvelet-domain matched filtering incorporated into loop 12. The number of iterations η = 60 and the trade off parameter λ = 0.46. a) Total upgoing waveˆ b) Primaries estimation (Pˆ0 ). c) primaries estimation field section (P). ˆ P), ˆ i.e. the multiples estimation d) minus the total data (Pˆ − Pˆ0 = −G ˆ Final estimated source wavelet (Q) 42 Chapter 5 Conclusion and future works 5.1 Conclusion The aim of this thesis is to incorporate a curvelet-domain matched-filtering step into the formulation of EPSI and investigate the factors that affect the outcome of this matching step on the outcome of EPSI. We started by leveraging the curvelet transform ability to represent seismic data and images sparsely as a function of location, scale and dip to end up with a solid matched-filtering formulation that has the ability to control a possible overfitting that might lead to primary energy removal. Then we incorporated this formulation into EPSI formulation to produce a triconvex optimization problem in which after we invert for the Green’s and source function, we solve our matching-filtering problem. After that we looked at the different factors that affect the outcome of our inversion namely, the number of iteration η, the value of the trade off parameter λ and when and how frequent should we solve the matched-filtering problem. We also saw that the number of iterations and the value of the regularization parameter should be chosen wisely to balance the smoothness and positivity of the curveletdomain scaling versus data overfitting and long execution time. Then, we concluded by showing that solving the curvelet-domain matchedfiltering step around the middle alternating optimization loop gives the best results since in that stage the inversion problem have already stabilized and the dramatic 43 changes of the first few iterations have already done. 5.2 Future works In this section we try to improve our method by making it faster. We suggest two ways to do that either by changing the problem formulation or by imposing frequency regularization over the curvelet-domain scaling vector. New formulation The problem with the current formulation Equation 3.9 is that we have to solve for the curvelet domain scaling for the whole data at the same time which requires a great deal of time and memory. We start by assuming R = −I and then change the formulation a bit to get P − GQ = −C∗ diag(z)CGP. (5.1) This formulation is solved for each shot gather separately which makes it fast and easy to parallelized. However, this formulation does not have a physical meaning like matching for the reflectivity coefficients (Equation 3.9). Frequency regularization In Shahidi and Herrmann (2009), they show that by imposing a smoothness over the curvelet-domain scaling vector in the frequency domain, faster convergence is realized. By incorporating this in our formulation we could shorten the execution time of Equation 3.9. 44 Bibliography A. Berkhout. Seismic migration : imaging of acoustic energy by wave field extrapolation. Elsevier, 1985. → pages A. Berkhout. Multiple removal based on the feedback model. The Leading Edge, 18:127–131, 1999. → pages E. Candes, L. Demanet, D. Donoho, and L. Ying. Fast discrete curvelet transforms. Multiscale Modeling and Simulation, 5:861–899, 2006. → pages E. J. Candes and D. L. Donoho. Curvelets: a surprisingly effective nonadaptive representation of objects with edges. In in Curve and Surface Fitting: Saint-Malo. University Press, 2000. → pages M. R. Gadallah and R. L. Fisher. Applied Seismology: A Comprehensive Guide to Seismic Theory and Application. PennWell Corporation, 2005. → pages G. Hennenfent and F. J. Herrmannn. Seismic denoising with nonunformly sampled curvelets. Computing in Science and Engineering, 8:16–25, 2006. → pages F. J. Herrmann. Curvelet-domain matched filtering. In SEG Technical Program Expanded Abstracts, volume 27, pages 3643–3647. SEG, November 2008a. → pages F. J. Herrmann. Curvelet-domain matched filtering. In SEG Technical Program Expanded Abstracts, volume 27, pages 3643–3647. SEG, November 2008b. → pages F. J. Herrmann and G. Hennenfent. Non-parametric seismic data recovery with curvelet frames. Geophysical Journal International, 173:233–248, 2008. → pages 45 F. J. Herrmann, P. P. Moghaddam, and C. C. Stolk. Sparsity- and continuity-promoting seismic image recovery with curvelet frames. Applied and Computational Harmonic Analysis, 24:150–173, 2008a. URL http://slim.eos.ubc.ca/Publications/Public/Journals/HerrmannACHA07.pdf. Accepted for publication in the Journal of Applied and Computational Harmonic Analysis. → pages F. J. Herrmann, D. Wang, and D. J. Verschuur. Adaptive curvelet-domain primary-multiple separation. Geophysics, 73(3):A17–A21, 2008b. doi:10.1190/1.2904986. URL http://slim.eos.ubc.ca/Publications/Public/ Journals/herrmann08acd/paper html/paper.html. → pages D. Lin, J. Young, Y. Huang, and M. Hartmann. 3d srme application in the gulf of mexico. SEG Technical Program Expanded Abstracts, 23(1):1257–1260, 2004. doi:10.1190/1.1851098. URL http://link.aip.org/link/?SGA/23/1257/1. → pages T. Lin and F. J. Herrmann. Unified compressive sensing framework for simultaneous acquisition with primary estimation. In SEG, 2009. → pages R. Neelamani, A. I. Baumstein, D. G. Gillard, M. T. Hadidi, and W. L. Soroka. Coherent and random noise attenuation using the curvelet transform. The Leading Edge, 27:240–248, 2008. → pages C. C. Paige and M. A. Saunders. Lsqr: An algorithm for sparse linear equations and sparse least squares. ACM Trans. Math. Softw., 8:43–71, March 1982. ISSN 0098-3500. doi:http://doi.acm.org/10.1145/355984.355989. URL http://doi.acm.org/10.1145/355984.355989. → pages R. Saab, D. Wang, O. Yilmaz, and F. Herrmann. Curvelet-based primary-multiple separation from a bayesian perspective. In SEG International Exposition and 77th Annual Meeting, 2007. URL http: //slim.eos.ubc.ca/Publications/Public/Conferences/SEG/2007/saab07seg.pdf. → pages R. Shahidi and F. J. Herrmann. Curvelet-domain matched filtering with frequency-domain regularization. In SEG Technical Program Expanded Abstracts, volume 28, page 3645. SEG, 2009. → pages E. van den Berg and M. P. Friedlander. Probing the pareto frontier for basis pursuit solutions. SIAM Journal on Scientific Computing, 31(2):890–912, 2008. doi:10.1137/080714488. URL http://link.aip.org/link/?SCE/31/890. → pages 46 G. J. A. van Groenestijn and D. J. Verschuur. Estimating primaries by sparse inversion and application to near-offset data reconstruction. Geophysics, 74(3): A23–A28, 2009. doi:10.1190/1.3111115. URL http://link.aip.org/link/?GPY/74/A23/1. → pages D. J. Verschuur, A. J. Berkhout, and C. P. A. Wapenaar. Adaptive surface-related multiple elimination. Geophysics, 57:1166–1177, 1992. → pages E. Verschuur. Seismic multiple removal techniques: Past, present and future. EAGE Publications BV, 2002. → pages C. Yarham and F. J. Herrmann. Bayesian ground-roll seperation by curvelet-domain sparsity promotion. In SEG Technical Program Expanded Abstracts, volume 27, pages 2576–2580. SEG, November 2008. → pages 47 Appendix A Implementations There are many implementations of the curvelet-domain matched-filtering namely, a MATLAB, a Python and a partially MPI C++ implementation. Although the last implementation is the fastest and has more features, we chose to extend the MATLAB implementation for these reasons. MATLAB does not only make developing and testing algorithms fast and easy, but a recent extension to MATLAB, namely Parallel Computing Toolbox, makes solving memory and time intensive algorithms possible. It provides MPI free programming of distributed arrays, different parts of one large array reside at different computational nodes, and distributed algorithm execution, different parts of an algorithm is executed at different computational nodes at the same time. The latter feature makes our algorithm, see Section 3.2.2, possible. For example, the diagonal scaling in Equation 3.9 for our small synthetic data has more than 32 million elements. SPG 1 , a gradient-based solver that uses spectral step length and Pareto root finding (van den Berg and Friedlander, 2008), is used for solving the tion problem in Equation 3.3 and for the 2 1 optimiza- optimization problem in Equation 3.4, the least-squares solver described in Paige and Saunders (1982) is used. We have chosen the curvelet-domain transform via wrapping (Candes et al., 2006) as our curvelet transform with 6 scales and 16 angles at the finest scale. For this choice of implementation, the pseudo-inverse equals the adjoint. 48
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Estimation of surface-free data by curvelet-domain...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Estimation of surface-free data by curvelet-domain matched filtering and sparse inversion AlMatar, Mufeed H. 2011
pdf
Page Metadata
Item Metadata
Title | Estimation of surface-free data by curvelet-domain matched filtering and sparse inversion |
Creator |
AlMatar, Mufeed H. |
Publisher | University of British Columbia |
Date | 2011 |
Date Issued | 2011-05-09 |
Description | A recent robust multiple-elimination technique, based on the underlying principle that relates primary impulse response to total upgoing wavefield, tries to change the paradigm that sees surface-related multiples as noise that needs to be removed from the data prior to imaging. This technique, estimation of primaries by sparse inversion (EPSI), (van Groenestijn and Verschuur, 2009; Lin and Herrmann, 2009), proposes an inversion procedure during which the source function and surface-free impulse response are directly calculated from the upgoing wavefield using an alternating optimization procedure. EPSI hinges on a delicate interplay between surface-related multiples and pri- maries. Finite aperture and other imperfections may violate this relationship. In this thesis, we investigate how to make EPSI more robust by incorporating curvelet- domain matching in its formulation. Compared to surface-related multiple removal (SRME), where curvelet-domain matching was used successfully, incorporating this step has the additional advantage that matches multiples to multiples rather than predicated multiples to total data as in SRME. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | Eng |
Collection |
Electronic Theses and Dissertations (ETDs) 2008+ |
Date Available | 2011-05-09 |
Provider | Vancouver : University of British Columbia Library |
Rights | Attribution 3.0 Unported |
DOI | 10.14288/1.0053426 |
Degree |
Master of Science - MSc |
Program |
Geophysics |
Affiliation |
Science, Faculty of Earth, Ocean and Atmospheric Sciences, Department of |
Degree Grantor | University of British Columbia |
Graduation Date | 2011-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
Rights URI | http://creativecommons.org/licenses/by/3.0/ |
URI | http://hdl.handle.net/2429/34367 |
Aggregated Source Repository | DSpace |
Download
- Media
- ubc_2011_fall_almatar_mufeed.pdf [ 2.95MB ]
- Metadata
- JSON: 1.0053426.json
- JSON-LD: 1.0053426+ld.json
- RDF/XML (Pretty): 1.0053426.xml
- RDF/JSON: 1.0053426+rdf.json
- Turtle: 1.0053426+rdf-turtle.txt
- N-Triples: 1.0053426+rdf-ntriples.txt
- Original Record: 1.0053426 +original-record.json
- Full Text
- 1.0053426.txt
- Citation
- 1.0053426.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Country | Views | Downloads |
---|---|---|
United States | 11 | 0 |
China | 3 | 0 |
Russia | 1 | 0 |
France | 1 | 0 |
Saudi Arabia | 1 | 0 |
Japan | 1 | 0 |
City | Views | Downloads |
---|---|---|
Ashburn | 4 | 0 |
Long Beach | 4 | 0 |
Sunnyvale | 3 | 0 |
Unknown | 2 | 0 |
Shenyang | 1 | 0 |
Beijing | 1 | 0 |
Jinan | 1 | 0 |
Dhahran | 1 | 0 |
Tokyo | 1 | 0 |
{[{ mDataHeader[type] }]} | {[{ month[type] }]} | {[{ tData[type] }]} |
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0053426/manifest