SEISMIC SINGULARITY CHARACTERIZATION WITH REDUNDANT DICTIONARIES by C A T H E R I N E M A R E V A DUPUIS Graduee en Ingenierie, Ecole Nationale Superieure de Techniques Avancees, Paris, 2003 A THESIS S U B M I T T E D I N P A R T I A L F U L F I L L M E N T O F THE REQUIREMENTS FOR T H E DEGREE OF M A S T E R OF SCIENCE in T H E F A C U L T Y O F G R A D U A T E STUDIES (Mathematics) T H E UNIVERSITY OF BRITISH C O L U M B I A July 2005 Â© Catherine Dupuis, 2005 Abstract We consider seismic signals as a superposition of waveforms parameterized by their fractionalorders. Each waveform models the reflection of a seismic wave at a particular transition between two lithological layers in the subsurface. The location of the waveforms in the seismic signal corresponds to the depth of the transitions in the subsurface, whereas their fractional-order constitutes a measure of the sharpness of the transitions. B y considering fractional-order transitions, we generalize the zero-order transition model of the conventional deconvolution problem, and aim at capturing the different types of transitions. The goal is to delineate and characterize transitions from seismic signals by recovering the locations and fractional-orders of its corresponding waveforms. This problem has received increasing interest, and several methods have been proposed, including multi- and monoscale analysis based on Mallat's wavelet transform modulus maxima, and seismic atomic decomposition. We propose a new method based on a two-step approach, which divides the initial problem of delineating and characterizing transitions over the whole seismic signal, into two easier subproblems. The algorithm first partitions the seismic signal into its major components, and then estimates the fractional-orders and locations of each component. Both steps are based on the sparse decomposition of seismic signals in overcomplete dictionaries of waveforms parameterized by their fractional-orders, and involve i minimizations solved by an iterative thresholding algorithm. We present the method and show numerical results on both synthetic and real data. 1 ii Table of Contents Abstract ii Table of Contents iii List of Tables vi List of Figures vii Acknowledgments Part I. ix Capita Prima 1 Chapter 1. Introduction 2 Chapter 2. Background 7 2.1 2.2 Model for seismic transitions 2.1.1 The seismic reflectivity 2.1.2 The seismic source function 2.1.3 The model for the imaged seismic signal Previous work 2.2.1 Complex seismic trace analysis 2.2.2 Multiscale analysis 2.2.3 Monoscale analysis 2.2.4 Seismic Atomic Decomposition Chapter 3. 3.1 3.2 3.3 4.2 Problem Statement 27 Dictionaries and Decomposition 3.1.1 Basis Pursuit Uniqueness condition and equivalence between (Po) 3.2.1 Empirical studies for the performance of B P Detection-Estimation approach a Chapter 4. 4.1 7 7 11 12 15 15 16 24 25 Detection-Estimation Method Detection 4.1.1 Detection dictionary 4.1.2 Decomposition in Detection 4.1.3 Algorithm 4.1.4 Windowing Estimation 4.2.1 Decomposition in Estimation 4.2.2 Dictionary in estimation 4.2.3 Resolution n d (Pi) 27 28 29 31 32 34 34 35 36 37 39 42 42 43 44 iii 4.3 Numerical scheme Chapter 5. 5.1 5.2 7.5 7.6 7.7 Discussion 49 49 51 55 62 66 66 68 69 Capita Selecta Chapter 7. 7.3 7.4 48 Summary and Conclusions Limitations of our method Future work Part II. 7.1 7.2 Numerical Experiments Application to synthetic data 5.1.1 Simple I D synthetic data 5.1.2 Comparison with Basis Pursuit and Matching Pursuit 5.1.3 Realistic synthetic data Application to real data Chapter 6. 6.1 6.2 6.3 46 73 Atomic decompositions with Redundant Dictionaries Problem formulation Method of Frames 7.2.1 The algorithm 7.2.2 Concept of spar si ty Matching Pursuit Basis Pursuit 7.4.1 The formulation 7.4.2 Linear Programming (LP) 7.4.3 Basis Pursuit De-Noising Basis Pursuit versus Matching Pursuit 7.5.1 Results 7.5.2 Discussion Block Coordinate Relaxation Method 7.6.1 Problem statement 7.6.2 Algorithm 7.6.3 Numerical scheme 7.6.4 Simple examples 7.6.5 Multiple Block Coordinate Relaxation Method 7.6.6 Results 7.6.7 Discussion Conclusion 74 74 75 75 75 76 78 78 78 80 80 80 92 93 93 94 95 97 98 99 Ill 112 Bibliography 113 Appendix A . Wavelets 116 A.l Overview Appendix B. 116 Bases 118 iv B . l Fractional Splines B . 2 Fractional Derivatives of the Gaussian Appendix C. Soft Thresholding C. l D. l D.2 125 Derivation of the formula Appendix D. 118 122 125 Fourier Transform and Fast Fourier Transform Theoretical results D . l . l Definitions D . l . 2 Approximation of the Fourier coefficients with the F F T D . l . 3 Fourier coefficients and Fourier Transform D.1.4 Fourier Transform and Fast Fourier Transform Fourier Transform and Fast Fourier Transform in Matlab D.2.1 Approximation of the Fourier Transform in Matlab D.2.2 Matlab code for the Fourier Transform Appendix E . Explanation of the software v 127 127 127 128 128 128 129 129 130 132 List of Tables 5.1 C P U Running Times of B P , M P and our method S E E D E vi List of Figures 1.1 Detection-estimation method 6 2.1 2.2 2.3 2.4 2.5 2.6 Causal and anti-causal fractional splines Varying order transitions Wavelet Transform Modulus Maxima Lines Multifractal analysis of a well-log measurement Global multiscale analysis on different well-logs measurements Monoscale analysis of sedimentary records 9 14 21 22 23 26 3.1 Probability of success of B P 32 4.1 4.2 4.3 4.4 A n example of the window and its parameters Illustration of the windowing Fractional derivatives of the Gaussian with similar fractional-orders Detection Estimation Algorithm 40 41 45 47 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 5.10 5.11 5.12 5.13 5.14 5.15 5.16 Estimation of the average waveform of seismic data with a Fourier spectrum fit Synthetic signal with 20 events Synthetic signal with 30 events Comparison between the detection-estimation algorithm, B P and M P Synthetic slice with 16 reflectors Detected reflectors for the synthetic slice with 16 reflectors Original fractional-orders for the synthetic slice with 16 reflectors Recovered fractional-orders for the synthetic slice with 16 reflectors Synthetic slice with 24 reflectors Detected reflectors for the synthetic slice with 24 reflectors Original fractional-orders for the synthetic slice with 24 reflectors Recovered fractional-orders for the synthetic slice with 24 reflectors Real data Detected reflectors in the real data Recovered fractional-orders in the real data Real data with recovered fractional-orders 49 51 52 53 56 56 57 57 59 59 61 61 63 64 64 65 6.1 Improved detection-estimation method 72 7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9 Decomposition of a cosine using the D C T and the Haar wavelets Analysis of a very sparse coefficient with an M D W T dictionary with 2 a Analysis of a very sparse coefficient with an M D W T dictionary with 10 a Analysis of a very sparse coefficient with 2 very different a Analysis of a very sparse coefficient with an M D W T dictionary with 2 similar a . . . . Analysis of a less sparse coefficient with an M D W T dictionary with 2 a Analysis of a less sparse coefficient with an M D W T dictionary with 10 a Analysis of a less sparse coefficient with an M D W T dictionary with 2 close a Analysis of a less sparse coefficient with an M D W T dictionary with 2 different a . . . 76 82 83 84 85 86 87 88 89 vii 7.10 7.11 7.12 7.13 7.14 7.15 7.16 7.17 7.18 7.19 7.20 7.21 7.22 7.23 7.24 7.25 7.26 7.27 7.28 7.29 Analysis of two close components in an M D W T dictionary with two different a Analysis of two close components in an M D W T dictionary with ten different a Block Coordinate Relaxation Algorithm B C R algorithm: separating the noise, a cosine function and a step function B C R algorithm: separating the noise, a cosine function and bumps Original and recovered signals Original and recovered coefficients Five first original and recovered signals Five last original and recovered signals Five first original and recovered coefficients Five last original and recovered coefficients Original and recovered signals Original and recovered coefficients Original and recovered signals Original and recovered coefficients Original signals Recovered signals Original coefficients Recovered coefficients B C R separation with decimated fractional spline wavelets viii 90 91 96 97 98 100 101 102 102 103 103 104 105 106 107 108 108 109 109 110 Acknowledgments First, I would like to acknowledge my engineering school E N S T A (Ecole Nationale Superieure de Techniques Avancees) in Paris, France, whose excellent program enabled to study at the University of British Columbia. I am also very grateful to Dr. Robert Braid and all my professors at E N S T A for their help and support. I would also like to thank my supervisor, Dr. Felix Herrmann, for providing me with a very interesting project and guiding me in my research. I also thank him for enabling me to participate in the M G A (Multiscale and Geometry Analysis) program organized by I P A M (Institute for Pure and Applied Mathematics) at the University of California, Los Angeles. Doing research at I P A M for the Fall 2004 as part of the M G A program has been a wonderful experience. I would like to thank Richard Kublik for his help and support, especially during difficult times. I also thank Professor Ozgur Yilmaz for useful discussions and suggestions on my work, and my officemates Peyman Poor Moghaddam, Gilles Hennenfent and Urs Boeniger for their help and friendship. I gratefully acknowledge a Jean-Walter Zellidja scholarship for my first year at U B C . Finally, I give personal thanks to my parents for their love, support and encouragement throughout my studies. ix Part I Capita Prima i Chapter 1 Introduction The Earth's subsurface consists of layers of different materials separated by interfaces, also called transitions. Transitions are characteristic of regions where the Earth's properties (indicated by changes in the acoustic properties) vary rapidly compared to the scale of the seismic wavelet. These sudden changes in the properties at the transitions create edges in the medium and provoke the reflection of seismic waves. These reflections are recorded at the surface, giving rise to a seismic signal (seismic trace) which contains the seismic events triggered by the reflection of the waves at the different transitions. To a good approximation, a seismic signal can be written as the convolution of the seismic reflectivity and the seismic source function (seismic wavelet) [5]. Here the reflectivity depends on the transitions in the acoustic impedance, which is the product of the compressional wave speed and the density of the materials. Seismic data thus carries information on the geometrical relationships between subsurface lithological boundaries, which is usually used to build geological models to explain the origin of stratigraphic features. Extracting local information about transitions from seismic data has recently received increasing interest [6, 7, 9, 11, 12, 18, 22, 23, 34, 35], as accurate knowledge of the subsurface is more and more needed. Accurately finding the types of transitions is important because the information on the type of transitions complements the information about amplitude and phase already used by geologists. Considered together, the type of transition, along with its amplitude and phase, can be used in various geophysical applications, ranging from improving the geological interpretation of the subsurface, to detecting changes in the lithology and constraining the critical points in the percolation model proposed by Herrmann et al. [19]. In this work [19], 2 a bi-compositional mixture model for transport properties (elastic and fluid) of rocks has been proposed. A t the critical points, where the more permeable of the rock constituents connect, the model predicts a multi-scale transition in the transport properties. The exponents of the transitions associated to the critical points could explain and characterize the transitions in the subsurface. In this study, we are interested in characterizing the transitions in the subsurface, as this information may be used to infer the permeability properties of the rocks. The diversity of transitions in the subsurface has recently been established by Herrmann et al. [7, 14, 16, 22, 27] in their study of seismic and well data using multiscale analysis with wavelets. The results of Herrmann, Muller and others, showed that the Earth's subsurface behaved like a multifractal, giving rise to a medium consisting of varying order singularities [7, 14, 16, 18, 27, 35]. Transitions no longer follow the typical zero-order discontinuity model (i.e. step function), but can have any fractional-order discontinuity, that can be obtained by fractionally differentiating/integrating a zero-order discontinuity [17, 33]. The conventional deconvolution problem (spiky decon) considers seismic transitions in the subsurface as zero-order discontinuities, and aims at recovering the seismic reflectivity as a spike train signal. However, several methods have been proposed to delineate and characterize transitions. Starting in 1972 with Harms and Tackenberg [13], followed by Payton in 1977 [28], and later Taner et al. in 1979 [32], attempts were made to characterize transitions using complex trace analysis. Complex trace analysis is based on instantaneous phase behavior, where the instantaneous phase is seen as a seismic trace attribute giving information on lithological properties. This approach is also based on the non-local Hilbert transform, which makes it noise sensitive. In this analysis, transitions are modeled as a combination of several sub-wavelength zero and/or first order discontinuities. However, this transition model is still simplistic, as it does not incorporate all the different types of transitions. We propose a mathematical model for transitions, that incorporates instantaneous phase behavior through any fractional-order discontinuity [7]. Our model allows for any fractional-order transitions in the subsurface, leading to a broader class of reflectivity signals. This makes our model more general, and an extension of the conventional deconvolution model. In the case where transitions are modeled by 3 zero-order discontinuities, our model reduces to the conventional deconvolution model. Following their multiscale studies of seismic and well data, Herrmann et al. [16, 17, 20] proposed a method to characterize the sharpness of transitions using monoscale analysis. In this method, zero-order transitions were generalized to any fractional-order. The key of this method is to fractionally integrate/differentiate the seismic signal until the disappearance/appearance of a local maximum. The amount of integration/differentiation at the point of disappearance/appearance of the local maximum gives the estimate for the fractional-order of the transition. Although relatively successful, this method lacks robustness and is highly sensitive to noise and any phase rotation of the waveforms in the signal. Following the monoscale analysis, Herrmann [18] proposed a parameterization of seismic signals with redundant dictionaries. In this framework, the signal is written as a weighted superposition of waveforms parameterized by their fractional-order. These waveforms are taken from a suitably chosen collection, called a dictionary [2, 26]. Given a dictionary, the signal can be represented by a vector containing the weighting coefficients. Assuming sparse reflectivity, the idea is to find the sparsest set of coefficients in the chosen dictionary that represents the signal. Under certain conditions, the sparsest representation contains a few large coefficients corresponding to the different seismic events in the signal. The problem of finding these coefficients (which corresponds to solving an ill-posed problem), was attempted using Matching Pursuit [15, 18, 26] and Basis Pursuit [2, 18]. These are two algorithms that approximate the sparsest representation. The results of Matching Pursuit and Basis Pursuit showed the ill-posedness of the problem and the difficulties in finding the sparsest representation for complicated signals in large dictionaries [18]. Even though both algorithms are able to find the sparsest representation for simple configurations, both fail when the dictionary becomes too big or the signal too complex. Motivated by the conclusions of these studies [18], we reformulate the problem into a more tractable approach. We propose a two-step method that divides the resolution of the problem into two subproblems. In the first part, that we call detection, we find the major events in the seismic signal. In the second part, that we call estimation, we estimate their fractional-orders. 4 B y finding the largest sets of coefficients in the approximate sparse representation of the seismic signal in a chosen detection dictionary T>d, the detection algorithm locates the major features of the seismic signal that correspond to predominant reflectors in the Earth. The detection dictionary contains several parameterized waveforms that mimic the behavior of the seismic signal. The richness of the detection dictionary accounts for non-stationarity in the seismic signal. The approximate representation c of the signal s in is obtained by minimizing 5l|s-*dc||l + A | | c | | i , (1.1) d with respect to c, where c is the representation of the seismic signal s with respect to the dictionary T>^ &d is the matrix of the detection dictionary X ^ , and A<2 is a trade-off parameter between the data misfit and the d. norm of the coefficient vector c. The columns of <&d are 1 the different waveforms in T>d after normalization. The detection step can be viewed as a spiky deconvolution problem, where the goal is to find a spike train signal containing the locations of the predominant components. The different signal components associated with the largest coefficients in the representation c constitute a partitioning of the signal s into its prevailing localized components (major reflection events). The extraction of these components is done by multiplying the signal with localized windows centered at the different component locations. The extracted signals, now localized, are analyzed separately in the estimation part. We denote S j , the i th signal component extracted from s. The estimation algorithm operates on relatively simple and localized signals Sj assumed to be a linear combination of a limited number of waveforms. The estimation process delineates the different waveforms in each signal S j , and estimates their fractional-order by finding the sparsest representation of the signal in another dictionary of parameterized waveforms, noted V e for estimation dictionary. Each Sj is analyzed in the estimation part through the minimization of ^ I l i i - S e C j I l i + AellCiHi, with respect to c i; where denotes the i th (1.2) signal component of the seismic signal s, and c t its representation with respect to the estimation dictionary T> . Here, we aim at an exact e reconstruction, which requires the parameter A to be very small. When A â€”> 0, (1.2) behaves e 5 e like Basis Pursuit [2] and translates into min ||CJ||I subject to s = 4 & Â£i- (1.3) e The coefficients in the sparse representation Cj correspond to waveforms in T> whose paramee ters a constitute estimates for the fractional-orders of S,. The final step collects the different waveforms and associated fractional-orders pertaining to each signal component Sj into a global vector describing the complete seismic signal s. The global vector contains the fractional-orders of all the reflection events of the total seismic signal s. The following figure shows the main steps of our detection-estimation method. s ii VL I r 3 2. J -J s 52 ^4.2 ESTIMATION 4 4 ^4.2 ESTIMATION ESTIMATION 4 4 a Q-2 3 ESTIMATION Â«4 Figure 1.1: Detection-estimation method. The numbers 4.1, 4.1.4 and 4.2 refer to the sections where the particular topic is explained in detail. 6 Chapter 2 Background 2.1 Model for seismic transitions Significant variations in the Earth's properties over the length-scale of the seismic wavelet create singularities in the subsurface and make seismic waves reflect. These singularities occur at transitions between layers, and are typically modeled by zero-order discontinuities (i.e. step discontinuities). However, multiscale analysis on sedimentary records (vertical profiles of the broadband fluctuation of the elastic properties) performed by Herrmann et al. [14, 16, 27], revealed the existence of accumulations of varying order singularities in the subsurface, which give rise to any fractional-order discontinuity. The different types of transitions correspond to the different materials in the Earth and to the various ways in which materials are deposited and mixed together at the transitions. 2.1.1 The seismic reflectivity The mathematical model Following Herrmann et al. [15, 17, 18, 21], we extend the usual zero-order transition model to any fractional-order transition, modeled by the causal and anti-causal fractional splines [33] (also called fractional onset functions): (2.1) where a > 0, and T is the Gamma function defined as: T(x) â€” J Â°Â° t e x 0 a > 1, these functions are differentiable. dt for x â‚¬ R. When l When a is between 0 and 1, they become non7 For â€” 1 < a < 0, they are discontinuous (and non- differentiable b u t remain continuous. differentiable). W h e n a < â€” 1, the fractional splines cease to be locally integrable, and become tempered distributions [17]. C a u s a l and anti-causal fractional splines are linked together by the relationship: V x â‚¬ M , X-( ) x = X+(~ )- M a t h e m a t i c a l l y , the fractional exponents a specify x the regularity of the fractional splines, as they coincide w i t h their L i p s c h i t z exponent. E a c h of these fractional splines [33] is L i p s c h i t z a, where the L i p s c h i t z regularity is defined as: Definition 2.1.1 (Lipschitz regularity [25]) â€¢ A function f is pointwise Lipschitz a > 0 at v if 3K > 0 and a polynomial p of degree m = \ot\ such that v ViGR, \f(t)-p (t)\ v <K\t-v\ . a â€¢ A function f is uniformly Lipschitz a over [a,b] if it satisfies the above inequality for all v G [a, b], with a constant K that is independent of v. â€¢ The Lipschitz regularity of f at v or over [a, b] is the supremum of the a such that f is Lipschitz a. O u r choice of fractional splines is motivated by their regularity a n d scale invariance. T h u s we can characterize the sharpness of a transition irrespectively of the scale: V x G R and V a > 0, a > 0 , X^(CTX) - < J Â± ( x ) . a X A s explained before, each of these functions has a particular order of singularity a. For a > 1 at v, the function %Â± has a bounded derivatives and a singularity at v w i t h L i p s c h i t z exponent a. However, when a < 1 at u, xÂ± is not differentiable at v a n d a characterizes the type of singularity. E x a m p l e s of causal,and anti-causal fractional splines are shown i n F i g u r e 2.1. 8 a=0 a = 0.1 a = 0.3 a = 0.5 Figure 2.1: Examples of Causal and anti-causal fractional splines with different fractionalorders. The top two rows show causal fractional splines, and the bottom two rows show anticausal fractional splines. 9 Implementation of the xÂ± basis We construct the basis B&, made of the family of all translations of x Â± f Â° r e a c n a - Except for a = 0, the family B^, obtained by all the translations of xÂ±> gives a degenerate collection of waveforms, as the last basis function is zero everywhere. This is due to the fact that the first sample of %Â± is zero for any a except a = 0. B y adding 1 to all the x Â± , we ensure all the basis functions are non-zero. Here the translations are non circular. If we use circular translations (or circular permutations), we create a singularity of order 0 for all the basis functions in all the bases B^ â€¢ A circular translation C , p â‚¬ Z , of a vector v â‚¬ M p wi / / , v((n + p) nv t s *r VI < n < I\ , C^v(nJ = be the length of the the vector Xa B+ = {xi} a n has the property that (mod AT)) i f n + p ^ O v(N) Let N if n + p = 0 (mod N) (mod N) d define: , such that VI < n < N , y ^ n ) = x+(n - i + 1) , where x+(n) = 0 Vn < 0 It is easy to prove that for any a > 0, BÂ£ is a basis. See proof in Appendix B . l . A similar 1 argument is used to show that B~ is a basis. Transitions and reflectivity We denote by / the I D vertical earth model that represents the transitions as a superposition of fractional splines: f(z) = J2^iXÂ± (z-z ), (2.2) i i i where z corresponds to the depth in the subsurface, and Zi to the location of the i th transition in the subsurface. B y differentiating / , given by (2.2), we obtain the seismic reflectivity r, as r(z) = where Ki = > a n K d A c and i X T\z - ) Zi KiXÂ°?-\z - Â«), (2.3) are the sets of indices for the transitions with causal and anti-causal discontinuities respectively. 10 2.1.2 The seismic source function In most geophysical surveys, and especially those using explosive sources, the seismic source function tp is unknown, and is typically modeled by the Ricker/Mexican hat wavelet (opposite sign of the second derivative of the Gaussian). In the rest of the paper, we will only use the name Ricker wavelet. We relax this model and discuss our assumptions on the seismic source function with regard to its smoothness and wiggliness. We first introduce some definitions. Following Unser et al. [33], a 7-order causal fractional B-spline, Â£+> is defined by taking the (7 + l ) t h fractional difference of a one-sided power function Â£(*) = Al x+(x) = Â£(-!)* f +1 fc> \ 0 7 t *W * - *), (2-4) / where 7 > â€” 1, x G R, and ( Â£ ) is the generalization of the binomial coefficients defined as T + 1\ Tf'Y + 2) 7 A^ + l ) r ( + 2 - k) [33]. The (7 + l ) difference of a function g is defined g ( x ) = ^ ( - l ) y ^ jg( - k). In the Fourier domain, (2.4) translates to k J' - T(k + 1 fc>o 7 f e t h x ^ by ' with the frequency to â‚¬ R. The anti-causal and symmetric fractional splines [33] are defined in a similar way by anti-causal: C-( ) â€” C+( W â€” w ) > a n d symmetric: C (CJ) sin(ui/2) 7+1 7 w/2 (2-6) The fractional B-splines are the extensions of the traditional B-splines to non integer exponents 7. For further extensions, B l u et al. [1] defined the (7, r)-fractional B-splines l+i_ CÂ» C T Q{u) - 2 2 + H (2.7) (2.8) where 7 is the usual fractional exponent, and r controls the phase of the function. Unlike the traditional B-splines, fractional splines do not have compact support, but are of rapid decay with the following decay property [33] 11 The fractional B splines and (7,0)-fractional B-splines converge to the infinitely smooth Gaussian function when 7 â€”> 0 0 . We consider the source function as a 8 th derivative of an (7, refractional B-spline, parameter- ized by its fractional exponents 7 and 8, and its phase determined by r iP(t) = d?Q{t) for 8 < 7 + 1, (2.10) where t is the time variable. The conventional Ricker wavelet model for the seismic source function ip is a particular case of our model for 8 â€” 2, r = 0, and 7 â€”> 0 0 . In addition to this model, we discuss two properties of the seismic source function ip: the decay of its Fourier spectrum for frequencies going to infinity, as ip G C ( R ) (ip is an 7-times continuously 7 differentiable function on R ) , / |f/>(u;)||u;| cLj < 0 0 , JM. 7 and the differentiability of the Fourier transform of ip at zero frequency [18]: / t ip(t)dt = 0 <S=Â» n(d^)) = 0 for q G N , and q < M. q u=0 JR The first condition limits the high-frequency content of tp by setting the asymptotic decay rate for high frequencies, and the second condition requires that ip be orthogonal to some finite degree polynomial, hence describing its wiggliness [18]. This latter property defines the number of vanishing moments of ip [25], noted q. 2.1.3 The model for the imaged seismic signal We consider the linearized single scattering approximation, where the imaged seismic signal s can be written as: s(z)cx(r*ip )(z). (2.11) z Here r is the imaged travel-time depth converted reflectivity, and ip is the depth converted z seismic source function. For clarity purposes, we simplify the notations and refer to the depth converted seismic source function ip as ip. Because we are only interested in the singularity z 12 properties of the seismic signal, we simplify the notations and omit the constants. We therefore write the seismic signal s as the convolution of the reflectivity and the seismic source function: s( ) z = (T*II>)(Z) = (2.12a) Y2 *+~ (z-*i)*iKz)K ieA Â£ t f i X - l < _ 1 ( * ( 2 . 1 2 b ) J6AA c where r is given by equation (2.3). The seismic signal s can also be written as a superposition of fractional derivatives/integrations of the seismic wavelet ip, where the definition of fractional differentiation is given by Liouville [24] in his generalization of differentiation for fractionalorders: D e f i n i t i o n 2.1.2 (Fractional Derivatives [24]) Let f and X + _ 1 be a tempered distribution in Â«S(R) be the causal fractional spline: The fractional derivative of f of order a is defined as Df a = * /, where the convolution has to be taken in the sense of distributions. Under the condition that the orders OJJ given in equation (2.12b) are all non integers, and that the seismic wavelet tp is either even or odd, we can write: XT\ Z - i) * iP( ) = QD-^iz z z - Zi) (2.13) and I dD-^fzi =l X J~ (z-Zi)*iP(z) a 1 - z) if t\> is odd, -dD~Â°' 'ip(zi , (2-14) - z) if tp is even, i which gives a seismic signal s of the form () = Y 8 CiD- *P(z ai Z ieA - ) Â± J2 CiD~ ^(z a Zi ieA c 13 A - zi), (2.15) (a) Transitions:a between 0 and 1 (b) Seismic signals with Ricker Locations Locations Figure 2.2: Varying order transitions and their associated seismic signals obtained with a Ricker wavelet. The transitions are set to continuously interpolate between a zero-order discontinuity (step function) and a first-order discontinuity (ramp function). One can clearly see that different order transitions create seismic signals that not only differ in amplitudes, but also in waveform. where G, - V^/) - rfo+i) â€¢ In Figure 2.2, we present an example of varying order transitions and their corresponding seismic signals. The seismic source function is a Ricker wavelet. Figure 2.2a shows the different transitions with fractional-orders ranging from 0 to 1 and Figure 2.2b shows the seismic signals obtained by convolving the reflectivity associated to these transitions with the Ricker wavelet. B y interpolating the fractional-order of the transitions between 0 and 1, we obtain seismic signals going from the Ricker wavelet (associated to the zero-order transition) to the antiderivative of the Ricker wavelet (associated to the first-order transition). 14 (a) Transitions:^ between 0 and 1 order 0.1 0.9 - (b) Seismic signals with Ricker s^/M y/M///In / //// If/ ///III / 0.8 order 0 In I/ 0.7 I Ii \ / / 'II / W 0.6 CD 0J 11 T3 ^ 0.5 "O Z3 -4â€”I CL E < E < 0.4 1 Ii III/ 0.3 \ 0.2 j(Ijl 0.1 w -0.5 order 1 0 0.5 Locations 0 0.5 Locations 1 Figure 2.2: Varying order transitions and their associated seismic signals obtained with a Ricker wavelet. The transitions are set to continuously interpolate between a zero-order discontinuity (step function) and a first-order discontinuity (ramp function). One can clearly see that different order transitions create seismic signals that not only differ in amplitudes, but also in waveform. where L x - r&Vi) In Figure 2.2, we present an example of varying order transitions and their corresponding seismic signals. The seismic source function is a Ricker wavelet. Figure 2.2a shows the different transitions with fractional-orders ranging from 0 to 1 and Figure 2.2b shows the seismic signals obtained by convolving the reflectivity associated to these transitions with the Ricker wavelet. B y interpolating the fractional-order of the transitions between 0 and 1, we obtain seismic signals going from the Ricker wavelet (associated to the zero-order transition) to the antiderivative of the Ricker wavelet (associated to the first-order transition). 14 2.2 Previous work 2.2.1 Complex seismic trace analysis Beginning with Harms and Tackenberg [13] in 1972, followed by Payton in 1977 [28], and later Taner et al. [32] in 1977 and 1979, many attempts have been made to characterize transitions using complex trace analysis. Complex seismic trace analysis calculates attributes from a seismic trace. Attributes are defined as specific information characterizing seismic traces that can be computed or extracted from them. For complex trace analysis, the attributes are the amplitude, phase, instantaneous frequency, weighted average frequency and apparent polarity [32]. This analysis considers a seismic trace s(t) as the real part of a complex signal S(t) [32], where S is defined as: S(t) = s(t)+is*(t) , Vt > 0. s* is called the quadrature component and is uniquely determined by the following two conditions 1. s* is calculated from s by a convolution 2. If s(t) = Acos{ut + 9) for A, 9 â‚¬ R and to > 0, then s*(t) = Asin(ut + 9) (s* is the Hilbert transform of s) s, considered as a complex signal, can be written as S(t) = A(t)e ^\ l6 where s(t) is the real part of S, and s*(t) the imaginary part. From this, we refer to the amplitude \A(t)\ as the reflection strength, and to the phase 9(t) as the instantaneous phase [32]. The instantaneous frequency u can be computed by calculating / d9(t) x - - r - Using the reflection strength, the instantaneous phase and instantaneous frequency, we deduce the weighted average frequency as [32] _ J^A(t-TM*-T)L(T)dT j^A{t-T)L{T)dT and apparent polarity. The apparent polarity is defined as the sign of s{t) when A(t) has a local maximum [32]. 15 Taner et al. [32] interpreted these attributes as information on geological and lithological features. For instance, they interpreted high reflection strength as "major lithological changes between adjacent rock layers, such as uncomformities and boundaries associated with sharp changes in sea level or depositional environments", and showed that "phase displays are effective in showing discontinuities ... and events with different dip attitudes which interfere with each other" [32]. This analysis is based on instantaneous phase behavior, where the instantaneous phase is seen as a seismic trace attribute giving information on lithological properties. The complex trace analysis approach is also noise sensitive, as it has the disadvantage of being based on the nonlocal Hilbert transform. The complex analysis transition model is also quite simplistic. In this paper, we propose a mathematical model for transitions that incorporates any instantaneous phase behavior of the form (iuS) a = e 2 l a through fractional-order transition, as fractional differentiations/integrations yield phase rotations [7]. 2.2.2 Multiscale analysis Herrmann et al. [14, 16, 21] performed a multiscale analysis of seismic data using wavelets. Theorems and Definitions Given a mother wavelet ip, where ip is a normalized function in L (U) with zero average, we 2 can create a family of wavelets through translations and dilations of the mother wavelet Vt â‚¬ R , V(a,a) 6 (R+)Va,a(*) = A=1> where a > 0 is the position of the wavelets and a > 0 their scale. The wavelet coefficients of a function / are computed by taking the inner product of / with the different wavelets ip atCT W{f,iP}(a,o-) = (f,ip , ). a a See Appendix A for more details on wavelets. 16 (2.16) T h e o r e m 2.2.1 [JAFFARD] [25] [Relation between Lipschitz exponent and the decay rate of the wavelet coefficients] If f â‚¬ L ( R ) is Lipschitz a <n 2 at v, then there exists A > 0 such that V(Â«,s)6lxK + , \Wf(u,s)\ <As ^(l a+1 + \â€”-\ ). s a Conversely, if a < n is not an integer and there exist A > 0 and a' < a such that V(Â«, s ) e l x l + , \Wf(u, a) | < ^ s Q + 1 / 2 ( l + \^â€”^\ '), s a then f is Lipschitz a at v. This theorem proves that the local Lipschitz regularity of / at v depends on the decay rate at fine scales of \Wf(u, s)\ in the neighborhood of v and that the decay can be controlled from its local maximal values. [25] Definition 2.2.1 [Modulus Maximum and Maxima Lines] A modulus maximum is a point (UQ, S O ) where \Wf(u, so)| is maximum at u = UQ. This implies dWf(u , ) 0 SQ = Q du This local maximum should be a strict local maximum in either the right or the left neighborhood of UQ, to avoid having any local maxima when \Wf(u, so)I is constant. We define a maxima line to be any connected curve s(u) in the scale-space plane (u, s) along which all points are modulus maxima. The Wavelet Transform Modulus Maxima Lines are referred to as W T M M L . Singularities are detected by finding the abscissa where the wavelet modulus maxima converge at fine scale i.e., when s â€”* 0. Definition 2.2.2 (Self-similarity [25]) A set S C Râ„¢ is said to be self-similar if it is the union of disjoint subsets Si, S2, Sk that can be obtained from S by a scaling, translation 17 and rotation. This self-similarity often implies an infinite multiplication of details, that creates irregular structures. D e f i n i t i o n 2.2.3 ( C a p a c i t y d i m e n s i o n [25]) The capacity dimension is a simplification of the Hausdorff dimension that is easier to compute numerically. Let S be a bounded set in M. . n We count the minimum number j\f(s) of balls of radius s required to cover S. If S is a set of dimension D with a finite length(D=l), surface(D=2) or volume(D=3) then N ~ ( ) s ^ so log S s-*0 The capacity dimension D of S generalizes this result and is defined by D = l i m i n f ^ l . s->0 log S The Hausdorff dimension is a refined fractal measure that considers all covers of S with balls of radius smaller than s. It is most often equal to the capacity dimension, but not always. Definition 2.2.4 [Singularity Spectrum [25]] Let S a be the set of all the points t â‚¬ R where the pointwise Lipschitz regularity of f is equal to a. The spectrum of singularity D(a) of f is the fractal dimension of S . The support of D(a) is the set of a such that S is not empty. a a The singularity spectrum gives the proportion of Lipschitz a singularities that appear at any scale s. A multifractal is said to be homogeneous if all singularities have the same Lipschitz exponent ao, which means that the support of D(a) is restricted to {ao}. D e f i n i t i o n 2.2.5 [Partition function [25]] Let {up(s)}p^z be the position of all local maxima of \W (u, s)\ at a fixed scale s. The partition function Z measures the sum at a power q of all g these wavelet modulus maxima Z(q,s) = J2\ f( P> ^W v 18 u s For each g G M , the scaling exponent r(q) measures the asymptotic decay of Z(q, s) at fine scales s Tig) = l i m mf v ' This typically Theorem means that Z(q,s) 2 . 2 . 2 [ARNEODO, ~ s-+o \og(Z(q,s)) ^â€”â€”. logs s(\ T q BACRY, JAFFARD, support of D(a>). Let ip be a wavelet with n > a m a x MUZYJ [25] Let A = [a i ,a ] m n vanishing moments. be the max If f is a self-similar signal then r(q) = min(g(a + 1/2) â€” 73(a)). (Legendre Proposition â€¢ â€¢ 2.2.1 The scaling exponent r(q) is a convex and increasing The Legendre transform is invertible D(a) The spectrum D(a) transform) of self-similar function of q. if and only if D{a) is convex, in which case = min(g(a + 1/2) - r{q)). signals is convex. We have the relations | ^ = a(q) + 1/2, where 1/2 is due to the I? normalization of the wavelets. Numerical calculations 1. Maxima: [25] compute Wf(u, s) and the modulus maxima at each scale s. Chain the wavelet maxima across scales. 2. Partition function: compute Z(g,s) = ^ | W / ( u , s ) | Â« . p p 3. Scaling: compute T(g) with a linear regression of l o g Z(q, s) w r(q) log s + C(q). 2 4. Spectrum: 2 compute D{a) = mm(g(a + l / 2 ) - ( g ) ) . T 19 L o c a l analysis Herrmann [14] applied the result of theorem 2.2.1 to well data, and was able to find their local exponents a by estimating the slope of the wavelet coefficients in the log-log plane. In the log-log plane, the relationship between the wavelet coefficients and the local exponent a at position a and scale a is log(|W{o, a}\) < \og(A) + a log(a), (2.17) where A is a positive constant. From equation (2.17), the estimate for the local exponent a is computed via linear regression. In Figure 2.3 and Figure 2.4, we show the local multiscale analysis performed on a simple synthetic signal and on a sonic well-log measurement [16]. In Figure 2.3, the top plot shows the data with the different singularities, the middle plot shows the continuous wavelet transform with the W T M M L superimposed, and the bottom plots show the local multiscale analysis associated to each singularity. In Figure 2.4, the top plot shows the sonic well-log, the middle plot shows the continuous wavelet transform with the W T M M L superimposed, and the bottom plots present the local multiscale analysis for selected singularities. We can clearly see that the well-log contains singularities everywhere, as the W T M M L converge everywhere. This property therefore shows that the Earth behaves like a multifractal. The method is appropriate for functions with isolated singularities because it is based on a local analysis. However, for functions with accumulated singularities (which is the case for seismic data), it is difficult to detect and characterize the singularities using a local analysis. G l o b a l analysis Global analysis yields an estimate for the singularity spectrum f(a) (defined in definition 2.2.4), which gives the "rate of occurrence" [14] of a particular singularity a in the analyzed signal. Theorem 2.2.2 and Proposition 2.2.1 link the singularity spectrum f(a) to the scaling exponent r(q) (defined in Definition 2.2.5) through the following relation r(q) = mm(q(a + 1/2) - f(a)) , and / ( a ) = mm(q(a + 1/2) - r(q)). 20 1.5 F Figure 2.3: Regularity analysis through local analysis. The top plot shows the data with different singularities. The second plot shows the continuous wavelet transform with the location of the W T M M L L (Wavelet Transform Modulus Maxima Lines) superimposed. The bottom plots show the local multiscale analysis for each singularity. The slope of these different lines gives an estimate for the singularity order. Taken from [16]. 21 as 25 1 ^ % Jf 22,5 25 9 1Â» " i. jf .7.5 .5 1 2 3 4 S 0 (rf) 7 is â€¢ logÂ« 1 2 3 4 S (Â«) 0 7 S log c Â«iO 2400 if) logo- xatm] D.6308 2 0 17.5 15 Figure 2.4: Local analysis of a sonic well-log measurement. The top plot shows the sonic well-log (compressional wave speed). The middle plot shows the continuous wavelet transform with the W T M M L (Wavelet Transform Modulus Maxima Lines) superimposed. The bottom plot presents the local multiscale analysis for selected singularities. We can clearly see that the well-logs are singular everywhere, and therefore present multifractal characteristics. Taken from [16]. 22 "/(a) and r(q) contain information on the global integrability and differentiability of the signal s " [14]. The minimum exponent a i refers to the strongest singularity associated to the less m n regular regions, whereas a m a x refers to the weakest singularity associated to the most regular regions. Figure 2.5 [16] shows the result of the global multiscale analysis on three different well-logs associated to the compressional wave speed, the shear wave speed and the density. The top three plot present the well-logs, the bottom left plot shows the function r(q) for all three logs, and the bottom right plot shows the singularity spectra f(a) for the three logs as well. Note the similarity between the three logs. 6000 -i u^tOOO 2000 0 i 1 1 200 400 600 _J L 800 200 400 600 200 400 600 r 1000 x 1200 1400 1600 1800 2000 800 1000 1200 1400 1600 1800 2000 800 1000 1200 1800 2000 4000 o"2000 0 0 4000 =L2000 0 â€ž.i/..~~4..-wâ€”-^tU' 1400 1600 \ 0.5 1 \ 1.5 Figure 2.5: Comparison between three different well-logs. The top plot presents the compressional wave speed well-log, the second plot the shear wave well-log, and the third plot the density well-log. The bottom left figure shows the r(q) function for all three logs, and the bottom right plot shows the singularity spectra f(a). Taken from [16]. The results of Herrmann [14] after applying the global analysis on seismic data showed that the scale exponent a was different at every point, which is one of the characteristics of multifractals. One of the explanation for the multifractality of seismic data was given by Herrmann [14]: "The multifractality of the Earth's sedimentary deposits can be understood because the sedimentation 23 process if closely linked to the hydrodynamics of the atmosphere. Hydrodynamic turbulence is one of the classical examples of a process showing evidence of multifractal behavior across a large inertial scale range. Hence the variability in the sedimentary deposits can be considered as a frozen state of turbulence." The results of the global multiscale analysis on seismic data provide the geophysicist with information on its global scaling exponents, but not about its localization in the data. In full band data, it is possible to estimate the local exponent a from / ( a ) . However, because seismic data is bandwidth limited, this is more difficult. Indeed, the multiscale method needs all the scales (or frequencies) in order to derive the asymptotic results, which are not available in bandwidth limited data. Multiscale analysis is therefore not the best method to obtain information about the local scale exponent, albeit several attempts have been made [14, 35]. 2.2.3 Monoscale analysis Herrmann et al. [14, 16, 20] proposed a monoscale analysis method to analyze seismic data and unravel its singularities. In this method, they generalize the zero-order transitions to any fractional-order. The key of the method is to fractionally integrate/differentiate the seismic signal until the disappearance/appearance of a local maximum. The amount of integration/differentiation at the point of disappearance/appearance of the local maximum gives the estimate for the fractional-order of the transition. The monoscale analysis method is based on the generalization of the standard continuous wavelet transform. The continuous wavelet transform of / , defined in equation (2.16), can be rewritten as a function of the scale a and differentiability M [16]: W { / , ^}(a, a) = o M ^{f * 4> )(a) = a M 0 (f * ^ 0 , ) ( < O = ( / * V ' f ) ( Â« ) , (2-18) where <p is a 2 M differentiable, real and symmetric smoothing function with support propora tional to a, and tpjf is the wavelet generated by dilations of ip M defined as Vx â‚¬ K , tp (x) = M ( - l ) ^ 0 ( x ) [16]. w Herrmann et al. [14, 16, 20] generalized the wavelet transform to any fractional-order derivatives 24 8- W{f, with 0 â‚¬ R. When 3 is negative, ifi}{a, a) = <r?-rj(f * <t>c){a), is a fractional integration, and when 3 is positive, (2.19) is a fractional differentiation [16, 20, 21]. Instead of varying the scale a like the multiscale method, 3 is the varying parameter, changing the fractional number of differentiations or integrations [16]. Modulus Maxima lines are defined in the same way as for the standard wavelet transform, defined in Definition 2.2.1. For fractional differentiation (0 >0), the local estimate for the fractional exponent a is given by the minimum value of 0 for which a local maximum appears along these 0 maxima lines a(a,a) = inf{d |W{/,^}(<7,a)| a = 0}. (2.20) For fractional integration (8 < 0), the estimate for a is given by a(a,a) = sup{d \W{f,^ }(o,a)\=0}. a (3 (2.21) In Figure 2.6 we present the result of the monoscale analysis method on sedimentary records [17]. The top plot shows a well data set, and the second plot the singularity orders of the well data found by the monoscale analysis. The third plot shows an ice core measurement, and the bottom plot presents the singularity orders of the ice core measurement. The color defines the singularity order, and the size of the dots their magnitude. This method gives quite good results. However, it doesn't always maintain the lateral continuity of the exponents along the reflectors, lacks robustness, and is highly sensitive to noise and any phase rotation of the waveforms in the signal. 2.2.4 Seismic Atomic Decomposition Following the monoscale analysis, Herrmann proposed in 2005 [18] a parameterization of seismic signals using redundant dictionaries. In this framework, the signal is decomposed in a dictionary of waveforms parameterized by their fractional-order, and represented by its coefficients. Assuming sparse reflectivity, the idea is to find the sparsest set of coefficients that represents 25 awacl I mÂ«n triry rÂ«corct lOOO 1200 14-00 3<SOO 1SOO lOOO 1200 1400 1600 lO oI lO 20 1 20 30 â€¢ 30 40 2000 2200 2400 2600 2SOO 3000 1SOO 2000 2200 2-JOO 2C>00 2800 3000 Ice c o r e measureineiu so s>o 100 SO (SO 70 1 â€¢ ' â€¢ ' ' â€¢ 40 50 <SO 70 SO OO IOO 110 H O Figure 2.6: Application of the monoscale method. The top plot shows the smoothed acoustic impedance, and the second plot displays the singularity orders of the impedance found by the monoscale analysis. The third plot shows an ice core measurement, and the bottom plot displays the singularity characterization of the ice core measurement. The color defines the singularity order, and the size of the dots their magnitude. Taken from [17]. the signal in the chosen dictionary. The sparsest representation contains a few large coefficients pertaining to the different seismic events in the signal. This problem was tackled using Matching Pursuit [15, 18, 26] and Basis Pursuit [2, 18], which are two algorithms that approximate the sparsest representation. The results show the ill-posedness of the problem and the difficulties in finding the sparsest representation for complicated signals in large dictionaries. For simple configurations, both algorithms are able to find the sparsest representation, but fail when the signal becomes too complex or the dictionary too large. A dictionary is defined in Chapter 3, along with a brief explanation of the Matching Pursuit and Basis Pursuit methods. 26 Chapter 3 Problem Statement Characterizing transitions in the Earth involves finding their locations, amplitudes, and fractional exponents. In particular, given the seismic signal s( ) = Y, CiD- iP(z ai z - )Â±Y^ Zi CiD- ^{z a - ) Zi as given in (2.15), the goal is to estimate the locations Zi, the amplitudes Ci and fractional exponents Q j . One of the methods proposed to solve this problem is atomic decomposition [18], whose results constitute the starting point of our detection-estimation method. 3.1 Dictionaries and Decomposition In the atomic decomposition method, the aim is to uniquely decompose the whole seismic signal with respect to a dictionary and to represent it by its coefficients. We use terminology introduced by Mallat and Zhang [26]. In R , a dictionary V is a collection of parameterized n waveforms (</? ) r> where the waveforms </? are discrete-time signals of finite length n, called 7 7G 7 atoms, and V is the set of the indexing parameter 7. In the case of a frequency or Fourier dictionary, 7 is the indexing frequency, whereas for a time/scale dictionary, 7 is the indexing time/scale jointly. A dictionary is called complete if it contains exactly n linearly independent atoms, or overcomplete if it is full rank and contains more than n atoms. Given an overcomplete dictionary V of normalized atoms, we consider the decomposition c = ( c ) r of a signal s as 7 7 e (3-1) 7er 27 where the representation of the signal s is given by the vector c = ( c ) r - For discrete signals, 7 7G we can consider the atoms of T> as columns of a matrix 3> and write equation (3.1) in the matrix form (3.2) s = 3>c, where $ is an n x m (m > n) matrix representing the dictionary V, s â‚¬ R and c G R . n also use the adjoint of <& denoted by 3?*. 3> and m We satisfy the relation: Vx G R , Vy_ G K " , ($x,y.) = (x, m We adopt terminology used by Chen et al. [2]. The matrix 3? represents the synthesis operator, which consists of building up a signal s by superposing atoms of V. The matrix 3>* represents the analysis operator, whose action is to find a vector of coefficients c that correspond to the atoms in s. The normalization of the dictionary, leading to | | $ | | = 1, is crucial for the convergence of any minimization involving dictionary decompositions. In the rest of the paper, we will use normalized dictionaries and will refer to $ as the matrix of T>. Finding the solution c of (3.2) in an overcomplete dictionary is an underdetermined problem, and thus has an infinite number of solutions [2, 26]. 3.1.1 Basis Pursuit The nonuniqueness of c enables us to choose the representation c that is most suited to our problem. Even though seismic reflectivity yielded by sedimentary records is a non sparse signal as it reflects everywhere, we can still treat it as a sparse signal by only considering the few large outlying reflectors. Seismic reflectivity is therefore assumed to be a sparse signal. Given sparse reflectivity, and assuming the dictionary is chosen to give a sparse representation of seismic signals, it is natural to impose sparsity on the representation c and choose the sparsest c that represents s. We define an event to be any localized waveform in R . n The sparsest representation is the representation with minimum Â£Â° norm (i.e. the fewest number of non-zero entries), which is referred to as the solution of the (PQ) problem [8]: (Po) min ||c||o subject to s = 3>c. 28 / (3.3) For the remainder of this paper, we will describe the sparsest solution as the solution of the (Po) problem. Solving the (Po) problem is NP-hard, because its computational complexity increases exponentially with the number of atoms in V [2]. Basis Pursuit (BP) [2, 8] approximates the sparsest representation by convexifying the (Po) problem into the following problem: (Pi) rrnn ||c||i subject to s = <&c. (3.4) For seismic signals, the decomposition involves a dictionary of waveforms parameterized by their fractional-order a, as opposed to a single wavelet in the spiky deconvolution problem. Examples of atoms chosen for seismic applications are fractional B splines and fractional spline wavelets [18]. The idea is to find the few atoms in V that best match the different events in s, and deduce the fractional-order of the events from these atoms. Herrmann [18] proposed an atomic decomposition of seismic signals with B P , and showed that B P manages to find the sparsest representation when the dictionary is small or the signal simple. However, as shown by Herrmann [18], the larger the dictionary, the more similar the atoms in V, so if the dictionary gets too large, B P will have difficulties distinguishing between similar waveforms and hence not find the sparsest solution. 3.2 Uniqueness condition and equivalence between (Po) and (Pi) Herrmann [18] showed empirically that for certain configurations of signals and dictionaries, the solution to the (Pi) problem was the solution to the (Po) problem, which, in general, is not the case. There exist some conditions [8] under which a representation c is the unique sparsest representation, and stronger conditions under which the (Po) and (Pi) problems are equivalent. To discuss these conditions, we need to introduce the following definitions. Definition 3.2.1 (Spark [8]) Let & be a matrix of size n by m, with m > n. The Spark of 3> is defined as the smallest possible number a, such that there exists a subgroup of a columns from <& that are linearly dependent. 29 Definition 3.2.2 ( / / ^ ( G ) [8]) For G a symmetric matrix, Mi/2(G) denotes the smallest integer k such that some collection of k off-diagonal magnitudes arising in a single row or column of G sums at least to \. Although Spark and rank of a matrix are in some ways similar [8], they remain two separate concepts: Example 3.2.1 We show a full rank matrix B with a Spark 2. 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 Clearly B is full rank as it has 4 independent columns (rank(B) = 4:), but B also has two identical columns which reduces its Spark to 2. The Spark is an important concept that measures the similarity between any two columns of which for dictionaries, translates to the similarity between any two atoms in V. The uniqueness condition involves the Spark of T>, and is given in the following theorem: Theorem 3.2.1 (Uniqueness [8]) Consider a dictionary V with matrix representation c 0 If there exists a of s , such that sâ€” &CQ and ||c ||o < cr/2 , then CQ is the sparsest solution 0 possible, i.e. c solves the (PQ) problem. 0 Given a signal s and dictionary V, Theorem 3.2.1 states the sparsity condition under which a representation c is the unique sparsest representation of s in V. If a representation c has strictly less than cr/2 non-zero entries, it is the sparsest representation. However, this theorem doesn't show whether this sparsest representation can be the solution to an l l which is stated in Theorem 3.2.2. 30 minimization, The equivalence between the (Po) and (Pi) problems involves the quantity u-i/2(G) ( ^ i / ( G ) < 2 a), where G is the Gram matrix of $ defined by G = <&*<!>. (<1>* denotes the adjoint of <&.) The following theorem gives the equivalence condition [8]: Theorem 3.2.2 [8] Consider a dictionary V with matrix <& and Gram matrix G. If there exists a representation c such that s = $ C Q and ||c ||o < Pi/2(C), then c is the unique solution of 0 0 0 (Pi) and is the sparsest solution. For a given signal s and dictionary V, solving the (Pi) problem gives the unique sparsest solution, only if its solution is sparse enough in V (i.e. its Â£Â° norm is strictly less than the quantity ^ i / ( G ) ) . 2 The choice of V is hence very important as it determines the sparsity condition under which the solution of (Pi) is also solution of (Po)- However, the condition stated in theorem 3.2.2 is necessary, but not sufficient, as solutions to (Pi) with Â£Â° norm larger than A*i/2(G) may still be solutions to (Po)- When solutions to (Pi) do not satisfy the condition of theorem 3.2.2, the situation is less clear, as such solutions may or may not be solutions to (Po). In our case and for the type of dictionaries we use, solutions to (Pi) often do not satisfy the condition of theorem 3.2.2. However, Starck et al. [30] empirically demonstrated the gap between theoretical results (based on worst-case analysis), and the empirical evidence of B P ' s performance. The goal of our new detection-estimation method is to increase the empirical bounds shown for Basis Pursuit. 3.2.1 Empirical studies for the performance of BP In many theoretical and practical applications, it is necessary to know the value of the Spark and Hi/2(G) for a given dictionary V. However, computing these quantities involves several Â£Â° minimizations, and requires enumerating subsets of columns of which is computationally intractable as the algorithm complexity grows exponentially with the size of V [8]. Donoho et al. [8] proposed a numerical scheme for computing an upper bound on the Spark, by replacing the 1Â° minimizations by l x minimizations. However, although the scheme is now computationally tractable, its numerical results are very difficult to interpret [8]. Due to this computational barrier, the performance of B P must be evaluated through numerical experiments. 31 Probability of success of BP l 1 I 1 l 2 i 3 i i 4 5 i 6 i 7 i 8 i 9 r 10 # of nonzero entries in the coefficient vector Figure 3.1: Probability of success of B P for different number of non-zero entries in the representation c. We used a fractional spline wavelet dictionary with ten different fractional-orders (a) ranging from 0.5 to 2. One can clearly see that the probability of success of B P decreases very fast with the number of non-zero entries in the coefficient vector. We performed simulations with fractional spline wavelet dictionaries used to decompose seismic signals [18], and showed that, as soon as the representation c had more than two non-zero entries, the probability of success of B P was very low. This is illustrated in Figure 3.1. These results explain the failure of Herrmann's atomic decomposition [18] for large dictionaries and complicated signals, and show that for large seismic dictionaries, the representation of the signal must be highly sparse (its representation c must satisfy ||c||o < 2) in order to be the unique solution of an i minimization. 1 3.3 Detection-Estimation approach We address the above issues through a two-step approach: we first partition the seismic signal s into its major features, and then analyze each of them separately by performing their I 1 decomposition in a large dictionary of waveforms parameterized by their fractional-order. Each major feature, containing only a few events, is now a simple and localized signal that can be 32 successfully decomposed with respect to large dictionaries. We consider dictionaries made with all translations of parameterized waveforms. For example, a dictionary containing waveforms parameterized by two different fractional-orders a\ and a 2 will be written as V=[V ,V ], ai (3.5) a2 where T> contains all the translations of a waveform parameterized by O J I , and T> contains ai a2 all the translations of a waveform parameterized by a . In a matrix form, equation (3.5) can 2 be written as * = (3.6) Due to this construction, the synthesis operator associated to <& acts as a convolution, and the analysis operator associated to 3>* as a correlation. 33 Chapter 4 Detection-Estimation Method We now discuss our approach to the problem of delineating and characterizing transitions from seismic data. Studies on sedimentary records have shown that seismic reflectivity is made of accumulation of varying order singularities [14, 16, 27], leading to a non sparse signal. However, unlike white noise, sedimentary records clearly contain distinct large outlying reflectors with strong singularities evidenced by small fractional-orders, giving rise to a few large "spiky" events in the reflectivity. Currently, we are only interested in finding these outlying reflectors, and consider the reflectivity as a function of these large outliers. We therefore assume that the reflectivity contains a few events and denote the seismic signal s. The principle of our method is to divide the original problem into two easier subproblems. We start by partitioning the signal s into its major features (pertaining to the predominant reflectors in the Earth), and then proceed by analyzing and characterizing each feature separately. We name these two steps detection and estimation respectively. 4.1 Detection The purpose of the detection is to find the locations and characteristic scales of the major features in the signal s. In particular, the detection step answers two specific questions: 1. What is the partition of the signal s into its prevailing components? 2. What is the characteristic scale of each detected component? 34 This information is then used to build windowing functions to extract the major components from the signal s. The principle of the detection is to find an approximate sparse representation of the seismic signal s in an appropriate dictionary, and select the largest sets of coefficients that correspond to the predominant features in s. 4.1.1 Detection dictionary Like any matched-filter approach, it is important to choose atoms that best correlate with the events in s in order to guarantee the sparsity of its representation. To satisfy this condition, we choose to estimate the average waveform of s, by finding the waveform tp whose Fourier a spectrum best fits the Fourier spectrum of the signal s. Because we work with waveforms parameterized by their fractional-order (and implicitly or explicitly parameterized by their frequency), we estimate the average waveform in terms of its fractional-order a, and central frequency / . We construct a larger detection dictionary T> by selecting the estimated average d waveform </? and some neighboring ones. B y neighboring waveforms, we refer to waveforms a whose parameter a and frequency / are close to the parameter a and frequency / of the average waveform. We choose a and / such that / \a â€” a\ < T) From experiments we choose 77 = 1, ei = \ and a n ( ei < -= < e . 2 f Â£2 â€” 2. The choices we made for 77, ei and e are not fixed, and can be modified, as long as the neighboring waveforms obtained from 2 these changes still correlate well with the seismic signal. Adding more fractional-orders and more frequency content in the detection dictionary accounts for the non-stationary properties of the seismic signal, and ensures the sparsity of its representation with respect to V (multiple d spiky deconvolutions). The detection dictionary V is therefore made of the selected waveforms d (average waveforms and neighboring ones) and all their translations, and can be seen as the concatenation of several dictionaries. For clarification, consider a detection dictionary V made of p selected waveforms and their d translations. V d can be written as V = [D ,...,V ], d dl 35 dp (4.1) where is the detection dictionary made of the i waveform and all its translations. For discrete signals, T>d can be represented by its matrix <&d, which can also be written as a concatenation of p matrices *d=[* where 4.1.2 is the matrix of d l ,...,*^], (4.2) â€¢ Decomposition in Detection We envision an approximate sparse decomposition M 5 = E ^^+ c r(M) ' (-) 4 3 i=l where is the residual and c = (c )^L 0li l the representation of s with respect to the detection dictionary. This approximation selects the M largest coefficients c , which correspond to the ai M main features of the signal s, and approximately represent s in a sparse solution c. Because the aim of the detection step is to extract only the major features of the signal s, it is sufficient to approximately reconstruct the signal s and consider the above approximate decomposition. Equation (4.3) can be written in a matrix form as s = * d Â£ + r, (4.4) where 3? a is the matrix of T>d of size n x m (m > n), s and r 6 W and c â‚¬ M . 1 m c refers to the solution of min i||s - * d c | | l + A | | c | | i , d where [3>di) > 0 is a trade-off parameter between the I 1 *dp]i a n (4.5) norm of c and the data misfit, $ d = d c = [cj, . . . , c ] . The size of the residual and hence the number of selected T p atoms M in the representation c, is controlled by A^. The residual goes to zero as A^ â€”> 0, leading to an exact reconstruction. On the other hand, if A^ â€”> oo, the residual gets large, with lim \dâ€”â€¢+oo r = s, and lim c = 0. For noisy data, (4.5) becomes a denoising problem, where the Adâ€”>+oo value of A<2 has to be set according to the noise level. Chen et al. [2] proposed to set \ d = a s/21og(p), n 36 (4.6) where an is the noise level, and p the cardinality of the dictionary. Chen et al. [2] refer to (4.5) as Basis Pursuit De-Noising ( B P D N ) [2]. Similarly to B P , B P D N can be rewritten as a perturbed Linear Programming problem, and is solved using the Simplex algorithm or an Interior-Point scheme [2]. 4.1.3 Algorithm Because of limitations associated with the Basis Pursuit and Matching Pursuit methods that we will discuss later, we propose to solve (4.5) using an iterative thresholding algorithm [4] on each coefficient C j . The resulting algorithm amounts to a Block Coordinate Relaxation Method [29, 30] performed on c, with a Landweber iteration with thresholding applied at each iteration step on each c^ [4]. A t each iteration k we minimize the functional ^ | | s - * d c | | i + A ||c||i, (4.7) fc with respect to the coefficient c, where A& > 0 is a decreasing parameter. We start with a large A& and then decrease it to some predetermined minimum value \min- The choice of A j m n depends on how small we want the data misfit to be and is specified at the beginning of the algorithm. A small a large A jâ€ž m A j m n will give a small data misfit and an exact reconstruction, whereas will contribute to a large data misfit and an approximate reconstruction. For normalized signals, and to a good approximation, we choose A j m = 10 . - 4 n A t each iteration k, we solve for the vector c = [ c j , c * ] , by separately solving for its different k components c^, and assuming the other components c^, j / i, are fixed. The representation cf is updated according to the rule [4, 30] <* = S (cf + < M R i - * *cf)), (4.8) +l Xk where &ck* d and R f = s â€” Ylj^i denotes the adjoint operator of &dj*Cj- S\ is the compok nentwise soft thresholding operator by Afc, defined by S\ (x) = sign(x)(\x\ â€” Xk)+ , for x G M, k where y + = max(0, y) for y G R . The choice of the iterative thresholding algorithm to solve (4.5) is motivated by the algorithm's flexibility and control over the decrease of the parameter Afc and choice of X inm 37 The way the parameter A& decreases determines the accuracy of the algorithm: the slower the decrease, the more accurate the solution c. In our case, we are only interested in the major features in s, which correspond to the larger coefficients in c. For that reason, we choose to use an exponential decrease of the parameter Afc which will make the algorithm slowly capture the large coefficients at the beginning of the iterations, and then move on faster on the small coefficients. The expression for the exponential decrease of Afc is given by: Afc = A - e â€”ks (4.9) where A = 1 + Ao and s is a predetermined speed parameter. The representation c, obtained by iteratively minimizing (4.5), is numerically not sparse, as none of its entries is strictly zero. However, most of them are within a tolerance of 1 0 - 6 >C \ i, m n which is negligible. We now present our approach to detect the major events in s. According to the notations introduced in equation (4.2), the representation c obtained by minimizing (4.5) is a vector of length m (m = pn) that can be written as c = [ c i , - . , c ] , cj 6 Vi p where each Cj â‚¬ R . In order to incorporate the contribution of each dictionary Â£>j in the n detection process, we consider the different vectors c^ as columns of a matrix C. Each row k in C , corresponding to the location k in s, contains the coefficients of s with respect to the different dictionaries T>{ at the location k. B y taking the largest entry in the magnitude of each row, we construct another vector c where c â‚¬ Râ„¢ and C â‚¬ n x p. c T T T describes the best contribution of the dictionaries Vi at the different locations in s, and the largest entries in c T correspond to the major features in s. We now select the largest entries of c , by only considering those larger than a threshold \i. T The choice of n depends on the type of signals involved, and is very important, especially when dealing with noisy data, p is chosen to be a percentage of the Â£ norm of c , written as 2 r 38 where p is the percentage. From experimental results, we choose to set p = 0.1, but this value can be modified. For noisy data for example, p should be set at a large enough value in order to be above the noise level. Note that for noisy data, A ; can also be set to a larger value m n depending on the noise level, as stated in (4.6). We call c^f the thresholded vector c T with threshold level p. For clarity purposes, we omit the index T and denote the thresholded vector 4.1.4 Windowing The next step in the detection is the analysis of the thresholded vector c , and the selection t of the major features in s. The representation c^ contains localized sets of nonzero entries corresponding to the main components of s, and can be written as M 1=1 where M is the number of sets in c , cJ is a vector of length n containing the iâ€ž-th set of largest tfl nonzero entries in c , and Vz ^ j , sup(c^) n sup(cj ) = 0. t/i A n example of such a partition is given below: c We now define = (0,1,2,0,0,-1,-2,0) (4.11a) 4, = (0,1,2,0,0,0,0,0) (4.11b) 4, - (0,0,0,0,0,-1,-2,0). (4.11c) iM as the largest entry in the modulus of cj^ mi = max |4, (01, and denote by ip at ( - ) 4 12 the atom in V that corresponds to the entry 77V Each vector cj^ describes a particular feature in s which does not overlap with the other ones, as stated by the condition Vz 7^ j, sup(c^) fl sup(4J = 0. The location U and frequency / , of each atom (p the location and characteristic scale of the i ai th determine main feature in s. In order to extract these detected features from the signal s, we adopt the methodology of 39 multiplying c t by localized windows centered at the locations l{. To facilitate the discussion, we introduce the following notation. Let Wj be the window associated to the i selected atom ip , centered at l{. Each window th ai is chosen to be a discrete function given by 0 exp(WiW ( n - ^( 1 ) ) 2 ) = { 1 X0<k<li-(B + 6)<Ti \i-Bo-i<k-k< {(3 + 5)0-1 Xlk-hl^Bai exp(-vM*gf <+1 > ) (4.13) if Bor <k-l <(B )2 i i if U + (3 + 5)ai 0 where CTJ is the standard deviation of <p i, and a a + Sfa <k<n, is the standard deviation of the Gaussian 9i function used for the decay of the window at the tips. 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.I Figure 4.1: The i window and its associated parameters (3 and 5. The function plotted is multiplied to the signal to extract the z detected signal component. The parameters B and 5 are as given in equation (4.13). th The parameters 3 and 5 specify the width of the window. We provide an illustration of the window in figure 4.1. We choose default values 3 â€” 2, <5" = 2, leading to a window width of 8crj, which gives a good approximation of the width of (p â€¢ The window is equal to 1 over the width ai 1Bo~i centered at k. The window ends at both tips with a Gaussian decay with a width equal to 40 <5<7j for each decay. Both Q and 8 depend on the data and should be modified accordingly. Our current choices for these parameters come from numerical experiments, and seem well suited to our type of data. B y multiplying the signal s by each window W , , we partition s into its prevailing components S j , where Sj Â£ W . The multiplication between the signal s and the window Wj is done compo1 nentwise, such that V 1 < k < n , Sj(lfe) = s(fc)Wi(fc). 1.5r _1 1 1 Signal with the window 1 1 1 1 1 51 1 1 I I I l I ' 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 1.51 1 1 Windowed signal part i 1 l 0.3 l 0.4 1 1 1 (4.14) r 1 1 1 0.8 0.9 1 1 1 1- _1 51 ' 0 l l 0.1 0.2 l 0.5 l 0.6 l l i 1 0.7 0.8 0.9 1 Figure 4.2: The top plot shows the signal and the window applied to one detected event. The bottom plot shows the windowed event. Each signal Sj is now a simple and localized signal assumed to be a superposition of a small number of atoms. We show an example of a windowing in Figure 4.2. The top plot shows the signal and the window applied to one detected event. The bottom plot shows the windowed event. 41 4.2 Estimation We now treat each windowed signal Si as a generic signal s. The estimation algorithm operates with simple and localized signals s assumed to be a superposition of a small number of waveforms parameterized by their fractional-order. The purpose of the estimation step is to unravel and approximate the fractional-order of these waveforms through the decomposition of the localized signal with respect to a large dictionary. In particular, the estimation step answers two quantitative questions: 1. How many waveforms does s contain? 2. What is the fractional-order of each of these waveforms? 4.2.1 Decomposition in Estimation We consider a signal s: K s where <fi ai = Y2, on<t>oci> (4.15) a i=i ai, and is a localized waveform parameterized by its fractional-order teger, leading to a very sparse representation a = taining the waveforms (4> )iLiai (a )f_ . ai K a small in- We denote T>K the dictionary con- x The principle of the estimation step is to find the waveforms K that make up the signal s (or an approximation of these waveforms), through the decomposition of s in a large estimation dictionary of waveforms parameterized by their fractional-order. The signal s can be decomposed as N s = ^2c ip , i=i ai (4.16) ai (f )iLi waveforms taken from an overcomplete estimation dictionary and c = [c )iLi is a very sparse representation of s. Equation (4.16) can be written in a matrix form: where a r e V, ai E ai s = * c, (4.17) e where 3> is the matrix of V of size n x m (m > n), s G R and c G M . In the ideal case, where n e m E the estimation dictionary T> contains the waveforms (^aJfiLi (i- - T>K Q T> ), equation (4.15) e E E 42 is identical to (4.16), and the representation a coincides with the representation c. When V e contains a subset Vp of T>K, T>P = (^>a .)f-i, P < K and Oj â‚¬ { 1 , i ^ } , both representations a a and c coincide on the subset of coefficients that correspond to the waveforms (<f> )aâ‚¬S- Because a T>K is unknown, V does not in general contain the waveforms (<f> )iLi- However, if a subset of waveforms in V converges to the waveforms in T>x, we hope to have the representation c e e ai converge to a. As stated in section 3.1, there is no unique solution c to the underdetermined problem (4.17), and we cannot guarantee that the representation c will converge to a. However, following the sparsity of a, and by imposing sparsity on the representation c, we hope to find the representation c that will converge to a in the case where a subset of waveforms in V e converges to the waveforms in T>x- The sparsity assumption of c naturally leads to solving the (Po) problem. However, the (Po) problem is in general computationally intractable [2, 8], so we consider the easier (Pi) problem: min ||c||i subject to s = * c , (4.18) e The result of Theorem 3.2.2 given in section 3.2, describes the sparsity condition under which the solution to (Po) is also solution to ( P i ) : If there exists a representation CQ such that s = 3>c and \\c \\o < / X i / ( G ) then CQ is the unique 0 0 2 ; solution of (Pi) and is the sparsest solution. For large dictionaries, the above condition requires a high sparsity of the solution of (Po), in order to guarantee its being solution to (Pi) as well. Theorem 3.2.2 reveals the trade-off between the size of the dictionary and the complexity of the signal. The larger the dictionary, the smaller / x i / ( G ) , and the sparser c needs to be in order to also be solution of ( P i ) . On the 2 0 other hand, pi/ (G) increases as the size of the dictionary decreases, which relaxes the sparsity 2 of c . The choice of the dictionary T> is thus very important. 0 4.2.2 E Dictionary in estimation We now discuss our choice of dictionary V . e Based on the strong sparsity of c assumed in equation (4.17), we can relax the size of the dictionary and allow for a large one. Following 43 the argument developed in section 4.1.1, it is necessary to select atoms that best correlate with the signal s. In particular, we choose atoms that are waveforms parameterized by their fractional-order a, leading to (4.19) T> = (<Pa)aeA e where A is the indexing set of fractional-orders taken from the singularity spectrum f(a) given by a multiscale analysis [14, 25] on the seismic signal. In the estimation, it is crucial to obtain an accurate estimate of the fractional-orders contained in the signal s we analyze. For that purpose, it is necessary that the estimation dictionary contains a large number of fractionalorders a sufficiently close. There is a trade-off between the size of the dictionary and the accuracy of the estimation algorithm. When the fractional-orders in the dictionary are too far apart, we lose accuracy. On the other hand, if we choose the fractional-orders of the dictionary too close together, the dictionary becomes too large and the algorithm prohibitively slow. For our purposes, we choose the distance between two consecutive fractional-orders in the dictionary T> to be 0.2, as it gives a good compromise between accuracy and speed. Figure 4.3 shows two e waveforms with fractional-orders differing by 0.2. The waveforms are fractional derivatives of the Gaussian, whose fractional-orders are the fractional amounts of differentiation performed on the Gaussian. The top plot shows a waveform with fractional-order 2.2 and the bottom plot shows a waveform with fractional-order 2.4. Visually, these waveforms look similar, but our algorithm can still distinguish between them. Waveforms with closer fractional-orders will not be distinguished by the algorithm, but this is sufficient for our purposes. The same argument applies to dictionaries explicitly parameterized by their fractional-orders and frequencies. 4.2.3 Resolution We propose to solve (4.18) using the iterative thresholding algorithm [4] described in section 4.1.3. Recall the iterative thresholding algorithm, whose solution c is given by c = argnun (-||s - * c | | | + A | | c | | i ) , e 44 e (4.20) 1 Waveforms with similar fractional-orders i I I I I I I I . 1 a =2.2 w CD 0.5 T3 0 Q. E < -0.5 -1 â€¢ 50 100 150 200 250 300 i 350 i i 400 450 i 500 Locations 1 <D I I I I 1 I 1 1 . 1 1 0.5 T3 i 0 L E < -0.5 -1 I 1 1 1 50 100 150 200 250 i i i 300 350 400 i i 450 500 Locations Figure 4.3: Fractional derivatives of the Gaussian with similar fractional-orders. Here we see almost two identical waveforms with fractional-orders differing by 0.2. The top plot shows a waveform with fractional-order 2.2, and the bottom plot displays a waveform with fractionalorder 2.4. We choose the estimation dictionary so that our algorithm can distinguish two waveforms with fractional-order differing at least by 0.2. If the fractional-orders are closer than 0.2, our method will not distinguish them. where s = 3> c + r. In the estimation step, the signal s is decomposed with respect to a large e dictionary. Unlike the detection, where we consider an approximate decomposition, the signal s is entirely decomposed, leading to an exact decomposition of s. This is very important as we want to estimate the fractional-orders of s. We can solve (4.18) using the iterative thresholding algorithm, because, when A â€”> 0, the residual r goes to zero and the solution c behaves exactly e like B P applied to s. From experiments, we choose X i m n = A = 10 e - 1 0 . Although numerically non sparse, most of the coefficients of the representation c are within a tolerance of 1 0 ~ 10 <gi Xmin- It is then necessary to threshold c. As in the detection step, we 45 where p' is the percentage. From experiments, we choose p' = 0.1. We note the thresholded vector c ,. The nonzero entries of the solution c , determine the different events in s. In part t ticular, each nonzero entry in c , corresponds to a particular atom (p in V, whose parameter t a a gives the fractional exponent of the corresponding event. In the case of dictionaries parameterized by fractional exponents and frequencies, the nonzero entries in c , correspond to atoms t <P( ,f), whose parameters a and / give the fractional-orders and frequencies of the estimated a events. 4.3 Numerical scheme We present the numerical scheme for our algorithm. We use the abbreviation B C R (Block Coordinate Relaxation) for the iterative thresholding algorithm, as our thresholding algorithm is based on the Block Coordinate Relaxation Method [29, 31]. T denotes the hard thresholding M operator by p. The function E s t i m a t e approximates the fractional-orders of the different windowed signals S j . The function G l o b a l gathers the estimated fractional-orders of the windowed signals into a global vector a. The vector a contains the fractional-orders of the different events in s. The algorithm is given in Figure 4.4. 46 \ B u i l d the detection B u i l d the estimation dictionary â€¢ Perform f o r d i c t i o n a r y T>d V e each t r a c e s DETECTION 1. Decompose with respect s Thresholding: c= to using T>d 2. T h r e s h o l d c w i t h t h r e s h o l d l e v e l U-: c ^ = T ( c ) 3. Select 4. Window o u t t h e M t the p sets Iterative BCR(s) of largest p coefficients: detected c o r r e s p o n d i n g t o (cJ ) ? parts (cJ ) ? i n s: = 1 ESTIMATION â€” F o r e a c h p a r t s, 1. Decompose 2. Select with respect t h e waveforms fractional-order: t o T> : c, = e and e s t i m a t e BCR^) their = Estimate(Cj) â€” end â€¢ Global vector â€¢ end f o r f r a c t i o n a l - o r d e r s a = Global(a,j) F i g u r e 4.4: Detection E s t i m a t i o n A l g o r i t h m 47 = 1 (Â§i)iLi Chapter 5 Numerical Experiments We now present the results of our detection-estimation method on both synthetic and real data. We consider dictionaries made with fractional derivatives of a centered Gaussian. The atoms of the dictionaries are thus parameterized by their fractional-order a and their standard deviation a (equivalent to the scale and the frequency). The dictionaries can be written as V = (</'a,(T)( ,(T)e(R+) > 2 Q where a and a are the indexing fractional-orders and standard deviations. The fractional-order a of the waveform <p ,oa l s given by the fractional amount of differentiation a performed on the Gaussian: Va > 0 Vx e R , y> ,a(x) = D g (x), a a (5.1) a where g is the centered Gaussian function with standard deviation cr, defined by g (x) = a <T e zT ', for x â‚¬ R. Our choice of dictionary made with fractional derivatives of the Gaussian is 3 motivated by several factors. First, dictionaries made with fractional derivatives of the Gaussian provide flexibility over the choice of both the fractional-orders and the frequencies of the atoms. They hence provide adaptable dictionaries and in particular, allow for the construction of bandwidth limited dictionaries containing atoms with very localized frequencies. Bandwidth limited dictionaries are particularly well suited for seismic data, as seismic data is bandwidth limited. Second, we noticed that fractional derivatives of the Gaussian showed good correlation with seismic data, both in the time and frequency domain. In particular, their Fourier spectrum fits quite well the Fourier spectrum of seismic data. We show an example of the Fourier fitting in Figure 5.1. 48 Average waveform in the data 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Fourier Transform of the average waveform and of the data 0 -20 Frequency Figure 5.1: Fourier Spectrum Fit: the top plot shows the average waveform in the data that was found by fitting the Fourier spectrum of the data with the Fourier spectrum of a fractional derivative of the Gaussian. The bottom plot shows the Fourier transform of the data (in red) and the average waveform (in blue). The results displayed in Figure 5.1 have been made on real seismic data gratefully received from ChevronTexaco. The top plot shows the average waveform of the data, whereas the bottom plot displays the superposition of the Fourier Transform of the average waveform and the Fourier Transform of the first trace of the data (see appendix D). 5.1 5.1.1 Application to synthetic data Simple I D synthetic data We start by presenting the results of our detection-estimation algorithm on simple synthetic signals. We consider synthetic signals s as superpositions of fractional derivatives of the Gaussian: j=i i=i m where k = i + (j dictionary V â€” l)N , a and the waveforms ip ,o-j ai = {^>a,a)( ,a)e(U+) - The synthesis dictionary refers to the dictionary we use to 2 s are taken from an overcomplete synthesis a 49 synthesize the signal s. We apply our detection-estimation algorithm to two different synthetic signals: one with 20 events (N = 20), and one with 30 events (N = 30). We choose a synthesis dictionary V s with 30 different fractional-orders a ranging from 1 to 4, and 25 standard deviations between 0.007 and 0.013. We choose A j m and Xmin = 1 0 - 9 n = 10 - 3 for the iterative thresholding algorithm of the detection part, for the iterative thresholding algorithm of the estimation part. The detection decomposition is performed with a dictionary of size 3, and the estimation decomposition with a dictionary of size 48: the estimation dictionary contains 16 different fractional-orders between 1 and 4, and 3 different standard deviations. The frequencies of the waveforms contained in the estimation dictionary can be chosen to be close to the frequency of the windowed event, which makes the estimation dictionary scale- or /regency-adaptable. This property of our algorithm enables us to reduce the number of standard deviations, and hence the size of the estimation dictionary. This reduction increases the speed of the estimation decompositions performed on each windowed event. We use this adaptability property in the analysis of the two synthetic signals. The results obtained with our detection-estimation algorithm are shown in Figures 5.2 and 5.3. Figure 5.2a shows the signal with 20 events, Figure 5.2b presents the original fractional exponents at their locations in the signal, and Figure 5.2c shows the fractional exponents found by the algorithm. In this simple example, the locations and fractional-orders of the different events in s are correctly found. Figure 5.3a shows the synthetic signal s containing 30 events, Figure 5.3b shows the original fractional exponents a at their location in the signal, and Figure 5.3c shows the fractional exponents found by the algorithm. A l l the events in s are found and have the correct locations, and most of the fractional-orders are correct. However, some of the fractional-orders found by the algorithm are far from the original ones. We explain these few errors in the estimated fractional-orders to be due to an erroneous windowing. A n incorrect windowing will not capture the whole event and therefore not extract it entirely, providing the estimation algorithm with 50 (a) O r i g i n a l s i g n a l (20 e v e n t s ) Iâ€”i I Iâ€”I 0 50 Iâ€¢ I 100 Ii 150 I U_J 200 u I il 250 300 Locations I U 350 I d 400 I U 450 I JJ 500 Figure 5.2: Analysis of a synthetic signal with 20 events. the wrong waveform. A poor windowing is more likely to happen when the signal s contains closely-spaced events relative to the scale of the seismic wavelet. 5.1.2 Comparison with Basis Pursuit and Matching Pursuit In this section, we compare the performance of our detection-estimation algorithm with the performance of Basis Pursuit (BP) and Matching Pursuit ( M P ) . We consider a signal s with 25 events (N = 25) and synthesize it with a synthesis dictionary V made with 30 fractionals orders a between 1 and 4, and 17 standard deviations between 0.09 and 0.011. We set the parameters of our algorithm to be \ i m n detection part, and A j m n = 10 - 9 = 10 - 3 for the iterative thresholding algorithm of the for the iterative thresholding algorithm of the estimation part. We choose a detection dictionary of size 3 and an estimation dictionary of size 54. In this experiment we do not use the adaptability property of the algorithm in order to have a fixed estimation dictionary for all windowed event. In that way, we can use the estimation dictionary as the analysis dictionary for B P and M P . 51 (a) Original signal (30 events) (c) Recovered r ? )( ( r ) j (> i < < ( I U I I 50 ) () r r > j( (p > ) ( <p ) ( I I I il 100 I C) > j '< C ( i U I 150 . I l_d 200 I < ) I I l_U (' c ) () () 1 < t ) <ij c) ) a 9 I l__U 250 300 Locations I 1 l_U 350 I nâ€” L1U 400 I Li I 450 I c L1 LJ 500 Figure 5.3: Analysis of a synthetic signal with 30 events. Both B P and M P decompose the signal s in an analysis dictionary V , and try to approximate a the sparsest representation c, such that, s = <& c, where 3> is the matrix of the dictionary T> . a a a We compare the results of our algorithm with the results of B P and M P , and use the estimation dictionary of our method as the analysis dictionary V a for B P and M P . Each algorithm gives a representation of the signal s in the dictionary <fr . For simple signals and small dictionaries, a M P , B P and our algorithm (that we call S E E D E for Scale Exponent Estimation) gives the same representation. However, when the signal complexity is high and the dictionary large, each algorithm provides different representations of the same signal s. The results of our comparison are shown in Figure 5.4. The first top plot shows the original fractional exponents at their location in the signal s, the second top plot shows the fractional exponents found by our algorithm S E E D E , the third plot shows the fractional exponents found by B P and the bottom plot shows the fractional exponents found by M P . We can see that for this simple signal s, B P and M P globally find the correct 52 Original and recovered fractional-orders CO 4 c 2- 6 o a 9 9 9 9 P Q 9 0- LU 4 Q LU LU 2 CO 0- <P 9 9 9 U _ 9 $ a $ L 4 CO 2 i i II* 2h 0 i 50 J 100 150 200 250 300 Locations 350 400 450 500 jure 5.4: Comparison between the detection-estimation algorithm, B P and M P . 53 events at the correct locations but find other events as well. B P still find more events than M P because of its super-resolution property [2]. However, neither B P nor M P find the events located at abscissa 300 because too many events are closely-spaced in the vector. For each event in the original vector, B P and M P also find multiple events with different fractional-orders. B P is nevertheless better than M P , because for most of the events in the original vector, and among the multiple fractional-orders found by B P , there is often the correct fractional-order. Unlike B P and M P , our method doesn't resolve closely-spaced events, and its resolution property is lower than the resolution property of B P and M P . However, for each recovered event, our method finds only one fractional-order. We can see that most of the fractional-orders recovered by our method are correct. The incorrect fractional-orders are those belonging to events influenced by other neighboring events. This is due to the windowing which extracts a part of the signal, and prevents the estimation analysis to "see" the complete signal. Information is therefore "cut" by the windowing. As long as the events are not too closely-spaced, and no matter how many events the signal contain, our method will find and correctly estimate all the fractional-orders. However, as soon as the events start to interfere with each other, our method starts to have some difficulties in finding the correct fractional-orders and then the events themselves. One great advantage of our method compared to M P and B P , is that it gives only one estimated fractional-order for each event found. We feel that further work is required to get an optimal windowing and improve the resolution of our algorithm. The numerical complexities of B P , M P and our method S E E D E are quite different. Table 5.1 displays the C P U times in seconds spent in running different decomposition techniques on signals with various lengths. A l l computation was done on a Macintosh G5 workstation. M P is the fastest decomposition. M P has a quasi-linear complexity, depending on the number of chosen atoms [2, 26]. B P on the other hand, is very slow. The complexity of B P depends on the complexity of the conjugate gradient solver and the complexity of the primal-dual logarithmic barrier interior-point algorithm. The complexity of B P is also affected by the size of the 54 Problem Size: 512 C P U Running Time in Seconds Dictionary Size BP MP SEEDE 18 22630.68 7.1450 10.4283 30 66605.47 11.0123 33.4668 42 5902.68 9.4068 56.5617 54 44589.83 12.4648 71.5451 Table 5.1: C P U Running Times of B P , M P and our method S E E D E . dictionary and the complexity of the signal. The numerical complexity of B P can nonetheless be modified by using different parameter settings. Our algorithm finds a compromise between B P and M P . One should note that all our experiments were done in Matlab. The speed of these algorithms can therefore be largely increased by using C++ or Fortran. 5.1.3 Realistic synthetic data We now present our result on more realistic synthetic data. Seismic data is usually stored in seismic cubes or a seismic matrices, where each column is a seismic trace. We use synthetic data of size 512 x 100. Figures 5.5, 5.6, 5.7 and 5.8 present the results of our algorithm on a synthetic slice (or matrix) with 16 events. Figure 5.5 displays the synthetic slice, Figure 5.6 shows the detected reflectors, Figure 5.7 presents the original fractional-orders at their location in the different traces, and Figure 5.8 shows the fractional-orders found by our algorithm. We used a synthesis dictionary made with 20 different fractional-orders ranging from 1 to 4, and 3 standard deviations (scales) between 0.005 and 0.04. In the algorithm, we used a detection dictionary of size 2 (2 standard deviations) made with the average waveform estimated from the synthetic data. We chose the adaptive estimation dictionary. In this example, the events are quite well separated so the algorithm can detect them. This 55 Original synthetic data 0.05 Figure 5.5: Synthetic slice with 16 reflectors. Locations of the reflectors 50 100 r 150F 200 f sz Q . 250 f CD Q 300 f 350 400 F 450 F 500 F 10 20 30 -I L_ 40 50 60 70 X Locations Figure 5.6: Detected reflectors. 56 80 90 100 Original fractional orders ~i i 1 1 1 r i i i 40 50 60 X Locations 70 80 90 50 - 100 â€¢ 150 200 Q . 250 r Q 300 (â€¢ 350 Y 400 F 450 F 500 [ _i 10 1 1 20 30 1 1 i 100 Figure 5.7: Original fractional-orders. Fractional orders found 10 20 30 40 50 60 70 80 X Locations Figure 5.8: Recovered fractional-orders. 57 90 100 example shows that if the events are well separated, the algorithm detects all of them and correctly estimates most of them. The lateral continuity of the recovered fractional-orders along the different reflectors shown in Figure 5.8 is overall quite well preserved. This property is very important for seismic data, as fractional-orders should remain laterally continuous along reflectors. However, although most of the recovered fractional-orders recovered are correct and lateral continuous along the reflectors, we can clearly see that three distinct reflectors have fluctuating fractional-orders. The fractional-orders oscillate between the correct value and the maximum value of the fractional-orders in the dictionary. This phenomenon is due to an incorrect estimation of the frequency in the detection part. The frequency of the analyzed event in the estimation part is indeed very important for a correct estimation. If the estimated frequency of the atom is incorrect, the estimation will be unable to find the correct fractionalorder, as the largest inner product between the event and the atoms in the estimation dictionary will not correspond to the atom with the correct fractional-order. In this example, we clearly see that sometimes, the detection outputs an incorrect frequency. The reason for that happening is not well understood at present, and we feel that further work is required to fully understand the whole algorithm. We now show the results of our algorithm on a different synthetic slice containing 24 events. In Figures 5.9, 5.10, 5.11 and 5.12, we show the results of our algorithm on a synthetic slice containing 24 events. Figure 5.9 shows the synthetic data, Figure 5.10 shows the output of the detected reflectors, Figure 5.11 displays the original fractional-orders of the different reflectors, and Figure 5.12 presents the fractional-orders found by our algorithm. Like the previous example, we used a synthesis dictionary made with 20 fractional-orders between 1 and 4, and 3 standard deviations between 0.005 and 0.04. In the algorithm, we used a detection dictionary of size 3 containing 3 different standard deviations, and made with the average waveform. We chose the adaptive estimation dictionary. In this example, the events start interfering with each other, although not extremely close. One can clearly notice that the second event was not found by the algorithm, because it was too close to the first one. The algorithm lacks the super-resolution property of B P and does not resolve 58 Original synthetic data 0.05 0.05 Figure 5.9: Synthetic slice with 24 reflectors. Locations of the reflectors ^ i 1 1 1 r 40 50 60 X Locations 70 80 90 50 â€¢ 100 ; 150 â€¢ 200 â€¢ Q. 250E CD Q 300 f 350 ; 400 ' 450 f 500 F10 20 30 Figure 5.10: Detected reflectors. 59 100 two closely-spaced events. The two first reflectors represent a thin layer in the subsurface, but the algorithm considers it as only one reflector. In theory, this is not a problem, as the estimation algorithm should be able to detect closely-spaced events as well. However, due to numerical complexities, the current implementation of the algorithm does not yet resolve closely-spaced events. Incorporating the super-resolution property in our implementation of our algorithm is still ongoing research. Oscillating fractional-orders along certain reflectors can also clearly be seen in this example. One of the solution would be to increase the size of the detection dictionary and incorporate many frequencies and possibly different fractional-orders. However,the use of large detection dictionaries leads to denser coefficients that are hard to analyze. Despite these flaws, we can see that most of the fractional-orders found by our algorithm have the correct values, and remain overall laterally continuous along the different reflectors. Further work is still required to precisely understand the role of the different parameters involved in the algorithm. The algorithm can be largely improved by the adequate choice of dictionaries, minimization parameters and efficient minimization solvers. The choice of the iterative thresholding algorithm is a first step in the implementation of our method, and will be replaced by more efficient solvers in the future. 60 Original fractional orders 50 h 100h 150r 200 r Q. CD 250 r Q 300 F 350 F 400 1 450 â€¢ 500 1 Â° 20 30 40 50 60 70 80 90 100 X Locations Figure 5.11: Original fractional-orders. Fractional orders found 50 â€¢ 100 â€¢ 150 â€¢ 200 â€¢ CL 250 F <D Q 300 â€¢ 350 â€¢ 400 â€¢ 450 500 F10 20 30 40 50 60 70 80 90 X Locations Figure 5.12: Recovered fractional-orders. 61 100 5.2 Application to real data We now present the results of our algorithm on real seismic data. We used a detection dictionary of size 1 made with the average waveform estimated through a Fourier spectrum fitting. The Fourier spectrum fitting was shown in Figure 5.1. We chose a detection dictionary of size 1 made with the average waveform in order to have a sparser and clearer detection coefficient c. As stated in section 5.1.3, the larger the detection dictionary, the denser the output coefficient of the detection minimization, which makes it hard to detect the events and therefore defeats the purpose of the detection. In theory and for very clean data, the larger the dictionary, the better, as larger dictionaries give more information than smaller dictionaries. However, in practice and for real data, the output of the detection algorithm gives a noisy coefficient that is difficult to analyze. We noticed from experiments on real data, that the efficiency of the detection algorithm is increased by using a smaller dictionary. The optimal performance of the detection algorithm was obtained with a detection dictionary of size 1. We used an estimation dictionary of size 54, with 18 fractional-orders, and 3 standard deviations. The fractional-orders and standard deviations of the estimation dictionary are chosen according to the fractional-order and standard deviation of the average waveform. In that sense, both the detection and estimation dictionaries are adaptable to the data. In the detection minimization, we used A j m = 1 0 , and \ i - 4 n m n = 10~ 10 for the estimation minimization. 62 We present the result of our algorithm in Figures 5.13, 5.14 and 5.15. Real data X Locations Figure 5.13: Real data. Figure 5.13 shows the first slice of a real seismic data cube, Figure 5.14 presents the detected reflectors, and Figure 5.15 displays the results of our algorithm on the first slice of the real seismic data. In Figure 5.15, we can clearly distinguish most of the different reflectors contained in the data, which shows a good performance of the detection algorithm. Some reflectors remain hard to discern, in particular at the bottom of the seismic slice, which is also the case in the real data shown in Figure 5.13. Overall, the results of our algorithm is coherent with the real data, and the reflectors detected follow the reflectors present in the real data. The lateral continuity of the fractional-orders along the different reflectors is quite well preserved for certain reflectors that are isolated and easy to detect. For general setting of reflectors, we can notice some fluctuations in the fractional-orders, which can be due to a wrong estimation in the frequency of the detected event, which is the phenomenon mentioned earlier, and/or to the choice of fractional-orders in the estimation dictionary. The changes in the fractional-orders can also be due to actual changes in the nature of the transitions in the subsurface. G3 R e s u l t of t h e d e t e c t i o n 50 r â€”' 50 1 -1- 1 100 150 200 1 250 I I I 300 350 400 X Locations Figure 5.14: Detected reflectors. Recovered fractional orders X Locations Figure 5.15: Recovered fractional-orders. 64 Despite limitations in our current implementation of the method, we feel that our preliminary results are encouraging with respect to detecting the reflectors and estimating their fractionalorders. Figures 5.16 presents the superposition of the real data with the fractional-orders recovered by our algorithm. Real Data with r e c o v e r e d f r a c t i o n a l - o r d e r s 50 100 150 200 250 300 350 400 XLocations Figure 5.16: The real data is presented by the grey scale and the recovered fractional-orders are displayed in colors. 65 Chapter 6 Discussion 6.1 Summary and Conclusions In this paper, we have presented a new method for characterizing and delineating transitions in the subsurface. Our method is based on optimal decompositions of the data [2], with respect to overcomplete dictionaries of waveforms parameterized by their fractional-orders. We propose a two-step approach that divides the initial i 1 decomposition problem into a detection step followed by an estimation step. The detection part locates the predominant events in the data, and partitions the data into its prevailing components. The estimation part approximates the fractional-orders of each detected component. The detection and estimation algorithms both exploit the sparsity property of the representation of the data with respect to the dictionaries, by minimizing the Â£ norm of the data representation. We use an iterative thresholding algorithm l [4] with a Block Coordinate Relaxation method [29, 30] to solve the minimization. We show that by first partitioning the data, and then analyzing each detected event separately, we improve the performance of B P with respect to finding the original coefficient. The probability of success in finding the original representation of the data with respect to a given dictionary T>, is increased by our new technique. In particular, the probability of detecting the events in the data with our new method is quite large. We show that our method achieves quite good performance in detecting events, provided the events are not too closely-spaced. The method also gives encouraging results with respect to the estimation of the fractional-order of the events, and provide an increased probability of correctly detecting and estimating the fractional-orders of the events than B P . 66 Unlike B P , where the use of large dictionaries is prohibitive, our new method can give good and accurate results in fairly large dictionaries. Besides, the computational complexity of our method can be reduced by changing the speed parameters of both the detection and estimation decomposition algorithms. However, the complexity of our algorithm is still significant, and increases with the size of the dictionaries. In comparison to B P , our method achieves lower computational cost, but is computationally more expensive than M P . In that respect, our method seems to give a compromise between M P and B P . In order to improve the resolution of our method, we proposed to replace the iterative thresholding algorithm of the estimation step by B P . However, incorporating B P into the estimation part made it prohibitively expensive when dealing with large dictionaries. We therefore feel that further work is still required to achieve super-resolution of thin layers with our new method. We propose a fast and efficient detection algorithm by using a small dictionary. The goal of the detection algorithm is to both detect the major events in the data and estimate their frequency through an approximate I 1 decomposition in a small dictionary. The estimated frequency of the events is indeed very important, as it constitutes the base of the frequency contents of the estimation dictionary. In the case where the estimation dictionary contains the wrong frequencies, the largest inner product between the atoms in the dictionary and the analyzed event will pertain to an atom with an incorrect fractional-order. It is therefore crucial that the detection algorithm gives a good approximation of the correct frequency of the events. There is a trade-off between the optimal performance of the detection, which requires a small detection dictionary (especially for real and noisy data), and a good approximation of the frequency of the detected events, which requires a larger detection dictionary with a wide frequency content. Contrary to the detection algorithm, the estimation algorithm uses very large dictionaries with a wide range of fractional-orders, which provides accuracy in the estimated fractional-orders. Unlike the detection decomposition, the estimation decomposition, due to the size of the dictionaries involved, is a slow and computationally intensive algorithm. However, many ways are possible to increase the speed of the estimation part, including changing the speed parameters in the iterative thresholding algorithm, and reducing the size of the estimation dictionary 67 by adapting its frequency to the frequency of the analyzed event. The estimation dictionary can therefore be adapted to the event by narrowing its frequency content and selecting only a few frequencies around the frequency estimated in the detection. However, in that case, if the detection provides an incorrect estimated frequency, the estimation will be unable to find the correct fractional-order. One should note that incorporating a very large frequency content in the estimation dictionary, in addition to its wide range of fractional-orders, leads to an extremely slow and computationally intensive algorithm. The experiments performed on both synthetic and real data nonetheless show promising outcomes. The performance of our algorithm on synthetic data gives very encouraging results as the lateral continuity along the different reflectors is quite well preserved. The results on real data, although lacking some lateral continuity along reflectors, seem also quite promising. 6.2 Limitations of our method The method suffers from a limitation in the resolution of two closely-spaced events in the data. At the current stage, the estimation algorithm doesn't resolve two closely-spaced events. The method therefore loses the super-resolution property of B P , but gains in lower computational costs. Another limitation is the large dependence between the detection algorithm and the estimation algorithm. A correct approximation of the frequencies of the detected events is a very important element for the estimation algorithm. The approximation of the frequency of the detected events indeed constitutes the baseline for the frequency content of the estimation dictionary. Besides, there is a large trade-off between the size of the detection dictionary, which should contain a wide range of frequency to ensure a correct approximation of the frequency of the detected events, and the efficiency of the detection algorithm. The wider the frequency content of the detection dictionary, the more accurate the frequency of the detected events, but the less efficient the detection of the events, in particular for real and noisy data. The analysis of the coefficients obtained in the detection algorithm therefore needs to be improved in order to 68 correctly detect the events in large dictionaries with wide frequency content. Although our method achieves a good performance in fairly large dictionaries, its efficiency is still limited by the size of the dictionaries, due to computational costs. The estimation dictionary cannot incorporate a wide range of frequencies, as its size would lead to a prohibitively slow algorithm. The windowing of the detected events is also a very sensitive part that plays an important role in the estimation algorithm, as it determines the efficiency of the estimation. A wrong windowing can lead to the extraction of an incorrect event, and therefore to an incorrect fractional-order. If the window is too small, the event will be "cut", and not extracted entirely. On the other hand, if the window is too large, the extracted event will contain remnants of other events. In those cases, depending on how much the window "cut" the event, or on how much the window added to the event, the estimated fractional-order will be correct or incorrect. Further work is still required to build an optimal windowing. Even with these limitations, our method still performs quite well in the detection of the events, and relatively well in the estimation of their fractional-orders. In particular, the results obtained on real data, showed that most of the reflectors were detected by our algorithm, and that the lateral continuity of the fractional-orders along the reflectors was relatively well preserved, taking into account that actual changes in the fractional-orders of the reflectors also exist in the data. The goal of our algorithm is to detect these changes, and we feel that our algorithm performed quite well with respect to the smooth changes in the fractional-orders of the reflectors in the data. 6.3 Future work In this paper, we developed a new method that demonstrated encouraging results, in particular in the detection part. It also showed that, no matter how many events were in data, as long as they were not too closely-spaced, our algorithm would detect them, and estimate them quite accurately. 69 In the future, it would be beneficial to improve the algorithm by making it faster and more efficient. The possible improvements are listed and explained below: 1. The current method uses an iterative thresholding algorithm with a Block Coordinate Relaxation method to solve an l l minimization which enables us to use relatively large dictionaries. However, iterative minimizations lead to slow algorithms and therefore limit their performance. The numerical complexity of our algorithm, although inferior to the numerical complexity of B P , is still significant. It would therefore be good to increase the speed of the iterative thresholding solver, and to incorporate, in the future, more efficient Â£ solvers involving global optimization tools. l 2. Working with large datasets and large dictionaries involves numerous and costly computations, and therefore requires powerful programming languages. Our algorithm, currently written in Matlab, would gain in speed if it were programmed in another programming language like C++ or Fortran. In the future, it will be necessary to convert our Matlab routines into one of these programming languages. 3. Resolving thin layers in the subsurface is also very important, and constitutes a major issue when analyzing seismic data. In order to achieve super-resolution, it would be interesting to incorporate B P in our algorithm, but in a way that would not lead to a prohibitively slow algorithm. B P is a very powerful solver with respect to its multiresolution property, and we feel that it could help our algorithm resolve closely-spaced events. 4. The efficiency of our method highly depends on the choice of the dictionaries, in particular on the choice of their fractional-orders and frequencies. Many questions therefore arise: (a) What is the range of fractional-orders and frequencies? (b) How finely sampled should the fractional-orders and frequencies be? (c) Can we reduce the range of fractional-orders to [0,1] and build any other fractionalorder a > 1 through a linear combination of fractional-orders in [0,1]? 70 5. Global multiscale analysis with wavelets performed on seismic data (see section 2.2.2) yield an estimate for the singularity spectrum f(a) of the data. The singularity spectrum f(a) gives the probability of occurrence of a particular singularity in the data. The singularity spectrum therefore provides information on the different fractional-orders contained in the data. It would be interesting to use this information as a weighting function for the coefficients obtained in the l l minimizations in the algorithm. The i 1 norm of the decomposition of the data with respect to a dictionary associated to a rare singularity should be highly sparse, and therefore should be largely penalized in the minimizations. On the other hand, the i l norm of the decomposition of the data with respect to a dictionary associated to a frequent singularity, should be less sparse, and therefore should not be penalized in the minimizations. The weighting function w should behave like where f(a) is the singularity spectrum of the data. For a signal s of length n, and a dictionary T> of cardinality p, the minimization should be: min ||s â€” <J>c||? + A[[wc||i (6.1) where w is the weighting vector constructed according to the fractional-orders contained in the dictionary <&, c G M m with m = np, and wc denotes the componentwise multiplication: VI < k < m, (wc)(k) â€” w(k)c(k). If we write the dictionary V as: * = [* a i ) ...,* a f > ], (6.2) w can be obtained by the following computation: f(ai) f{a ) p where 1â€ž = [ 1 , 1 ] is a vector of ones of length n, and w G R m with m = np. 6. Because the estimation of the frequencies of the detected events is very important, it would be interesting to include in the algorithm another estimation decomposition that would 71 estimate the frequency of each detected event. The global algorithm would then consists of a detection, a first estimation for an approximation of the frequency of each detected event, and a second estimation directed at finding the fractional-orders of each detected event. The first estimation dictionary would contain a wide range of frequencies and a few fractional-orders, whereas the second estimation dictionary would contain a wide range of fractional-orders and a few frequency centered around the estimated frequency. The proposed algorithm aims at improving the estimation of the fractional-orders, but gains computational cost. Performing two estimations with two large dictionaries can lead to a prohibitively slow algorithm with a high numerical complexity. Further work is therefore required to lower its computational cost. Figure 6.1 presents the modified algorithm. Figure 6.1: Improved detection-estimation method. 72 Part II Capita Selecta 73 Chapter 7 Atomic decompositions with Redundant Dictionaries 7.1 Problem formulation As stated in chapter 2 by equation (2.15), we model seismic signals s as s(z) = Y iâ‚¬A i ~ ^( c D ai z - i) Â± Y i ~ ^( z c D ai z - Â«o. t - ) 7 1 c where Z{ is the location of the i th transition in the subsurface, d = ^^f' r i â€” "ffiaf+iT^ > V' is the seismic source function, and D~ is the Liouville â€”af fractional derivative operator defined ai 1 in Definition 2.1.2. The goal is to estimate the locations Z{ of the reflectors in the subsurface, their amplitudes Q, and their fractional exponents a*. For that, we consider seismic signals s as a superposition of waveforms (p parameterized by their fractional-order a: a s =Y <x4>a, (7.2) aV<x, (7.3) a and envision their decomposition: s= Y c with respect to an overcomplete dictionary V â€” (ip ) es, where the atoms ip are waveforms a a a parameterized by their fractional-order a, and S is the indexing set of the parameter a. In a matrix form, equation (7.3) can be written as: s = $c, (7.4) where $ is the matrix of the dictionary V. The goal is to find the representation c of the signal s which satisfies equation (7.4). However, due to the overcompleteness of T>, there is no unique 74 solution for c, but the nonuniqueness of c provides a flexibility which enables us to choose the most suited'representation c to our problem. In the case of seismic signals, assumed to be a superposition of a limited number of waveforms, it is natural to try to minimize the number of components of c. There have been several approaches to solve this problem, including the Method of Frames [3], Matching Pursuit [26], Basis Pursuit [2], and the Block Coordinate Relaxation Method ([29, 30]). We describe those methods in the following sections. 7.2 7.2.1 Method of Frames The algorithm The Method of Frames [3] is an algorithm that finds the representation c with minimum I 2 norm, and solves rruxi | |c| |2 , subject to s = <I>c. (7.5) (7.5) is a quadratic system whose unique solution c} is given by: s t = *t ='* (** )- Â§. T a r (7.6) 1 The matrix & is called the generalized inverse of 3>, and is given by: * f = * (** )- . T r 1 In our case, the major disadvantage of this method is the non sparsity of the solution c} [2]. The solution cf is indeed not sparse because the I 2 norm tends to prefer a large number of small nonzero coefficients, rather than a few number of larger ones. The non sparsity of c} means that the decomposition of s is spread out on a large number of atoms in V, which translates into the fact that we need a large number of atoms to represent s, and contradicts our previous assumption. The solution given by the Method of Frames is therefore not suited to our application. 7.2.2 Concept of sparsity The concept of sparsity is intrinsically linked to the dictionary we use, as the decomposition of a given signal / can be sparse in one dictionary and not in another one. 75 Consider a simple example of a cosine function, denoted g. The representation of g in the Discrete Cosine dictionary (Discrete Cosine Transform) is very sparse, as we only need a few cosine basis functions to construct g. However, the representation of g in the Haar wavelet dictionary is full, as we need a large number of step functions to build g. The Haar wavelet basis is a basis made of translations and dilations of a step function. The results are shown in Figure 7.1. , â€” 0 Signal 20 40 120 Coefficients using60Discrete80 Cosine100 Transform 1 140 Coefficients using Haar wavelets 2| , , 1 , , -21 0 ' 20 ' 40 ' 60 ' 80 ' 100 ' 120 ' 140 Figure 7.1: Decomposition of a cosine using the Discrete Cosine Transform ( D C T ) and the Haar wavelets. The top figure shows the cosine function g, the middle one plots the decomposition of g with respect to the Discrete Cosine Transform ( D C T ) dictionary, and the bottom one gives the decomposition of g with respect to the Haar basis. 7.3 Matching Pursuit Contrary to the Method of Frames, Matching Pursuit (MP) [26] approximates the sparsest representation CQ, where the sparsest representation c refers to the representation with minimum 0 Â£Â° norm. M P tries to find the decomposition of the signal s in a dictionary V by iteratively finding the waveforms that best match the data s. M P is an iterative algorithm that selects, at each iteration, the waveform that minimizes the Â£ -difference between the subsequent reductions 2 76 and their projection onto a waveform. Consider a dictionary V made of atoms g . The Matching Pursuit algorithm begins by proA jecting s on a vector g Xo G V and computing the residue Rs [25] Â§=<s,g >g Ao + ife, Ao where (s, g ) is the inner product between s and g A o write [25] A o (7.7) . Since Rs is orthogonal to g A o , we can n l l s l l ^ l f e g O p + HPsll . 2 To minimize |].Rs|| , we need to choose g 2 A o â‚¬ T>, such that |(s, g ) | is maximum, which is A o computationally done by finding an almost optimal vector g I (S, g A o A o that satisfies [25] ) I > Â« S U | ( s , g )|, P where a G [0,1] is an optimality factor [25]. B y setting RÂ°s = s and subdecomposing the residue at each iteration with the update rule for the m choose g iteration t h s.t. | C R s , g J | > asup | ( i ? s , g ) | , m A 7 m m 7 we obtain the following decomposition of the m 7 t h residue at iteration m: R s=(R s, Jg +R h. m m (7.9) m+ gx (7.8) Xm If we stop after M terms, we obtain a sparse approximation of s M !(*) = E^SA )gA (^)m m (7-10) m=l As M â€”> oo, this decomposition not only becomes exact, but also provides a sparse approximate representation if we set M <Â§; oo. The adaptivity of the dictionary and the subsequent selection of the best fitting waveforms concentrate the energy of the data in as few possible waveforms, yielding a high non-linear approximation rate. However, this method often fails to locate two closely-spaced events in the coefficient vector. Due to the greediness of the algorithm, M P can also select the wrong basis function at the beginning (because two events are too close) and then spend the rest of the time correcting for its mistake [2]. 77 7.4 Basis Pursuit 7.4.1 The formulation Basis Pursuit (BP) is a method that decomposes a signal s into an optimal superposition of dictionary elements, where optimal means having the smallest l norm among all such decoml positions [2]. B P requires that the decomposition of the signal s in the dictionary V be sparse, saying that we only need a few waveforms from V to build s. Basis Pursuit (BP) solves the following minimization problem: nun ||c||i subject to s = 3>c, (7-11) where 3> is the matrix of the dictionary V. B P tries to find the sparsest solution c by replacing the NP-hard Â£Â° minimization by the i minimization stated above. The constraint imposed on 1 the coefficient vector c, requiring a minimization of the number of nonzero components, is very well suited to our type of problem. 7.4.2 Linear Programming (LP) B P is a convex and non quadratic optimization problem that can be rewritten as a Linear Programming (LP) problem [2]: m i n d x subject to A x = b , x > 0, T (7-12) where x 6 MP, s 6 M , and x > 0 requires all components of x to be positive. The reformulation n of B P into the L P problem comes from the following change of variables [2] p = 2 m ; x = ( u , v ) ; d = (1,1) ; A = ( * , - * ) ; b = s, (7.13) where u and v refer to the positive and negative components of c respectively ((u, v) â‚¬ (RJ ) ), 71 2 and d is a vector of ones in W. First, we can write Ax = (*,-*)(u,v) = * ( u â€” v) = *c. 78 (7.14) As 3>c = s = b, (7.14) is equivalent to Ax = b. Secondly, we can write dx T = (l,l) (u,v) = u+ = llfiUl. r V (7.15) To clarify the notations stated above, consider the following example: 1 -2 c = â€¢ 3 -4 with u = 1 0 0 2 3 0 0 4 and c = u â€” v. We can easily check that d x = | |c| 11: T d x T = = l,l) (u,v) T U + V = 1+2+3+4 = ||c||i- (7.16) The solution given by B P is obtained by an Interior-Point scheme [2] based on a Primal-Dual Log-Barrier method [2, 10]. The algorithm starts by an initial feasible solution yielded by the Method Of Frames [3], and proceeds by applying the Log-Barrier algorithm. The speed of the 79 algorithm depends on the data, and the size and implementation of the dictionary, whereas the accuracy depends on the Conjugate Gradient solver. 7.4.3 Basis Pursuit De-Noising Like any real data, seismic signals contain noise. If we decompose the entire noisy data d = s+n with respect to a dictionary V, we also fit the noise. In order to avoid fitting the noise as well as the signal, we need to denoise the data and decompose the denoised data into the dictionary T>. In this case, we do not want an exact decomposition of d, but an approximate one. Basis Pursuit De-Noising ( B P D N ) [2] adapts the original Basis Pursuit to the case of noisy data of the form d = s + n, where s is the signal and n the noise. The denoising problem can be written as m u i i | | d - * c | | i +AMclU, (7.17) where we minimize the i norm of the representation of the signal s that should be sparse in the 1 dictionary V, as well as the data misfit. B P D N can be rewritten as a perturbed L P ([2, 10]), and is solved using a Simplex and an Interior-Point method. 7.5 Basis Pursuit versus Matching Pursuit 7.5.1 Results In this section, I present a comparison between Basis Pursuit and Matching Pursuit with respect to decomposing a signal s in a particular dictionary T>. The aim of the numerical experiments I made was to see whether the method involving the decomposition of the signal s using redundant dictionaries could find the right decomposition c and hence accurately localize the right waveforms in s. For each dictionary V, I started with a known coefficient vector c, synthesized a signal x = <f>c, where <fr was the matrix of T> containing the different fractionalorders a, and then decomposed x using B P or M P . In my experiments I worked with the following parameters: â€¢ The size of the dictionaries (number of different fractional-orders a) â€¢ The number of nonzero components in the representation c (sparsity of c) 80 â€¢ The distance between two different fractional-orders a In all my experiments, I considered a signal x of length n = 128. In this section, I only present the results for the decimated fractional spline wavelet dictionary, abbreviated M D W T , as the results are the same for the fractional spline and undecimated fractional spline wavelet dictionaries. Decimated Fractional Spline Wavelets In the eight following figures, the top two plots show the superposition of the original signal s and the recovered signal. The top left figure shows the result by B P and the top right figure the result by M P . The two middle plots show the coefficient c found by B P (left) and M P (right). The two bottom plots show the original coefficient c. 81 1. Coefficient c with only four nonzero components â€¢ Comparison between two dictionaries (small with two a and large with ten a) BP 2 1 0 -1 V -2 Q- p 50 1/ 100 o 2 + 1 iginal signal 1 o 150 0 -1 -? 150 n 50 100 150 200 250 50 100 150 200 250 -5 -5 50 100 150 200 250 50 100 150 200 250 Figure 7.2: Analysis of a very sparse coefficient with an M D W T dictionary with 2 a. Figure 7.2 compares the efficiency of B P with the efficiency of M P for a dictionary containing only two different a (a\ â€” 0 and a<i = 5). The difference in a in V assures a certain dissimilarity between the atoms, and hence prevents V from being too badly conditioned. As we see in Figure 7.2, both B P and M P recover the right coefficient c, first because there are very few nonzero entries in the original c, and second because the dictionary T> is small. 82 150 200 5| I 400 â€¢ 0 J 600 800 â€¢ â€¢ 1 n 5I â€”| ' 1 o _51 1000 1200 1 1 1 1 200 400 600 800 I 200 400 600 800 â€¢ â€¢ â€¢ â€¢ 0 J 1 1 1000 1200 T â€” | ' o 1 1â€”I 1000 1200 _51 1 , , , 200 400 600 800 , ,â€” 1000 1200 ure 7.3: Analysis of a very sparse coefficient with an M D W T dictionary with 10 a. Figure 7.3 compares the efficiency of B P and M P for a larger dictionary T> containing ten different a between 0 and 5. Because the different fractional-orders are closer to each other than in the previous example, the atoms in T> are more similar. As we see in Figure 7.3, B P doesn't recover exactly the right coefficient c, whereas M P does. The reason why B P doesn't, is because T> is now much larger and c not sparse enough for V. In fact, for this particular example, B P finds exactly the right coefficient c with a nonzero probability less than 1 (empirically found around 0.75) and we need less than four nonzero entries in c in order that B P finds exactly the right coefficient with probability 1. 83 â€¢ Comparison between two dictionaries (one with two very different a and one with two similar a) 50 100 150 200 50 250 100 150 200 250 Figure 7.4: Analysis of a very sparse coefficient with an M D W T dictionary with 2 very different a. In Figure 7.4, the original coefficient vector c has only four nonzero entries, and both M P and B P find it exactly. In this case, the dictionary V is very small and contains dissimilar atoms. For this specific dictionary, c is sparse enough so B P and M P can find it. 84 BP MP Figure 7.5: Analysis of a very sparse coefficient with an M D W T dictionary with 2 similar a. In Figure 7.5 however, only M P finds the original coefficient exactly. B P finds the nonzero entries of the original c, but adds another nonzero one. B P has more trouble finding the right coefficient because the atoms in T> are much more similar. In this case, the original c is not sparse enough for B P to find it with probability 1. Like in Figure 7.3, B P finds the right coefficient c exactly with a nonzero probability less than 1 (empirically found around 0.75). The more similar the a, the lower the probability that B P finds the right coefficient c for a given original c with a fixed number of nonzero entries. M P has the same problem, but seems to be able to recover coefficients that are less sparse than the coefficients B P can recover. 85 2. Coefficient c with ten nonzero components â€¢ Comparison between two dictionaries (small with two a and large with ten a) BP 50 100 MP 150 200 . 50 250 100 150 200 250 Figure 7.6: Analysis of a less sparse coefficient with an M D W T dictionary with 2 a. Figure 7.6 shows the efficiency of B P and M P in a small dictionary for a given original coefficient c. In this case, both B P and M P find the original coefficient exactly. The original coefficient, although containing 10 nonzero entries, is sparse enough for M P and B P to find it in this dictionary T>. 86 0 â€¢| 200 400 1 600 '! 150 ' 800 1000 1200 200 11 0 200 400 600 original gj 100 800 1000 1200 400 600 800 1000 1200 I I o 50 200 400 600 I 11 800 1000 1200 Figure 7.7: Analysis of a less sparse coefficient with an M D W T dictionary with 10 a. Unlike Figure 7.6, Figure 7.7 shows that neither B P nor M P exactly finds the right coefficient back, although M P seems a little bit better. In this case, the dictionary is too large for B P and M P to find the original coefficient back. For this particular dictionary with ten different a, the original c is not sparse enough for B P and M P to find it exactly. The reason for this result is explained by the Spark of the dictionary. The larger T>, the smaller its Spark, and hence the sparser c needs to be in order to be found by B P and M P . This condition is given in the equivalence theorem mentioned in section 3.2 of Chapter 3, and the definition of the Spark is given in section 3.2 of Chapter 3. 87 â€¢ Comparison between two dictionaries (one with two very different a and one with two similar ones) 150 & 0 It 1 i 150 . ii i | 9= n 5 i 0 i , 50 100 1. n i ' 1 1 -5 50 100 150 200 250 150 200 250 5 LjULjUL C O) 0 -5 50 100 150 200 -5 250 50 100 150 200 250 Figure 7.8: Analysis of a less sparse coefficient with an M D W T dictionary with 2 close a. In Figure 7.8, s is decomposed in a small dictionary V with two very similar a (0.5 and 1). When the a are similar however, neither B P nor M P recovers exactly the right coefficient. In fact, both recover the initial coefficient with a nonzero probability strictly less than 1. The original coefficient c is not sparse enough. 88 150 & 0 Ii Â» 11 I ' 150 I 50 100 150 200 250 50 100 150 200 250 50 100 150 200 250 Figure 7.9: Analysis of a less sparse coefficient with an M D W T dictionary with 2 different a. In Figure 7.9, s is decomposed in a small dictionary V with two very different a (0 and 5). When the a are very different, B P and M P can recover the right coefficient, B P doesn't recover it exactly but is very close to the right coefficient. 89 3. Coefficients c with two very close components BP + MP 2 .5> o 50 100 50 150 100 150 g> 0 -5 50 100 150 200 250 50 100 150 200 250 50 100 150 200 250 50 100 150 200 250 5 Figure 7.10: Analysis of two close components in an M D W T dictionary with two different a. Figures 7.10 and 7.11 compare the efficiency of B P and M P in resolving two close components in c. In Figure 7.10, the dictionary V is small and contains only two different a. In this case, both B P and M P recover exactly the right coefficient c, no matter how close the components are. 90 150 -5 200 400 600 800 1000 1200 200 400 600 800 1000 1200 150 -5 -5 200 400 600 800 1000 1200 200 400 600 800 1000 1200 gure 7.11: Analysis of two close components in an M D W T dictionary with ten different a. In Figure 7.11, V is large and contains ten different a between 0 and 5. In this case, B P finds exactly the right coefficient (multiresolution), but M P cannot resolve the two close components. In small enough dictionaries M P can resolve two close components, but fails to do so in larger ones. B P is more sensitive to the number of nonzero entries than to how close the events in the original coefficient c are. 91 7.5.2 Discussion I showed some results for dictionaries made with decimated fractional spline wavelets and got similar results for fractional spline and undecimated fractional spline wavelet dictionaries. The results revealed the importance of the size of the dictionary (its redundancy), the orders a and the number of nonzero entries in the coefficient c. There is a trade off between the redundancy of the dictionary T> and the sparsity of c. For a given dictionary V, there is a maximum number of nonzero components in c that guarantees its being the unique solution of (Pi) (and hence (Po)). The definitions of the (Po) and (Pi) problems are given in section 3.1.1 of Chapter 3. For our type of dictionaries (fractional B splines, decimated and undecimated fractional spline wavelets), the results are as follows: 1. If V is small (e.g. made of only 2 different a) and the a very different, the maximum number of nonzero entries in c is around 10. 2. If V is small (e.g. made of only 2 different a) and the a more similar, the maximum number of nonzero entries in c is close to 10. B P and M P find the correct coefficient back with a high probability. 3. If V is large (e.g. made of 10 different a or more), the maximum number of nonzero entries in c is very small (around 1 or 2 depending on T>). From our results, we can infer two major problems: 1. Seismic data is made of two many different events for our type of dictionaries. 2. Seismic data contains a wide variety of fractional-orders, which requires large dictionaries. We therefore cannot rely on B P or M P to decompose and find the fractional-orders a of a seismic trace. Another way of tackling the problem is to consider s as a sum of different signals S j , each having a different fractional-order Â« j . Using the Block Coordinate Relaxation Method (BCR) which is an iterative algorithm, we hope to be able to separate the different signals. 92 7.6 7.6.1 Block Coordinate Relaxation Method Problem statement We consider the seismic signal s as s= s Ql + ... + s Qn + r, where SQ is an af* order signal, and r is the residual containing the rest of the orders of s that we didn't include in our model s_,, + ... + s~, . r can also contain the noise if we work with â€”Oil â€”<Xn â€” noisy data. Given s, we would like to find its different components s^., and in particular its associated orders aj. Like B P , we will try to find the representation c of the signal s with respect to an overcomplete dictionary V made of waveforms parameterized by their fractional-orders a. Examples of such dictionaries include fractional B spline, and decimated and undecimated fractional spline wavelet dictionaries. We consider a dictionary V as V= [V ,...,V ], ai an and its matrix <&: where 3> Qi is a the matrix of V containing the different translations of af 1 ai A representation c in P is of the form c = [ c a i order waveforms. , c J . Following these notations, we can write Q our model as s = * c + r. (7.18) In the rest of the paper, we will use the following notations: Sa 4 = Â§i , = Cj and $ Q i Our main assumption here, is that the representation of each = is sparse in the dictionary T>i, so that we only need a few atoms in T>i to build S j . This is the same idea as B P , because we would like each a\ h order event of the signal s to be described with as few atoms as possible. 93 Ideally, each event would be described by only one atom and hence the order OJJ of this event would be read on its associated atom. We propose to solve the following minimization problem min (ftUcJIi + ... + ^ H c J l ! + ||s - *c||!), (7.19) where 6i > 0, for 0 < i < n. The parameters 3i are trade-off parameters between the different Â£ norms and the data misfit. 1 1. The first part of the functional minimizes the representation of each signal s_j in the corresponding dictionary XV 2. The second part minimizes the data misfit. This second part relaxes the constraint. If an additional content existing in the signal s is not represented sparsely by the dictionaries T>i, the above formulation will tend to allocate this content in the residual [30]. Remark: What makes the minimization (7.19) succeed in separating the different signals S j , is that each dictionary Vi is very good in representing s and bad in representing the other Sj, for 4 j jÂ£ i. The better this assumption is verified, the more accurate the algorithm. 7.6.2 Algorithm I solved the minimization (7.19) using a slightly modified version of the Block Coordinate Relaxation Method ( B C R ) implemented by Starck et al. [30]. If 3>*3>i = / , the previous minimization can be solved by Soft Thresholding [30]. If we fix every coefficient vector c except c , we can solve for c 4 io io min/3 ||** c ||i + | | S - * c | | i , io where S io = s - 0 io i o i o i o with (7.20) I f * * * , i = I , (7.20) is equivalent to minAollCiolli + I I H S i o - C i o l l i . 94 (7.21) which can be written as N J2(ki (k)\ + X(c (k)-Z (k))% 0 io io k=i where Z{ (k) is the k th 0 component of 3>j S 0 io and A = This is equivalent to N scalar, convex and independent minimization problems that are solved by Soft Thresholding applied componentwise to c io [4]. The minimization can still be solved by Soft Thresholding [29] when the matrices are a concatenation of matrices Ai that satisfy A*Ai â€” I. See Appendix C for the derivation of the soft thresholding solution. The undecimated orthonormal wavelet transform consists of all shifted versions of the decimated orthonormal wavelet transform. The matrix of the decimated orthonormal wavelet transform is unitary, and the matrix of the undecimated wavelet transform is a concatenation (modulus a reordering) of unitary matrices. We can hence use Soft Thresholding to solve the minimization problem when using undecimated orthonormal wavelet transforms. 7.6.3 Numerical scheme In Figure 7.12, I present the basic numerical scheme for separating two signals s^ and s , 2 assuming s is of the form s = s^ + s + r, where s is an a* order signal, s is an a h 2 x 2 order h 2 signal and r is the residual which can contain other aj-th order signals and/or noise. The dictionary we use can be written as: <& = [$i, $ ] , where 3>i is adapted to s , and 3> to s . 2 x 2 2 The algorithm given in Figure 7.12 is for two signals, but I programmed a general algorithm for any number of signals and any dictionaries. 95 Initialize Lmax, number o f i t e r a t i o n s , and t h r e s h o l d s 5i = 0i-L m a x and 02-Lmax- 82 = Perform Part old L m a x times A: Update c â€ž n â€¢ Synthesize â€¢ Calculate assuming c 2 x the signal Soft threshold c 2 x R = sâ€”s 2 x (R - * Â£Â§' ) ld 2 fixed = ^ic the residual â€¢ C a l c u l a t e c = c% + * â€¢ is d 2 2 with 2 <5, 2 i.e. solves the following minimization problem: m i n f o l l & l l i + ||Ba -S2II2 Â£2 Update Part â€žold Â£l t h e t h r e s h o l d 62 = S â€” 02 2 B: Update c â€ž â€” assuming c x fixed Â£l â€¢ Synthesize â€¢ Calculate the signal d x Soft threshold problem: s = <fr Â£ 2 2 c x + * f (R with x - 8\, 2 $cf ) d i.e. solves mm8 \\c \\ + Â£1 Update 2 the residual R i = s â€” s â€¢ Calculate c = c f â€¢ is 2 1 1 1 the following \\R -s \\ minimization 2 1 1 2 t h e t h r e s h o l d 81 = 5i â€” 0\ end ; Figure 7.12: Block Coordinate Relaxation ( B C R ) Algorithm 96 7.6.4 Simple examples I applied the B C R algorithm to two examples: one that I made myself and one taken from Starck et al. [31]. In both examples we are dealing with data d of the form d = f\ + / where f\ and / 2 2 + n, are two different signals, and n is the noise. 1. In the first example, f\ is a cosine function and / 2 is a step function. I used the Discrete Cosine Transform as the dictionary 3>i adapted to f\ and the Haar wavelet basis as the dictionary <F> 2 adapted to / . The results are shown in Figure 7.13. 2 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 residual 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Figure 7.13: B C R algorithm: separating the noise, a cosine function and a step function. In Figure 7.13, the top plot shows the original cosine function (blue line) and the recovered cosine function (red line), the second plot shows the original step function (blue line) and the recovered step function (red line), the third plot displays the sum of the original signals (blue line) superimposed with the sum of the recovered signals (red line), and the bottom plot shows the residual. 97 2. In Starck's example [30], f\ is a cosine function and fa consists of bumps (Gaussians). I used the Discrete Cosine Transform as the dictionary 4>i and the ATrou wavelet basis as the dictionary <&2- The results are shown in Figure 7.14. cosine 0 I 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 i i i i i i i i i 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 I 1 1 1 1 | â€” 0 1 1 1 0.1 0.2 0.3 0.4 1 I 1 residual |_ 1 1 i i i 0.5 0.6 0.7 0.8 0.9 1 Figure 7.14: B C R algorithm: separating the noise, a cosine function and bumps. In Figure 7.14, the top plot shows the original cosine function (blue line) and the recovered cosine function (red line), the second plot shows the original bumps (blue line) and the recovered bumps (red line), the third plot displays the sum of the original signals (blue line) superimposed with the sum of the recovered signals (red line), and the bottom plot shows the residual. 7.6.5 Multiple Block Coordinate Relaxation Method The multiple B C R iteratively performs a B C R decomposition on each of the separated signals resulting from a previous B C R decomposition. The idea is to run the B C R iteratively on the different signal contents, until we separate them. I empirically noticed that running the B C R more than three times does not really improve the results. I therefore limited the iterations to three. 98 7.6.6 Results I present here some results on the performance of the B C R algorithm for the decimated fractional spline wavelet dictionary. The results are similar for fractional B splines and undecimated fractional spline wavelet dictionaries. I tested the B C R algorithm for different settings and different synthetic seismic signals that I generated randomly in order to avoid the creation of patterns. The experiments test the following cases: 1. The analyzing dictionary V exactly contains the waveforms in s 2. The analyzing dictionary V contains the waveforms in s as well as other ones 3. The analyzing dictionary T> doesn't contain the waveforms in s The synthetic signal s was built as a real seismic signal by randomly building its different coefficients. B y hypotheses, s is a summation of signals Sj with different fractional-orders a^. In order to construct s, I created each coefficient Cj with random position and number of nonzero entries, and synthesized s, by computing Sj = * i C j , where <f>j is the matrix of the dictionary T>i associated to the order OJJ . Given s, the B C R algorithm tries to separate the different content types of s in the analyzing dictionary T>. 99 T h e a n a l y z i n g d i c t i o n a r y V exactly contains the waveforms i n s â€¢ Dictionary V made of 4 different a between 0 and 5 Original (in red) and recovered (in blue) signals 51 .5! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 L 0 0.1 0.2 0.3 0.4 0.5 0.6 "0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 _5i 0 i i i 0.1 0.2 0.3 i 0.4 i i 0.5 0.6 1 i 0.7 i i I 0.8 0.9 1 Figure 7.15: Original and recovered signals. In this example, s is synthesized and analyzed with the same dictionary V and the coefficient c of the signal s in V has 27 nonzero entries. Figures 7.15 shows that the different content types in s are successfully separated and the coefficients exactly recovered. W i t h 27 nonzero entries in c, the B C R algorithm is able to separate the different content types in s and finds the right coefficient back. improvement over B P and M P . 100 This is an Original (in red) and recovered (in blue) coeffs -10 I 20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200 100 120 140 160 180 200 20 10 -10 l 40 20 A-r\JlA A_ 0 -20 20 40 60 80 Figure 7.16: Original and recovered coefficients. Figure 7.16 shows the superposition of the original coefficients and the coefficients Cj found by the B C R algorithm. The original coefficient has 27 nonzero entries. The different coefficients of the different content types are perfectly recovered by the algorithm. 101 Dictionary made of 10 different a between 0 and 1 Original (In blue) and recovered (in red) signals n, J U _n_n, v ) 0.1 _ fl 0.2 0.3 0 0 0.4 0.5 0.6 0.7 0.8 0.1 0.2 1 1 0.1 0.1 0.2 0.2 0.3 0.4 0.3 0.4 0.4 i 0.5 0.6 1 1 i 0.5 0.6 â€”i 1 0.5 0.6 0.7 r- 0.8 0.7 0.8 0.1 0.2 0.3 5 0.11111 I - 0.9 . 1â€ž . 5 0.1 i 0.2 0.3 1 1 0.4 0.5 0.6 i i i *â€”yv* â€¢ s . ^ ) 5 j 0.22227] 0.9 1 Mfv^y- 0 i 1 A J 0 PyL â€¢ - 0.3 _ 1 0.0 I . f l l Ir *- Original (In blue) and recovered (in red) signals 2 \=7] nLi H 0 ) ' 0.4 1 0.1 0.2 0.3 0.4 0.5 0.5 1 0.8 0.6 1 0.9 i i A ~ yrâ€”â€¢ 0.7 1 0.8 1 1 1 0.66667 | â€” 0.9 | 1 1 0.7777F1 1 0 _i 0.1 0.2 0.3 0.4 __i 0.5 0.6 0.7 0.8 1 i_ 0.9 1 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.8 0.9 1 1 1 1 1 - | â€” - 0.33333~l 0.7 _j 0.8 t_ 0.9 vl' - 0 0.7 i 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 A 0.6 A 0.7 Figure 7.17: Five first original and recovered sig- Figure 7.18: Five last original and recovered signals, nals. The original coefficient c has 33 nonzero entries. In Figures 7.17 and 7.18, we can see that the separation of the different signals is not very successful. Because the dictionary is large and the waveforms very similar (the fractional-orders are much closer), the dictionaries become more similar. In particular, one dictionary 3>i is not bad in representing the other signals s^- for j ^ i, which is why the separation is not perfect. The more similar the dictionaries, the less the B C R algorithm will be able to separate the different content types of s. 102 Original (in blue) and recovered (in red) coeffs Original (in blue) and recovered (in red) coeffs 3 40 -n r 20 40 i 60 80 100 120 140 160 180 200 0 201 100 120 140 160 180 200 0 20 , 40 , 60 p 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200 120 140 160 180 200 jUll 60 r â€”i J - _/A__LAI r~ 1 A ^ L 100 JjJL I 0.33333] | Q.44444 | JUL _i 20 40 60 100 1 20 __l 1 40 L_ 1 60 1 60 I 200 0 20 i_ 40 100 60 80 120 1 00 140 1 20 160 1 40 0.888891 180 160 200 1 80 Figure 7.19: Five first original and recovered co- Figure 7.20: Five last original and recovered coefficients, efficients. Figures 7.19 and 7.20 illustrate the performance of the B C R algorithm on the coefficients. 103 200 T h e a n a l y z i n g d i c t i o n a r y V contains the waveforms i n s, as well as other ones â€¢ Synthesizing dictionary made of two a between 0 and 1 and analyzing dictionary made of four a between 0 and 3 Recovered signals n -2 .j 0 1 1 I i_ 0.1 0.2 0.3 0.4 ~\ 1 1 1 0.1 0.2 0.3 0.4 _i 0 ~ -1 .j 0 r _J j I i i_ 0.5 0.6 0.7 0.8 0.9 1 1 r 0.5 0.6 0.7 0.8 0.9 1 n 1 1 r 1â€”; i_ r~ I I I I I I I i_ 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 n 1 1 1 1 1 r 0.1 0.2 0.3 0.4 _i 0 n r 1 - 3 I I I I i_ 0.5 0.6 0.7 0.8 0.9 1 Figure 7.21: Original and recovered signals. In this example, the synthetic signal s is only made of two different content types: one with order 0 and one with order 1, and the original coefficient has only five nonzero entries. The analyzing dictionary is made of four different a (0,1,2 and 3). s is decomposed into four signals with order 0,1,2 and 3 respectively. Figure 7.21 shows the original signals in blue (top two plots) and the separated ones in dashed red (in the four plots). The separation is perfect. 104 Recovered coeffs i i i i i i i i 20 i 40 i 60 i 80 i 100 i 120 i 140 i 160 i 180 1 1 1 1 i i i i i 120 i 140 i 160 i 180 i i i i 1 0 0 1 1 1 20 40 60 80 i i i t i 20 i 40 i 60 i 80 100 120 i 140 i 160 i 180 1 I 1 1 1 1 1 1 1 100 200 200 â€” 0 0 1 1 1 1 1 1 1 1 1 20 40 60 80 100 120 140 160 180 200 200 Figure 7.22: Original and recovered coefficients Figure 7.22 shows the original and recovered coefficients of the different signals in the same order as for the signals shown in Figure 7.21. They are perfectly recovered. 105 Synthesizing dictionary made of two a between 0 and 1 and analyzing dictionary ; of four a between 0 and 1 Recovered signals i 0 x 1 0 -3 0.1 i rll i i i i i r i â€” 0| u i i i i i i i i 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 r ~i 1 0.33333 0.02 0.1 0.2 _i 0.3 i 0.4 i 0.5 i 0.6 ~\ 1 1 1 1 1 i 0.7 i i 0.8 0.9 1 r - - 0.66667 0 -0.02 _i 0.1 i 0.2 â€”i 0.1 1 0.2 1 i 0.3 i 0.4 i 0.5 i 0.6 i 0.7 i_ 0.8 1 1 1 1 1 1 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.9 1 Figure 7.23: Original and recovered signals. In this case, a similar synthetic signal s with four events is decomposed with respect to a different analyzing dictionary. The analyzing dictionary contains four different fractional-orders a ranging from 0 and 1 (0,^,1 and 1). Figure 7.23 shows the original signals in blue (top and bottom plots) and the separated ones in dashed red (in the four plots). The separation is not perfect because the waveforms in V are too similar. However, the separation is still quite good, as the amplitudes of the signals with Orders \ and | are very small (these amplitudes should in theory be zero). The non optimal choice of V for the B C R algorithm affects the performance of the separation. 106 Recovered coeffs 20 r 10 -10 L 20 40 60 80 100 120 140 160 180 200 0.01, -0.01 h -0.02 L 200 0.05 h 200 200 Figure 7.24: Original and recovered coefficients. Figure 7.24 shows the coefficients of the different signals. 107 3. T h e a n a l y z i n g d i c t i o n a r y V doesn't contain the waveforms in s â€¢ Synthesizing dictionary made of five a between 0 and 3 and analyzing dictionary made of thirty a between 0 and 3 Original signals Recovered signals Figure 7.25: Original signals. Figure 7.26: Recovered signals. Figures 7.25 and 7.26 illustrate the decomposition of a signal s containing 30 events, and show the original signals and the recovered ones. Figure 7.26 shows that s is separated in many more content types than it actually has, which implies that the events are not found appropriately. Not only can wrong events be selected, but new events can be created in the reconstructed signals. 108 Original coeffs 0 1 1 Recovered coeffs ' 200 ~ ' 200 - â€¢"1 400 400 600 ! 1 !r 'â€¢ ": ! i! >- i1 1 | ' ; i ' = f' ' J ' S! ', 1 1 1 1 1 ! ! ']' ' f- != ' ' 1 ! â€¢ ! - 1 : - 600 â€¢ - 600 1000 1000 â€¢ ! ; 1200 . ! 1200 1400 1400 1600 1600 1600 1800 I -0.5 I 0 I 0.5 1 I 1 I 1.5 alphas I 2 I 1 2.5 I 3 I 3.5 ! I ; ; ; ; ; ! ! f 1 i i 1 1 i 1 i 1 1 I [ 1 I 1 1 ' ; l 1; ; ! ; ; ; ; ! ; ; ; ! - i , ! , 1 1 1 f | | | I 1 : - i 1 1 ' 1 1 ! 1 ! 1 1 !| j 1 i i i 1 i 1 | | | * ! i 1 | 1 ! | 1 i i 1 1 i - 1 1 ' | | 1 1 1 1 | | i â€¢ ; - ; - ! ! i ,! 0.5 ! i I ! , ! 1 ! > 1 1 . 1 1.5 alphas i i ! 1 , I! 2 1 â€¢ 1 , !I i,' ! ! 2.5 Figure 7.28: Recovered coefficients. Figures 7.27 and 7.28 show the same results for the coefficients. 109 i i â€¢ 2000 I "~" â€¢ 1 ! 0 Figure 7.27: Original coefficients. i: ; : ; I;: I; 11!: ; i; BOO 2000 [ ! ' i! 0 1 ! ! ! â€¢ 3 T h e a n a l y z i n g d i c t i o n a r y V doesn't contain the waveforms i n s â€ž0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 I 1 0.72414 .0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0. .0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 - 2.2759 | ,0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 .0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Figure 7.29: B C R separation with decimated fractional spline wavelets. Figure 7.29 shows the decomposition of the signal s with respect to a large dictionary whose fractional-orders are different from the fractional-orders contained in s. Figure 7.29 shows the superposition of the original content types of s and the reconstructed ones with closest fractional-order. The four top plots show the original signals (in blue) and the recovered signals (in red), and the bottom plot shows the residual. Except for the zero-order signal (first plot) which is well recovered, and the third order signal (fourth plot) which is partially recovered, the other content types of s aren't. The residual seems to contain most of the events. The failure of the decomposition is most probably due to the size of the analyzing dictionary I used, as well as the fact that the waveforms in s were not in the analyzing dictionary. The analyzing dictionary was fairly large and contained very similar waveforms, which reduced the efficiency of the B C R algorithm. 110 7.6.7 Discussion The above results show the efficiency of the B C R algorithm in separating a signal s into different content types described by their singularity order o^. As explained earlier in this paper, the B C R algorithm can be very efficient if each dictionary used for each content type are very good in representing one content type and bad in representing the other ones. In that sense, the more similar the atoms in the dictionaries, the less efficient the B C R algorithm. The decomposition we envision is badly conditioned because the dictionaries we need to use for seismic signals contain similar atoms. However, in certain cases, the B C R algorithm succeeds in separating s into the correct content types. First, there is a great importance in the decrease of the parameters Si in the B C R algorithm presented in Figure 7.12. The slower the decrease, the better the separation, because in this case, the algorithm has time to make the difference between the various content types. The B C R algorithm looks at the successive inner products between the signal s and the waveforms (fi in and puts the largest inner products (larger than the threshold Si) into the content type of order o^. If Si decreases too fast, a large number of inner products corresponding to various orders will be larger than Si, and thus will be put wrongly into the content type of order aj. There is also a concern about the dictionary. If the dictionary contains waveforms that are too similar, or if the dictionary is too large (which also triggers very similar waveforms), it will be more difficult to distinguish between the different inner products, because they will have similar values. One solution could be to make the parameters Si decrease very slowly but that would make the algorithm prohibitively slow, and moreover would only work optimally if the analyzing and synthesizing dictionaries are the same. There is also a trade-off between the size of the dictionary and the fractional-orders a that we choose in V. The number of nonzero entries in the original coefficient c is not important here, because it is the dissimilarity between the different content types of s, and the adequate choice of the dictionaries <f>j that make the B C R algorithm succeed. From these studies on the performance of the B C R algorithm, we can conclude that if the 111 analyzing dictionary V consists of waveforms that are in s, the B C R algorithm will succeed or partially succeed in separating s into the correct content types, provided V is not too large, and the fractional-orders in V not too close. However, when the dictionary contains atoms that are not in s, the B C R algorithm starts having trouble in finding the correct decomposition. The last issue of the B C R is its computational cost. Depending on the dictionary and its implementation, the normalization of the dictionary can become computationally intensive, and the B C R algorithm prohibitively slow. This is for example the case for the undecimated fractional spline wavelet dictionary. 7.7 Conclusion The studies of M P , B P and the B C R algorithm show that it is very difficult to estimate the fractional-orders of seismic signals by computing their representation in an overcomplete dictionary parameterized by their fractional-orders a. These algorithms work for very particular coefficients and well conditioned dictionaries. As soon as the dictionary gets too large, and/or the waveforms too similar, and/or the coefficient vector not sparse enough, none of the above algorithms are able to find the right representation c. Real data falls under these considerations, as we need a large enough dictionary (thus with similar waveforms) to find the fractional-orders contained in s. The estimated fractional-orders are limited by the fractional-orders contained in the dictionary, and the maximal error between the fractional-orders a found and the real ones in s is given by the largest distance between two consecutive fractional-orders contained in V. To avoid large errors, it is therefore necessary to decompose s into a large dictionary containing many different fractional-orders, triggering the failure of M P , B P and the B C R algorithm. The representation c of seismic signals is not sparse enough for the type of dictionary we need. 112 Bibliography [1] T . B l u and M . Unser. A complete family of scaling functions: the (a, r) fractional splines. In Proceedings of the Twenty-Eighth IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP'03), volume 6, pages 505-508, Hong Kong S A R , People's Republic of China, A p r i l 6-10 2003. [2] Scott Shaobing Chen, David L . Donoho, and Michael A . Saunders. Atomic decomposition by basis pursuit. SIAM Journal on Scientific Computing, 20(1):33-61, 1999. [3] I. Daubechies. Time-frequency localization operators: a geometric phase approach. IEEE Transactions on Information Theory, 34:605-612, 1988. [4] I. Daubechies, M . Defrise, and C . De M o l . A n iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Comm. Pure Appli. Math, 57:1413-1541, 2004. [5] M . V . de Hoop and N . Bleistein. Generalized Radon transform inversions for reflectivity in anisotropic elastic media. Inverse Probl., 13(3):669-690, 1997. [6] F . J . Dessing, E . V . Hoekstra, F . J . Herrmann, and C . P. A . Wapenaar. Multiscale edge detection by means of multiscale migration, pages 459-462. Soc. of Expl. Geophys., 1996. [7] F . J . Dessing. A Wavelet Transform Approach to Seismic Processing. P h D thesis, Delft University of Technology, 1997. [8] D.L.Donoho and M . Elad. Optimally sparse representation in general (non orthogonal) dictionaries via l minimization. Proc. Nat. Acad. Sci., 100:2197-2202, 2003. l [9] R. Ghose and J . Goudswaard. Integrating s-wave seismic-reflection data and conepenetration-test data using a multiangle multiscale approach. In Geophysics, volume 69, pages 440-459. Soc. of Expl. Geophys., 2004. [10] P . E . Gill, W . Murray, D . B . Ponceleon, and M . A . Saunders. Solving the kkt systems in barrier methods for linear and quadratic programming. Technical Report SOL 91-7, July 1991. [11] J . Goudswaard, M . Dillen, and C. Wapenaar. Multiangle processing and multiscale characterization of walkaway VSP data, pages 178-181. Soc. of Expl. Geophys., 2000. [12] J . C . M . Goudswaard and C . P. A . Wapenaar. Resolving well log singularities from seismic data, pages 150-153. Soc. of Expl. Geophys., 1997. [13] J . C . Harms and P. Tackenberg. Seismic signatures of sedimentation models. Geophysics, 37(l):45-58, February 1972. [14] Felix J . Herrmann. Multiscale analysis of well- and seismic data. Mathematical Methods in Geophysical Imaging V, SPIE, 3453:180-208, 1998. [15] Felix J . Herrmann. Fractional Spline Matching Pursuit: a quantitative tool for seismic stratigraphy. In Soc. Expl. Geoph., 2001. 113 [16] Felix J . Herrmann. Multi- and monoscale attributes for well- and seismic data. Chapter in the annual report of the Borehole Acoustics and Logging and Reservoir Delineation Consortia at MIT, 2001. [17] Felix J . Herrmann. Singularity Characterization by Monoscale Analysis: Applications to Seismic Imaging. Applied and Computational Harmonic Analysis, 11:64-88, 2001. [18] Felix J . Herrmann. Seismic deconvolution by atomic decomposition: a parametric approach with sparseness constraints. Integr. Computer-Aided Eng., 2005. [19] Felix J . Herrmann and Yves Bernabe. Seismic singularities at upper mantle discontinuities: a site percolation model. Geop. J. Int., 2004. [20] Felix J . Herrmann, William J . Lyons, and Colin Stark. Seismic facies characterization by monoscale analysis. Geophysical Research Letters, 28(19):3781-3784, October 2001. [21] Felix J . Herrmann and Colin Stark. Monoscale analysis of edges/reflectors using fractional differentiations/integrations. In Soc. Expl. Geoph., 1999. [22] F . J . Herrmann. A Scaling Medium Representation, a Discussion on Well-Logs, Fractals and Waves. P h D thesis, Delft University of Technology, 1997. [23] C . Liner, C . - F . L i , A . Gersztenkorn, and J . Smythe. SPICE: A new general seismic attribute, pages 433-436. Soc. of Expl. Geophys., 2004. [24] J . Liouville. Sur une formule pour les differentielles a indices quelconques a l'occasion d'un memoire de M . Tortolini. J. Math Pures Appl, unnumbered (in French), 1855. [25] S. Mallat. A wavelet tour of signal processing. 1997. [26] S. Mallat and Z. Zhang. Matching pursuit in a time-frequency dictionary. IEEE Transactions on Signal Processing, pages 3397-3415, 1993. [27] J . Muller, I. Bokn, and J.L.McCauley. Multifractal analysis of petrophysical data. Ann. Geophysicae, 10:735-761, 1992. [28] C . Payton. Seismic stratigraphy - applications to hyfrocarbon exploration. AAPG, 1977. [29] S. Sardy, A . G . Bruce, and P.Tseng. Block Coordinate Relaxation Methods for Nonparametric Wavelet Denoising. Journal of Computational and Graphical Statistics, 9, 2000. [30] J . L . Starck, M . E l a d , and D.L.Donoho. Image Decomposition V i a the Combination of Sparse Representation and a Variational Approach. IEEE Transaction on Image Processing. [31] J . L . Starck, M . E l a d , and D.L.Donoho. Redundant Multiscale Transforms and Their Application for Morphological Component Analysis. Advances in Imaging and Electron Physics, 132, 2004. [32] M . T . Taner, F . Koehler, and R . E . Sheriff. Complex seismic trace analysis. Geophysics, 44(6): 1041-1063, June 1979. 114 [33] M . Unser and T. B l u . Fractional Splines and Wavelets. SIAM Review, 42:43-67, 2000. [34] CO PD A[] Wapenaar, J[] Goudswaard, and A[] JQ van Wijngaarden. Multiangle, multiscale inversion of migrated seismic data. The Leading Edge, 18(8):928-932, 1999. [35] C.P.A. Wapenaar. Amplitude-variations-with-angle behavior of self-similar interfaces. Geophysics, 64:1928, 1999. [36] Meyer Yves. Ondelettes et algorithmes concurrents, volume 1435 of Actualites Scientifiques et Industrielles [Current Scientific and Industrial Topics]. Hermann, Paris, 1992. 115 Appendix A Wavelets A.l Overview Definition A.1.1 (Wavelet ,[25]) A wavelet is a function tp E L ( R ) of zero average 2 / ip(t)dt = 0, JM. which, is dilated with a scale parameter a > 0, and, translated, by a > 0 1 , ft-a\ Definition A.1.2 (Continuous Wavelet Transform, [25]) The wavelet transform of a function f G L (W) at the scale a and position a, W / ( a , a) , is computed by correlating f with the wavelet at position a and scale a: 2 +oo / -oo Any signal f â‚¬ L ( R ) can be written as a continuous summation 2 VteR,f(t)=K / Jâ€”oo Jâ€” oo of wavelets: Wf{a,a)i> (t) , M Â®~ where K is a constant. This representation is very redundant and the coefficients VVf(a, a) are highly correlated. In order to reduce this redundancy, we can limit ourselves to dyadic wavelets and hence obtain a basis of L {R). 2 116 Definition A . l . 3 ( O r t h o n o r m a l Wavelet Bases, [26]) Dyadic wavelets |V(j,fe) = (^~W^J }^ ^ 2 9 e n e r a ^ e a orthonormal basis n ofL (R). 2 Because the basis is orthonormal, the wavelet coefficients Wf(j,k) of a function f â‚¬ L (R) are simply computed by taking the inner product between f and each wavelet: 2 W/(j,fe) = </,%*)) +oo f(Wl dt k) / -OO â€” oc Any signal f can be represented uniquely as +oo +oo V t Â£ K , /(*) ^ (f,^(j,k))^U,k)j â€”â€”oo k= â€” oo 117 Appendix B Bases B.l Fractional Splines P r o p o s i t i o n B . l . l Let a = 0, and let <po = <p o be a vector ofW a= , such that V 1 < n < N , ip (n) = 1 , and V n < 0 , <p (n) = 0. 0 0 Define D = "Do is a GR N 0 i<k<N , 1 < k < N , s.t. V 1 < n < N , ug(n) = </>o(n - k + 1) basis of M.^. PROOF: VQ consists of N vectors of R , so we only need to show that the vectors of Â£>o are either N linearly independent, or that VQ can represent any vector of M . . We show that the vectors vÂ® N are linearly independent. Suppose there exists iV nonzero numbers /3i, ...,/3/v, such that N (B.l) i=l Equation (B.l) is equivalent to: 0 (B.2) t=i (B.3) i=l 118 We prove recursively that all the 3% are zero. Define the following property for an integer p, l<p<N: Vl<J<P,Pj=0. (B.4) 1. Let's show that this property is true for p = 1. If we evaluate equation (B.3) at n = 1, we obtain N J2&M2-i) = 0, (B.5) t=i which is equivalent to /3i^o(l) = 0, (B.6) as <^o(2 â€” i) is only nonzero for i = 1. A s V'o(l) = 1, we deduce that 3\ = 0, which is property (B.4) for p = 1. 2. Now, we show that if property (B.4) is satisfied for an integer p between 1 and N, it is also satisfied for p + 1. Let p be an integer, 1 < p < N and suppose that property (B.4) is true for p: V 1 < j < p , 3j = 0. We have E M Â° = o + l) = i=p+l Â« V 1 < n < Af , E 3m{n -i 0. (B.7) i=p+l If we evaluate equation (B.7) at n = p + 1, we obtain Af Â£ A v o O - i + 2) = 0, (B.8) i=p+l which is equivalent to /3 i<A)(p + 1) = 0, P+ as <fio{P ~ i + 2) is only nonzero for i = p + 1. A s <^o(p + 1) = 1, w e deduce that 3 +i = 0, which is property (B.4) for p + p 119 (B.9) 3. Property (B.4) is true for p = 1. If it is true for p, it is also true for p + 1, so it is true for all 1 < n < N , i.e. V 1 < n < N , 0 = 0. n Conclusion: vÂ® for 1 < k < N, are linearly independent and VQ is a basis of I P r o p o s i t i o n B . l . 2 Let a > 0, and let ip be a vector ofR , N a V 1 < n < N , <f (n) = n such that + 1 , and V n < 0 , a a = 1. Define V V A = { k} v a = {v% â‚¬R , l<k<N N 1 < k < N , s.t. V 1 < n < N , v%(n) = <p (n - k + 1)}. a is a basis of] PROOF: This proof is very similar to the previous one. V has N vectors in R N A so it is sufficient to prove that the vectors v% are linearly independent. Suppose there exists N nonzero numbers d\, â€¢â€¢â€¢,0N, such that N (B.10) t=i N Â»Vl<n<JV, Â£/3 <(n) t = 0 = 0. (B.ll) N &Vl<n<N , Y t=i Sixain-i If we evaluate equation (B.12) at n = 1, n = 2 , n 120 + (B.12) = N, we obtain the following N equations: N = 0 (B.13a) = 0 (B.13b) M2)Pi+<Pa(.3)0 J23i i=3 = 0 (B.13c) + ... + ipa(N)0N_1+0N = 0. (B.13d) i=i N tfcMPi + YlPi i=2 N 2 ^ (2)0 l Q + < {3)0 2 fa We will also use a recursive argument using property (B.4). 1. Let's show that property (B.4) is true for p = 1. Combining equations (B.13a) and (B.13b), we obtain: (ip (2) â€” l)0i = 0, which implies a 0i = 0 because <p (2) ^ 1. So property (B.4) is true for p = 1. a 2. Now, let p be an integer, 1 < p < N and suppose that property (B.4) is true for p: vi<i<P,ft = o. W i t h this property, the set of N equations becomes N Jft = 0 (B.14a) i=p+l N +z)Pp+i + & = ( 0 ) R14b i=p+2 N <p (p + 2)0 a p+1 + ip (p + 3)0 a p+2 E^ = ( ) 0 R14c z=p+3 l (p + Pa 2)0 p+1 + <p (p + 3)P 2 ol p+ + = -+tpa( )0N-\+0N N 0. A s for p â€” 1, combining equations (B.14a) and (B.14b) gives us (<p (p + 2) â€” a which implies that 0 \ p+ (B.14d) l)0 \ = p+ 0, = 0 because (fa(p + 2) =fi 1. So property (B.4) is true for p + 1. 121 3. V l < n < / V , / ? n = 0. Conclusion: v% for 1 < k < N, are linearlyindependent and V is a basis of IR a B.2 Fractional Derivatives of the Gaussian Let a be a positive fractional number, and let g be the a derivative of the Gaussian. A th a Gaussian doesn't have a compact support, but if it decays fast enough, we can approximate it by a function that has a compact support. If we sample g , we obtain a vector of length N a that we call u . Now let's define S as the set of integers m where the vector u a is less than a a tolerance (tol = 10~ , but this can be changed depending on the applications): 3 5 = | m e N , such that for 1 < m < N , \u (m)\ < tol a and let's put these coefficients to zero: V n 6 S , u (n) a = 0. Define the vector (p , as the first circular shifted vector v, such that v(l) ^ 0 but v(0) = v(N) = a 0: -Pa â€” yue R , such that 3 / 6 N , V 1 < n < JV , v(n) = S u N l a , v(l) ^ 0 and v(N) = 0 where S is the circular shift operator. By construction, V n > N and n < 0 , <p {n) = 0. a P r o p o s i t i o n B . 2 . 1 Define Kk<N V a is a basis ofiR . N 122 PROOF: The proof is similar to the proof of the fractional splines for a. = 0, because, for 2 < k < N, each vector has one more zero component than the previous vector v^_ v Similarly to the fractional splines, we only need to prove that the vectors v% are linearly independent, and we prove it recursively. We suppose there exists N nonzero numbers @\,...,8N, such that N â€¢YPiV? i=l = 0 (B.15) = 0 (B.16) = 0. (B.17) N 1 < n < N , Â£&v?(n) i=l N <=>Vl<n<JV, Â£ / 3 i Â¥ J ( n - i + l) i=i a 1. Property for p = 1. If we evaluate equation (B.17) at n = 1, we get N Â£Avo(2-i) i=l = 0, (B.18) which is equivalent to: /?iyo(l) = 0, (B.19) as t/?o(2 â€” 0 is only nonzero for i = 1. A s v?o(l) 7^ 0, we deduce that 0\ â€” 0, which is property (B.4) for p = 1. 2. Now, we show that if property (B.4) is satisfied for an integer p between 1 and N, it is also satisfied for p + 1. Let p be an integer, 1 < p < N and suppose that property (B.4) is true for p: Vl<J<P,0j=O. 123 So we have Â£ A"? = 0 i=p+l (B.20) N Â£ PiM -i+ ) t=p+i â€¢&\/l<n<N, n = l Â°- (- ) B 21 If we evaluate equation (B.21) at n = p + 1 we obtain N Â£ ft</?o(p-i t=p+i + 2) =0, (B.22) which is equivalent to: ) f3p iVo(p + l ) = 0 + (B.23) ) as <po{p â€” i + 2) is only nonzero for i = p + 1. As </?o(p + 1 ) ^ 0 , we deduce that 0 \ v+ = 0, which is property (B.4) for p+ 1. 3. Property (B.4) is true for p = 1. If it is true for p, it is also true for p + 1, so it is true for all 1 < n < N , i.e. V 1 < n < N , 0 = 0. n Conclusion: v% for 1 < k < N, are linearly independent and V is a basis of R N a 124 , Appendix C Soft Thresholding C.l Derivation of the formula P r o p o s i t i o n C . l . l (Soft Thresholding) Let (a,y) e R and A e R+, then 2 sign(a)(\a\ - -^)+ = argmm(X(a - x ) + \x\), 2 where y+ â€” max(0, y), for y Â£ M.. PROOF: Let f(x) = \x\ + A(a â€” x) , and let's call x = argm\i\ 2 x f(x). We have: if x > 0 x + \(a - x) 2 Aa 2 if x = 0 - x + A(a - x) if x < 0, l-2X(a-x) ifx>0 2 and /'(*) = -l-2A(a-x) ifx<0. / is not differentiable at 0 because /'(0+) = 1 - 2Aa, and /'(0_) = - 1 - 2Aa. Let x ^ 0. x = a â€” ^ if x > 0 i.e. a > f'(x)=0 x = a+ 7 ^ i f x < 0 125 i.e. a < 2A (C.l) Equation ( C . l ) can be rewritten as x = sign{a)(\a\ - - ^ ) if \a\ > ~ . (C.2) Now, if \a\ < 7 ^ , / ' can never be zero, and hence the minimum of / cannot happen when the derivative is zero. Now, consider \a\ < ^ , , 1 a < â€” " 2A and note that 1 â€¢<=> 1 1 1 < a < â€” 2A " " 2A -1 < -2Aa < 1 1 - 2Aa > 0 and - 1 - 2Aa < 0. Recall the expression of / ' : l-2Xa + 2Xx - l - 2 A a + 2A:r â€¢ O n R+, f'(x) if x > 0 if x < 0. = 1 â€” 2Aa + 2Xx, where 1 - 2Aa > 0 and 2Aa; > 0. In this case, the sign of / ' is given by the y-intercept (1 â€” 2A), which is a positive quantity. f'(x is therefore positive on M.+ . â€¢ In the same way, on M I , f'(x) = â€” 1 - 2Aa + 2Xx, where â€”1 â€” 2Aa < 0 and 2Xx > 0. In this case, the sign of / ' is given by the y-intercept ( â€” 1 â€”2A), which is a negative quantity. f'(x) is therefore negative on E l We can then conclude that / , defined on K , is decreasing on M l , and increasing on , which means that the minimum of / is at zero. Summarizing the results, we get If \a\ > ji- then x = sign(a)(\a\ If M < ^ then x = 0, which can be written as one equation x = sign{a)(\a\ - 126 ^) + â€” jr (C.3) Appendix D Fourier Transform and Fast Fourier Transform This appendix explains how to compute the Fourier Transform of a function via the Fast Fourier Transform (FFT). D.l Theoretical results D.l.l Definitions Definition D . l . l (Fourier Transform) Let f e The Fourier L ^ I R ) . The Fourier Transform (FT) of f is the map defined on M. by D e f i n i t i o n D . l . 2 (Fourier Series) Let f be a T-periodic function. (cn{f))nâ‚¬Z Transform of f is The Fourier coefficients are defined by Definition D . l . 3 (Fast Fourier Transform (FFT)) Let f be a T-periodic function. Fourier coefficients of f, {a (f))n=o n computed by the F F T are given by 1 W _ 1 kT â€”nk VI < n < N , a (f) = TfJ2 ^ l v ~ >\L0N n 127 The D.1.2 Approximation of the Fourier coefficients w i t h the F F T Let / be a T-periodic function. The Fourier coefficients c ( / ) are given by n -in e l , = Â±J (f) Cn f(t)e- â„¢$ 2i dt. (D.l) We can approximate ( D . l ) using the lower Riemann sum k=0 where LON = is the N th k=0 root of unity. A n approximate value for the Fourier coefficients c â€ž ( / ) is therefore given by 1 i Y _ kT 1 k=0 Therefore c â€ž ( / ) Â« D.l.3 a (f). n Fourier coefficients and Fourier Transform Let / G V(R). We set <p(t)=T^2f(t + nT), where ip Â£ L ( R / T Z ) . Computing the Fourier coefficients of <p gives 1 CnM= [ yZf(t r + kT)e- â„¢$dt= [ f{t)e-*â„¢$ 2i dt = F T / ( Â£ ) . The sampling is obtained by calculating the Fourier Transform of the function tp. We can therefore interpret the sequence ( F T / ( ^ ) ) as a sampling of the Fourier Transform of / . z D.l.4 Fourier Transform and Fast Fourier Transform To approximate the Fourier Transform of a function / with the F F T , / must be a continuous function with compact support [36]: / G C*Â°(R) and V|x| > T , f(x) = 0. 0 Choose T i > To, and consider the function / as the restriction of a function / to [â€”Ti, Ti] / = / t[-T,,T,]. 128 where / is 2T\ periodic. Since / is 2T\ periodic, we can compute its Fourier coefficients [36] rTi C k = k /(Â«)Â«P(--WTi) = ^ F T (A) IZ . ( D . 2 ) W h e n T i 3> To, the sampling defined above is sufficiently precise for our purposes. When / does not have compact support, its Fourier Transform can be approximated by the Discrete Cosine Transform. D.2 Fourier Transform and Fast Fourier Transform in Matlab D.2.1 Approximation of the Fourier Transform in M a t l a b In Matlab, the F F T of a vector v (noted u) of length N, is defined as N F F T ( v ) ( k ) = u(k) = Y v(n))e- ^ - ^ - )/ , i 2 k 1 n 1 for 1 < k < N . N n=l The inverse F F T is given by N IFFT(u)(n) =y(n) = Â£ u(k))e i 2 w ( k - 1 ) ( n - 1 ) / N , for 1 < n < N . k=i Let / be a continuous function with compact support [â€”To, To], i.e. f(x) T i > T . F r o m equation (D.2) [36], we have = 0 for |x| > To. Let 0 cn(f) = ^ F T (Jl) for â€ž e z. We can also approximate the Fourier coefficients of / by c ( ) = ^ x> ~ e_27r(fc n f ' > 1)(n_1)/iV fornez k=l where Xk = / ( â€” ^ r ^ - ) for 1 < k < N, and / is 2Ti-periodic. The Fourier coefficients are iV-periodic (i.e. c N = c/v for p Â£ Z ) . If we take N large enough, we obtain a good approximation of the Fourier coefficients, and can write n+p Cn(f) Â« ^FFT(f). Now, c ( / ) n Â« ^ F F T ( f ) and c (f) n = ^ F T ^ ) , ^FFT(f). 129 which leads to c â€ž ( / ) = ^ F T ^ ) Â« We can therefore write 2Ti ~N~ for 1 < n < N or -N/2 + 1 < n < N/2. T h e above formula gives an approximation of the Fourier Transform of a function / w i t h the FFT. T h e M a t l a b code for finding the Fourier Transform using the F F T is given on the next page. D.2.2 M a t l a b code for the Fourier Transform function[N.xi,ft,XI,F] = FT(f,Tl_x,FLAG) '/. f u n c t i o n [N.xi, f t , XI ,F] = F T ( f ,Tl_x,FLAG) 7, Compute t h e COMPLEX F o u r i e r T r a n s f o r m of t h e sampled f u n c t i o n f . 7. BE CAREFUL! f has t o have a COMPACT SUPPORT! 7. '/. INPUTS: '/. Mandatory '/, f : Function f '/, T l _ x : Half period i n x â€¢/, O p t i o n a l "/, FLAG: '/, i f 1 -> n o r m a l i z e s 0 -> no n o r m a l i z a t i o n '/, OUTPUTS: '/. N: Length of t h e F o u r i e r T r a n s f o r m F, and XI '/. x i : F r e q u e n c i e s on which f t i s e v a l u a t e d '/. f t F o u r i e r Transform of f 1 y, (c) C a t h e r i n e if Dupuis, January 2005 n a r g i n <3 FLAG = 0; end n = length(f); a = -1; b = 1; stepx = ( b - a ) / ( n - l ) ; x = a:stepx:b; 130 â€¢/.â€¢/.y. /. /.y. /.7.'/.y.7. /.7.7.7. /. /.7,y.'/.'/ /.y. /. /.7.7.7. /.7. , , 1 7. 7. , , , , 1 t 1 ZERO PADDING 7,7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7. Period T l _ x = 60; 7 . in x k p t s = f l o o r ( ( T l _ x - x ( e n d ) ) / s t e p x ) ; 7 . we added k p t s p o i n t s on t h e r i g h t t o 7 . z e r o padd t h e s i g n a l up t o t h e c l o s e s t t o t h e l e f t t o T l _ x = n/2+kpts; 7 . i Tl_s n samples N = 2 * T l _ s ; 7 . always even y< k = [1:N] ; i n d = [-Tl_x + ( 2 * T l _ x * ( k - l ) ) / ( N - l ) ] ; 7 . i n d i c e s where t h e f u n c t i o n s i g i s 7, evaluated = [zeros ((N-n)/2,1) ; f ; z e r o s ((N-n)/2,1)] ; 7 . z e r o p a d d i n g sig xN = [x(n/2) - ( N / 2 - l ) * s t e p x : s t e p x : x ( n / 2 + l ) + ( N / 2 - 1 ) * s t e p x ] ; 7 . whole x 7. a f t e r z e r o padding /o/o/o/o/./o/o/o/o/o/./o/o/o/o/o/o/o/o/o/.A 7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7,7. FOURIER TRANSFORM 7.7.7.7,7.7.7.7.7.7,7.7.7.7.7.7.7,7.7.7.7.7.7.7.7.7.7.7. F = fftshift(fft(fftshift(sig))); ind = [(N-n)/2+l:(N-n)/2+n]; = F ( i n d ) ; 7 . r e s t r i c t i o n of F ft nn = [-(N-2)/2-l:(N-2)/2] ; XI = n n . / ( 2 * T l _ x ) ; x i = XI([(N-n)/2+l:(N-n)/2+n]); o/ y y y y y o/ y y y oy y y y y y y y y y y y y y y y y y y y y y y y yy y y oi y y y y y y ot y y y y y y y oi at y y y y y y y y y y y y y y y y y y QI /o/o/o/o/o/o/o/a/o/.>/o/o/o/o/oA^ vut%%%%%^%%%%%%%%%vM%m.%%% if NORMALIZATION 7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7,7.7.7.7.7. FLAG==1 F = F./max(abs(F)); ft = ft./max(abs(ft)); end y y Â°/w w /vÂ°/Â°/ Â°/wv 0 o y o / o / 0/ 0/ 0/ V W W W V V W W W W V W W W W W W W W W ' / ' / ' / W V W W V W V /o /o/o /o /o/o /o /o /o /o /o/o / o / o / o /o/o /o /Â«/o / o / o / o /o /o/o / o / o / o / o /o/o/o /o/o /o/o /o /o/o/o /o/o /o/o /o/o /o/o / o / o / o /o/o /o /o/o / o / o / o / o / o / o / o / o / o / o / o / o / o /o/o /o/o 131 Appendix E Explanation of the software README is the program that locates the reflectors and estimates their fractional-orders. The program works for any size of the data (matrix or cube). S e i s C h a r a c l _ 4 .m 1. BCRDaubechiesl_2.m Version 1.2 Iterative thresholding algorithm with Coordinate Relaxation Method f u n c t i o n [ s i g n o i s e , c o e f f r e c ] = BCRDaubechiesl_2(s,NameOfDiet,pari,par2,par3, beta,czero,Lmax,limit,expl,speedl,speed2)} S o l v e t h e f o l l o w i n g m i n i m i z a t i o n problem: min b e t a l l | c l | I 1 + ... + b e t a n l I c n I 11 + ( I l s - s l - . . . - s n l I 2)"2 using the Block Coordinate R e l a x a t i o n Method w i t h r e s p e c t to the c o e f f i c i e n t s Block Relaxation Coordinate method INPUTS: s: S i g n a l we want t o decompose NameOfDiet: Name of t h e merged d i c t i o n a r y parl,par2,par3: Parameter of t h e merged d i c t i o n a r y beta : Parameter o f t h e m i n i m i z a t i o n Lmax: Max. # of i t e r a t i o n s (vector) czero: I n i t a l guess f o r t h e c o e f f i c i e n t s limit: Stopping expl: 0 -> l i n e a r c r i t e r i o n f o r the algorithm decrease 1 -> e x p o n e n t i a l decrease speedl: Speed of l i n e a r / e x p o n e n t i a l decrease speed2: Speed of t h e power decrease decrease only) OUTPUTS: 132 (for exponential Matrix whose columns c o n t a i n t h e d i f f e r e n t p a r t s as w e l l as t h e r e s i d u a l Matrix i n the l a s t content column whose column c o n t a i n t h e c o e f f i c i e n t s different content 133 parts of t h e B u i l d D i c l - 3 .m Version 1.3 B u i l d the detection and estimation dictionaries function[] = BuildDicl_3(whichdic,n,pardic_ave,savedetectest,nb_alpha, nb_sigma,dist_alpha,percent_sigma) B u i l d the d i c t i o n a r i e s f o r the estimation and d e t e c t i o n and saves i t into: INPUTS: whichdic: S t r i n g : values 1- -> i f whichdic = ' d e t e c t i o n ' or ' d e t e c t i o n ' -> 'both' only computes the detection dictionary 1- i f w h i c h d i c = 'both' ^> computes the detection & estimation d i c t i o n a r i e s n: pardic_ave: Length of the signal Average parameters of the different traces 1- p a r d i c _ a v e ( l ) the average the average s t d d e v i a t i o n order of the contains 2- p a r d i c _ a v e ( 2 ) c o n t a i n s of the fractional traces traces savedetectest Name of the d e t e c t i o n or b o t h d i c t i o n a r i e s nb_alpha: Vector of l e n g t h 2 c o n t a i n i n g the i n the d e t e c t i o n and nb_sigma: orders 1- n b _ a l p h a ( l ) : # of f r a c o r d e r s i n the detection 2- n b _ a l p h a ( 2 ) : # of f r a c o r d e r s i n the estimation Vector of l e n g t h 2 c o n t a i n i n g the i n the d e t e c t i o n and 1- n b _ s i g m a ( l ) : dist_alpha: # of f r a c estimation d i c t i o n a r i e s # of s t d die die deviations estimation d i c t i o n a r i e s # of s t d d e v i a t i o n s i n the detection 2- nb_sigma(2): # of s t d d e v i a t i o n s i n the estimation Vector the of l e n g t h 2 c o n t a i n i n g the f r a c t i o n a l orders distances f o r the d e t e c t i o n betweent and estimation d i c t i o n a r i e s 1- d i s t _ a l p h a ( l ) : percent_sigma: f o r the detection die 2- d i s t _ a l p h a ( 2 ) : f o r the estimation Maximum p e r c e n t a g e of the average s t d d e v i a t i o n added t o the average s t d d e v i a t i o n . percent_sigma) w i l l be the a l l o w e d i n the dictionary 134 die sigma_ave(1+/- furthest std deviation die die BuildFracGauss_dicl_2 Version 1.2 B u i l d a dictionary with fractional derivatives of the Gaussian f u n c t i o n [diet,par1,par2,par3,norm_dic,alpha_dic,sigma_dic,loopsigma] = BuildFracGauss_dicl_2(n,alpha_ave,sigma_ave,dist_alpha,percentage, nb_alpha,nb_sigma) B u i l d a merged d i c t i o n a r y w i t h f r a c t i o n a l d e r i v a t i v e s of g a u s s i a n s INPUTS: n: Length of t h e s i g n a l alpha_ave: Average a l p h a i n the s i g n a l sigma_ave: Average sigma i n the s i g n a l dist_alpha: D i s t a n c e between t h e a l p h a s i n t h e d i c t i o n a r y percentage: Maximum percentage of t h e average sigma (" frequency) added or s u b t r a c t e d from t h e average sigma nb_alpha: Nb of alphas i n t h e d i c t i o n a r y nb_sigma: Nb of sigmas i n the d i c t i o n a r y OUTPUTS: diet: Name of t h e d i c t i o n a r y pari,par2,par3: Parameters of t h e d i c t i o n a r y (merged dictionary) norm_dic: Norm of the m a t r i x of t h e d i c t i o n a r y alpha_dic: V e c t o r of a l p h a s i n t h e d i c t i o n a r y sigma_dic: V e c t o r of sigmas i n t h e d i c t i o n a r y loopsigma: Flag: = 1 i f l o o p on sigma t o b u i l d t h e d i c t i o n a r y i s t h e 1st one = 0 otherwise See a l s o MakeFracGaussl_2.m, MakeDic.m, NormFracGauss.m, 135 chunkl_l.m Version 1.1 Process and analyze the output of the iterative thresholding algorithm function[chunks_c,chunks_index,nbchunk,NbOfChunks,ind] = chunkl_l(c,tol) Chunks the coefficient INPUTS: c: tol: OUTPUT: chunks_c: chunks_index: nbchunk: NbOfChunks: ind vector c into subgroups of events Coefficient vector c Threshold l e v e l (keep moduli of c that are larger than t o l ) . t o l is c a l l e d mu i n the paper. C e l l array containing the different chunks NbOfChunks c e l l s ) C e l l array containing the different chunks of c (has NbOfChunks c e l l s ) Vector that keeps track of which chunk the are put i n . Usage: nbchunk(find(ind==index)) gives the of the chunk index i s i n . # of chunks found Indices of the moduli of c larger than t o l 136 of c (has of indices indices nb E s t i m a t e l _ l .m Version 1.1 Estimate the fractional-orders of the detected events function[v_alpha] = Estimatel_l(s,cmat,cfold,tolchunk,distblob, alpha.dic,sigma_dic,loopalpha,table) E s t i m a t e t h e f r a c t i o n a l o r d e r a l p h a of t h e d i f f e r e n t events INPUTS: s: S i g n a l analyzed cmat: M a t r i x of t h e c o e f f i c i e n t cfold: V e c t o r of the f o l d e d c o e f f i c i e n t tolchunk: T o l e r a n c e f o r the f u n c t i o n chunk (keep t h e m o d u l i - distblob: Minimum d i s t a n c e between t h e chunks a l l o w e d alpha_dic: V e c t o r of t h e f r a c t i o n a l o r d e r s f o r t h e d i c t i o n a r y i n sigma_dic: V e c t o r of t h e f r e q u e n c i e s f o r t h e d i c t i o n a r y i n which loopalpha: Flag: l a r g e r than (output of BCRDaubechies) cmat (length(cfold)=n) tolchunk) ( i n GetBlob which cmat i s decomposed i n cmat i s decomposed i n = 1 i f l o o p on a l p h a t o b u i l d t h e d i c t i o n a r y i s t h e 1st = 0 table: otherwise V e c t o r g i v i n g the l o c a t i o n s i n t h e s i g n a l of the c o e f f i c i e n t corresponding atom i n FracGauss to a particular dictionaries OUTPUT: v_alpha: V e c t o r whose nonzero e n t r i e s g i v e t h e f r a c t i o n a l o r d e r of t h e s i g n a l at t h i s p a r t i c u l a r l o c a t i o n . I t s length i s the l e n g t h of the s i g n a l See a l s o chunkl_l.m, Getblobl.l.m, TableFracGaussl_2.m, w h i c h a l p h a l _ l BCRDaubechiesl_2.m 137 Foldl-l.m Version 1.1 Reshape the output of the iterative thresholding algorithm function[cfold] = Foldl_l(c,n,NameOfDiet,pari,par2,par3) F o l d t h e v e c t o r c unto t h e v e c t o r c f o l d ' INPUTS: c: V e c t o r c of l e n g t h NumberOfDicts*n n: Length NameOfDiet: Name of t h e merged d i c t i o n a r y pari,par1,par3: Parameters of t h e d i c t i o n a r y of t h e s i g n a l OUTPUT: cfold: F o l d e d v e c t o r of l e n g t h n by 1 o b t a i n e d by summing t h e a b s o l u t e v a l u e of t h e d i f f e r e n t w i t h t h e d i f f e r e n t wavelets See also SeisCharacl_4.m 138 coefficients i n NameOfDict obtained Getblobl.l .m Version 1.1 Process the output of chunkl_l .m for the analysis of the coefficient vector obtained from the iterative thresholding algorithm function[blobs_c,blobs_index,nbblob,NbOfBlobs,indblob] = Getblobl_l(c,chunks_c,chunks_index,nbchunk,NbOfChunks,dist) Get the coefficients INPUTS: c: chunks_c: chunks_index: nbchunk: NbOfChunks: dist: OUTPUTS: blobs_c: blobs_index: nbblob: NbOfBlobs: indblob: and put them into blobs Coefficient vector C e l l array containing the different chunks of c (has NbOfChunks c e l l s ) C e l l array containing the different chunks of indices of c (has NbOfChunks c e l l s ) Vector that keeps track of which chunk the indices are put i n Usage: nbchunk(find(ind==index)) gives the nb of the chunk index i s i n . # of chunks found Minimum distance between the chunks allowed C e l l array containing the different blobs chunks) of c (has NbOfBlobs c e l l s ) C e l l array containing the different blobs of c (has NbOfBlobs c e l l s ) Vector that keeps track of which blob the are put i n Usage: nbblob(find(ind==index)) gives the of the blob index i s i n . # of blobs found Indices of the entries of c which are i n different blobs See also chunkl_l.m 139 (merged of indices indices nb the InitSeisCharac.m Give examples of parameters for the algorithm function[par_detection,par_window,par_estimation,nb_alpha,nb_sigma, dist.alpha,percent_sigma,par_bcr_d,par_bcr_e] Initialize the d e f a u l t parameters f o r = InitSeisCharac(n,ratio) SeisCharacl_3.m INPUTS: n: Length of the ratio: R a t i o between the signal initial the l e n g t h of the t r a c e s OUTPUTS: par.detection: l e n g t h of the t r a c e s currently and analyzed V e c t o r of l e n g t h 2 c o n t a i n i n g the parameters f o r the detection 1- par_detection(l): of the c o e f f i c i e n t threshold level 2- of the norm => p a r _ d e t e c t i o n ( l ) * m a x v a l f o r the is the detection p a r _ d e t e c t i o n ( 2 ) : minimum w i d t h of the s e t s of coefficients par_window: percentage a s s o c i a t e d t o one event V e c t o r of l e n g t h 2 c o n t a i n i n g the parameters f o r the windowing 1- par_window(l): # of sigma we want f o r h a l f the width of the window 2- par_window(2): # of sigma we want f o r the w i d t h of the decay of the window par_estimation: V e c t o r of l e n g t h 2 c o n t a i n i n g the parameters f o r estimation par_bcr(_d;_e): (see V e c t o r of l e n g t h 3 c o n t a i n i n g the parameters f o r BCR f o r d e t e c t i o n and e s t i m a t i o n 1- par_bcr(l): beta 2- par_bcr(2): if 3- (_d or _e) =1 -> e x p o n e n t i a l =0 -> l i n e a r decrease decrease par_bcr(3): speedl f o r Block Solver (BCRDaubechiesl_2.m) 4- par_bcr(4): speed2 f o r B l o c k S o l v e r (BCRDaubechiesl_2.m) See a l s o the par_detection) SeisCharacl_4.m 140 the 9. NormFracGauss.m Compute the norm of the dictionary made of fractional derivatives of the Gaussian. function[norme] = NormFracGauss(n,dict,parl,par2,par3) Computes the norm of the FracGauss dictionary INPUTS: n: diet: pari,par2,par3: Length of the s i g n a l Name Of the d i c t i o n a r y Parameters of the d i c t i o n a r y OUTPUT: norme: Norm of the matrix of the d i c t i o n a r a y See a l s o SeisCharacl_4.m, SizeOfDiet.m, FastS.m, FastA.m 141 10. PrecomputeDictl_2.m Version 1.2 Precompute a table of estimation dictionaries function[dict_P,par1_P,par2_P,par3_P,no_P,alpha_dic_P,sigma_dic_P] = PrecomputeDict1_2(n,savedict,alpha_dic_P,sigma_P) Computes the d i f f e r e n t d i c t i o n a r i e s f o r the estimation INPUTS Mandatory n: savedict: Optional alpha_dic_P: sigma_P OUTPUTS dict_P Length of the s i g n a l Name of the mat f i l e where the precomputed dictionary i s stored Vector of the d i f f e r e n t f r a c t i o n a l orders Vector of d i f f e r e n t standard d e v i a t i o n s f o r the s t a r t of the computation of the estimation dictionaries parl_P,par2_P,par3_P no_P C e l l array c o n t a i n i n g the name of the d i f f e r e n t estimation d i c t o n a r i e s corresponding t o the d i f f e r e n t sigmas i n the vector sigma C e l l arrays: parameters of the d i c t i o n a r i e s Vector of the norms of the d i f f e r e n t alpha_dic_P: dictionaries Vector of the f r a c t i o n a l orders. A l l the sigma_dic_P: f r a c t i o n a l orders are i n each d i c t i o n a r y C e l l array of the d i f f e r e n t sigma vectors f o r each d i c t i o n a r y See a l s o MakeFracGaussl_2.nl, NormFracGauss.m, MakeDic.m, 142 SeisCharacl_4. SeisCharacl_4.m Version 1.4 Delineate and characterize transitions from seismic events function[v_detection,v_alpha] = S e i s C h a r a c l _ 3 ( d a t a , r a t i o . w h i c h d i c , s a v e d e t e c t e s t , s a v e d i c t , trace_nb,par_detection,par_window, par_estimation,par_bcr_d,par_bcr_e,FLAG) F i n d the l o c a t i o n s and f r a c t i o n a l orders of seismic events INPUTS: data: Seismic data ratio: Ratio between the i n i t i a l length of the t r a c e s and the length of the t r a c e s c u r r e n t l y analyzed S t r i n g : values -> 'detection', 'estimation' or 'both' 1- i f whichdic = 'detection' -> only computes the detection dictionary whichdic: savedetectest savedict: trace_nb: par_detection: (matrix or cube) 2- i f whichdic = 'both' -> computes the d e t e c t i o n & estimation d i c t i o n a r i e s Name of the d e t e c t i o n d i c t i o n a r y Name of the precomputed estimation d i c t i o n a r y Vector of length 2 c o n t a i n i n g the # of the 1st t r a c e to be processed and the # of the l a s t t r a c e t o be processed. 1- t r a c e _ n b ( l ) : # of the 1st t r a c e 2- trace_nb(2): # of the l a s t t r a c e Vector of length 2 c o n t a i n i n g the parameters f o r the detection 1- p a r _ d e t e c t i o n ( l ) : percentage of the norm of the c o e f f i c i e n t => par_detection(l)*maxval t h r e s h o l d l e v e l f o r the d e t e c t i o n i s the 2- p a r _ d e t e c t i o n ( 2 ) : minimum width of the sets of c o e f f i c i e n t s a s s o c i a t e d t o one event par_window: Vector of length 2 c o n t a i n i n g the parameters f o r the windowing 1- par_window(l): # of sigma we want f o r h a l f the width of the window 2- par_window(2): # of sigma we want f o r the width of the decay of the window par_estimation: par_bcr(_d;_e): Vector of length 2 c o n t a i n i n g the parameters f o r the estimation (see par_detection) Vector of length 3 c o n t a i n i n g the parameters f o r the BCR f o r d e t e c t i o n and estimation (_d or _e) 1- p a r _ b c r ( l ) : beta 2- par_bcr(2): i f =1 -> exponential decrease =0 -> l i n e a r decrease 143 3- p a r _ b c r ( 3 ) : speedl f o r Block Solver (BCRDaubechiesl_2.m) 4- p a r _ b c r ( 4 ) : speed2 f o r Block Solver (BCRDaubechiesl_2.m) FLAG: OUTPUT: v_detection: v_alpha: = 1 -> only d e t e c t i o n performed (default 0) Vector containing the l o c a t i o n s of the events detected i n the Detection part Vector containing the f r a c t i o n a l orders of the d i f f e r e n t events at t h e i r l o c a t i o n . v_alpha(k) = a l p h a l means that at the l o c a t i o n k, the f r a c t i o n a l order i s a l p h a l See a l s o B u i l d D i c l _ 3 . m , BCRDaubechiesl_2.m, B P _ I n t e r i o r . m , F o l d l _ l . m , Windowl_3.m,Estimatel_l.m, TableFracGaussl_2.m 12. whichalphal_l .m Version 1.1 Give the fractional-order of a specific atom in a dictionary made with fractional derivatives of the Gaussian function[alpha_answer] = whichalphal_l(ind,n,sigma,alpha,loopalpha) ONLY FOR FRACGAUSS DICTIONARY! Gives the alpha corresponding to the index i n d i n the long c o e f f i c i e n t vector i n the merged d i c t i o n a r y made of the d i f f e r e n t sigmas i n the vector sigma and the d i f f e r e n t alphas i n the vector alpha INPUTS: ind: n: sigma: alpha: loopalpha: Index i n the c o e f f i c i e n t vector Length of the s i g n a l Vector of the d i f f e r e n t sigmas f o r the merged d i c t i o n a r y Vector of the d i f f e r e n t alphas f o r the merged d i c t i o n a r y Flag: = 1 i f loop on alpha to b u i l d the d i c t i o n a r y i s the 1st one = 0 otherwise OUTPUTS: alpha_answer: Alpha corresponding to ind See also c h u n k l _ l . m , E s t i m a t e l _ l . m 144 whichsigmal_l .m Version 1.1 Give the standard deviation of a specific atom in a dictionary made with fractional derivatives of the Gaussian function[sigma_answer_sample,sigma_answer] = whichsigmal_l(ind,n,sigma, alpha,loopsigma) ONLY FOR FRACGAUSS DICTIONARY! Gives the sigma corresponding to the index i n d i n the long c o e f f i c i e n t v e c t o r i n the merged d i c t i o n a r y made of the d i f f e r e n t sigmas i n the v e c t o r sigma and the d i f f e r e n t alphas i n the vector alpha INPUTS: loopsigma: Index i n the c o e f f i c i e n t vector Length of the s i g n a l Vector of the d i f f e r e n t sigmas f o r the merged d i c t i o n a r y Vector of the d i f f e r e n t alphas f o r the merged d i c t i o n a r y = 1 i f loop on sigma to b u i l d the d i c t i o n a r y Flag: i s the 1st one = 0 otherwise = 1 of loop on sigma to b u i l d the d i c t i o n a r y ind: n: sigma: alpha: i s the 1st one = 0 otherwise OUTPUTS: sigma_answer_sample: sigma_answer: Sigma i n # of samples Sigma corresponding to i n d See a l s o chunkl_l.ni, E s t i m a t e l _ l .m 145 14. Windowl_3.m Version 1.3 Window out the detected events function[M,sigma_blob_sample,alpha_blob,sigma_blob,v_detection] = Windowl_3(s,cmat,cfold,tolchunk,distblob,alpha_dic,sigma_dic, loopsigma,table,halfnb_sigma_width,nb_sigma_decay) Window out the s i g n a l s INPUTS: s: cmat: cfold: tolchunk: distblob: alpha_dic: sigma_dic: loopsigma: S i g n a l we analyze Matrix of the c o e f f i c i e n t (output of BCRDaubechies) Vector of the f o l d e d c o e f f i c i e n t cmat (length(cfold)=n) Tolerance f o r the f u n c t i o n chunk (keep the moduli l a r g e r than tolchunk) Minimum distance between the chunks allowed ( i n GetBlob.m) Vector of the f r a c t i o n a l orders f o r the d i c t i o n a r y i n which cmat i s decomposed i n Vector of the frequencies f o r the d i c t i o n a r y i n which cmat i s decomposed i n Flag: = 1 i f loop on sigma to b u i l d the d i c t i o n a r y i s the 1st one = 0 otherwise table: halfnb_sigma_width: nb_sigma_decay: Vector g i v i n g the l o c a t i o n s i n the s i g n a l of the c o e f f i c i e n t corresponding to a p a r t i c u l a r atom i n FracGauss d i c t i o n a r i e s How many times sigma f o r h a l f the width of the 1 window How many times sigma f o r the width of gaussian decay the OUTPUTS: M: sigma_blob_sample: alpha_blob: Matrix whose columns contain the windowed s i g n a l s from s different Vector containing the s t d d e v i a t i o n s of the atoms c o r r e l a t i n g the best with each of the windowed s i g n a l ( i n # of samples) Vector c o n t a i n i n g the alphas of the atoms c o r r e l a t i n g the best with each of the windowed signals sigma_blob: Vector containing s t d d e v i a t i o n s 146 of the atoms c o r r e l a t i n g the best with each of the windowed signal Vector c o n t a i n i n g the l o c a t i o n s 147 of the events
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Seismic singularity characterization with redundant...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Seismic singularity characterization with redundant dictionaries Dupuis, Catherine Mareva 2005
pdf
Page Metadata
Item Metadata
Title | Seismic singularity characterization with redundant dictionaries |
Creator |
Dupuis, Catherine Mareva |
Date Issued | 2005 |
Description | We consider seismic signals as a superposition of waveforms parameterized by their fractional-orders. Each waveform models the reflection of a seismic wave at a particular transition between two lithological layers in the subsurface. The location of the waveforms in the seismic signal corresponds to the depth of the transitions in the subsurface, whereas their fractional-order constitutes a measure of the sharpness of the transitions. By considering fractional-order transitions, we generalize the zero-order transition model of the conventional deconvolution problem, and aim at capturing the different types of transitions. The goal is to delineate and characterize transitions from seismic signals by recovering the locations and fractional-orders of its corresponding waveforms. This problem has received increasing interest, and several methods have been proposed, including multi- and monoscale analysis based on Mallat's wavelet transform modulus maxima, and seismic atomic decomposition. We propose a new method based on a two-step approach, which divides the initial problem of delineating and characterizing transitions over the whole seismic signal, into two easier subproblems. The algorithm first partitions the seismic signal into its major components, and then estimates the fractional-orders and locations of each component. Both steps are based on the sparse decomposition of seismic signals in overcomplete dictionaries of waveforms parameterized by their fractional-orders, and involve i 1 minimizations solved by an iterative thresholding algorithm. We present the method and show numerical results on both synthetic and real data. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2009-12-15 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0080074 |
URI | http://hdl.handle.net/2429/16727 |
Degree |
Master of Science - MSc |
Program |
Mathematics |
Affiliation |
Science, Faculty of Mathematics, Department of |
Degree Grantor | University of British Columbia |
Graduation Date | 2005-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
Aggregated Source Repository | DSpace |
Download
- Media
- 831-ubc_2005-0431.pdf [ 9.13MB ]
- Metadata
- JSON: 831-1.0080074.json
- JSON-LD: 831-1.0080074-ld.json
- RDF/XML (Pretty): 831-1.0080074-rdf.xml
- RDF/JSON: 831-1.0080074-rdf.json
- Turtle: 831-1.0080074-turtle.txt
- N-Triples: 831-1.0080074-rdf-ntriples.txt
- Original Record: 831-1.0080074-source.json
- Full Text
- 831-1.0080074-fulltext.txt
- Citation
- 831-1.0080074.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0080074/manifest