UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Seismic singularity characterization with redundant dictionaries Dupuis, Catherine Mareva 2005

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-ubc_2005-0431.pdf [ 9.13MB ]
Metadata
JSON: 831-1.0080074.json
JSON-LD: 831-1.0080074-ld.json
RDF/XML (Pretty): 831-1.0080074-rdf.xml
RDF/JSON: 831-1.0080074-rdf.json
Turtle: 831-1.0080074-turtle.txt
N-Triples: 831-1.0080074-rdf-ntriples.txt
Original Record: 831-1.0080074-source.json
Full Text
831-1.0080074-fulltext.txt
Citation
831-1.0080074.ris

Full Text

SEISMIC SINGULARITY CHARACTERIZATION WITH REDUNDANT DICTIONARIES by C A T H E R I N E M A R E V A DUPUIS Graduee en Ingenierie, Ecole Nationale Superieure de Techniques Avancees, Paris, 2003 A THESIS S U B M I T T E D I N P A R T I A L F U L F I L L M E N T O F THE REQUIREMENTS FOR T H E DEGREE OF M A S T E R OF SCIENCE in T H E F A C U L T Y O F G R A D U A T E STUDIES (Mathematics) T H E UNIVERSITY OF BRITISH C O L U M B I A July 2005 © Catherine Dupuis, 2005  Abstract We consider seismic signals as a superposition of waveforms parameterized by their fractionalorders. Each waveform models the reflection of a seismic wave at a particular transition between two lithological layers in the subsurface. The location of the waveforms in the seismic signal corresponds to the depth of the transitions in the subsurface, whereas their fractional-order constitutes a measure of the sharpness of the transitions. B y considering fractional-order transitions, we generalize the zero-order transition model of the conventional deconvolution problem, and aim at capturing the different types of transitions. The goal is to delineate and characterize transitions from seismic signals by recovering the locations and fractional-orders of its corresponding waveforms. This problem has received increasing interest, and several methods have been proposed, including multi- and monoscale analysis based on Mallat's wavelet transform modulus maxima, and seismic atomic decomposition. We propose a new method based on a two-step approach, which divides the initial problem of delineating and characterizing transitions over the whole seismic signal, into two easier subproblems. The algorithm first partitions the seismic signal into its major components, and then estimates the fractional-orders and locations of each component. Both steps are based on the sparse decomposition of seismic signals in overcomplete dictionaries of waveforms parameterized by their fractional-orders, and involve i minimizations solved by an iterative thresholding algorithm. We present the method and show numerical results on both synthetic and real data. 1  ii  Table of Contents Abstract  ii  Table of Contents  iii  List of Tables  vi  List of Figures  vii  Acknowledgments  Part I.  ix  Capita Prima  1  Chapter 1.  Introduction  2  Chapter 2.  Background  7  2.1  2.2  Model for seismic transitions 2.1.1 The seismic reflectivity 2.1.2 The seismic source function 2.1.3 The model for the imaged seismic signal Previous work 2.2.1 Complex seismic trace analysis 2.2.2 Multiscale analysis 2.2.3 Monoscale analysis 2.2.4 Seismic Atomic Decomposition  Chapter 3. 3.1 3.2 3.3  4.2  Problem Statement  27  Dictionaries and Decomposition 3.1.1 Basis Pursuit Uniqueness condition and equivalence between (Po) 3.2.1 Empirical studies for the performance of B P Detection-Estimation approach  a  Chapter 4. 4.1  7 7 11 12 15 15 16 24 25  Detection-Estimation Method  Detection 4.1.1 Detection dictionary 4.1.2 Decomposition in Detection 4.1.3 Algorithm 4.1.4 Windowing Estimation 4.2.1 Decomposition in Estimation 4.2.2 Dictionary in estimation 4.2.3 Resolution  n  d (Pi)  27 28 29 31 32 34 34 35 36 37 39 42 42 43 44  iii  4.3  Numerical scheme  Chapter 5. 5.1  5.2  7.5  7.6  7.7  Discussion  49 49 51 55 62 66 66 68 69  Capita Selecta  Chapter 7.  7.3 7.4  48  Summary and Conclusions Limitations of our method Future work  Part II.  7.1 7.2  Numerical Experiments  Application to synthetic data 5.1.1 Simple I D synthetic data 5.1.2 Comparison with Basis Pursuit and Matching Pursuit 5.1.3 Realistic synthetic data Application to real data  Chapter 6. 6.1 6.2 6.3  46  73  Atomic decompositions with Redundant Dictionaries  Problem formulation Method of Frames 7.2.1 The algorithm 7.2.2 Concept of spar si ty Matching Pursuit Basis Pursuit 7.4.1 The formulation 7.4.2 Linear Programming (LP) 7.4.3 Basis Pursuit De-Noising Basis Pursuit versus Matching Pursuit 7.5.1 Results 7.5.2 Discussion Block Coordinate Relaxation Method 7.6.1 Problem statement 7.6.2 Algorithm 7.6.3 Numerical scheme 7.6.4 Simple examples 7.6.5 Multiple Block Coordinate Relaxation Method 7.6.6 Results 7.6.7 Discussion Conclusion  74 74 75 75 75 76 78 78 78 80 80 80 92 93 93 94 95 97 98 99 Ill 112  Bibliography  113  Appendix A . Wavelets  116  A.l  Overview  Appendix B.  116  Bases  118  iv  B . l Fractional Splines B . 2 Fractional Derivatives of the Gaussian Appendix C. Soft Thresholding C. l  D. l  D.2  125  Derivation of the formula  Appendix D.  118 122  125  Fourier Transform and Fast Fourier Transform  Theoretical results D . l . l Definitions D . l . 2 Approximation of the Fourier coefficients with the F F T D . l . 3 Fourier coefficients and Fourier Transform D.1.4 Fourier Transform and Fast Fourier Transform Fourier Transform and Fast Fourier Transform in Matlab D.2.1 Approximation of the Fourier Transform in Matlab D.2.2 Matlab code for the Fourier Transform  Appendix E . Explanation of the software  v  127 127 127 128 128 128 129 129 130 132  List of Tables 5.1 C P U Running Times of B P , M P and our method S E E D E  vi  List of Figures 1.1  Detection-estimation method  6  2.1 2.2 2.3 2.4 2.5 2.6  Causal and anti-causal fractional splines Varying order transitions Wavelet Transform Modulus Maxima Lines Multifractal analysis of a well-log measurement Global multiscale analysis on different well-logs measurements Monoscale analysis of sedimentary records  9 14 21 22 23 26  3.1  Probability of success of B P  32  4.1 4.2 4.3 4.4  A n example of the window and its parameters Illustration of the windowing Fractional derivatives of the Gaussian with similar fractional-orders Detection Estimation Algorithm  40 41 45 47  5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 5.10 5.11 5.12 5.13 5.14 5.15 5.16  Estimation of the average waveform of seismic data with a Fourier spectrum fit Synthetic signal with 20 events Synthetic signal with 30 events Comparison between the detection-estimation algorithm, B P and M P Synthetic slice with 16 reflectors Detected reflectors for the synthetic slice with 16 reflectors Original fractional-orders for the synthetic slice with 16 reflectors Recovered fractional-orders for the synthetic slice with 16 reflectors Synthetic slice with 24 reflectors Detected reflectors for the synthetic slice with 24 reflectors Original fractional-orders for the synthetic slice with 24 reflectors Recovered fractional-orders for the synthetic slice with 24 reflectors Real data Detected reflectors in the real data Recovered fractional-orders in the real data Real data with recovered fractional-orders  49 51 52 53 56 56 57 57 59 59 61 61 63 64 64 65  6.1  Improved detection-estimation method  72  7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9  Decomposition of a cosine using the D C T and the Haar wavelets Analysis of a very sparse coefficient with an M D W T dictionary with 2 a Analysis of a very sparse coefficient with an M D W T dictionary with 10 a Analysis of a very sparse coefficient with 2 very different a Analysis of a very sparse coefficient with an M D W T dictionary with 2 similar a . . . . Analysis of a less sparse coefficient with an M D W T dictionary with 2 a Analysis of a less sparse coefficient with an M D W T dictionary with 10 a Analysis of a less sparse coefficient with an M D W T dictionary with 2 close a Analysis of a less sparse coefficient with an M D W T dictionary with 2 different a . . .  76 82 83 84 85 86 87 88 89  vii  7.10 7.11 7.12 7.13 7.14 7.15 7.16 7.17 7.18 7.19 7.20 7.21 7.22 7.23 7.24 7.25 7.26 7.27 7.28 7.29  Analysis of two close components in an M D W T dictionary with two different a Analysis of two close components in an M D W T dictionary with ten different a Block Coordinate Relaxation Algorithm B C R algorithm: separating the noise, a cosine function and a step function B C R algorithm: separating the noise, a cosine function and bumps Original and recovered signals Original and recovered coefficients Five first original and recovered signals Five last original and recovered signals Five first original and recovered coefficients Five last original and recovered coefficients Original and recovered signals Original and recovered coefficients Original and recovered signals Original and recovered coefficients Original signals Recovered signals Original coefficients Recovered coefficients B C R separation with decimated fractional spline wavelets  viii  90 91 96 97 98 100 101 102 102 103 103 104 105 106 107 108 108 109 109 110  Acknowledgments First, I would like to acknowledge my engineering school E N S T A (Ecole Nationale Superieure de Techniques Avancees) in Paris, France, whose excellent program enabled to study at the University of British Columbia. I am also very grateful to Dr. Robert Braid and all my professors at E N S T A for their help and support. I would also like to thank my supervisor, Dr. Felix Herrmann, for providing me with a very interesting project and guiding me in my research. I also thank him for enabling me to participate in the M G A (Multiscale and Geometry Analysis) program organized by I P A M (Institute for Pure and Applied Mathematics) at the University of California, Los Angeles. Doing research at I P A M for the Fall 2004 as part of the M G A program has been a wonderful experience. I would like to thank Richard Kublik for his help and support, especially during difficult times. I also thank Professor Ozgur Yilmaz for useful discussions and suggestions on my work, and my officemates Peyman Poor Moghaddam, Gilles Hennenfent and Urs Boeniger for their help and friendship. I gratefully acknowledge a Jean-Walter Zellidja scholarship for my first year at U B C . Finally, I give personal thanks to my parents for their love, support and encouragement throughout my studies.  ix  Part I Capita Prima  i  Chapter 1 Introduction  The Earth's subsurface consists of layers of different materials separated by interfaces, also called transitions. Transitions are characteristic of regions where the Earth's properties (indicated by changes in the acoustic properties) vary rapidly compared to the scale of the seismic wavelet. These sudden changes in the properties at the transitions create edges in the medium and provoke the reflection of seismic waves. These reflections are recorded at the surface, giving rise to a seismic signal (seismic trace) which contains the seismic events triggered by the reflection of the waves at the different transitions. To a good approximation, a seismic signal can be written as the convolution of the seismic reflectivity and the seismic source function (seismic wavelet) [5]. Here the reflectivity depends on the transitions in the acoustic impedance, which is the product of the compressional wave speed and the density of the materials. Seismic data thus carries information on the geometrical relationships between subsurface lithological boundaries, which is usually used to build geological models to explain the origin of stratigraphic features. Extracting local information about transitions from seismic data has recently received increasing interest [6, 7, 9, 11, 12, 18, 22, 23, 34, 35], as accurate knowledge of the subsurface is more and more needed. Accurately finding the types of transitions is important because the information on the type of transitions complements the information about amplitude and phase already used by geologists. Considered together, the type of transition, along with its amplitude and phase, can be used in various geophysical applications, ranging from improving the geological interpretation of the subsurface, to detecting changes in the lithology and constraining the critical points in the percolation model proposed by Herrmann et al. [19]. In this work [19],  2  a bi-compositional mixture model for transport properties (elastic and fluid) of rocks has been proposed. A t the critical points, where the more permeable of the rock constituents connect, the model predicts a multi-scale transition in the transport properties. The exponents of the transitions associated to the critical points could explain and characterize the transitions in the subsurface. In this study, we are interested in characterizing the transitions in the subsurface, as this information may be used to infer the permeability properties of the rocks. The diversity of transitions in the subsurface has recently been established by Herrmann et al. [7, 14, 16, 22, 27] in their study of seismic and well data using multiscale analysis with wavelets. The results of Herrmann, Muller and others, showed that the Earth's subsurface behaved like a multifractal, giving rise to a medium consisting of varying order singularities [7, 14, 16, 18, 27, 35]. Transitions no longer follow the typical zero-order discontinuity model (i.e. step function), but can have any fractional-order discontinuity, that can be obtained by fractionally differentiating/integrating a zero-order discontinuity [17, 33]. The conventional deconvolution problem (spiky decon) considers seismic transitions in the subsurface as zero-order discontinuities, and aims at recovering the seismic reflectivity as a spike train signal. However, several methods have been proposed to delineate and characterize transitions. Starting in 1972 with Harms and Tackenberg [13], followed by Payton in 1977 [28], and later Taner et al. in 1979 [32], attempts were made to characterize transitions using complex trace analysis. Complex trace analysis is based on instantaneous phase behavior, where the instantaneous phase is seen as a seismic trace attribute giving information on lithological properties. This approach is also based on the non-local Hilbert transform, which makes it noise sensitive. In this analysis, transitions are modeled as a combination of several sub-wavelength zero and/or first order discontinuities. However, this transition model is still simplistic, as it does not incorporate all the different types of transitions. We propose a mathematical model for transitions, that incorporates instantaneous phase behavior through any fractional-order discontinuity [7]. Our model allows for any fractional-order transitions in the subsurface, leading to a broader class of reflectivity signals. This makes our model more general, and an extension of the conventional deconvolution model. In the case where transitions are modeled by  3  zero-order discontinuities, our model reduces to the conventional deconvolution model. Following their multiscale studies of seismic and well data, Herrmann et al. [16, 17, 20] proposed a method to characterize the sharpness of transitions using monoscale analysis. In this method, zero-order transitions were generalized to any fractional-order. The key of this method is to fractionally integrate/differentiate the seismic signal until the disappearance/appearance of a local maximum. The amount of integration/differentiation at the point of disappearance/appearance of the local maximum gives the estimate for the fractional-order of the transition. Although relatively successful, this method lacks robustness and is highly sensitive to noise and any phase rotation of the waveforms in the signal. Following the monoscale analysis, Herrmann [18] proposed a parameterization of seismic signals with redundant dictionaries. In this framework, the signal is written as a weighted superposition of waveforms parameterized by their fractional-order. These waveforms are taken from a suitably chosen collection, called a dictionary [2, 26]. Given a dictionary, the signal can be represented by a vector containing the weighting coefficients. Assuming sparse reflectivity, the idea is to find the sparsest set of coefficients in the chosen dictionary that represents the signal. Under certain conditions, the sparsest representation contains a few large coefficients corresponding to the different seismic events in the signal. The problem of finding these coefficients (which corresponds to solving an ill-posed problem), was attempted using Matching Pursuit [15, 18, 26] and Basis Pursuit [2, 18]. These are two algorithms that approximate the sparsest representation. The results of Matching Pursuit and Basis Pursuit showed the ill-posedness of the problem and the difficulties in finding the sparsest representation for complicated signals in large dictionaries [18]. Even though both algorithms are able to find the sparsest representation for simple configurations, both fail when the dictionary becomes too big or the signal too complex. Motivated by the conclusions of these studies [18], we reformulate the problem into a more tractable approach. We propose a two-step method that divides the resolution of the problem into two subproblems. In the first part, that we call detection, we find the major events in the seismic signal. In the second part, that we call estimation, we estimate their fractional-orders.  4  B y finding the largest sets of coefficients in the approximate sparse representation of the seismic signal in a chosen detection dictionary T>d, the detection algorithm locates the major features of the seismic signal that correspond to predominant reflectors in the Earth. The detection dictionary  contains several parameterized waveforms that mimic the behavior of the seismic  signal. The richness of the detection dictionary accounts for non-stationarity in the seismic signal. The approximate representation c of the signal s in  is obtained by minimizing  5l|s-*dc||l + A | | c | | i ,  (1.1)  d  with respect to c, where c is the representation of the seismic signal s with respect to the dictionary T>^ &d is the matrix of the detection dictionary X ^ , and A<2 is a trade-off parameter between the data misfit and the d. norm of the coefficient vector c. The columns of <&d are 1  the different waveforms in T>d after normalization. The detection step can be viewed as a spiky deconvolution problem, where the goal is to find a spike train signal containing the locations of the predominant components. The different signal components associated with the largest coefficients in the representation c constitute a partitioning of the signal s into its prevailing localized components (major reflection events). The extraction of these components is done by multiplying the signal with localized windows centered at the different component locations. The extracted signals, now localized, are analyzed separately in the estimation part. We denote S j , the i  th  signal component extracted from s.  The estimation algorithm operates on relatively simple and localized signals Sj assumed to be a linear combination of a limited number of waveforms. The estimation process delineates the different waveforms in each signal S j , and estimates their fractional-order by finding the sparsest representation of the signal in another dictionary of parameterized waveforms, noted V  e  for  estimation dictionary. Each Sj is analyzed in the estimation part through the minimization of ^ I l i i - S e C j I l i + AellCiHi, with respect to c  i;  where  denotes the i  th  (1.2)  signal component of the seismic signal s, and c  t  its representation with respect to the estimation dictionary T> . Here, we aim at an exact e  reconstruction, which requires the parameter A to be very small. When A —> 0, (1.2) behaves e  5  e  like Basis Pursuit [2] and translates into min ||CJ||I subject to s = 4  & £i-  (1.3)  e  The coefficients in the sparse representation Cj correspond to waveforms in T> whose paramee  ters a constitute estimates for the fractional-orders of S,. The final step collects the different waveforms and associated fractional-orders pertaining to each signal component Sj into a global vector describing the complete seismic signal s. The global vector contains the fractional-orders of all the reflection events of the total seismic signal s. The following figure shows the main steps of our detection-estimation method.  s  ii VL I  r 3  2.  J  -J  s  52 ^4.2 ESTIMATION  4  4  ^4.2 ESTIMATION  ESTIMATION  4  4  a  Q-2  3  ESTIMATION  «4  Figure 1.1: Detection-estimation method. The numbers 4.1, 4.1.4 and 4.2 refer to the sections where the particular topic is explained in detail.  6  Chapter 2 Background  2.1  Model for seismic transitions  Significant variations in the Earth's properties over the length-scale of the seismic wavelet create singularities in the subsurface and make seismic waves reflect. These singularities occur at transitions between layers, and are typically modeled by zero-order discontinuities (i.e. step discontinuities). However, multiscale analysis on sedimentary records (vertical profiles of the broadband fluctuation of the elastic properties) performed by Herrmann et al. [14, 16, 27], revealed the existence of accumulations of varying order singularities in the subsurface, which give rise to any fractional-order discontinuity. The different types of transitions correspond to the different materials in the Earth and to the various ways in which materials are deposited and mixed together at the transitions.  2.1.1  The seismic reflectivity  The mathematical model Following Herrmann et al. [15, 17, 18, 21], we extend the usual zero-order transition model to any fractional-order transition, modeled by the causal and anti-causal fractional splines [33] (also called fractional onset functions):  (2.1)  where a > 0, and T is the Gamma function defined as: T(x) — J °° t e x  0  a > 1, these functions are differentiable.  dt for x € R. When  l  When a is between 0 and 1, they become non7  For — 1 < a < 0, they are discontinuous (and non-  differentiable b u t remain continuous.  differentiable). W h e n a < — 1, the fractional splines cease to be locally integrable, and become tempered distributions [17]. C a u s a l and anti-causal fractional splines are linked together by the relationship: V x € M , X-( ) x  =  X+(~ )- M a t h e m a t i c a l l y , the fractional exponents a specify x  the regularity of the fractional splines, as they coincide w i t h their L i p s c h i t z exponent. E a c h of these fractional splines [33] is L i p s c h i t z a, where the L i p s c h i t z regularity is defined as:  Definition 2.1.1  (Lipschitz regularity [25])  • A function f is pointwise Lipschitz a > 0 at v if 3K > 0 and a polynomial p of degree m = \ot\ such that v  ViGR,  \f(t)-p (t)\ v  <K\t-v\ . a  • A function f is uniformly Lipschitz a over [a,b] if it satisfies the above inequality for all v G [a, b], with a constant K that is independent of v. •  The Lipschitz regularity of f at v or over [a, b] is the supremum of the a such that f is Lipschitz a.  O u r choice of fractional splines is motivated by their regularity a n d scale invariance. T h u s we can characterize the sharpness of a transition irrespectively of the scale: V x G R and V a > 0, a > 0 , X^(CTX) - < J ± ( x ) . a  X  A s explained before, each of these functions has a particular order of singularity a. For a > 1 at v, the function %± has a bounded derivatives and a singularity at v w i t h L i p s c h i t z exponent a. However, when a < 1 at u, x± is not differentiable at v a n d a characterizes the type of singularity. E x a m p l e s of causal,and anti-causal fractional splines are shown i n F i g u r e 2.1.  8  a=0  a = 0.1  a = 0.3  a = 0.5  Figure 2.1: Examples of Causal and anti-causal fractional splines with different fractionalorders. The top two rows show causal fractional splines, and the bottom two rows show anticausal fractional splines.  9  Implementation of the x± basis We construct the basis B&, made of the family of all translations of x ± f °  r e a c n  a  - Except for  a = 0, the family B^, obtained by all the translations of x±> gives a degenerate collection of waveforms, as the last basis function is zero everywhere. This is due to the fact that the first sample of %± is zero for any a except a = 0. B y adding 1 to all the x ± , we ensure all the basis functions are non-zero. Here the translations are non circular. If we use circular translations (or circular permutations), we create a singularity of order 0 for all the basis functions in all the bases B^ • A circular translation C , p € Z , of a vector v € M p  wi  /  /  , v((n + p)  nv t s  *r  VI < n < I\ , C^v(nJ =  be the length of the the vector Xa  B+ = {xi}  a n  has the property that  (mod AT)) i f n + p ^ O  v(N) Let  N  if n + p = 0  (mod N) (mod N)  d define:  , such that VI < n < N , y ^ n ) = x+(n - i + 1) , where x+(n) = 0 Vn < 0  It is easy to prove that for any a > 0, B£ is a basis. See proof in Appendix B . l . A similar 1  argument is used to show that B~ is a basis.  Transitions and reflectivity We denote by / the I D vertical earth model that represents the transitions as a superposition of fractional splines:  f(z) = J2^iX± (z-z ),  (2.2)  i  i  i  where z corresponds to the depth in the subsurface, and Zi to the location of the i  th  transition  in the subsurface. B y differentiating / , given by (2.2), we obtain the seismic reflectivity r, as r(z) =  where Ki =  >  a n  K  d A c and  i X  T\z - ) Zi  KiX°?-\z  - «),  (2.3)  are the sets of indices for the transitions with causal and  anti-causal discontinuities respectively.  10  2.1.2  The seismic source function  In most geophysical surveys, and especially those using explosive sources, the seismic source function tp is unknown, and is typically modeled by the Ricker/Mexican hat wavelet (opposite sign of the second derivative of the Gaussian). In the rest of the paper, we will only use the name Ricker wavelet. We relax this model and discuss our assumptions on the seismic source function with regard to its smoothness and wiggliness. We first introduce some definitions. Following Unser et al. [33], a 7-order causal fractional B-spline, £+> is defined by taking the (7 + l )  t h  fractional difference of a one-sided power function £(*)  = Al x+(x)  = £(-!)* f  +1  fc>  \  0  7  t *W *  - *),  (2-4)  /  where 7 > — 1, x G R, and ( £ ) is the generalization of the binomial coefficients defined as T + 1\ Tf'Y + 2) 7  A^  + l ) r ( + 2 - k) [33]. The (7 + l ) difference of a function g is defined g ( x ) = ^ ( - l ) y ^ jg( - k). In the Fourier domain, (2.4) translates to  k J' - T(k + 1  fc>o  7  f e  t h  x  ^  by  '  with the frequency to € R. The anti-causal and symmetric fractional splines [33] are defined in a similar way by anti-causal:  C-( ) — C+( W  — w  )  >  a n  d symmetric:  C (CJ)  sin(ui/2) 7+1  7  w/2  (2-6)  The fractional B-splines are the extensions of the traditional B-splines to non integer exponents 7. For further extensions, B l u et al. [1] defined the (7, r)-fractional B-splines l+i_ C» C T  Q{u)  -  2  2 +  H  (2.7)  (2.8) where 7 is the usual fractional exponent, and r controls the phase of the function. Unlike the traditional B-splines, fractional splines do not have compact support, but are of rapid decay with the following decay property [33]  11  The fractional B splines and (7,0)-fractional B-splines converge to the infinitely smooth Gaussian function when 7 —> 0 0 . We consider the source function as a 8  th  derivative of an (7, refractional B-spline, parameter-  ized by its fractional exponents 7 and 8, and its phase determined by r iP(t) = d?Q{t) for 8 < 7 + 1,  (2.10)  where t is the time variable. The conventional Ricker wavelet model for the seismic source function ip is a particular case of our model for 8 — 2, r = 0, and 7 —> 0 0 . In addition to this model, we discuss two properties of the seismic source function ip: the decay of its Fourier spectrum for frequencies going to infinity, as ip G C ( R ) (ip is an 7-times continuously 7  differentiable function on R ) , / |f/>(u;)||u;| cLj < 0 0 , JM. 7  and the differentiability of the Fourier transform of ip at zero frequency [18]:  / t ip(t)dt = 0 <S=» n(d^))  = 0 for q G N , and q < M.  q  u=0  JR  The first condition limits the high-frequency content of tp by setting the asymptotic decay rate for high frequencies, and the second condition requires that ip be orthogonal to some finite degree polynomial, hence describing its wiggliness [18]. This latter property defines the number of vanishing moments of ip [25], noted q.  2.1.3 The model for the imaged seismic signal We consider the linearized single scattering approximation, where the imaged seismic signal s can be written as: s(z)cx(r*ip )(z).  (2.11)  z  Here r is the imaged travel-time depth converted reflectivity, and ip is the depth converted z  seismic source function. For clarity purposes, we simplify the notations and refer to the depth converted seismic source function ip as ip. Because we are only interested in the singularity z  12  properties of the seismic signal, we simplify the notations and omit the constants. We therefore write the seismic signal s as the convolution of the reflectivity and the seismic source function: s( ) z  =  (T*II>)(Z)  =  (2.12a)  Y2 *+~ (z-*i)*iKz)K  ieA  £ t f i X -  l  <  _  1  ( * ( 2 . 1 2 b )  J6AA  c  where r is given by equation (2.3). The seismic signal s can also be written as a superposition of fractional derivatives/integrations of the seismic wavelet ip, where the definition of fractional differentiation is given by Liouville [24] in his generalization of differentiation for fractionalorders:  D e f i n i t i o n 2.1.2 (Fractional Derivatives [24]) Let f  and X +  _ 1  be a tempered distribution in «S(R)  be the causal fractional spline:  The fractional derivative of f of order a is defined as Df a  =  * /,  where the convolution has to be taken in the sense of distributions.  Under the condition that the orders OJJ given in equation (2.12b) are all non integers, and that the seismic wavelet tp is either even or odd, we can write: XT\  Z  - i) * iP( ) = QD-^iz z  z  - Zi)  (2.13)  and  I dD-^fzi =l  X J~ (z-Zi)*iP(z) a  1  - z) if t\> is odd,  -dD~°' 'ip(zi  ,  (2-14)  - z) if tp is even,  i  which gives a seismic signal s of the form () = Y  8  CiD- *P(z ai  Z  ieA  - ) ± J2 CiD~ ^(z a  Zi  ieA  c  13  A  - zi),  (2.15)  (a) Transitions:a between 0 and 1  (b) Seismic signals with Ricker  Locations  Locations  Figure 2.2: Varying order transitions and their associated seismic signals obtained with a Ricker wavelet. The transitions are set to continuously interpolate between a zero-order discontinuity (step function) and a first-order discontinuity (ramp function). One can clearly see that different order transitions create seismic signals that not only differ in amplitudes, but also in waveform. where G, -  V^/)  -  rfo+i) •  In Figure 2.2, we present an example of varying order transitions and their corresponding seismic signals. The seismic source function is a Ricker wavelet. Figure 2.2a shows the different transitions with fractional-orders ranging from 0 to 1 and Figure 2.2b shows the seismic signals obtained by convolving the reflectivity associated to these transitions with the Ricker wavelet. B y interpolating the fractional-order of the transitions between 0 and 1, we obtain seismic signals going from the Ricker wavelet (associated to the zero-order transition) to the antiderivative of the Ricker wavelet (associated to the first-order transition).  14  (a) Transitions:^ between 0 and 1 order 0.1  0.9  -  (b) Seismic signals with Ricker  s^/M  y/M///In / //// If/ ///III /  0.8 order 0  In  I/  0.7  I Ii \ / / 'II /  W 0.6 CD  0J  11  T3  ^ 0.5  "O  Z3  -4—I  CL  E <  E  < 0.4  1 Ii III/  0.3  \  0.2  j(Ijl  0.1  w -0.5  order 1  0 0.5 Locations  0 0.5 Locations  1  Figure 2.2: Varying order transitions and their associated seismic signals obtained with a Ricker wavelet. The transitions are set to continuously interpolate between a zero-order discontinuity (step function) and a first-order discontinuity (ramp function). One can clearly see that different order transitions create seismic signals that not only differ in amplitudes, but also in waveform.  where L x  - r&Vi)  In Figure 2.2, we present an example of varying order transitions and their corresponding seismic signals. The seismic source function is a Ricker wavelet. Figure 2.2a shows the different transitions with fractional-orders ranging from 0 to 1 and Figure 2.2b shows the seismic signals obtained by convolving the reflectivity associated to these transitions with the Ricker wavelet. B y interpolating the fractional-order of the transitions between 0 and 1, we obtain seismic signals going from the Ricker wavelet (associated to the zero-order transition) to the antiderivative of the Ricker wavelet (associated to the first-order transition).  14  2.2  Previous work  2.2.1 Complex seismic trace analysis Beginning with Harms and Tackenberg [13] in 1972, followed by Payton in 1977 [28], and later Taner et al. [32] in 1977 and 1979, many attempts have been made to characterize transitions using complex trace analysis.  Complex seismic trace analysis calculates attributes from a  seismic trace. Attributes are defined as specific information characterizing seismic traces that can be computed or extracted from them. For complex trace analysis, the attributes are the amplitude, phase, instantaneous frequency, weighted average frequency and apparent polarity [32]. This analysis considers a seismic trace s(t) as the real part of a complex signal S(t) [32], where S is defined as:  S(t) = s(t)+is*(t) , Vt > 0. s* is called the quadrature component and is uniquely determined by the following two conditions 1. s* is calculated from s by a convolution 2. If s(t) = Acos{ut + 9) for A, 9 € R and to > 0, then s*(t) = Asin(ut + 9) (s* is the Hilbert transform of s) s, considered as a complex signal, can be written as S(t) = A(t)e ^\ l6  where s(t) is the real part  of S, and s*(t) the imaginary part. From this, we refer to the amplitude \A(t)\ as the reflection strength, and to the phase 9(t) as the instantaneous phase [32]. The instantaneous frequency u can be computed by calculating /  d9(t)  x  -  -  r  -  Using the reflection strength, the instantaneous phase and instantaneous frequency, we deduce the weighted average frequency as [32] _  J^A(t-TM*-T)L(T)dT j^A{t-T)L{T)dT  and apparent polarity. The apparent polarity is defined as the sign of s{t) when A(t) has a local maximum [32]. 15  Taner et al.  [32] interpreted these attributes as information on geological and lithological  features. For instance, they interpreted high reflection strength as "major lithological changes between adjacent rock layers, such as uncomformities and boundaries associated with sharp changes in sea level or depositional environments", and showed that "phase displays are effective in showing discontinuities ... and events with different dip attitudes which interfere with each other" [32]. This analysis is based on instantaneous phase behavior, where the instantaneous phase is seen as a seismic trace attribute giving information on lithological properties. The complex trace analysis approach is also noise sensitive, as it has the disadvantage of being based on the nonlocal Hilbert transform. The complex analysis transition model is also quite simplistic. In this paper, we propose a mathematical model for transitions that incorporates any instantaneous phase behavior of the form  (iuS)  a  =  e 2 l  a  through fractional-order transition, as fractional  differentiations/integrations yield phase rotations [7].  2.2.2 Multiscale analysis Herrmann et al. [14, 16, 21] performed a multiscale analysis of seismic data using wavelets.  Theorems and Definitions Given a mother wavelet ip, where ip is a normalized function in L (U) with zero average, we 2  can create a family of wavelets through translations and dilations of the mother wavelet  Vt € R , V(a,a) 6 (R+)Va,a(*) = A=1>  where a > 0 is the position of the wavelets and a > 0 their scale. The wavelet coefficients of a function / are computed by taking the inner product of / with the different wavelets ip  atCT  W{f,iP}(a,o-) = (f,ip , ). a a  See Appendix A for more details on wavelets.  16  (2.16)  T h e o r e m 2.2.1 [JAFFARD] [25] [Relation between Lipschitz exponent and the decay rate of the wavelet coefficients] If f € L ( R ) is Lipschitz a <n 2  at v, then there exists A > 0 such that  V(«,s)6lxK  +  , \Wf(u,s)\ <As ^(l a+1  + \—-\ ). s a  Conversely, if a < n is not an integer and there exist A > 0 and a' < a such that V(«, s ) e l x l + ,  \Wf(u, a) | < ^ s  Q + 1 / 2  ( l + \^—^\ '), s a  then f is Lipschitz a at v.  This theorem proves that the local Lipschitz regularity of / at v depends on the decay rate at fine scales of \Wf(u, s)\ in the neighborhood of v and that the decay can be controlled from its local maximal values. [25]  Definition 2.2.1 [Modulus Maximum and Maxima Lines] A modulus maximum is a point (UQ, S O ) where \Wf(u, so)| is maximum at u = UQ. This implies dWf(u , ) 0 SQ  =  Q  du This local maximum should be a strict local maximum in either the right or the left neighborhood of UQ, to avoid having any local maxima when \Wf(u, so)I is constant. We define a maxima line to be any connected curve s(u) in the scale-space plane (u, s) along which all points are modulus maxima.  The Wavelet Transform Modulus Maxima Lines are referred to as W T M M L . Singularities are detected by finding the abscissa where the wavelet modulus maxima converge at fine scale i.e., when s —* 0.  Definition 2.2.2 (Self-similarity [25]) A set S C R™ is said to be self-similar if it is the union of disjoint subsets Si, S2,  Sk that can be obtained from S by a scaling, translation 17  and rotation. This self-similarity often implies an infinite multiplication of details, that creates irregular structures.  D e f i n i t i o n 2.2.3 ( C a p a c i t y d i m e n s i o n [25]) The capacity dimension is a simplification of  the Hausdorff dimension that is easier to compute numerically. Let S be a bounded set in M. . n  We count the minimum number j\f(s) of balls of radius s required to cover S. If S is a set of dimension D with a finite length(D=l), surface(D=2) or volume(D=3) then N  ~  ( ) s  ^  so log S  s-*0  The capacity dimension D of S generalizes this result and is defined by D = l i m i n f ^ l . s->0 log S  The Hausdorff dimension is a refined fractal measure that considers all covers of S with balls of radius smaller than s. It is most often equal to the capacity dimension, but not always.  Definition 2.2.4 [Singularity Spectrum [25]] Let S  a  be the set of all the points t € R where  the pointwise Lipschitz regularity of f is equal to a. The spectrum of singularity D(a) of f is the fractal dimension of S . The support of D(a) is the set of a such that S is not empty. a  a  The singularity spectrum gives the proportion of Lipschitz a singularities that appear at any scale s. A multifractal is said to be homogeneous if all singularities have the same Lipschitz exponent ao, which means that the support of D(a) is restricted to {ao}.  D e f i n i t i o n 2.2.5 [Partition function [25]] Let {up(s)}p^z  be the position of all local maxima  of \W (u, s)\ at a fixed scale s. The partition function Z measures the sum at a power q of all g  these wavelet modulus maxima Z(q,s) =  J2\ f( P> ^W  v  18  u  s  For each g G M , the scaling exponent r(q) measures the asymptotic  decay of Z(q, s) at fine scales  s Tig) = l i m mf v  '  This typically  Theorem  means that Z(q,s)  2 . 2 . 2 [ARNEODO,  ~  s-+o  \og(Z(q,s)) ^——. logs  s(\ T  q  BACRY,  JAFFARD,  support of D(a>). Let ip be a wavelet with n > a  m a x  MUZYJ [25] Let A = [a i ,a ] m n  vanishing moments.  be the  max  If f is a  self-similar  signal then r(q) = min(g(a + 1/2) — 73(a)). (Legendre  Proposition •  •  2.2.1  The scaling exponent r(q) is a convex and increasing  The Legendre transform  is invertible D(a)  The spectrum D(a)  transform)  of self-similar  function  of q.  if and only if D{a) is convex, in which case  = min(g(a + 1/2) -  r{q)).  signals is convex.  We have the relations | ^ = a(q) + 1/2, where 1/2 is due to the I? normalization of the wavelets. Numerical  calculations  1. Maxima:  [25]  compute Wf(u,  s) and the modulus maxima at each scale s. Chain the wavelet  maxima across scales. 2. Partition  function:  compute Z(g,s) = ^ | W / ( u , s ) | « . p  p  3. Scaling: compute T(g) with a linear regression of l o g Z(q, s) w r(q) log s + C(q). 2  4. Spectrum:  2  compute  D{a) = mm(g(a + l / 2 ) - ( g ) ) . T  19  L o c a l analysis Herrmann [14] applied the result of theorem 2.2.1 to well data, and was able to find their local exponents a by estimating the slope of the wavelet coefficients in the log-log plane. In the log-log plane, the relationship between the wavelet coefficients and the local exponent a at position a and scale a is log(|W{o, a}\) < \og(A) + a log(a),  (2.17)  where A is a positive constant. From equation (2.17), the estimate for the local exponent a is computed via linear regression. In Figure 2.3 and Figure 2.4, we show the local multiscale analysis performed on a simple synthetic signal and on a sonic well-log measurement [16]. In Figure 2.3, the top plot shows the data with the different singularities, the middle plot shows the continuous wavelet transform with the W T M M L superimposed, and the bottom plots show the local multiscale analysis associated to each singularity. In Figure 2.4, the top plot shows the sonic well-log, the middle plot shows the continuous wavelet transform with the W T M M L superimposed, and the bottom plots present the local multiscale analysis for selected singularities. We can clearly see that the well-log contains singularities everywhere, as the W T M M L converge everywhere. This property therefore shows that the Earth behaves like a multifractal. The method is appropriate for functions with isolated singularities because it is based on a local analysis. However, for functions with accumulated singularities (which is the case for seismic data), it is difficult to detect and characterize the singularities using a local analysis. G l o b a l analysis Global analysis yields an estimate for the singularity spectrum f(a) (defined in definition 2.2.4), which gives the "rate of occurrence" [14] of a particular singularity a in the analyzed signal. Theorem 2.2.2 and Proposition 2.2.1 link the singularity spectrum f(a) to the scaling exponent r(q) (defined in Definition 2.2.5) through the following relation r(q) = mm(q(a + 1/2) - f(a))  , and / ( a ) = mm(q(a + 1/2) - r(q)).  20  1.5 F  Figure 2.3: Regularity analysis through local analysis. The top plot shows the data with different singularities. The second plot shows the continuous wavelet transform with the location of the W T M M L L (Wavelet Transform Modulus Maxima Lines) superimposed. The bottom plots show the local multiscale analysis for each singularity. The slope of these different lines gives an estimate for the singularity order. Taken from [16].  21  as  25  1 ^  %  Jf  22,5  25  9  1»  "  i. jf  .7.5 .5 1  2  3  4  S 0  (rf)  7  is  •  log«  1  2  3  4  S  («)  0  7  S  log c  «iO  2400  if)  logo-  xatm] D.6308  2 0  17.5 15  Figure 2.4: Local analysis of a sonic well-log measurement. The top plot shows the sonic well-log (compressional wave speed). The middle plot shows the continuous wavelet transform with the W T M M L (Wavelet Transform Modulus Maxima Lines) superimposed. The bottom plot presents the local multiscale analysis for selected singularities. We can clearly see that the well-logs are singular everywhere, and therefore present multifractal characteristics. Taken from [16].  22  "/(a) and r(q) contain information on the global integrability and differentiability of the signal s " [14]. The minimum exponent a i  refers to the strongest singularity associated to the less  m n  regular regions, whereas a  m a x  refers to the weakest singularity associated to the most regular  regions. Figure 2.5 [16] shows the result of the global multiscale analysis on three different well-logs associated to the compressional wave speed, the shear wave speed and the density. The top three plot present the well-logs, the bottom left plot shows the function r(q) for all three logs, and the bottom right plot shows the singularity spectra f(a) for the three logs as well. Note the similarity between the three logs.  6000  -i  u^tOOO 2000 0 i  1  1  200  400  600  _J  L 800  200  400  600  200  400  600  r  1000 x  1200  1400  1600  1800  2000  800  1000  1200  1400  1600  1800  2000  800  1000  1200  1800  2000  4000 o"2000 0 0 4000  =L2000 0  „.i/..~~4..-w—-^tU' 1400  1600  \ 0.5  1  \  1.5  Figure 2.5: Comparison between three different well-logs. The top plot presents the compressional wave speed well-log, the second plot the shear wave well-log, and the third plot the density well-log. The bottom left figure shows the r(q) function for all three logs, and the bottom right plot shows the singularity spectra f(a). Taken from [16]. The results of Herrmann [14] after applying the global analysis on seismic data showed that the scale exponent a was different at every point, which is one of the characteristics of multifractals. One of the explanation for the multifractality of seismic data was given by Herrmann [14]: "The multifractality of the Earth's sedimentary deposits can be understood because the sedimentation  23  process if closely linked to the hydrodynamics of the atmosphere. Hydrodynamic turbulence is one of the classical examples of a process showing evidence of multifractal behavior across a large inertial scale range. Hence the variability in the sedimentary deposits can be considered as a frozen state of turbulence." The results of the global multiscale analysis on seismic data provide the geophysicist with information on its global scaling exponents, but not about its localization in the data. In full band data, it is possible to estimate the local exponent a from / ( a ) . However, because seismic data is bandwidth limited, this is more difficult. Indeed, the multiscale method needs all the scales (or frequencies) in order to derive the asymptotic results, which are not available in bandwidth limited data. Multiscale analysis is therefore not the best method to obtain information about the local scale exponent, albeit several attempts have been made [14, 35].  2.2.3 Monoscale analysis Herrmann et al. [14, 16, 20] proposed a monoscale analysis method to analyze seismic data and unravel its singularities. In this method, they generalize the zero-order transitions to any fractional-order. The key of the method is to fractionally integrate/differentiate the seismic signal until the disappearance/appearance of a local maximum. The amount of integration/differentiation at the point of disappearance/appearance of the local maximum gives the estimate for the fractional-order of the transition. The monoscale analysis method is based on the generalization of the standard continuous wavelet transform.  The continuous wavelet transform of / , defined in equation (2.16), can  be rewritten as a function of the scale a and differentiability M [16]: W { / , ^}(a,  a) = o  M  ^{f  * 4> )(a) = a  M  0  (f * ^ 0 , ) ( < O = ( / * V ' f ) ( « ) ,  (2-18)  where <p is a 2 M differentiable, real and symmetric smoothing function with support propora  tional to a, and tpjf is the wavelet generated by dilations of ip  M  defined as Vx € K , tp (x) = M  ( - l ) ^ 0 ( x ) [16]. w  Herrmann et al. [14, 16, 20] generalized the wavelet transform to any fractional-order derivatives  24  8-  W{f, with 0 € R. When 3 is negative,  ifi}{a,  a) = <r?-rj(f * <t>c){a),  is a fractional integration, and when 3 is positive,  (2.19) is a  fractional differentiation [16, 20, 21]. Instead of varying the scale a like the multiscale method, 3 is the varying parameter, changing the fractional number of differentiations or integrations [16]. Modulus Maxima lines are defined in the same way as for the standard wavelet transform, defined in Definition 2.2.1. For fractional differentiation (0 >0), the local estimate for the fractional exponent a is given by the minimum value of 0 for which a local maximum appears along these 0 maxima lines  a(a,a) =  inf{d |W{/,^}(<7,a)| a  = 0}.  (2.20)  For fractional integration (8 < 0), the estimate for a is given by a(a,a) = sup{d \W{f,^ }(o,a)\=0}. a  (3  (2.21)  In Figure 2.6 we present the result of the monoscale analysis method on sedimentary records [17]. The top plot shows a well data set, and the second plot the singularity orders of the well data found by the monoscale analysis. The third plot shows an ice core measurement, and the bottom plot presents the singularity orders of the ice core measurement. The color defines the singularity order, and the size of the dots their magnitude. This method gives quite good results. However, it doesn't always maintain the lateral continuity of the exponents along the reflectors, lacks robustness, and is highly sensitive to noise and any phase rotation of the waveforms in the signal.  2.2.4  Seismic Atomic Decomposition  Following the monoscale analysis, Herrmann proposed in 2005 [18] a parameterization of seismic signals using redundant dictionaries. In this framework, the signal is decomposed in a dictionary of waveforms parameterized by their fractional-order, and represented by its coefficients. Assuming sparse reflectivity, the idea is to find the sparsest set of coefficients that represents 25  awacl I m«n triry r«corct  lOOO  1200  14-00  3<SOO  1SOO  lOOO  1200  1400  1600  lO  oI lO  20  1  20  30  • 30  40  2000  2200  2400  2600  2SOO  3000  1SOO  2000  2200  2-JOO  2C>00  2800  3000  Ice c o r e  measureineiu  so  s>o  100  SO  (SO  70  1  •  '  •  '  '  •  40  50  <SO  70  SO  OO  IOO  110  H O  Figure 2.6: Application of the monoscale method. The top plot shows the smoothed acoustic impedance, and the second plot displays the singularity orders of the impedance found by the monoscale analysis. The third plot shows an ice core measurement, and the bottom plot displays the singularity characterization of the ice core measurement. The color defines the singularity order, and the size of the dots their magnitude. Taken from [17]. the signal in the chosen dictionary. The sparsest representation contains a few large coefficients pertaining to the different seismic events in the signal. This problem was tackled using Matching Pursuit [15, 18, 26] and Basis Pursuit [2, 18], which are two algorithms that approximate the sparsest representation. The results show the ill-posedness of the problem and the difficulties in finding the sparsest representation for complicated signals in large dictionaries. For simple configurations, both algorithms are able to find the sparsest representation, but fail when the signal becomes too complex or the dictionary too large. A dictionary is defined in Chapter 3, along with a brief explanation of the Matching Pursuit and Basis Pursuit methods.  26  Chapter 3 Problem Statement  Characterizing transitions in the Earth involves finding their locations, amplitudes, and fractional exponents. In particular, given the seismic signal  s( ) = Y, CiD- iP(z ai  z  - )±Y^ Zi  CiD- ^{z a  -  )  Zi  as given in (2.15), the goal is to estimate the locations Zi, the amplitudes Ci and fractional exponents Q j . One of the methods proposed to solve this problem is atomic decomposition [18], whose results constitute the starting point of our detection-estimation method.  3.1 Dictionaries and Decomposition In the atomic decomposition method, the aim is to uniquely decompose the whole seismic signal with respect to a dictionary and to represent it by its coefficients. We use terminology introduced by Mallat and Zhang [26]. In R , a dictionary V is a collection of parameterized n  waveforms (</? ) r> where the waveforms </? are discrete-time signals of finite length n, called 7  7G  7  atoms, and V is the set of the indexing parameter 7.  In the case of a frequency or Fourier  dictionary, 7 is the indexing frequency, whereas for a time/scale dictionary, 7 is the indexing time/scale jointly. A dictionary is called complete if it contains exactly n linearly independent atoms, or overcomplete if it is full rank and contains more than n atoms. Given an overcomplete dictionary V of normalized atoms, we consider the decomposition c = ( c ) r of a signal s as 7  7 e  (3-1)  7er 27  where the representation of the signal s is given by the vector c = ( c ) r - For discrete signals, 7  7G  we can consider the atoms of T> as columns of a matrix 3> and write equation (3.1) in the matrix form (3.2)  s = 3>c, where $ is an n x m (m > n) matrix representing the dictionary V, s € R and c G R . n  also use the adjoint of <& denoted by 3?*. 3> and  m  We  satisfy the relation:  Vx G R , Vy_ G K " , ($x,y.) = (x, m  We adopt terminology used by Chen et al. [2]. The matrix 3? represents the synthesis operator, which consists of building up a signal s by superposing atoms of V. The matrix 3>* represents the analysis operator, whose action is to find a vector of coefficients c that correspond to the atoms in s. The normalization of the dictionary, leading to | | $ | | = 1, is crucial for the convergence of any minimization involving dictionary decompositions. In the rest of the paper, we will use normalized dictionaries and will refer to $ as the matrix of T>. Finding the solution c of (3.2) in an overcomplete dictionary is an underdetermined problem, and thus has an infinite number of solutions [2, 26].  3.1.1  Basis Pursuit  The nonuniqueness of c enables us to choose the representation c that is most suited to our problem. Even though seismic reflectivity yielded by sedimentary records is a non sparse signal as it reflects everywhere, we can still treat it as a sparse signal by only considering the few large outlying reflectors. Seismic reflectivity is therefore assumed to be a sparse signal. Given sparse reflectivity, and assuming the dictionary is chosen to give a sparse representation of seismic signals, it is natural to impose sparsity on the representation c and choose the sparsest c that represents s. We define an event to be any localized waveform in R . n  The sparsest representation is the representation with minimum £° norm (i.e. the fewest number of non-zero entries), which is referred to as the solution of the (PQ) problem [8]:  (Po) min ||c||o subject to s = 3>c. 28  /  (3.3)  For the remainder of this paper, we will describe the sparsest solution as the solution of the (Po) problem. Solving the (Po) problem is NP-hard, because its computational complexity increases exponentially with the number of atoms in V [2]. Basis Pursuit (BP) [2, 8] approximates the sparsest representation by convexifying the (Po) problem into the following problem: (Pi) rrnn ||c||i subject to s = <&c.  (3.4)  For seismic signals, the decomposition involves a dictionary of waveforms parameterized by their fractional-order a, as opposed to a single wavelet in the spiky deconvolution problem. Examples of atoms chosen for seismic applications are fractional B splines and fractional spline wavelets [18]. The idea is to find the few atoms in V that best match the different events in s, and deduce the fractional-order of the events from these atoms. Herrmann [18] proposed an atomic decomposition of seismic signals with B P , and showed that B P manages to find the sparsest representation when the dictionary is small or the signal simple. However, as shown by Herrmann [18], the larger the dictionary, the more similar the atoms in V, so if the dictionary gets too large, B P will have difficulties distinguishing between similar waveforms and hence not find the sparsest solution.  3.2  Uniqueness condition and equivalence between (Po)  and  (Pi) Herrmann [18] showed empirically that for certain configurations of signals and dictionaries, the solution to the (Pi) problem was the solution to the (Po) problem, which, in general, is not the case. There exist some conditions [8] under which a representation c is the unique sparsest representation, and stronger conditions under which the (Po) and (Pi) problems are equivalent. To discuss these conditions, we need to introduce the following definitions.  Definition 3.2.1 (Spark [8]) Let & be a matrix of size n by m, with m > n. The Spark of 3> is defined as the smallest possible number a, such that there exists a subgroup of a columns from <& that are linearly dependent.  29  Definition 3.2.2 ( / / ^ ( G ) [8]) For G a symmetric matrix, Mi/2(G) denotes the smallest integer k such that some collection of k off-diagonal magnitudes arising in a single row or column of G sums at least to \.  Although Spark and rank of a matrix are in some ways similar [8], they remain two separate concepts:  Example 3.2.1  We show a full rank matrix B with a Spark 2. 1  0 0 0  0  1 0  0  0  0  0 0  1  0 0  1 0  0  1 0  Clearly B is full rank as it has 4 independent columns (rank(B) = 4:), but B also has two identical columns which reduces its Spark to 2.  The Spark is an important concept that measures the similarity between any two columns of which for dictionaries, translates to the similarity between any two atoms in V. The uniqueness condition involves the Spark of T>, and is given in the following theorem:  Theorem 3.2.1 (Uniqueness [8]) Consider a dictionary V with matrix representation c  0  If there exists a  of s , such that s— &CQ and ||c ||o < cr/2 , then CQ is the sparsest solution 0  possible, i.e. c solves the (PQ) problem. 0  Given a signal s and dictionary V, Theorem 3.2.1 states the sparsity condition under which a representation c is the unique sparsest representation of s in V. If a representation c has strictly less than cr/2 non-zero entries, it is the sparsest representation. However, this theorem doesn't show whether this sparsest representation can be the solution to an l  l  which is stated in Theorem 3.2.2.  30  minimization,  The equivalence between the (Po) and (Pi) problems involves the quantity u-i/2(G) ( ^ i / ( G ) < 2  a), where G is the Gram matrix of $ defined by G = <&*<!>. (<1>* denotes the adjoint of <&.) The following theorem gives the equivalence condition [8]:  Theorem 3.2.2 [8] Consider a dictionary V with matrix <& and Gram matrix G. If there exists a representation c such that s = $ C Q and ||c ||o < Pi/2(C), then c is the unique solution of 0  0  0  (Pi) and is the sparsest solution. For a given signal s and dictionary V, solving the (Pi) problem gives the unique sparsest solution, only if its solution is sparse enough in V (i.e. its £° norm is strictly less than the quantity ^ i / ( G ) ) . 2  The choice of V is hence very important as it determines the sparsity  condition under which the solution of (Pi) is also solution of (Po)- However, the condition stated in theorem 3.2.2 is necessary, but not sufficient, as solutions to (Pi) with £° norm larger than A*i/2(G) may still be solutions to (Po)- When solutions to (Pi) do not satisfy the condition of theorem 3.2.2, the situation is less clear, as such solutions may or may not be solutions to (Po). In our case and for the type of dictionaries we use, solutions to (Pi) often do not satisfy the condition of theorem 3.2.2. However, Starck et al. [30] empirically demonstrated the gap between theoretical results (based on worst-case analysis), and the empirical evidence of B P ' s performance.  The goal of our new detection-estimation method is to increase the empirical  bounds shown for Basis Pursuit.  3.2.1  Empirical studies for the performance of BP  In many theoretical and practical applications, it is necessary to know the value of the Spark and Hi/2(G) for a given dictionary V.  However, computing these quantities involves several  £° minimizations, and requires enumerating subsets of columns of  which is computationally  intractable as the algorithm complexity grows exponentially with the size of V [8]. Donoho et al. [8] proposed a numerical scheme for computing an upper bound on the Spark, by replacing the 1° minimizations by l  x  minimizations. However, although the scheme is now computationally  tractable, its numerical results are very difficult to interpret [8]. Due to this computational barrier, the performance of B P must be evaluated through numerical experiments. 31  Probability of success of BP l  1 I  1  l  2  i  3  i  i  4  5  i  6  i  7  i  8  i  9  r  10  # of nonzero entries in the coefficient vector  Figure 3.1: Probability of success of B P for different number of non-zero entries in the representation c. We used a fractional spline wavelet dictionary with ten different fractional-orders (a) ranging from 0.5 to 2. One can clearly see that the probability of success of B P decreases very fast with the number of non-zero entries in the coefficient vector. We performed simulations with fractional spline wavelet dictionaries used to decompose seismic signals [18], and showed that, as soon as the representation c had more than two non-zero entries, the probability of success of B P was very low. This is illustrated in Figure 3.1. These results explain the failure of Herrmann's atomic decomposition [18] for large dictionaries and complicated signals, and show that for large seismic dictionaries, the representation of the signal must be highly sparse (its representation c must satisfy ||c||o < 2) in order to be the unique solution of an i minimization. 1  3.3  Detection-Estimation approach  We address the above issues through a two-step approach: we first partition the seismic signal s into its major features, and then analyze each of them separately by performing their I  1  decomposition in a large dictionary of waveforms parameterized by their fractional-order. Each major feature, containing only a few events, is now a simple and localized signal that can be  32  successfully decomposed with respect to large dictionaries. We consider dictionaries made with all translations of parameterized waveforms. For example, a dictionary containing waveforms parameterized by two different fractional-orders a\ and a  2  will be written as  V=[V ,V ], ai  (3.5)  a2  where T> contains all the translations of a waveform parameterized by O J I , and T> contains ai  a2  all the translations of a waveform parameterized by a . In a matrix form, equation (3.5) can 2  be written as * =  (3.6)  Due to this construction, the synthesis operator associated to <& acts as a convolution, and the analysis operator associated to 3>* as a correlation.  33  Chapter 4 Detection-Estimation Method  We now discuss our approach to the problem of delineating and characterizing transitions from seismic data. Studies on sedimentary records have shown that seismic reflectivity is made of accumulation of varying order singularities [14, 16, 27], leading to a non sparse signal. However, unlike white noise, sedimentary records clearly contain distinct large outlying reflectors with strong singularities evidenced by small fractional-orders, giving rise to a few large "spiky" events in the reflectivity. Currently, we are only interested in finding these outlying reflectors, and consider the reflectivity as a function of these large outliers. We therefore assume that the reflectivity contains a few events and denote the seismic signal s. The principle of our method is to divide the original problem into two easier subproblems. We start by partitioning the signal s into its major features (pertaining to the predominant reflectors in the Earth), and then proceed by analyzing and characterizing each feature separately. We name these two steps detection and estimation respectively.  4.1  Detection  The purpose of the detection is to find the locations and characteristic scales of the major features in the signal s. In particular, the detection step answers two specific questions: 1. What is the partition of the signal s into its prevailing components? 2. What is the characteristic scale of each detected component?  34  This information is then used to build windowing functions to extract the major components from the signal s. The principle of the detection is to find an approximate sparse representation of the seismic signal s in an appropriate dictionary, and select the largest sets of coefficients that correspond to the predominant features in s.  4.1.1  Detection dictionary  Like any matched-filter approach, it is important to choose atoms that best correlate with the events in s in order to guarantee the sparsity of its representation. To satisfy this condition, we choose to estimate the average waveform of s, by finding the waveform tp whose Fourier a  spectrum best fits the Fourier spectrum of the signal s. Because we work with waveforms parameterized by their fractional-order (and implicitly or explicitly parameterized by their frequency), we estimate the average waveform in terms of its fractional-order a, and central frequency / . We construct a larger detection dictionary T> by selecting the estimated average d  waveform </? and some neighboring ones. B y neighboring waveforms, we refer to waveforms a  whose parameter a and frequency / are close to the parameter a and frequency / of the average waveform. We choose a and / such that  / \a — a\ < T)  From experiments we choose 77 = 1, ei = \  and a n (  ei < -= < e . 2  f £2 — 2. The choices we made for 77, ei and  e are not fixed, and can be modified, as long as the neighboring waveforms obtained from 2  these changes still correlate well with the seismic signal. Adding more fractional-orders and more frequency content in the detection dictionary accounts for the non-stationary properties of the seismic signal, and ensures the sparsity of its representation with respect to V (multiple d  spiky deconvolutions). The detection dictionary V is therefore made of the selected waveforms d  (average waveforms and neighboring ones) and all their translations, and can be seen as the concatenation of several dictionaries. For clarification, consider a detection dictionary V  made of p selected waveforms and their  d  translations. V  d  can be written as  V = [D ,...,V ], d  dl  35  dp  (4.1)  where  is the detection dictionary made of the i  waveform and all its translations. For  discrete signals, T>d can be represented by its matrix <&d, which can also be written as a concatenation of p matrices *d=[*  where  4.1.2  is the matrix of  d l  ,...,*^],  (4.2)  •  Decomposition in Detection  We envision an approximate sparse decomposition M  5  = E ^^+ c  r(M)  '  (-) 4 3  i=l  where  is the residual and c = (c )^L 0li  l  the representation of s with respect to the detection  dictionary. This approximation selects the M largest coefficients c ,  which correspond to the  ai  M main features of the signal s, and approximately represent s in a sparse solution c. Because the aim of the detection step is to extract only the major features of the signal s, it is sufficient to approximately reconstruct the signal s and consider the above approximate decomposition. Equation (4.3) can be written in a matrix form as s = * d £ + r,  (4.4)  where 3? a is the matrix of T>d of size n x m (m > n), s and r 6 W and c € M . 1  m  c refers to the solution of min i||s - * d c | | l + A | | c | | i , d  where [3>di)  > 0 is a trade-off parameter between the I  1  *dp]i  a n  (4.5)  norm of c and the data misfit, $ d =  d c = [cj, . . . , c ] . The size of the residual and hence the number of selected T  p  atoms M in the representation c, is controlled by A^. The residual goes to zero as A^ —> 0, leading to an exact reconstruction. On the other hand, if A^ —> oo, the residual gets large, with lim \d—•+oo  r = s, and  lim  c = 0. For noisy data, (4.5) becomes a denoising problem, where the  Ad—>+oo  value of A<2 has to be set according to the noise level. Chen et al. [2] proposed to set \  d  = a s/21og(p), n  36  (4.6)  where an is the noise level, and p the cardinality of the dictionary. Chen et al. [2] refer to (4.5) as Basis Pursuit De-Noising ( B P D N ) [2]. Similarly to B P , B P D N can be rewritten as a perturbed Linear Programming problem, and is solved using the Simplex algorithm or an Interior-Point scheme [2].  4.1.3  Algorithm  Because of limitations associated with the Basis Pursuit and Matching Pursuit methods that we will discuss later, we propose to solve (4.5) using an iterative thresholding algorithm [4] on each coefficient C j . The resulting algorithm amounts to a Block Coordinate Relaxation Method [29, 30] performed on c, with a Landweber iteration with thresholding applied at each iteration step on each c^ [4]. A t each iteration k we minimize the functional  ^ | | s - * d c | | i + A ||c||i,  (4.7)  fc  with respect to the coefficient c, where A& > 0 is a decreasing parameter. We start with a large A& and then decrease it to some predetermined minimum value \min- The choice of  A j m  n  depends on how small we want the data misfit to be and is specified at the beginning of the algorithm. A small a large  A j„ m  A j m  n  will give a small data misfit and an exact reconstruction, whereas  will contribute to a large data misfit and an approximate reconstruction. For  normalized signals, and to a good approximation, we choose  A j m  = 10 . - 4  n  A t each iteration k, we solve for the vector c = [ c j , c * ] , by separately solving for its different k  components c^, and assuming the other components c^, j / i, are fixed. The representation cf is updated according to the rule [4, 30] <* = S (cf + < M R i - * *cf)),  (4.8)  +l  Xk  where  &ck*  d  and R f = s — Ylj^i  denotes the adjoint operator of  &dj*Cj-  S\ is the compok  nentwise soft thresholding operator by Afc, defined by S\ (x) = sign(x)(\x\ — Xk)+ , for x G M, k  where y  +  = max(0, y) for y G R .  The choice of the iterative thresholding algorithm to solve (4.5) is motivated by the algorithm's flexibility and control over the decrease of the parameter Afc and choice of X inm  37  The way  the parameter A& decreases determines the accuracy of the algorithm: the slower the decrease, the more accurate the solution c. In our case, we are only interested in the major features in s, which correspond to the larger coefficients in c. For that reason, we choose to use an exponential decrease of the parameter Afc which will make the algorithm slowly capture the large coefficients at the beginning of the iterations, and then move on faster on the small coefficients. The expression for the exponential decrease of Afc is given by: Afc = A - e  —ks  (4.9)  where A = 1 + Ao and s is a predetermined speed parameter. The representation c, obtained by iteratively minimizing (4.5), is numerically not sparse, as none of its entries is strictly zero. However, most of them are within a tolerance of 1 0  - 6  >C  \ i, m n  which is negligible. We now present our approach to detect the major events in s. According to the notations introduced in equation (4.2), the representation c obtained by minimizing (4.5) is a vector of length m (m = pn) that can be written as c = [ c i , - . , c ] , cj 6 Vi p  where each Cj € R . In order to incorporate the contribution of each dictionary £>j in the n  detection process, we consider the different vectors c^ as columns of a matrix C. Each row k in C , corresponding to the location k in s, contains the coefficients of s with respect to the different dictionaries T>{ at the location k. B y taking the largest entry in the magnitude of each row, we construct another vector c  where c € R™ and C € n x p. c T  T  T  describes the best contribution of the dictionaries Vi at the  different locations in s, and the largest entries in c  T  correspond to the major features in s.  We now select the largest entries of c , by only considering those larger than a threshold \i. T  The choice of n depends on the type of signals involved, and is very important, especially when dealing with noisy data, p is chosen to be a percentage of the £ norm of c , written as 2  r  38  where p is the percentage. From experimental results, we choose to set p = 0.1, but this value can be modified. For noisy data for example, p should be set at a large enough value in order to be above the noise level. Note that for noisy data, A ; can also be set to a larger value m  n  depending on the noise level, as stated in (4.6). We call c^f the thresholded vector c  T  with  threshold level p. For clarity purposes, we omit the index T and denote the thresholded vector  4.1.4 Windowing The next step in the detection is the analysis of the thresholded vector c , and the selection t  of the major features in s. The representation c^ contains localized sets of nonzero entries corresponding to the main components of s, and can be written as  M 1=1  where M is the number of sets in c , cJ is a vector of length n containing the i„-th set of largest tfl  nonzero entries in c , and Vz ^ j , sup(c^) n sup(cj ) = 0. t/i  A n example of such a partition is given below: c  We now define  =  (0,1,2,0,0,-1,-2,0)  (4.11a)  4,  =  (0,1,2,0,0,0,0,0)  (4.11b)  4,  -  (0,0,0,0,0,-1,-2,0).  (4.11c)  iM  as the largest entry in the modulus of cj^  mi = max |4, (01, and denote by ip  at  ( - ) 4  12  the atom in V that corresponds to the entry 77V Each vector cj^ describes  a particular feature in s which does not overlap with the other ones, as stated by the condition Vz  7^  j, sup(c^) fl sup(4J = 0. The location U and frequency / , of each atom (p  the location and characteristic scale of the i  ai  th  determine  main feature in s.  In order to extract these detected features from the signal s, we adopt the methodology of 39  multiplying c  t  by localized windows centered at the locations l{. To facilitate the discussion,  we introduce the following notation. Let Wj be the window associated to the i  selected atom ip , centered at l{. Each window  th  ai  is chosen to be a discrete function given by 0  exp(WiW  ( n  - ^(  1 ) ) 2  )  = { 1  X0<k<li-(B  + 6)<Ti  \i-Bo-i<k-k<  {(3 + 5)0-1  Xlk-hl^Bai  exp(-vM*gf  <+1  > )  (4.13)  if Bor <k-l <(B  )2  i  i  if U + (3 + 5)ai  0 where CTJ is the standard deviation of <p i, and a a  + Sfa  <k<n,  is the standard deviation of the Gaussian  9i  function used for the decay of the window at the tips.  0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.I  Figure 4.1: The i window and its associated parameters (3 and 5. The function plotted is multiplied to the signal to extract the z detected signal component. The parameters B and 5 are as given in equation (4.13). th  The parameters 3 and 5 specify the width of the window. We provide an illustration of the window in figure 4.1. We choose default values 3 — 2, <5" = 2, leading to a window width of 8crj, which gives a good approximation of the width of (p • The window is equal to 1 over the width ai  1Bo~i centered at k. The window ends at both tips with a Gaussian decay with a width equal to 40  <5<7j  for each decay. Both Q and 8 depend on the data and should be modified accordingly. Our  current choices for these parameters come from numerical experiments, and seem well suited to our type of data. B y multiplying the signal s by each window W , , we partition s into its prevailing components S j , where Sj £ W . The multiplication between the signal s and the window Wj is done compo1  nentwise, such that V 1 < k < n , Sj(lfe) = s(fc)Wi(fc).  1.5r  _1  1  1  Signal with the window 1  1  1  1  1  51  1  1  I  I  I  l  I  ' 0  0.1  0.2  0.3  0.4  0.5  0.6  0.7  1.51  1  1  Windowed signal part i  1  l 0.3  l 0.4  1  1  1  (4.14)  r  1  1  1  0.8  0.9  1  1  1  1-  _1 51 ' 0  l  l  0.1  0.2  l  0.5  l 0.6  l  l  i  1  0.7  0.8  0.9  1  Figure 4.2: The top plot shows the signal and the window applied to one detected event. The bottom plot shows the windowed event. Each signal Sj is now a simple and localized signal assumed to be a superposition of a small number of atoms. We show an example of a windowing in Figure 4.2. The top plot shows the signal and the window applied to one detected event. The bottom plot shows the windowed event.  41  4.2  Estimation  We now treat each windowed signal Si as a generic signal s. The estimation algorithm operates with simple and localized signals s assumed to be a superposition of a small number of waveforms parameterized by their fractional-order. The purpose of the estimation step is to unravel and approximate the fractional-order of these waveforms through the decomposition of the localized signal with respect to a large dictionary. In particular, the estimation step answers two quantitative questions: 1. How many waveforms does s contain? 2. What is the fractional-order of each of these waveforms?  4.2.1  Decomposition in Estimation  We consider a signal s: K s  where  <fi  ai  = Y2, on<t>oci>  (4.15)  a  i=i  ai, and  is a localized waveform parameterized by its fractional-order  teger, leading to a very sparse representation a = taining the waveforms  (4> )iLiai  (a )f_ . ai  K  a small in-  We denote T>K the dictionary con-  x  The principle of the estimation step is to find the  waveforms  K  that make up the signal s (or an approximation of these waveforms), through the decomposition of s in a large estimation dictionary of waveforms parameterized by their fractional-order. The signal s can be decomposed as N  s = ^2c ip , i=i ai  (4.16)  ai  (f )iLi waveforms taken from an overcomplete estimation dictionary and c = [c )iLi is a very sparse representation of s. Equation (4.16) can be written in a matrix form:  where  a r e  V,  ai  E  ai  s = * c,  (4.17)  e  where 3> is the matrix of V of size n x m (m > n), s G R and c G M . In the ideal case, where n  e  m  E  the estimation dictionary T> contains the waveforms (^aJfiLi (i- - T>K Q T> ), equation (4.15) e  E  E  42  is identical to (4.16), and the representation a coincides with the representation c. When V  e  contains a subset Vp of T>K, T>P = (^>a .)f-i, P < K and Oj € { 1 , i ^ } , both representations a  a and c coincide on the subset of coefficients that correspond to the waveforms (<f> )a€S- Because a  T>K is unknown, V  does not in general contain the waveforms (<f> )iLi- However, if a subset  of waveforms in V  converges to the waveforms in T>x, we hope to have the representation c  e  e  ai  converge to a. As stated in section 3.1, there is no unique solution c to the underdetermined problem (4.17), and we cannot guarantee that the representation c will converge to a. However, following the sparsity of a, and by imposing sparsity on the representation c, we hope to find the representation c that will converge to a in the case where a subset of waveforms in V  e  converges to the waveforms in T>x- The sparsity assumption of c naturally leads to solving the (Po) problem. However, the (Po) problem is in general computationally intractable [2, 8], so we consider the easier (Pi) problem: min ||c||i subject to s = * c ,  (4.18)  e  The result of Theorem 3.2.2 given in section 3.2, describes the sparsity condition under which the solution to (Po) is also solution to ( P i ) : If there exists a representation CQ such that s = 3>c and \\c \\o < / X i / ( G ) then CQ is the unique 0  0  2  ;  solution of (Pi) and is the sparsest solution. For large dictionaries, the above condition requires a high sparsity of the solution of (Po), in order to guarantee its being solution to (Pi) as well. Theorem 3.2.2 reveals the trade-off between the size of the dictionary and the complexity of the signal. The larger the dictionary, the smaller / x i / ( G ) , and the sparser c needs to be in order to also be solution of ( P i ) . On the 2  0  other hand, pi/ (G) increases as the size of the dictionary decreases, which relaxes the sparsity 2  of c . The choice of the dictionary T> is thus very important. 0  4.2.2  E  Dictionary in estimation  We now discuss our choice of dictionary V . e  Based on the strong sparsity of c assumed in  equation (4.17), we can relax the size of the dictionary and allow for a large one. Following  43  the argument developed in section 4.1.1, it is necessary to select atoms that best correlate with the signal s. In particular, we choose atoms that are waveforms parameterized by their fractional-order a, leading to (4.19)  T> = (<Pa)aeA e  where A is the indexing set of fractional-orders taken from the singularity spectrum f(a) given by a multiscale analysis [14, 25] on the seismic signal. In the estimation, it is crucial to obtain an accurate estimate of the fractional-orders contained in the signal s we analyze. For that purpose, it is necessary that the estimation dictionary contains a large number of fractionalorders a sufficiently close. There is a trade-off between the size of the dictionary and the accuracy of the estimation algorithm. When the fractional-orders in the dictionary are too far apart, we lose accuracy. On the other hand, if we choose the fractional-orders of the dictionary too close together, the dictionary becomes too large and the algorithm prohibitively slow. For our purposes, we choose the distance between two consecutive fractional-orders in the dictionary T> to be 0.2, as it gives a good compromise between accuracy and speed. Figure 4.3 shows two e  waveforms with fractional-orders differing by 0.2. The waveforms are fractional derivatives of the Gaussian, whose fractional-orders are the fractional amounts of differentiation performed on the Gaussian. The top plot shows a waveform with fractional-order 2.2 and the bottom plot shows a waveform with fractional-order 2.4. Visually, these waveforms look similar, but our algorithm can still distinguish between them. Waveforms with closer fractional-orders will not be distinguished by the algorithm, but this is sufficient for our purposes. The same argument applies to dictionaries explicitly parameterized by their fractional-orders and frequencies.  4.2.3  Resolution  We propose to solve (4.18) using the iterative thresholding algorithm [4] described in section 4.1.3. Recall the iterative thresholding algorithm, whose solution c is given by c = argnun (-||s - * c | | | + A | | c | | i ) , e  44  e  (4.20)  1  Waveforms with similar fractional-orders i  I  I  I  I  I  I  I  .  1  a =2.2  w CD  0.5  T3  0  Q.  E  <  -0.5  -1  •  50  100  150  200  250  300  i  350  i  i  400  450  i  500  Locations 1  <D  I  I  I  I  1  I  1  1  .  1  1  0.5  T3  i  0  L  E  <  -0.5  -1  I  1  1  1  50  100  150  200  250  i  i  i  300  350  400  i  i  450  500  Locations  Figure 4.3: Fractional derivatives of the Gaussian with similar fractional-orders. Here we see almost two identical waveforms with fractional-orders differing by 0.2. The top plot shows a waveform with fractional-order 2.2, and the bottom plot displays a waveform with fractionalorder 2.4. We choose the estimation dictionary so that our algorithm can distinguish two waveforms with fractional-order differing at least by 0.2. If the fractional-orders are closer than 0.2, our method will not distinguish them. where s = 3> c + r. In the estimation step, the signal s is decomposed with respect to a large e  dictionary. Unlike the detection, where we consider an approximate decomposition, the signal s is entirely decomposed, leading to an exact decomposition of s. This is very important as we want to estimate the fractional-orders of s. We can solve (4.18) using the iterative thresholding algorithm, because, when A —> 0, the residual r goes to zero and the solution c behaves exactly e  like B P applied to s. From experiments, we choose X i  m n  = A = 10 e  - 1 0  .  Although numerically non sparse, most of the coefficients of the representation c are within a tolerance of 1 0 ~  10  <gi Xmin- It is then necessary to threshold c. As in the detection step, we  45  where p' is the percentage. From experiments, we choose p' = 0.1. We note the thresholded vector c ,. The nonzero entries of the solution c , determine the different events in s. In part  t  ticular, each nonzero entry in c , corresponds to a particular atom (p in V, whose parameter t  a  a gives the fractional exponent of the corresponding event. In the case of dictionaries parameterized by fractional exponents and frequencies, the nonzero entries in c , correspond to atoms t  <P( ,f), whose parameters a and / give the fractional-orders and frequencies of the estimated a  events.  4.3  Numerical scheme  We present the numerical scheme for our algorithm. We use the abbreviation B C R (Block Coordinate Relaxation) for the iterative thresholding algorithm, as our thresholding algorithm is based on the Block Coordinate Relaxation Method [29, 31]. T denotes the hard thresholding M  operator by p. The function E s t i m a t e approximates the fractional-orders of the different windowed signals S j . The function G l o b a l gathers the estimated fractional-orders of the windowed signals into a global vector a. The vector a contains the fractional-orders of the different events in s. The algorithm is given in Figure 4.4.  46  \  B u i l d the  detection  B u i l d the  estimation dictionary  • Perform f o r  d i c t i o n a r y T>d V  e  each t r a c e s  DETECTION 1. Decompose  with respect  s  Thresholding:  c=  to  using  T>d  2.  T h r e s h o l d c w i t h t h r e s h o l d l e v e l U-: c ^ = T ( c )  3.  Select  4.  Window o u t t h e  M  t  the p sets  Iterative  BCR(s)  of largest  p  coefficients:  detected  c o r r e s p o n d i n g t o (cJ ) ?  parts  (cJ ) ?  i n s:  = 1  ESTIMATION — F o r e a c h p a r t s, 1. Decompose 2.  Select  with respect  t h e waveforms  fractional-order:  t o T> : c, = e  and e s t i m a t e  BCR^) their  = Estimate(Cj)  — end •  Global vector  •  end  f o r f r a c t i o n a l - o r d e r s a = Global(a,j)  F i g u r e 4.4: Detection E s t i m a t i o n A l g o r i t h m  47  = 1  (§i)iLi  Chapter 5 Numerical Experiments  We now present the results of our detection-estimation method on both synthetic and real data. We consider dictionaries made with fractional derivatives of a centered Gaussian. The atoms of the dictionaries are thus parameterized by their fractional-order a and their standard deviation a (equivalent to the scale and the frequency). The dictionaries can be written as  V  =  (</'a,(T)( ,(T)e(R+) > 2  Q  where a and a are the indexing fractional-orders and standard deviations. The fractional-order a of the waveform  <p ,oa  l  s  given by the fractional amount of differentiation a performed on the  Gaussian: Va > 0  Vx e R , y> ,a(x) = D g (x), a  a  (5.1)  a  where g is the centered Gaussian function with standard deviation cr, defined by g (x) = a  <T  e zT ', for x € R. Our choice of dictionary made with fractional derivatives of the Gaussian is 3  motivated by several factors. First, dictionaries made with fractional derivatives of the Gaussian provide flexibility over the choice of both the fractional-orders and the frequencies of the atoms. They hence provide adaptable dictionaries and in particular, allow for the construction of bandwidth limited dictionaries containing atoms with very localized frequencies. Bandwidth limited dictionaries are particularly well suited for seismic data, as seismic data is bandwidth limited. Second, we noticed that fractional derivatives of the Gaussian showed good correlation with seismic data, both in the time and frequency domain. In particular, their Fourier spectrum fits quite well the Fourier spectrum of seismic data. We show an example of the Fourier fitting in Figure 5.1. 48  Average waveform in the data  0  0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9  Fourier Transform of the average waveform and of the data  0  -20  Frequency  Figure 5.1: Fourier Spectrum Fit: the top plot shows the average waveform in the data that was found by fitting the Fourier spectrum of the data with the Fourier spectrum of a fractional derivative of the Gaussian. The bottom plot shows the Fourier transform of the data (in red) and the average waveform (in blue). The results displayed in Figure 5.1 have been made on real seismic data gratefully received from ChevronTexaco. The top plot shows the average waveform of the data, whereas the bottom plot displays the superposition of the Fourier Transform of the average waveform and the Fourier Transform of the first trace of the data (see appendix D).  5.1  5.1.1  Application to synthetic data  Simple I D synthetic data  We start by presenting the results of our detection-estimation algorithm on simple synthetic signals. We consider synthetic signals s as superpositions of fractional derivatives of the Gaussian:  j=i  i=i  m  where k = i + (j dictionary V  —  l)N , a  and the waveforms ip ,o-j ai  = {^>a,a)( ,a)e(U+) - The synthesis dictionary refers to the dictionary we use to 2  s  are taken from an overcomplete synthesis  a  49  synthesize the signal s. We apply our detection-estimation algorithm to two different synthetic signals: one with 20 events (N = 20), and one with 30 events (N = 30). We choose a synthesis dictionary V  s  with  30 different fractional-orders a ranging from 1 to 4, and 25 standard deviations between 0.007 and 0.013. We choose A j m  and Xmin = 1 0  - 9  n  = 10  - 3  for the iterative thresholding algorithm of the detection part,  for the iterative thresholding algorithm of the estimation part. The detection  decomposition is performed with a dictionary of size 3, and the estimation decomposition with a dictionary of size 48: the estimation dictionary contains 16 different fractional-orders between 1 and 4, and 3 different standard deviations. The frequencies of the waveforms contained in the estimation dictionary can be chosen to be close to the frequency of the windowed event, which makes the estimation dictionary scale- or /regency-adaptable. This property of our algorithm enables us to reduce the number of standard deviations, and hence the size of the estimation dictionary. This reduction increases the speed of the estimation decompositions performed on each windowed event. We use this adaptability property in the analysis of the two synthetic signals. The results obtained with our detection-estimation algorithm are shown in Figures 5.2 and 5.3. Figure 5.2a shows the signal with 20 events, Figure 5.2b presents the original fractional exponents at their locations in the signal, and Figure 5.2c shows the fractional exponents found by the algorithm. In this simple example, the locations and fractional-orders of the different events in s are correctly found. Figure 5.3a shows the synthetic signal s containing 30 events, Figure 5.3b shows the original fractional exponents a at their location in the signal, and Figure 5.3c shows the fractional exponents found by the algorithm. A l l the events in s are found and have the correct locations, and most of the fractional-orders are correct. However, some of the fractional-orders found by the algorithm are far from the original ones. We explain these few errors in the estimated fractional-orders to be due to an erroneous windowing. A n incorrect windowing will not capture the whole event and therefore not extract it entirely, providing the estimation algorithm with  50  (a) O r i g i n a l s i g n a l (20 e v e n t s )  I—i I  I—I  0  50  I• I  100  Ii  150  I  U_J  200  u  I  il  250 300 Locations  I  U  350  I  d  400  I  U  450  I  JJ  500  Figure 5.2: Analysis of a synthetic signal with 20 events. the wrong waveform. A poor windowing is more likely to happen when the signal s contains closely-spaced events relative to the scale of the seismic wavelet.  5.1.2  Comparison with Basis Pursuit and Matching Pursuit  In this section, we compare the performance of our detection-estimation algorithm with the performance of Basis Pursuit (BP) and Matching Pursuit ( M P ) . We consider a signal s with 25 events (N = 25) and synthesize it with a synthesis dictionary V made with 30 fractionals  orders a between 1 and 4, and 17 standard deviations between 0.09 and 0.011. We set the parameters of our algorithm to be \ i  m n  detection part, and A j m  n  = 10  - 9  = 10  - 3  for the iterative thresholding algorithm of the  for the iterative thresholding algorithm of the estimation part.  We choose a detection dictionary of size 3 and an estimation dictionary of size 54. In this experiment we do not use the adaptability property of the algorithm in order to have a fixed estimation dictionary for all windowed event. In that way, we can use the estimation dictionary as the analysis dictionary for B P and M P . 51  (a) Original signal (30 events)  (c) Recovered r  ? )( (  r  )  j  (> i < <  ( I  U  I  I  50  )  ()  r  r  > j( (p > ) ( <p ) (  I  I  I il  100  I  C) > j '<  C  (  i U I  150  . I l_d  200  I  <  )  I  I l_U  (' c ) ()  ()  1  <  t  )  <ij  c)  ) a  9 I l__U  250 300 Locations  I  1 l_U  350  I  n—  L1U  400  I  Li I  450  I  c L1 LJ  500  Figure 5.3: Analysis of a synthetic signal with 30 events. Both B P and M P decompose the signal s in an analysis dictionary V , and try to approximate a  the sparsest representation c, such that, s = <& c, where 3> is the matrix of the dictionary T> . a  a  a  We compare the results of our algorithm with the results of B P and M P , and use the estimation dictionary of our method as the analysis dictionary V  a  for B P and M P . Each algorithm gives  a representation of the signal s in the dictionary <fr . For simple signals and small dictionaries, a  M P , B P and our algorithm (that we call S E E D E for Scale Exponent Estimation) gives the same representation.  However, when the signal complexity is high and the dictionary large,  each algorithm provides different representations of the same signal s.  The results of our  comparison are shown in Figure 5.4. The first top plot shows the original fractional exponents at their location in the signal s, the second top plot shows the fractional exponents found by our algorithm S E E D E , the third plot shows the fractional exponents found by B P and the bottom plot shows the fractional exponents found by M P . We can see that for this simple signal s, B P and M P globally find the correct  52  Original and recovered fractional-orders CO 4 c 2-  6  o  a  9  9  9 9  P  Q  9  0-  LU 4 Q LU LU 2 CO 0-  <P  9  9 9  U  _  9  $ a  $  L  4 CO 2  i  i  II*  2h 0  i  50  J  100  150  200  250 300 Locations  350  400  450  500  jure 5.4: Comparison between the detection-estimation algorithm, B P and M P .  53  events at the correct locations but find other events as well. B P still find more events than M P because of its super-resolution property [2]. However, neither B P nor M P find the events located at abscissa 300 because too many events are closely-spaced in the vector. For each event in the original vector, B P and M P also find multiple events with different fractional-orders.  B P is nevertheless better than M P , because for most of the events in the  original vector, and among the multiple fractional-orders found by B P , there is often the correct fractional-order. Unlike B P and M P , our method doesn't resolve closely-spaced events, and its resolution property is lower than the resolution property of B P and M P . However, for each recovered event, our method finds only one fractional-order. We can see that most of the fractional-orders recovered by our method are correct. The incorrect fractional-orders are those belonging to events influenced by other neighboring events. This is due to the windowing which extracts a part of the signal, and prevents the estimation analysis to "see" the complete signal. Information is therefore "cut" by the windowing. As long as the events are not too closely-spaced, and no matter how many events the signal contain, our method will find and correctly estimate all the fractional-orders.  However, as soon as the events start to interfere with each other, our  method starts to have some difficulties in finding the correct fractional-orders and then the events themselves.  One great advantage of our method compared to M P and B P , is that it  gives only one estimated fractional-order for each event found. We feel that further work is required to get an optimal windowing and improve the resolution of our algorithm. The numerical complexities of B P , M P and our method S E E D E are quite different.  Table  5.1 displays the C P U times in seconds spent in running different decomposition techniques on signals with various lengths. A l l computation was done on a Macintosh G5 workstation. M P is the fastest decomposition. M P has a quasi-linear complexity, depending on the number of chosen atoms [2, 26]. B P on the other hand, is very slow. The complexity of B P depends on the complexity of the conjugate gradient solver and the complexity of the primal-dual logarithmic barrier interior-point algorithm.  The complexity of B P is also affected by the size of the  54  Problem Size: 512  C P U Running Time in Seconds  Dictionary Size  BP  MP  SEEDE  18  22630.68  7.1450  10.4283  30  66605.47  11.0123  33.4668  42  5902.68  9.4068  56.5617  54  44589.83  12.4648  71.5451  Table 5.1: C P U Running Times of B P , M P and our method S E E D E .  dictionary and the complexity of the signal. The numerical complexity of B P can nonetheless be modified by using different parameter settings. Our algorithm finds a compromise between B P and M P . One should note that all our experiments were done in Matlab. The speed of these algorithms can therefore be largely increased by using C++ or Fortran.  5.1.3 Realistic synthetic data We now present our result on more realistic synthetic data. Seismic data is usually stored in seismic cubes or a seismic matrices, where each column is a seismic trace. We use synthetic data of size 512 x 100. Figures 5.5, 5.6, 5.7 and 5.8 present the results of our algorithm on a synthetic slice (or matrix) with 16 events. Figure 5.5 displays the synthetic slice, Figure 5.6 shows the detected reflectors, Figure 5.7 presents the original fractional-orders at their location in the different traces, and Figure 5.8 shows the fractional-orders found by our algorithm. We used a synthesis dictionary made with 20 different fractional-orders ranging from 1 to 4, and 3 standard deviations (scales) between 0.005 and 0.04. In the algorithm, we used a detection dictionary of size 2 (2 standard deviations) made with the average waveform estimated from the synthetic data. We chose the adaptive estimation dictionary. In this example, the events are quite well separated so the algorithm can detect them. This  55  Original synthetic data  0.05  Figure 5.5: Synthetic slice with 16 reflectors.  Locations of the reflectors  50  100 r  150F 200 f  sz Q . 250 f CD  Q  300 f 350 400 F 450 F 500 F 10  20  30  -I  L_  40  50  60  70  X Locations  Figure 5.6: Detected reflectors.  56  80  90  100  Original fractional orders ~i  i  1  1  1  r  i  i  i  40 50 60 X Locations  70  80  90  50 -  100 • 150 200 Q . 250 r  Q 300 (• 350 Y 400 F 450  F  500 [  _i  10  1  1  20  30  1  1  i  100  Figure 5.7: Original fractional-orders.  Fractional orders found  10  20  30  40  50  60  70  80  X Locations  Figure 5.8: Recovered fractional-orders.  57  90  100  example shows that if the events are well separated, the algorithm detects all of them and correctly estimates most of them.  The lateral continuity of the recovered fractional-orders  along the different reflectors shown in Figure 5.8 is overall quite well preserved. This property is very important for seismic data, as fractional-orders should remain laterally continuous along reflectors.  However, although most of the recovered fractional-orders recovered are correct  and lateral continuous along the reflectors, we can clearly see that three distinct reflectors have fluctuating fractional-orders. The fractional-orders oscillate between the correct value and the maximum value of the fractional-orders in the dictionary. This phenomenon is due to an incorrect estimation of the frequency in the detection part. The frequency of the analyzed event in the estimation part is indeed very important for a correct estimation. If the estimated frequency of the atom is incorrect, the estimation will be unable to find the correct fractionalorder, as the largest inner product between the event and the atoms in the estimation dictionary will not correspond to the atom with the correct fractional-order. In this example, we clearly see that sometimes, the detection outputs an incorrect frequency. The reason for that happening is not well understood at present, and we feel that further work is required to fully understand the whole algorithm. We now show the results of our algorithm on a different synthetic slice containing 24 events. In Figures 5.9, 5.10, 5.11 and 5.12, we show the results of our algorithm on a synthetic slice containing 24 events. Figure 5.9 shows the synthetic data, Figure 5.10 shows the output of the detected reflectors, Figure 5.11 displays the original fractional-orders of the different reflectors, and Figure 5.12 presents the fractional-orders found by our algorithm. Like the previous example, we used a synthesis dictionary made with 20 fractional-orders between 1 and 4, and 3 standard deviations between 0.005 and 0.04. In the algorithm, we used a detection dictionary of size 3 containing 3 different standard deviations, and made with the average waveform. We chose the adaptive estimation dictionary. In this example, the events start interfering with each other, although not extremely close. One can clearly notice that the second event was not found by the algorithm, because it was too close to the first one. The algorithm lacks the super-resolution property of B P and does not resolve 58  Original synthetic data  0.05  0.05  Figure 5.9: Synthetic slice with 24 reflectors.  Locations of the reflectors ^  i  1  1  1  r  40 50 60 X Locations  70  80  90  50 • 100 ; 150 • 200 • Q. 250E CD  Q  300 f 350  ;  400 ' 450 f 500 F10  20  30  Figure 5.10: Detected reflectors.  59  100  two closely-spaced events.  The two first reflectors represent a thin layer in the subsurface,  but the algorithm considers it as only one reflector. In theory, this is not a problem, as the estimation algorithm should be able to detect closely-spaced events as well.  However, due  to numerical complexities, the current implementation of the algorithm does not yet resolve closely-spaced events. Incorporating the super-resolution property in our implementation of our algorithm is still ongoing research. Oscillating fractional-orders along certain reflectors can also clearly be seen in this example. One of the solution would be to increase the size of the detection dictionary and incorporate many frequencies and possibly different fractional-orders. However,the use of large detection dictionaries leads to denser coefficients that are hard to analyze. Despite these flaws, we can see that most of the fractional-orders found by our algorithm have the correct values, and remain overall laterally continuous along the different reflectors. Further work is still required to precisely understand the role of the different parameters involved in the algorithm. The algorithm can be largely improved by the adequate choice of dictionaries, minimization parameters and efficient minimization solvers. The choice of the iterative thresholding algorithm is a first step in the implementation of our method, and will be replaced by more efficient solvers in the future.  60  Original fractional orders  50  h  100h  150r  200 r  Q. CD  250  r  Q 300  F  350  F  400  1  450 •  500 1  °  20  30  40  50  60  70  80  90  100  X Locations Figure 5.11: Original fractional-orders.  Fractional orders found  50 •  100 •  150 •  200 •  CL 250 F <D Q 300 •  350 •  400 •  450  500 F10  20  30  40  50  60  70  80  90  X Locations Figure 5.12: Recovered fractional-orders.  61  100  5.2  Application to real data  We now present the results of our algorithm on real seismic data. We used a detection dictionary of size 1 made with the average waveform estimated through a Fourier spectrum fitting. The Fourier spectrum fitting was shown in Figure 5.1. We chose a detection dictionary of size 1 made with the average waveform in order to have a sparser and clearer detection coefficient c. As stated in section 5.1.3, the larger the detection dictionary, the denser the output coefficient of the detection minimization, which makes it hard to detect the events and therefore defeats the purpose of the detection.  In theory and for very clean data, the larger the dictionary,  the better, as larger dictionaries give more information than smaller dictionaries. However, in practice and for real data, the output of the detection algorithm gives a noisy coefficient that is difficult to analyze. We noticed from experiments on real data, that the efficiency of the detection algorithm is increased by using a smaller dictionary. The optimal performance of the detection algorithm was obtained with a detection dictionary of size 1. We used an estimation dictionary of size 54, with 18 fractional-orders, and 3 standard deviations.  The  fractional-orders and standard deviations of the estimation dictionary are chosen according to the fractional-order and standard deviation of the average waveform. In that sense, both the detection and estimation dictionaries are adaptable to the data. In the detection minimization, we used A j m  = 1 0 , and \ i - 4  n  m  n  = 10~  10  for the estimation minimization.  62  We present the result of our algorithm in Figures 5.13, 5.14 and 5.15. Real data  X Locations  Figure 5.13: Real data. Figure 5.13 shows the first slice of a real seismic data cube, Figure 5.14 presents the detected reflectors, and Figure 5.15 displays the results of our algorithm on the first slice of the real seismic data. In Figure 5.15, we can clearly distinguish most of the different reflectors contained in the data, which shows a good performance of the detection algorithm. Some reflectors remain hard to discern, in particular at the bottom of the seismic slice, which is also the case in the real data shown in Figure 5.13. Overall, the results of our algorithm is coherent with the real data, and the reflectors detected follow the reflectors present in the real data. The lateral continuity of the fractional-orders along the different reflectors is quite well preserved for certain reflectors that are isolated and easy to detect. For general setting of reflectors, we can notice some fluctuations in the fractional-orders, which can be due to a wrong estimation in the frequency of the detected event, which is the phenomenon mentioned earlier, and/or to the choice of fractional-orders in the estimation dictionary. The changes in the fractional-orders can also be due to actual changes in the nature of the transitions in the subsurface.  G3  R e s u l t of t h e d e t e c t i o n  50 r  —'  50  1  -1-  1  100  150  200  1  250  I  I  I  300  350  400  X Locations  Figure 5.14: Detected reflectors.  Recovered fractional orders  X Locations  Figure 5.15: Recovered fractional-orders.  64  Despite limitations in our current implementation of the method, we feel that our preliminary results are encouraging with respect to detecting the reflectors and estimating their fractionalorders. Figures 5.16 presents the superposition of the real data with the fractional-orders recovered by our algorithm. Real Data with r e c o v e r e d f r a c t i o n a l - o r d e r s  50  100  150  200  250  300  350  400  XLocations  Figure 5.16: The real data is presented by the grey scale and the recovered fractional-orders are displayed in colors.  65  Chapter 6 Discussion  6.1  Summary and Conclusions  In this paper, we have presented a new method for characterizing and delineating transitions in the subsurface. Our method is based on optimal decompositions of the data [2], with respect to overcomplete dictionaries of waveforms parameterized by their fractional-orders. We propose a two-step approach that divides the initial i  1  decomposition problem into a detection step  followed by an estimation step. The detection part locates the predominant events in the data, and partitions the data into its prevailing components. The estimation part approximates the fractional-orders of each detected component. The detection and estimation algorithms both exploit the sparsity property of the representation of the data with respect to the dictionaries, by minimizing the £ norm of the data representation. We use an iterative thresholding algorithm l  [4] with a Block Coordinate Relaxation method [29, 30] to solve the minimization. We show that by first partitioning the data, and then analyzing each detected event separately, we improve the performance of B P with respect to finding the original coefficient. The probability of success in finding the original representation of the data with respect to a given dictionary T>, is increased by our new technique. In particular, the probability of detecting the events in the data with our new method is quite large. We show that our method achieves quite good performance in detecting events, provided the events are not too closely-spaced. The method also gives encouraging results with respect to the estimation of the fractional-order of the events, and provide an increased probability of correctly detecting and estimating the fractional-orders of the events than B P .  66  Unlike B P , where the use of large dictionaries is prohibitive, our new method can give good and accurate results in fairly large dictionaries. Besides, the computational complexity of our method can be reduced by changing the speed parameters of both the detection and estimation decomposition algorithms. However, the complexity of our algorithm is still significant, and increases with the size of the dictionaries. In comparison to B P , our method achieves lower computational cost, but is computationally more expensive than M P . In that respect, our method seems to give a compromise between M P and B P . In order to improve the resolution of our method, we proposed to replace the iterative thresholding algorithm of the estimation step by B P . However, incorporating B P into the estimation part made it prohibitively expensive when dealing with large dictionaries. We therefore feel that further work is still required to achieve super-resolution of thin layers with our new method. We propose a fast and efficient detection algorithm by using a small dictionary. The goal of the detection algorithm is to both detect the major events in the data and estimate their frequency through an approximate I  1  decomposition in a small dictionary. The estimated frequency of  the events is indeed very important, as it constitutes the base of the frequency contents of the estimation dictionary. In the case where the estimation dictionary contains the wrong frequencies, the largest inner product between the atoms in the dictionary and the analyzed event will pertain to an atom with an incorrect fractional-order. It is therefore crucial that the detection algorithm gives a good approximation of the correct frequency of the events. There is a trade-off between the optimal performance of the detection, which requires a small detection dictionary (especially for real and noisy data), and a good approximation of the frequency of the detected events, which requires a larger detection dictionary with a wide frequency content. Contrary to the detection algorithm, the estimation algorithm uses very large dictionaries with a wide range of fractional-orders, which provides accuracy in the estimated fractional-orders. Unlike the detection decomposition, the estimation decomposition, due to the size of the dictionaries involved, is a slow and computationally intensive algorithm. However, many ways are possible to increase the speed of the estimation part, including changing the speed parameters in the iterative thresholding algorithm, and reducing the size of the estimation dictionary  67  by adapting its frequency to the frequency of the analyzed event. The estimation dictionary can therefore be adapted to the event by narrowing its frequency content and selecting only a few frequencies around the frequency estimated in the detection. However, in that case, if the detection provides an incorrect estimated frequency, the estimation will be unable to find the correct fractional-order. One should note that incorporating a very large frequency content in the estimation dictionary, in addition to its wide range of fractional-orders, leads to an extremely slow and computationally intensive algorithm. The experiments performed on both synthetic and real data nonetheless show promising outcomes. The performance of our algorithm on synthetic data gives very encouraging results as the lateral continuity along the different reflectors is quite well preserved. The results on real data, although lacking some lateral continuity along reflectors, seem also quite promising.  6.2  Limitations of our method  The method suffers from a limitation in the resolution of two closely-spaced events in the data. At the current stage, the estimation algorithm doesn't resolve two closely-spaced events. The method therefore loses the super-resolution property of B P , but gains in lower computational costs. Another limitation is the large dependence between the detection algorithm and the estimation algorithm. A correct approximation of the frequencies of the detected events is a very important element for the estimation algorithm. The approximation of the frequency of the detected events indeed constitutes the baseline for the frequency content of the estimation dictionary. Besides, there is a large trade-off between the size of the detection dictionary, which should contain a wide range of frequency to ensure a correct approximation of the frequency of the detected events, and the efficiency of the detection algorithm. The wider the frequency content of the detection dictionary, the more accurate the frequency of the detected events, but the less efficient the detection of the events, in particular for real and noisy data. The analysis of the coefficients obtained in the detection algorithm therefore needs to be improved in order to  68  correctly detect the events in large dictionaries with wide frequency content. Although our method achieves a good performance in fairly large dictionaries, its efficiency is still limited by the size of the dictionaries, due to computational costs. The estimation dictionary cannot incorporate a wide range of frequencies, as its size would lead to a prohibitively slow algorithm. The windowing of the detected events is also a very sensitive part that plays an important role in the estimation algorithm, as it determines the efficiency of the estimation. A wrong windowing can lead to the extraction of an incorrect event, and therefore to an incorrect fractional-order. If the window is too small, the event will be "cut", and not extracted entirely. On the other hand, if the window is too large, the extracted event will contain remnants of other events. In those cases, depending on how much the window "cut" the event, or on how much the window added to the event, the estimated fractional-order will be correct or incorrect. Further work is still required to build an optimal windowing. Even with these limitations, our method still performs quite well in the detection of the events, and relatively well in the estimation of their fractional-orders. In particular, the results obtained on real data, showed that most of the reflectors were detected by our algorithm, and that the lateral continuity of the fractional-orders along the reflectors was relatively well preserved, taking into account that actual changes in the fractional-orders of the reflectors also exist in the data. The goal of our algorithm is to detect these changes, and we feel that our algorithm performed quite well with respect to the smooth changes in the fractional-orders of the reflectors in the data.  6.3  Future work  In this paper, we developed a new method that demonstrated encouraging results, in particular in the detection part. It also showed that, no matter how many events were in data, as long as they were not too closely-spaced, our algorithm would detect them, and estimate them quite accurately.  69  In the future, it would be beneficial to improve the algorithm by making it faster and more efficient. The possible improvements are listed and explained below: 1. The current method uses an iterative thresholding algorithm with a Block Coordinate Relaxation method to solve an l  l  minimization which enables us to use relatively large  dictionaries. However, iterative minimizations lead to slow algorithms and therefore limit their performance. The numerical complexity of our algorithm, although inferior to the numerical complexity of B P , is still significant. It would therefore be good to increase the speed of the iterative thresholding solver, and to incorporate, in the future, more efficient £ solvers involving global optimization tools. l  2. Working with large datasets and large dictionaries involves numerous and costly computations, and therefore requires powerful programming languages. Our algorithm, currently written in Matlab, would gain in speed if it were programmed in another programming language like C++ or Fortran. In the future, it will be necessary to convert our Matlab routines into one of these programming languages. 3. Resolving thin layers in the subsurface is also very important, and constitutes a major issue when analyzing seismic data. In order to achieve super-resolution, it would be interesting to incorporate B P in our algorithm, but in a way that would not lead to a prohibitively slow algorithm. B P is a very powerful solver with respect to its multiresolution property, and we feel that it could help our algorithm resolve closely-spaced events. 4. The efficiency of our method highly depends on the choice of the dictionaries, in particular on the choice of their fractional-orders and frequencies. Many questions therefore arise: (a) What is the range of fractional-orders and frequencies? (b) How finely sampled should the fractional-orders and frequencies be? (c) Can we reduce the range of fractional-orders to [0,1] and build any other fractionalorder a > 1 through a linear combination of fractional-orders in [0,1]?  70  5. Global multiscale analysis with wavelets performed on seismic data (see section 2.2.2) yield an estimate for the singularity spectrum f(a) of the data. The singularity spectrum  f(a)  gives the probability of occurrence of a particular singularity in the data. The singularity spectrum therefore provides information on the different fractional-orders contained in the data. It would be interesting to use this information as a weighting function for the coefficients obtained in the l  l  minimizations in the algorithm. The i  1  norm of the  decomposition of the data with respect to a dictionary associated to a rare singularity should be highly sparse, and therefore should be largely penalized in the minimizations. On the other hand, the i  l  norm of the decomposition of the data with respect to a  dictionary associated to a frequent singularity, should be less sparse, and therefore should not be penalized in the minimizations. The weighting function w should behave like  where f(a) is the singularity spectrum of the data. For a signal s of length n, and a dictionary T> of cardinality p, the minimization should be: min ||s — <J>c||? + A[[wc||i  (6.1)  where w is the weighting vector constructed according to the fractional-orders contained in the dictionary <&, c G M  m  with m = np, and wc denotes the componentwise multiplication:  VI < k < m, (wc)(k) — w(k)c(k). If we write the dictionary V as: * = [*  a i )  ...,*  a f >  ],  (6.2)  w can be obtained by the following computation:  f(ai)  f{a ) p  where 1„ = [ 1 , 1 ] is a vector of ones of length n, and w G R  m  with m = np.  6. Because the estimation of the frequencies of the detected events is very important, it would be interesting to include in the algorithm another estimation decomposition that would 71  estimate the frequency of each detected event. The global algorithm would then consists of a detection, a first estimation for an approximation of the frequency of each detected event, and a second estimation directed at finding the fractional-orders of each detected event. The first estimation dictionary would contain a wide range of frequencies and a few fractional-orders, whereas the second estimation dictionary would contain a wide range of fractional-orders and a few frequency centered around the estimated frequency. The proposed algorithm aims at improving the estimation of the fractional-orders, but gains computational cost. Performing two estimations with two large dictionaries can lead to a prohibitively slow algorithm with a high numerical complexity. Further work is therefore required to lower its computational cost. Figure 6.1 presents the modified algorithm.  Figure 6.1: Improved detection-estimation method.  72  Part II Capita Selecta  73  Chapter 7 Atomic decompositions with Redundant Dictionaries  7.1  Problem formulation  As stated in chapter 2 by equation (2.15), we model seismic signals s as s(z)  = Y i€A  i ~ ^(  c  D  ai  z  - i) ± Y  i ~ ^(  z  c  D  ai  z  - «o.  t - ) 7  1  c  where Z{ is the location of the i  th  transition in the subsurface, d = ^^f' r  i  —  "ffiaf+iT^ > V' is  the seismic source function, and D~ is the Liouville —af fractional derivative operator defined ai  1  in Definition 2.1.2. The goal is to estimate the locations Z{ of the reflectors in the subsurface, their amplitudes Q, and their fractional exponents a*. For that, we consider seismic signals s as a superposition of waveforms (p parameterized by their fractional-order a: a  s =Y  <x4>a,  (7.2)  aV<x,  (7.3)  a  and envision their decomposition: s= Y  c  with respect to an overcomplete dictionary V — (ip ) es, where the atoms ip are waveforms a  a  a  parameterized by their fractional-order a, and S is the indexing set of the parameter a. In a matrix form, equation (7.3) can be written as: s = $c,  (7.4)  where $ is the matrix of the dictionary V. The goal is to find the representation c of the signal s which satisfies equation (7.4). However, due to the overcompleteness of T>, there is no unique 74  solution for c, but the nonuniqueness of c provides a flexibility which enables us to choose the most suited'representation c to our problem. In the case of seismic signals, assumed to be a superposition of a limited number of waveforms, it is natural to try to minimize the number of components of c. There have been several approaches to solve this problem, including the Method of Frames [3], Matching Pursuit [26], Basis Pursuit [2], and the Block Coordinate Relaxation Method ([29, 30]). We describe those methods in the following sections.  7.2 7.2.1  Method of Frames The algorithm  The Method of Frames [3] is an algorithm that finds the representation c with minimum I  2  norm, and solves rruxi | |c| |2 , subject to s = <I>c.  (7.5)  (7.5) is a quadratic system whose unique solution c} is given by: s  t = *t  ='* (** )- §. T  a  r  (7.6)  1  The matrix & is called the generalized inverse of 3>, and is given by: *  f  = * (** )- . T  r  1  In our case, the major disadvantage of this method is the non sparsity of the solution c} [2]. The solution cf is indeed not sparse because the I  2  norm tends to prefer a large number of  small nonzero coefficients, rather than a few number of larger ones. The non sparsity of c} means that the decomposition of s is spread out on a large number of atoms in V, which translates into the fact that we need a large number of atoms to represent s, and contradicts our previous assumption. The solution given by the Method of Frames is therefore not suited to our application.  7.2.2  Concept of sparsity  The concept of sparsity is intrinsically linked to the dictionary we use, as the decomposition of a given signal / can be sparse in one dictionary and not in another one. 75  Consider a simple example of a cosine function, denoted g. The representation of g in the Discrete Cosine dictionary (Discrete Cosine Transform) is very sparse, as we only need a few cosine basis functions to construct g. However, the representation of g in the Haar wavelet dictionary is full, as we need a large number of step functions to build g. The Haar wavelet basis is a basis made of translations and dilations of a step function. The results are shown in Figure 7.1.  ,  — 0  Signal  20 40 120 Coefficients using60Discrete80 Cosine100 Transform  1 140  Coefficients using Haar wavelets 2|  ,  ,  1  ,  ,  -21 0  ' 20  ' 40  ' 60  ' 80  ' 100  ' 120  ' 140  Figure 7.1: Decomposition of a cosine using the Discrete Cosine Transform ( D C T ) and the Haar wavelets. The top figure shows the cosine function g, the middle one plots the decomposition of g with respect to the Discrete Cosine Transform ( D C T ) dictionary, and the bottom one gives the decomposition of g with respect to the Haar basis.  7.3  Matching Pursuit  Contrary to the Method of Frames, Matching Pursuit (MP) [26] approximates the sparsest representation CQ, where the sparsest representation c refers to the representation with minimum 0  £° norm. M P tries to find the decomposition of the signal s in a dictionary V by iteratively finding the waveforms that best match the data s. M P is an iterative algorithm that selects, at each iteration, the waveform that minimizes the £ -difference between the subsequent reductions 2  76  and their projection onto a waveform. Consider a dictionary V made of atoms g . The Matching Pursuit algorithm begins by proA  jecting s on a vector g  Xo  G V and computing the residue Rs [25] §=<s,g >g Ao  + ife,  Ao  where (s, g ) is the inner product between s and g A o  write [25]  A o  (7.7)  . Since Rs is orthogonal to g  A o  , we can  n  l l s l l ^ l f e g O p + HPsll . 2  To minimize |].Rs|| , we need to choose g 2  A o  € T>, such that |(s, g ) | is maximum, which is A o  computationally done by finding an almost optimal vector g I (S, g  A o  A o  that satisfies [25]  ) I > « S U | ( s , g )|, P  where a G [0,1] is an optimality factor [25]. B y setting R°s = s and subdecomposing the residue at each iteration with the update rule for the m choose g  iteration  t h  s.t. | C R s , g J | > asup | ( i ? s , g ) | , m  A 7 m  m  7  we obtain the following decomposition of the m  7  t h  residue at iteration m:  R s=(R s, Jg +R h. m  m  (7.9)  m+  gx  (7.8)  Xm  If we stop after M terms, we obtain a sparse approximation of s M  !(*)  =  E^SA )gA (^)m  m  (7-10)  m=l  As M —> oo, this decomposition not only becomes exact, but also provides a sparse approximate representation if we set M <§; oo. The adaptivity of the dictionary and the subsequent selection of the best fitting waveforms concentrate the energy of the data in as few possible waveforms, yielding a high non-linear approximation rate. However, this method often fails to locate two closely-spaced events in the coefficient vector. Due to the greediness of the algorithm, M P can also select the wrong basis function at the beginning (because two events are too close) and then spend the rest of the time correcting for its mistake [2]. 77  7.4  Basis Pursuit  7.4.1 The formulation Basis Pursuit (BP) is a method that decomposes a signal s into an optimal superposition of dictionary elements, where optimal means having the smallest l norm among all such decoml  positions [2]. B P requires that the decomposition of the signal s in the dictionary V be sparse, saying that we only need a few waveforms from V to build s. Basis Pursuit (BP) solves the following minimization problem: nun ||c||i subject to s = 3>c,  (7-11)  where 3> is the matrix of the dictionary V. B P tries to find the sparsest solution c by replacing the NP-hard £° minimization by the i minimization stated above. The constraint imposed on 1  the coefficient vector c, requiring a minimization of the number of nonzero components, is very well suited to our type of problem.  7.4.2 Linear Programming (LP) B P is a convex and non quadratic optimization problem that can be rewritten as a Linear Programming (LP) problem [2]: m i n d x subject to A x = b , x > 0, T  (7-12)  where x 6 MP, s 6 M , and x > 0 requires all components of x to be positive. The reformulation n  of B P into the L P problem comes from the following change of variables [2] p = 2 m ; x = ( u , v ) ; d = (1,1) ; A = ( * , - * ) ; b = s,  (7.13)  where u and v refer to the positive and negative components of c respectively ((u, v) € (RJ ) ), 71  2  and d is a vector of ones in W. First, we can write Ax  =  (*,-*)(u,v)  =  * ( u — v)  =  *c. 78  (7.14)  As 3>c = s = b, (7.14) is equivalent to Ax = b. Secondly, we can write dx T  =  (l,l) (u,v)  =  u+  =  llfiUl.  r  V  (7.15)  To clarify the notations stated above, consider the following example: 1 -2  c = •  3 -4  with u =  1  0  0  2  3  0  0  4  and c = u — v.  We can easily check that d x = | |c| 11: T  d x T  = =  l,l) (u,v) T  U + V  =  1+2+3+4  =  ||c||i-  (7.16)  The solution given by B P is obtained by an Interior-Point scheme [2] based on a Primal-Dual Log-Barrier method [2, 10]. The algorithm starts by an initial feasible solution yielded by the Method Of Frames [3], and proceeds by applying the Log-Barrier algorithm. The speed of the 79  algorithm depends on the data, and the size and implementation of the dictionary, whereas the accuracy depends on the Conjugate Gradient solver.  7.4.3  Basis Pursuit De-Noising  Like any real data, seismic signals contain noise. If we decompose the entire noisy data d = s+n with respect to a dictionary V, we also fit the noise. In order to avoid fitting the noise as well as the signal, we need to denoise the data and decompose the denoised data into the dictionary T>. In this case, we do not want an exact decomposition of d, but an approximate one. Basis Pursuit De-Noising ( B P D N ) [2] adapts the original Basis Pursuit to the case of noisy data of the form d = s + n, where s is the signal and n the noise. The denoising problem can be written as m u i i | | d - * c | | i +AMclU,  (7.17)  where we minimize the i norm of the representation of the signal s that should be sparse in the 1  dictionary V, as well as the data misfit. B P D N can be rewritten as a perturbed L P ([2, 10]), and is solved using a Simplex and an Interior-Point method.  7.5  Basis Pursuit versus Matching Pursuit  7.5.1  Results  In this section, I present a comparison between Basis Pursuit and Matching Pursuit with respect to decomposing a signal s in a particular dictionary T>. The aim of the numerical experiments I made was to see whether the method involving the decomposition of the signal s using redundant dictionaries could find the right decomposition c and hence accurately localize the right waveforms in s. For each dictionary V, I started with a known coefficient vector c, synthesized a signal x = <f>c, where <fr was the matrix of T> containing the different fractionalorders a, and then decomposed x using B P or M P . In my experiments I worked with the following parameters: • The size of the dictionaries (number of different fractional-orders a) • The number of nonzero components in the representation c (sparsity of c) 80  • The distance between two different fractional-orders a In all my experiments, I considered a signal x of length n = 128. In this section, I only present the results for the decimated fractional spline wavelet dictionary, abbreviated M D W T , as the results are the same for the fractional spline and undecimated fractional spline wavelet dictionaries.  Decimated Fractional Spline Wavelets In the eight following figures, the top two plots show the superposition of the original signal s and the recovered signal. The top left figure shows the result by B P and the top right figure the result by M P . The two middle plots show the coefficient c found by B P (left) and M P (right). The two bottom plots show the original coefficient c.  81  1. Coefficient c with only four nonzero components • Comparison between two dictionaries (small with two a and large with ten a) BP 2 1 0 -1  V  -2  Q-  p 50  1/  100  o  2  +  1  iginal signal  1  o 150  0 -1  -?  150  n  50  100  150  200  250  50  100  150  200  250  -5  -5  50  100  150  200  250  50  100  150  200  250  Figure 7.2: Analysis of a very sparse coefficient with an M D W T dictionary with 2 a. Figure 7.2 compares the efficiency of B P with the efficiency of M P for a dictionary containing only two different a (a\ — 0 and a<i = 5). The difference in a in V assures a certain dissimilarity between the atoms, and hence prevents V from being too badly conditioned. As we see in Figure 7.2, both B P and M P recover the right coefficient c, first because there are very few nonzero entries in the original c, and second because the dictionary T> is small.  82  150  200  5|  I  400  •  0 J  600  800  •  •  1  n  5I  —|  '  1  o  _51  1000 1200  1  1  1  1  200  400  600  800  I  200  400  600  800  •  •  •  •  0 J  1  1  1000 1200 T  — |  '  o  1 1—I 1000 1200  _51  1  ,  ,  ,  200  400  600  800  ,  ,—  1000 1200  ure 7.3: Analysis of a very sparse coefficient with an M D W T dictionary with 10 a. Figure 7.3 compares the efficiency of B P and M P for a larger dictionary T> containing ten different a between 0 and 5. Because the different fractional-orders are closer to each other than in the previous example, the atoms in T> are more similar. As we see in Figure 7.3, B P doesn't recover exactly the right coefficient c, whereas M P does. The reason why B P doesn't, is because T> is now much larger and c not sparse enough for V. In fact, for this particular example, B P finds exactly the right coefficient c with a nonzero probability less than 1 (empirically found around 0.75) and we need less than four nonzero entries in c in order that B P finds exactly the right coefficient with probability 1.  83  • Comparison between two dictionaries (one with two very different a and one with two similar a)  50  100  150  200  50  250  100  150  200  250  Figure 7.4: Analysis of a very sparse coefficient with an M D W T dictionary with 2 very different  a. In Figure 7.4, the original coefficient vector c has only four nonzero entries, and both M P and B P find it exactly. In this case, the dictionary V is very small and contains dissimilar atoms. For this specific dictionary, c is sparse enough so B P and M P can find it.  84  BP  MP  Figure 7.5: Analysis of a very sparse coefficient with an M D W T dictionary with 2 similar a. In Figure 7.5 however, only M P finds the original coefficient exactly. B P finds the nonzero entries of the original c, but adds another nonzero one. B P has more trouble finding the right coefficient because the atoms in T> are much more similar. In this case, the original c is not sparse enough for B P to find it with probability 1. Like in Figure 7.3, B P finds the right coefficient c exactly with a nonzero probability less than 1 (empirically found around 0.75). The more similar the a, the lower the probability that B P finds the right coefficient c for a given original c with a fixed number of nonzero entries. M P has the same problem, but seems to be able to recover coefficients that are less sparse than the coefficients B P can recover.  85  2. Coefficient c with ten nonzero components • Comparison between two dictionaries (small with two a and large with ten a) BP  50  100  MP  150  200  . 50  250  100  150  200  250  Figure 7.6: Analysis of a less sparse coefficient with an M D W T dictionary with 2 a. Figure 7.6 shows the efficiency of B P and M P in a small dictionary for a given original coefficient c. In this case, both B P and M P find the original coefficient exactly. The original coefficient, although containing 10 nonzero entries, is sparse enough for M P and B P to find it in this dictionary T>.  86  0  •|  200  400  1  600  '!  150  '  800 1000 1200  200  11  0  200  400  600  original  gj  100  800 1000 1200  400  600  800 1000 1200  I I  o  50  200  400  600  I 11  800 1000 1200  Figure 7.7: Analysis of a less sparse coefficient with an M D W T dictionary with 10 a. Unlike Figure 7.6, Figure 7.7 shows that neither B P nor M P exactly finds the right coefficient back, although M P seems a little bit better. In this case, the dictionary is too large for B P and M P to find the original coefficient back. For this particular dictionary with ten different a, the original c is not sparse enough for B P and M P to find it exactly. The reason for this result is explained by the Spark of the dictionary. The larger T>, the smaller its Spark, and hence the sparser c needs to be in order to be found by B P and M P . This condition is given in the equivalence theorem mentioned in section 3.2 of Chapter 3, and the definition of the Spark is given in section 3.2 of Chapter 3.  87  • Comparison between two dictionaries (one with two very different a and one with two similar ones)  150  &  0  It 1  i  150  . ii i |  9= n 5  i  0  i  ,  50  100  1.  n i ' 1  1 -5  50  100  150  200  250  150  200  250  5  LjULjUL  C O)  0  -5  50  100  150  200  -5  250  50  100  150  200  250  Figure 7.8: Analysis of a less sparse coefficient with an M D W T dictionary with 2 close a. In Figure 7.8, s is decomposed in a small dictionary V with two very similar a (0.5 and 1). When the a are similar however, neither B P nor M P recovers exactly the right coefficient. In fact, both recover the initial coefficient with a nonzero probability strictly less than 1. The original coefficient c is not sparse enough.  88  150  &  0  Ii  »  11  I  '  150  I  50  100  150  200  250  50  100  150  200  250  50  100  150  200  250  Figure 7.9: Analysis of a less sparse coefficient with an M D W T dictionary with 2 different a. In Figure 7.9, s is decomposed in a small dictionary V with two very different a (0 and 5). When the a are very different, B P and M P can recover the right coefficient, B P doesn't recover it exactly but is very close to the right coefficient.  89  3. Coefficients c with two very close components BP +  MP  2  .5> o  50  100  50  150  100  150  g> 0  -5  50  100  150  200  250  50  100  150  200  250  50  100  150  200  250  50  100  150  200  250  5  Figure 7.10: Analysis of two close components in an M D W T dictionary with two different a. Figures 7.10 and 7.11 compare the efficiency of B P and M P in resolving two close components in c. In Figure 7.10, the dictionary V is small and contains only two different a. In this case, both B P and M P recover exactly the right coefficient c, no matter how close the components are.  90  150  -5  200  400  600  800  1000 1200  200  400  600  800  1000 1200  150  -5  -5  200  400  600  800  1000 1200  200  400  600  800  1000 1200  gure 7.11: Analysis of two close components in an M D W T dictionary with ten different a. In Figure 7.11, V is large and contains ten different a between 0 and 5. In this case, B P finds exactly the right coefficient (multiresolution), but M P cannot resolve the two close components. In small enough dictionaries M P can resolve two close components, but fails to do so in larger ones. B P is more sensitive to the number of nonzero entries than to how close the events in the original coefficient c are.  91  7.5.2  Discussion  I showed some results for dictionaries made with decimated fractional spline wavelets and got similar results for fractional spline and undecimated fractional spline wavelet dictionaries. The results revealed the importance of the size of the dictionary (its redundancy), the orders a and the number of nonzero entries in the coefficient c. There is a trade off between the redundancy of the dictionary T> and the sparsity of c. For a given dictionary V, there is a maximum number of nonzero components in c that guarantees its being the unique solution of (Pi) (and hence (Po)). The definitions of the (Po) and (Pi) problems are given in section 3.1.1 of Chapter 3. For our type of dictionaries (fractional B splines, decimated and undecimated fractional spline wavelets), the results are as follows: 1. If V is small (e.g. made of only 2 different a) and the a very different, the maximum number of nonzero entries in c is around 10. 2. If V is small (e.g. made of only 2 different a) and the a more similar, the maximum number of nonzero entries in c is close to 10. B P and M P find the correct coefficient back with a high probability. 3. If V is large (e.g. made of 10 different a or more), the maximum number of nonzero entries in c is very small (around 1 or 2 depending on T>). From our results, we can infer two major problems: 1. Seismic data is made of two many different events for our type of dictionaries. 2. Seismic data contains a wide variety of fractional-orders, which requires large dictionaries. We therefore cannot rely on B P or M P to decompose and find the fractional-orders a of a seismic trace. Another way of tackling the problem is to consider s as a sum of different signals S j , each having a different fractional-order « j . Using the Block Coordinate Relaxation Method (BCR) which is an iterative algorithm, we hope to be able to separate the different signals.  92  7.6 7.6.1  Block Coordinate Relaxation Method Problem statement  We consider the seismic signal s as  s= s  Ql  + ... + s  Qn  + r,  where SQ is an af* order signal, and r is the residual containing the rest of the orders of s that we didn't include in our model s_,, + ... + s~, . r can also contain the noise if we work with —Oil  —<Xn  —  noisy data. Given s, we would like to find its different components s^., and in particular its associated orders aj. Like B P , we will try to find the representation c of the signal s with respect to an overcomplete dictionary V made of waveforms parameterized by their fractional-orders a. Examples of such dictionaries include fractional B spline, and decimated and undecimated fractional spline wavelet dictionaries. We consider a dictionary V as V=  [V ,...,V ], ai  an  and its matrix <&:  where  3>  Qi  is a the matrix of V  containing the different translations of af  1  ai  A representation c in P is of the form c = [ c  a i  order waveforms.  , c J . Following these notations, we can write Q  our model as s = * c + r.  (7.18)  In the rest of the paper, we will use the following notations: Sa  4  = §i ,  = Cj  and  $  Q i  Our main assumption here, is that the representation of each  =  is sparse in the dictionary T>i,  so that we only need a few atoms in T>i to build S j . This is the same idea as B P , because we would like each a\  h  order event of the signal s to be described with as few atoms as possible.  93  Ideally, each event would be described by only one atom and hence the order OJJ of this event would be read on its associated atom. We propose to solve the following minimization problem  min  (ftUcJIi + ... + ^ H c J l ! + ||s - *c||!),  (7.19)  where 6i > 0, for 0 < i < n. The parameters 3i are trade-off parameters between the different £ norms and the data misfit. 1  1. The first part of the functional minimizes the representation of each signal s_j in the corresponding dictionary XV 2. The second part minimizes the data misfit. This second part relaxes the constraint. If an additional content existing in the signal s is not represented sparsely by the dictionaries T>i, the above formulation will tend to allocate this content in the residual [30]. Remark: What makes the minimization (7.19) succeed in separating the different signals S j , is that each dictionary Vi is very good in representing s and bad in representing the other Sj, for 4  j j£ i. The better this assumption is verified, the more accurate the algorithm.  7.6.2  Algorithm  I solved the minimization (7.19) using a slightly modified version of the Block Coordinate Relaxation Method ( B C R ) implemented by Starck et al. [30]. If 3>*3>i = / , the previous minimization can be solved by Soft Thresholding [30]. If we fix every coefficient vector c except c , we can solve for c 4  io  io  min/3 ||** c ||i + | | S - * c | | i , io  where S  io  = s -  0  io  i o  i o  i o  with (7.20)  I f * * * , i = I , (7.20) is equivalent to  minAollCiolli + I I H S i o - C i o l l i .  94  (7.21)  which can be written as  N  J2(ki (k)\ +  X(c (k)-Z (k))%  0  io  io  k=i where Z{ (k) is the k  th  0  component of 3>j S 0  io  and A =  This is equivalent to N scalar,  convex and independent minimization problems that are solved by Soft Thresholding applied componentwise to c  io  [4]. The minimization can still be solved by Soft Thresholding [29] when  the matrices are a concatenation of matrices Ai that satisfy A*Ai — I. See Appendix C for the derivation of the soft thresholding solution. The undecimated orthonormal wavelet transform consists of all shifted versions of the decimated orthonormal wavelet transform. The matrix of the decimated orthonormal wavelet transform is unitary, and the matrix of the undecimated wavelet transform is a concatenation (modulus a reordering) of unitary matrices. We can hence use Soft Thresholding to solve the minimization problem when using undecimated orthonormal wavelet transforms.  7.6.3  Numerical scheme  In Figure 7.12, I present the basic numerical scheme for separating two signals s^ and s , 2  assuming s is of the form s = s^ + s + r, where s is an a* order signal, s is an a h  2  x  2  order  h 2  signal and r is the residual which can contain other aj-th order signals and/or noise.  The  dictionary we use can be written as: <& = [$i, $ ] , where 3>i is adapted to s , and 3> to s . 2  x  2  2  The algorithm given in Figure 7.12 is for two signals, but I programmed a general algorithm for any number of signals and any dictionaries.  95  Initialize  Lmax, number o f i t e r a t i o n s ,  and t h r e s h o l d s  5i  =  0i-L  m a x  and  02-Lmax-  82 =  Perform Part old  L  m  a  x  times  A: Update c „  n  •  Synthesize  •  Calculate  assuming c  2  x  the signal  Soft  threshold  c  2  x  R = s—s 2  x  (R - * £§' )  ld  2  fixed  = ^ic  the residual  • C a l c u l a t e c = c% + * •  is  d  2  2  with  2  <5, 2  i.e.  solves  the following  minimization  problem: m i n f o l l & l l i + ||Ba -S2II2  £2  Update Part „old  £l  t h e t h r e s h o l d 62 = S — 02 2  B: Update c „  —  assuming c  x  fixed  £l  •  Synthesize  •  Calculate  the signal  d  x  Soft  threshold  problem:  s = <fr £ 2  2  c  x  + * f (R with  x  -  8\,  2  $cf ) d  i.e.  solves  mm8 \\c \\ +  £1  Update  2  the residual R i = s — s  • Calculate c = c f •  is  2  1  1  1  the following  \\R -s \\  minimization  2  1  1  2  t h e t h r e s h o l d 81 = 5i — 0\  end  ;  Figure 7.12: Block Coordinate Relaxation ( B C R ) Algorithm  96  7.6.4 Simple examples I applied the B C R algorithm to two examples: one that I made myself and one taken from Starck et al. [31]. In both examples we are dealing with data d of the form d = f\ + / where f\ and /  2  2  + n,  are two different signals, and n is the noise.  1. In the first example, f\ is a cosine function and /  2  is a step function. I used the Discrete  Cosine Transform as the dictionary 3>i adapted to f\ and the Haar wavelet basis as the dictionary  <F>  2  adapted to / . The results are shown in Figure 7.13. 2  0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9  0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9  0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9  1  residual  0  0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9  1  Figure 7.13: B C R algorithm: separating the noise, a cosine function and a step function. In Figure 7.13, the top plot shows the original cosine function (blue line) and the recovered cosine function (red line), the second plot shows the original step function (blue line) and the recovered step function (red line), the third plot displays the sum of the original signals (blue line) superimposed with the sum of the recovered signals (red line), and the bottom plot shows the residual.  97  2. In Starck's example [30], f\ is a cosine function and fa consists of bumps (Gaussians). I used the Discrete Cosine Transform as the dictionary 4>i and the ATrou wavelet basis as the dictionary <&2- The results are shown in Figure 7.14. cosine  0  I  0  0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9  i  i  i  i  i  i  i  i  i  0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9  I  1  1  1  1  | —  0  1  1  1  0.1  0.2  0.3  0.4  1  I  1  residual |_  1  1  i  i  i  0.5  0.6  0.7  0.8  0.9  1  Figure 7.14: B C R algorithm: separating the noise, a cosine function and bumps. In Figure 7.14, the top plot shows the original cosine function (blue line) and the recovered cosine function (red line), the second plot shows the original bumps (blue line) and the recovered bumps (red line), the third plot displays the sum of the original signals (blue line) superimposed with the sum of the recovered signals (red line), and the bottom plot shows the residual.  7.6.5 Multiple Block Coordinate Relaxation Method The multiple B C R iteratively performs a B C R decomposition on each of the separated signals resulting from a previous B C R decomposition. The idea is to run the B C R iteratively on the different signal contents, until we separate them. I empirically noticed that running the B C R more than three times does not really improve the results. I therefore limited the iterations to three. 98  7.6.6 Results I present here some results on the performance of the B C R algorithm for the decimated fractional spline wavelet dictionary. The results are similar for fractional B splines and undecimated fractional spline wavelet dictionaries. I tested the B C R algorithm for different settings and different synthetic seismic signals that I generated randomly in order to avoid the creation of patterns. The experiments test the following cases: 1. The analyzing dictionary V exactly contains the waveforms in s 2. The analyzing dictionary V contains the waveforms in s as well as other ones 3. The analyzing dictionary T> doesn't contain the waveforms in s The synthetic signal s was built as a real seismic signal by randomly building its different coefficients. B y hypotheses, s is a summation of signals Sj with different fractional-orders a^. In order to construct s, I created each coefficient Cj with random position and number of nonzero entries, and synthesized s, by computing Sj = * i C j , where <f>j is the matrix of the dictionary T>i associated to the order OJJ . Given s, the B C R algorithm tries to separate the different content types of s in the analyzing dictionary T>.  99  T h e a n a l y z i n g d i c t i o n a r y V exactly contains the waveforms i n s  • Dictionary V made of 4 different a between 0 and 5 Original (in red) and recovered (in blue) signals 51  .5!  1  1  1  1  1  1  1  1  1  1  1  1  1  1  L  0  0.1  0.2  0.3  0.4  0.5  0.6  "0.7  0.8  0.9  1  0  0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9  1  _5i  0  i  i  i  0.1  0.2  0.3  i  0.4  i  i  0.5  0.6  1  i  0.7  i  i  I  0.8  0.9  1  Figure 7.15: Original and recovered signals. In this example, s is synthesized and analyzed with the same dictionary V and the coefficient c of the signal s in V has 27 nonzero entries. Figures 7.15 shows that the different content types in s are successfully separated and the coefficients exactly recovered. W i t h 27 nonzero entries in c, the B C R algorithm is able to separate the different content types in s and finds the right coefficient back. improvement over B P and M P .  100  This is an  Original (in red) and recovered (in blue) coeffs  -10  I  20  40  60  80  100  120  140  160  180  200  20  40  60  80  100  120  140  160  180  200  20  40  60  80  100  120  140  160  180  200  100  120  140  160  180  200  20 10  -10  l  40 20  A-r\JlA  A_  0 -20  20  40  60  80  Figure 7.16: Original and recovered coefficients. Figure 7.16 shows the superposition of the original coefficients and the coefficients Cj found by the B C R algorithm.  The original coefficient has 27 nonzero entries.  The different coefficients of the different content types are perfectly recovered by the algorithm.  101  Dictionary made of 10 different a between 0 and 1 Original (In blue) and recovered (in red) signals  n, J U  _n_n, v  )  0.1  _ fl  0.2  0.3  0  0  0.4  0.5  0.6  0.7  0.8  0.1  0.2  1  1  0.1  0.1  0.2  0.2  0.3  0.4  0.3  0.4  0.4  i 0.5  0.6  1  1  i  0.5  0.6  —i  1  0.5  0.6  0.7 r-  0.8  0.7  0.8  0.1  0.2  0.3  5  0.11111 I  -  0.9  .  1„ .  5  0.1 i  0.2  0.3  1  1  0.4  0.5  0.6  i  i  i  *—yv* • s . ^  )  5  j 0.22227] 0.9  1  Mfv^y-  0 i  1  A  J  0  PyL •  -  0.3  _  1  0.0 I  . f l l  Ir  *-  Original (In blue) and recovered (in red) signals  2  \=7]  nLi  H  0 )  '  0.4 1  0.1  0.2  0.3  0.4  0.5  0.5 1  0.8  0.6 1  0.9 i  i  A  ~  yr—•  0.7  1  0.8  1  1  1  0.66667 |  —  0.9  |  1  1  0.7777F1  1  0  _i 0.1  0.2  0.3  0.4  __i 0.5  0.6  0.7  0.8  1  i_ 0.9  1  1  0  0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9  1  0.8  0.9  1  1  1  1  1  -  | — - 0.33333~l  0.7  _j  0.8  t_  0.9  vl'  - 0  0.7 i  0.6  0.7  0.8  0.9  1  0  0.1  0.2  0.3  0.4  0.5  A  0.6  A  0.7  Figure 7.17: Five first original and recovered sig- Figure 7.18: Five last original and recovered signals, nals. The original coefficient c has 33 nonzero entries. In Figures 7.17 and 7.18, we can see that the separation of the different signals is not very successful. Because the dictionary is large and the waveforms very similar (the fractional-orders are much closer), the dictionaries become more similar. In particular, one dictionary 3>i is not bad in representing the other signals s^- for j ^ i, which is why the separation is not perfect. The more similar the dictionaries, the less the B C R algorithm will be able to separate the different content types of s.  102  Original (in blue) and recovered (in red) coeffs  Original (in blue) and recovered (in red) coeffs  3 40  -n  r  20  40  i  60  80  100  120  140  160  180  200  0 201  100  120  140  160  180  200  0  20 ,  40 ,  60 p  80  100  120  140  160  180  200  20  40  60  80  100  120  140  160  180  200  120  140  160  180  200  jUll 60  r  —i J  -  _/A__LAI  r~  1  A  ^  L  100  JjJL  I  0.33333]  |  Q.44444 |  JUL _i  20  40  60  100  1 20  __l  1 40  L_  1 60  1 60  I  200  0  20  i_  40  100  60  80  120  1 00  140  1 20  160  1 40  0.888891  180  160  200  1 80  Figure 7.19: Five first original and recovered co- Figure 7.20: Five last original and recovered coefficients, efficients. Figures 7.19 and 7.20 illustrate the performance of the B C R algorithm on the coefficients.  103  200  T h e a n a l y z i n g d i c t i o n a r y V contains the waveforms i n s, as well as other ones  • Synthesizing dictionary made of two a between 0 and 1 and analyzing dictionary made of four a between 0 and 3 Recovered signals  n  -2  .j  0  1  1  I  i_  0.1  0.2  0.3  0.4  ~\  1  1  1  0.1  0.2  0.3  0.4  _i  0  ~  -1  .j  0  r  _J  j  I  i  i_  0.5  0.6  0.7  0.8  0.9  1  1  r  0.5  0.6  0.7  0.8  0.9  1  n  1  1  r  1—;  i_  r~  I  I  I  I  I  I  I  i_  0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9  n  1  1  1  1  1  r  0.1  0.2  0.3  0.4  _i  0  n  r  1 - 3  I  I  I  I  i_  0.5  0.6  0.7  0.8  0.9  1  Figure 7.21: Original and recovered signals. In this example, the synthetic signal s is only made of two different content types: one with order 0 and one with order 1, and the original coefficient has only five nonzero entries. The analyzing dictionary is made of four different a (0,1,2 and 3). s is decomposed into four signals with order 0,1,2 and 3 respectively. Figure 7.21 shows the original signals in blue (top two plots) and the separated ones in dashed red (in the four plots). The separation is perfect.  104  Recovered coeffs i  i  i  i  i  i  i  i  20  i 40  i 60  i 80  i 100  i 120  i 140  i 160  i 180  1  1  1  1  i  i  i  i  i 120  i 140  i 160  i 180  i  i  i  i  1  0  0  1  1  1  20  40  60  80  i  i  i  t  i 20  i 40  i 60  i 80  100  120  i 140  i 160  i 180  1  I  1  1  1  1  1  1  1  100  200  200  —  0  0  1  1  1  1  1  1  1  1  1  20  40  60  80  100  120  140  160  180  200  200  Figure 7.22: Original and recovered coefficients Figure 7.22 shows the original and recovered coefficients of the different signals in the same order as for the signals shown in Figure 7.21. They are perfectly recovered.  105  Synthesizing dictionary made of two a between 0 and 1 and analyzing dictionary ; of four a between 0 and 1 Recovered signals i  0  x 1 0  -3  0.1  i  rll  i  i  i  i  i  r  i  — 0|  u  i  i  i  i  i  i  i  i  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9  1  1  r  ~i  1  0.33333  0.02  0.1  0.2  _i 0.3  i 0.4  i 0.5  i 0.6  ~\  1  1  1  1  1  i 0.7 i  i  0.8  0.9  1  r -  - 0.66667  0  -0.02  _i 0.1  i 0.2  —i  0.1  1  0.2  1  i 0.3  i 0.4  i 0.5  i 0.6  i 0.7  i_ 0.8  1  1  1  1  1  1  0.3  0.4  0.5  0.6  0.7  0.8  0.9  1  0.9  1  Figure 7.23: Original and recovered signals. In this case, a similar synthetic signal s with four events is decomposed with respect to a different analyzing dictionary. The analyzing dictionary contains four different fractional-orders a ranging from 0 and 1  (0,^,1 and 1).  Figure 7.23 shows the  original signals in blue (top and bottom plots) and the separated ones in dashed red (in the four plots). The separation is not perfect because the waveforms in V are too similar. However, the separation is still quite good, as the amplitudes of the signals with Orders \ and | are very small (these amplitudes should in theory be zero). The non optimal choice of V for the B C R algorithm affects the performance of the separation.  106  Recovered coeffs 20 r 10  -10  L  20  40  60  80  100  120  140  160  180  200  0.01,  -0.01 h -0.02  L  200  0.05 h  200  200  Figure 7.24: Original and recovered coefficients. Figure 7.24 shows the coefficients of the different signals.  107  3. T h e a n a l y z i n g d i c t i o n a r y V doesn't contain the waveforms in s  • Synthesizing dictionary made of five a between 0 and 3 and analyzing dictionary made of thirty a between 0 and 3  Original signals  Recovered signals  Figure 7.25: Original signals.  Figure 7.26: Recovered signals.  Figures 7.25 and 7.26 illustrate the decomposition of a signal s containing 30 events, and show the original signals and the recovered ones. Figure 7.26 shows that s is separated in many more content types than it actually has, which implies that the events are not found appropriately. Not only can wrong events be selected, but new events can be created in the reconstructed signals.  108  Original coeffs 0  1  1  Recovered coeffs  '  200  ~ ' 200 - •"1  400  400  600  !  1  !r  '• ": ! i!  >-  i1  1  | ' ; i ' = f' ' J ' S! ',  1  1  1  1  1  !  !  ']' ' f- != '  '  1  !  •  !  -  1  :  -  600  •  -  600  1000  1000  • ! ;  1200  . !  1200 1400  1400  1600  1600  1600  1800  I  -0.5  I 0  I  0.5  1  I  1  I 1.5 alphas  I  2  I  1  2.5  I 3  I  3.5  !  I ; ; ; ; ;  !  !  f  1  i  i  1  1  i  1  i  1  1  I  [  1  I  1  1  '  ; l 1; ; ! ; ; ; ; ! ; ; ; ! -  i  ,  !  ,  1  1  1  f  |  |  |  I  1  : -  i  1  1  '  1  1  !  1  !  1  1  !| j  1  i  i  i  1  i  1  |  |  |  *  !  i  1  |  1  !  |  1  i  i  1  1  i  -  1  1  '  |  |  1 1  1  1  |  |  i  •  ;  -  ;  -  !  !  i  ,!  0.5  !  i  I  ! , !  1  !  >  1  1  .  1  1.5 alphas  i  i  !  1 , I!  2  1  •  1  ,  !I  i,'  !  !  2.5  Figure 7.28: Recovered coefficients.  Figures 7.27 and 7.28 show the same results for the coefficients.  109  i  i  •  2000 I "~" • 1 ! 0  Figure 7.27: Original coefficients.  i: ; : ; I;: I; 11!:  ; i;  BOO  2000  [ ! ' i!  0  1  !  !  ! •  3  T h e a n a l y z i n g d i c t i o n a r y V doesn't contain the waveforms i n s  „0  0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9 I  1  0.72414  .0  0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.  .0  0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9  1  - 2.2759 |  ,0  0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9  1  .0  0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9  1  0  0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9  1  Figure 7.29: B C R separation with decimated fractional spline wavelets. Figure 7.29 shows the decomposition of the signal s with respect to a large dictionary whose fractional-orders are different from the fractional-orders contained in s. Figure 7.29 shows the superposition of the original content types of s and the reconstructed ones with closest fractional-order. The four top plots show the original signals (in blue) and the recovered signals (in red), and the bottom plot shows the residual. Except for the zero-order signal (first plot) which is well recovered, and the third order signal (fourth plot) which is partially recovered, the other content types of s aren't.  The residual seems to contain most of the events.  The failure of the  decomposition is most probably due to the size of the analyzing dictionary I used, as well as the fact that the waveforms in s were not in the analyzing dictionary. The analyzing dictionary was fairly large and contained very similar waveforms, which reduced the efficiency of the B C R algorithm.  110  7.6.7  Discussion  The above results show the efficiency of the B C R algorithm in separating a signal s into different content types described by their singularity order o^. As explained earlier in this paper, the B C R algorithm can be very efficient if each dictionary used for each content type are very good in representing one content type and bad in representing the other ones. In that sense, the more similar the atoms in the dictionaries, the less efficient the B C R algorithm. The decomposition we envision is badly conditioned because the dictionaries we need to use for seismic signals contain similar atoms. However, in certain cases, the B C R algorithm succeeds in separating s into the correct content types. First, there is a great importance in the decrease of the parameters Si in the B C R algorithm presented in Figure 7.12. The slower the decrease, the better the separation, because in this case, the algorithm has time to make the difference between the various content types. The B C R algorithm looks at the successive inner products between the signal s and the waveforms (fi in  and puts the largest inner products (larger than the threshold Si) into the content  type of order o^. If Si decreases too fast, a large number of inner products corresponding to various orders will be larger than Si, and thus will be put wrongly into the content type of order aj.  There is also a concern about the dictionary. If the dictionary contains waveforms that  are too similar, or if the dictionary is too large (which also triggers very similar waveforms), it will be more difficult to distinguish between the different inner products, because they will have similar values. One solution could be to make the parameters Si decrease very slowly but that would make the algorithm prohibitively slow, and moreover would only work optimally if the analyzing and synthesizing dictionaries are the same. There is also a trade-off between the size of the dictionary and the fractional-orders a that we choose in V. The number of nonzero entries in the original coefficient c is not important here, because it is the dissimilarity between the different content types of s, and the adequate choice of the dictionaries <f>j that make the B C R algorithm succeed. From these studies on the performance of the B C R algorithm, we can conclude that if the  111  analyzing dictionary V consists of waveforms that are in s, the B C R algorithm will succeed or partially succeed in separating s into the correct content types, provided V is not too large, and the fractional-orders in V not too close. However, when the dictionary contains atoms that are not in s, the B C R algorithm starts having trouble in finding the correct decomposition. The last issue of the B C R is its computational cost.  Depending on the dictionary and its  implementation, the normalization of the dictionary can become computationally intensive, and the B C R algorithm prohibitively slow. This is for example the case for the undecimated fractional spline wavelet dictionary.  7.7  Conclusion  The studies of M P , B P and the B C R algorithm show that it is very difficult to estimate the fractional-orders of seismic signals by computing their representation in an overcomplete dictionary parameterized by their fractional-orders a. These algorithms work for very particular coefficients and well conditioned dictionaries. As soon as the dictionary gets too large, and/or the waveforms too similar, and/or the coefficient vector not sparse enough, none of the above algorithms are able to find the right representation c. Real data falls under these considerations, as we need a large enough dictionary (thus with similar waveforms) to find the fractional-orders contained in s. The estimated fractional-orders are limited by the fractional-orders contained in the dictionary, and the maximal error between the fractional-orders a found and the real ones in s is given by the largest distance between two consecutive fractional-orders contained in V. To avoid large errors, it is therefore necessary to decompose s into a large dictionary containing many different fractional-orders, triggering the failure of M P , B P and the B C R algorithm. The representation c of seismic signals is not sparse enough for the type of dictionary we need.  112  Bibliography [1] T . B l u and M . Unser. A complete family of scaling functions: the (a, r) fractional splines. In Proceedings of the Twenty-Eighth IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP'03), volume 6, pages 505-508, Hong Kong S A R , People's Republic of China, A p r i l 6-10 2003. [2] Scott Shaobing Chen, David L . Donoho, and Michael A . Saunders. Atomic decomposition by basis pursuit. SIAM Journal on Scientific Computing, 20(1):33-61, 1999. [3] I. Daubechies. Time-frequency localization operators: a geometric phase approach. IEEE Transactions on Information Theory, 34:605-612, 1988. [4] I. Daubechies, M . Defrise, and C . De M o l . A n iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Comm. Pure Appli. Math, 57:1413-1541, 2004. [5] M . V . de Hoop and N . Bleistein. Generalized Radon transform inversions for reflectivity in anisotropic elastic media. Inverse Probl., 13(3):669-690, 1997. [6] F . J . Dessing, E . V . Hoekstra, F . J . Herrmann, and C . P. A . Wapenaar. Multiscale edge detection by means of multiscale migration, pages 459-462. Soc. of Expl. Geophys., 1996. [7] F . J . Dessing. A Wavelet Transform Approach to Seismic Processing. P h D thesis, Delft University of Technology, 1997. [8] D.L.Donoho and M . Elad. Optimally sparse representation in general (non orthogonal) dictionaries via l minimization. Proc. Nat. Acad. Sci., 100:2197-2202, 2003. l  [9] R. Ghose and J . Goudswaard. Integrating s-wave seismic-reflection data and conepenetration-test data using a multiangle multiscale approach. In Geophysics, volume 69, pages 440-459. Soc. of Expl. Geophys., 2004. [10] P . E . Gill, W . Murray, D . B . Ponceleon, and M . A . Saunders. Solving the kkt systems in barrier methods for linear and quadratic programming. Technical Report SOL 91-7, July 1991. [11] J . Goudswaard, M . Dillen, and C. Wapenaar. Multiangle processing and multiscale characterization of walkaway VSP data, pages 178-181. Soc. of Expl. Geophys., 2000. [12] J . C . M . Goudswaard and C . P. A . Wapenaar. Resolving well log singularities from seismic data, pages 150-153. Soc. of Expl. Geophys., 1997. [13] J . C . Harms and P. Tackenberg. Seismic signatures of sedimentation models. Geophysics, 37(l):45-58, February 1972. [14] Felix J . Herrmann. Multiscale analysis of well- and seismic data. Mathematical Methods in Geophysical Imaging V, SPIE, 3453:180-208, 1998. [15] Felix J . Herrmann. Fractional Spline Matching Pursuit: a quantitative tool for seismic stratigraphy. In Soc. Expl. Geoph., 2001. 113  [16] Felix J . Herrmann. Multi- and monoscale attributes for well- and seismic data. Chapter in the annual report of the Borehole Acoustics and Logging and Reservoir Delineation Consortia at MIT, 2001. [17] Felix J . Herrmann. Singularity Characterization by Monoscale Analysis: Applications to Seismic Imaging. Applied and Computational Harmonic Analysis, 11:64-88, 2001. [18] Felix J . Herrmann. Seismic deconvolution by atomic decomposition: a parametric approach with sparseness constraints. Integr. Computer-Aided Eng., 2005. [19] Felix J . Herrmann and Yves Bernabe. Seismic singularities at upper mantle discontinuities: a site percolation model. Geop. J. Int., 2004. [20] Felix J . Herrmann, William J . Lyons, and Colin Stark. Seismic facies characterization by monoscale analysis. Geophysical Research Letters, 28(19):3781-3784, October 2001. [21] Felix J . Herrmann and Colin Stark. Monoscale analysis of edges/reflectors using fractional differentiations/integrations. In Soc. Expl. Geoph., 1999. [22] F . J . Herrmann. A Scaling Medium Representation, a Discussion on Well-Logs, Fractals and Waves. P h D thesis, Delft University of Technology, 1997. [23] C . Liner, C . - F . L i , A . Gersztenkorn, and J . Smythe. SPICE: A new general seismic attribute, pages 433-436. Soc. of Expl. Geophys., 2004. [24] J . Liouville. Sur une formule pour les differentielles a indices quelconques a l'occasion d'un memoire de M . Tortolini. J. Math Pures Appl, unnumbered (in French), 1855. [25] S. Mallat. A wavelet tour of signal processing. 1997. [26] S. Mallat and Z. Zhang. Matching pursuit in a time-frequency dictionary. IEEE Transactions on Signal Processing, pages 3397-3415, 1993. [27] J . Muller, I. Bokn, and J.L.McCauley. Multifractal analysis of petrophysical data. Ann. Geophysicae, 10:735-761, 1992. [28] C . Payton. Seismic stratigraphy - applications to hyfrocarbon exploration. AAPG, 1977. [29] S. Sardy, A . G . Bruce, and P.Tseng. Block Coordinate Relaxation Methods for Nonparametric Wavelet Denoising. Journal of Computational and Graphical Statistics, 9, 2000. [30] J . L . Starck, M . E l a d , and D.L.Donoho. Image Decomposition V i a the Combination of Sparse Representation and a Variational Approach. IEEE Transaction on Image Processing. [31] J . L . Starck, M . E l a d , and D.L.Donoho. Redundant Multiscale Transforms and Their Application for Morphological Component Analysis. Advances in Imaging and Electron Physics, 132, 2004. [32] M . T . Taner, F . Koehler, and R . E . Sheriff. Complex seismic trace analysis. Geophysics, 44(6): 1041-1063, June 1979.  114  [33] M . Unser and T. B l u . Fractional Splines and Wavelets. SIAM Review, 42:43-67, 2000. [34] CO PD A[] Wapenaar, J[] Goudswaard, and A[] JQ van Wijngaarden. Multiangle, multiscale inversion of migrated seismic data. The Leading Edge, 18(8):928-932, 1999. [35] C.P.A. Wapenaar. Amplitude-variations-with-angle behavior of self-similar interfaces. Geophysics, 64:1928, 1999. [36] Meyer Yves. Ondelettes et algorithmes concurrents, volume 1435 of Actualites Scientifiques et Industrielles [Current Scientific and Industrial Topics]. Hermann, Paris, 1992.  115  Appendix A Wavelets  A.l  Overview  Definition A.1.1 (Wavelet ,[25]) A wavelet is a function tp E L ( R ) of zero average 2  / ip(t)dt = 0, JM. which, is dilated with a scale parameter a > 0, and, translated, by a > 0 1  ,  ft-a\  Definition A.1.2 (Continuous Wavelet Transform, [25]) The wavelet transform of a function f G L (W) at the scale a and position a, W / ( a , a) , is computed by correlating f with the wavelet at position a and scale a: 2  +oo  /  -oo  Any signal f € L ( R ) can be written as a continuous summation 2  VteR,f(t)=K  / J—oo  J— oo  of wavelets:  Wf{a,a)i> (t)  ,  M  ®~  where K is a constant.  This representation is very redundant and the coefficients VVf(a, a) are highly correlated. In order to reduce this redundancy, we can limit ourselves to dyadic wavelets and hence obtain a basis of  L {R). 2  116  Definition A . l . 3 ( O r t h o n o r m a l Wavelet Bases, [26]) Dyadic wavelets |V(j,fe)  =  (^~W^J  }^  ^  2  9  e n e r a  ^  e  a  orthonormal basis  n  ofL (R). 2  Because the basis is orthonormal, the wavelet coefficients Wf(j,k) of a function f € L (R) are simply computed by taking the inner product between f and each wavelet: 2  W/(j,fe)  =  </,%*)) +oo  f(Wl dt k)  /  -OO  — oc  Any signal f can be represented uniquely as +oo +oo  V t £ K , /(*) ^  (f,^(j,k))^U,k)j ——oo k= — oo  117  Appendix B Bases  B.l  Fractional Splines  P r o p o s i t i o n B . l . l Let a = 0, and let <po = <p o be a vector ofW a=  , such that  V 1 < n < N , ip (n) = 1 , and V n < 0 , <p (n) = 0. 0  0  Define  D = "Do is  a  GR  N  0  i<k<N  , 1 < k < N , s.t. V 1 < n < N , ug(n) = </>o(n - k + 1)  basis of M.^.  PROOF: VQ consists of N vectors of R , so we only need to show that the vectors of £>o are either N  linearly independent, or that VQ can represent any vector of M . . We show that the vectors v® N  are linearly independent. Suppose there exists iV nonzero numbers /3i, ...,/3/v, such that N  (B.l) i=l  Equation (B.l) is equivalent to: 0  (B.2)  t=i  (B.3) i=l  118  We prove recursively that all the 3% are zero. Define the following property for an integer p, l<p<N: Vl<J<P,Pj=0.  (B.4)  1. Let's show that this property is true for p = 1. If we evaluate equation (B.3) at n = 1, we obtain N  J2&M2-i)  = 0,  (B.5)  t=i which is equivalent to /3i^o(l) = 0,  (B.6)  as <^o(2 — i) is only nonzero for i = 1. A s V'o(l) = 1, we deduce that 3\ = 0, which is property (B.4) for p = 1. 2. Now, we show that if property (B.4) is satisfied for an integer p between 1 and N, it is also satisfied for p + 1. Let p be an integer, 1 < p < N and suppose that property (B.4) is true for p: V 1 < j < p , 3j = 0. We have  E  M °  = o  + l)  =  i=p+l  « V 1 < n < Af ,  E 3m{n  -i  0.  (B.7)  i=p+l  If we evaluate equation (B.7) at n = p + 1, we obtain Af  £  A v o O - i + 2) = 0,  (B.8)  i=p+l  which is equivalent to /3 i<A)(p + 1) = 0, P+  as <fio{P ~ i + 2) is only nonzero for i = p + 1. A s <^o(p + 1) = 1,  w  e  deduce that 3 +i = 0, which is property (B.4) for p + p  119  (B.9)  3. Property (B.4) is true for p = 1. If it is true for p, it is also true for p + 1, so it is true for all 1 < n < N , i.e. V 1 < n < N , 0  = 0.  n  Conclusion: v® for 1 < k < N, are linearly independent and VQ is a basis of I  P r o p o s i t i o n B . l . 2 Let a > 0, and let ip  be a vector ofR , N  a  V 1 < n < N , <f (n) = n  such that  + 1 , and V n < 0 ,  a  a  = 1.  Define V  V  A  = { k} v  a  = {v% €R  , l<k<N  N  1 < k < N  , s.t. V 1 < n < N , v%(n) = <p (n - k + 1)}. a  is a basis of]  PROOF: This proof is very similar to the previous one. V  has N vectors in R  N  A  so it is sufficient to  prove that the vectors v% are linearly independent. Suppose there exists N nonzero numbers d\, •••,0N, such that N  (B.10)  t=i N  »Vl<n<JV,  £/3 <(n) t  =  0  =  0.  (B.ll)  N  &Vl<n<N  , Y  t=i  Sixain-i  If we evaluate equation (B.12) at n = 1, n = 2 , n  120  +  (B.12)  = N, we obtain the following N equations:  N  =  0  (B.13a)  =  0  (B.13b)  M2)Pi+<Pa(.3)0 J23i i=3  =  0  (B.13c)  + ... + ipa(N)0N_1+0N  =  0.  (B.13d)  i=i N  tfcMPi + YlPi i=2 N 2  ^ (2)0  l  Q  +  < {3)0  2  fa  We will also use a recursive argument using property (B.4).  1. Let's show that property (B.4) is true for p = 1. Combining equations (B.13a) and (B.13b), we obtain: (ip (2) — l)0i = 0, which implies a  0i = 0 because <p (2) ^ 1. So property (B.4) is true for p = 1. a  2. Now, let p be an integer, 1 < p < N and suppose that property (B.4) is true for p:  vi<i<P,ft  = o.  W i t h this property, the set of N equations becomes N  Jft  =  0  (B.14a)  i=p+l N  +z)Pp+i +  & =  (  0  )  R14b  i=p+2 N  <p (p + 2)0 a  p+1  + ip (p + 3)0 a  p+2  E^ =  ( )  0  R14c  z=p+3 l (p + Pa  2)0  p+1  + <p (p + 3)P 2 ol  p+  +  =  -+tpa( )0N-\+0N N  0.  A s for p — 1, combining equations (B.14a) and (B.14b) gives us (<p (p + 2) — a  which implies that 0 \ p+  (B.14d)  l)0 \ = p+  0,  = 0 because (fa(p + 2) =fi 1. So property (B.4) is true for p + 1.  121  3. V l < n < / V , / ?  n  = 0.  Conclusion: v% for 1 < k < N, are linearlyindependent and V is a basis of IR a  B.2  Fractional Derivatives of the Gaussian  Let a be a positive fractional number, and let g be the a  derivative of the Gaussian. A  th  a  Gaussian doesn't have a compact support, but if it decays fast enough, we can approximate it by a function that has a compact support. If we sample g , we obtain a vector of length N a  that we call u . Now let's define S as the set of integers m where the vector u a  is less than a  a  tolerance (tol = 10~ , but this can be changed depending on the applications): 3  5 = | m e N , such that for 1 < m < N , \u (m)\ < tol a  and let's put these coefficients to zero: V n 6 S , u (n) a  = 0.  Define the vector (p , as the first circular shifted vector v, such that v(l) ^ 0 but v(0) = v(N) = a  0: -Pa — yue R  , such that 3 / 6 N , V 1 < n < JV , v(n) = S u  N  l  a  , v(l) ^ 0 and v(N) = 0  where S is the circular shift operator. By construction, V n > N and n < 0 , <p {n) = 0. a  P r o p o s i t i o n B . 2 . 1 Define Kk<N  V  a  is a basis  ofiR . N  122  PROOF: The proof is similar to the proof of the fractional splines for a. = 0, because, for 2 < k < N, each vector  has one more zero component than the previous vector v^_  v  Similarly to the  fractional splines, we only need to prove that the vectors v% are linearly independent, and we prove it recursively. We suppose there exists N nonzero numbers @\,...,8N, such that N  •YPiV? i=l  =  0  (B.15)  =  0  (B.16)  =  0.  (B.17)  N  1 < n < N , £&v?(n) i=l N  <=>Vl<n<JV,  £ / 3 i ¥ J ( n - i + l) i=i a  1. Property for p = 1. If we evaluate equation (B.17) at n = 1, we get  N  £Avo(2-i) i=l  =  0,  (B.18)  which is equivalent to: /?iyo(l) = 0,  (B.19)  as t/?o(2 — 0 is only nonzero for i = 1. A s v?o(l) 7^ 0, we deduce that 0\ — 0, which is property (B.4) for p = 1. 2. Now, we show that if property (B.4) is satisfied for an integer p between 1 and N, it is also satisfied for p + 1. Let p be an integer, 1 < p < N and suppose that property (B.4) is true for p: Vl<J<P,0j=O.  123  So we have £ A"? = 0 i=p+l  (B.20)  N  £ PiM -i+ ) t=p+i  •&\/l<n<N,  n  =  l  °-  (- ) B  21  If we evaluate equation (B.21) at n = p + 1 we obtain N  £ ft</?o(p-i t=p+i  + 2) =0,  (B.22)  which is equivalent to: )  f3p iVo(p + l ) = 0 +  (B.23)  )  as <po{p — i + 2) is only nonzero for i = p + 1. As </?o(p + 1 ) ^ 0 , we deduce that 0 \ v+  = 0,  which is property (B.4) for p+ 1. 3. Property (B.4) is true for p = 1. If it is true for p, it is also true for p + 1, so it is true for all 1 < n < N , i.e. V 1 < n < N , 0 = 0. n  Conclusion: v% for 1 < k < N, are linearly independent and V  is a basis of R  N  a  124  ,  Appendix C Soft Thresholding  C.l  Derivation of the formula  P r o p o s i t i o n C . l . l (Soft Thresholding) Let (a,y) e R and A e R+, then 2  sign(a)(\a\  - -^)+  = argmm(X(a  - x ) + \x\), 2  where y+ — max(0, y), for y £ M.. PROOF: Let f(x) = \x\ + A(a — x) , and let's call x = argm\i\ 2  x  f(x).  We have: if x > 0  x + \(a - x)  2  Aa  2  if x = 0  - x + A(a - x)  if x < 0,  l-2X(a-x)  ifx>0  2  and  /'(*)  =  -l-2A(a-x)  ifx<0.  / is not differentiable at 0 because /'(0+) = 1 - 2Aa, and /'(0_) = - 1 - 2Aa. Let x ^ 0.  x = a — ^ if x > 0  i.e.  a >  f'(x)=0 x = a+ 7 ^ i f x < 0  125  i.e.  a <  2A  (C.l)  Equation ( C . l ) can be rewritten as x = sign{a)(\a\ - - ^ ) if \a\ > ~ .  (C.2)  Now, if \a\ < 7 ^ , / ' can never be zero, and hence the minimum of / cannot happen when the derivative is zero. Now, consider \a\ < ^ , , 1 a < — " 2A  and note that 1  •<=>  1 1  1 < a < — 2A " " 2A -1 < -2Aa < 1 1 - 2Aa > 0 and - 1 - 2Aa < 0.  Recall the expression of / ' :  l-2Xa  + 2Xx  - l - 2 A a + 2A:r • O n R+, f'(x)  if x > 0 if x < 0.  = 1 — 2Aa + 2Xx, where 1 - 2Aa > 0 and 2Aa; > 0. In this case, the sign  of / ' is given by the y-intercept (1 — 2A), which is a positive quantity. f'(x  is therefore  positive on M.+ . • In the same way, on M I , f'(x)  = — 1 - 2Aa + 2Xx, where —1 — 2Aa < 0 and 2Xx > 0. In  this case, the sign of / ' is given by the y-intercept ( — 1 —2A), which is a negative quantity. f'(x)  is therefore negative on E l  We can then conclude that / , defined on K , is decreasing on M l , and increasing on  , which  means that the minimum of / is at zero. Summarizing the results, we get  If \a\ > ji- then x = sign(a)(\a\ If M < ^ then x = 0, which can be written as one equation x = sign{a)(\a\ -  126  ^)  +  — jr  (C.3)  Appendix D Fourier Transform and Fast Fourier Transform  This appendix explains how to compute the Fourier Transform of a function via the Fast Fourier Transform (FFT).  D.l  Theoretical results  D.l.l  Definitions  Definition D . l . l (Fourier Transform) Let f e  The Fourier  L ^ I R ) . The Fourier  Transform (FT) of f is the map defined on M. by  D e f i n i t i o n D . l . 2 (Fourier Series) Let f be a T-periodic function. (cn{f))n€Z  Transform of f is  The Fourier  coefficients  are defined by  Definition D . l . 3 (Fast Fourier Transform (FFT)) Let f be a T-periodic function. Fourier coefficients of f, {a (f))n=o n  computed by the F F T are given by  1  W  _  1  kT  —nk  VI < n < N , a (f) = TfJ2 ^ l v ~ >\L0N n  127  The  D.1.2  Approximation of the Fourier coefficients w i t h the F F T  Let / be a T-periodic function. The Fourier coefficients c ( / ) are given by n  -in e l ,  = ±J  (f)  Cn  f(t)e- ™$ 2i  dt.  (D.l)  We can approximate ( D . l ) using the lower Riemann sum  k=0  where LON =  is the N  th  k=0  root of unity.  A n approximate value for the Fourier coefficients c „ ( / ) is therefore given by 1  i  Y  _  kT  1  k=0  Therefore c „ ( / ) «  D.l.3  a (f). n  Fourier coefficients and Fourier Transform  Let / G V(R).  We set  <p(t)=T^2f(t  + nT),  where ip £ L ( R / T Z ) . Computing the Fourier coefficients of <p gives 1  CnM=  [ yZf(t r  + kT)e- ™$dt=  [ f{t)e-*™$  2i  dt = F T / ( £ ) .  The sampling is obtained by calculating the Fourier Transform of the function tp. We can therefore interpret the sequence ( F T / ( ^ ) ) as a sampling of the Fourier Transform of / . z  D.l.4  Fourier Transform and Fast Fourier Transform  To approximate the Fourier Transform of a function / with the F F T , / must be a continuous function with compact support [36]:  / G C*°(R) and V|x| > T , f(x) = 0. 0  Choose T i > To, and consider the function / as the restriction of a function / to [—Ti, Ti] /  =  /  t[-T,,T,].  128  where / is 2T\ periodic. Since / is 2T\ periodic, we can compute its Fourier coefficients  [36]  rTi C k  = k  /(«)«P(--WTi) = ^ F T (A)  IZ  .  ( D  .  2 )  W h e n T i 3> To, the sampling defined above is sufficiently precise for our purposes. When / does not have compact support, its Fourier Transform can be approximated by the Discrete Cosine Transform.  D.2  Fourier Transform and Fast Fourier Transform in Matlab  D.2.1  Approximation of the Fourier Transform in M a t l a b  In Matlab, the F F T of a vector v (noted u) of length N, is defined as N  F F T ( v ) ( k ) = u(k) = Y  v(n))e- ^ - ^ - )/ , i 2  k  1  n  1  for 1 < k < N .  N  n=l  The inverse F F T is given by N  IFFT(u)(n) =y(n) = £  u(k))e  i 2 w ( k  -  1 ) ( n  -  1 ) / N  ,  for 1 < n < N .  k=i Let / be a continuous function with compact support [—To, To], i.e. f(x) T i > T . F r o m equation (D.2) [36], we have  = 0 for |x| > To. Let  0  cn(f) = ^ F T  (Jl)  for „ e  z.  We can also approximate the Fourier coefficients of / by  c ( ) = ^ x> ~ e_27r(fc  n f  ' >  1)(n_1)/iV fornez  k=l  where Xk = / ( — ^ r ^ - ) for 1 < k < N, and / is 2Ti-periodic. The Fourier coefficients are iV-periodic (i.e. c N = c/v for p £ Z ) . If we take N large enough, we obtain a good approximation of the Fourier coefficients, and can write n+p  Cn(f) « ^FFT(f). Now, c ( / ) n  «  ^ F F T ( f ) and c (f) n  = ^ F T ^ ) ,  ^FFT(f).  129  which leads to c „ ( / ) = ^ F T ^ )  «  We can therefore write  2Ti ~N~  for 1 < n < N or -N/2  + 1 < n <  N/2.  T h e above formula gives an approximation of the Fourier Transform of a function / w i t h the FFT.  T h e M a t l a b code for finding the Fourier Transform using the F F T is given on the next  page.  D.2.2  M a t l a b code for the Fourier Transform  function[N.xi,ft,XI,F]  = FT(f,Tl_x,FLAG)  '/. f u n c t i o n [N.xi, f t , XI ,F] = F T ( f ,Tl_x,FLAG) 7, Compute t h e COMPLEX F o u r i e r T r a n s f o r m of t h e sampled f u n c t i o n f . 7. BE CAREFUL! f has t o have a COMPACT SUPPORT!  7. '/. INPUTS: '/. Mandatory '/, f :  Function f  '/, T l _ x :  Half period i n x  •/, O p t i o n a l "/, FLAG: '/,  i f 1 -> n o r m a l i z e s 0 -> no n o r m a l i z a t i o n  '/, OUTPUTS: '/. N:  Length of t h e F o u r i e r T r a n s f o r m F, and XI  '/. x i :  F r e q u e n c i e s on which f t i s e v a l u a t e d  '/. f t  F o u r i e r Transform of f  1 y, (c) C a t h e r i n e if  Dupuis, January 2005  n a r g i n <3 FLAG = 0;  end n = length(f); a = -1; b = 1; stepx = ( b - a ) / ( n - l ) ; x = a:stepx:b;  130  •/.•/.y. /. /.y. /.7.'/.y.7. /.7.7.7. /. /.7,y.'/.'/ /.y. /. /.7.7.7. /.7. ,  ,  1  7. 7.  ,  ,  ,  ,  1  t  1  ZERO PADDING  7,7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.  Period T l _ x = 60;  7 .  in x  k p t s = f l o o r ( ( T l _ x - x ( e n d ) ) / s t e p x ) ; 7 . we added k p t s p o i n t s on t h e r i g h t t o 7 . z e r o padd t h e s i g n a l up t o t h e c l o s e s t t o t h e l e f t t o T l _ x = n/2+kpts; 7 . i  Tl_s  n samples  N = 2 * T l _ s ; 7 . always even y<  k = [1:N] ; i n d = [-Tl_x + ( 2 * T l _ x * ( k - l ) ) / ( N - l ) ] ;  7 . i n d i c e s where t h e f u n c t i o n s i g i s 7, evaluated  = [zeros ((N-n)/2,1) ; f ; z e r o s ((N-n)/2,1)] ; 7 . z e r o p a d d i n g  sig  xN = [x(n/2) - ( N / 2 - l ) * s t e p x : s t e p x : x ( n / 2 + l ) + ( N / 2 - 1 ) * s t e p x ] ; 7 . whole x 7.  a f t e r z e r o padding  /o/o/o/o/./o/o/o/o/o/./o/o/o/o/o/o/o/o/o/.A  7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7,7.  FOURIER TRANSFORM  7.7.7.7,7.7.7.7.7.7,7.7.7.7.7.7.7,7.7.7.7.7.7.7.7.7.7.7.  F = fftshift(fft(fftshift(sig))); ind  = [(N-n)/2+l:(N-n)/2+n]; = F ( i n d ) ; 7 . r e s t r i c t i o n of F  ft  nn = [-(N-2)/2-l:(N-2)/2] ; XI = n n . / ( 2 * T l _ x ) ; x i = XI([(N-n)/2+l:(N-n)/2+n]); o/ y y y y y o/ y y y oy y y y y y y y y y y y y y y y y y y y y y y y yy y y oi y y y y y y ot y y y y y y y oi at y y y y y y y y y y y y y y y y y y QI  /o/o/o/o/o/o/o/a/o/.>/o/o/o/o/oA^  vut%%%%%^%%%%%%%%%vM%m.%%% if  NORMALIZATION  7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7,7.7.7.7.7.  FLAG==1 F = F./max(abs(F)); ft  = ft./max(abs(ft));  end y  y  °/w  w /v°/°/ °/wv 0  o y o / o / 0/ 0/ 0/ V W W W V V W W W W V W W W W W W W W W ' / ' / ' / W V W W V W V /o /o/o /o /o/o /o /o /o /o /o/o / o / o / o /o/o /o /«/o / o / o / o /o /o/o / o / o / o / o /o/o/o /o/o /o/o /o /o/o/o /o/o /o/o /o/o /o/o / o / o / o /o/o /o /o/o / o / o / o / o / o / o / o / o / o / o / o / o / o /o/o /o/o  131  Appendix E Explanation of the software  README is the program that locates the reflectors and estimates their fractional-orders. The program works for any size of the data (matrix or cube). S e i s C h a r a c l _ 4 .m  1. BCRDaubechiesl_2.m  Version 1.2  Iterative thresholding algorithm with Coordinate Relaxation Method f u n c t i o n [ s i g n o i s e , c o e f f r e c ] = BCRDaubechiesl_2(s,NameOfDiet,pari,par2,par3, beta,czero,Lmax,limit,expl,speedl,speed2)} S o l v e t h e f o l l o w i n g m i n i m i z a t i o n problem: min b e t a l l | c l | I 1 + ... + b e t a n l I c n I 11 + ( I l s - s l - . . . - s n l I 2)"2 using the Block Coordinate  R e l a x a t i o n Method w i t h r e s p e c t  to the c o e f f i c i e n t s Block Relaxation Coordinate  method  INPUTS: s:  S i g n a l we want t o decompose  NameOfDiet:  Name of t h e merged d i c t i o n a r y  parl,par2,par3:  Parameter of t h e merged d i c t i o n a r y  beta :  Parameter o f t h e m i n i m i z a t i o n  Lmax:  Max. # of i t e r a t i o n s  (vector)  czero:  I n i t a l guess f o r t h e c o e f f i c i e n t s  limit:  Stopping  expl:  0 -> l i n e a r  c r i t e r i o n f o r the algorithm decrease  1 -> e x p o n e n t i a l  decrease  speedl:  Speed of l i n e a r / e x p o n e n t i a l decrease  speed2:  Speed of t h e power decrease decrease  only)  OUTPUTS:  132  (for exponential  Matrix  whose columns c o n t a i n t h e d i f f e r e n t  p a r t s as w e l l as t h e r e s i d u a l Matrix  i n the l a s t  content column  whose column c o n t a i n t h e c o e f f i c i e n t s  different  content  133  parts  of t h e  B u i l d D i c l - 3 .m  Version 1.3  B u i l d the detection and estimation dictionaries  function[]  =  BuildDicl_3(whichdic,n,pardic_ave,savedetectest,nb_alpha,  nb_sigma,dist_alpha,percent_sigma) B u i l d the  d i c t i o n a r i e s f o r the  estimation  and  d e t e c t i o n and  saves i t  into: INPUTS: whichdic:  S t r i n g : values 1-  ->  i f whichdic =  ' d e t e c t i o n ' or ' d e t e c t i o n ' ->  'both' only  computes  the  detection dictionary 1- i f w h i c h d i c =  'both' ^>  computes the  detection  &  estimation d i c t i o n a r i e s n: pardic_ave:  Length of the  signal  Average parameters of the  different traces  1- p a r d i c _ a v e ( l )  the  average  the  average s t d d e v i a t i o n  order  of the  contains  2- p a r d i c _ a v e ( 2 ) c o n t a i n s of the  fractional  traces  traces  savedetectest  Name of the d e t e c t i o n or b o t h d i c t i o n a r i e s  nb_alpha:  Vector  of l e n g t h 2 c o n t a i n i n g the  i n the d e t e c t i o n and  nb_sigma:  orders  1- n b _ a l p h a ( l ) : # of f r a c o r d e r s  i n the  detection  2- n b _ a l p h a ( 2 ) : # of f r a c o r d e r s  i n the  estimation  Vector  of l e n g t h 2 c o n t a i n i n g the  i n the d e t e c t i o n and 1- n b _ s i g m a ( l ) : dist_alpha:  # of f r a c  estimation d i c t i o n a r i e s  # of s t d  die die  deviations  estimation d i c t i o n a r i e s  # of s t d d e v i a t i o n s  i n the  detection  2- nb_sigma(2): # of s t d d e v i a t i o n s  i n the  estimation  Vector the  of l e n g t h 2 c o n t a i n i n g the  f r a c t i o n a l orders  distances  f o r the d e t e c t i o n  betweent  and  estimation d i c t i o n a r i e s 1- d i s t _ a l p h a ( l ) : percent_sigma:  f o r the  detection  die  2- d i s t _ a l p h a ( 2 ) : f o r the  estimation  Maximum p e r c e n t a g e of the  average s t d d e v i a t i o n added  t o the  average s t d d e v i a t i o n .  percent_sigma) w i l l be the a l l o w e d i n the  dictionary  134  die  sigma_ave(1+/-  furthest std deviation  die die  BuildFracGauss_dicl_2  Version 1.2  B u i l d a dictionary with fractional derivatives of the Gaussian  f u n c t i o n [diet,par1,par2,par3,norm_dic,alpha_dic,sigma_dic,loopsigma]  =  BuildFracGauss_dicl_2(n,alpha_ave,sigma_ave,dist_alpha,percentage, nb_alpha,nb_sigma) B u i l d a merged d i c t i o n a r y w i t h f r a c t i o n a l d e r i v a t i v e s of g a u s s i a n s INPUTS: n:  Length of t h e s i g n a l  alpha_ave:  Average a l p h a i n the s i g n a l  sigma_ave:  Average sigma i n the s i g n a l  dist_alpha:  D i s t a n c e between t h e a l p h a s i n t h e d i c t i o n a r y  percentage:  Maximum percentage  of t h e average sigma ("  frequency)  added or s u b t r a c t e d from t h e average sigma nb_alpha:  Nb of alphas i n t h e d i c t i o n a r y  nb_sigma:  Nb of sigmas i n the d i c t i o n a r y  OUTPUTS: diet:  Name of t h e d i c t i o n a r y  pari,par2,par3:  Parameters of t h e d i c t i o n a r y  (merged  dictionary)  norm_dic:  Norm of the m a t r i x of t h e d i c t i o n a r y  alpha_dic:  V e c t o r of a l p h a s i n t h e d i c t i o n a r y  sigma_dic:  V e c t o r of sigmas i n t h e d i c t i o n a r y  loopsigma:  Flag: = 1 i f l o o p on sigma t o b u i l d t h e d i c t i o n a r y i s t h e 1st one = 0 otherwise  See a l s o MakeFracGaussl_2.m,  MakeDic.m, NormFracGauss.m,  135  chunkl_l.m  Version 1.1  Process and analyze the output of the iterative thresholding algorithm  function[chunks_c,chunks_index,nbchunk,NbOfChunks,ind] = chunkl_l(c,tol) Chunks the coefficient INPUTS: c: tol:  OUTPUT: chunks_c: chunks_index: nbchunk:  NbOfChunks: ind  vector c into subgroups of events  Coefficient vector c Threshold l e v e l (keep moduli of c that are larger than t o l ) . t o l is c a l l e d mu i n the paper.  C e l l array containing the different chunks NbOfChunks c e l l s ) C e l l array containing the different chunks of c (has NbOfChunks c e l l s ) Vector that keeps track of which chunk the are put i n . Usage: nbchunk(find(ind==index)) gives the of the chunk index i s i n . # of chunks found Indices of the moduli of c larger than t o l  136  of c (has of  indices  indices nb  E s t i m a t e l _ l .m  Version 1.1  Estimate the fractional-orders of the detected events  function[v_alpha] = Estimatel_l(s,cmat,cfold,tolchunk,distblob, alpha.dic,sigma_dic,loopalpha,table) E s t i m a t e t h e f r a c t i o n a l o r d e r a l p h a of t h e d i f f e r e n t  events  INPUTS: s:  S i g n a l analyzed  cmat:  M a t r i x of t h e c o e f f i c i e n t  cfold:  V e c t o r of the f o l d e d c o e f f i c i e n t  tolchunk:  T o l e r a n c e f o r the f u n c t i o n chunk (keep t h e m o d u l i -  distblob:  Minimum d i s t a n c e between t h e chunks a l l o w e d  alpha_dic:  V e c t o r of t h e f r a c t i o n a l o r d e r s f o r t h e d i c t i o n a r y i n  sigma_dic:  V e c t o r of t h e f r e q u e n c i e s f o r t h e d i c t i o n a r y i n which  loopalpha:  Flag:  l a r g e r than  (output of BCRDaubechies) cmat  (length(cfold)=n)  tolchunk) ( i n GetBlob  which cmat i s decomposed i n cmat i s decomposed i n = 1 i f l o o p on a l p h a t o b u i l d t h e d i c t i o n a r y i s t h e 1st = 0 table:  otherwise  V e c t o r g i v i n g the l o c a t i o n s i n t h e s i g n a l of the c o e f f i c i e n t  corresponding  atom i n FracGauss  to a particular  dictionaries  OUTPUT: v_alpha:  V e c t o r whose nonzero e n t r i e s g i v e t h e f r a c t i o n a l o r d e r of t h e s i g n a l at t h i s p a r t i c u l a r l o c a t i o n . I t s length i s the l e n g t h of  the s i g n a l  See a l s o chunkl_l.m,  Getblobl.l.m, TableFracGaussl_2.m, w h i c h a l p h a l _ l  BCRDaubechiesl_2.m  137  Foldl-l.m  Version 1.1  Reshape the output of the iterative thresholding algorithm  function[cfold] =  Foldl_l(c,n,NameOfDiet,pari,par2,par3)  F o l d t h e v e c t o r c unto t h e v e c t o r c f o l d ' INPUTS: c:  V e c t o r c of l e n g t h NumberOfDicts*n  n:  Length  NameOfDiet:  Name of t h e merged d i c t i o n a r y  pari,par1,par3:  Parameters of t h e d i c t i o n a r y  of t h e s i g n a l  OUTPUT: cfold:  F o l d e d v e c t o r of l e n g t h n by 1 o b t a i n e d by summing t h e a b s o l u t e v a l u e of t h e d i f f e r e n t w i t h t h e d i f f e r e n t wavelets  See  also  SeisCharacl_4.m  138  coefficients  i n NameOfDict  obtained  Getblobl.l .m  Version 1.1  Process the output of chunkl_l .m for the analysis of the coefficient vector obtained from the iterative thresholding algorithm  function[blobs_c,blobs_index,nbblob,NbOfBlobs,indblob] = Getblobl_l(c,chunks_c,chunks_index,nbchunk,NbOfChunks,dist) Get the coefficients INPUTS: c: chunks_c: chunks_index: nbchunk:  NbOfChunks: dist: OUTPUTS: blobs_c: blobs_index: nbblob:  NbOfBlobs: indblob:  and put them into blobs  Coefficient vector C e l l array containing the different chunks of c (has NbOfChunks c e l l s ) C e l l array containing the different chunks of indices of c (has NbOfChunks c e l l s ) Vector that keeps track of which chunk the indices are put i n Usage: nbchunk(find(ind==index)) gives the nb of the chunk index i s i n . # of chunks found Minimum distance between the chunks allowed  C e l l array containing the different blobs chunks) of c (has NbOfBlobs c e l l s ) C e l l array containing the different blobs of c (has NbOfBlobs c e l l s ) Vector that keeps track of which blob the are put i n Usage: nbblob(find(ind==index)) gives the of the blob index i s i n . # of blobs found Indices of the entries of c which are i n different blobs  See also chunkl_l.m  139  (merged of  indices  indices nb  the  InitSeisCharac.m  Give examples of parameters for the algorithm  function[par_detection,par_window,par_estimation,nb_alpha,nb_sigma, dist.alpha,percent_sigma,par_bcr_d,par_bcr_e] Initialize  the d e f a u l t  parameters f o r  =  InitSeisCharac(n,ratio)  SeisCharacl_3.m  INPUTS: n:  Length of the  ratio:  R a t i o between the  signal initial  the l e n g t h of the t r a c e s OUTPUTS: par.detection:  l e n g t h of the t r a c e s currently  and  analyzed  V e c t o r of l e n g t h 2 c o n t a i n i n g the parameters f o r  the  detection 1-  par_detection(l):  of the c o e f f i c i e n t threshold level 2-  of the norm  => p a r _ d e t e c t i o n ( l ) * m a x v a l  f o r the  is  the  detection  p a r _ d e t e c t i o n ( 2 ) : minimum w i d t h of the s e t s of  coefficients par_window:  percentage  a s s o c i a t e d t o one event  V e c t o r of l e n g t h 2 c o n t a i n i n g the parameters f o r  the  windowing 1-  par_window(l):  # of sigma we want f o r h a l f  the  width of the window 2-  par_window(2): # of sigma we want f o r the w i d t h of  the decay of the window par_estimation:  V e c t o r of l e n g t h 2 c o n t a i n i n g the parameters f o r estimation  par_bcr(_d;_e):  (see  V e c t o r of l e n g t h 3 c o n t a i n i n g the parameters f o r BCR f o r d e t e c t i o n  and e s t i m a t i o n  1-  par_bcr(l):  beta  2-  par_bcr(2):  if  3-  (_d or _e)  =1  -> e x p o n e n t i a l  =0  -> l i n e a r  decrease  decrease  par_bcr(3): speedl f o r Block Solver (BCRDaubechiesl_2.m)  4-  par_bcr(4):  speed2 f o r B l o c k S o l v e r  (BCRDaubechiesl_2.m) See a l s o  the  par_detection)  SeisCharacl_4.m  140  the  9. NormFracGauss.m Compute the norm of the dictionary made of fractional derivatives of the Gaussian.  function[norme] = NormFracGauss(n,dict,parl,par2,par3) Computes the norm of the FracGauss  dictionary  INPUTS: n: diet: pari,par2,par3:  Length of the s i g n a l Name Of the d i c t i o n a r y Parameters of the d i c t i o n a r y  OUTPUT: norme:  Norm of the matrix of the d i c t i o n a r a y  See a l s o SeisCharacl_4.m, SizeOfDiet.m, FastS.m, FastA.m  141  10. PrecomputeDictl_2.m  Version 1.2  Precompute a table of estimation dictionaries  function[dict_P,par1_P,par2_P,par3_P,no_P,alpha_dic_P,sigma_dic_P] = PrecomputeDict1_2(n,savedict,alpha_dic_P,sigma_P) Computes the d i f f e r e n t d i c t i o n a r i e s f o r the estimation INPUTS Mandatory n: savedict: Optional alpha_dic_P: sigma_P  OUTPUTS dict_P  Length of the s i g n a l Name of the mat f i l e where the precomputed dictionary i s stored Vector of the d i f f e r e n t f r a c t i o n a l orders Vector of d i f f e r e n t standard d e v i a t i o n s f o r the s t a r t of the computation of the estimation dictionaries  parl_P,par2_P,par3_P no_P  C e l l array c o n t a i n i n g the name of the d i f f e r e n t estimation d i c t o n a r i e s corresponding t o the d i f f e r e n t sigmas i n the vector sigma C e l l arrays: parameters of the d i c t i o n a r i e s Vector of the norms of the d i f f e r e n t  alpha_dic_P:  dictionaries Vector of the f r a c t i o n a l orders. A l l the  sigma_dic_P:  f r a c t i o n a l orders are i n each d i c t i o n a r y C e l l array of the d i f f e r e n t sigma vectors f o r each d i c t i o n a r y  See a l s o MakeFracGaussl_2.nl, NormFracGauss.m, MakeDic.m,  142  SeisCharacl_4.  SeisCharacl_4.m  Version 1.4  Delineate and characterize transitions from seismic events  function[v_detection,v_alpha] = S e i s C h a r a c l _ 3 ( d a t a , r a t i o . w h i c h d i c , s a v e d e t e c t e s t , s a v e d i c t , trace_nb,par_detection,par_window, par_estimation,par_bcr_d,par_bcr_e,FLAG) F i n d the l o c a t i o n s and f r a c t i o n a l orders of seismic events INPUTS: data:  Seismic data  ratio:  Ratio between the i n i t i a l length of the t r a c e s and the length of the t r a c e s c u r r e n t l y analyzed S t r i n g : values -> 'detection', 'estimation' or 'both' 1- i f whichdic = 'detection' -> only computes the detection dictionary  whichdic:  savedetectest savedict: trace_nb:  par_detection:  (matrix or cube)  2- i f whichdic = 'both' -> computes the d e t e c t i o n & estimation d i c t i o n a r i e s Name of the d e t e c t i o n d i c t i o n a r y Name of the precomputed estimation d i c t i o n a r y Vector of length 2 c o n t a i n i n g the # of the 1st t r a c e to be processed and the # of the l a s t t r a c e t o be processed. 1- t r a c e _ n b ( l ) : # of the 1st t r a c e 2- trace_nb(2): # of the l a s t t r a c e Vector of length 2 c o n t a i n i n g the parameters f o r the detection 1- p a r _ d e t e c t i o n ( l ) : percentage of the norm of the c o e f f i c i e n t => par_detection(l)*maxval t h r e s h o l d l e v e l f o r the d e t e c t i o n  i s the  2- p a r _ d e t e c t i o n ( 2 ) : minimum width of the sets of c o e f f i c i e n t s a s s o c i a t e d t o one event par_window:  Vector of length 2 c o n t a i n i n g the parameters f o r the windowing 1- par_window(l): # of sigma we want f o r h a l f the width of the window 2- par_window(2): # of sigma we want f o r the width of the decay of the window  par_estimation: par_bcr(_d;_e):  Vector of length 2 c o n t a i n i n g the parameters f o r the estimation (see par_detection) Vector of length 3 c o n t a i n i n g the parameters f o r the BCR f o r d e t e c t i o n and estimation (_d or _e) 1- p a r _ b c r ( l ) : beta 2- par_bcr(2): i f =1 -> exponential decrease =0 -> l i n e a r decrease 143  3- p a r _ b c r ( 3 ) : speedl f o r Block Solver (BCRDaubechiesl_2.m) 4- p a r _ b c r ( 4 ) : speed2 f o r Block Solver (BCRDaubechiesl_2.m) FLAG: OUTPUT: v_detection: v_alpha:  = 1 -> only d e t e c t i o n performed  (default  0)  Vector containing the l o c a t i o n s of the events detected i n the Detection part Vector containing the f r a c t i o n a l orders of the d i f f e r e n t events at t h e i r l o c a t i o n . v_alpha(k) = a l p h a l means that at the l o c a t i o n k, the f r a c t i o n a l order i s a l p h a l  See a l s o B u i l d D i c l _ 3 . m , BCRDaubechiesl_2.m, B P _ I n t e r i o r . m , F o l d l _ l . m , Windowl_3.m,Estimatel_l.m, TableFracGaussl_2.m  12. whichalphal_l .m  Version 1.1  Give the fractional-order of a specific atom in a dictionary made with fractional derivatives of the Gaussian  function[alpha_answer]  = whichalphal_l(ind,n,sigma,alpha,loopalpha)  ONLY FOR FRACGAUSS DICTIONARY! Gives the alpha corresponding to the index i n d i n the long c o e f f i c i e n t vector i n the merged d i c t i o n a r y made of the d i f f e r e n t sigmas i n the vector sigma and the d i f f e r e n t alphas i n the vector alpha INPUTS: ind: n: sigma: alpha: loopalpha:  Index i n the c o e f f i c i e n t vector Length of the s i g n a l Vector of the d i f f e r e n t sigmas f o r the merged d i c t i o n a r y Vector of the d i f f e r e n t alphas f o r the merged d i c t i o n a r y Flag: = 1 i f loop on alpha to b u i l d the d i c t i o n a r y i s the 1st one = 0 otherwise  OUTPUTS: alpha_answer:  Alpha corresponding to ind  See also c h u n k l _ l . m , E s t i m a t e l _ l . m 144  whichsigmal_l .m  Version 1.1  Give the standard deviation of a specific atom in a dictionary made with fractional derivatives of the Gaussian function[sigma_answer_sample,sigma_answer]  = whichsigmal_l(ind,n,sigma,  alpha,loopsigma) ONLY FOR FRACGAUSS DICTIONARY! Gives the sigma corresponding to the index i n d i n the long c o e f f i c i e n t v e c t o r i n the merged d i c t i o n a r y made of the d i f f e r e n t sigmas i n the v e c t o r sigma and the d i f f e r e n t alphas i n the vector alpha INPUTS:  loopsigma:  Index i n the c o e f f i c i e n t vector Length of the s i g n a l Vector of the d i f f e r e n t sigmas f o r the merged d i c t i o n a r y Vector of the d i f f e r e n t alphas f o r the merged d i c t i o n a r y = 1 i f loop on sigma to b u i l d the d i c t i o n a r y  Flag:  i s the 1st one = 0 otherwise = 1 of loop on sigma to b u i l d the d i c t i o n a r y  ind: n: sigma: alpha:  i s the 1st  one  = 0 otherwise OUTPUTS: sigma_answer_sample: sigma_answer:  Sigma i n # of samples Sigma corresponding to i n d  See a l s o chunkl_l.ni, E s t i m a t e l _ l .m  145  14. Windowl_3.m  Version 1.3  Window out the detected events  function[M,sigma_blob_sample,alpha_blob,sigma_blob,v_detection]  =  Windowl_3(s,cmat,cfold,tolchunk,distblob,alpha_dic,sigma_dic, loopsigma,table,halfnb_sigma_width,nb_sigma_decay) Window out the s i g n a l s INPUTS: s: cmat: cfold: tolchunk: distblob: alpha_dic: sigma_dic: loopsigma:  S i g n a l we analyze Matrix of the c o e f f i c i e n t (output of BCRDaubechies) Vector of the f o l d e d c o e f f i c i e n t cmat (length(cfold)=n) Tolerance f o r the f u n c t i o n chunk (keep the moduli l a r g e r than tolchunk) Minimum distance between the chunks allowed ( i n GetBlob.m) Vector of the f r a c t i o n a l orders f o r the d i c t i o n a r y i n which cmat i s decomposed i n Vector of the frequencies f o r the d i c t i o n a r y i n which cmat i s decomposed i n Flag: = 1 i f loop on sigma to b u i l d the d i c t i o n a r y i s the 1st one = 0 otherwise  table:  halfnb_sigma_width: nb_sigma_decay:  Vector g i v i n g the l o c a t i o n s i n the s i g n a l of the c o e f f i c i e n t corresponding to a p a r t i c u l a r atom i n FracGauss d i c t i o n a r i e s How many times sigma f o r h a l f the width of the 1 window How many times sigma f o r the width of gaussian decay  the  OUTPUTS: M:  sigma_blob_sample:  alpha_blob:  Matrix whose columns contain the windowed s i g n a l s from s  different  Vector containing the s t d d e v i a t i o n s  of  the atoms c o r r e l a t i n g the best with each of the windowed s i g n a l ( i n # of samples) Vector c o n t a i n i n g the alphas of the atoms c o r r e l a t i n g the best with each of the windowed signals  sigma_blob:  Vector containing s t d d e v i a t i o n s 146  of the  atoms  c o r r e l a t i n g the best with each of the windowed signal Vector c o n t a i n i n g the l o c a t i o n s  147  of the events  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0080074/manifest

Comment

Related Items