A Weighted l1-Minimization forDistributed Compressive SensingbyXiaowei LiB.Sc., Peking University, 2013A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFMASTER OF SCIENCEinThe Faculty of Graduate and Postdoctoral Studies(Mathematics)THE UNIVERSITY OF BRITISH COLUMBIA(Vancouver)September 2015c© Xiaowei Li 2015AbstractDistributed Compressive Sensing (DCS) studies the recovery of jointly sparsesignals. Compared to separate recovery, the joint recovery algorithms inDCS are usually more effective as they make use of the joint sparsity. In thisthesis, we study a weighted l1-minimization algorithm for the joint sparsitymodel JSM-1 proposed by Baron et al. Our analysis gives a sufficient nullspace property for the joint sparse recovery. Furthermore, this property canbe extended to stable and robust settings. We also presents some numericalexperiments for this algorithm.iiPrefaceThis thesis is original, unpublished, independent work by the author, X. Liunder the supervision of Dr. O¨. Yılmaz.iiiTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viAcknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . vii1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Compressive Sensing with l1-Minimization . . . . . . . . . . 11.2 Distributed Compressive Sensing . . . . . . . . . . . . . . . . 51.3 Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.4 Main Contributions . . . . . . . . . . . . . . . . . . . . . . . 92 A Weighted l1-Minimization . . . . . . . . . . . . . . . . . . . 112.1 An Equivalent Problem . . . . . . . . . . . . . . . . . . . . . 112.2 Geometric Intuition: The Unit Ball for Ψ-Norm . . . . . . . 172.3 Stability and Robustness . . . . . . . . . . . . . . . . . . . . 222.4 A Null Space Property . . . . . . . . . . . . . . . . . . . . . 283 The Recovery of Two Signals . . . . . . . . . . . . . . . . . . 333.1 Ψ-Norm and Unit Ball Revisited . . . . . . . . . . . . . . . . 333.2 A Null Space Property . . . . . . . . . . . . . . . . . . . . . 354 Numerical Experiments . . . . . . . . . . . . . . . . . . . . . . 404.1 Sparse Signals . . . . . . . . . . . . . . . . . . . . . . . . . . 40ivTable of Contents4.2 Compressible Signals . . . . . . . . . . . . . . . . . . . . . . 445 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48vList of Figures3.1 Unit ball for Ψ-norm when J = 2 and N = 1 . . . . . . . . . 343.2 Support model illustration . . . . . . . . . . . . . . . . . . . . 374.1 Comparison of joint and separate sparse recovery when J = 2and N = 50 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424.2 Joint sparse recovery under different γ when J = 2 and N = 50 434.3 Joint sparse recovery with A1 = A2 under different γ whenJ = 2 and N = 50 . . . . . . . . . . . . . . . . . . . . . . . . 434.4 Joint stable recovery under different η when J = 2 and N = 50 45viAcknowledgementsI would like to thank Dr. O¨zgu¨r Yılmaz for his guidance and support. Iwould also want to thank my parents for their love throughout the years.viiChapter 1Introduction1.1 Compressive Sensing with l1-MinimizationCompressive Sensing (CS) is a modern technique for recovering sparse signalsfrom a seemingly incomplete set of linear measurements. It exploits thesparsity assumption which leads to recovery algorithms that require far fewersamples than the ambient dimension of the signal to be recovered. One suchalgorithm is Basis Pursuit, which relies on l1-minimization.Mathematically, a vector (signal) x is sparse if most of its entries arezero and we say it is s-sparse if |{i : x(i) 6= 0}| ≤ s. The set of all s-sparsevectors in the ambient dimension is denoted as Σs.In practice, most signals we encounter are not necessarily sparse but canbe well approximated by sparse signals. Such signals are called compressiblesignals and are usually modeled as signals whose k-th largest coefficient x(k)obeys a power decay such as|x(k)| ≤ Ck−1/q. (1.1)To measure how close a signal can be approximated by sparse vectors, wecan define the best s-term approximation error to x in lp byσs(x)p = infz∈Σs‖x− z‖p, p > 0.Note that σs(x)p = 0 if x is s-sparse. For compressible signals as in (1.1),we can show that for p > q > 0,σs(x)p ≤ C(pq− 1)−1/ps1/p−1/q.Given a compressible signal x ∈ RN , sparse recovery aims to recovera sparse approximation to x with only m measurements where m N .11.1. Compressive Sensing with l1-MinimizationNote that this problem is ill-posed if sparsity is not assumed. Ideally, eachmeasurement is a linear function of x taking value in R, but in reality, noiseis also added each time we take a measurement. Writing this in matrix form,we havey = Ax+ e,where A ∈ Rm×N is the sampling matrix, e ∈ Rm is additive noise andy ∈ Rm is the final measurements. Each row in this equation represents onemeasurement and we typically assume that ‖e‖2 ≤ ε.The following l0-minimization is perhaps the most straightforward algo-rithm to find the sparest solution in this setting:minx∈RN‖x‖0 s.t. ‖y −Ax‖2 ≤ ε. (1.2)In the noiseless case (i.e. ε = 0), (1.2) recovers every s-sparse vector if andonly if spark(A) > 2s [15].1 This means m ≥ 2s is sufficient for perfectrecovery (provided spark(A) > 2s). However, (1.2) is intractable in practiceas it is NP-hard for any given ε ≥ 0 [18] and furthermore, it is neither stablenor robust [15].As the convex relaxation of l0-minimization, consider the l1-minimizationminx∈RN‖x‖1 s.t. ‖y −Ax‖2 ≤ ε. (1.3)This is also referred to as Basis Pursuit DeNoise (BPDN) and simply BasisPursuit (BP) when ε = 0. It is shown (e.g. [7],[11]) that BPDN recoversx stably and robustly for some properly chosen sampling matrix A withm ≥ Cs log(N/s). We will now discuss this in more detail.One important concept in CS is the restricted isometry property, whichis first introduced by Cande`s and Tao [10].Definition 1.1.1. A matrix A satisfies the restricted isometry property(RIP) of order s with restricted isometry constant (RIC) δs if(1− δs)‖x‖22 ≤ ‖Ax‖22 ≤ (1 + δs)‖x‖221The spark of a matrix A is the smallest number of columns of A that are linearlydependent.21.1. Compressive Sensing with l1-Minimizationholds for every s-sparse vector x.Remark 1.1.1. Roughly speaking, a small δs implies every s columns of Aare nearly orthogonal. In fact, let AS be the matrix made from some scolumns in A, then by this definition, AS preserves the square of Euclideandistance up to a factor of δs. Hence a small δs means AS almost preservesthe Euclidean distance and is therefore nearly orthogonal.RIP provides uniform guarantees for sparse recovery, namely a measure-ment matrix with a small RIC can stably and robustly recover every sparsevector. The following theorem is a precise statement of this:Theorem 1.1.1 ([8]). Let A ∈ Rm×N be the measurement matrix satisfyingδ2s <√2 − 1. Then for any vector x∗ ∈ RN and y = Ax∗ + e ∈ Rm with‖e‖2 ≤ ε, let x# be the solution of (1.3), we have‖x# − x∗‖2 ≤ C√sσs(x∗)1 +Dε, (1.4)where C, D are well behaved constants depending only on δ2s.Remark 1.1.2. In the sparse and noiseless case, by putting σs(x∗)1 = 0 andε = 0, we know that BP recovers every s-sparse vector when δ2s <√2− 1.Remark 1.1.3. It is also true that (see e.g., [15])‖x# − x∗‖1 ≤ Cσs(x∗)1 +D√sε. (1.5)Remark 1.1.4. The bound δ2s <√2 − 1 is not optimal. In fact, alongwith δ3s + 3δ4s < 2 [9], it is one of the earliest RIP-based guarantee forstable and robust s-sparse recovery. Various conditions and improvementshave been made since (e.g. [3, 4, 17]). For the optimal results, Cai andZhang [6] recently showed that for any given t ≥ 4/3, δts <√(t− 1)/tis a sharp bound for stable and robust s-sparse recovery (sharp means forany > 0, there exists a matrix A with δts <√(t− 1)/t + who cannotstably recover every s-sparse vector). Consequently the sharp bound for δ2sis δ2s < 1/√2 ≈ 0.7071. They also proved that δs < 1/3 is another sharpbound [5].31.1. Compressive Sensing with l1-MinimizationDespite the simple result of Theorem 1.1.1, verifying whether a givenmatrix satisfies the RIP condition is still NP-hard [1]. But it turns out, withexponentially high probability, various types of random matrices includingGaussian and Bernoulli satisfy RIP with small RIC. This is why randommatrices play an important role in CS.One of the most common and well-studied random ensembles is the Gaus-sian ensemble. A matrix A ∈ Rm×N is Gaussian if its entries Aij are in-dependent Gaussian variables, all with distribution N(0, 1m). For Gaussianmatrices, RIP conditions may be satisfied when m N .Theorem 1.1.2 ([15]). Let A ∈ Rm×N be a Gaussian matrix, then thereexists a universal constant C > 0 such that A satisfies δs ≤ δ with probabilityat least 1− 2 exp(−δ2m/C) provided thatm ≥ Cδ−2s log(eN/s). (1.6)Remark 1.1.5. When (1.6) is satisfied, the failure probability is at most2 exp(−δ2m/C) ≤ 2(s/eN)s = 2e−γN ,where γ = sN logeNs dependents only on the ratio s/N . Therefore, withsN ≥ α > 0 for some constant α, we have γ ≥ α log eα > 0 so that the failureprobability decreases exponentially as N increases. Theorem 1.1.2 also holdsfor certain classes of subgaussian matrices [15].Remark 1.1.6. Putting Theorem 1.1.1 and Theorem 1.1.2 together, we knowthat Gaussian matrices guarantee stable and robust s-sparse recovery (withhigh probability) provided (1.6) is satisfied. On the other hand, [12] showsthat if we want (1.5) or (1.4) to hold with x# being the solution of anyrecovery algorithm, then we must have m = O(s log(eN/s)). So (1.6) isoptimal up to a constant factor.Remark 1.1.7. If we take an orthonormal basis Φ in RN , it is easy to seethat AΦ is still Gaussian, so Theorem 1.1.2 holds for AΦ. Furthermore,Theorem 1.1.2 also holds for AΦ when A is subgaussian. This is known asthe universality of subgaussian matrices [15].41.2. Distributed Compressive SensingWe end this section by noting the case when a signal x is sparse (or com-pressible) in some other domain Φ, namely x = Φu where Φ is a orthonormalbasis and u a sparse (or compressible) vector. Now consider A′= AΦ, byTheorem 1.1.1 and Remark 1.1.7 we know that BPDN still offers stable androbust sparse recovery. This is a very useful fact as it extends the basis forrepresenting signals form the canonical one to any orthonormal one.1.2 Distributed Compressive SensingIn compressive sensing, sometimes we may have a collection of signals thatare similar in certain aspects. To recover all these signals, one way is toapply the method and theories in section 1.1 separately on each signal andits measurements. However, this is often unnecessary and by consideringtheir similarities, much better methods may be designed.As a toy example, in the noiseless case, suppose that we have n s-sparsesignals x1, . . . , xn all with the same support. Consider the following recoverymethod: first recover x1 through BP and second, recover each of xi for i ≥ 2by solving a s by s linear system with supp(xi) = supp(x1). Obviously,this method only requires s measurements each for x2, . . . , xn rather thanO(s log(eN/s)) measurements when we recover each signal separately.Distributed Compressive Sensing (DCS) arises in similar circumstanceswhere multiple (sparse) signals share common information that we can ex-ploit, and studies the recovery of these signals. The theory of DCS is intro-duced in [2], in which the authors use the Joint Sparsity Models (JSM) tomodel the intra- and inter-signal correlations. Other types of signal modelshave also been studied in [22], [21], [20]. It is worth noting that signal modelis a prerequisite for DCS and different models usually lead to different al-gorithms and analysis. In this paper we are mainly concerned with JSM-1,the first signal model in [2].JSM-1 assumes that each signal is made of two sparse components: acommon part and an innovation part. The common part, modeling theshared information, is the same across all signals while the innovation part,modeling additional individual information for each signal, varies across sig-51.2. Distributed Compressive Sensingnals. So, for such a model with J signals x1, . . . , xJ ∈ RN , we writexi = z0 + zi, i ∈ [J ] := {1, . . . , J} (1.7)where z0 denotes the common part and zi the innovation part.Here we say {zi}Ji=0 is a JSM representation of {xi}Ji=1. Notice that from(1.7), this representation is not unique. For example, given x1(n), . . . , xJ(n),we cannot uniquely determine z0(n), . . . , zJ(n) since there are J+1 variablesand only J equations. However, if we know one of them is zero, the rest ofzi(n) can then be determined. We call this the reduced JSM representation.Definition 1.2.1. {zi}Ji=0 is a reduced JSM representation of {xi}Ji=1 if#{0 ≤ i ≤ J : zi(n) 6= 0} ≤ J, ∀n ∈ [N ].It is easy to see that every JSM representation can be rewritten as areduced one.We assume each signal is sampled by yi = Aixi where Ai ∈ Rmi×N is themeasurement matrix, then to recover the sparse solutions, we can considerthe following l0-minimization problem.minJ∑k=0‖zk‖0 s.t. yi = Ai(z0 + zi), i = 1, . . . , J. (1.8)Baron et al. give conditions for recovering {xi} through (1.8) in [2].To state this, they first define the overlap size KC(Γ, {zi}) and conditionalsparsity Kcond(Γ, {zi}).Definition 1.2.2 ([2]). Given a JSM representation {zi}Ji=0, the overlapsize for a set of signals Γ ⊆ [J ], denoted KC(Γ, {zi}), is the number ofindices in which there is overlap between the common and the innovationpart supports at all signals j /∈ Γ:KC(Γ, {zi}) = |{n ∈ [N ] : z0(n) 6= 0 and ∀j /∈ Γ, zj(n) 6= 0}|.Also define KC([J ], {zi}) = KC({zi}) and KC(∅, {zi}) = 0. The conditionalsparsity for a set of signals Γ ⊆ [J ] is defined asKcond(Γ, {zi}) =∑j∈Γ|supp(zj)|+KC(Γ, {zi}).61.2. Distributed Compressive SensingThe conditional sparsity is the number of entries form {zi}i∈Γ that mustbe recovered by measurements yj (j ∈ Γ). Roughly speaking, the supportof zj (j ∈ Γ) must be recovered by yj (j ∈ Γ) since other yj provides noinformation about them. All entries in z0 with z0(n) 6= 0 and zj(n) 6= 0 (∀j /∈Γ) must also be recovered by yj (j ∈ Γ) since outside of Γ, z0(n)zj(n) 6= 0for all j, so we cannot uniquely determine z0(n) from j /∈ Γ. For the numberof required measurements, Baron et al. showed the following theorem:Theorem 1.2.1 ([2]). Let {xi}i∈[J ] be signals in RN from JSM-1 and eachmeasured by a Gaussian matrix Ai ∈ Rmi×N yielding yi = Aixi. Supposethere exists a reduced JSM representation {zi} of {xi} such that∑j∈Γmj ≥ Kcond(Γ, {zi}) + |Γ| (1.9)holds for all Γ ⊆ [J ], then {xi} can be uniquely recovered from (1.8) withprobability one over {Ai}i∈[J ].Conversely, if there exists a reduced JSM representation {z′i} such that∑j∈Γmj < Kcond(Γ, {z′i}) (1.10)holds for some Γ ⊆ [J ], then {xi} cannot be uniquely recovered.Remark 1.2.1. The gap |Γ| between (1.9) and (1.10) is due to identifyingthe support of zi. If all support locations of {zi} are prior known, this gapcan be removed [2].Aside from JSM-1, another popular signal model in DCS is the MultipleMeasurement Vector (MMV), which is also the JSM-2 in [2]. MMV assumeswe have measurements of multiple (sparse) signals with exactly the samesupport. This model is much more well-studied than JSM-1 as its supportstructure is more simple and direct. Many algorithms have been proposedfor MMV, e.g., Simultaneous Orthogonal Matching Pursuit (SOMP [23]),Reduce MMV and Boost (ReMBo [16]) and the mixed-norm optimization([13],[14]), etc.Even with Theorem 1.2.1, the l0-minimization as in (1.8) is again NP-hard, just like (1.2). This prompts us to look at the convexly relaxed71.3. Notationsweighted l1 minimization (also called the γ-weighted l1-norm formulationin [2]):minJ∑k=0γk‖zk‖1 s.t. yi = Ai(z0 + zi), i = 1, . . . , Jwhere each wight γk > 0 is introduced to balance the magnitudes of zk. Byfurther assuming that {zi}i>0 have approximately the same magnitudes, wethen arrive at a relatively simplified algorithm which only imposes a weighton the common part z0:PZ(γ) : min γ‖z0‖1 +J∑k=1‖zk‖1 s.t. yi = Ai(z0 + zi), i = 1, . . . , Jwhere γ > 0 is a given parameter. This is the algorithm we will study inthe rest of this thesis.1.3 NotationsHere we clarify the notations for this thesis.All vectors are column vectors unless stated otherwise. For simplicity,we will writeX =x1x2...xJ , Y =y1y2...yJ , Z =γz0z1...zJ , (1.11)andA =A1A2. . .AJ , (1.12)where X ∈ RJN , Y ∈ R∑mi , Z ∈ R(J+1)N and Ai ∈ Rmi×N . Also writeΨ =γ−1IN INγ−1IN IN.... . .γ−1IN IN (1.13)81.4. Main Contributionsso that we have X = ΨZ and PZ(γ) becomesPZ(γ) : minZ∈R(J+1)N‖Z‖1 s.t. Y = AΨZ.Note that a parameter γ is implied whenever we write Ψ or Z.Moreover, we will write [M ] = {1, 2, . . . ,M} for any positive integer M ;|S| or #S for the cardinality of a set S; S for the complement of a set S; w(i)for the i-th coordinate of a vector w; supp(w) for the support of a vectorw; N for natural numbers and N+ for positive integers; ψ = ker(Ψ) for thekernel of Ψ. Note that ψ is a N dimensional subspace andψ = {(γuT ,−uT , . . . ,−uT )T : u ∈ RN}.We denote by wS the restriction of a vector w to indices in S. Particu-larly, for X ∈ RJN and S ⊆ [JN ], we have XS defined asXS(i) = X(i)1{i∈S}, i ∈ [JN ]where 1{i∈S} is the indicator function of whether i ∈ S. We will also usethe notation X|T for X ∈ RJN and T ⊆ [N ], which means the restriction toT of every xj in X, i.e.,X|T =(x1)T...(xj)T .Especially, we write X|i = X|{i} for i ∈ [N ].1.4 Main ContributionsThe main contribution of this thesis is a null space property that guaranteesthe recovery of JSM-1 signals through PZ(γ). Furthermore, this propertycan be made to be stable and robust. To the best of the author’s knowledge,this is the first sufficient condition of such kind. Aside form these recoveryguarantees, we also give a geometric interpretation for understanding theeffectiveness of PZ(γ).91.4. Main ContributionsChapter 2 gives a detailed discussion about these contributions. Thekey point of observation is to write PZ(γ) into an equivalent form. Our nullspace property is also derived by studying this equivalent problem. Chapter3 is a case study for PZ(γ) with J = 2, namely the recovery of two signals.Chapter 4 presents several numerical experiments on PZ(γ) with J = 2 andchapter 5 summarizes the thesis.10Chapter 2A Weighted l1-MinimizationRecall that under the Joint Sparsity Model (JSM) we assume every signal xiis the sum of a common part z0 and an innovation part zi. We also considera weighted l1-minimization for recovering J such signals:PZ(γ) : min γ‖z0‖1 +J∑k=1‖zk‖1 s.t. yi = Ai(z0 + zi), i = 1, . . . , Jwhere γ > 0 is a weight imposed on the common part. Using the matrixnotations as in section 1.3, we can write PZ(γ) in a more concise formPZ(γ) : minZ∈R(J+1)N‖Z‖1 s.t. Y = AΨZ.The signals are then recovered by X = ΨZ. To study this algorithm, wefirst write it into a different but equivalent form.2.1 An Equivalent ProblemAs to be shown in Theorem 2.1.3, the optimization problem PZ(γ) is equiv-alent toPX(γ) : minX∈RJN‖X‖Ψ s.t. Y = AX.Here the Ψ-norm ‖ · ‖Ψ is defined as follows (with ψ = Ker(Ψ))‖X‖Ψ = infU∈ψ‖[0X]+ U‖1. (2.1)In other words, the Ψ-norm of a X ∈ RJN is the smallest l1-norm of Z ∈R(J+1)N such that X = ΨZ. So another way of writing this is‖X‖Ψ = minR(J+1)N‖Z‖1 s.t. X = ΨZ.112.1. An Equivalent ProblemWe first give an intuitive (but not rigorous) argument for the equivalence.Looking at PZ(γ), by adding any U ∈ ψ to Z, we can rewrite it asminU∈ψ,Z‖Z + U‖1 s.t. Y = AΨ(Z + U) = AΨZ.Note the above minimization is taken over both U and Z. Now if we mini-mize over U first, it becomesminZminU∈ψ‖Z + U‖1 s.t. Y = AΨZ.By definition, the Ψ-norm of ΨZ is the smallest l1-norm of Zˆ such thatΨZˆ = ΨZ, so ‖ΨZ‖Ψ is the same as minU∈ψ ‖Z + U‖1 (this can also beshown as in Lemma 2.1.2). Hence we haveminZ‖ΨZ‖Ψ s.t. Y = AΨZ.Lastly, writing this into a minimization problem of X using X = ΨZ, wearrive at PX(γ).Although it is not a proof, the above argument does highlight a directconnection between PZ(γ) and PX(γ), and we now give a detailed proof oftheir equivalence below. First need to show that ‖ · ‖Ψ is indeed a norm.Lemma 2.1.1. ‖ · ‖Ψ is a norm on RJN .Proof. Notice that vector spaces RJN and R(J+1)N/ψ are isomorphic withisomorphismf : RJN → R(J+1)N/ψX 7→[0X]+ ψ.The claim then follows from the fact that infU∈ψ ‖[0X]+U‖1 is the canonicalquotient norm on R(J+1)N/ψ.The following lemma is a helpful result in proving the equivalence ofPZ(γ) and PX(γ).122.1. An Equivalent ProblemLemma 2.1.2. For Z ∈ R(J+1)N as defined in (1.11) and Ψ in (1.13), wehave‖ΨZ‖Ψ = infU∈ψ‖Z + U‖1.Proof. LetZˆ =γz0−z0...−z0 ,then Zˆ ∈ ψ (as ΨZˆ = 0) and[0ΨZ]=0z0 + z1...z0 + zJ+ Zˆ − Zˆ = Z − Zˆ.By definition,‖ΨZ‖Ψ = infU∈ψ‖[0ΨZ]+ U‖1 = infU∈ψ‖Z + U − Zˆ‖1 = infU∈ψ‖Z + U‖1.Now the main theorem for this section:Theorem 2.1.3. The optimization problems PZ(γ) and PX(γ) are equiva-lent, in the sense that, with the same Y and A, if Z and X are the solutionsets for PZ(γ) and PX(γ) respectively, thenΨZ := {ΨZ : Z ∈ Z} = X.Furthermore, for every Z ∈ Z and X ∈ X,‖Z‖1 = ‖X‖Ψ. (2.2)132.1. An Equivalent ProblemProof. If Z∗ is a minimizer for PZ(γ), it is easy to see that‖Z∗‖1 = infV ∈ker(A),U∈ψ‖Z∗ +[0V]+ U‖1. (2.3)We also claim that‖Z∗‖1 = infU∈ψ‖Z∗ + U‖1. (2.4)This is because if not, letUˆ = arg minU∈ψ‖Z∗ + U‖1, Zˆ = Z∗ + Uˆ ,then Zˆ is also a solution for Y = AΨZ and ‖Zˆ‖1 < ‖Z∗‖1, which contradictsto Z∗ being a minimizer.Now for V ∈ ker(A), by Lemma 2.1.2,‖ΨZ∗ + V ‖Ψ = ‖Ψ(Z∗ +[0V])‖Ψ = infU∈ψ‖Z∗ +[0V]+ U‖1, (2.5)then by (2.3), (2.4) and (2.5) we have‖ΨZ∗ + V ‖Ψ ≥ ‖Z∗‖1 = infU∈ψ‖Z∗ + U‖1.Using Lemma 2.1.2 again, this becomes‖ΨZ∗ + V ‖Ψ ≥ ‖ΨZ∗‖Ψ,since the above holds for any V ∈ ker(A), we conclude that ΨZ∗ is a mini-mizer for PX , i.e., ΨZ∗ ∈ X.If X∗ is a minimizer for PX , then‖X∗‖Ψ ≤ ‖X∗ + V ‖Ψ, ∀V ∈ ker(A).Take W∗ ∈ R(J+1)N such thatX∗ = ΨW∗, ‖X∗‖Ψ = ‖W∗‖1.For any V ∈ ker(A), we have‖W∗‖1 = ‖X∗‖Ψ ≤ ‖X∗ + V ‖Ψ.142.1. An Equivalent ProblemAlso notice that‖X∗ + V ‖Ψ = ‖Ψ(W∗ +[0V])‖Ψ = infU∈ψ‖W∗ +[0V]+ U‖1.So we conclude that‖W∗‖1 ≤ infU∈ψ‖W∗ +[0V]+ U‖1holds for every V ∈ ker(A). ThereforeW∗ is a minimizer for PZ , i.e., W∗ ∈ Z.Finally, (2.2) is true because, as shown above, ‖X∗‖Ψ = ‖W∗‖1 andX∗ ∈ X, W∗ ∈ Z.Remark 2.1.1. From Theorem 2.1.3, Ψ maps the solution set Z onto X. Herewe point out that this mapping is not necessarily one to one. For example,take N = 1, J = 3, γ = 1 and assume that X0 = [x1 x2 x3]T = [1 1 0]Tis in X. Notice that ‖X0‖Ψ = 2 andX0 = 000+ 110 = 111+ 00−1 .So we have Z0 = [0 1 1 0]T and Z1 = [1 0 0 −1]T both in Z withΨZ0 = ΨZ1 = X0. (In fact, Zα = [α 1− α 1− α −α]T ∈ Z for allα ∈ [0, 1] and ΨZα = X0).However, for most choices of γ, the mapping Ψ is indeed one to one.This is stated in the following proposition.Proposition 2.1.4. Let Z and X be the same as in Theorem 2.1.3. Ifγ /∈ {s− t : s, t ∈ N+, s+ t = J}, (2.6)then Ψ is a bijection between Z and X.Proof. For X ∈ X, considerZX = {Z ∈ Z : ΨZ = X}.152.1. An Equivalent ProblemWe need to show |ZX | = 1 so that Ψ is injective. First notice that forevery Z ∈ ZX , if its first N coordinates γz0 are given, then we can uniquelydetermine z1, . . . , zJ through ΨZ = X. In other words, Z ∈ ZX can beuniquely determined by γz0, or simply z0.Now look at z0. As a result from Theorem 2.1.3 we have ‖Z‖1 = ‖X‖Ψ.Writing this out we can see z0(i) ∈ zi wherezi = arg minu∈Rγ|u|+∑j∈[J ]|xj(i)− u|, i ∈ [N ].To study the cardinality of zi, letfi(u) = γ|u|+∑j∈[J ]|xj(i)− u|,then fi(u) is a convex piecewise linear function and we can calculate its rightderivativef′i+(u) ={γ − s+ t, u ≥ 0−γ − s+ t, u < 0where s = {j : xj(i) > u} and t = {j : xj(i) ≤ u}. According to condition(2.6), f′i+(u) 6= 0 for every u ∈ R.Thus we conclude that fi(u) has a unique minimum, i.e., |zi| = 1. Thisis because if not, take u1 and u2 to be two minimizers with u1 < u2, thenby convexity, fi is a constant on [u1, u2], which implies f′i+(u1) = 0.Since every zi (i ∈ [N ]) has exactly one element, z0 can have only onepossible value. So we have |ZX | = 1.Ψ is surjective by Theorem 2.1.3.Theorem 2.1.3 suggests that PZ(γ) and PX(γ) are essentially the same,so we will not distinguish them from now on. In the rest of the thesis, asolution X∗ of PZ(γ) simply means that X∗ is a solution to the equivalentproblem PX(γ). Similarly, a solution Z∗ of PX(γ) means Z∗ is a solution tothe equivalent problem PZ(γ).In practice, we are more interested in the case where X contains onlyone (sparse) element. The reason for this is that most algorithms for l1-minimization will only return with one solution. Besides, even if we are able162.2. Geometric Intuition: The Unit Ball for Ψ-Normto find more solutions, we may not be able to identify which one is the truesignal(s). As a result from Theorem 2.1.3, the following corollary providesa characterization for the unique minimizer of PZ(γ).Corollary 2.1.5. X∗ is the unique minimizer of PZ(γ) if and only if‖[0X∗]+[0V]+ U‖1 > ‖X∗‖Ψ (2.7)holds for every V ∈ ker(A)\{0} and every U ∈ ψ.Proof. This follows directly from the fact that X∗ is the unique minimizerif and only if‖X∗ + V ‖Ψ > ‖X‖Ψ, ∀V ∈ ker(A)\{0}.Corollary 2.1.5 may be used to further derive guarantees for sparse re-covery. For now, we turn our eyes to a study of the unit ball of the Ψ-normas it presents us a geometric interpretation of how and why PZ(γ) works.2.2 Geometric Intuition: The Unit Ball forΨ-NormThe main result (Theorem 2.2.3) of this section is a description of B‖·‖Ψ ,the unit ball for Ψ-norm. Roughly speaking, B‖·‖Ψ is the convex hull ofthe l1 unit ball B‖·‖1 and a few ”other points”. So, in some degree, B‖·‖Ψand B‖·‖1 are similar to each other. This can be used to understand whyPZ(γ) promotes sparsity. Meanwhile, for the different parts of these twounit balls, the existence of those ”other points” favors signals with (large)common part to be the solution for PZ(γ). This can help us understandwhy PZ(γ) performs well for the joint model. The weight γ also plays anrole in determining the shape of B‖·‖Ψ .We start by looking at an easier scenario: each signal xj is only a singlevalue in R, i.e., N = 1 and xj = xj(1). Now we have the following lemma.172.2. Geometric Intuition: The Unit Ball for Ψ-NormLemma 2.2.1. For X and Ψ with N = 1, we have{X ∈ RJ : ‖X‖Ψ ≤ 1} = Conv{±eJi ,±1γ1J : i ∈ [J ]},where eJi ∈ RJ is the vector with i-th coordinate one and all others zero,1J ∈ RJ is the all one vector.Proof. ”⊆”: By definition,‖X‖Ψ = infu|γu|+∑j∈[J ]|xj(1)− u|.Letu∗ = arg minu|γu|+∑j∈[J ]|xj(1)− u|,we can then write X asX = γu∗1γ1J +∑j∈[J ](xj(1)− u∗)eJj .For ‖X‖Ψ = 1, notice that|γu∗|+∑j∈[J ]|xj(1)− u∗| = 1,so X ∈ Conv{±eJi ,±γ−11J : i ∈ [J ]}.”⊇”: Notice that ‖eJi ‖Ψ = 1, ‖γ−11J‖Ψ ≤ 1 and {X ∈ RJ : ‖X‖Ψ ≤ 1}is convex.One result of Lemma 2.2.1 is the following corollary, which is from theconvexity of the unit ball.Corollary 2.2.2. For X and Ψ with N = 1, if ‖X‖Ψ = 1, then there existsλk such thatX =∑kλkpk,λk ≥ 0,∑kλk = 1,where pk ∈ {±eJi ,±γ−11J : i ∈ [J ]}.182.2. Geometric Intuition: The Unit Ball for Ψ-NormWith the preparations above, we can now describe the unit ball for Ψ-norm. The key observation here is to write X as a sum of projections whereLemma 2.2.1 and Corollary 2.2.2 can be applied to each projection.Theorem 2.2.3. Let eni be the vector in Rn with i-th coordinate one andall other coordinates zero. Denote BJN‖·‖ the unit ball for any norm ‖ · ‖ inRJN , i.e., BJN‖·‖ := {X ∈ RJN : ‖X‖ ≤ 1}, thenBJN‖·‖Ψ = Conv{±ei,±dj : i ∈ [JN ], j ∈ [N ]}= Conv{BJN‖·‖1 ,±dj : j ∈ [N ]}where ei = eJNi and dj =1γeNj...eNj ∈ RJN .Proof. ”⊆”: ∀X ∈ ∂BJN‖·‖Ψ ,2 letX|i =J∑j=1xj(i)ei+(j−1)N , i = 1, · · · , Nbe the projection of every xj to its i-th coordinate, then X =∑i∈[N ]X|iand by definition,‖X‖Ψ =∑i∈[N ]infui|γui|+∑j∈[J ]|xj(i)− ui| = ∑i∈[N ]‖X|i‖Ψ.Let αi = ‖X|i‖Ψ, by Corollary 2.2.2, ∃λik and pik such thatX|i = αi∑kλikpik ,λik ≥ 0,∑kλik = 1,pik ∈ {±di,±ei, . . . ,±ei+(J−1)N}.SoX =∑i∈[N ]X|i =∑i∈[N ]∑kαiλikpik ,2∂S denotes the boundary of a set S. Here ∂BJN‖·‖Ψ = {X ∈ RJN : ‖X‖Ψ = 1}.192.2. Geometric Intuition: The Unit Ball for Ψ-Normand we also have ∑i∈[N ]∑kαiλik =∑i∈[N ]αi = ‖X‖Ψ = 1.This shows X ∈ Conv{±ei,±dj : i ∈ [JN ], j ∈ [N ]}.”⊇”: Notice that ‖ei‖Ψ = 1, ‖dj‖Ψ ≤ 1 and BJN‖·‖Ψ is convex.Remark 2.2.1. BJN‖·‖Ψ is the convex hull of BJN‖·‖1 and {±dj : j ∈ [N ]}. Despite±dj , the two unit balls BJN‖·‖Ψ and BJN‖·‖1 are similar in shape. Since l1-normcan effectively promote sparsity, it should be no surprise that Ψ-norm isalso capable of finding sparse solutions. On the other hand, look at dj anddivide it into sequential blocks of length N , every block has only the j-thcoordinate to be nonzero with value 1/γ. This structure corresponds tothe common part in JSM-1. For γ < J , ±dj are outside of BJN‖·‖1 , namelyBJN‖·‖1 ⊂ BJN‖·‖Ψ . In this case, consider the minimization of X over both Ψ-norm and l1-norm under the same linear constraints, and let X∗ and X∗ betheir unique minimizer respectively. If ‖X∗‖Ψ = ‖X∗‖Ψ, then X∗ = X∗;otherwise we have ‖X∗‖Ψ < ‖X∗‖Ψ and ‖X∗‖1 > ‖X∗‖1, which impliesthat X∗ has a larger common part than X∗. Thus Ψ-norm has a higherprobability of finding solutions with larger common parts than l1-norm. Inall, we can see that PZ(γ) promotes both sparsity and the common partwhen γ < J . (Note that when γ ≥ J , Ψ-norm becomes l1-norm).Remark 2.2.2. The value of γ can no doubt affect the performance of thealgorithm due to its impact on B‖·‖Ψ .1. For small γ (close to 0), the scale of dj , which is 1/γ, becomes verylarge, so the algorithm will favor solutions with large shared compo-nents. This is only suitable for signals with predominantly a commonpart.2. As γ increases but remains less than J , dj moves towardsγJ dj ∈ ∂B‖·‖1 .So B‖·‖Ψ shrinks in all ±dj directions and gets more and more similarto B‖·‖1 . A good choice for γ in practice is usually somewhere in here,one that can balance the common part and the innovation part of thesignals.202.2. Geometric Intuition: The Unit Ball for Ψ-Norm3. Once γ ≥ J , Ψ-norm becomes l1-norm as ±dj ∈ B‖·‖1 for all j. Thiscan also be seen from the definition of Ψ-norm. Recall that‖X‖Ψ =∑i∈[N ]minuγ|u|+∑j∈[J ]|xj(i)− u| .When γ ≥ J , for every i ∈ [N ], the minimum is achieved at u = 0with value∑j∈[J ] |xj(i)|, and the right hand side above becomes ‖X‖1.There is no need for PZ(γ) in this scenario since it is computationallyheavier than the l1-minimization on X.Corollary 2.2.4. For X ∈ RJN and p ≥ 1,min{ γJ1/p, 1}‖X‖p ≤ ‖X‖Ψ ≤ (JN)1−1/p‖X‖p.Proof. Since BJN‖·‖1 ⊆ BJN‖·‖Ψ , we have‖X‖Ψ ≤ ‖X‖1 ≤ (JN)1−1/p‖X‖p. (2.8)For the other side of the inequality, consider the ei and dj as in Theorem2.2.3, we have‖ei‖Ψ = ‖ei‖p = 1, ‖dj‖Ψ = 1, ‖dj‖p = J1/p/γ.If 0 < γ < J , this suggests that BJN‖·‖Ψ ⊆ (J1/p/γ) · BJN‖·‖1 . Otherwise γ > Jand this suggests BJN‖·‖Ψ ⊆ BJN‖·‖1 . In all, the first inequality holds.Remark 2.2.3. By setting p = 1 we getmin{γJ, 1}‖X‖1 ≤ ‖X‖Ψ ≤ ‖X‖1. (2.9)The equalities in (2.9) can be achieved at ei or dj .For p > 1, the left equality in Corollary 2.2.4 can always be achieved (at eior dj), but not always for the right. Consequently (JN)1−1/p above is notexactly sharp. However, we note that it is sharp when γ ≥ 1 or J is even. Toshow this, consider X˜ = [1J −1J · · · (−1)J+11J ]T where 1J is the all-one row vector in RJ . With X = X˜, (2.8) then becomes equations. As forwhen γ ∈ (0, 1) and J is odd, (JN)1−1/p is not sharp because every nonzeroX that achieve the equality on the right in (2.8) (i.e. |X(i)| = |X(1)| > 0for all i ∈ [JN ]) will have a strict inequality on the left.212.3. Stability and Robustness2.3 Stability and RobustnessIn section 2.1, we gave a condition (Corollary 2.1.5) for the exact recoveryof X through algorithm PZ(γ). Here we will extend this condition so thatPZ(γ) have guaranteed stability and robustness.StabilityThe stability for PZ(γ) will enable us the recovery of not only the exactspare signals under JSM-1, but also the compressible ones. It is importantfor practice as most real-life signals are among the latter. To study this, wefirst need to model the compressible signals under JSM-1.We now write the signals to be recovered as Xˆ = [xˆT1 , . . . , xˆTJ ]T (wherexˆj ∈ RN ) and the measurements Yˆ = AXˆ. Other notations remain the sameas before. We also say that Xˆ is JSM-1 compressible if it can be written asXˆ = X + E, (2.10)where X is a vector under JSM-1 and E is small compared to X. Notethat this definition is rather loose and allows some freedom by choosingX and E. Unlike the definitions for compressible signals and best s-termapproximation in compressive sensing, each xj in (2.10) need not to be abest ‖xj‖0-term approximation for xˆj . This is because if forced to be, theresulting X may no longer be in JSM-1.Under this new setting, PZ(γ) and PX(γ) becomePˆZ(γ) : minZ∈R(J+1)N‖Z‖1 s.t. Yˆ = AΨZ.PˆX(γ) : minX∈RJN‖X‖Ψ s.t. Yˆ = AX.To establish stability, one natural idea is to apply the stable null spaceproperty in compressive sensing to PˆX(γ), but with Ψ-norm. A similarargument like that of l1-norm [15] then yields the desired result. However,the stable condition derived such way is too strong and impractical. A briefdescription of this is given as below.Recall the definition of the stable null space property in compressivesensing.222.3. Stability and RobustnessDefinition 2.3.1. A matrix A ∈ Rm×N satisfies the stable null space prop-erty with constant 0 < ρ < 1 relative to S (denoted as (S, ρ)-NSP) if‖vS‖1 ≤ ρ‖vS‖1, ∀v ∈ ker(A)\{0}.It is said to satisfy the stable null space property of order s (denoted as(s, ρ)-NSP) if the above holds for every S ⊆ [N ] with |S| ≤ s.If matrix Φ is (S, ρ)-NSP with S = supp(x), then l1-minimization isstable due to the following result [15]:‖x−∆l1(x)‖1 ≤2(1 + ρ)1− ρ σ|S|(x)1,where∆l1(x) = arg minw‖w‖1 s.t. Φw = Φx.Note that in showing this, the only property required form l1-norm is that‖wS‖1 + ‖wS‖1 = ‖w‖1 for all w.To apply this approach to PˆX(γ), first we need to choose the supportset T = ∪j∈[J ]supp(xj) so that property ‖W |T ‖Ψ + ‖W |T ‖Ψ = ‖W‖Ψ holdsfor all W ∈ RJN . A stable null space property with Ψ-norm then comes as‖V |T ‖Ψ ≤ ρ‖V |T ‖Ψ, ∀V ∈ ker(A)\{0}. (2.11)Unfortunately, (2.11) is a little too strong. In fact, take any V =[v10],where v1 ∈ ker(A1)\{0}, then V ∈ ψ\{0} and by (2.9) (2.11) we have‖(v1)T ‖1 = ‖V |T ‖Ψ ≤ ρ‖V |T ‖Ψ ≤ ρ‖V |T ‖1 = ρ‖(v1)T ‖1.Thus we arrive at an even stronger condition than (supp(x1), ρ)-NSP on A1,which already implies the stable recovery of x1 through l1-minimization.Similar results can also be said for x2, . . . , xJ . Therefore in the case of(2.11), there is no need to even consider JSM and PˆZ(γ) as we can simplyrecover through separate l1-minimizations.For Xˆ = X + E ∈ RJN , we will use the following (2.12) to providestability for PˆZ(γ). Here α > 0 is some constant.‖X + V ‖Ψ ≥ ‖X‖Ψ + α‖V ‖Ψ, ∀V ∈ ker(A). (2.12)232.3. Stability and RobustnessNote that (2.12) depends on X rather than Xˆ and, compared with thenecessary and sufficient condition in Corollary 2.1.5, it is slightly strongerthan (2.7) in the existence of term α‖V ‖Ψ. Theorem 2.3.1 gives a stabilityresult based on (2.12).Theorem 2.3.1. Given JSM-1 compressible signals Xˆ ∈ RJN as in (2.10)and let X# be a solution of PˆX(γ) with Yˆ = AXˆ. If (2.12) holds for someα > 0, then‖Xˆ −X#‖Ψ ≤ (2/α+ 2)‖E‖Ψ (2.13)Proof. LetV = X# − Xˆ = X# −X − E,we will prove (2.13) by contradiction.If (2.13) does not hold, then‖X# −X‖Ψ ≥ ‖V ‖Ψ − ‖E‖Ψ > (2/α+ 1)‖E‖Ψ.Together with (2.12), we have‖X# − E‖Ψ − ‖X‖Ψ = ‖X + V ‖Ψ − ‖X‖Ψ≥ α‖X# −X − E‖Ψ≥ α (‖X# −X‖Ψ − ‖E‖Ψ)> 2‖E‖Ψ.On the other hand, since X# is a solution of PˆX(γ),‖X# − E‖Ψ ≤ ‖X#‖Ψ + ‖E‖Ψ ≤ ‖Xˆ‖Ψ + ‖E‖Ψ ≤ ‖X‖Ψ + 2‖E‖ΨThis gives us a contradiction. Therefore (2.13) must hold.Remark 2.3.1. We may think (2.12) as roughly another form of the stablenull space property. In fact, when considering the l1-minimization in com-pressive sensing, it is equivalent to the stable null space property as statedin the following proposition.242.3. Stability and RobustnessProposition 2.3.2. A matrix Φ ∈ Rm×N is (s, ρ)-NSP if and only if‖x+ v‖1 ≥ ‖x‖1 + α‖v‖1 (2.14)holds for all v ∈ ker(Φ) and x ∈ RN with ‖x‖0 ≤ s. Here the correspondingrelationship between ρ and α isα =1− ρ1 + ρ, ρ =1− α1 + α.Proof. ”if”: For S ⊆ [N ] with |S| ≤ s and v ∈ ker(Φ), by putting x = −vSin (2.14), we have‖vS‖1 ≥ ‖vS‖1 + α‖v‖1Simplify this gives ‖vS‖1 ≤ ρ‖vS‖1 with ρ = 1−α1+α .”only if”: For v ∈ ker(Φ) and x ∈ RN with ‖x‖0 ≤ s, let S = supp(x).From (s, ρ)-NSP we have‖v‖1 = ‖vS‖1 + ‖vS‖1 ≤ (1 + ρ)‖vS‖1.Hence‖vS‖1 = ρ‖vS‖1 + (1− ρ)‖vS‖1 ≥ ‖vS‖1 +1− ρ1 + ρ‖v‖1.Using supp(x) = S, (2.14) then follows by‖x+ v‖1 ≥ ‖x‖1 − ‖vS‖1 + ‖vS‖1 ≥ ‖x‖1 +1− ρ1 + ρ‖v‖1.Remark 2.3.2. A similar argument to the ”only if” part in Proposition 2.3.2with Ψ-norm can show that (2.11) implies (2.12). However, a similar ”if”argument does not work because we cannot simply put X = V |T and invoke(2.12) as V |T may well not be in JSM-1. This tells us that condition (2.12)is no stronger than (2.11). In fact, it is actually weaker as demonstrated inthe following example.Consider the case when J = 2, N = 2 and γ ≥ 1, letA1 = [1 0], A2 = [1 2],252.3. Stability and Robustnessandx1 =[10], x2 =[00].Then we have z0 = 0, z1 = x1, z2 = x2, X = [1 0 0 0]T andV = [0 a −2b b]T , a, b ∈ R.Notice that for γ ≥ 1 and W ∈ RJN ,max{|W (1)|, |W (3)|}+ max{|W (2)|, |W (4)|} ≤ ‖W‖Ψ ≤ ‖W‖1.Hence‖X + V ‖Ψ ≥ 1 + max{|a|, |b|},and‖X‖Ψ + α‖V ‖Ψ ≤ 1 + α(|a|+ 3|b|) ≤ 1 + 3α ·max{|a|, |b|}.Therefore (2.12) holds with α = 1/3.On the other hand, if we take V∗ = [0 1 −2 1]T ∈ ker(A) and T =∪supp(xj) = {1}, then we have‖V∗|T ‖Ψ = 2 > 1 = ‖V∗|T ‖Ψ.So (2.11) does not hold for any ρ ∈ (0, 1).RobustnessThe robustness for PZ(γ) lets us deal with noisy samplings. Here we assumethe measurements Y′are corrupted with noise and writeY′= AXˆ + , ‖‖2 ≤ ε, (2.15)where = [T1 , . . . , TJ ]T and j ∈ Rmj is the noise when measuring xj .Problems PZ(γ) and PX(γ) then becomeP ′Z(γ) : minZ∈R(J+1)N‖Z‖1 s.t. ‖Y ′ −AΨZ‖2 ≤ ε.P ′X(γ) : minX∈RJN‖X‖Ψ s.t. ‖Y ′ −AX‖2 ≤ ε.The main result for robustness is the following theorem.262.3. Stability and RobustnessTheorem 2.3.3. Given JSM-1 compressible signals Xˆ ∈ RJN as in (2.10)and its measurement Y′as in (2.15). Let X# be a solution of P ′X(γ). If‖X + V ‖Ψ + τ‖AV ‖2 ≥ ‖X‖Ψ + α‖V ‖Ψ, ∀V ∈ RJN . (2.16)holds for some α > 0 and τ ≥ 0, then‖Xˆ −X#‖Ψ ≤ (2/α+ 2)‖E‖Ψ + (2τ/α)ε. (2.17)Proof. This proof is similar to that of Theorem 2.3.1. LetV = X# − Xˆ = X# −X − E,we have‖AV ‖2 ≤ ‖AX# − Y ′‖2 + ‖Y ′ −AXˆ‖2 ≤ 2ε.If (2.17) does not hold, then‖X# −X‖Ψ ≥ ‖V ‖Ψ − ‖E‖Ψ > (2/α+ 1)‖E‖Ψ + (2τ/α)ε.Hence‖X# − E‖Ψ − ‖X‖Ψ = ‖X + V ‖Ψ − ‖X‖Ψ≥ α‖X# −X − E‖Ψ − τ‖AV ‖2≥ α (‖X# −X‖Ψ − ‖E‖Ψ)− 2τε> 2‖E‖Ψ + 2τε− 2τε= 2‖E‖Ψ.On the other hand, X# is a solution of P ′X(γ) implies‖X# − E‖Ψ ≤ ‖X#‖Ψ + ‖E‖Ψ ≤ ‖Xˆ‖Ψ + ‖E‖Ψ ≤ ‖X‖Ψ + 2‖E‖Ψ.This gives a contradiction.Remark 2.3.3. The idea for condition (2.16) comes from the robust nullspace property in compressive sensing. In fact, for Φ ∈ Rm×N and a positiveinteger s, if we assume the robust null space property with constants ρ ∈(0, 1) and η > 0, i.e.,‖vS‖1 ≤ ρ‖vS‖1 + η‖Φv‖2, ∀v ∈ RN , S ⊆ [N ], |S| ≤ s.272.4. A Null Space Propertythen similar to the argument in Proposition 2.3.2, we have‖x+ v‖1 + 2η1 + ρ‖Φv‖2 ≥ ‖x‖1 + 1− ρ1 + ρ‖v‖1for all v ∈ RN and x ∈ RN with ‖x‖0 ≤ s.2.4 A Null Space PropertySo far we have derived a condition for the exact recovery of jointly sparsesignals (Corollary 2.1.5), as well as its stable and robust variants (2.12) and(2.16). However, these conditions are still primitive as they include X, thesignals to be recovered. In this section, we will try to drop X and findsome kind of null space property for A. First we study the recovery of exactjointly sparse signals with noiseless sampling.Definition 2.4.1. A matrix A ∈ RM×JN is said to satisfy the null spaceproperty for JSM with constant 0 ≤ ρ ≤ 1 relative to (T,S) if‖V |T ‖Ψ + ρ‖V˜S‖Ψ > ‖VS‖Ψ, ∀V ∈ ker(A)\{0} (2.18)where T ⊆ [N ], S ⊆ [JN ] andV˜S = V |T − VS .With this null space property, we have the result for exact recovery asTheorem 2.4.1 below.Theorem 2.4.1. For jointly sparse signals X ∈ RJN , let Z ∈ R(J+1)N suchthat ΦZ = X and ‖Z‖1 = ‖X‖Ψ. Define setsT = ∪supp(zj) = {i ∈ [N ] : ∃0 ≤ j ≤ J such that zj(i) 6= 0},S = {i+ (j − 1)N ∈ [JN ] : z0(i) 6= 0 or zj(i) 6= 0, i ∈ [N ], j ∈ [J ]}.If we further have|{j ∈ [J ] : i ∈ supp(zj)}| ≤ t, ∀i /∈ supp(z0) (2.19)for some t ≤ γ, then the null space property for JSM with constant γ−tγ+trelative to (T,S) guarantees the exact recovery of X through PZ(γ).282.4. A Null Space PropertyProof. By Corollary 2.1.5, it is enough to show‖X + V ‖Ψ > ‖X‖Ψ, ∀V ∈ ker(A)\{0}.Notice that ‖X + V ‖Ψ =∑i∈[N ] ‖(X + V )|i‖Ψ, so we will first estimate‖(X + V )|i‖Ψ for all i. There are three cases:(a) i ∈ supp(z0). By triangular inequality,‖(X + V )|i‖Ψ ≥ ‖X|i‖Ψ − ‖V |i‖Ψ.(b) i ∈ T . Notice that X|i = 0, so‖(X + V )|i‖Ψ = ‖V |i‖Ψ.(c) i ∈ T\supp(z0). Let 1ij = 1{i∈supp(zj)} be the indicator of whetheri ∈ supp(zj) and let 1¯ij be its negation. By definition,‖(X + V )|i‖Ψ = infuγ|u|+∑j∈[J ]|zj(i) + vj(i)− u|1ij + |vj(i)− u|1¯ij≥ infuγ|u|+∑j∈[J ]|zj(i)|1ij − |vj(i)|1ij − |u|1ij+|vj(i)− u|1¯ij .Notice that‖X|i‖Ψ = γ|z0(i)|+∑j∈[J ]|zj(i)| =∑j∈[J ]|zj(i)|1ij ,‖VS |i‖Ψ = infuγ|u|+∑j∈[J ]|vj(i)− u|1ij + |u|1¯ij =∑j∈[J ]|vj(i)|1ij .(The last equality is true because γ −∑j 1ij ≥ γ − t ≥ 0 and‖VS |i‖Ψ ≥∑j∈[J ]|vj(i)|1ij + infuγ −∑j∈[J ]1ij +∑j∈[J ]1¯ij |u|.Together they imply ‖VS |i‖Ψ ≥ ‖VS |i‖1. On the other hand, ‖VS |i‖Ψ ≤‖VS |i‖1 =∑j |vj(i)|1ij .)292.4. A Null Space PropertySubstitute ‖X|i‖Ψ, ‖VS |i‖Ψ into above and use the assumption that∑j 1ij ≤ t, we have‖(X + V )|i‖Ψ ≥ ‖X|i‖Ψ − ‖VS |i‖Ψ + infuhi(u),wherehi(u) = (γ − t)|u|+∑j∈[J ]|vj(i)− u|1¯ij .Next we show infu hi(u) ≥ γ−tγ+t‖V˜S |i‖Ψ where V˜S = V |T − VS . Letgi(u) = γ|u|+∑j∈[J ]|u|1ij + |vj(i)− u|1¯ijand by definition, ‖V˜S |i‖Ψ = infu gi(u). Applying∑j 1ij ≤ t ≤ γ wegetγ − tγ + tgi(u) ≤ (γ − t)|u|+ γ − tγ + t∑j∈[J ]|vj(i)− u|1¯ij ≤ hi(u).The inequality infu hi(u) ≥ γ−tγ+t‖V˜S |i‖Ψ then follows by taking theinfimum of u on both sides, and we have the final estimation‖(X + V )|i‖Ψ ≥ ‖X|i‖Ψ − ‖VS |i‖Ψ + γ − tγ + t‖V˜S |i‖Ψ.Putting (a) (b) (c) all together and by the Null Space Property for JSM,‖X + V ‖Ψ ≥ ‖X‖Ψ − ‖V |S‖Ψ + ‖V |T ‖Ψ +γ − tγ + t‖V˜S‖Ψ > ‖X‖Ψ.Remark 2.4.1. Although not always satisfiable, condition (2.19) in Theorem2.4.1 is reasonable, especially when J is large. It requires that, on a coor-dinate whose common part is zero, only a few (no more than the weight γ)signals can have a nonzero entry. This may be seen as an interpretation ofJSM-1 because if a coordinate is shared by many signals, then it is morelikely that a nonzero common part exists there. We also note that by in-creasing the value of γ, (2.19) can be easily satisfied. However, as studiedbefore, Ψ-norm changes towards l1-norm as γ increases to J , so a large γmay not be able to effectively promote the common part.302.4. A Null Space PropertyRemark 2.4.2. T can also be equivalently defined in terms of X by T =∪supp(xj). However, we define S only in terms of Z, rather than using{i+(j−1)N ∈ [JN ] : xj(i) 6= 0}, because in some rare cases like 0 = xj(i) =z0(i) + zj(i) with z0(i) 6= 0, we would still want to include i + (j − 1)N inthe set S.Finally, we give the null space properties for stable and robust recovery.Based on the discussion in section 2.3, it is easy to see that we can similarlydefine the stable null space property for JSM by replacing (2.18) in Definition2.4.1 with‖V |T ‖Ψ + ρ‖V˜S‖Ψ ≥ ‖VS‖Ψ + α‖V ‖Ψ, ∀V ∈ ker(A) (2.20)and similarly define the robust null space property for JSM by replacing(2.18) with‖V |T ‖Ψ + ρ‖V˜S‖Ψ + τ‖AV ‖2 ≥ ‖VS‖Ψ + α‖V ‖Ψ, ∀V ∈ RJN (2.21)As their counterparts to Theorem 2.4.1, we have the following theorem.Theorem 2.4.2. For Xˆ = X + E ∈ RJN , let Z ∈ R(J+1)N such thatΦZ = X and ‖Z‖1 = ‖X‖Ψ. Define sets T and S the same as in Theorem2.4.1. Also assume that|{j ∈ [N ] : i ∈ supp(zj)}| ≤ t, ∀i /∈ supp(z0)for some t ≤ γ.(i) If (2.20) holds for some α > 0 and ρ = γ−tγ+t , then‖Xˆ −X#‖Ψ ≤ (2/α+ 2)‖E‖Ψ,where X# is any solution to problem:minW∈RJN‖W‖Ψ s.t. AXˆ = AW.(ii) If (2.21) holds for some α > 0, τ ≥ 0 and ρ = γ−tγ+t , then‖Xˆ −X#‖Ψ ≤ (2/α+ 2)‖E‖Ψ + (2τ/α)ε,312.4. A Null Space Propertywhere X# is any solution to problem:minW∈RJN‖W‖Ψ s.t. ‖AXˆ −AW‖2 ≤ ε.Proof. This is readily seen from the proof of Theorem 2.4.1 and Theorem2.3.1 (for (i)) or Theorem 2.3.3 (for (ii)).At the end of this chapter we note that the (stable/robust) null spaceproperty for JSM above is a non-uniform guarantee since it relies on thesupport structure of the signals (i.e. S and T ). However, by considering allpossible support structures for the set of desired jointly sparse signals, wecan write it into a uniform guarantee. In this case, (2.18) (or (2.20), (2.21))holds for all possible pairs of (T,S).On the other hand, we do not have any condition on the matrices Ai(such as an RIP-based sufficient condition) that will ensure that such auniform null space property holds. This is an important open problem weintend to work on as a follow up to this thesis.32Chapter 3The Recovery of Two SignalsAs a case study, in this chapter we focus on the recovery of two signals (i.e.J = 2). For the notations, here we haveX =[x1x2], Y =[y1y2], Z =γz0z1z2 ,andA =[A1A2], Ψ =[γ−1IN INγ−1IN IN],where X ∈ R2N , Y ∈ Rm1+m2 , Z ∈ R3N and Ai ∈ Rmi×N . We shall considerPZ(γ) and its equivalent form PX(γ) for recovery.PZ(γ) : minZ∈R3N‖Z‖1 s.t. Y = AΨZ.PX(γ) : minX∈R2N‖X‖Ψ s.t. Y = AX.3.1 Ψ-Norm and Unit Ball RevisitedRecall the definition of Ψ-norm, now we have‖X‖Ψ = infU∈ψ‖[0X]+ U‖1= infu∈RN‖γu‖1 + ‖x1 − u‖1 + ‖x2 − u‖1.Also by Proposition 2.1.4, for every X ∈ R2N and any choice of γ > 0, thereexists one and only one Z ∈ R3N such that X = ΨZ and ‖Z‖1 = ‖X‖Ψ. Sowhen considering PX(γ) to have a unique solution, it is the same as whenPZ(γ) has a unique solution.333.1. Ψ-Norm and Unit Ball Revisited11-1-1( 1γ ,1γ )(− 1γ ,− 1γ )(a) 0 < γ < 111-1-1( 1γ ,1γ )(− 1γ ,− 1γ )(b) 1 < γ < 2Figure 3.1: Unit ball for Ψ-norm when J = 2 and N = 1For the unit ball of Ψ-norm, we can easily apply the results in section 2.2.Figure 3.1 shows the unit ball with N = 1 and γ changing from 0 to 2 (theleft one is an example when 0 < γ < 1 and the right one an example when1 < γ < 2). As γ increases to 2, the ball ”shrinks” to l1 ball in directions±(1/γ, 1/γ). Particularly, it is the ”mixture” of the l1 and l∞ ball whenγ = 1. Ψ-norm also becomes l1-norm once γ ≥ 2. For an explicit formula,have the following proposition.Proposition 3.1.1. If N = 1, for x = (a, b)T‖x‖Ψ =|a|+ |b| ab ≤ 0|a|+ |b| ab > 0, γ ≥ 2M + (γ − 1)m ab > 0, γ ∈ (0, 2)where M = max{|a|, |b|}, m = min{|a|, |b|}.This proposition can be shown by dropping the absolute value sign of|γu|+ |a− u|+ |b− u| and finding its minimum for all a, b since‖x‖Ψ = infu|γu|+ |a− u|+ |b− u|.Based on Proposition 3.1.1, we have the inequalities below.343.2. A Null Space PropertyProposition 3.1.2. For γ > 0 and any u, a, b ∈ R, let θ = min{γ − 1, 1},then|γu|+ |a− u|+ |b− u| ≥{|a|+ θ|b||b|+ θ|a| .Proof. First, notice that|γu|+ |a− u|+ |b− u| ≥ ‖(a, b)T ‖Ψ.Then by Proposition 3.1.1, let M = max{|a|, |b|} and m = min{|a|, |b|} wehave‖(a, b)T ‖Ψ ≥M + θm ≥ m+ θM.3.2 A Null Space PropertyIn chapter 2, Theorem 2.4.1 states that the null space property for JSM canguarantee the exact recovery of jointly sparse X through PZ(γ). However,it is based on condition|{j ∈ [J ] : i ∈ supp(zj)}| ≤ γ, ∀i /∈ supp(z0),and in the case of J = 2, this condition may be too strong. In fact, if wehave a i ∈ [N ] with x1(i)x2(i) < 0, then it is easy to see that z0(i) = 0 andzj(i) = xj(i) for j = 1, 2. Hence we must have γ ≥ 2 for the above conditionto hold. On the other hand, for PZ(γ) to be useful (Ψ-norm to be differentform l1-norm), we should only focus on 0 < γ < 2.In this section we will derive a null space property just for J = 2. Firstwe consider jointly sparse signals X and define its support structure.Definition 3.2.1. For X ∈ R2N , TX = TX(T , T0, T1, . . . , T5) is the support353.2. A Null Space Propertymodel of X ifT3 = supp(z0) ∩ supp(z1)T4 = supp(z0) ∩ supp(z2)T5 = supp(z1) ∩ supp(z2)T0 = supp(z0)\(T3 ∪ T4)T1 = supp(z1)\(T3 ∪ T5)T2 = supp(z2)\(T5 ∪ T4)T = [N ]\(∪5i=0Ti)where Z = (γzT0 , zT1 , zT2 )T , ΨZ = X and ‖Z‖1 = ‖X‖Ψ.Figure 3.2 illustrates the supports T , T0, T1, . . . , T5. We also note thatthere is one and only one Z satisfying the conditions above for any X ∈ R2Nand γ > 0, so TX is well-defined. In such case we have∩2i=0supp(zi) = ∅z0(i)z1(i) ≥ 0, i ∈ T3z0(i)z2(i) ≥ 0, i ∈ T4z1(i)z2(i) ≤ 0, i ∈ T5Now we can state a null space property and its result. For simplicity,consider the special case when γ = 1.Theorem 3.2.1. For γ = 1 and X∗ with support model TX∗(T , T0, . . . , T5).If for every V =[v1v2]∈ ker(A)\{0} (here v1, v2 ∈ RN ),∑Tmaxj∈[2]|vj(i)| >∑T0minj∈[2]|vj(i)|+∑T1∪T3∪T5|v1(i)|+∑T2∪T4∪T5|v2(i)|,then X∗ is the unique minimizer of PZ(γ) with Y = AX∗.Proof. By Corollary 2.1.5, we only need to show that (2.7) is satisfied. LetX∗ = ΨZ such that ‖Z‖1 = ‖X‖Ψ. For U ∈ ψ, we haveU = u−u−u , u =u1...uN ,363.2. A Null Space Propertyz0z1 z2T0T1 T2T3 T4T5TFigure 3.2: Support model illustrationso for V ∈ ker(A)\{0},‖[0X∗]+[0V]+ U‖1 =N∑i=1|ui|+2∑j=1N∑i=1|z0(i) + zj(i) + vj(i) + ui|=∑i∈[N ]fi(U, V, Z)wherefi(U, V, Z) = fi = |ui|+2∑j=1|z0(i) + zj(i) + vj(i) + ui|.Next we estimate fi.(a) i ∈ T . We have z0(i) = z1(i) = z2(i) = 0,fi = |ui|+ |v1(i) + ui|+ |v2(i) + ui|≥ maxj∈[2]|vj(i)|The inequality here follows from Proposition 3.1.2.(b) i ∈ T0. We have z1(i) = z2(i) = 0,fi = |ui|+ |z0(i) + v1(i) + ui|+ |z0(i) + v2(i) + ui|≥ |ui|+ |z0(i) + ui|+ |z0(i) + v2(i) + ui| − |v1(i)|≥ |z0(i)| − |v1(i)|373.2. A Null Space PropertySimilarly we can show that fi ≥ |z0(i)| − |v2(i)|, thereforefi ≥ |z0(i)| −minj∈[2]|vj(i)|(c) i ∈ T1. We have z0(i) = z2(i) = 0,fi = |ui|+ |z1(i) + v1(i) + ui|+ |v2(i) + ui|≥ |ui|+ |z1(i) + ui|+ |v2(i) + ui| − |v1(i)|≥ |z1(i)| − |v1(i)|(d) i ∈ T2. Similar to (c), we havefi ≥ |z2(i)| − |v2(i)|(e) i ∈ T3. We have z2(i) = 0, z0(i)z1(i) ≥ 0,fi = |ui|+ |z0(i) + z1(i) + v1(i) + ui|+ |z0(i) + v2(i) + ui|≥ |ui|+ |z0(i) + z1(i) + ui|+ |z0(i) + v2(i) + ui| − |v1(i)|≥ |z0(i) + z1(i)| − |v1(i)|= |z0(i)|+ |z1(i)| − |v1(i)|(f) i ∈ T4. Similar to (e), we havefi ≥ |z0(i)|+ |z2(i)| − |v2(i)|(g) i ∈ T5. We have z0(i) = 0, z1(i)z2(i) ≤ 0,fi = |ui|+ |z1(i) + v1(i) + ui|+ |z2(i) + v2(i) + ui|≥ |ui|+ |z1(i) + ui|+ |z2(i) + ui| − |v1(i)| − |v2(i)|≥ |z1(i)|+ |z2(i)| − |v1(i)| − |v2(i)|Now sum up all above, we have∑i∈[N ]fi ≥ ‖Z‖1 +∑Tmaxj∈[2]|vj(i)|−∑T0minj∈[2]|vj(i)|+∑T1∪T3∪T5|v1(i)|+∑T2∪T4∪T5|v2(i)| .383.2. A Null Space PropertySo we can conclude that‖[0X∗]+[0V]+ U‖1 > ‖Z‖1 = ‖X‖Ψ.Lastly, we comment that although Theorem 3.2.1 is the special casefor γ = 1, general case for 0 < γ ≤ 2 can be similarly considered usingProposition 3.1.1 and 3.1.2. For stability and robustness, the method issame as that in section 2.4.39Chapter 4Numerical ExperimentsIn this chapter, we present some numerical experiments of the main algo-rithm studied in this thesis:PZ(γ) : min γ‖z0‖1 +J∑k=1‖zk‖1 s.t. yi = Ai(z0 + zi),whereAi ∈ Rmi×N , yi = Aixi, i = 1, . . . , J.Continued from chapter 3, we mainly focus on the two-signal case. Similarexperiments can be found in [2] [19] et al.4.1 Sparse SignalsFor the simple experiments here, we choose the number of signals J = 2,signal dimension N = 50 and the number of measurements for both signalsto be the same (m1 = m2). We will calculate the recovery rate as michanges form 0 to N . The recovery rate is measured by the number ofsuccessful instances out of a thousand trials where each trial is conductedby first generating signals xi and Gaussian matrices Ai, and then runningjoint recovery algorithm PZ(γ). A trial is considered successful if‖X# −X‖2‖X‖2 ≤ θ,where X# is the numerical solution of recovery algorithm and θ is the desiredaccuracy.Consider the recovery for sparse signals. We will compare PZ(γ) withthe separate recovery:min ‖wi‖1 s.t. yi = Aiwi, i = 1, 2.404.1. Sparse SignalsNote that PZ(γ) is useful in practice only if it outperforms the separaterecovery. We would also like to see how γ can affect the joint recovery,particularly if close values of γ achieves similar recovery rates (the stabilityof γ).Here we take sparse vectors zi ∈ RN for i = 0, 1, 2 where each one hassparsity ‖zi‖0 = ki and the nonzero entries are assigned to be Gaussian.The sparse signals are xj = z0 + zj (j = 1, 2). We also take θ = 10−3.Figure 4.1 is a comparison of joint (solid lines) and separate (dotted lines)sparse recovery under various settings. Here we fix k1 = k2 and k0 +k1 = 10while changing the values of k0 and γ. We make the following observationsform this set of experiments.First, depending on the signals and γ, joint recovery may perform better,worse or similar to separate recovery. For example, (d) (e) (f) togethershows that with fixed γ, the value of k0 can influence whether joint recoveryoutperforms separate recovery; while (b) (e) (h) together shows that γ canhave the same influence with fixed k0. Second, generally speaking, the largerthe common part z0 (or k0), the better the performance for joint recoverywith γ < 2. So we should use joint recovery only when there is a relative largecommon part, otherwise it may be better to do separate recovery. Third,looking at (j) (k) (l), the recovery rates for both methods are almost thesame regardless of k0. This verifies our claim earlier that Ψ-norm becomesl1-norm when γ = 2.To further see the stability of γ. Figure 4.2 shows the joint recoveryunder different γ when we fix k0 = 8 and k1 = k2 = 2. The dotted linesrepresent results for corresponding separate recoveries. These results suggestthat γ is stable in this case, i.e., close values of γ give similar recovery rates.Figure 4.3 studies whether choosing Gaussian measurement matrices A1and A2 independently or the same will affect the performance of PZ(γ).This problem is more thoroughly studied in [19], which considers recoverieswith the same, partly and completely independent A1, A2. From the figureswe can see that it is better to choose Ai independently. This is consistentwith the results form [19].414.1. Sparse Signals0 10 20 30 40 5000.20.40.60.81measurementsrecovery rate(a) k0 = 8, γ = 0.50 10 20 30 40 5000.20.40.60.81measurementsrecovery rate(b) k0 = 5, γ = 0.50 10 20 30 40 5000.20.40.60.81measurementsrecovery rate(c) k0 = 2, γ = 0.50 10 20 30 40 5000.20.40.60.81measurementsrecovery rate(d) k0 = 8, γ = 10 10 20 30 40 5000.20.40.60.81measurementsrecovery rate(e) k0 = 5, γ = 10 10 20 30 40 5000.20.40.60.81measurementsrecovery rate(f) k0 = 2, γ = 10 10 20 30 40 5000.20.40.60.81measurementsrecovery rate(g) k0 = 8, γ = 1.50 10 20 30 40 5000.20.40.60.81measurementsrecovery rate(h) k0 = 5, γ = 1.50 10 20 30 40 5000.20.40.60.81measurementsrecovery rate(i) k0 = 2, γ = 1.50 10 20 30 40 5000.20.40.60.81measurementsrecovery rate(j) k0 = 8, γ = 20 10 20 30 40 5000.20.40.60.81measurementsrecovery rate(k) k0 = 5, γ = 20 10 20 30 40 5000.20.40.60.81measurementsrecovery rate(l) k0 = 2, γ = 2Figure 4.1: Comparison of joint and separate sparse recovery when J = 2and N = 50. Here we fix k1 = k2 and k0 + k1 = 10. The horizontal axisis the value of m1 (or m2) and the vertical axis is recovery rate. Solid linesin the figures represent joint recovery PZ(γ) and the dotted lines representseparate recovery.424.1. Sparse Signals0 10 20 30 40 5000.10.20.30.40.50.60.70.80.91measurementsrecovery rate γ=1γ=0.8γ=0.6separate(a) γ = 0.6, 0.8, 10 10 20 30 40 5000.10.20.30.40.50.60.70.80.91measurementsrecovery rate γ=1γ=1.2γ=1.4separate(b) γ = 1, 1.2, 1.4Figure 4.2: Joint sparse recovery under different γ when J = 2 and N = 50.Here we fix k0 = 8 and k1 = k2 = 2. The dotted lines represent results forseparate recovery. The horizontal axis is the value of m1 (or m2) and thevertical axis is recovery rate.0 10 20 30 40 5000.20.40.60.81measurementsrecovery rate ξ=1ξ=0separate(a) γ = 0.50 10 20 30 40 5000.20.40.60.81measurementsrecovery rate ξ=1ξ=0separate(b) γ = 10 10 20 30 40 5000.20.40.60.81measurementsrecovery rate ξ=1ξ=0separate(c) γ = 1.5Figure 4.3: Joint sparse recovery with A1 = A2 under different γ when J = 2and N = 50. Here we fix k0 = 8 and k1 = k2 = 2. ξ denotes the overlapbetween A1 and A2 such that ξ = 1 means A1 = A2 while ξ = 0 meansthat they are independent. The dotted lines represent results for separaterecovery. The horizontal axis is the value of m1 (or m2) and the verticalaxis is recovery rate.434.2. Compressible Signals4.2 Compressible SignalsWe present a few experiments below to demonstrate that PZ(γ) also worksfor compressible signals. For experiments with noise as well, we refer toe.g.,[19]. Here we assume the signals arexj = z0 + zj + ej , j = 1, 2where zi (0 ≤ i ≤ 2) are the same as the previous sparse case and ej hascoefficients drawn from a zero-mean normal distribution. We also defineη = maxj∈[J ]‖ej‖2‖z0 + zj‖2to quantify compressibility and take accuracy θ = η.Figure 4.4 shows the joint recovery for compressible signals (joint stablerecovery) under different η when J = 2 and N = 50. Here we fix k0 = 8,k1 = k2 = 2 and γ = 1. The dotted lines represent separate stable recoveryunder the same settings and the dashed lines represent joint sparse recovery(η = 0 and θ = 10−3) as in Figure 4.1 (a). From these figures we can seethat the recovery rates for compressible signals are similar to that of sparsesignals, which verifies the stability of PZ(γ).444.2. Compressible Signals0 10 20 30 40 5000.20.40.60.81measurementsrecovery rate(a) η = 1%0 10 20 30 40 5000.20.40.60.81measurementsrecovery rate(b) η = 3%0 10 20 30 40 5000.20.40.60.81measurementsrecovery rate(c) η = 5%0 10 20 30 40 5000.20.40.60.81measurementsrecovery rate(d) η = 10%Figure 4.4: Joint stable recovery (solid lines) under different η when J = 2and N = 50. Here we fix k0 = 8, k1 = k2 = 2 and γ = 1. The dotted linesare the results for separate stable recovery and the dashed lines are for jointsparse recovery (η = 0 and θ = 10−3) as in Figure 4.1 (a). The horizontalaxis is the value of m1 (or m2) and the vertical axis is recovery rate.45Chapter 5SummaryIn this thesis we studied the weighted l1-minimization problem PZ(γ) forthe recovery of jointly sparse signals in DCS. This algorithm is first put forthin [2] by Baron et al., and then followed by a probabilistic study of the two-signal case. Here, by recognizing and defining the Ψ-norm, we adopt anotherapproach, paralleled to the study of l1-minimization from CS framework, toanalyze PZ(γ). This analysis eventually leads to a null space guarantee forrecovery.Chapter 2 contains the main arguments of this thesis. In section 2.1we introduce Ψ-norm, write PZ(γ) into PX(γ) and prove their equivalence.This is the base of our deduction as we can then analyze PX(γ) in a similarfashion as the l1-minimization in CS framework. Section 2.2 takes a look atthe unit ball for Ψ-norm, whose structure promotes both sparsity and simi-larity (all signals sharing a common part). This can be seen as a geometricinterpretation for understanding the effectiveness of PZ(γ). Motivated bythe stable/robust null space property in CS, section 2.3 derives the stableand robust versions for Corollary 2.1.5 (a null space characterization for ex-act recovery). Section 2.4 further simplify these conditions to sufficient nullspace properties.The following chapter 3 serves as a case study for J = 2. We first reviewΨ-norm for this special case in section 3.1 and then give a sufficient nullspace property in section 3.2. Note that this property is not the same as theone in section 2.4 as the latter requires assumption (2.19), which may notbe feasible in the two-signal case.Chapter 4 presents some numerical experiments for PZ(γ) and comparestheir results with that of separate recovery. From the results we know thatPZ(γ) is only efficient when our jointly sparse signals have a large enough46Chapter 5. Summarycommon part, otherwise it may perform worse than separate recovery. Wealso test different values of weight γ to see its impact on the performanceof PZ(γ). The results show that γ can greatly influence PZ(γ), but closevalues of γ should not lead to drastic performance change.At the end of this thesis, we discuss several open problems and possiblefuture works.First problem is to prove that for Gaussian matrices Ai with enough mea-surements, PZ(γ) recovers jointly sparse signals with overwhelming proba-bility. To show this, we probably need to find another recovery guarantee(e.g. RIP-based guarantee). Moreover, this guarantee should not be strongerwhen compared to the ones for separate recovery, because we would also liketo explain why PZ(γ) can work better than the other.Second problem is to determine the optimal choice of γ for recovery. Ofcourse, as numerical results form chapter 4 suggest, this should base on someprior information on the joint sparsity model (e.g. {‖zi‖0 : 0 ≤ i ≤ J}).Further, if the optimum is impossible or to hard to find, the goal can belowered to a suboptimal one.Third problem is to explain why less repetition in measurement matricesAi leads to better performance for PZ(γ), at least when J = 2, as observedin chapter 4 and [19]. Moreover, an interesting phenomenon seen in [19]is also worth studying: in the two-signal case, when computing the differ-ence between two innovation parts by subtracting the x1, x2 recovered fromPZ(γ), contrary to the previous result, more repetition in Ai now leads tobetter performance. The null space property derived in this thesis may bea way to study these questions.Last problem is to generalize our analysis to some other sparsity modelsin DCS. The main obstacle here is that our analysis relies heavily on Ψ andits structure, which is determined by the joint sparsity model.47Bibliography[1] Afonso S Bandeira, Edgar Dobriban, Dustin G Mixon, and William FSawin. Certifying the restricted isometry property is hard. arXivpreprint arXiv:1204.1580, 2012.[2] Dror Baron, Michael B Wakin, Marco F Duarte, Shriram Sarvotham,and Richard G Baraniuk. Distributed compressed sensing, 2005.[3] T Tony Cai, Lie Wang, and Guangwu Xu. New bounds for re-stricted isometry constants. Information Theory, IEEE Transactionson, 56(9):4388–4394, 2010.[4] T Tony Cai, Lie Wang, and Guangwu Xu. Shifting inequality andrecovery of sparse signals. Signal Processing, IEEE Transactions on,58(3):1300–1308, 2010.[5] T Tony Cai and Anru Zhang. Sharp rip bound for sparse signal and low-rank matrix recovery. Applied and Computational Harmonic Analysis,35(1):74–93, 2013.[6] T Tony Cai and Anru Zhang. Sparse representation of a polytope andrecovery of sparse signals and low-rank matrices. Information Theory,IEEE Transactions on, 60(1):122–132, 2014.[7] Emmanuel J Cande`s. Compressive sampling. In Proceedings oh the In-ternational Congress of Mathematicians: Madrid, August 22-30, 2006:invited lectures, pages 1433–1452, 2006.[8] Emmanuel J Candes. The restricted isometry property and its im-plications for compressed sensing. Comptes Rendus Mathematique,346(9):589–592, 2008.48Bibliography[9] Emmanuel J Candes, Justin K Romberg, and Terence Tao. Stablesignal recovery from incomplete and inaccurate measurements. Com-munications on pure and applied mathematics, 59(8):1207–1223, 2006.[10] Emmanuel J Candes and Terence Tao. Decoding by linear program-ming. Information Theory, IEEE Transactions on, 51(12):4203–4215,2005.[11] Emmanuel J Cande`s and Michael B Wakin. An introduction to com-pressive sampling. Signal Processing Magazine, IEEE, 25(2):21–30,2008.[12] Albert Cohen, Wolfgang Dahmen, and Ronald DeVore. Compressedsensing and best -term approximation. Journal of the American math-ematical society, 22(1):211–231, 2009.[13] Shane F Cotter, Bhaskar D Rao, Kjersti Engan, and Kenneth Kreutz-Delgado. Sparse solutions to linear inverse problems with multi-ple measurement vectors. Signal Processing, IEEE Transactions on,53(7):2477–2488, 2005.[14] Yonina C Eldar and Moshe Mishali. Robust recovery of signals from astructured union of subspaces. Information Theory, IEEE Transactionson, 55(11):5302–5316, 2009.[15] Simon Foucart and Holger Rauhut. A mathematical introduction tocompressive sensing. Springer, 2013.[16] Moshe Mishali and Yonina C Eldar. Reduce and boost: Recoveringarbitrary sets of jointly sparse vectors. Signal Processing, IEEE Trans-actions on, 56(10):4692–4702, 2008.[17] Qun Mo and Song Li. New bounds on the restricted isometry constantδ2k. Applied and Computational Harmonic Analysis, 31(3):460–468,2011.[18] Balas Kausik Natarajan. Sparse approximate solutions to linear sys-tems. SIAM journal on computing, 24(2):227–234, 1995.49Bibliography[19] Felix Oghenekohwo, Ernie Esser, and FJ Herrmann. Time-lapse seismicwithout repetition-reaping the benefits from randomized sampling andjoint recovery. In 76th EAGE Conference and Exhibition 2014, 2014.[20] Jeonghun Park, Seunggye Hwang, Janghoon Yang, and DongkuKim. Generalized distributed compressive sensing. arXiv preprintarXiv:1211.6522, 2012.[21] Dennis Sundman, Saikat Chatterjee, and Mikael Skoglund. Methodsfor distributed compressed sensing. Journal of Sensor and ActuatorNetworks, 3(1):1–25, 2013.[22] Dennis Sundman, Saikat Chatterjee, and Mikael Skoglund. Distributedgreedy pursuit algorithms. Signal Processing, 105:298–315, 2014.[23] Joel A Tropp, Anna C Gilbert, and Martin J Strauss. Simultaneoussparse approximation via greedy pursuit. In Acoustics, Speech, andSignal Processing, 2005. Proceedings.(ICASSP’05). IEEE InternationalConference on, volume 5, pages v–721. IEEE, 2005.50
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- A weighted ℓ₁-minimization for distributed compressive...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
A weighted ℓ₁-minimization for distributed compressive sensing Li, Xiaowei 2015
pdf
Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.
Page Metadata
Item Metadata
Title | A weighted ℓ₁-minimization for distributed compressive sensing |
Creator |
Li, Xiaowei |
Publisher | University of British Columbia |
Date Issued | 2015 |
Description | Distributed Compressive Sensing (DCS) studies the recovery of jointly sparse signals. Compared to separate recovery, the joint recovery algorithms in DCS are usually more effective as they make use of the joint sparsity. In this thesis, we study a weighted ℓ₁-minimization algorithm for the joint sparsity model JSM-1 proposed by Baron et al. Our analysis gives a sufficient null space property for the joint sparse recovery. Furthermore, this property can be extended to stable and robust settings. We also presents some numerical experiments for this algorithm. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2015-10-24 |
Provider | Vancouver : University of British Columbia Library |
Rights | Attribution-NonCommercial-NoDerivs 2.5 Canada |
DOI | 10.14288/1.0165800 |
URI | http://hdl.handle.net/2429/54836 |
Degree |
Master of Science - MSc |
Program |
Mathematics |
Affiliation |
Science, Faculty of Mathematics, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 2015-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
Rights URI | http://creativecommons.org/licenses/by-nc-nd/2.5/ca/ |
AggregatedSourceRepository | DSpace |
Download
- Media
- 24-ubc_2015_november_li_xiaowei.pdf [ 540.06kB ]
- Metadata
- JSON: 24-1.0165800.json
- JSON-LD: 24-1.0165800-ld.json
- RDF/XML (Pretty): 24-1.0165800-rdf.xml
- RDF/JSON: 24-1.0165800-rdf.json
- Turtle: 24-1.0165800-turtle.txt
- N-Triples: 24-1.0165800-rdf-ntriples.txt
- Original Record: 24-1.0165800-source.json
- Full Text
- 24-1.0165800-fulltext.txt
- Citation
- 24-1.0165800.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
data-media="{[{embed.selectedMedia}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.24.1-0165800/manifest