A parameterized Douglas-Rachfordalgorithm: theory and applicationsbyDongying WangB.Sc., Xi’an Jiaotong-Liverpool University, China, 2015A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFMASTER OF SCIENCEinTHE COLLEGE OF GRADUATE STUDIES(Mathematics)THE UNIVERSITY OF BRITISH COLUMBIA(Okanagan)August 2017c© Dongying Wang, 2017The undersigned certify that they have read, and recommend to the College ofGraduate Studies for acceptance, a thesis entitled: A PARAMETERIZED DOUGLAS-RACHFORD ALGORITHM: THEORY AND APPLICATIONS submitted by DONGY-ING WANG in partial fulfilment of the requirements of the degree of Master ofScienceDr. Shawn Wang, Irving K. Barber School of Arts and SciencesSupervisor, ProfessorDr. Heinz Bauschke, Irving K. Barber School of Arts and SciencesSupervisory Committee Member, ProfessorDr. Julian Cheng, School of EngineeringSupervisory Committee Member, ProfessorDr. Philip D. Loewen, University of British Columbia (Vancouver)University Examiner, ProfessorAugust 17, 2017(Date Submitted to Grad Studies)iiAbstractDouglas-Rachford algorithm is important due to its applications on the Heronproblem and on the image denoising. Mathematically, it can be considered asfinding a point such that the point belongs to a zero set of the sum of two maximallymonotone operators.In this thesis, previous work on Douglas-Rachford algorithm is presented and theDouglas-Rachford algorithm with a changed parameter is considered. I give it thename "α-Douglas-Rachford algorithm". The new algorithm which has the changedparameter is shown to have a convergent result and other conclusions similar tothose of the classic Douglas-Rachford algorithm. At the same time, it has beenshown that the application of the α-Douglas-Rachford algorithm is wider than theapplication of the classic one.Later on, the α-Douglas-Rachford algorithm is proved to converge to the solutionof the composited monotone inclusion problems, and in a special-limit case, it hassome other properties. The numerical experiments confirm that the α-Douglas-Rachford algorithm does have the properties that I proved theoretically.iiiTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiAcknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiiDedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ixChapter 1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Inner product space . . . . . . . . . . . . . . . . . . . . . . . . . 11.1.1 Sets in vector spaces . . . . . . . . . . . . . . . . . . . . 61.2 Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.2.1 Linear operators . . . . . . . . . . . . . . . . . . . . . . 111.2.2 Nonexpansive operators . . . . . . . . . . . . . . . . . . 141.2.3 Monotone operators . . . . . . . . . . . . . . . . . . . . 151.2.4 Resolvent . . . . . . . . . . . . . . . . . . . . . . . . . . 181.3 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191.3.1 Convex functions . . . . . . . . . . . . . . . . . . . . . . 191.3.2 Lower semicontinuous functions . . . . . . . . . . . . . . 211.3.3 Coercive and supercoercive functions . . . . . . . . . . . 231.3.4 Subgradient and subdifferential . . . . . . . . . . . . . . 241.3.5 Conjugation . . . . . . . . . . . . . . . . . . . . . . . . . 271.3.6 Infimal convolution . . . . . . . . . . . . . . . . . . . . . 281.3.7 Fenchel-Rockafellar duality . . . . . . . . . . . . . . . . 30Chapter 2: Classic Douglas-Rachford algorithm . . . . . . . . . . . . 312.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31ivTABLE OF CONTENTS2.2 Douglas-Rachford splitting problem and the brief history of Douglas-Rachford algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 312.3 The composited monotone inclusion problem . . . . . . . . . . . 332.4 Application to proper, lower-semicontinuous convex functions . . 40Chapter 3: Douglas-Rachford algorithm with a changed parameter . 483.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483.2 α-Douglas-Rachford algorithm, with parameter α ∈ [1, 2) . . . . 483.2.1 Application to composited monotone inclusion problems . 573.2.2 The application to proper, lower-semicontinuous convexfunctions . . . . . . . . . . . . . . . . . . . . . . . . . . 60Chapter 4: Special cases of the composited monotone inclusion prob-lems and the double regularization . . . . . . . . . . . . . 644.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 644.2 The special cases . . . . . . . . . . . . . . . . . . . . . . . . . . 644.3 Double regularization: Moreau regularization and Tychonov regu-larization combined . . . . . . . . . . . . . . . . . . . . . . . . . 684.3.1 Subdifferential of infimal convolutions . . . . . . . . . . . 684.3.2 Main results . . . . . . . . . . . . . . . . . . . . . . . . . 69Chapter 5: The α-Douglas-Rachford algorithm with α→ 2 . . . . . . 735.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 735.2 Parameter α→ 2 . . . . . . . . . . . . . . . . . . . . . . . . . . 73Chapter 6: The application of the α-Douglas-Rachford algorithm . . 796.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 796.2 Least norm solution of the primal problem . . . . . . . . . . . . . 796.3 Least norm solution of the primal-dual problem . . . . . . . . . . 80Chapter 7: Numerical experiments . . . . . . . . . . . . . . . . . . . 837.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 837.2 A feasibility problem . . . . . . . . . . . . . . . . . . . . . . . . 847.3 A Heron problem . . . . . . . . . . . . . . . . . . . . . . . . . . 887.4 A feasibility problem solved by the primal-dual formulation . . . 967.4.1 The algorithm . . . . . . . . . . . . . . . . . . . . . . . . 977.4.2 Numerical results . . . . . . . . . . . . . . . . . . . . . . 98Chapter 8: Conclusions and future work . . . . . . . . . . . . . . . . 101Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103vList of TablesTable 7.1 Example 1: Six fixed αk where αk = 2 − 1/k, the corre-sponding optimization point y∗, and the norm of y∗. . . . 87Table 7.2 Example 1: starting point x0, the corresponding optimiza-tion point y∗, and the norm of y∗. . . . . . . . . . . . . . 87Table 7.3 Example 2: Seven fixed αk, where αk = 2 − 1/k, thecorresponding optimal point y∗1 and y∗2 , together with thecase α = 2. . . . . . . . . . . . . . . . . . . . . . . . . . 95Table 7.4 Example 2: Seven fixed αk, where αk = 2 − 1/k, thecorresponding optimal point y∗1 and y∗2 , together with thecase α = 2. . . . . . . . . . . . . . . . . . . . . . . . . . 95Table 7.5 Example 3: Six fixed αk, where αk = 2 − 1/k, the corre-sponding optimal point y∗1 and y∗2 , and√‖y∗1‖2 + ‖y∗2‖2,together with the case α = 2. . . . . . . . . . . . . . . . 99Table 7.6 Example 3: Six fixed αk, where αk = 2 − 1/k, the corre-sponding optimal point y∗1 and y∗2 , and√‖y∗1‖2 + ‖y∗2‖2,together with the case α = 2. . . . . . . . . . . . . . . . 99Table 7.7 Example 3: Six fixed αk, where αk = 2 − 1/k, the corre-sponding optimal point y∗1 and y∗2 , and√‖y∗1‖2 + ‖y∗2‖2,together with the case α = 2. . . . . . . . . . . . . . . . 100viList of FiguresFigure 1.1 Examples of convex sets . . . . . . . . . . . . . . . . . . 7Figure 1.2 Examples of nonconvex sets . . . . . . . . . . . . . . . . 7Figure 1.3 Figure of f(x) = |x| . . . . . . . . . . . . . . . . . . . . 25Figure 7.1 The plot of Example 7.2.1 . . . . . . . . . . . . . . . . . 85Figure 7.2 The plot of Example 7.1 . . . . . . . . . . . . . . . . . . 92Figure 7.3 The plot of Example 7.2 . . . . . . . . . . . . . . . . . . 97viiAcknowledgementsFirst and foremost, I thank my supervisor Dr. Shawn Wang for his patient in-struction in mathematics and his support for my study in UBC (Okanagan campus).His support and patience go beyond the work in this thesis. I would also like tothank Dr. Heinz Bauschke and Julian Cheng for agreeing to serve on my thesiscommittee and for their constructive comments.I thank Dr. Andrew Jirasek, Dr. John Braun, Dr. Qiduan Yang, and Dr. RebeccaTyson, for sharing their knowledge and experience in statistics, algebra, and math-ematical biology.I recognize the support of Unit 5 members and my colleagues in the COCANA labat UBC (Okanagan campus), who had helped me during my study. In particular, Ithank Chayne Planiden for his comments on my English.I thank UBC (Okanagan campus) for providing me with financial support over theyears of my master’s studies.Finally, I thank my parents and my grandparents, whose love and belief in me arethe ultimate encouragement I have.viiiDedicationTo my parents and my grandparents.ixChapter 1IntroductionIn this chapter, we will introduce some background materials on inner productspaces and some necessary convex analysis used later in the thesis.1.1 Inner product spaceDefinition 1.1.1. A vector space consists of a set V with elements called vectors,along with two operations such that the following properties hold:1 Vector addition: Let u, v ∈ V , then there is a vector u + v ∈ V and thefollowing are satisfied.i Commutativity: u+ v = v + u, ∀u, v, w ∈ V.ii Associativity: u+ (v + w) = (u+ v) + w,∀u, v, w ∈ V.iii Zero: there is a vector 0 ∈ V such that 0 + u = u = u+ 0,∀u ∈ V.iv Inverses, for each u ∈ V , there is a vector−u such that u+ (−u) = 0.2 Scalar multiplication: Let u, v ∈ V and r, s ∈ R, then the following aresatisfied.i Left distributivity: (r + s)v = rv + sv.ii Associativity: r(sv) = (rs)v.iii Right distributivity: r(u+ v) = ru+ rv.iv Neutral element: 1v = v.v Absorbing element: 0v = 0.vi Inverse neutral element:(−1)v = −v.Example 1.1.2. The space Rn consists of vectors v = (v1, . . . , vn) with vi ∈ Rfor 1 ≤ i ≤ n and operations defined by(u1, . . . , un) + (v1, . . . , vn) := (u1 + v1, . . . , un + vn);r(v1, . . . , vn) := (rv1, . . . , rvn).where r ∈ R.11.1. Inner product spaceDefinition 1.1.3. Let V be a vector space overR, and letW be a subset of V . ThenW is a subspace if:(1) The zero vector, 0, is in W .(2) If u and v are elements of W , then u+ v is an element of W .(3) If u is an element of W and c is a scalar from R, then the scalar multiple cuis an element of W .Definition 1.1.4. A norm ‖ · ‖ on a vector space V over the field R is a functionV → R with the following properties:(1) Positive definite: ‖x‖ ≥ 0 for all x ∈ V and ‖x‖ = 0 if and only if x = 0.(2) Homogeneous: ‖αx‖ = |α|‖x‖ for all α ∈ R and x ∈ V.(3) Triangle inequality: ‖x+ y‖ ≤ ‖x‖+ ‖y‖ for all x, y ∈ V.Here, (V, ‖ · ‖) is called a normed space.Definition 1.1.5. An inner product on a vector space V is a function 〈·, ·〉 : V ×V → R with the following properties.(1) Positive definite: 〈x, x〉 ≥ 0, and 〈x, x〉 = 0 if and only if x = 0.(2) Symmetry: 〈x, y〉 = 〈y, x〉.(3) Bilinearity: 〈αx+ βy, z〉 = α〈x, z〉+ β〈y, z〉 for α, β ∈ R, x, y, z ∈ V.We call a vector space paired with an inner product and norm induced by‖x‖ := √〈x, x〉 an inner product space.Example 1.1.6. In Rn, define(∀x ∈ Rn)(∀y ∈ Rn), 〈x, y〉 :=n∑i=1xiyi.Then Rn is an inner product space.Fact 1.1.7. Let x, y ∈ Rn, and 〈x, z〉 = 〈y, z〉 for all z ∈ Rn. Then x = y.Proof. For all z ∈ Rn, 〈x, z〉 = 〈y, z〉 implies that 〈x, x − y〉 = 〈y, x − y〉 (bysetting z = x− y). Moreover,0 = 〈x, x− y〉 − 〈y, x− y〉⇒ 0 = 〈x− y, x− y〉⇒ 0 = ‖x− y‖2.Thus, x = y.21.1. Inner product spaceIn this thesis, we use the Euclidean norm, which is given by‖x‖ = √xᵀx =√√√√ n∑i=1x2i .In the following thesis, ‖x‖ refers to the Euclidean norm of x.Fact 1.1.8. (Cauchy Schwarz Inequality) Let x and y be in Rn. Then|〈x, y〉| ≤ ‖x‖ · ‖y‖, i.e.∣∣∣∣∣n∑i=1xiyi∣∣∣∣∣ ≤√√√√ n∑i=1x2i√√√√ n∑i=1y2i .Moreover, 〈x, y〉 = ‖x‖ · ‖y‖ ⇔ ∃α ∈ [0,+∞) such that x = αy or y = αx.Definition 1.1.9. In a normed vector space (V, ‖ · ‖), a sequence (vn)+∞n=1 is saidto converge (or strongly converge) to a point v ∈ V if ∀ > 0, ∃N > 0 such that‖vn − v‖ < for all n ≥ N.In the following thesis, (xn)+∞n=1 → x denotes the sequence (xn)+∞n=1 converges(or strongly converges) to x. We also write it as xn → x.Definition 1.1.10. Let (xn)+∞n=1 be a sequence in a vector space (V, ‖ · ‖), let(nk)+∞k=1 be a strictly increasing sequence in N. Then the sequence (xnk)+∞k=1 iscalled a subsequence of (xn)+∞n=1.Definition 1.1.11. A sequence (xn)+∞n=1 is bounded if ∃M > 0 such that ‖xn‖ ≤M,∀n ≥ 1.Fact 1.1.12. (Bolzano-Weierstrass Theorem) Every bounded sequence (xn)+∞n=1 inRm has a convergent subsequence, i.e., there exists a subsequence (xnk)+∞k=1 of(xn)+∞n=1 such that xnk → x, for some x ∈ Rm.Definition 1.1.13. A sequence (xn)+∞n=1 is called a Cauchy sequence if for every > 0, there exists an integer N > 0 such that ‖xn − xk‖ < for all n, k > N.Fact 1.1.14. Let (xn)+∞n=1 be a Cauchy sequence in a normed vector space (V, ‖·‖).Let x ∈ V . Then (xn)+∞n=1 converges to x if and only if it has a subsequence thatconverges to x.Proof. We separate this proof into two parts:(1) “ ⇒ ” If the Cauchy sequence (xn)+∞n=1 → x, it follows that (xn)+∞n=1 is asubsequence of itself which converges to x.31.1. Inner product space(2) “ ⇐ ” Suppose (xnk)+∞k=1 is a subsequence of (xn)+∞n=1 and converges to x.Then for all > 0,∃N1 > 0(N1 ∈ N) such that ‖xnk − x‖ < 2 ,∀k > N1.Since (xn)+∞n=1 is a Cauchy sequence, ∀ > 0,∃N2 > 0(N2 ∈ N) such that‖xm − xnk‖ < 2 ,∀m, k > N2.Let M = max(N1, N2), we have nM > M. Then ∀ > 0,m > M :‖xm − x‖ ≤ ‖xm − xnM ‖+ ‖xnM − x‖<2+2= .That is, (xn)+∞n=1 converges to x.Definition 1.1.15. An inner product spaceH is called complete, or a Hilbert Space,if each Cauchy sequence inH converges to a point inH.In the following thesis,H denotes a Hilbert space.Example 1.1.16. Rm is complete. Thus, Rm is a Hilbert space.Proof. Suppose (xn)+∞n=1 is a Cauchy sequence in Rm, we want to prove its con-vergence. According to Fact 1.1.14, we only need to prove it has a subsequencewhich is convergent.Since (xn)+∞n=1 is a Cauchy sequence, let = 1. Then, there existsN ∈ N such thatfor all m, k > N, ‖xm − xk‖ < 1. Thus, for all k > N ,‖xk‖ = ‖xk − xN+1 + xN+1‖≤ ‖xk − xN+1‖+ ‖xN+1‖< 1 + ‖xN+1‖.Let M = max(‖x1‖, ‖x2‖, . . . , ‖xN‖, ‖xN+1‖, 1 + ‖xN+1‖), then for all k ≥ 1,‖xk‖ ≤M. According to the definition of the bounded sequence, we find (xn)+∞n=1is bounded. By using Bolzano-Weierstrass theorem, (xn)+∞n=1 has a convergentsubsequence, which completes the proof.Definition 1.1.17. A sequence (xn)+∞n=1 in a Hilbert space H is said to convergeweakly to a point v ∈ H if for all y ∈ H, 〈vn, y〉 → 〈v, y〉.Fact 1.1.18. [4, Lemma 2.51] Let (xn)+∞n=1 and (un)+∞n=1 be sequences in H, andlet x and u be points inH. Then the following hold:(1) Suppose thatH is finite-dimensional. Then xn ⇀ x⇔ xn → x.41.1. Inner product space(2) Suppose that xn ⇀ x and un → u. Then 〈xn, un〉 → 〈x, u〉.Fact 1.1.19. [4, Lemma 2.46] Let (xn)+∞n=1 be a sequence in Rm. Then (xn)+∞n=1converges if and only if it is bounded and possesses at most one sequential clusterpoint.Definition 1.1.20. Let l2 be a space such that each element in it is a sequencex = (ζj)+∞j=1 = (ζ1, ζ2, . . .) of numbers such that+∞∑j=1|ζj |2 < +∞,and its distance function is defined byd(x, y) =√√√√+∞∑j=1|ζj − µj |2,where y = (µj)+∞j=1 and+∞∑j=1|µj |2 < +∞.Remark 1.1. Let (xn)+∞n=1 and (un)+∞n=1 be weakly convergent sequences in l2,xn ⇀ x and un ⇀ u do not imply 〈xn, un〉 → 〈x, u〉. A counter example is:let (xn)+∞n=1 = (en)+∞n=1 and (un)+∞n=1 = (en)+∞n=1, where (en)+∞n=1 is an orthonormalsequence in l2. We have〈xn, un〉 = 〈en, en〉 = 1;whilexn ⇀ 0, un ⇀ 0 and 〈0, 0〉 = 0.Remark 1.2. We show en ⇀ 0. We need 〈en, x〉 → 〈0, x〉 ∀x ∈ l2 as n → +∞.Because x ∈ l2, let x = (ζn)+∞n=1, we have+∞∑n=1|ζn|2 < +∞⇒ ζ2n → 0⇒ ζn → 0 as n→ +∞.Therefore, limn→+∞ ζn = 0. Solimn→+∞〈en, x〉 = ζn = 0 = 〈0, x〉for all x ∈ l2. Hence en ⇀ 0.51.1. Inner product space1.1.1 Sets in vector spacesDefinition 1.1.21. A set C ⊆ Rm is closed if it contains all limit points, i.e.,whenever there exists a sequence (xk)∞k=1 in C and xk → x, then x ∈ C.Fact 1.1.22. For a Hilbert space H, let S be a subspace of H. Then S is closed ifand only if S is complete.Proof. (1) ⇒ Let (xn)+∞n=1 be a Cauchy sequence in S. Because the Hilbertspace H is complete, (xn)+∞n=1 must converge to some x ∈ H. However, asS is closed, x ∈ S. Thus, S is complete.(2) ⇐ Let x ∈ S¯. Then there exists a sequence (xn)+∞n=1 ∈ S that convergesto x. Since a convergent sequence must be a Cauchy sequence, moreover,since S is complete, (xn)+∞n=1 ∈ S must converge to a point included in S.Because a convergent sequence cannot converge to more than one point, wehave x ∈ S. Thus, S is closed.Definition 1.1.23. A set O ⊆ Rm is open if ∀x ∈ O,∃r > 0 such that the openball B(x; r) ⊆ O, whereB(x; r) = {y ∈ Rm : ‖y − x‖ < r}.Definition 1.1.24. The interior of a subset C ofH can be expressed asintC := {x ∈ C|(∃r ∈ R++) B(0; r) ⊂ C − x}.Definition 1.1.25. The orthogonal complement of a subset C of H is denoted byC⊥, i.e.,C⊥ := {u ∈ H|(∀x ∈ C) 〈x, u〉 = 0}.Example 1.1.26. For any Hilbert spaceH,H⊥ = {0}.Proof. (1) {0} ⊆ H⊥ is clear.(2) Suppose there exists a u 6= 0 such that u ∈ H⊥. According to the definitionof H⊥, for any x ∈ H, 〈x, u〉 = 0. However, since u ∈ H and u 6= 0, wehave 〈u, u〉 6= 0, which is a contradiction. Therefore, u must equal 0.Altogether, we haveH⊥ = {0}.Fact 1.1.27. Let C and D be two subsets ofH. Then D⊥ ⊆ C⊥ if C ⊆ D.Proof. Let u ∈ D⊥. Then u is a vector in H such that for all x ∈ D, 〈x, u〉 = 0.Since C ⊆ D, all y ∈ C also contained in D. Thus we have ∀y ∈ C, 〈y, u〉 = 0.Therefore, u ∈ C⊥ and so D⊥ ⊆ C⊥.61.1. Inner product spaceFact 1.1.28. [12, Lemma 3.3-6] If S is a closed subspace of a Hilbert space H,thenS = S⊥⊥.Definition 1.1.29. A set C ⊆ Rm is convex if for any x, y ∈ C and α ∈ (0, 1) wehaveαx+ (1− α)y ∈ C.Graphically, a set C is convex if the line segment between any two points in Cis also contained in C, see Figure 1.1.Figure 1.1: Examples of convex setsFigure 1.2 shows two nonconvex sets.Figure 1.2: Examples of nonconvex setsExample 1.1.30. Let r ∈ R++. Then the closed ballB(c; r) = {x ∈ Rm : ‖x− c‖ ≤ r}71.1. Inner product spaceis convex.Proof. Let x, y ∈ B(c; r) and a ∈ (0, 1). Then‖ax+ (1− a)y − c‖ =‖a(x− c) + (1− a)(y − c)‖≤a‖x− c‖+ (1− a)‖y − c‖≤ar + (1− a)r = r.Therefore, ax+ (1− a)y ∈ B(c; r), which implies B(c; r) is convex.Example 1.1.31. Let ai ≤ bi for all i ∈ {1, 2, . . . ,m}. Then the boxC = {x ∈ Rm : ai ≤ xi ≤ bi}is convex.Proof. Let x, y ∈ C and a ∈ (0, 1). Then for all i ∈ {1, 2, . . . ,m},aai + (1− a)ai ≤ axi + (1− a)yi ≤ abi + (1− a)bi⇔ai ≤ axi + (1− a)yi ≤ bi.Therefore, ax+ (1− a)y ∈ C, which implies C is convex.Definition 1.1.32. Let A ⊆ R. The infimum of A is the largest lower bound anddenoted by inf A; the supremum of A is the smallest upper bound and denoted bysupA.Remark 1.3. When A = ∅, inf A = +∞ and supA = −∞.Definition 1.1.33. Let C be a nonempty convex subset of H and let x ∈ H. Thenormal cone to C at x isNCx ={ {u ∈ H| sup〈C − x, u〉 ≤ 0}, if x ∈ C;∅ otherwise.Example 1.1.34. [4, Example 6.39] Let C = B(0; 1) and let x ∈ C. ThenNCx =R+x, if ‖x‖ = 1;{0} if ‖x‖ < 1;∅ if ‖x‖ > 1.Lemma 1.1.35. In Rm, we haveN{0}x ={Rm, if x = 0;∅ if x 6= 0.Then, domN{0} = {0} and ranN{0} = Rm.81.2. OperatorsProof. According to the definition of the normal cone, y ∈ N{0}x if and only ifsup〈0− x, y〉 ≤ 0. When x = 0, the inequalitysup〈0− 0, y〉 ≤ 0is satisfied by all y ∈ Rm. Therefore,N{0}x = Rm if x = 0.That is, domN{0} = {0} and ranN{0} = Rm.1.2 OperatorsDefinition 1.2.1. Let M : H → 2H be a set-valued operator, the domain of M isdomM := {x ∈ H : Mx 6= ∅};the range of M isranM := {u ∈ H : ∃x ∈ H, u ∈Mx};the graph of M isgraM := {(x, u) ∈ H ×H : u ∈Mx};the set of zeros of M is:zerM := {x ∈ H : 0 ∈Mx};the set of fixed points of M isFixM := {x ∈ H : x ∈Mx};the inverse of M isM−1 : H → 2H : u 7→ {x ∈ H : u ∈Mx}.Definition 1.2.2. Let M1,M2 : H → 2H,(1) The sum of M1,M2 is defined as (M1 +M2)(x) := M1(x) +M2(x) for allx ∈ H;(2) The parallel sum of M1,M2 is M1M2 : H → 2H, defined byM1M2 := (M−11 +M−12 )−1.91.2. OperatorsDefinition 1.2.3. The identity operator is denoted by Id : Rm → Rm, for whichwe have(∀x ∈ Rm) Idx = x.Definition 1.2.4. Let A : Rm → Rn. The induced norm or operator norm onRn×m is given by :‖A‖ := sup{‖Ax‖‖x‖ : x ∈ Rm with x 6= 0}.Example 1.2.5. On space Rm,‖ Id ‖ = sup{‖x‖‖x‖ : x ∈ Rm with x 6= 0}= 1.Definition 1.2.6. The distance to a set C ⊂ H is the functiondC : H → [0,+∞] : x 7→ infy∈C‖x− y‖.Note that if C = ∅ then dC ≡ +∞.Definition 1.2.7. Let C be a subset of H, let x ∈ H, and let p ∈ C. Then p is aprojection of x onto C if ‖x − p‖ equals to the distance between x and C, whichdenoted by dC . If every point in H has at least one projection onto C, then C isproximinal. If every point in H has exactly one projection onto C, then C is aChebyshev set. In this case, the projector onto C is the operator, denoted by PC ,that maps every point inH to its unique projection onto C.Definition 1.2.8. An operator M : Rm → Rm is called ρ-strongly positive (ρ ∈R+) if 〈Mx, x〉 ≥ ρ‖x‖2.Example 1.2.9. For all α ∈ R++, the following two operators are ρ-stronglypositive:(1) α Id is ρ-strongly positive for any 0 < ρ ≤ α.(2) Let V : R2 → R2 : (x1, x2)→ (x1+x2α , x2−x1α ). V is ρ-strongly positive forany 0 < ρ ≤ 1α .Proof. (1) For any x ∈ Rm,〈α Idx, x〉 = α‖x‖2.Therefore, α Id is ρ-strongly positive for any 0 < ρ ≤ α.101.2. Operators(2) For any x1 ∈ R, x2 ∈ R,〈V (x1, x2), (x1, x2)〉 = 〈(x1 + x2α,x2 − x1α), (x1, x2)〉=x21 + x22α=1α‖x‖2.Therefore, V is ρ-strongly positive for any 0 < ρ ≤ 1α .1.2.1 Linear operatorsDefinition 1.2.10. An operator L : Rn → Rm is said to be linear if and only ifL(αx+ βy) = αL(x) + βL(y)for all x, y ∈ Rn and α, β ∈ R.Definition 1.2.11. Let L : Rn → Rm be a linear operator. The adjoint of L is theunique linear operator L? : Rm → Rn that satisfies(∀x ∈ Rn)(∀y ∈ Rm) 〈Lx, y〉 = 〈x, L?y〉.Fact 1.2.12. [12, Theorem 3.9-2] The Hilbert-adjoint operator L? of L in Defini-tion 1.2.11 exists, is unique and is a bounded linear operator with norm‖L?‖ = ‖L‖.Fact 1.2.13. Let L : Rn → Rm be a linear operator, let λ ∈ R. Then(1) λL is a linear operator.(2) (λL)? = λL?.Proof. (1) Clear.(2)〈x, (λL)?y〉 = 〈λLx, y〉= λ〈x, L?y〉= 〈x, λL?y〉.Thus, (λL)? = λL?.111.2. OperatorsFact 1.2.14. Let L : Rn → Rm be a linear operator. Then L? = Lᵀ. In this thesis,we use Aᵀ to denote the transpose of a matrix A.Proof. Since L : Rn → Rm, we setL =a11 a12 . . . a1na21 a22 . . . a2n....... . ....am1 am2 . . . amn .Then, let x = (x1, x2, . . . , xn) ∈ Rn; let y = (y1, y2, . . . , ym) ∈ Rm. Wehave:〈Lx, y〉 =〈a11 a12 . . . a1na21 a22 . . . a2n....... . ....am1 am2 . . . amnx1x2...xn ,y1y2...ym〉=〈a11x1 + a12x2 + . . .+ a1nxna21x1 + a22x2 + . . .+ a2nxn...am1x1 + am2x2 + . . .+ amnxn ,y1y2...ym〉=m∑i=1(ai1x1 + ai2x2 + . . .+ ainxn)yi=n∑j=1(a1jy1 + a2jy2 + . . .+ amjym)xj=〈x1x2...xn ,a11y1 + a2122 + . . .+ am1yma12y1 + a22y2 + . . .+ am2ym...a1ny1 + a2ny2 + . . .+ amnym〉=〈x1x2...xn ,a11 a21 . . . am1a12 a22 . . . am2....... . ....a1n a2n . . . amny1y2...ym〉.121.2. OperatorsThus,L? =a11 a21 . . . am1a12 a22 . . . am2....... . ....a1n a2n . . . amn = Lᵀ.Fact 1.2.15. Let L : Rn → Rm be a linear operator. If L−1 exists, then L−1 isalso a linear operator.Proof. Let x, y ∈ Rn, and let α, β ∈ R. Since L is a linear operator and L−1L =Id, we haveL−1[αL(x) + βL(y)] = L−1[L(αx+ βy)]= αx+ βy. (1.1)Let x′ = L(x), y′ = L(y), equation (1.1) becomesL−1(αx′ + βy′) = αL−1(x′) + βL−1(y′),that is, L−1 is a linear operator.Fact 1.2.16. [12, Theorem 2.4-2] Every finite dimensional subspace of a normedspace is complete.Fact 1.2.17. Let T : Rm → Rm be a linear operator. Then ranT is a subspace, sois closed. In other words, ranT = ranT . Here we use ranT to denote the closureof ranT.Proof. We set this proof into two parts.(1) Prove ranT is a subspace of Rm.(a) Since T is a linear operator, T (0) = 0. Thus, 0 ∈ ranT.(b) For any u, v ∈ ranT, there must exists x, y ∈ Rm such that Tx =u, Ty = v. Meanwhile, T (x+ y) = Tx+ Ty = u+ v. Thus, u+ v ∈ranT.(c) For any u ∈ ranT, there must exists x ∈ Rm such that Tx = u.Meanwhile, for any scalar c, T (cx) = cT (x) = cu. Thus, cu ∈ ranT.Since ranT satisfies all the conditions of being a subspace, ranT is a sub-space of Rm.131.2. Operators(2) Since Rm is a finite dimensional complete space and ranT is a subspace ofit, by using Fact 1.2.16 and Fact 1.1.22, we get ranT is closed.Fact 1.2.18. [4, Fact 2.25] Let T : Rm → Rm be a linear operator. Then(ranT )⊥ = zerT ?.1.2.2 Nonexpansive operatorsDefinition 1.2.19. Let D be a nonempty subset of Rm. Let T : D → Rm. Then Tis(1) nonexpansive if∀x ∈ D,∀y ∈ D, ‖Tx− Ty‖ ≤ ‖x− y‖;(2) firmly nonexpansive if∀x ∈ D,∀y ∈ D, ‖Tx−Ty‖2 ≤ ‖x− y‖2−‖(Id−T )x− (Id−T )y‖2.Example 1.2.20. For any 0 ≤ α ≤ 1, α Id is firmly nonexpansive.Proof. Since 0 ≤ α ≤ 1, we have α2 − α ≤ 0, which also impliesα2 + 1 + α2 − 2α ≤ 1.That is, for any x, y ∈ Rm, and x 6= y,α2 + (1− α)2 ≤ 1⇔ α2‖x− y‖2 ≤ ‖x− y‖2 − (1− α)2‖x− y‖2⇔ ‖αx− αy‖2 ≤ ‖x− y‖2 − ‖(1− α)x− (1− α)y‖2.Thus, for any 0 < α ≤ 1, α Id is firmly nonexpansive.Fact 1.2.21. [4, Proposition 4.2] Let D be a nonempty subset of Rm, let T : D →H.Then the following are equivalent:(1) T is firmly nonexpansive.(2) Id−T is firmly nonexpansive.(3) 2T − Id is firmly nonexpansive.(4) (∀x ∈ D) (∀y ∈ D) ‖Tx− Ty‖2 ≤ 〈x− y, Tx− Ty〉.141.2. Operators(5) (∀x ∈ D) (∀y ∈ D) 0 ≤ 〈Tx− Ty, (Id−T )x− (Id−T )y〉.(6) (∀x ∈ D) (∀y ∈ D) (∀α ∈ [0, 1]) ‖Tx − Ty‖ ≤ ‖α(x − y) + (1 −α)(Tx− Ty)‖.Fact 1.2.22. [4, Theorem 5.15] Let D be a nonempty closed convex subset of Rm,let T : D → D be a nonexpansive operator such that FixT 6= ∅, where the fixedpoints setFixT = {x ∈ Rm : Tx = x}.Let (λn)+∞n=1 be a sequence in [0, 1] such that∑+∞n=1 λn(1 − λn) = +∞, and letx0 ∈ D. Set(∀n ∈ N) xn+1 = xn + λn(Txn − xn).Then the following hold:(1) (Txn − xn)+∞n=1 converges to 0.(2) (xn)+∞n=1 converges to a point in FixT.Definition 1.2.23. Let D be a nonempty subset of H, let T : D → H be nonex-pansive, and let γ ∈ (0, 1). Then T is averaged with constant γ, or γ − averaged,if there exists a nonexpansive operator R : D → H such thatT = (1− γ) Id +γR.1.2.3 Monotone operatorsDefinition 1.2.24. [4] An operatorM : H → 2H is monotone if 〈x−y, u−v〉 ≥ 0for all (x, u), (y, v) ∈ graM .M is a maximally monotone operator if there is no monotone operator whose graphproperly contains graM .M is a strictly monotone operator if(∀(x, u), (y, v) ∈ graM) x 6= y ⇒ 〈x− y, u− v〉 > 0 (1.2)M is a uniformly monotone operator if there exists an increasing function φM :R+ → [0,+∞] with φM (0) = 0, and 〈x − y, u − v〉 ≥ φM (‖x − y‖) for all(x, u), (y, v) ∈ graM . When φM (‖x − y‖) = ‖x − y‖2, M is called stronglymonotone.Remark 1.2.25. [4, Remark 22.3] The notions of strict, uniform, and strong mono-tonicity of A : H → 2H can naturally be localized to a subset C of domA.Proposition 1.2.26. If M is uniformly monotone on H, then A is strictly mono-tone.151.2. OperatorsProof. M is uniformly monotone onH implies there exists a continuous increasingfunction φM : R+ → [0,+∞] with φM (0) = 0, and 〈x−y, u−v〉 ≥ φM (‖x−y‖)for all (x, u), (y, v) ∈ graM . Thus, in the case x 6= y, for all (x, u), (y, v) ∈graM , we have〈x− y, u− v〉 ≥ φM (‖x− y‖) (1.3)According to the definition of φM , (1.3) means φM (‖x − y‖) > 0, so M is astrictly monotone operator.Fact 1.2.27. [4, Proposition 23.35] Let A : H → 2H be strictly monotone. ThenzerA is at most a singleton.Proof. Suppose x ∈ zerA, y ∈ zerA, x 6= y, i.e., 0 ∈ Ax, 0 ∈ Ay. Since A isstrictly monotone, we have〈x− y, 0− 0〉 > 0⇒ 0 > 0which is a contradiction. Thus, zerA is at most a singleton.Fact 1.2.28. [4, Corollary 25.5] Let A and B be maximally monotone operatorsfromH to 2H such that one of the following holds:(1) domA ∩ intdomB 6= ∅.(2) 0 ∈ int(domA− domB).ThenA+B is maximally monotone. In particular, (1) and (2) hold when domB =H.Fact 1.2.29. [4, Propositions 20.22] Let A : H → 2H be maximally monotone,let z ∈ H, u ∈ H, and γ ∈ R++. Then A−1, and C : x 7→ u + γA(x + z) aremaximally monotone.Fact 1.2.30. [4, Propositions 20.22, 20.23] Let A : H → 2H and B : G → 2Gbe maximally monotone. Then A × B : H × G → 2H×G :(x, y) 7→ Ax × By ismaximally monotone.Fact 1.2.31. [4, Example 20.35] Let A : H → H be a bounded linear operatorsuch that A? = −A. Then A is maximally monotone.Fact 1.2.32. [4, Minty’s Theorem] Let A : H → 2H be monotone. Then A ismaximally monotone if and only if ran(Id +A) = H.Example 1.2.33. The following are maximally monotone operators:161.2. Operators(1) Let C be a nonempty convex set in Rm. Then NC , N−1C , N−1C + Id, and(N−1C + Id)−1 are all maximally monotone.(2) For any γ ∈ R++, γ Id is maximally monotone.Proof. (1) C is closed and convex, so due to Example 1.3.13, ιC ∈ Γ0(Rm).As ∂ιC is maximally monotone by Fact 1.3.27 and ∂ιC = NC by Example1.3.25, we get NC is maximally monotone. Because dom Id = Rm and NCis maximally monotone, by Fact 1.2.28 and Fact 1.2.29, N−1C , N−1C + Id,and (N−1C + Id)−1 are all maximally monotone.(2) For any x ∈ Rm, we have γ Idx = γx. Therefore, for any x, y ∈ Rm,〈y − x, γ Id y − γ Idx〉 =〈y − x, γy − γx〉=γ‖y − x‖2≥0.Moreover, we haveran(Id +γ Id) = ran(1 + γ) Id = Rm.Therefore, by Fact 1.2.32, γ Id is maximally monotone.Lemma 1.2.34. Let A : Rm → 2Rm be maximally monotone. Then −A(−·) isalso maximally monotone.Proof. For any y1 ∈ −A(−x1) and y2 ∈ −A(−x2), we have{ −y1 ∈ A(−x1)−y2 ∈ A(−x2).Since A is maximally monotone,〈−y1 − (−y2),−x1 − (−x2)〉 ≥ 0,which is equivalent to〈y2 − y1, x2 − x1〉 ≥ 0.Therefore,−A(−·) is monotone. SinceA is maximally monotone, we have−A(−·)is maximally monotone.171.2. Operators1.2.4 ResolventDefinition 1.2.35. [4] Let A : H → 2H be maximally monotone. The resolvent ofA is defined asJA := (Id +A)−1.Fact 1.2.36. Let A : H → 2H be monotone and let γ ∈ R++. Then JγA issingle-valued.Proof. Suppose JγA is not single-valued. Then there exists x, y1, y2 ∈ H andy1 6= y2 such thaty1 ∈ JγA(x) and y2 ∈ JγA(x).That is, {x ∈ (Id +γA)y1x ∈ (Id +γA)y2,which impliesy2 − y1 ∈ γ(Ay1 −Ay2).Therefore, there exists (y1, u) ∈ graA and (y2, v) ∈ graA such that u − v =1γ (y2 − y1). That implies〈y1 − y2, u− v〉 = −1γ‖y1 − y2‖2,which is less than 0 as y1 6= y2. This contradicts the assumption that A is mono-tone. Therefore, JγA must be single-valued.Fact 1.2.37. [4, Proposition 23.20] Let A : H → 2H be maximally monotone andlet γ ∈ R++. ThenId = JγA + γJγ−1A−1 ◦ γ−1 Id .In particular,JA−1 = Id−JA.Fact 1.2.38. [4, Proposition 23.10] LetD be a nonempty subset ofH, let T : D →H, and set A = T−1 − Id. Then the following hold:(1) T = JA.(2) T is firmly nonexpansive if and only if A is monotone.(3) T is firmly nonexpansive and D = H if and only if A is maximally mono-tone.181.3. FunctionsDefinition 1.2.39. An operator T : Rn → Rm is Lipschitz continous with constantβ ∈ [0,∞) if(∀x ∈ Rn)(∀y ∈ Rn) ‖Tx− Ty‖ ≤ β‖x− y‖.The operator T is locally Lipschitz continuous near a point x0 ∈ Rn if there existsr ∈ R++ such that T |B(x0;r), which means the restriction of T to B(x0; r), isLipschitz continuous.Fact 1.2.40. [4, Corollary 23.11] Let A : H → 2H be maximally monotone andlet γ ∈ R++. Then the following hold:(1) JγA : H → H and Id−JγA : H → H are firmly nonexpansive and maxi-mally monotone.(2) The reflected resolvent RγA : H → H : x 7→ 2JγAx− x is nonexpansive.1.3 FunctionsDefinition 1.3.1. Let f : Rm → [−∞,+∞]. The domain of f isdom f := {x ∈ Rm : f(x) < +∞},the epigraph of f isepi f := {(x, ξ) ∈ Rm × R : f(x) ≤ ξ},and the reversal of f isf∨ := {x ∈ Rm : f∨(x) := f(−x)}.Definition 1.3.2. A function f : Rm → [−∞,+∞] is proper if its domain isnonempty and −∞ /∈ f(Rm).1.3.1 Convex functionsDefinition 1.3.3. Let f : H → [−∞,+∞]. Then f(x) is convex if its epigraph{(x, r) : f(x) ≤ r} is a convex subset of H× R. Moreover, f is concave if −f isconvex.Fact 1.3.4. [4, Proposition 8.4] Let f : H → [−∞,+∞]. Then f(x) is convexif and only if for all x ∈ dom f, for all y ∈ dom f, for all α ∈ (0, 1)f(αx+ (1− α)y) ≤ αf(x) + (1− α)f(y).191.3. FunctionsFact 1.3.5. Let f : Rm → [−∞,+∞] be convex. Then its domain dom f = {x ∈Rm : f(x) < +∞} is convex.Proof. For any x, y ∈ dom f, we have f(x) < +∞, f(y) < +∞. Thus, for allα ∈ (0, 1), αf(x) + (1− α)f(y) < +∞. By Fact 1.3.4, for all x ∈ dom f, for ally ∈ dom f, for all α ∈ (0, 1)f(αx+ (1− α)y) ≤ αf(x) + (1− α)f(y)< +∞.That is, αx+ (1−α)y ∈ dom f. According to the definition of the convex set, wefind dom f is a convex set.Definition 1.3.6. Let f : Rm → [−∞,+∞] be a proper function. Then f(x) isstrictly convex if ∀x ∈ dom f,∀y ∈ dom f, ∀α ∈ (0, 1), and for x 6= y, wehavef(αx+ (1− α)y) < αf(x) + (1− α)f(y).Now letC be a nonempty subset of dom f . Then f is convex onC if ∀x ∈ C,∀y ∈C,∀α ∈ (0, 1), and for x 6= y, we havef(αx+ (1− α)y) ≤ αf(x) + (1− α)f(y),and f is strictly convex on C if ∀x ∈ C,∀y ∈ C,∀α ∈ (0, 1), and for x 6= y, wehavef(αx+ (1− α)y) < αf(x) + (1− α)f(y).Example 1.3.7. The function ‖ · ‖2 is strictly convex.Proof. Let x, y ∈ Rm, and x 6= y. Let 0 < a < 1. Then‖ax+ (1− a)y‖2 − a‖x‖2 − (1− a)‖y‖2=a2x2 + (1− a)2y2 + 2a(1− a)〈x, y〉 − ax2 − (1− a)y2=− a(1− a)(x2 + y2 − 2〈x, y〉)=− a(1− a)(x− y)2.Because x 6= y and 0 < a < 1, we have −a(1− a)(x− y)2 < 0. That is,‖ax+ (1− a)y‖2 < a‖x‖2 + (1− a)‖y‖2.Therefore, the function ‖ · ‖2 is strictly convex.Fact 1.3.8. Let f : Rm → R be a convex function, A : Rm → Rm be a linearoperator. Then g = f ◦A is convex.201.3. FunctionsProof. For all x ∈ dom g, for all y ∈ dom g, for all α ∈ (0, 1). Since g = f ◦ A,we haveg(αx+ (1− α)y) = f(A[αx+ (1− α)y]).Because A is a linear operator, we havef(A[αx+ (1− α)y]) = f(αAx+ (1− α)Ay). (1.4)As f is convex, equation (1.4) impliesg(αx+ (1− αy)) ≤ αf(Ax) + (1− α)f(Ay)= αg(x) + (1− α)g(y).Therefore, g is a convex function.Fact 1.3.9. Let f : R+ → R be convex and increasing, g : Rm → R+ be convex.Then h = f ◦ g is convex.Proof. For all x ∈ domh, for all y ∈ domh, for all α ∈ (0, 1). Since h = f ◦ g,we haveh(αx+ (1− α)y) = f(g(αx+ (1− α)y)). (1.5)As f is convex and increasing, g is convex, equation (1.5) implies thath(αx+ (1− α)y) ≤ f(αg(x) + (1− α)y)≤ αf(g(x)) + (1− α)f(g(y))= αh(x) + (1− α)h(y).Therefore, h is a convex function.1.3.2 Lower semicontinuous functionsIn the following thesis, I shall use B(x; r) to denote the closed ball with centerat x and radius r ∈ R++.Definition 1.3.10. The lower limit of a function f : Rm → R at x¯ is the value inRm defined bylim infx→x¯ f(x) : = limδ↘0infx∈B(x¯;δ)f(x)= supδ>0infx∈B(x¯;δ)f(x).211.3. FunctionsDefinition 1.3.11. A function f is lower semicontinous at a point x0 iflim infx→x0f(x) ≥ f(x0).The function is said to be lower semicontinuous on H if it is lower semicontinousat every point x0 ∈ H.Example 1.3.12. The following functions are lower semicontinous:(1) All continuous functions are lower semicontinous.(2) The piecewise functionf(x) ={sin(x) if x ≤ pi2 ,sin(x) + 1 if x > pi2is lower semicontinuous.Example 1.3.13. [4, Example 1.25] The indicator function of a set C ∈ H, i.e.,the functionιC : H → [−∞,+∞] : x 7→{0 if x ∈ C,+∞ otherwiseis lower semicontinuous if and only if C is closed. Moreover, if C is closed andconvex, then ιC is a proper, lower semicontinuous, and convex function.In the following thesis, I use Γ(H) to denote the set of lower semicontinuousconvex functions fromH to [−∞,+∞], and use Γ0(H) to denote the set of properlower semicontinuous convex functions fromH to (−∞,+∞].Fact 1.3.14. [4, Corollary 9.4] Let (fi)i∈I be a family in Γ(H). If I is finite and−∞ /∈ ⋃i∈I fi(H). Then∑i∈I fi ∈ Γ(H).Lemma 1.4. Let f, g ∈ Γ0(H), and dom f ∩ dom g 6= ∅. Then f + g ∈ Γ0(H).Proof. Since f, g ∈ Γ0(H), that is,−∞ /∈ f(H)∪g(H),we have f+g > −∞.Asdom f ∩ dom g 6= ∅, there must exists at least an x such that f(x) + g(x) < +∞.Therefore, combining with Fact 1.3.14, f + g ∈ Γ0(H).Fact 1.3.15. [4, Theorem 9.20] Let f ∈ Γ0(H). Then for any x ∈ H, there existsa u ∈ H and an η ∈ R such that f(x) ≥ 〈x, u〉+ η.221.3. Functions1.3.3 Coercive and supercoercive functionsDefinition 1.3.16. Let f : H → [−∞,+∞]. Then f is coercive iflim‖x‖→+∞f(x) = +∞,and supercoercive iflim‖x‖→+∞f(x)‖x‖ = +∞.By convention, we say f is coercive and supercoercive ifH = {0}.Example 1.3.17. The function ‖ · ‖2 is supercoercive.Fact 1.3.18. Let f be in Γ0(H), and let g : H → (−∞,+∞] be supercoercive.Then f + g is supercoercive.Proof. According to Fact 1.3.15, there exists a u ∈ H and an η ∈ R such that forall x ∈ H,f(x) ≥ 〈x, u〉+ η.Then we havelim‖x‖→+∞f(x) + g(x)‖x‖ ≥ lim‖x‖→+∞〈x, u〉+ η + g(x)‖x‖≥ lim‖x‖→+∞−‖u‖‖x‖+ η + g(x)‖x‖= lim‖x‖→+∞(−‖u‖+ η + g(x)‖x‖ )≥ −‖u‖ − |η|+ lim‖x‖→+∞g(x)‖x‖→ +∞.Thus, f + g is supercoercive.Fact 1.3.19. [4, Corollary 11.16] Let f and g be in Γ0(H). Suppose that dom f ∩dom g 6= ∅ and f is supercoercive. Then f + g is coercive and it has a minimizeroverH. If f or g is strictly convex, then f + g has exactly one minimizer overH.231.3. Functions1.3.4 Subgradient and subdifferentialDefinition 1.3.20. Let f : Rm → [−∞,+∞], and let x be a point such that|f(x)| < +∞. We say that f is differentiable (or Fre´chet differentiable) at x if andonly if there exists a vector x∗ with the propertylimy→xf(y)− f(x)− 〈x∗, y − x〉‖y − x‖ = 0.If such x∗ exists, it is called the gradient of f at x and is denoted by∇f(x).Definition 1.3.21. A vector u ∈ Rm is said to be a subgradient of a convex functionf : Rm → R at the point x if we have∀y ∈ Rm 〈y − x, u〉+ f(x) ≤ f(y).The set of all subgradients of f at x is called the subdifferential of f at x and isdenoted by ∂f(x).Definition 1.3.22. (Fenchel Subdifferential) For a (not necessarily convex) f :Rm → R, define its Fenchel subdifferential at x∂f(x) := {v ∈ Rm : f(y) ≥ f(x) + 〈v, y − x〉 for all y ∈ Rm}.When f is convex, ∂f(x) is the usual subdifferential.Fact 1.3.23. [15, Proposition 2.36] Let f : Rm → (−∞,+∞] be proper andconvex, and let x ∈ dom f. Suppose that f is differentiable at x. Then∂f(x) = {∇f(x)}.Example 1.3.24. Let f(x) = 12‖x‖2. Then ∂f(x) = {∇f(x)} = {x}.Proof. We already proved ‖ · ‖2 is strictly convex in Example 1.3.7, therefore inthe case x, y ∈ Rm, x 6= y and 0 < a < 1,12‖ax+ (1− a)y‖2 <12(a‖x‖2 + (1− a)‖y‖2)=a2‖x‖2 + 1− a2‖y‖2.Therefore, f(x) = 12‖x‖2 is strictly convex. Since f(x) is proper, convex, anddifferentiable on Rm, applying Fact 1.3.23 here, we have∂f(x) = {∇f(x)} = {x}.241.3. FunctionsFigure 1.3: Figure of f(x) = |x|Example 1.3.25. [4, Example 16.13] Let C be a nonempty convex subset of H.Then ∂ιC = NC .Example 1.3.26. Let f : R→ R defined as f(x) = |x|.Through the graph of this fuction, we can easily see its global minimizer is atx¯ = 0. However, this function is not differentiable at x = 0. At points other than0, f is differentiable. According to the definition of the subgradient of a convexfunction, we can get the the subdifferential of f(x) = |x| is∂f(x) ={−1} if x < 0,[−1, 1] if x = 0,{1} if x > 0.Proof. Because f is differentiable when x < 0 (or x > 0). According to Fact1.3.23,∂f(x) = {∇f(x)} = {1} for x > 0;∂f(x) = {∇f(x)} = {−1} for x < 0.For x¯ = 0, let v ∈ ∂f(x¯). Then we have〈v, x− x¯〉 ≤ f(x)− f(x¯) ∀x ∈ R⇔ 〈v, x〉 ≤ f(x) for x¯ = 0, ∀x ∈ R⇔{v ≥ f(x)x = −xx = −1 if x < 0v ≤ f(x)x = xx = 1 if x > 0⇔ v ∈ [−1, 1].251.3. FunctionsThus, we have ∂f(0) = [−1, 1].Fact 1.3.27. [4, Theorem 20.25] Let f ∈ Γ0(H). Then ∂f is maximally monotone.Fact 1.3.28. Let f ∈ Γ0(Rm) Then ran(Id +∂f) = Rm.Proof. Since f ∈ Γ0(Rm), by Fact 1.3.27, ∂f is maximally monotone. Accordingto Fact 1.2.32, we have ran(Id +∂f) = Rm.Fact 1.3.29. [4, Theorem 16.47] Let K be a real Hilbert space, let f ∈ Γ0(H), letg ∈ Γ0(K), and let L : H → K be a nonzero bounded linear operator. SupposeL(dom f) ∩ intdom g 6= ∅. Then ∂(f + g ◦ L) = ∂f + L? ◦ (∂g) ◦ L.Fact 1.3.30. [10, Page 20] Let f : Rm → R be convex, and let L : H → Kbe a nonzero bounded linear operator. If h(x) = f(Lx + b), where b ∈ R, then∂h(x) = L? ◦ ∂f ◦ (Lx+ b).Fact 1.3.31. [4, Corollary 16.50] Let m be an integer such that m ≥ 2, set I ={1, . . . ,m}, and let (fi)i∈I be functions in Γ0(H) such thatdom fm ∩m−1i=1 intdom fi 6= ∅.Then ∂(∑mi=1 fi) =∑mi=1 ∂fi.Fact 1.3.32. Let f ∈ Γ0(H) and let γ ∈ R++. Then ∂(f+(γ/2)‖·‖2) = ∂f+γ Id.Proof. Since dom ‖ · ‖2 = H, and f ∈ Γ0(H), we havedom f ∩ intdom[(γ/2)‖ · ‖2] 6= ∅.Therefore, by using Fact 1.3.31,∂(f + (γ/2)‖ · ‖2) = ∂f + ∂[(γ/2)‖ · ‖2] = ∂f + γ Id .Definition 1.3.33. The set of global minimizers of a function f is denoted asArgmin f.Fact 1.3.34. Let f : Rm → (−∞,+∞] be proper. ThenArgmin f = zer ∂f = {x ∈ Rm : 0 ∈ ∂f(x)}.Proof. Let x ∈ Argmin f. Then for all y ∈ Rm, f(x) ≤ f(y). That impliesf(x) + 〈y − x, 0〉 ≤ f(y), which is equivalent to 0 ∈ ∂f(x). Thus,Argmin f = zer ∂f.261.3. Functions1.3.5 ConjugationDefinition 1.3.35. Let f : H → [−∞,+∞]. The conjugate of f isf∗ : H → [−∞,+∞] : u 7→ supx∈H(〈x, u〉 − f(x)),and the biconjugate of f is f∗∗ = (f∗)∗.Fact 1.3.36. [4, Proposition 13.19] Let f : H → [−∞,+∞]. Thenf =12‖ · ‖2 ⇔ f∗ = f.Fact 1.3.37. [4, Proposition 13.23] Let f : H → (−∞,+∞]. Then for anyα ∈ R++, (αf)∗ = αf∗(·/α).Example 1.3.38. Let x ∈ Rm, let λ ∈ R++. If f(x) = λ‖x‖2, then f∗(u) = ‖u‖24λ .Proof. By applying Fact 1.3.37 together with Fact 1.3.36, we have(λ‖ · ‖2)∗ =(2λ12‖ · ‖2)∗= 2λ(12‖ · ‖2)∗( ·2λ)= λ∥∥∥ ·2λ∥∥∥2=‖ · ‖24λ.Fact 1.3.39. [4, Proposition 13.13] Let f : H → [−∞,+∞]. Then f∗ ∈ Γ(H).Fact 1.3.40. [4, Theorem 13.37] Let f : H → (−∞,+∞] be proper. Then f ∈Γ0(H) if and only if f = f∗∗. In this case, f∗ is proper as well.Theorem 1.3.41. Let f, g ∈ Γ0(H) and dom f∗ ∩ dom g∗ 6= ∅. Then f∗ + g∗ ∈Γ0(H).Proof. Combine Fact 1.3.39 and Fact 1.3.40, f, g ∈ Γ0(H) implies f∗, g∗ ∈Γ0(H). Moreover, as dom f∗ ∩ dom g∗ 6= ∅, according to Lemma 1.4, f∗ + g∗ ∈Γ0(H).Fact 1.3.42. [4, Proposition 16.10] Let f be a proper function on H, let x ∈ H,and u ∈ H. Then u ∈ ∂f(x) if and only if f(x) + f∗(u) = 〈x, u〉 ⇒ x ∈ ∂f∗(u).271.3. FunctionsFact 1.3.43. [4, Proposition 16.49] Let f ∈ Γ0(H). Then intdom f ⊂ dom ∂f ⊂dom f.Example 1.3.44. Let f : R→ R such thatf(x) ={x lnx if x ≥ 0,+∞ if x < 0.We have dom ∂f = (0,+∞) because ∂f(0) = ∅. As dom f = [0,+∞), in thisexample dom ∂f ⊆ dom f.Example 1.3.45. Let f(x) = ι[2,3](x). Then 2 ∈ dom ∂f but 2 /∈ intdom f.Therefore, intdom f ⊆ dom ∂f.Fact 1.3.46. [4, Corollary 16.30] Let f ∈ Γ0(H). Then (∂f)−1 = ∂f∗.1.3.6 Infimal convolutionDefinition 1.3.47. Let f and g be functions from Rm to [−∞,+∞]. The infimalconvolution of f and g isfg : Rm → [−∞,+∞] : x 7→ infy∈Rm(f(y) + g(x− y)),and it is exact at a point x ∈ Rm if (fg)(x) = miny∈Rm{f(y) + g(x− y)}, i.e.,∃y ∈ H : (fg)(x) = f(y) + g(x− y) ∈ (−∞,+∞];fg is exact if it is exact at every point of its domain, in which case it is denotedby f g.Fact 1.3.48. [4, Proposition 12.6] Let f and g be functions fromH → (−∞,+∞].Then the following hold:(1) dom(fg) = dom f + dom g.(2) fg = gf.Fact 1.3.49. [4, Proposition 13.24] Let f and g be functions in fromH to (−∞,+∞].Then (fg)∗ = f∗ + g∗.Fact 1.3.50. [4, Proposition 15.2] Let f and g be functions in Γ0(H) such that0 ∈ int(dom f − dom g). Then (f + g)∗ = f∗ g∗.Fact 1.3.51. [4, Proposition 15.7] Let f and g be in Γ0(H). Suppose0 ∈ int(dom f∗ − dom g∗).Then fg = f g ∈ Γ0(H).281.3. FunctionsFact 1.3.52. [4, Proposition 16.61] Let f and g be in Γ0(H), let x ∈ dom(fg),and let y ∈ H. Then the following hold:(1) Suppose that (fg)(x) = f(y) + g(x− y). Then∂(fg)(x) = ∂f(y) ∩ ∂g(x− y).(2) Suppose that ∂f(y) ∩ ∂g(x− y) 6= ∅. Then (fg)(x) = f(y) + g(x− y).Definition 1.3.53. Let f : H → (−∞,+∞] be proper and let λ ∈ R++. TheMoreau envelope of f with parameter λ iseλf := f(12λ‖ · ‖2).Example 1.3.54. [4, Example 12.21] Let C ⊂ H and let λ ∈ R++. Then eλιC =(2λ)−1d2C .Proof. We haveeλιC(x) =[ιC(12λ‖ · ‖2)](x)= infy∈Rm(ιC(y) +12λ‖x− y‖2)= infy∈C(12λ‖x− y‖2)=12λd2C(x).Fact 1.3.55. Let f : Rm → (−∞,+∞] be proper and let λ ∈ R++. Then eλf isfull domain, i.e., dom eλf = Rm.Proof. Combining the definition of eλf with the Fact 1.3.48, we havedom eλf = dom f(12λ‖ · ‖2)= dom f + Rm= Rm.Thus, eλf is full domain.291.3. FunctionsDefinition 1.3.56. Let f ∈ Γ0(H) and let x ∈ H. Then Proxf x is the uniquepoint inH that satisfiese1f(x) = miny∈H(f(y) +12‖x− y‖2)= f(Proxf x) +12‖x− Proxf x‖2.Fact 1.3.57. [4, Proposition 16.44] Let f ∈ Γ0(H), and let x and p be inH. Thenp = Proxf x⇔ x− p ∈ ∂f(p).In other words,Proxf = (Id +∂f)−1 = J∂f .1.3.7 Fenchel-Rockafellar dualityDefinition 1.3.58. [4, Definition 15.19] Let f : H → (−∞,+∞], let g : K →(−∞,+∞], and let L : H → K be a nonezero bounded linear operator. The primalproblem associated with the composition function f + g ◦ L isminx∈H{f(x) + g(Lx)}, (1.6)its dual problem isminv∈K{f∗(L?v) + g∗(−v)}, (1.7)the primal optimal value is µ = inf(f + g ◦ L)(H), the dual optimal value isµ∗ = inf(f∗ ◦ L? + g∗∨)(K), and the duality gap is∆(f, g, L) ={0, if µ = −µ∗ ∈ {−∞,+∞}µ+ µ∗otherwise.Fact 1.3.59. [4, Proposition 15.21] Let f : H → (−∞,+∞] and g : K →(−∞,+∞] be proper, and let L : H → K be a nonezero bounded linear operator.Set µ = inf(f + g ◦ L)(H) and µ∗ = inf(f∗∨ ◦ L? + g∗)(K). Thenµ = −µ∗ ⇔ ∆(f, g, L) = 0.Remark 1.5. There exists a solution to problem (1.6) implies there must existsa solution to problem (1.7), and vice versa. Therefore, solving problem (1.6) isequivalent to solving problem (1.7).30Chapter 2Classic Douglas-Rachfordalgorithm2.1 OverviewIn this chapter, the history of the Douglas-Rachford algorithm is reviewed.There is a relation between the composited monotone inclusion problem and theDouglas-Rachford algorithm, and there is also a relation between the compositedmonotone inclusion problem and the optimization problems. Those two relationsare roughly given by Bot and Hendrich [6] in 2013. Here, I will show those rela-tions in details.2.2 Douglas-Rachford splitting problem and the briefhistory of Douglas-Rachford algorithmThe Douglas-Rachford splitting problem is the problem of finding a point x ∈H such that0 ∈ Ax+Bx,where A and B are maximally monotone operators. Naturally, this approach isnumerically viable only in those cases in which it is easy to compute Jγ(A+B),where γ ∈ R++. However, the Douglas-Rachford algorithm, in which the opera-tors A and B are employed in separate steps, can be seen as a widely applicablealternative.The Douglas-Rachford algorithm was first be proposed by J. Douglas and H.H. Rachford [9] in 1956 as a method for solving certain matrix equations. In 1969Lieutaud (see [13]) extended their method to deal with (possibly nonlinear) max-imally monotone operators that are defined everywhere. Lions and Mercier, intheir paper [14] from 1979, presented a broad and powerful generalization to itscurrent form, i.e., to handle the sum of any two maximally monotone operatorsthat are possibly nonlinear, possibly set-valued and not necessarily defined every-where. With the joint work of Eckstein and Bertsekas (see [11, Theorem 5]) from312.2. Douglas-Rachford splitting problem and the brief history of Douglas-Rachford algorithm1992, the inclusion JB(FixT ) ⊆ zer(A+B) has been proved. Later on, in 2004,Combettes (see [7, Lemma 2.6(iii)]) refined the results by Eckstein and Bertsekas.Together with the earlier results by Lions and Mercier [14], the work by Ecksteinand Bertsekas and later by Combettes complete the following Douglas-Rachfordalgorithm in the finite dimensional setting.Lemma 2.2.1. [4, Douglas-Rachford algorithm] LetA andB be maximally mono-tone operators from Rm to 2Rm such that zer(A + B) 6= ∅. Let (λn)+∞n=1 be a se-quence in [0, 2] such that∑+∞n=1 λn(2− λn) = +∞, let γ ∈ R++, and let x0 ∈ H.Set(∀n ∈ N)yn = JγBxnzn = JγA(2yn − xn)xn+1 = xn + λn(zn − yn).(DR)Then there exists x ∈ FixRγA ◦RγB such that the following hold:(1) JγBx ∈ zer(A+B).(2) (yn − zn)+∞n=1 converges to 0.(3) (xn)+∞n=1 converges to x.(4) (yn)+∞n=1 converges to JγBx.(5) (zn)+∞n=1 converges to JγBx.(6) Suppose that one of the following holds:(a) A is uniformly monotone on every nonempty bounded subset of domA.(b) B is uniformly monotone on every nonempty bounded subset of domB.Then (yn)+∞n=1 and (zn)+∞n=1 converge to the unique point in zer(A+B).Later on, in 2009, Combettes [8] proved that Douglas-Rachford algorithm iserror-tolerant. Relying on the work of Combettes, in 2013, the joint work of Botand Hendrich [6] showed that there are two different primal-dual iterative error-tolerant methods for solving inclusions with mixtures of composite and parallel-sum type monotone operators.Remark 2.1. In 2011, Svaiter (see [16]) demonstrated that A + B does not haveto be maximally monotone and (JB(Tnx))+∞n=1 converges weakly to a point inzer(A+B) in the general Hilbert space. In 2017, Bauschke and Moursi [5] gave asimpler proof of the weakly convergence of the sequence (JB(Tnx))+∞n=1. But thisis beyond the scope of this thesis.322.3. The composited monotone inclusion problem2.3 The composited monotone inclusion problemThe composited monotone inclusion problems in this thesis are all consideredin the Rm space. All of them contain two parts: the primal inclusion problem;and the dual inclusion problem. Here is the general case: Let A : Rm → 2Rm ,B : Rm → 2Rm and D : Rm → 2Rm be maximally monotone operators. Letr ∈ Rm, and let L : Rm → Rm be a nonzero linear invertible operator.Let z ∈ Rm, the primal inclusion problem is to find a point x¯ ∈ Rm such thatz ∈ Ax¯+ L?(BD)(Lx¯− r).(P)The dual inclusion problem is to find a point v¯ ∈ Rm such that(∃x ∈ Rm){z − L?v¯ ∈ Axv¯ ∈ (BD)(Lx− r)(D)Lemma 2.3.1. The primal inclusion problem (P) is equivalent to the dual inclusionproblem (D).Proof. Suppose x¯ ∈ Rm is the solution of the primal inclusion problem. That is,z ∈ Ax¯+ L?(BD)(Lx¯− r),which is equivalent to0 ∈ −z +Ax¯+ L?(BD)(Lx¯− r). (2.1)Equation (2.1) implies that some v¯ ∈ (BD)(Lx¯− r) obeys 0 ∈ −z+Ax¯+L?v¯.In other words, that means v¯ ∈ Rm obeys{0 ∈ −z +Ax¯+ L?v¯v¯ ∈ (BD)(Lx¯− r). (2.2)This v¯ solves (D), because x = x¯ satisfies the required conditions. Since L is anonzero linear operator, 0 ∈ −z + Ax + L?v¯ means z − L?v¯ ∈ Ax. Therefore,problem (2.2) is the dual problem. Thus, finding the solution of the primal inclu-sion problem is equivalent to finding the solution of the dual inclusion problem.In another words, if we can find a solution to the primal inclusion problem, theremust exists a solution to the dual inclusion problem. Conversely, if we can find asolution to the dual inclusion problem, there must exists a solution to the primalinclusion problem.332.3. The composited monotone inclusion problemRemark 2.2. We say (x¯, v¯) is a primal-dual solution to problem (P) together with(D), ifz − L?v¯ ∈ Ax¯ and v¯ ∈ (BD)(Lx¯− r).Here, x¯ is a solution to (P) and v¯ is a solution to (D), see Bot and Hendrich [6].Before we get a corollary of Lemma 2.3.1, we need the following lemma.Lemma 2.3.2. Let B : Rm → 2Rm and D = N{0}. ThenBD = B.Proof. Since D = N{0}, according to Lemma 1.1.35, N{0}−1y = 0, for any y ∈Rm. Suppose BD 6= ∅, then there exists a pair of x ∈ Rm and y ∈ Rm such thaty ∈ (BD)(x).That is,y ∈ (B−1 +N{0}−1)−1(x)⇔x ∈ (B−1 +N{0}−1)(y)⇔x ∈ B−1y⇔y ∈ Bx.Therefore,BN{0} = B.Corollary 2.3.3. Let A : Rm → 2Rm , B : Rm → 2Rm be maximally monotoneoperators, let D = N{0}. Let r = 0, z = 0, and let L = Id. Then the followingproblems are equivalent:(1) the primal inclusion problem:find a point x¯ ∈ Rm such that 0 ∈ Ax¯+Bx¯, (2.3)(2) the dual inclusion problem:find a point v¯ ∈ Rm such that (∃x ∈ Rm){ −v¯ ∈ Axv¯ ∈ Bx. (2.4)Therefore, in this case, the dual inclusion problem (D) becomes: find v′ suchthat0 ∈ A−1(v′)−B−1(−v′). (2.5)342.3. The composited monotone inclusion problemProof. (1) Plugging D = N{0}, r = 0, z = 0, and L = Id into the primalinclusion problem (P), we getfind a point x¯ ∈ Rm such that 0 ∈ Ax¯+ (BN{0})x¯.By Lemma 2.3.2, we get 0 ∈ Ax¯+ (BN{0})x¯ is equivalent to0 ∈ Ax¯+Bx¯.(2) Again, we plug D = N{0}, r = 0, z = 0, and L = Id into the dual inclusionproblem (D), we get (2.4). Since there exists x ∈ Rm such that −v¯ ∈Ax, v¯ ∈ Bx, x should be a solution of (2.3). Therefore, (2.4) is equivalentto (2.3). Since −v¯ ∈ Ax⇒ x ∈ A−1(−v¯) and v¯ ∈ Bx⇒ x ∈ B−1(v¯), theinclusion problem (2.4) is equivalent to find v¯ such that0 ∈ A−1(−v¯)−B−1(v¯).Now we let v′ = −v¯, then the inclusion problem (2.4) becomes: find v′ suchthat0 ∈ A−1(v′)−B−1(−v′).Remark 2.3. (2.5) is called the Attouch-Théra duality [1] of (2.3).Lemma 2.3.4. [6, Theorem 2.1] Let A : Rm → 2Rm , B : Rm → 2Rm andD : Rm → 2Rm be maximally monotone operators. Let z and r ∈ Rm, letL : Rm → Rm be a nonzero linear operator. Let K = Rm × Rm. If we definethree set-valued operators M,Q and S as follows:M : K → 2K : (x, v) 7→ (−z +Ax, r +B−1v);(M)Q : K → 2K : (x, v) 7→ (0, D−1v);(Q)S : K → K : (x, v) 7→ (L?v,−Lx).(S)Moreover, define an bounded linear operatorV : K → K : (x, v) 7→ (xτ− 12L?v,vσ− 12Lx),(V)where τ, σ ∈ R++, and τσ‖L‖2 < 4.Finally, define two operators on KV :A := V −1(12S +Q).(A)B := V −1(12S +M).(B)352.3. The composited monotone inclusion problemHere, the space KV is a vector space with inner product 〈x, y〉KV = 〈x, V y〉K andnorm ‖x‖KV =√〈x, V x〉K. Then any(x¯, v¯) ∈ zer(A+B).is a pair of primal-dual solution to problem (P) and (D) and vice versa, while x¯ isthe solution of the primal inclusion problem (P) and v¯ is the solution of the dualinclusion problem (D)For the completeness of the thesis, we show the proof of this lemma here.Proof. We split this proof into three steps.Step 1: Prove the set-valued operators M,Q, and S are maximally mono-tone.Since L is a nonzero linear operator, A,B, and D are maximally monotone oper-ators, and operators M and Q are maximally monotone on K by Fact 1.2.29 andFact 1.2.30.For operator S, let a = (x, v), b = (y, u), a ∈ K, b ∈ K. Then〈Sa, b〉 = 〈(L?v,−Lx), (y, u)〉= 〈L?v, y〉+ 〈−Lx, u〉= 〈v, Ly〉+ 〈x,−L?u〉= 〈(x, v), (−L?u, Ly)〉= 〈a,−Sb〉i.e., S? = −S. From Fact 1.2.31, it follows that S is maximally monotone.Step 2: Show V is maximally monotone, and prove V −1 exists.Let a = (x, v), b = (y, u), a ∈ K, b ∈ K. Then〈V a, b〉 =〈(xτ− 12L?v,vσ− 12Lx), (y, u)〉= 〈xτ, y〉+ 〈−12L?v, y〉+ 〈 vσ− 12Lx, u〉= 〈x, yτ〉+ 12〈v,−Ly〉+ 〈v, uσ〉 − 12〈x, L?u〉= 〈x, yτ− 12L?u〉+ 〈v,−12Ly +uσ〉362.3. The composited monotone inclusion problem=〈(x, v), (yτ− 12L?u,−12Ly +uσ)〉= 〈a, V b〉.That means V is self-adjoint, i.e., V ? = V .Let a = (x, v), a ∈ K,〈V a, a〉 = 〈(xτ− 12L?v,vσ− 12Lx), (x, v)〉=‖x‖2τ− 〈v, Lx〉+ ‖v‖2σ≥ ‖x‖2τ− ‖v‖‖L‖‖x‖+ ‖v‖2σ.For any λ ∈ R++,‖x‖2λ+ λ‖v‖2 − 2‖x‖‖v‖ ≥ 0.Then we get‖x‖2τ− ‖v‖‖L‖‖x‖+ ‖v‖2σ≥‖x‖2τ+‖v‖2σ− 12(σ‖L‖2√τσ‖L‖2 ‖x‖2 +√τσ‖L‖2σ‖v‖2)≥‖x‖2τ+‖v‖2σ− 12(√τσ‖L‖2τ‖x‖2 +√τσ‖L‖2σ‖v‖2)≥(1− 12√τσ‖L‖2)(‖x‖2τ+‖v‖2σ)≥(1− 12√τσ‖L‖2) min{1τ,1σ}‖a‖2.Let ρ = (1 − 12√τσ‖L‖2) min{ 1τ , 1σ}. Then 〈V a, a〉 ≥ ρ‖a‖2. Since τ and σsatisfy the condition τσ‖L‖2 < 4, we have ρ > 0. That means, V is ρ−stronglypositive.Let a = (x, v), b = (y, u), a ∈ K, b ∈ K. Since V is a bounded linear operator, wehave :〈V a− V b, a− b〉 = 〈V (a− b), a− b〉≥ ρ‖a− b‖2≥ 0.Thus, V is maximally monotone.To prove the existence of V −1, we only need to prove V is one-to-one (in other372.3. The composited monotone inclusion problemwords, zerV = {0}), and is onto (in other words, ranV = K).Because V is ρ-strongly positive, zerV = {0}. Otherwise, suppose we have x ∈zerV with x 6= 0. Then 〈x, 0〉 = 〈x, V x〉 ≥ ρ‖x‖2 > 0, which is impossible.According to Fact 1.2.17, since V is a bounded linear operator, ranV is a closedsubspace, that is, ranV = ranV. By Fact 1.2.18,(ranV )⊥ = zerV ?. (2.6)As V is self-adjoint, V ? = V. Thus,zerV ? = zerV = {0}. (2.7)Combining the result of (2.6) and (2.7), we get(ranV )⊥ = {0}.Because (ranV )⊥ = {0} ⊆ K⊥, by Fact 1.1.27,K⊥⊥ ⊆ (ranV )⊥⊥. (2.8)Since ranV and K are closed subspaces, by Fact 1.1.28, K⊥⊥ = K, (ranV )⊥⊥ =ranV. Thus, (2.8) implies that K ⊆ ranV. Therefore,ranV = K.That means, V −1 exists.Step 3: Show that (x¯, v¯) ∈ zer(A+B) if and only if x¯ is a primal solutionof (P) and v¯ is a dual solution of (D)Since S,M and Q are maximally monotone and domS = K, according to Fact1.2.28, 12S +Q and12S +M are maximally monotone on K.Take 12S +Q as an example. Let(x, u) ∈ gra(12S +Q); (y, v) ∈ gra(12S +Q).As 12S +Q is maximally monotone on space K, we have〈x− y, u− v〉K ≥ 0. (2.9)Because V −1 exists, (2.9) can be written as 〈x − y, V V −1(u − v)〉K ≥ 0, whichequals〈x− y, V −1(u− v)〉KV ≥ 0. (2.10)382.3. The composited monotone inclusion problemBy Fact 1.2.15,(2.10)⇔ 〈x− y, V −1u− V −1v〉KV ≥ 0.That means A := V −1(12S +Q) is a maximally monotone operator on space KV.By using the same method, we can prove B := V −1(12S + M) is a maximallymonotone operator on space KV too.Again, since V −1 is linear,zer(A+B) = zer(V −1(12S +M) + V −1(12S +Q))= zer(V −1(S +M +Q)).(1) On the one hand, let x ∈ zer(V −1(S +M +Q)). Then(V −1(S +M +Q))(x) = 0i.e., (S +M +Q)(x) = V (0)= 0.Thus, x ∈ zer(S +M +Q).(2) On the other hand, let x ∈ zer(S +M +Q). Then we have(V −1(S +M +Q))(x) = V −1((S +M +Q)(x))= V −1(0)= 0.Thus, x ∈ zer(V −1(S +M +Q)).Altogether, we havezer(V −1(S +M +Q)) = zer(S +M +Q).Consequently, one haszer(A+B) = zer(S +M +Q).If zer(M + S + Q) 6= ∅, then according to the definition of M,S,Q, there exists(x¯, v¯) ∈ K such that {0 ∈ −z +Ax¯+ L?v¯,0 ∈ r +B−1v¯ +D−1v¯ − Lx¯,392.4. Application to proper, lower-semicontinuous convex functionswhich is equivalent to {z − L?v¯ ∈ Ax¯,v¯ ∈ (BD)(Lx¯− r). (2.11)That is, v¯ is the solution of the dual inclusion problem. Moreover, since L is anonzero linear operator, z − L?v¯ ∈ Ax¯ means 0 ∈ −z + Ax¯ + L?v¯. Then, thosetwo inclusions in (2.11) impliesz ∈ Ax¯+ L?(BD)(Lx¯− r),i.e., x¯ is the solution of the primal inclusion problem. Altogether, we call (x¯, v¯)the primal-dual solution.Therefore, we deduce that (x¯, v¯) ∈ zer(A + B) if and only if x¯ is a solution ofthe primal inclusion problem (P) and v¯ is a solution of the dual inclusion problem(D).Remark 2.4. For the operator V : K → K : (x, v) 7→ (xτ − 12L?v, vσ − 12Lx),it is conjectured that we can consider it as a matrix and get detV = 1τσ − ‖L‖24 .Therefore, if we have τσ‖L‖2 < 4, V −1 exists.Remark 2.5. According to the definition of the Douglas-Rachford Splitting Prob-lem, Lemma 2.3.4 implies that the primal inclusion problem (P) and the dual in-clusion problem (D) can be solved by using the Douglas-Rachford algorithm.2.4 Application to proper, lower-semicontinuous convexfunctionsBefore we show the relationship between the primal dual inclusion problemsand the optimization problems, we must get the following results, which can befound in [4].Theorem 2.4.1. Let C be a convex subset of H, let K be a real Hilbert space,let L : H → K be linear and continuous, and let D be a convex subset of K. IfD ∩ intL(C) 6= ∅ or intD ∩ L(C) 6= ∅, then 0 ∈ int(D − L(C)).Proof. Suppose that y ∈ D ∩ intL(C). Then there exists an open ball B(y; r) ⊆L(C) for some r ∈ R++. ClearlyB(0; r) = y − B(y; r). (2.12)Since y ∈ D, B(y; r) ⊆ L(C), equation (2.12) implies B(0; r) ⊆ D − B(y; r).Therefore,0 ∈ int(D − L(C)).402.4. Application to proper, lower-semicontinuous convex functionsIn the case of intD ∩ L(C) 6= ∅, we let x ∈ intD ∩ L(C). Then there exists anopen ball B(x; r) ⊆ D for some r ∈ R++. ClearlyB(0; r) = x− B(x; r). (2.13)Since x ∈ L(C), B(x; r) ⊆ D, equation (2.13) implies B(0; r) ⊆ L(C) − D.Therefore,0 ∈ int(D − L(C)).Theorem 2.4.2. Let f ∈ Γ0(H), g ∈ Γ0(H). Then∂f∂g ⊆ ∂(fg).Proof. If (∂f∂g)(z) = ∅, clearly the inclusion holds. Assume (∂f∂g)(z) 6=∅. Let v ∈ (∂f∂g)(z). Since f ∈ Γ0(H), g ∈ Γ0(H), by Fact 1.3.27, ∂fand ∂g are maximally monotone. According to the definition of the parallel sumbetween operators (Definition 1.2.2), v ∈ (∂f∂g)(z) implies v ∈ ((∂f)−1 +(∂g)−1)−1(z), i.e., z ∈ ((∂f)−1 + (∂g)−1)(v). In other words, z ∈ (∂f)−1(v) +(∂g)−1(v).Let a1 ∈ (∂f)−1(v) and a2 ∈ (∂g)−1(v) such that z = a1 + a2. Then{v ∈ ∂f(a1),v ∈ ∂g(a2),sov ∈ ∂f(a1) ∩ ∂g(a2) (a1 + a2 = z).According to the Fact 1.3.52,v ∈ ∂f(a1) ∩ ∂g(a2)⇔ v ∈ ∂(fg)(z).Thus,∂f∂g ⊆ ∂(fg).Theorem 2.4.3. Let f ∈ Γ0(Rm), g ∈ Γ0(Rm). If dom f∗ ∩ int dom g∗ 6= ∅, then(1) 0 ∈ int(dom f∗ − dom g∗).(2) fg = f g ∈ Γ0(Rm).(3) ∂(fg) = ∂f∂g.412.4. Application to proper, lower-semicontinuous convex functionsProof. (1) Since f ∈ Γ0(Rm), g ∈ Γ0(Rm), by Fact 1.3.39 and Fact 1.3.40, f∗and g∗ are in Γ0(Rm). Thus, dom f∗ and dom g∗ are convex.Because dom f∗ ∩ int dom g∗ 6= ∅, due to Theorem 2.4.1,0 ∈ int(dom f∗ − dom g∗).(2) Using the result of (1) with the Fact 1.3.51 to complete the proof that fg =f g ∈ Γ0(Rm).(3) Since we proved fg = f g ∈ Γ0(Rm) above, according to Fact 1.3.40,fg = (fg)∗∗. Thus,∂(fg) = ∂(fg)∗∗= ∂[(fg)∗]∗. (2.14)Since fg ∈ Γ0(Rm), combine Fact 1.3.39 and Fact 1.3.40, we have(fg)∗ ∈ Γ0(Rm).Therefore, by Fact 1.3.46, ∂[(fg)∗]∗ = [∂(fg)∗]−1.By Fact 1.3.49, [∂(fg)∗]−1 = [∂(f∗+ g∗)]−1. As dom f∗ ∩ int dom g∗ 6=∅, the sum rule for subdifferentials (Fact 1.3.31) gives[∂(f∗ + g∗)]−1 = [∂f∗ + ∂g∗]−1. (2.15)Again, because f ∈ Γ0(Rm), g ∈ Γ0(Rm), by Fact 1.3.46,∂f∗ = (∂f)−1, ∂g∗ = (∂g)−1.Thogether with (2.15) yeilds[∂f∗ + ∂g∗]−1 = [(∂f)−1 + (∂g)−1]−1= ∂f∂g.Lemma 2.6. Let K be a real Hilbert space, let f ∈ Γ0(H), let g ∈ Γ0(K),and let L : H → K be a nonzero bounded linear invertible operator. Suppose[L(dom f) + b]∩ intdom g 6= ∅. Then ∂[f(x) + g(Lx+ b)] = ∂f(x) +L? ◦ ∂g ◦(Lx+ b).422.4. Application to proper, lower-semicontinuous convex functionsProof. Let h(x) = g(Lx+ b), which can impliesdomh = L−1(dom g − b).Since [L(dom f) + b] ∩ intdom g 6= ∅, there exists an x0 ∈ dom f such thatLx0 + b ∈ intdom g,i.e.,x0 ∈ L−1(intdom g − b) ⊂ domh.Because L−1(intdom g − b) is open, x0 ∈ intdomh. Therefore,x0 ∈ dom f ∩ intdomh.By Fact 1.3.31,∂(f + h) = ∂f + ∂h.Because h(x) = g(Lx+ b), we have∂h(x) = L? ◦ ∂g ◦ (Lx+ b)by Fact 1.3.30. Therefore,∂[f(x) + g(Lx+ b)] = ∂f(x) + L? ◦ ∂g ◦ (Lx+ b).Theorem 2.4.4. By the definition of the primal inclusion problem (P), with A =∂f,B = ∂g,D = ∂l, where f, g, l ∈ Γ0(Rm). Then(1) We obtain the primal inclusion problemfind x¯ ∈ Rm such that z ∈ ∂f(x¯) + L? ◦ (∂g∂l) ◦ (Lx¯− r) (2.16)(2) Every solution of (2.16) is also a solution of the optimization problemArgminx∈Rm{f(x) + ((gl)(Lx− r))− 〈z, x〉}. (2.17)(3) If in addition, [L(dom f)−r]∩intdom(gl) 6= ∅, and dom g∗∩int dom l∗ 6=∅, then (2.16) and (2.17) are equivalent.Proof. (1) By Fact 1.3.27, ∂f, ∂g and ∂l are maximally monotone. Thus, oncewe plug them into the dual problem (P) by lettingA = ∂f,B = ∂g,D = ∂l,we obtain equation (2.16).432.4. Application to proper, lower-semicontinuous convex functions(2) Now we show every solution x¯ of (2.16) is also a solution of (2.17).First, let’s move z from the left side of equation (2.16) to its right side. Weobtain0 ∈ ∂f(x¯) + L? ◦ (∂g∂l) ◦ (Lx¯− r)− ∂〈z, x¯〉. (2.18)Due to Theorem 2.4.2, since g, l ∈ Γ0(Rm), equation (2.18) implies0 ∈ ∂f(x¯) + L? ◦ ∂(gl) ◦ (Lx¯− r)− ∂〈z, x¯〉.Take v1 ∈ ∂f(x¯),v2 ∈ ∂(gl) ◦ (Lx¯− r),v3 ∈ ∂〈−z, x¯〉,such that v1 +L?v2 + v3 is a generic point in ∂f(x¯) +L? ◦ ∂(gl) ◦ (Lx¯−r) + ∂〈−z, x¯〉. By the definition of subdifferential, for all y ∈ Rm〈v1, y − x¯〉 ≤ f(y)− f(x¯),〈v2, Ly − r − (Lx¯− r)〉 ≤ (gl)(Ly − r)− (gl)(Lx¯− r),〈v3, y − x¯〉 ≤ 〈z, x¯〉 − 〈z, y〉.Due to 〈v2, Ly − r − (Lx¯− r)〉 = 〈L?v2, y − x¯〉, we have〈v1 + L?v2 + v3, y − x¯〉 ≤ f(y)− f(x¯) + [(gl)(Ly − r)− (gl)(Lx¯− r)]− 〈z, y〉+ 〈z, x¯〉= f(y) + (gl)(Ly − r)− 〈z, y〉− f(x¯)− (gl)(Lx¯− r) + 〈z, x¯〉.In turn,v1 + L?v2 + v3 ∈ ∂(f(x¯) + (gl)(Lx¯− r)− 〈z, x¯〉).Therefore,∂f(x¯)+L?◦∂(gl)◦(Lx¯−r)+∂〈−z, x¯〉 ⊆ ∂(f(x¯)+(gl)(Lx¯−r)−〈z, x¯〉).(2.19)Since 0 ∈ ∂f(x¯) + L? ◦ ∂(gl) ◦ (Lx¯− r)− ∂〈z, x¯〉, (2.19) implies that0 ∈ ∂(f(x¯) + (gl)(Lx¯− r)− 〈z, x¯〉). (2.20)By Fact 1.3.34, the x¯ which satisfies the inclusion (2.20) is also an elementof the setArgminx∈Rm{f(x) + ((gl)(Lx− r))− 〈z, x〉},and vice versa.442.4. Application to proper, lower-semicontinuous convex functions(3) It suffices prove∂f(x¯)+L?◦∂(gl)◦(Lx¯−r)+∂〈−z, x¯〉 = ∂(f(x¯)+∂(gl)(Lx¯−r)−〈z, x¯〉).(2.21)Because g, l ∈ Γ0(Rm), and dom g∗ ∩ int dom l∗ 6= ∅, using the Theorem2.4.3 we have∂f(x¯)+L?(∂g∂l)(Lx¯−r)−∂〈z, x¯〉 = ∂f(x¯)+L?∂(gl)(Lx¯−r)−∂〈z, x¯〉.Again, by using the same theorem, we have gl ∈ Γ0(Rm). Then we canapply the Lemma 2.6 to get the conclusion that∂f(x¯) + L? ◦ ∂(gl) ◦ (Lx¯− r) = ∂[f(x¯) + (gl)(Lx¯− r)]since we have the condition [L(dom f)− r] ∩ intdom(gl) 6= ∅. Asdom〈z, x〉 = Rm,we have[L(dom f)− r] ∩ intdom(gl) ∩ intdom〈z, x¯〉=[L(dom f)− r] ∩ intdom(gl) ∩ Rm=[L(dom f)− r] ∩ intdom(gl) 6= ∅.Thus, by Fact 1.3.31, we get equation (2.21).Theorem 2.4.5. By the definition of the dual problem (D), with A = ∂f,B =∂g,D = ∂l, where f, g, l ∈ Γ0(Rm). Then(1) We obtain the dual inclusion problemfind v¯ ∈ Rm such that (∃x¯ ∈ Rm){z − L?v¯ ∈ ∂f(x¯)v¯ ∈ (∂g∂l)(Lx¯− r). (2.22)(2) Every solution v¯ of (2.22) is also a solution of the the optimization problemArgminv∈Rm{(g∗ + l∗)(v) + f∗(z − L?v) + 〈r, v〉}. (2.23)(3) If in addition, [−L? dom(g∗ + l∗) + z] ∩ intdom(f∗) 6= ∅, and dom g∗ ∩int dom l∗ 6= ∅, then (2.22) and (2.23) are equivalent.452.4. Application to proper, lower-semicontinuous convex functionsProof. (1) By Fact 1.3.27, ∂f, ∂g and ∂l are maximally monotone. Thus, oncewe plug them into the dual problem (D) by lettingA = ∂f,B = ∂g,D = ∂l,we obtain equation (2.22).(2) Now we show every solution of (2.22) is also a solution of (2.23). Due toTheorem 2.4.2, since g, l ∈ Γ0(Rm),v¯ ∈ (∂g∂l)(Lx¯− r) implies v¯ ∈ [∂(gl)](Lx¯− r). (2.24)Moreover, by Fact 1.3.42, (2.24) implies Lx¯−r ∈ ∂((gl)∗)(v¯). In general,(2.22)⇔{x¯ ∈ ∂f∗ ◦ (z − L?v¯) (2.25a)Lx¯− r ∈ ∂((gl)∗)(v¯) (2.25b)Multiplying (2.25a) by L, we obtainLx¯ ∈ L ◦ ∂f∗ ◦ (z − L?v¯). (2.26)(2.25b)− (2.26)⇒ 0 ∈ ∂[(gl)∗](v¯) + ∂〈r, v¯〉 − L ◦ ∂f∗ ◦ (z − L?v¯).Since g ∈ Γ0(Rm) and l ∈ Γ0(Rm), by Fact 1.3.49, (gl)∗ = g∗ + l∗. Thatis,0 ∈ ∂(g∗ + l∗)(v¯)− L ◦ ∂f∗ ◦ (z − L?v¯) + ∂〈r, v¯〉.Take v1 ∈ ∂(g∗ + l∗)(v¯),v2 ∈ ∂f∗ ◦ (z − L?v¯),v3 ∈ ∂〈r, v¯〉,such that v1−Lv2 + v3 is a generic point in ∂(g∗+ l∗)(v¯)−L ◦ ∂f∗ ◦ (z−L?v¯) + ∂〈r, v¯〉. By the definition of subdifferential, for all y ∈ Rm〈v1, y − v¯〉 ≤ (g∗ + l∗)(y)− (g∗ + l∗)(v¯),〈v2, z − L?y − (z − L?v¯)〉 ≤ f∗(z − L?y)− f∗(z − L?v¯),〈v3, y − v¯〉 ≤ 〈r, y〉 − 〈r, v¯〉.Due to 〈v2, z − L?y − (z − L?v¯)〉 = 〈−Lv2, y − v¯〉, for all y ∈ Rm :〈v1 − Lv2 + v3, y − v¯〉 ≤ (g∗ + l∗)(y)− (g∗ + l∗)(v¯) + f∗(z − L?y)− f∗(z − L?v¯) + 〈r, y〉 − 〈r, v¯〉= (g∗ + l∗)(y) + f∗(z − L?y) + 〈r, y〉−((g∗ + l∗)(v¯) + f∗(z − L?v¯) + 〈r, v¯〉)462.4. Application to proper, lower-semicontinuous convex functionsin turn,v1 − Lv2 + v3 ∈ ∂((g∗ + l∗)(v¯) + f∗(z − L?v¯) + 〈r, v¯〉).Therefore,∂(g∗ + l∗)(v¯)− L ◦ ∂f∗ ◦ (z − L?v¯) + ∂〈r, v¯〉⊆ ∂((g∗ + l∗)(v¯) + f∗(z − L?v¯) + 〈r, v¯〉). (2.27)Since 0 ∈ ∂(g∗+ l∗)(v¯)−L ◦ ∂f∗ ◦ (z−L?v¯) + ∂〈r, v¯〉, (2.27) implies that0 ∈ ∂((g∗ + l∗)(v¯) + f∗(z − L?v¯) + 〈r, v¯〉). (2.28)By Fact 1.3.34, the v¯ which satisfies the inclusion (2.28) is also an elementof the setArgminv∈Rm{(g∗ + l∗)(v) + f∗(z − L?v) + 〈r, v〉},and vice versa.(3) It suffices to prove∂(g∗ + l∗)(v¯)− L ◦ ∂f∗ ◦ (z − L?v¯) + ∂〈r, v¯〉= ∂((g∗ + l∗)(v¯) + f∗(z − L?v¯) + 〈r, v¯〉). (2.29)Because g, l ∈ Γ0(Rm), and dom g∗ ∩ int dom l∗ 6= ∅, using the Theorem1.3.41 we have (g∗ + l∗) ∈ Γ0(Rm). Then we can apply the Lemma 2.6 toget the conclusion that∂(g∗ + l∗)(v¯)− L ◦ ∂f∗ ◦ (z − L?v¯) = ∂((g∗ + l∗)(v¯) + f∗(z − L?v¯))since we have the condition [−L? dom(g∗+ l∗) + z]∩ intdom(f∗) 6= ∅. Asdom〈r, v〉 = Rm, we have[−L? dom(g∗ + l∗) + z] ∩ intdom(f∗) ∩ intdom〈r, v¯〉=[−L? dom(g∗ + l∗) + z] ∩ intdom(f∗) ∩ Rm=[−L? dom(g∗ + l∗) + z] ∩ intdom(f∗) 6= ∅.Thus, by Fact 1.3.31, we get equation (2.29).47Chapter 3Douglas-Rachford algorithm witha changed parameter3.1 OverviewAs we saw in the previous chapter, the classic Douglas-Rachford algorithmhas a parameter 2 in its iteration. That parameter gives me an inspiration: what ifwe changed the value of that parameter? In this chapter, a new algorithm whichbased on the classic Douglas-Rachford algorithm is constructed. I will prove theproperties of this algorithm and then show that it can be applied on the compositedmonotone inclusion problems and on the optimization problems.3.2 α-Douglas-Rachford algorithm, with parameterα ∈ [1, 2)First, let’s see some theorems and facts.Theorem 3.2.1. If A and B are maximally monotone operators fromH to 2H, and0 ∈ int(domA− domB), then zer(A+B + γ Id) 6= ∅ when γ ∈ R++.Proof. As A and B are maximally monotone operators and 0 ∈ int(domA −domB), according to Fact 1.2.28, A + B is a maximally monotone operator. ByFact 1.2.29, 1γ (A+B) is also maximally monotone. Let A¯ =1γ (A+B).Accordingto Fact 1.2.32,ran(A¯+ Id) = H ⇒ 0 ∈ ran(A¯+ Id),Then, zer(A¯+ Id) 6= ∅. Becausezer(A¯+ Id) = zer[γ(A¯+ Id)],= zer(A+B + γ Id),we have zer(A+B + γ Id) 6= ∅.483.2. α-Douglas-Rachford algorithm, with parameter α ∈ [1, 2)Fact 3.2.2. [4, Corollary 26.8](see also [2, Corollary 3]) Let m be an integer suchthat m ≥ 2, set I = {1, . . . ,m}, and let(Ai)i∈I : H → 2Hbe maximally monotone operators. For every i ∈ I , let (xi,n, ui,n)+∞n=1 be a se-quence in graAi and let (xi, ui) ∈ H ×H. Suppose that∑i∈Iui,n → 0 and (∀i ∈ I)xi,n ⇀ xiui,n ⇀ uimxi,n −∑j∈I xj,n → 0.Then there exists x ∈ zer∑i∈I Ai such that the following hold:(i) x = x1 = · · · = xm.(ii)∑i∈I ui = 0.(iii) (∀i ∈ I) (x, ui) ∈ graAi.(iv)∑i∈I〈xi,n, ui,n〉 → 〈x,∑i∈I ui〉 = 0.Theorem 3.2.3. Let A be a maximally monotone operator from K to 2K, let α ∈[1, 2). Define RαA = αJA − Id . Then RαA is nonexpansive.Proof. Let x, y ∈ K. We derive an equivalent criterion for nonexpansivity:RαA is nonexpansive⇔‖RαAx−RαAy‖ ≤ ‖x− y‖⇔‖(αJA − Id)x− (αJA − Id)y‖ ≤ ‖x− y‖⇔‖α(JAx− JAy)− (x− y)‖ ≤ ‖x− y‖⇔‖α(JAx− JAy)− (x− y)‖2 ≤ ‖x− y‖2⇔〈α(JAx− JAy)− (x− y), α(JAx− JAy)− (x− y)〉 ≤ ‖x− y‖2⇔α2‖(JAx− JAy)‖2 + ‖x− y‖2 − 2α〈JAx− JAy, x− y〉 ≤ ‖x− y‖2⇔α2‖(JAx− JAy)‖2 ≤ 2α〈JAx− JAy, x− y〉⇔α2‖(JAx− JAy)‖2 ≤ 〈JAx− JAy, x− y〉Since A is maximally monotone, it follows from Fact 1.2.38 that JA is firmlynonexpansive. According to Fact 1.2.21, we have(∀x ∈ H) (∀y ∈ H) ‖JAx− JAy‖2 ≤ 〈x− y, JAx− JAy〉.493.2. α-Douglas-Rachford algorithm, with parameter α ∈ [1, 2)As α ∈ [1, 2], α2 ∈ [12 , 1],α2‖(JAx− JAy)‖2 ≤ ‖JAx− JAy‖2 ≤ 〈JAx− JAy, x− y〉.Thus, RαA is nonexpansive.Remark 3.1. Theorem 3.2.3 holds whenever 0 ≤ α ≤ 2.Similarly, if we let B be a maximally monotone operator from K to 2K, anddefine RαB = αJB − Id, RαB is nonexpansive.Theorem 3.2.4. Let α ∈ [1, 2), let A, B be maximally monotone operators fromK to 2K, and 0 ∈ int(domA− domB). Let T = RαA ◦RαB . Then(1) T is nonexpansive;(2) JB(FixT ) = zer(A+B + (2− α) Id);(3) FixT 6= ∅.Proof. (1) According to Theorem 3.2.3, both RαA and RαB are nonexpansive.Thus, for any x, y ∈ K,‖Tx− Ty‖ = ‖RαA ◦RαBx−RαA ◦RαBy‖≤ ‖RαBx−RαBy‖≤ ‖x− y‖,that is, T is nonexpansive.(2) Since A+B is maximally monotone, according to Theorem 3.2.1,zer(A+B + (2− α) Id) 6= ∅.Consider an arbitrary x ∈ K from this set, i.e.,0 ∈ Ax+Bx+ (2− α)x.Therefore, there exists y ∈ K such thatx− y ∈ Ax+ (2− α)x and y − x ∈ Bx,which is equivalent to(α− 1)x− y ∈ Ax and x = JBy.503.2. α-Douglas-Rachford algorithm, with parameter α ∈ [1, 2)Thus,αJBy − y ∈ A ◦ JBy + JBy.which impliesJBy = JA(αJBy − y).Therefore,0 = αJA(αJBy − y)− αJBy⇔y = αJA(αJBy − y)− (αJBy − y)⇔y = (αJA − Id) ◦ (αJB − Id)y⇔y = RαA(RαBy).Note that x = JBy and x ∈ zer(A + B + (2 − α) Id). Consequently, wehaveJB(FixT ) ∈ zer(A+B + (2− α) Id).Since 2− α > 0, A+B + (2− α) Id is strictly monotone, we obtain thatzer(A+B + (2− α) Id)is a singleton by using Fact 1.2.27. Hence,JB(FixT ) = zer(A+B + (2− α) Id).(3) In the proof of (2), we have y ∈ FixT, so FixT 6= ∅.The following result is well-known.Lemma 3.2. Consider the Douglas-Rachford algorithm with λn = 1 for all n andγ = 1. Then (DR) becomesyn = JBxn,zn = JA(2yn − xn),xn+1 = xn + (zn − yn).This algorithm can also be written asxn+1 = xn +12(RA ◦RBxn − xn),in terms ofxn+1 = DA,B(xn), (3.1)whereDA,B =Id +RA ◦RB2=12Id +12RA ◦RB.513.2. α-Douglas-Rachford algorithm, with parameter α ∈ [1, 2)Proof.xn+1 = xn + (zn − yn)= xn + JA(2JBxn − xn)− JBxn=xn + (xn − 2JBxn) + 2JA(2JBxn − xn)2=xn −RBxn + 2JA(RBxn)2=xn +RA ◦RBxn2=Id +RA ◦RB2xnThat is, xn+1 = xn + 12(RA ◦RBxn − xn). Hence, (3.1) holds.We now introduce α-Douglas-Rachford algorithm.Lemma 3.3. Changing the parameter 2 of the algorithm (DR) into α, where α ∈[1, 2), we propose the α-DR algorithmyn = JγBxnzn = JγA(αyn − xn)xn+1 = xn + λn(zn − yn).(α-DR)If we keep the assumption that λn = 1 for all n and γ = 1, then following holds:(1) (α-DR) can also be written asxn+1 = xn +1α(RαA ◦RαBxn − xn),in terms ofxn+1 = DαA,B(xn), (3.2)whereDαA,B = (1−1α) Id +1αRαA ◦RαB.(2) DαA,B is an averaged mapping.523.2. α-Douglas-Rachford algorithm, with parameter α ∈ [1, 2)Proof. (1) According to the definition of the α-DR algorithm,xn+1 = xn + (JA(αyn − xn)− yn)= xn + (JA(αJBxn − xn)− JBxn)=αxn − αJBxn + αJA(αJBxn − xn)α=(α− 1)xn + xn − αJBxn + αJA(αJBxn − xn)α=(α− 1)xn + (αJA − Id) ◦ (αJB − Id)xnα=(α− 1)xn +RαA ◦RαBxnα= (1− 1α)xn +1αRαARαBxn.It follows thatxn+1 =[(1− 1α) Id +1αRαA ◦RαB]xn.So (3.2) holds.(2) Because RαA ◦ RαB is nonexpansive, as 1 ≤ α < 2, DαA,B is an averagedoperator.Remark 3.4. [4, Remark 4.34] Let D be a nonempty subset ofH, let T : D → H.(1) If T is averaged, then it is nonexpansive.(2) If T is nonexpansive, it is not necessarily averaged: consider T = − Id :H → H whenH 6= {0}.(3) T is firmly nonexpansive if and only if it is 1/2-averaged.Theorem 3.2.5. Let α ∈ [1, 2), let A and B be maximally monotone operatorsfrom K to 2K with 0 ∈ int(domA− domB). Let λn = 1 for all n, let γ = 1, andlet x0 ∈ Rm. Set yn = JBxnzn = JA(αyn − xn)xn+1 = xn + (zn − yn).(3.3)Then there exists x ∈ FixRαA ◦RαB such that the following hold:(1) JBx = zer(A+B + (2− α) Id). Moreover, the answer is unique.533.2. α-Douglas-Rachford algorithm, with parameter α ∈ [1, 2)(2) (yn − zn)+∞n=1 converges to 0.(3) (xn)+∞n=1 converges to x.(4) (yn)+∞n=1 converges to JBx.(5) (zn)+∞n=1 converges to JBx.Proof. (1) Let T = RαA ◦ RαB. According to Theorem 3.2.4, FixT 6= ∅. Thenfor any x ∈ FixT , we have x = RαA(RαBx), and this together with RαA =αJA − Id and RαB = αJB − Id, yields that(α− 1)JBx− x ∈ AJBx,that is,JBx− x ∈ AJBx+ (2− α)JBx.Thus,0 ∈ AJBx+ (2− α)JBx+ (x− JBx). (3.4)By the definition of the resolvent, we havex ∈ (B + Id)JBx,and so,x− JBx ∈ B ◦ JBx.Combining with (3.4), one has0 ∈ AJBx+ (2− α)JBx+B ◦ JBx,that is0 ∈ [A+B + (2− α) Id] ◦ JBx.Consequently, we haveJBx ∈ zer(A+B + (2− α) Id).Since 2− α > 0, A+B + (2− α) Id is strictly monotone, we obtain thatzer(A+B + (2− α) Id)is a singleton by using Fact 1.2.27. ThereforeJBx = zer(A+B + (2− α) Id).543.2. α-Douglas-Rachford algorithm, with parameter α ∈ [1, 2)(2) From (3.3), it follows thatzn − yn = JA(αyn − xn)− JBxn= JA(αJBxn − xn)− JBxn=αJA(αJBxn − xn)− αJBxnα=αJA(αJBxn − xn)− (αJBxn − xn)− xnα=αJA(RαBxn)−RαBxn − xnα=RαA ◦RαBxn − xnα=1α(Txn − xn).Thus, xn+1 = xn+ 1α(Txn−xn).By Fact 1.2.22(1), we have Txn−xn → 0.Therefore, zn − yn → 0.(3) Since 1 ≤ α < 2, we can apply Fact 1.2.22(2) to complete this proof.(4) It follows from (3.3) and JB = (Id +B)−1 that xn ∈ (B + Id)yn, i.e.,xn − yn ∈ Byn. Hence, (yn, xn − yn) ∈ graB.Similarly, (zn, αyn − xn − zn) ∈ graA. Set∀n ∈ N, vn := xn − yn, wn := αyn − xn − zn. (3.5)Then we have (zn, wn) ∈ graA(yn, vn) ∈ graBwn + vn = (α− 1)yn − zn(3.6)Since result (3) tells us that (xn)+∞n=1 converges to x, (xn)+∞n=1 is bounded.According to the algorithm (3.3), yn = JBxn. By Fact 1.2.40 (1), JA andJB are firmly nonexpansive. Thus,∀n ∈ N ‖yn − y0‖ = ‖JBxn − JBx0‖ ≤ ‖xn − x0‖,which implies (yn)+∞n=1 is bounded.Then there exists a subsequence (ynk)+∞k=1 of (yn)+∞n=1 such that (ynk)+∞k=1is a convergent sequence. Suppose ynk → y¯, as zn − yn → 0 has beenproved, there exists a corresponding subsequence (znk)+∞k=1 of (zn)+∞n=1 suchthat zkn → y¯. By (3.5), we have vnk → x− y¯ and wnk → (α− 1)y¯ − x.553.2. α-Douglas-Rachford algorithm, with parameter α ∈ [1, 2)Let’s set A1 := A + (2 − α) Id, A2 := B. It follows from Fact 1.2.28 thatA1 and A2 are maximally monotone. Since(zn, wn) ∈ graA and (yn, vn) ∈ graBas showed in (3.6), we have{(znk , wnk + (2− α)znk) ∈ graA1(ynk , vnk) ∈ graA2,with znk → y¯, ynk → y¯, wnk + (2 − α)znk → y¯ − x, vnk → x − y¯, znk −ynk → 0, ynk − znk → 0. Again, from (3.5), we havewnk + (2− α)znk + vnk = αynk − xnk − znk + (2− α)znk + xnk − ynk= (α− 1)(ynk − znk)→ 0This combining with Fact 3.2.2, yields that there exists a ∈ zer(A1 + A2)such thata = y¯ and{(a, y¯ − x) ∈ graA1,(a, x− y¯) ∈ graA2.That is, {(y¯, y¯ − x) ∈ gra(A+ (2− α) Id),(y¯, x− y¯) ∈ graB.In view of (y¯, y¯ − x) ∈ gra(A+ (2− α) Id), one hasy¯ − x ∈ [A+ (2− α)]y¯.which is equivalent to(α− 1)y¯ − x ∈ Ay¯.In view of (y¯, x− y¯) ∈ graB, one hasy¯ = JBx.Together, we obtainy¯ = JBx and y¯ ∈ domA.Thus, JBx is a sequential cluster point of (yn)+∞n=1. Because JBx emergesas the limit of every convergent subsequence for (yn)+∞n=1, we conclude thatJBx is the unique sequential cluster point. Since yn is bounded and has onlyone sequential cluster point, by Fact 1.1.19, yn → JBx.563.2. α-Douglas-Rachford algorithm, with parameter α ∈ [1, 2)(5) Combining result (2) and result (4), we have zn = (zn−yn)+yn → 0+JBx,i.e., zn → JBx.Remark 3.5. The proof of (4) works in a Hilbert space by replacing strong con-vergence with weak convergence in appropriate places. In particular, we haveyn ⇀ JBx.Remark 3.6. In Rm, the proof of (4) is simple by using that JB is Lipschitz con-tinuous:limn→+∞ yn = limn→+∞ JB(xn) = JBx.Remark 3.7. The application of the α-Douglas-Rachford algorithm is wider thanthe classic Douglas-Rachford algorithm since there is a requirement on the clas-sic one that zer(A + B) 6= ∅, while the α-Douglas-Rachford algorithm doesnot need a strict condition like that. Because according to Theorem 3.2.1, once0 ∈ int(domA− domB), zer[A+B + (2− α) Id] 6= ∅ for sure.3.2.1 Application to composited monotone inclusion problemsDefinition 3.2.6. Let B : H → 2H, D : H → 2H. Define B β D = (B−1 +D−1 + β Id)−1, where β ∈ R.Lemma 3.2.7. Let B : Rm → 2Rm and D = β2 Id with β1, β2 ∈ R++. Then(Bβ1 D) = B(β21 + β1β2Id).Proof. Since D = β2 Id,(Bβ1 D) =[B−1 + (β2 Id)−1 + β1 Id]−1= [B−1 +1 + β1β2β2Id]−1.Therefore,(Bβ1 D) = B(β21 + β1β2Id).Lemma 3.2.8. Let B : Rm → 2Rm , D = N{0}, and β ∈ R++. Then(Bβ D) = B(1βId).573.2. α-Douglas-Rachford algorithm, with parameter α ∈ [1, 2)Proof. Since D = N{0},(Bβ D) = (B−1 +N{0}−1 + β Id)−1. (3.7)According to Lemma 1.1.35, N{0}−1y = 0, for any y ∈ Rm. Thus,(B−1 +N{0}−1 + β Id)−1 = (B−1 + β Id)−1Therefore, (3.7) is equivalent to(Bβ N{0}) = B(1βId).Now suppose A : Rm → 2Rm , B : Rm → 2Rm and D : Rm → 2Rm aremaximally monotone operators, and L : Rm → Rm is a nonzero linear invertibleoperator.Also recall that M,Q, S, V,A, and B : Let K = Rm × Rm, τ, σ ∈ R++, andτσ‖L‖2 < 4.M : K → 2K : (x, v) 7→ (−z +Ax, r +B−1v);(M)Q : K → 2K : (x, v) 7→ (0, D−1v);(Q)S : K → K : (x, v) 7→ (L?v,−Lx);(S)V : K → K : (x, v) 7→ (xτ− 12L?v,vσ− 12Lx);(V)A := V −1(12S +Q);(A)B := V −1(12S +M).(B)Theorem 3.2.9. Suppose M,Q, S, V,A, and B are constructed by (M), (Q), (S),(V), (A) and (B) respectively. Let α ∈ [1, 2). Then the following two inclusionproblems are equivalent:(1) Find (x, v) ∈ Rm × Rm such that (x, v) ∈ zer(A+B + (2− α) Id).(2) Solve the problem with primal inclusion: find x ∈ Rm such thatz ∈ Ax+ 2− ατx+α4− αL? ◦ (B2−ασ D) ◦ (Lx− r) (3.8)583.2. α-Douglas-Rachford algorithm, with parameter α ∈ [1, 2)whereL = 4−α2 L, τ ∈ R++ and σ ∈ R++, together with the dual inclusion:find v such that there exists an x ∈ Rm that z −α4−αL?v ∈ Ax+ (2−α)τ xv ∈ (B2−ασ D) ◦ (Lx− r).(3.9)Proof. By the definitions of M,Q, S, V,A, and B, and step 3 of the proof ofLemma 2.3.4, we have B := V −1(12S + M) and A := V−1(12S + Q). Notethatzer(A+B + (2− α) Id) = zer(V −1(M + S +Q) + (2− α) Id)= zer(V −1(M + S +Q+ (2− α)V ))= zer(M + S +Q+ (2− α)V ))For all (x, v) ∈ zer(M + S +Q+ (2− α)V )), we have(0, 0) ∈(M + S +Q+ (2− α)V )(x, v)=(−z +Ax+ L?v + (2− α)xτ− 2− α2L?v,r +B−1v +D−1v − Lx+ (2− α) vσ− 2− α2Lx).That is {0 ∈ −z +Ax+ L?v + (2− α)xτ − 2−α2 L?v0 ∈ r +B−1v +D−1v − Lx+ (2− α) vσ − 2−α2 Lx.⇔{0 ∈ −z +Ax+ (2− α)xτ + α2L?v4−α2 Lx− r ∈ B−1v +D−1v + (2−α)σ v.According to Definition 3.2.6, 4−α2 Lx−r ∈ B−1v+D−1v+ (2−α)σ v can be writtenasv ∈ (B2−ασ D) ◦ (4− α2Lx− r).Thus, we have 0 ∈ −z +Ax+(2−α)τ x+α2L?vv ∈ (B2−ασ D) ◦ (4−α2 Lx− r).Since L = 4−α2 L, L? = 24−αL?. Then we havez ∈ Ax+ 2− ατx+α4− αL? ◦ (B2−ασ D) ◦ (Lx− r).593.2. α-Douglas-Rachford algorithm, with parameter α ∈ [1, 2)Therefore, for any (x, v) ∈ zer(A+B + (2− α) Id), x is also the solution of theprimal inclusion problem (3.8), and v is the solution of the dual inclusion problem(3.9).Theorem 3.2.10. Suppose M,Q, S, V,A, andB are constructed by (M), (Q), (S),(V), (A) and (B) respectively. Let α ∈ [1, 2). The inclusion problem (3.8) togetherwith inclusion problem (3.9) can be solved by using α-Douglas-Rachford algo-rithm if domD−1 = Rm. In particular, domD−1 = Rm if one of the followingholds:(1) D = N{0}.(2) D = Id .Proof. Because domD−1 = Rm, we get domQ = K. Since domS = K, weconclude thatdom(12S +Q) = K.As Lemma 2.3.4 shows V is invertible (one-to-one, onto), one hasdomA = domV −1(12S +Q)= dom(12S +Q)=K.ThendomA− domB = K.Therefore, we get0 ∈ int(domA− domB).As Lemma 2.3.4 also showsA andB are maximally monotone, by Theorem 3.2.9and Theorem 3.2.5, the composited monotone inclusion problem (3.8) togetherwith inclusion problem (3.9) can be solved by using α-Douglas-Rachford algo-rithm.3.2.2 The application to proper, lower-semicontinuous convexfunctionsAs f, g, l ∈ Γ0(Rm), we consider the maximal monotone operators: A =∂f,B = ∂g,D = ∂l. Thus the primal inclusion problem:find x ∈ R such that z ∈ Ax+ 2− ατx+α4− αL? ◦ (B2−ασ D) ◦ (Lx− r)603.2. α-Douglas-Rachford algorithm, with parameter α ∈ [1, 2)is equivalent to the following inclusion problem:find x ∈ R such that z ∈ ∂f(x) + 2− ατx+α4− αL? ◦ (∂g2−ασ ∂l) ◦ (Lx− r).Theorem 3.2.11. Let f, g, l ∈ Γ0(Rm), let L be a nonzero linear invertible op-erator, let z, r ∈ Rm, let τ ∈ R++ and σ ∈ R++, and α ∈ [1, 2). If dom g∗ ∩intdom l∗ 6= ∅, then the following primal inclusion problemFind x ∈ Rm such that z ∈ ∂f(x)+ 2− ατx+α4− αL? ◦ (∂g2−ασ ∂l)◦ (Lx−r)(3.10)can be characterized by the following optimization problemArgmin{f(·)+ 2− α2τ‖ ·‖2 + α4− α [(g l) (σ2(2− α)‖ ·‖2)](L ·−r)−〈z, ·〉}.(3.11)Proof. According to Definition 3.2.6,∂g2−ασ ∂l = [(∂g)−1 + (∂l)−1 +2− ασId]−1.Since g, l ∈ Γ0(Rm), by Fact 1.3.46,[(∂g)−1 + (∂l)−1 +2− ασId]−1 = (∂g∗ + ∂l∗ +2− ασId)−1.Because dom g∗ ∩ intdom l∗ 6= ∅, we can use the sum rule for subdifferentials(Fact 1.3.31) to get(∂g∗ + ∂l∗ +2− ασId)−1 = (∂(g∗ + l∗) +2− ασId)−1.Again, since g, l ∈ Γ0(Rm) and dom g∗ ∩ intdom l∗ 6= ∅, Theorem 1.3.41 impliesthat g∗ + l∗ ∈ Γ0(Rm). Therefore, we can use Fact 1.3.32 to get(∂(g∗ + l∗) +2− ασId)−1 = [∂(g∗ + l∗ +2− α2σ‖ · ‖2)]−1,which is equivalent to∂(g∗ + l∗ +2− α2σ‖ · ‖2)∗ (3.12)as g∗ + l∗ + 2−α2σ ‖ · ‖2 ∈ Γ0(Rm). By using the same method, as f ∈ Γ0(Rm), wehave∂f +2− ατId = ∂(f +2− α2τ‖ · ‖2). (3.13)613.2. α-Douglas-Rachford algorithm, with parameter α ∈ [1, 2)Obviously, dom ‖·‖2 = Rm and ‖·‖2 ∈ Γ0(Rm). Thus, we have 0 ∈ int[dom(g∗+l∗)− dom 2−α2σ ‖ · ‖2].By Fact 1.3.50, equation (3.12) can be rewritten as ∂[(g∗ + l∗)∗ (2−α2σ ‖ · ‖2)∗].Again, by using the Theorem 2.4.1 and Fact 1.3.50 with the reason that dom g∗ ∩intdom l∗ 6= ∅, the set in line (3.12) can be written as∂[(g∗∗ l∗∗) (2− α2σ‖ · ‖2)∗].Because g, l ∈ Γ0(Rm), by Fact 1.3.40, g∗∗ = g and l∗∗ = l. Moreover, as Exam-ple 1.3.38 says (2−α2σ ‖ · ‖2)∗ = ( σ2(2−α)‖ · ‖2), expression (3.12) finally equals∂[(g l) ( σ2(2− α)‖ · ‖2)]. (3.14)Combining the result of (3.13) and (3.14), and moving z to the end of the righthand side, the inclusion problem showed by (3.10) becomes0 ∈ ∂(f+2− α2σ‖·‖2)+ α4− αL?◦∂[(gl)( σ2(2− α)‖·‖2)]◦(Lx−r)+∂〈−z, x〉(3.15)Since (g l) σ2(2−α)‖ · ‖2 can be considered as a Moreau envelope with thefunction g l and λ = 2−ασ . Thus, by using Fact 1.3.55, we havedom[(g l) σ2(2− α)‖ · ‖2] = Rm.For f + 2−α2τ ‖ · ‖2, we havedom(f +2− α2τ‖ · ‖2) = dom f 6= ∅.Since L is a nonempty linear operator and r ∈ Rm, once dom f 6= ∅,[Ldom(f +2− α2τ‖ · ‖2)]− r 6= ∅. (3.16)Combining the result that dom[(g l) σ2(2−α)‖ · ‖2] = Rm with (3.16), we have[Ldom(f +2− α2τ‖ · ‖2)− r] ∩ intdom[(g l) σ2(2− α)‖ · ‖2] 6= ∅. (3.17)Due to Lemma 2.6∂(f +2− α2τ‖ · ‖2) + α4− αL? ◦ ∂[(g l) ( σ2(2− α)‖ · ‖2)] ◦ (L · −r)=∂{f + 2− α2τ‖ · ‖2 + α4− α [(g l) (σ2(2− α)‖ · ‖2)] ◦ (L · −r)} (3.18)623.2. α-Douglas-Rachford algorithm, with parameter α ∈ [1, 2)Combining with the fact that intdom〈−z, x〉 = Rm, we get∂{f + 2− α2τ‖ · ‖2 + α4− α [(g l) (σ2(2− α)‖ · ‖2)] ◦ (L · −r)}+ ∂〈−z, x〉=∂{f + 2− α2τ‖ · ‖2 + α4− α [(g l) (σ2(2− α)‖ · ‖2)] ◦ (L · −r) + 〈−z, ·〉}(3.19)by Fact 1.3.31. Combining the result of (3.15), (3.18) and (3.19), we get0 ∈ ∂{f + 2− α2τ‖ · ‖2 + α4− α [(g l) (σ2(2− α)‖ · ‖2)]◦ (L ·−r)−〈z, ·〉}(x).(3.20)By Fact 1.3.34, the inclusion (3.20) is equivalent to the statement that x belongs toArgmin{f(·)+ 2− α2τ‖·‖2 + α4− α [(g l)(σ2(2− α)‖·‖2)]◦(L ·−r)−〈z, ·〉},which is exactly the optimization problem (3.11). Thus, the primal inclusion prob-lem (3.10) is equivalent to the optimization problem (3.11).Remark 3.8. Combining Theorem 3.2.11 and Theorem 3.2.10, the optimizationproblem (3.11) can be solved by using α-Douglas-Rachford algorithm if we havedom(∂l)−1 = Rm. This holds when l = ι{0} or l = 12‖ · ‖2.63Chapter 4Special cases of the compositedmonotone inclusion problems andthe double regularization4.1 OverviewIn this chapter, we still let A : Rm → 2Rm , B : Rm → 2Rm and D : Rm →2Rmbe maximally monotone operators. Let z, r ∈ Rm and α ∈ [1, 2), let L :Rm → Rm be a nonzero linear invertible operator, let τ, σ ∈ R++ and τσ‖L‖2 <4. Setting the α inclusion problem in the general case: find x ∈ Rm such thatz ∈ Ax+ 2− ατx+α4− αL?(B2−ασ D)(Lx− r), (4.1)where L = 4−α2 L. The general case of the α inclusion problem (4.1) is too com-plex to get some direct result. In this chapter, we will consider the α inclusionproblem (4.1) in some special cases, and then get some particular results. In thosespecial cases, operator D satisfies domD−1 = Rm, that means, all of the spe-cial cases that will be showed below can be solved by using α-Douglas-Rachfordalgorithm.4.2 The special casesTheorem 4.2.1. Let a ∈ R++,L = a Id, D = Id, z = 0, r = 0. Then(1) The α inclusion problem (4.1) becomes: find x ∈ Rm such that0 ∈ Ax+ 2− ατx+αa4− α [B(σ2− α+ σ Id)](ax). (4.2)(2) If in addition, A = ∂f,B = ∂g, where f, g ∈ Γ0(Rm), then the primalinclusion problem (4.2) can be characterized by the following optimization644.2. The special casesproblemArgminx∈Rm{f(x) +2− α2τ‖x‖2 + α4− α(gγ2‖ · ‖2)(ax)} (4.3)where γ = σ2−α+σ .Proof. (1) Since L = a Id, z = 0, r = 0, the α inclusion problem (4.1) be-comes: find x ∈ Rm such that0 ∈ Ax+ 2− ατx+αa4− α(B2−ασ D)(ax), (4.4)Since D = Id, according to Lemma 3.2.7,(B2−ασ D) = B(σ2− α+ σ Id).Thus, (4.1) is equivalent to0 ∈ Ax+ 2− ατx+αa4− α [B(σ2− α+ σ Id)](ax)(2) Let γ = σ2−α+σ , since ∂f = A, ∂g = B, (4.2) is equivalent to0 ∈ ∂f(x) + 2− ατx+αa4− α(∂gγ Id)(ax). (4.5)Because f ∈ Γ0(Rm), we have∂f(x) +2− ατx = ∂[f(x) +2− α2τ‖x‖2]by Example 1.3.32. Moreover, since ∂(γf) = γ∂f for any γ ∈ R++,combining this with the Example 1.3.24, we haveγ Id = γ∂(12‖x‖2) = ∂(γ2‖x‖2).Therefore, (4.5) becomes0 ∈ ∂[f(x) + 2− α2τ‖x‖2] + αa4− α [∂g∂(γ2‖ · ‖2)](ax). (4.6)654.2. The special casesAs shown in Example 1.3.38, (γ2‖x‖2)∗ = ‖x‖24γ/2 =‖x‖22γ . Because g ∈Γ0(Rm), by Fact 1.3.40, dom g∗ 6= ∅. Therefore,dom g∗ ∩ intdom(γ2‖ · ‖2)∗ = dom g∗ ∩ intdom(‖ · ‖22γ)= dom g∗ ∩ Rm= dom g∗ 6= ∅.Combing the above result with the fact that g ∈ Γ0(Rm), we get∂g∂(γ2‖ · ‖2) = ∂(gγ2‖ · ‖2)by Fact 2.4.3.Again, we use the fact ∂(γf) = γ∂f, (4.6) is equivalent to0 ∈ ∂[f(x) + 2− α2τ‖x‖2] + a∂[ α4− α(gγ2‖ · ‖2)](ax). (4.7)Because (gγ2‖ · ‖2) can be considered as a Moreau envelope with the func-tion g and λ = 1γ . Thus, by using Fact 1.3.55, we havedom(gγ2‖ · ‖2) = Rm.Thus,adom[f(·) + 2− α2τ‖ · ‖2] ∩ intdom(gγ2‖ · ‖2)=adom f ∩ Rm=adom f 6= ∅.Thus, by Fact 1.3.29, (4.7) is equivalent to0 ∈ ∂[f(x) + 2− α2τ‖x‖2 + α4− α(gγ2‖ · ‖2)(ax)]. (4.8)Due to Fact 1.3.34, finding x such that0 ∈ ∂[f(x) + 2− α2τ‖x‖2 + α4− α(gγ2‖ · ‖2)(ax)]is equivalent to x being a solution ofArgminx∈Rm{f(x) +2− α2τ‖x‖2 + α4− α(gγ2‖ · ‖2)(ax)}Therefore, we completed the proof.664.2. The special casesTheorem 4.2.2. Let a ∈ R++,L = a Id, D = N{0}, z = 0, r = 0. Then(1) the α inclusion problem (4.1) becomes: find x ∈ Rm such that0 ∈ Ax+ 2− ατx+αa4− α(Bσ2− α Id)(ax). (4.9)(2) If in addition, A = ∂f,B = ∂g, where f, g ∈ Γ0(Rm), then the primalinclusion problem (4.9) can be characterized by the following optimizationproblemArgminx∈Rm{f(x) +2− α2τ‖x‖2 + α4− α(gγ2‖ · ‖2)(ax)}, (4.10)where γ = σ2−α .Proof. (1) Since L = a Id, z = 0, r = 0, the α inclusion problem (4.1) be-comes: find x ∈ Rm such that0 ∈ Ax+ 2− ατx+αa4− α(B2−ασ D)(ax), (4.11)Since D = N{0}, by Lemma 3.2.8(B2−ασ N{0}) = Bσ2− α Id .Thus, (4.11) is equivalent to0 ∈ Ax+ 2− ατx+αa4− α(Bσ2− α Id)(ax).(2) Since the only difference between (4.2) and (4.9) is the parameter of theidentity function. Thus, as we proved in 4.2.1(2), if we let γ = σ2−α here,the primal inclusion problem (4.9) can be characterized by the followingoptimization problemArgminx∈Rm{f(x) +2− α2τ‖x‖2 + α4− α(gγ2‖ · ‖2)(ax)}.674.3. Double regularization: Moreau regularization and Tychonov regularization combined4.3 Double regularization: Moreau regularization andTychonov regularization combinedLet f, g : Rm → R be proper, lower-semicontinuous and convex. Often wemust find the minimizer of f + g. The functions f, g might not be differentiable.We can first make one of the function differentiable by using the Moreau envelope.In order to obtain the least norm solution, we can do Tychonov regularization.Therefore, we can consider minimization of the following function:f + β1 q +β2[g(β3 q)] (4.12)where βi > 0 (i ∈ {1, 2, 3}), and q = 12‖ · ‖2. Put h = f + β2[g(β3 q)]. Whileg(β3 q) is the Moreau regularization with the function being g and λ = 1β3 , andh + β1 q is the Tychonov regularization. Therefore, Problem (4.12) is a doubleregularization.Remark 4.1. In (4.12), we usually require β1 ↓ 0, β2 → 1, and β3 ↓ 0.4.3.1 Subdifferential of infimal convolutionsLet q(x) = 12‖x‖2.Corollary 4.3.1. Let f : Rm → (−∞,+∞] be proper, lower semicontinuous andconvex. Then for any β3 ∈ R++,∂[f(β3 q)] = ∂fβ3 Id = [1β3Id +(∂f)−1]−1.Proof. By Fact 1.3.39 and Fact 1.3.40, f ∈ Γ0(Rm) implies f∗ ∈ Γ0(Rm); inparticular, dom f∗ 6= ∅. According to Example 1.3.38, we have[β3 q(x)]∗ =(β32‖x‖2)∗=‖x‖22β3.Therefore, dom(β3 q)∗ = Rm = int dom(β3 q)∗. Thus, dom f∗∩int dom(β3 q)∗ =dom f∗ 6= ∅. Applying Fact 2.4.3, we get∂(fβ3 q) = ∂f∂(β3 q)= ∂fβ3 Id= [(∂f)−1 + (β3 Id)−1]−1= [(∂f)−1 +1β3Id]−1.684.3. Double regularization: Moreau regularization and Tychonov regularization combined4.3.2 Main resultsTheorem 4.2. Let f, g : Rm → (−∞,+∞] be proper, lower semicontinuous andconvex, let q(x) = 12‖x‖2. For every β1 > 0, β2 > 0, β3 > 0, considerminx∈Rm{f + β1 q +β2[g(β3 q)]}. (p)Then the following hold:(1)∂{f + β1 q +β2[g(β3 q)]} = ∂f + β1 Id +β2[∂g(β3 Id)].(2) (p) always has a unique solution.(3) The Fenchel Dual of (p) isminv∈Rm{(f∗ qβ1)(v) +1β2β3q(v) + β2g∗(−vβ2)}, (d)and it also has a unique solution.Proof. (1) Because f and g are proper, dom f 6= ∅ and dom g 6= ∅. Sinceg(β3 q) is actually a Moreau envelope with the function g and λ = 1β3 ,according to Fact 1.3.55,intdom[g(β3 q)] = intRm= Rm. (4.13)For β1 q, we haveintdom(β1 q) = intRm = Rm.Thus,dom f ∩ intdomβ2[g(β3 q)] ∩ intdom(β1 q)= dom f ∩ Rm ∩ Rm= dom f6=∅.Additionally, f ∈ Γ0(Rm), g ∈ Γ0(Rm). Thus, by Fact 1.3.31,∂{f + β1 q +β2[g(β3 q)]} = ∂f + ∂(β1 q) + ∂[β2(g(β3 q))]. (4.14)Again, since g ∈ Γ0(Rm), by using Corollary 4.3.1, we have ∂[g(β3 q)] =∂g(β3 Id). Therefore, (4.14) yields∂{f + β2[g(β3 q)] + β1 q} = ∂f + β1 Id +β2[∂g(β3 Id)].694.3. Double regularization: Moreau regularization and Tychonov regularization combined(2) Because g ∈ Γ0(Rm), by Fact 1.3.40, dom g∗ 6= ∅. As noted in the proof ofCorollary 4.3.1, dom(β3 q)∗ = Rm. Consequently,dom g∗ ∩ int dom(β3 q)∗ = dom g∗ 6= ∅.Thus, by Fact 2.4.3(2),[g(β3 q)] = [g (β3 q)] ∈ Γ0(Rm).Because f ∈ Γ0(Rm), dom f 6= ∅. According to (4.13),dom f ∩ dom[β2g(β3 q)] = dom f 6= ∅.Thus, by using Lemma 1.4, we havef + β2[g(β3 q)] ∈ Γ0(Rm).Since dom f ∩ dom{β2[g(β3 q)]} = dom f,dom(β1 q) ∩ {dom f ∩ dom[β2(g(β3 q))]}=Rm ∩ dom f= dom f6=∅.In addition, according to Example 1.3.17 and Example 1.3.7, q is supercoer-cive and strictly convex. By Fact 1.3.19, β1 q +f+β2[g(β3 q)] has exactlyone minimizer over Rm. In another words, (p) always has a unique solution.(3) According to Fenchel duality (see Definition 1.3.58), we haveminx∈Rm{f(x) + β1 q(x) + β2[g(β3 q)](x)}= minv∈Rm{(f + β1 q)∗(v) + [β2(g(β3 q))]∗(−v)}. (4.15)Since dom q = Rm and f ∈ Γ0(Rm), we havedom f ∩ intdom q = dom f 6= ∅.Thus, by Theorem 2.4.1 and Fact 1.3.50,(f + β1 q)∗ = f∗ (β1 q)∗. (4.16)As we showed in Example 1.3.38, for λ > 0, (λ‖x‖2)∗ = ‖x‖24λ . Thus,(β1 q)∗ = (β12‖x‖2)∗ = ‖x‖22β1=qβ1. (4.17)704.3. Double regularization: Moreau regularization and Tychonov regularization combinedThat is,f∗ (β1 q)∗ = f∗ qβ1. (4.18)For [β2(g(β3 q))]∗(−v), according to Fact 1.3.37, we have[β2(g(β3 q))]∗(−v) = β2[g(β3 q)]∗(−vβ2).Due to Fact 1.3.49 and equation (4.17)[g(β3 q)]∗(−vβ2)= [g∗ + (β3 q)∗](−vβ2)= g∗(−vβ2)+1β3q(−vβ2)= g∗(−vβ2)+1β22β3q(v). (4.19)Combining the result of (4.18) and (4.19), we haveminv∈Rm{(f + β1 q)∗(v) + [β2(g(β3 q))]∗(−v)}= minv∈Rm{(f∗ qβ1)(v) +1β2β3q(v) + β2g∗(−vβ2)}.Therefore, we complete the proof that the Fenchel Dual of (p) is (d). For thesimilar reason as (2), (d) has a unique solution.Remark 4.3. We are interested in considering: β1 = 2−ατ and β2 =α4−α , whereτ ∈ R++ and α ∈ [1, 2).If we apply Theorem 4.2 to the optimization problem (4.3), we can get thefollowing corollary.Corollary 4.3.2. Let f, g ∈ Γ0(Rm), let q = 12‖ · ‖2. Consider the optimizationproblem (4.3) where a = 1, γ = σ2−α+σ . Then the following hold:(1) For any x ∈ Rm,∂{f(x) + 2− α2τ‖x‖2 + α4− α(gγ2‖ · ‖2)(x)}=∂f(x) +2− ατx+α4− α [∂g(x)(γx)].714.3. Double regularization: Moreau regularization and Tychonov regularization combined(2) The following problem, where γ = σ2−α+σ , always has a unique solution:minx∈Rm{f(x) + 2− α2τ‖x‖2 + α4− α(gγ2‖ · ‖2)(x)}. (4.20)(3) The Fenchel Dual of (4.20) isminv∈Rm{[f∗ (τ2− α q)](v) +4− ααγq(v) +α4− αg∗(α− 4αv)}.Moreover, this Fenchel Dual has a unique solution.If we apply Theorem 4.2 to the optimization problem (4.10), we can get thefollowing corollary.Corollary 4.3.3. Let f, g ∈ Γ0(Rm), let q = 12‖ · ‖2. Consider the optimizationproblem (4.10) where a = 1, γ = σ2−α . Then the following hold:(1) For any x ∈ Rm,∂{f(x) + 2− α2τ‖x‖2 + α4− α(gγ2‖ · ‖2)(x)}=∂f(x) +2− ατx+α4− α [∂g(x)(γx)].(2) The following problem, where γ = σ2−α , always has a unique solution:minx∈Rm{f(x) + 2− α2τ‖x‖2 + α4− α(gγ2‖ · ‖2)(x)}. (4.21)(3) The Fenchel Dual of (4.21) isminv∈Rm{[f∗ (τ2− α q)](v) +4− ααγq(v) +α4− αg∗(α− 4αv)}.Moreover, this Fenchel Dual has a unique solution.72Chapter 5The α-Douglas-Rachfordalgorithm with α→ 25.1 OverviewAs we set α ∈ [1, 2) in the α-Douglas-Rachford algorithm, we want to considerthe properties of that algorithm when α is in its limit case. Therefore, the α-Douglas-Rachford algorithm is considered in a special-limit case in this chapter.5.2 Parameter α→ 2Fact 5.2.1. [4, Theorem 23.44] Let A : H → 2H be a maximally monotoneoperator and let x ∈ H. Then the inclusions(∀γ ∈ (0, 1)) 0 ∈ Axγ + γ(xγ − x)define a unique curve (xγ)γ∈(0,1). Moreover, exactly one of the following holds:(1) zerA 6= ∅ and xγ → PzerAx as γ ↓ 0.(2) zerA = ∅ and ‖xγ‖ → +∞ as γ ↓ 0.Fact 5.2.1 and Theorem 3.2.5 yield the following characterizations of the α-Douglas-Rachford algorithm. However, before we go to the theorem, we need toget the following lemma.Lemma 5.2.2. Let A be a maximally monotone operator from K to 2K, (αk)+∞k=1be an increasing sequence in [1, 2) such that limk→+∞αk = 2. Then,limk→+∞RαkA = R2A.Proof. As RαkA = αkJA − Id and R2A = 2JA − Id, we have for fixed x,‖RαkA x−RAx‖ = ‖αkJAx− 2JAx‖ ≤ |αk − 2|‖JAx‖.735.2. Parameter α→ 2Clearly the right side tends to 0 as k → +∞, so RαkA x → RAx. This holds forevery x.Theorem 5.2.3. Let A and B be maximally monotone operators from K to 2K,0 ∈ int(domA − domB) and zer(A + B) 6= ∅. Let (αk)+∞k=1 be an increasingsequence in [1, 2) such that limk→+∞αk = 2. for each k, consider the sequencesyn = JBxnzn = JA(αkyn − xn)xn+1 = xn + (zn − yn).(5.1)Then the sequence xn converges to some x∗k ∈ FixRαkA ◦RαkB such thatJBx∗k = zer(A+B + (2− αk) Id).For any resulting sequence (x∗k)+∞k=1,(1) limk→+∞JBx∗k = Pzer(A+B)(0).(2) Suppose (x∗k)+∞k=1 is a convergent sequence. Let limk→+∞x∗k = x∗. Then JBx∗is a solution to 0 ∈ Ax+Bx, and ‖JBx∗‖ ≤ ‖y‖ for any y ∈ zer(A+B).Proof. The existence of x∗k follows from Theorem 3.2.5.(1) Because A,B are maximally monotone and 0 ∈ int(domA− domB), Fact1.2.28 implies that A+B is also maximally monotone. According to Theo-rem 3.2.5 (1), we have JBx∗k ∈ zer(A+B + (2− αk) Id). That means:0 ∈ (A+B)JBx∗k + (2− αk)JBx∗k,which can also be written as0 ∈ (A+B)JBx∗k + (2− αk)(JBx∗k − 0).As zer(A+B) 6= ∅, according to Fact 5.2.1,JBx∗k → Pzer(A+B)(0) as (2− αk) ↓ 0.Since limk→+∞αk = 2, we can also write this aslimk→+∞JBx∗k = Pzer(A+B)(0).745.2. Parameter α→ 2(2) Here we want to prove limk→+∞JB(x∗k) = JB(x∗), where limk→+∞x∗k = x∗.For any > 0, there exists N ∈ N such that for any l ≥ N,‖x∗l − x∗‖ < .Because B is maximally monotone, by Fact 1.2.38, JB is firmly nonexpan-sive on Rm. Thus,‖JB(x∗l )− JB(x∗)‖ ≤ ‖x∗l − x∗‖ < ,that is,limk→+∞JB(x∗k) = JB(x∗).As we already proved limk→+∞JB(x∗k) = Pzer(A+B)(0), we haveJB(x∗) = Pzer(A+B)(0).Therefore, JBx∗ is a solution to 0 ∈ Ax + Bx, and ‖JBx∗‖ ≤ ‖y‖ for anyy ∈ zer(A+B).This theorem shows that when zer(A+B) 6= ∅,we can either use the Douglas-Rachford algorithm or the α-Douglas-Rachford algorithm to get a solution of it.Moreover, when zer(A+B) has multiple solutions, we can useα-Douglas-Rachfordalgorithm to get the one which has the shortest norm.Theorem 5.2.4. Let f, g, l ∈ Γ0(Rm), let L : Rm → Rm be a nonzero invert-ible linear operator where [L(dom f) − r] ∩ intdom(gl) 6= ∅, and dom g∗ ∩int dom l∗ 6= ∅. Let z and r ∈ Rm, let τ ∈ R++ and σ ∈ R++. We setA = ∂f,B = ∂g,D = ∂l which domD−1 = Rm, and let αk be a increasingconvergent sequence in [1, 2) such that limk→+∞αk = 2, the following holds:(1) The sequence of problems: find x ∈ Rm such thatz ∈ Ax+ 2− αkτx+αk4− αkL? ◦ (B2−αkσ D) ◦ (Lx− r) (5.2)where L = 4−αk2 L, together with the sequence of duals: find v such thatthere exists an x ∈ Rm that z −αk4−αkL?v ∈ Ax+ (2−αk)τ xv ∈ (B2−αkσ D) ◦ (Lx− r).(5.3)755.2. Parameter α→ 2have solution pairs (xk, vk) that converge to (x, v) satisfying the primal in-clusion:z ∈ Ax+ L?(BD)(Lx− r) (5.4)together with the dual inclusion:{z − L?v ∈ Axv ∈ (BD)(Lx− r). (5.5)(2) The sequence of optimization problemsArgminx∈Rm{f(x) +2− αk2τ‖x‖2+αk4− αk [(g l) (σ2(2− αk)‖x‖2)] ◦ (Lx− r)− 〈z, x〉},(5.6)where L = 4−αk2 L, has a sequence of solutions xk that converges to anelement x ofArgminx∈Rm{f(x) + (g l) ◦ (Lx− r)− 〈z, x〉}, (5.7)when αk → 2.Proof. (1) Because f, g, l ∈ Γ0(Rm), by Fact 1.3.27, A,B,D are maximallymonotone. Suppose x¯k is the solution of (5.2), and v¯k is the solution of (5.3).Then by Theorem 3.2.9, (x¯k, v¯k) is the solution of the inclusion problem:find (x, v) ∈ Rm×Rm such that (x, v) ∈ zer(A+B+(2−αk) Id), (5.8)and vice versa. HereA := V −1(12S+Q), andB := V−1(12S+M), whereM : K → 2K : (x, v) 7→ (−z +Ax, r +B−1v);Q : K → 2K : (x, v) 7→ (0, D−1v);S : K → K : (x, v) 7→ (L?v,−Lx);V : K → K : (x, v) 7→ (xτ− 12L?v,vσ− 12Lx).We proved that A and B are maximally monotone in Lemma 2.3.4. SincedomD−1 = Rm, by Theorem 3.2.10 and Theorem 3.2.5 (1), there exists anx∗k ∈ FixRαkA ◦RαkB such that JBx∗k ∈ zer(A+B+ (2−αk) Id). Supposezer(A+B) 6= ∅. By Theorem 5.2.3, one haslimk→+∞JBx∗k = Pzer(A+B)(0).765.2. Parameter α→ 2In other words, if we set (x∗, v∗) = limk→+∞JBx∗k, which x∗, v∗ ∈ Rm,from Lemma 2.3.4, it follows that x∗ is the solution of the primal inclusionproblem: find x ∈ Rm such thatz ∈ Ax+ L?(BD)(Lx− r),and v∗ is the solution of the dual inclusion problem: find v such that thereexists an x ∈ Rm that {z − L?v ∈ Axv ∈ (BD)(Lx− r).Therefore, we complete the proof.(2) According to Theorem 3.2.11, we get that the optimization problem (5.6)is equivalent to the primal inclusion problem (5.2). Moreover, by using theTheorem 2.4.4, the inclusion problem (5.7) is equivalent to the optimizationproblem (5.4). Therefore, by using the result of (1), we complete the proof.Remark 5.1. If we let A := V −1(12S +Q), and B := V−1(12S +M), whereM : K → 2K : (x, v) 7→ (−z +Ax, r +B−1v);Q : K → 2K : (x, v) 7→ (0, D−1v);S : K → K : (x, v) 7→ (L?v,−Lx);V : K → K : (x, v) 7→ (xτ− 12L?v,vσ− 12Lx).According to Lemma 2.3.4, (x, v) ∈ zer(A+B) if and only if (x, v) ∈ zer(M +Q+ S), i.e.,(0, 0) ∈ (−z +Ax+ L?v, r +B−1v +D−1v − Lx). (5.9)(5.9) is the exactly: find (x, v) such that{z ∈ Ax+ L?vv ∈ (BD)(Lx− r).From Theorem 5.2.3, we know if we let (x∗, v∗) = limαk→2JBx∗k,√‖x∗‖2 + ‖v∗‖2is the shortest norm for all (x, v) ∈ zer(A+B), since limαk→2JBx∗k = Pzer(A+B)(0).775.2. Parameter α→ 2Remark 5.2. In 2011, Wang [17] gave two self-dual regularizations of maximalmonotone operators on H, which can be effectively used to find the least normsolution to maximally monotone operators.Remark 5.3. Dykstra method can also be used to find the least norm solution. SeeBauschke and Borwein’s paper [3].78Chapter 6The application of theα-Douglas-Rachford algorithm6.1 OverviewIn this chapter, we are going to use the α-Douglas-Rachford algorithm to solvethe inclusion problem0 ∈ Ax+Bx,where A : Rm → 2Rm and B : Rm → 2Rm are maximally monotone operators, intwo different ways.6.2 Least norm solution of the primal problemTheorem 6.2.1. Let A : Rm → 2Rm and B : Rm → 2Rm be maximally monotoneoperators, 0 ∈ int(domA− domB) and zer(A+B) 6= ∅. In order to solve0 ∈ Ax+Bx, (6.1)we can let αk be an increasing convergent sequence in [1, 2) such that limk→+∞αk =2. Then we use the α-Douglas-Rachford algorithm to solve the problem0 ∈ Ax+Bx+ (2− αk) Id . (6.2)When αk → 2, the answers of problem (6.2) converge to the shortest norm solutionof problem (6.1).Proof. Since A,B are maximally monotone operators, 0 ∈ int(domA− domB)and zer(A + B) 6= ∅, by Theorem 5.2.3, if we use the α-Douglas-Rachford al-gorithm to solve problem (6.2), for any fixed αk, there exists a correspondingx∗k ∈ FixRαkA ◦ RαkB such that JBx∗k ∈ zer(A + B + (2 − αk) Id). Moreover,we have limk→+∞JBx∗k = Pzer(A+B)(0). That is,‖ limk→+∞JBx∗k‖ ≤ ‖y‖for any y ∈ zer(A+B).796.3. Least norm solution of the primal-dual problem6.3 Least norm solution of the primal-dual problemTheorem 6.3.1. Let A : Rm → 2Rm and B : Rm → 2Rm be maximally monotoneoperators, and zer(A + B) 6= ∅, let L = Id . In order to solve the problem withprimal inclusion: find x ∈ Rm such that0 ∈ Ax+Bx, (6.3)together with the dual inclusion: find v ∈ Rm such that for some x,{ −v ∈ Axv ∈ Bx, (6.4)we can let αk be an increasing convergent sequence in [1, 2) such thatlimk→+∞αk = 2.Let L = 4−αk2 Id . Then we use the α-Douglas-Rachford algorithm to solve theproblem with primal inclusion: find x ∈ Rm such that0 ∈ Ax+ 2− αkτx+αk4− αkL?(B σ2− αk Id)(Lx), (6.5)where τ ∈ R++, σ ∈ R++, and τσ < 4, together with the dual inclusion: find vsuch that there exists an x ∈ Rm that{− αk4−αkL?v ∈ Ax+2−αkτ xv ∈ (B σ2−αk Id)(Lx).(6.6)When αk → 2, the sequence of solutions of the primal-dual problem (6.5) togetherwith (6.6) converge to the primal-dual shortest norm solution of problem (6.3)together with (6.4).Proof. LetM : K → 2K : (x, v) 7→ (Ax,B−1v);Q : K → 2K : (x, v) 7→ (0, D−1v) where D = N{0};S : K → K : (x, v) 7→ (v,−x);V : K → K : (x, v) 7→ (xτ− 12v,vσ− 12x).A := V −1(12S + Q), and B := V−1(12S + M). According to Lemma 2.3.4, weget A and B are maximally monotone andzer(A+B) = zer(M + S +Q) = zer(M + S),806.3. Least norm solution of the primal-dual problemas for any v ∈ Rm, D−1v = 0 by the definition of N{0}. That means for all(x, v) ∈ zer(A+B), we have(0, 0) ∈(M + S)(x, v)=(Ax+ v,B−1v − x).That is {0 ∈ Ax+ v0 ∈ B−1v − x,which is equivalent to0 ∈ Ax+Bxtogether with { −v ∈ Axv ∈ Bx.Therefore, solving (x, v) ∈ zer(A+B) is equivalent to solving the problem withprimal inclusion (6.3) together with dual inclusion (6.4). Since we have zer(A +B) 6= ∅, according to Lemma 2.3.1, the primal inclusion (6.3) is equivalent to thedual inclusion (6.4). Therefore zer(A+B) 6= ∅.In the proof of Theorem 3.2.10, we showed that once domD−1 = Rm, we have0 ∈ int(domA− domB). Since D = N{0}, by Lemma 1.1.35, we havedomD−1 = ranD = Rm.Therefore, we know A,B are maximally monotone, 0 ∈ int(domA − domB)and zer(A + B) 6= ∅. By Theorem 6.2.1, we can use the α-Douglas-Rachfordalgorithm to solve each problemzer(A+B + (2− αk) Id). (6.7)When αk → 2, the answers from problem (6.7) converge to the shortest normsolution of problem zer(A + B). By Theorem 3.2.9, the solution of problemzer(A + B + (2 − αk) Id) is also the primal-dual solution of problem (6.5) to-gether with (6.6). Therefore, when αk → 2, the sequence of solutions of theprimal-dual problem (6.5) together with (6.6) converge to the primal-dual shortestnorm solution of problem (6.3) together with (6.4).Remark 6.1. The operatorM + S : K → 2K : (x, v) 7→ (Ax+ v,B−1v − x)816.3. Least norm solution of the primal-dual problemis maximally monotone. Because M = (A,B−1) : K → 2K is maximally mono-tone according to Fact 1.2.30. And because S is skew and linear, by Fact 1.2.31,S is maximally monotone. Moreover, since domS = K, by Fact 1.2.28, we getM + S is maximally monotone.Remark 6.2. Note that Theorem 6.2.1 gives the primal shortest norm solution, butTheorem 6.3.1 gives the primal-dual shortest norm solution.Remark 6.3. According to Corollary 2.3.3, the dual inclusion (6.4) is equivalent tothe problem: find v′ such that0 ∈ A−1(v′)−B−1(−v′), (6.8)which is the Attouch-Théra dual [1] of (6.3).82Chapter 7Numerical experiments7.1 OverviewIn this chapter, we describe numerical experiments whose results confirms theproperties of the α-Douglas-Rachford algorithm derived above.In the following three numerical experiments, the operator D satisfiesdomD−1 = Rm.Before we go to the numerical examples, the following formulas for proximalpoints are necessary.Lemma 7.1.1. Let C be a closed convex set in Rm, let τ ∈ R++ and f = ιC .Then we have:(1) Proxτf (x) = PC(x).(2) Proxτf∗(x) = x− τ PC(xτ ).Proof. (1) Since C is a closed convex set, by by Example 1.3.13, f ∈ Γ0(Rm).Then, for any x ∈ Rm, we haveProxτf (x) = ProxτιC (x)= Argminu∈Rm{τιC(u) +12‖u− x‖2}= Argminu∈C12‖u− x‖2= PC(x).(2) Since f ∈ Γ0(Rm), by Fact 1.3.40, f = f∗∗. Therefore, we use Fact 1.2.37837.2. A feasibility problemto getProxτf∗(x) =[Id−τ Proxτ−1f ◦(τ−1 Id)](x)=x− τ Proxτ−1ιC (xτ)=x− τ Argminu∈C12‖u− xτ‖2=x− τ PC(xτ).Lemma 7.1.2. Let g = ‖ · ‖, let τ ∈ R++. Then for any x ∈ Rm,Proxτg∗(x) = PB(0;1)(x).Proof. Asg∗(u) = supx∈Rm{〈u, x〉 − ‖x‖}=ιB(0;1)(u),by Lemma 7.1.1, we haveProxτg∗(x) = PB(0;1)(x).7.2 A feasibility problemIn this part, we consider solving the inclusion problemz ∈ Ax+B(x− r), (7.1)where A,B are maximally monotone operators and z, r ∈ Rm are given.Example 7.2.1. Let f = ιC1 , g = ιC2 , where C1 is a circle centred at (5, 0) withradius 2, and C2 is a box centred at (3, 1.5) with radius 1. Let z = 0, r = 0. If welet A = ∂f,B = ∂g, the problem (7.1) becomes0 ∈ NC1(x) +NC2(x). (7.2)847.2. A feasibility problemFigure 7.1: The plot of Example 7.2.1Let αk be a increasing convergent sequence in [1, 2) such that limk→+∞αk = 2.Then the following holds:(1) The inclusion problem: The solution of problem: finding x ∈ R2 such that0 ∈ NC1(x) +NC2(x) + (2− αk)(x) (7.3)is the shortest norm solution of problem (7.2) when αk → 2.(2) The problem (7.3) can be solved by the α-Douglas-Rachford algorithm.Moreover, as αk → 2, the optimization result which is gotten by the α-Douglas-Rachford algorithm converges to the shortest norm solution of (7.2).Proof. (1) We apply Theorem 6.2.1 to complete this proof.Since C1 and C2 are closed, bounded and convex sets, according to Example1.2.33, NC1 and NC2 are maximally monotone operators. Because we alsohave int(C1 ∩C2) 6= ∅, according to Theorem 3.2.5, the inclusion problems(7.3) can be solved by the α-Douglas-Rachford algorithm (3.3) by lettingA = NC1 and B = NC2 . That is:yn = JNC2 (xn)zn = JNC1 (αkyn − xn)xn+1 = xn + (zn − yn).(7.4)857.2. A feasibility problemAccording to Example 1.3.25,NC1 = ∂ιC1 , NC2 = ∂ιC2 .Therefore, algorithm (7.4) becomesyn = J∂ιC2 (xn)zn = J∂ιC1 (αkyn − xn)xn+1 = xn + (zn − yn).(7.5)By Fact 1.3.57, we haveJ∂ιC2 = ProxιC2 , and J∂ιC1 = ProxιC1 .Let’s plug ProxιC2 ,ProxιC1 , into (7.5) instead of J∂ιC2 , and J∂ιC1 respec-tively, we getyn = ProxιC2 (xn)zn = ProxιC1 (αkyn − xn)xn+1 = xn + (zn − yn).(7.6)According to Lemma 7.1.1(1), we haveProxιC1 (x) = PC1(x)andProxιC2 (x) = PC2(x).As C1 is a circle centred at (5, 0) with radius 2, we can also writeProxιC1 (x) = (5, 0) + PB(0;2)(x− (5, 0)).Hence, when we choosing x0 = (5, 1) as starting values, for any fixed k, letαk = 2− 1k , the iterative scheme algorithm (7.6) becomes:yn = PC2(xn)zn = (5, 0) + PB(0;2)(αkyn − xn)xn+1 = xn + (zn − yn).The stopping criteria of this algorithm is: Let = 10−5, continue runningthis iteration until ‖xn+1 − xn‖ < .867.2. A feasibility problemTable 7.1: Example 1: Six fixed αk where αk = 2 − 1/k, the correspondingoptimization point y∗, and the norm of y∗.k y∗ ‖y∗‖1 (3.0635,0.5) 3.10410 (3.0635,0.5) 3.10450 (3.0635,0.5) 3.104100 (3.0635,0.5) 3.1041000 (3.0635,0.5) 3.10410000 (3.0635,0.5) 3.104As we can see, the optimization result y∗ doesn’t change its value when weuse different value of k. It is clear that y∗ locate at the boundary of C1 andalso locate at the boundary of C2. With the help of Figure 7.2.1, we get y∗ isthe smallest norm solution of problem (7.2).We also tried to run this algorithm from different starting point, and the resultshows once we fix the value of k, the result we get does not change if wechange its starting point.However, when we use the classic Douglas-Rachford algorithm to solve(7.2), the answer changes if we choose different starting point. Here is theresult:Table 7.2: Example 1: starting point x0, the corresponding optimization point y∗,and the norm of y∗.x0 y∗ ‖y∗‖(5,1) (4,0.8944) 4.0988(-3,1) (3.0785,0.5548) 3.1281(-4,-6) (4,0.5) 4.0311(10,-20) (4,0.5) 4.0311That means, in this example, if we directly use the classic Douglas-Rachfordalgorithm to solve problem (7.2), the answer we get may not have the short-est norm. However, if we use the α-Douglas-Rachford algorithm to solveproblem (7.3) and let αk → 2, the answer we get has the shortest norm.877.3. A Heron problem7.3 A Heron problemIn this part, we are going to consider solving a primal-dual inclusion problemby using α-Douglas-Rachford algorithm.Theorem 7.3.1. Let A : Rm → 2Rm , B : Rm → 2Rm and D : Rm → 2Rmbe maximally monotone operators and domD−1 = Rm. Let z, r ∈ Rm, and letL : Rm → Rm be a nonzero linear invertible operator. Let αk be an increasingconvergent sequence in [1, 2) such that limk→+∞αk = 2.(1) The problem with primal inclusion: find x ∈ Rm such thatz ∈ Ax+ 2− αkτx+αk4− αkL?(B2−αkσ D)(Lx− r), (7.7)where L = 4−αk2 L, τ ∈ R++, σ ∈ R++, and τσ‖L‖2 < 4, together withthe dual inclusion: find v such that there exists an x ∈ Rm that z −αk4−αkL?v ∈ Ax+ (2−αk)τ xv ∈ (B2−αkσ D) ◦ (Lx− r)(7.8)can be solved by using the α-Douglas-Rachford algorithm:yn = JBxnzn = JA(αkyn − xn)xn+1 = xn + (zn − yn).(7.9)Here, A := V −1(12S +Q), and B := V−1(12S +M), whereM : K → 2K : (x, v) 7→ (−z +Ax, r +B−1v);Q : K → 2K : (x, v) 7→ (0, D−1v);S : K → K : (x, v) 7→ (L?v,−Lx);V : K → K : (x, v) 7→ (xτ− 12L?v,vσ− 12Lx).(2) The algorithm (7.9) can be rewritten asy1n = JτA(x1n − τ2L?x2n + τz)y2n = JσB−1(x2n − σ2Lx1n + σLy1n − σr)w1n = αky1n − x1nw2n = αky2n − x2nz1n = w1n − τ2L?w2nz2n = JσD−1(w2n − σ2Lw1n + σLz1n)x1n+1 = x1n + (z1n − y1n)x2n+1 = x2n + (z2n − y2n),(7.10)887.3. A Heron problem(3) Let f, g, l ∈ Γ0(Rm), and A = ∂f,B = ∂g,D = ∂l. Then algorithm (7.10)implies the following algorithmy1n = Proxτf (x1n − τ2L?x2n + τz)y2n = Proxσg∗(x2n − σ2Lx1n + σLy1n − σr)w1n = αky1n − x1nw2n = αky2n − x2nz1n = w1n − τ2L?w2nz2n = Proxσl∗(w2n − σ2Lw1n + σLz1n)x1n+1 = x1n + (z1n − y1n)x2n+1 = x2n + (z2n − y2n).(7.11)Proof. (1) For any fixed αk, we can apply Theorem 3.2.10 to complete thisproof.(2) According to the definition of A and B, (7.9) can be rewritten asxn = [V−1(12S +M) + Id](yn),αkyn − xn = [V −1(12S +Q) + Id](zn),xn+1 = xn + (zn − yn).Set wn = αkyn − xn, we haveV (xn − yn) = (12S +M)(yn),wn = αkyn − xn,V (wn − zn) = (12S +Q)(zn),xn+1 = xn + (zn − yn).(7.12)Here, we letxn = (x1n, x2n), yn = (y1n, y2n), wn = (w1n, w2n), zn = (x1n, z2n).Since M,Q, S, V,A, and B are constructed by (M), (Q), (S), (V), (A) and(B) respectively, (7.12) is equivalent tox1n−y1nτ − 12L?(x2n − y2n) = 12L?y2n − z +Ay1nx2n−y2nσ − 12L(x1n − y1n) = −12Ly1n + r +B−1y2nw1n = αky1n − x1nw2n = αky2n − x2nw1n−z1nτ − 12L?(w2n − z2n) = 12L?z2n + 0w2n−z2nσ − 12L(w1n − z1n) = −12Lz1n +D−1z2nx1n+1 = x1n + (z1n − y1n)x2n+1 = x2n + (z2n − y2n).897.3. A Heron problemFinally, we simplify each line of the above algorithm and get:y1n = JτA(x1n − τ2L?x2n + τz)y2n = JσB−1(x2n − σ2Lx1n + σLy1n − σr)w1n = αky1n − x1nw2n = αky2n − x2nz1n = w1n − τ2L?w2nz2n = JσD−1(w2n − σ2Lw1n + σLz1n)x1n+1 = x1n + (z1n − y1n)x2n+1 = x2n + (z2n − y2n),which is algorithm (7.10). Therefore, algorithm (7.9) is equivalent to algo-rithm (7.10)(3) Since A = ∂f,B = ∂g,D = ∂l, we can write (7.10) asy1n = Jτ∂f (x1n − τ2L?x2n + τz)y2n = Jσ(∂g)−1(x2n − σ2Lx1n + σLy1n − σr)w1n = αky1n − x1nw2n = αky2n − x2nz1n = w1n − τ2L?w2nz2n = Jσ(∂l)−1(w2n − σ2Lw1n + σLz1n)x1n+1 = x1n + (z1n − y1n)x2n+1 = x2n + (z2n − y2n).(7.13)Because g, l ∈ Γ0(Rm), by Fact 1.3.46,(∂g)−1 = ∂g∗ and (∂l)−1 = ∂l∗.For any λ ∈ R++, we have λ∂f = ∂(λf). By Fact 1.3.57, we haveJτ∂f = Proxτf ; Jσ(∂g)−1 = Proxσg∗ ; and Jσ(∂l)−1 = Proxσl∗ .Let’s plug Proxτf ,Proxσg∗ , and Proxσl∗ into (7.13) instead of Jτ∂f , Jσ(∂g)−1 ,and Jσ(∂l)−1 respectively, we gety1n = Proxτf (x1n − τ2L?x2n + τz)y2n = Proxσg∗(x2n − σ2Lx1n + σLy1n − σr)w1n = αky1n − x1nw2n = αky2n − x2nz1n = w1n − τ2L?w2nz2n = Proxσl∗(w2n − σ2Lw1n + σLz1n)x1n+1 = x1n + (z1n − y1n)x2n+1 = x2n + (z2n − y2n),907.3. A Heron problemwhich is exactly the alorithm (7.11). Therefore, algorithm (7.10) implies(7.11) in this case.Lemma 7.3.2. Let C be a closed convex set in Rm. Then we havedC = ‖ · ‖ιC ,and∂dC = ∂‖ · ‖NC .Proof.dC(x) = infy∈C‖x− y‖= infy∈Rm{‖x− y‖+ ιC(y)}=(ιC‖ · ‖)(x).Since C is a closed bounded convex set, for any u ∈ Rm,ι∗C(u) = supx∈Rm{〈x, u〉 − ιC(x)}= supx∈C〈x, u〉 < +∞.Therefore, dom ι∗C = Rm. According to Fact 1.3.40, dom ‖ · ‖∗ 6= ∅, that meansdom ‖ ·‖∗∩ intdom ι∗C 6= ∅.Moreover, from Example 1.3.25, we haveNC = ∂ιC .By using Theorem 2.4.3 , we have∂dC =∂(‖ · ‖ιC)=∂‖ · ‖∂ιC=∂‖ · ‖NC .Example 7.1. Let f = ιC1 , l = ιC2 , where C1 is a circle centred at (5, 0) withradius 2, and C2 is a box centred at (−2, 4) with radius 0.5. A simple Heronproblem is to solveminx∈C1dC2(x) = minx∈R2(ιC1(x) + dC2(x)).917.3. A Heron problemFigure 7.2: The plot of Example 7.1Let g = ‖ · ‖, z = 0, r = 0. Then, we aim to solve the problem with the primalinclusion: find x ∈ Rm such that0 ∈ NC1(x) + (∂‖ · ‖NC2)(x) (7.14)together with the dual inclusion: find v such that there exists an x ∈ Rm that{ −v ∈ NC1(x)v ∈ (∂‖ · ‖NC2)(x).(7.15)Here, L = Id . Let αk be an increasing convergent sequence in [1, 2) such thatlimk→+∞αk = 2. For each αk, let L = 4−αk2 Id . Then the following holds:(1) When αk → 2, the sequence of problems: find x ∈ Rm such that0 ∈ NC1(x) +2− αkτx+αk4− αkL?(∂‖ · ‖2−αkσ NC2)(Lx), (7.16)where τ ∈ R++, σ ∈ R++, and τσ‖L‖2 < 4, together with the sequence ofduals: find v such that there exists an x ∈ Rm that −αk4−αkL?v ∈ NC1(x) + (2−αk)τ xv ∈ (∂‖ · ‖2−αkσ NC2)(Lx)(7.17)927.3. A Heron problemhave solution pairs (xk, vk) that converge to (x, v) satisfying the primal in-clusion (7.14) together with the dual inclusion (7.15).(2) The problem with primal inclusion (7.16) and dual inclusion (7.17) can besolved by the α-Douglas-Rachford algorithm. Moreover, the optimizationresult which is obtained by the classic Douglas-Rachford algorithm is the so-lution of the problem with primal inclusion (7.14) and dual inclusion (7.15).Proof. (1) Since C1 and C2 are closed, bounded and convex sets, according toExample 1.2.33, NC1 and NC2 are maximally monotone operators. Because‖ · ‖ ∈ Γ0(R2), by Fact 1.3.27, ∂‖ · ‖ is maximally monotone. According toExample 1.3.25,NC1 = ∂ιC1 , NC2 = ∂ιC2 ,and according to Example 1.3.13, ιC1 , ιC2 ∈ Γ0(R2).As g = ‖ · ‖, and dom(gl) = dom g + dom l, we have L(dom f) ∩intdom(gl) 6= ∅. As l = ιC2 where C2 is a closed bounded convex set, forany u ∈ R2,l∗(u) = supx∈R2{〈x, u〉 − l(x)}= supx∈C2〈x, u〉 < +∞.Therefore, dom l∗ = R2. Consequently, dom g∗ ∩ intdom l∗ 6= ∅.By Fact 1.3.46,D−1 = (∂l)−1 = ∂l∗.As dom l∗ = R2, we can use Fact 1.3.43 to getintdom l∗ = dom ∂l∗ = dom l∗ = R2.That is, domD−1 = R2.Now, we can apply Theorem 5.2.4 to complete the proof.(2) AsdomD−1 = R2,by Theorem 3.2.10, the problem with primal inclusion: For any fixed αk,find x ∈ R2 such that0 ∈ NC1x+2− αkτx+αk4− αkL?(∂‖ · ‖2−αkσ NC2)(Lx). (7.18)937.3. A Heron problemtogether with the dual inclusion: find v such that there exists an x ∈ Rm that −αk4−αkL?v ∈ NC1x+ (2−αk)τ x,v ∈ (∂‖ · ‖2−αkσ NC2)(Lx).(7.19)can be solved by the α-Douglas-Rachford algorithm.Because ιC1 , ιC2 ∈ Γ0(R2), according to Theorem 7.3.1, we can use al-gorithm (7.11) to solve the problem with primal inclusion (7.18) and dualinclusion (7.19).When solving that problem with algorithm (7.11), by Lemma 7.1.1(1),Proxτf (x) = PC1(x).As C1 is a circle centred at (5, 0) with radius 2, we can also writeProxτf (x) = (5, 0) + PB(0;2)(x− (5, 0)).We use Lemma 7.1.1(2) to getProxσl∗(x) = x− σPC2(xσ).As g = ‖ · ‖, by Lemma 7.1.2,Proxσg∗(x) = PB(0;1)(x).Because τ, σ, ‖L‖ must satisfy the relation τσ‖L‖2 < 4, and L = Id . Tobe on the safe side, we let σ = 2, and τ = 3/2. Hence, when we choosingx0 = (5,−2), v0 = (0, 0) as starting values, for any fixed k, let αk = 2− 1k ,the iterative scheme algorithm (7.11) becomes:y1n = (5, 0) + PB(0;2)(x1n − τ2x2n − (5, 0))y2n = PB(0;1)(x2n − σ2x1n + σy1n)w1n = αky1n − x1nw2n = αky2n − x2nz1n = w1n − τ2w2nz2n = w2n − σ2w1n + σz1n − σPC2((w2n − σ2w1n + σz1n)/σ)x1n+1 = x1n + (z1n − y1n)x2n+1 = x2n + (z2n − y2n).(7.20)947.3. A Heron problemThe stopping criteria of this algorithm is: Let = 10−5, continue runningthis iteration until ‖x1n+1 − x1n‖ < , and ‖x2n+1 − x2n‖ < .Table 7.3: Example 2: Seven fixed αk, where αk = 2 − 1/k, the correspondingoptimal point y∗1 and y∗2 , together with the case α = 2.αk y∗1 y∗21 (3.0041,0.1277) (0.8759,-0.4825)2− 110 (3.1449,0.7475) (0.8705,-0.4922)2− 150 (3.2156,0.9033) (0.8781,-0.4786)2− 1100 (3.2270,0.9254) (0.8792,-0.4764)2− 11000 (3.2378,0.9459) (0.8803,-0.4743)2− 110000 (3.2389,0.9480) (0.8805,-0.4741)2− 1106(3.2391,0.9482) (0.8805,-0.4741)α = 2 (3.2391,0.9482) (0.8805,-0.4741)As we can see, the numerical result shows that as k gets larger, or we cansay as αk gets closer to 2, the optimal result y∗1 and y∗2 which are gotten bythe α-Douglas-Rachford algorithm gets closer to the optimal result which isgotten by the classic Douglas-Rachford algorithm.Then we run the algorithm (7.20) again with the same starting point andsame stoping criteria, but in this time, we set τ = 1, σ = 1. Here is theresult:Table 7.4: Example 2: Seven fixed αk, where αk = 2 − 1/k, the correspondingoptimal point y∗1 and y∗2 , together with the case α = 2.αk y∗1 y∗21 (3.0020,0.0899) (0.8723,-0.4890)2− 110 (3.1196,0.6813) (0.8639,-0.5037)2− 150 (3.2063,0.8847) (0.8762,-0.4820)2− 1100 (3.2219,0.9157) (0.8783,-0.4782)2− 11000 (3.2373,0.9449) (0.8802,-0.4745)2− 110000 (3.2389,0.9479) (0.8804,-0.4741)2− 1106(3.2391,0.9482) (0.8805,-0.4741)α = 2 (3.2391,0.9482) (0.8805,-0.4741)Again, the numerical result shows that as k gets larger, or we can say as957.4. A feasibility problem solved by the primal-dual formulationαk gets closer to 2, the optimal result y∗1 and y∗2 which are gotten by the α-Douglas-Rachford algorithm gets closer to the optimal result which is gottenby the classic Douglas-Rachford algorithm.We also tried to run this algorithm with different starting point with fixed τand σ, and the result shows that once we fix the value of k, the result weget does not change if we change its starting point. This is because when αkfixed, 0 ∈ (A +B + (2 − αk) Id)(x) has a unique solution. So it does notmatter when we change the starting point of the algorithm.7.4 A feasibility problem solved by the primal-dualformulationExample 7.2. Let f = ιC1 , g = ιC2 , where C1 is a circle centred at (5, 0) withradius 2, and C2 is a box centred at (3, 1.5) with radius 1. Let z = 0, r = 0. If welet A = ∂f,B = ∂g, then, we aim to solve the problem with the primal inclusion:find x ∈ Rm such that0 ∈ NC1(x) +NC2(x). (7.21)together with the dual inclusion: find v ∈ Rm such that{ −v ∈ NC1(x)v ∈ NC2(x),(7.22)967.4. A feasibility problem solved by the primal-dual formulationFigure 7.3: The plot of Example 7.2We can solve (7.22) by our α-Douglas-Rachford method.Let αk be a increasing convergent sequence in [1, 2) such that limk→+∞αk = 2.For each αk, let L = 4−αk2 Id . Then we use the α-Douglas-Rachford algorithm tosolve the problem with primal inclusion: find x ∈ Rm such that0 ∈ NC1(x) +2− αkτx+αk4− αkL?(NC2σ2− αk Id)(Lx), (7.23)where τ ∈ R++, σ ∈ R++, and τσ < 4, together with the dual inclusion: find vsuch that there exists an x ∈ Rm that{− αk4−αkL?v ∈ NC1(x) +2−αkτ xv ∈ (NC2 σ2−αk Id)(Lx).(7.24)When αk → 2, the sequence of solutions of the primal-dual problem (7.23)together with (7.24) converge to the primal-dual shortest norm solution of problem(7.21) together with (7.22).7.4.1 The algorithmSince C1 and C2 are closed, bounded and convex sets, according to Example1.2.33, NC1 andNC2 are maximally monotone operators. Because int(C1∩C2) 6=977.4. A feasibility problem solved by the primal-dual formulation∅. we have zer(NC1 + NC2) 6= ∅. Due to Theorem 6.3.1, we can use α-Douglas-Rachford algorithm to solve the problem with primal inclusion (7.23) together withthe dual inclusion (7.24).According to Example 1.3.25,NC1 = ∂ιC1 , NC2 = ∂ιC2 ,and according to Example 1.3.13, ιC1 , ιC2 ∈ Γ0(R2).Because ιC1 , ιC2 ∈ Γ0(R2), Theorem 7.3.1 shows that we can use algorithm(7.11) to solve the problem with primal inclusion (7.23) and dual inclusion (7.24).When solving that problem with algorithm (7.11), according to Lemma 7.1.1(1),Proxτf (x) = PC1(x).As C1 is a circle centred at (5, 0) with radius 2, we can also writeProxτf (x) = (5, 0) + PB(0;2)(x− (5, 0)).We use Lemma 7.1.1 (2) to getProxσg∗(x) = x− σPC2(xσ).Because τ, σ must satisfy the relation τσ < 4, to be on the safe side, we letσ = 2, and τ = 3/2. Hence, when we choose x0 = (5, 1), v0 = (0, 0) as startingvalues, for any fixed k, let αk = 2− 1k , the iterative scheme (7.11) becomes:y1n = (5, 0) + PB(0;2)(x1n − τ2x2n − (5, 0))y2n = (x2n − σ2x1n + σy1n)− σPC2((x2n − σ2x1n + σy1n)/σ)w1n = αky1n − x1nw2n = αky2n − x2nz1n = w1n − τ2w2nz2n = w2n − σ2w1n + σz1nx1n+1 = x1n + (z1n − y1n)x2n+1 = x2n + (z2n − y2n).(7.25)The stopping criteria of this algorithm is: Let = 10−5, continue running thisiteration until ‖x1n+1 − x1n‖ < , and ‖x2n+1 − x2n‖ < .7.4.2 Numerical resultsWe summarize our numerical implementation in three tables.987.4. A feasibility problem solved by the primal-dual formulationTable 7.5: Example 3: Six fixed αk, where αk = 2 − 1/k, the correspondingoptimal point y∗1 and y∗2 , and√‖y∗1‖2 + ‖y∗2‖2, together with the case α = 2.αk y∗1 y∗2√‖y∗1‖2 + ‖y∗2‖21 (3.0053,0.1460) (1.0160,-0.5621) 3.22512− 110 (3.0565,0.4721) (0,-0.0852) 3.09392− 150 (3.0622,0.4949) (0,-0.0172) 3.10202− 1100 (3.0629,0.4975) (0,-0.0086) 3.10302− 11000 (3.0634,0.4997) 1.0e-03 *(0,-0.8606) 3.10392− 110000 (3.0635,0.5000) 1.0e-04 *(0,-0.8607) 3.1040α = 2 (3.6259,0.6339) (0,0) 3.6809Then we run the algorithm (7.25) again with the same starting point and samestoping criteria, but in this time, we set τ = 1, σ = 1. Here is the result:Table 7.6: Example 3: Six fixed αk, where αk = 2 − 1/k, the correspondingoptimal point y∗1 and y∗2 , and√‖y∗1‖2 + ‖y∗2‖2, together with the case α = 2.αk y∗1 y∗2√‖y∗1‖2 + ‖y∗2‖21 (3.0014,0.0740) (0.5021,-0.3890) 3.06872− 110 (3.0546,0.4642) (0,-0.1256) 3.09222− 150 (3.0621,0.4945) (0,-0.0258) 3.10192− 1100 (3.0628,0.4974) (0,-0.0129) 3.10302− 11000 (3.0634,0.4997) (0,-0.0013) 3.10392− 110000 (3.0635,0.5000) 1.0e-03 *(0,-0.1291) 3.1040α = 2 (3.7500,0.7500) (0,0) 3.8243We also tried to run this algorithm with the starting point x0 = (−4,−6), v0 =(0, 0), and fix τ = 1 and σ = 1, Here is the result:997.4. A feasibility problem solved by the primal-dual formulationTable 7.7: Example 3: Six fixed αk, where αk = 2 − 1/k, the correspondingoptimal point y∗1 and y∗2 , and√‖y∗1‖2 + ‖y∗2‖2, together with the case α = 2.αk y∗1 y∗2√‖y∗1‖2 + ‖y∗2‖21 (3.0014,0.0740) (0.5021,-0.3890) 3.06872− 110 (3.0546,0.4642) (0,-0.1256) 3.09222− 150 (3.0621,0.4945) (0,-0.0258) 3.10192− 1100 (3.0628,0.4974) (0,-0.0129) 3.10302− 11000 (3.0634,0.4997) (0,-0.0013) 3.10392− 110000 (3.0635,0.5000) 1.0e-03 *(0,-0.1291) 3.1040α = 2 (3.3945,0.6448) (0,0) 3.4552According to the numerical result above, we can get the following conclusions:(1) If we let y∗ = (3.0635, 0.5000) and w∗ = (0, 0),tables 7.5, 7.6, and 7.7 allshow that when αk → 2, we have the smallest norm primal-dual solution(y∗, w∗). y∗ solves the primal and w∗ solves the dual.(2) Once we fix the value of k with fixed τ and σ, the result we get by using α-Douglas-Rachford algorithm does not change if we change its starting point.(3) In three tables 7.5, 7.6, and 7.7, α = 2 gives different y∗1 because0 ∈ NC1(x) +NC2(x)has multiple solutions.Remark 7.3. In Bot and Hendrich’s paper (see [6]), they have Douglas-Rachfordalgorithm applications on denoising problems in image processing. We believesimilar work can be done for the α-Douglas-Rachford algorithm.100Chapter 8Conclusions and future workIn this thesis, the Douglas-Rachford algorithm is studied. This is an algorithmfor solving the split problem: find x ∈ Rm such that0 ∈ Ax+Bx,where A and B are maximally monotone operators. The Douglas-Rachford algo-rithm can be written as(∀n ∈ N)yn = JBxnzn = JA(2yn − xn)xn+1 = xn + (zn − yn).In this thesis, I built a new algorithm based on Douglas-Rachford algorithm andcalled it the α-Douglas-Rachford algorithm. This algorithm solves the split prob-lem: find x ∈ Rm such that0 ∈ Ax+Bx+ (2− α)xwhere α ∈ [1, 2), A and B are maximally monotone operators. The new algorithmcan be written as(∀n ∈ N)yn = JBxnzn = JA(αyn − xn)xn+1 = xn + (zn − yn).I proved that the α-Douglas-Rachford algorithm has very similar properties to theDouglas-Rachford algorithm, and also showed the connection between those twoalgorithms when α→ 2. One distinctive feature of α-Douglas-Rachford algorithmis that it can be used to find the least norm solution.Possible future work:(1) Is the α-Douglas-Rachford algorithm error torlerant?(2) In the primal-dual problems, what are the optimal parameters τ, σ to imple-ment the α-Douglas-Rachford algorithms?101Chapter 8. Conclusions and future work(3) If we change the space from Rm to a more general space, like H, a generalHilbert space, does the α-Douglas-Rachford algorithm have the same resultsand properties?(4) Comparing with the Douglas-Rachford algorithm, does theα-Douglas-Rachfordalgorithm converge faster?(5) More numerical experiments on theα-Douglas-Rachford algorithm for higherdimensions and practical applications are required.102Bibliography[1] ATTOUCH, H., AND THÉRA, M. A general duality principle for the sumof two operators. Journal of Convex Analysis 3, (1996), 1-24. → pages 35,82[2] BAUSCHKE, H. H. A note on the paper by Eckstein and Svaiter on "Gen-eral projective splitting methods for sums of maximal monotone operators” .SIAM Journal on Control and Optimtimization 48, 4 (2009), 2513-2515. →pages 49[3] BAUSCHKE, H. H., AND BORWEIN, J. M. Dykstra’s alternating projectionalgorithm for two sets. Journal of Approximation Theory 79, 3 (1994), 418-443. → pages 78[4] BAUSCHKE, H. H., AND COMBETTES, P. L. Convex analysis and mono-tone operator theory in Hilbert spaces. Springer, 2017. → pages 4, 5, 8, 14,15, 16, 18, 19, 22, 23, 25, 26, 27, 28, 29, 30, 32, 40, 49, 53, 73[5] BAUSCHKE, H. H., AND MOURSI, W. M. On the Douglas-Rachford al-gorithm. Mathematical Programming, 164, 1-2 (2017), 263-284. → pages32[6] BOT, R. I., AND HENDRICH, C. A Douglas-Rachford type primal-dualmethod for solving inclusions with mixtures of composite and parallel-sumtype monotone operators. SIAM Journal on Optimization 23, 4 (2013), 2541-2565. → pages 31, 32, 34, 35, 100[7] COMBETTES, P. L. Solving monotone inclusions via compositions of non-expansive averaged operators. Optimization 53, 5-6 (2004), 475-504. →pages 32[8] COMBETTES, P. L. Iterative construction of the resolvent of a sum of max-imal monotone operators. Journal of Convex Analysis 16, 4 (2009), 727-748.→ pages 32[9] DOUGLAS, J., AND RACHFORD, H. H. On the numerical solution of heatconduction problems in two and three space variables. Transactions of theAmerican Mathematical Society 82, 2 (1956), 421-439. → pages 31103Bibliography[10] DUCHI, J. Introductory lectures on stochastic optimization. Manuscript,Stanford University, (2016). → pages 26[11] ECKSTEIN, J., AND BERTSEKAS, D. P. On the Douglas-Rachford split-ting method and the proximal point algorithm for maximal monotone opera-tors. Mathematical Programming 55, 1 (1992), 293-318. → pages 31[12] KREYSZIG, E. Introductory functional analysis with applications (Vol. 1).Wiley New York, 1989. → pages 7, 11, 13[13] LIEUTAUD, J. Approximations d’opérateurs monotones par des méthodesde splitting. Doctoral thesis, University of Paris, 1969. → pages 31[14] LIONS, P. L., AND MERCIER, B. Splitting algorithms for the sum of twononlinear operators. SIAM Journal on Numerical Analysis 16, 6 (1979), 964-979. → pages 31, 32[15] MORDUKHOVICH, B. S., AND NAM, N. M. An easy path to convex anal-ysis and applications. Synthesis Lectures on Mathematics and Statistics, 6:1-218, 2013. → pages 24[16] SVAITER, B. F. On weak convergence of the Douglas-Rachford method.SIAM Journal on Control and Optimization 49, 1 (2011), 280-287. → pages32[17] WANG, X. Self-dual regularization of monotone operators via the resolventaverage. SIAM Journal on Optimization 21, 2 (2011), 438-462. → pages 78104
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- A parameterized Douglas-Rachford algorithm : theory...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
A parameterized Douglas-Rachford algorithm : theory and applications Wang, Dongying 2017
pdf
Page Metadata
Item Metadata
Title | A parameterized Douglas-Rachford algorithm : theory and applications |
Creator |
Wang, Dongying |
Publisher | University of British Columbia |
Date Issued | 2017 |
Description | Douglas-Rachford algorithm is important due to its applications on the Heron problem and on the image denoising. Mathematically, it can be considered as finding a point such that the point belongs to a zero set of the sum of two maximally monotone operators. In this thesis, previous work on Douglas-Rachford algorithm is presented and the Douglas-Rachford algorithm with a changed parameter is considered. I give it the name "α-Douglas-Rachford algorithm". The new algorithm which has the changed parameter is shown to have a convergent result and other conclusions similar to those of the classic Douglas-Rachford algorithm. At the same time, it has been shown that the application of the α-Douglas-Rachford algorithm is wider than the application of the classic one. Later on, the α-Douglas-Rachford algorithm is proved to converge to the solution of the composited monotone inclusion problems, and in a special-limit case, it has some other properties. The numerical experiments confirm that the α-Douglas-Rachford algorithm does have the properties that I proved theoretically. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2017-08-18 |
Provider | Vancouver : University of British Columbia Library |
Rights | Attribution-NonCommercial-NoDerivatives 4.0 International |
DOI | 10.14288/1.0354491 |
URI | http://hdl.handle.net/2429/62715 |
Degree |
Master of Science - MSc |
Program |
Mathematics |
Affiliation |
Irving K. Barber School of Arts and Sciences (Okanagan) Mathematics, Department of (Okanagan) |
Degree Grantor | University of British Columbia |
Graduation Date | 2017-09 |
Campus |
UBCO |
Scholarly Level | Graduate |
Rights URI | http://creativecommons.org/licenses/by-nc-nd/4.0/ |
Aggregated Source Repository | DSpace |
Download
- Media
- 24-ubc_2017_september_Wang_Dongying.pdf [ 758.6kB ]
- Metadata
- JSON: 24-1.0354491.json
- JSON-LD: 24-1.0354491-ld.json
- RDF/XML (Pretty): 24-1.0354491-rdf.xml
- RDF/JSON: 24-1.0354491-rdf.json
- Turtle: 24-1.0354491-turtle.txt
- N-Triples: 24-1.0354491-rdf-ntriples.txt
- Original Record: 24-1.0354491-source.json
- Full Text
- 24-1.0354491-fulltext.txt
- Citation
- 24-1.0354491.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0354491/manifest