Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

The Douglas–Rachford operator in the possibly inconsistent case : static properties and dynamic behaviour Moursi, Walaa M. 2016

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
24-ubc_2017_february_moursi_walaa.pdf [ 1.57MB ]
Metadata
JSON: 24-1.0340501.json
JSON-LD: 24-1.0340501-ld.json
RDF/XML (Pretty): 24-1.0340501-rdf.xml
RDF/JSON: 24-1.0340501-rdf.json
Turtle: 24-1.0340501-turtle.txt
N-Triples: 24-1.0340501-rdf-ntriples.txt
Original Record: 24-1.0340501-source.json
Full Text
24-1.0340501-fulltext.txt
Citation
24-1.0340501.ris

Full Text

The Douglas–Rachford Operator in thePossibly Inconsistent Case:Static Properties and Dynamic BehaviourbyWalaa M. MoursiB.Sc. (with distinction), Mansoura University, Egypt, 1999M.Sc., Mansoura University, Egypt, 2004A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFDOCTOR OF PHILOSOPHYinTHE COLLEGE OF GRADUATE STUDIES(Mathematics)THE UNIVERSITY OF BRITISH COLUMBIA(Okanagan)December 2016© Walaa M. Moursi, 2016AbstractThe problem of finding a minimizer of the sum of two convex functions— or, more generally, that of finding a zero of the sum of two maximallymonotone operators — is of central importance in variational analysis andoptimization. Perhaps the most popular method of solving this problem isthe Douglas–Rachford splitting method. Surprisingly, little is known aboutthe behaviour of the algorithm in the general inconsistent case, i.e., whenthe sum problem has no zeros.This thesis provides a comprehensive study of the Douglas–Rachfordoperator and the corresponding algorithm. First, a novel framework ofthe normal problem that copes with the possibly inconsistent situation isintroduced and studied. We present a systematic study of the range of theoperator and displacement map, which provides precise information onthe minimal displacement vector needed to define the normal problem. Anew Feje´r monotonicity principle as well as new identities for the Douglas–Rachford operator are developed. A systematic and detailed study of affinenonexpansive operators and their asymptotic behaviour is presented. Inthe light of the new analysis, we significantly advance the understandingof the inconsistent case by providing a complete proof of the full weakconvergence of the shadow sequence to a best approximation solution inthe convex feasibility setting. In fact, a more general sufficient conditionfor weak convergence in the general case is presented. Under some extraassumptions, we were able to prove strong and linear convergence. Ourresults are illustrated through numerous examples. We conclude with a listof open problems which serve as promising directions of future research.iiPrefaceThe research work presented in this thesis is based on the nine publishedpapers [13], [14], [15] by Heinz Bauschke and Walaa Moursi; [26] by HeinzBauschke, Radu Bot¸, Warren Hare and Walaa Moursi; [31], [38] by HeinzBauschke, Warren Hare and Walaa Moursi; [36], [33] by Heinz Bauschke,Minh Dao and Walaa Moursi; [107] by Sarah Moffat, Walaa Moursi andXianfu Wang; and the two submitted manuscripts [39] by Heinz Bauschkeand Walaa Moursi; and [110] by Walaa Moursi.The results in these papers appear in the thesis as follows: Chapter 4 isbased on the work in [39] and [26]; Chapter 5 is based on the work in [15],[39], [26] and [31]; Chapter 6 is based on the work in [36], [13] and [33];Chapter 7 is based on the work in [31], [15], and [39]; Chapter 8 is based onthe work in [15], [39] and [110]; Chapter 9 is based on the work in [15], [39]and [26]; Chapter 11 is based on the work in [31] and [38]; Chapter 12 andChapter 13 are based on the work in [38]; Chapter 14 is based on the workin [13]; Chapter 15 is based on the work in [15]; and Chapter 16 is based onthe work in [36].For all co-authored papers, each author contributed equally.iiiTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiiGlossary of Notation and Symbols . . . . . . . . . . . . . . . . . . . xAcknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiiDedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiiiChapter 1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 The goal of the thesis . . . . . . . . . . . . . . . . . . . . . . . 11.2 Contributions in this thesis . . . . . . . . . . . . . . . . . . . . 2Chapter 2: Convex analysis: Overview . . . . . . . . . . . . . . . . 52.1 Convex sets and projection operators . . . . . . . . . . . . . . 52.2 Convex functions . . . . . . . . . . . . . . . . . . . . . . . . . 72.3 Subdifferentiability and conjugation . . . . . . . . . . . . . . 8Chapter 3: (Firmly) nonexpansive mappings and monotone oper-ators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103.1 Nonexpansive and firmly nonexpansive operators . . . . . . 103.2 Monotone operators: Basic definitions and facts . . . . . . . 113.3 Resolvents of monotone operators . . . . . . . . . . . . . . . 133.4 Sums of monotone operators: maximality and range of thesum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163.5 Examples of linear monotone operators . . . . . . . . . . . . 20ivTABLE OF CONTENTSChapter 4: Attouch–The´ra duality and paramonotonicity . . . . . 294.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294.2 Duality for monotone operators . . . . . . . . . . . . . . . . . 294.3 Solution mappings K and Z . . . . . . . . . . . . . . . . . . . 334.4 Paramonotonicity . . . . . . . . . . . . . . . . . . . . . . . . . 384.5 Projection operators and solution sets . . . . . . . . . . . . . 414.6 Subdifferential operators . . . . . . . . . . . . . . . . . . . . . 43Chapter 5: Zeros of the sum and the Douglas–Rachford operator 475.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475.2 Basic properties and facts . . . . . . . . . . . . . . . . . . . . 475.3 Useful identities for the Douglas–Rachford operator . . . . . 495.4 Douglas–Rachford operator and Attouch–The´ra duality . . . 515.5 The Douglas–Rachford operator and solution sets . . . . . . 555.6 From PDEs to maximally monotone operators . . . . . . . . 625.7 Why do we need to work with monotone operators and notjust subdifferential operators? . . . . . . . . . . . . . . . . . . 65Chapter 6: On Feje´r monotone sequences and nonexpansivemappings . . . . . . . . . . . . . . . . . . . . . . . . . . . 676.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 676.2 Feje´r monotonicity: New principles . . . . . . . . . . . . . . . 67Chapter 7: Nonexpansive mappings and the minimal displace-ment vector . . . . . . . . . . . . . . . . . . . . . . . . . . 747.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 747.2 Auxiliary results . . . . . . . . . . . . . . . . . . . . . . . . . . 747.3 The displacement map and the minimal displacement vector 76Chapter 8: Asymptotic behaviour of nonexpansive mappings . . 798.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 798.2 Iterating nonexpansive operators . . . . . . . . . . . . . . . . 808.3 Affine nonexpansive operators . . . . . . . . . . . . . . . . . 848.3.1 Iterating an affine nonexpansive operator . . . . . . . 848.3.2 Strong and linear convergence of affine nonexpansiveoperators . . . . . . . . . . . . . . . . . . . . . . . . . . 908.3.3 Some algorithmic consequences . . . . . . . . . . . . 958.4 Further results . . . . . . . . . . . . . . . . . . . . . . . . . . . 97Chapter 9: The Douglas–Rachford algorithm: convergence analysis101vTABLE OF CONTENTS9.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1019.2 Convergence of Douglas–Rachford algorithm . . . . . . . . . 1029.3 A new proof of the Lions–Mercier–Svaiter theorem . . . . . . 1039.4 The Douglas–Rachford algorithm in the affine case . . . . . . 1069.5 Eckstein–Ferris–Pennanen–Robinson duality and algorithms 1109.6 Brief history . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113Chapter 10: On the order of the operators . . . . . . . . . . . . . . . 11510.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11510.2 Connection between Fix T(A,B) and Fix T(B,A) . . . . . . . . . 11510.3 Iterates of T(A,B) vs. iterates of T(B,A) . . . . . . . . . . . . . . 117Chapter 11: Generalized solutions for the sum of two maximallymonotone operators . . . . . . . . . . . . . . . . . . . . 12611.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12611.2 A motivation from Linear Algebra . . . . . . . . . . . . . . . 12711.3 Perturbation calculus . . . . . . . . . . . . . . . . . . . . . . . 12811.4 Perturbations of the Douglas–Rachford operator . . . . . . . 12911.5 The w-perturbed problem . . . . . . . . . . . . . . . . . . . . 13111.6 The normal problem . . . . . . . . . . . . . . . . . . . . . . . 13311.7 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135Chapter 12: On the range of the Douglas–Rachford operator: Theory14012.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14012.2 Near convexity and near equality . . . . . . . . . . . . . . . . 14112.3 Attouch–The´ra duality and the normal problem . . . . . . . 14412.4 Main results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14512.5 Some infinite-dimensional observations . . . . . . . . . . . . 151Chapter 13: On the range of the Douglas–Rachford operator: Ap-plications . . . . . . . . . . . . . . . . . . . . . . . . . . . 15513.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15513.2 On the minimal displacement vector v(A,B) . . . . . . . . . . 15613.3 Applications to subdifferential operators . . . . . . . . . . . 15713.4 Application to firmly nonexpansive mappings . . . . . . . . 159Chapter 14: The Douglas–Rachford algorithm in the possibly in-consistent case . . . . . . . . . . . . . . . . . . . . . . . . 16114.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16114.2 The convex feasibility setting . . . . . . . . . . . . . . . . . . 162viTABLE OF CONTENTS14.3 Convergence analysis in the inconsistent case . . . . . . . . . 165Chapter 15: The Douglas–Rachford algorithm in the affine–affinefeasibility case . . . . . . . . . . . . . . . . . . . . . . . . 17315.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17315.2 The Douglas–Rachford operator for two affine subspaces . . 174Chapter 16: The Douglas–Rachford algorithm in the affine-convexcase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18116.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18116.2 Convergence results . . . . . . . . . . . . . . . . . . . . . . . 18116.3 Spingarn’s method . . . . . . . . . . . . . . . . . . . . . . . . 184Chapter 17: Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 18717.1 Main results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18717.2 Side results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18817.3 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205viiList of FiguresFigure 8.1 The solid curve represents limn→∞(T−v)nx, thedashed dotted curve represents limn→∞(v + T)nx,and the dashed curve represents limn→∞ Tnx + nv,when α = 0.5 and β = 1. . . . . . . . . . . . . . . . . 89Figure 8.2 A GeoGebra [78] snapshot that illustratesExample 8.39. The first few terms of the sequence(Tnx + nv)n∈N (blue points) are depicted. . . . . . . 100Figure 10.1 A GeoGebra [78] snapshot. Two closed convex setsin R2, U is a linear subspace (green line) and V(the ball). Shown are also the first five terms ofthe sequences (Tn(A,B)RAx0)n∈N (red points) and(Tn(B,A)x0)n∈N (blue points) in each case. . . . . . . . 121Figure 10.2 A GeoGebra [78] snapshot. Two closed convex sets inR2, U is the halfspace (cyan region) and V (the ball).Shown are also the first five terms of the sequences(Tn(A,B)RAx0)n∈N (red points) and (Tn(B,A)x0)n∈N(blue points) in each case. . . . . . . . . . . . . . . . 121Figure 11.1 A diagram that illustrates Remark 11.12. . . . . . . . 133Figure 11.2 A GeoGebra [78] snapshot that illustratesExample 11.24. . . . . . . . . . . . . . . . . . . . . . . 136Figure 11.3 A GeoGebra [78] snapshot that illustratesExample 11.25. . . . . . . . . . . . . . . . . . . . . . . 137Figure 12.1 A GeoGebra [78] snapshot that illustratesExample 12.15. . . . . . . . . . . . . . . . . . . . . . . 151viiiLIST OF FIGURESFigure 14.1 A GeoGebra [78] snapshot that illustratesFact 14.6(ii). Two intersecting lines in R3, U the blueline and V the red line. The first few iterates of thesequences (Tnx)n∈N (red points) and (PUTnx)n∈N(blue points) are also depicted. . . . . . . . . . . . . 164Figure 14.2 A GeoGebra [78] snapshot that illustratesTheorem 14.11. Two nonintersecting polyhedralsets in R2, U and V. The first few iterates of thesequences (Tnx)n∈N (red points) and (PUTnx)n∈N(blue points) are also depicted. Shown is theminimal displacement vector v as well. . . . . . . . 167Figure 15.1 Two nonintersecting affine subspaces U (blue line)and V (purple line) in R3. Shown are also thefirst few iterates of (Tnx0)n∈N (red points) and(PUTnx0)n∈N (blue points). . . . . . . . . . . . . . . . 178Figure 16.1 A GeoGebra [78] snapshot that illustratesExample 16.2. . . . . . . . . . . . . . . . . . . . . . . 183Figure 16.2 A GeoGebra [78] snapshot that illustratesCorollary 16.6. Three nonintersecting closedconvex sets, C1 (the blue triangle), C2 (the redpolygon) and C3 (the green circle), are shown alongwith their translations forming the generalizedintersection. The first few terms of the sequence(e(PUTnx))n∈N (yellow points) are also depicted.Here e : U→ R2 : (x, x, x) 7→ x. . . . . . . . . . . . . 186ixGlossary of Notation andSymbolsReal line:N The set of natural numbers {0, 1, 2, 3, . . .}Z The set of integers {. . . ,−2,−1, 0, 1, 2, . . .}R The set of real numbersR+ The set of positive real numbers [0,+∞[R++ The set of strictly positive real numbers]0,+∞[R− The set of negative real numbers ]−∞, 0]R−− The set of strictly negative real numbers]−∞, 0[Sets:C p. 5 Closure of a set Cint C p. 5 Interior of a set Cri C p. 5 Relative interior of a set Csri C p. 5 Strong relative interior of a set Cconv C p. 5 Convex hull of a set Caff C p. 5 Affine hull of a set Ccone C p. 5 Conic hull of a set CC⊥ p. 6 Orthogonal complement of a set Cpar C p. 6 Linear space parallel to a set CC	 p. 5 Polar cone of a set CC⊕ p. 5 Dual cone of a set Crec C p. 5 Recession cone of a set CNC p. 5 Normal cone operator of a set CPC p. 6 Orthogonal projection onto a set Cball(x; r) p. 6 Closed ball with centre x and radius rxGlossary of Notation and SymbolsFunctions:Γ(X) p. 8 The set of all proper convex lower semi-continuous functions defined from X to]−∞,+∞]argmin f p. 8 The set of minimizers of a function fepi f p. 7 epigraph of a function fιC p. 8 Indicator function of a set Cdom f p. 7 domain of a function fProx f p. 8 Proximal mapping of a function of f∂ f p. 8 Subdifferential operator a function of ff ∗ p. 9 Fenchel conjugate of a function of ff ∨ p. 9 Reversal of a function of fSet-valued operators:NC p. 5 Normal cone operator of a set CA : X ⇒ X p. 11 A is a set valued operator on XA−1 p. 11 Inverse of an operator AA> p. 11 The reversal of an operator AA−> p. 11 The reversal of the inverse of an operator Adom A p. 11 Domain of an operator Aran A p. 11 Range of an operator Agra A p. 11 Graph of an operator Azer A p. 11 Set of zeros of an operator AJA p. 13 Resolvent of an operator ARA p. 13 Reflected resolvent of an operator Aker L p. 17 Kernel of a linear operator LA B p. 34 Parallel sum of the operators A and BSingle-valued operators:MT Transpose of a matrix MT : X → X T is a single-valued operator on X definedeverywhereT(C) Image of a set C by an operator TT−1(C) Inverse image of a set C by an operator TT|S p. 55 The restriction of the map T to the set SFix T p. 13 The set of fixed points of an operator Tker L p. 17 Kernel o a linear operator L‖L‖ p. 90 Norm of a linear operator LxiAcknowledgementsFirst and foremost I thank my supervisors, Professor Heinz Bauschke andProfessor Warren Hare. Their broad knowledge, remarkable supervisionand incredible research abilities make this acknowledgment a challengeto write. I do not have enough good words to express my gratitudeto their brilliant mentorship and unconditional guidance. Their support,encouragement and patience go beyond the work in this thesis. Theynever hesitate to put time and effort towards enhancing and polishing myacademic skills, as a researcher, as a doctoral student, as a mentor, and as ateacher.I deeply thank Professor Shawn Wang, the Graduate Program Co-ordinator at UBC (Okanagan campus), for his continuous support and forthe remarkable research experience I gained from our collaboration. I alsothank Professor Yong Gao for agreeing to serve on my thesis committeeand for his constructive comments.I thank Dr. Hung Phan and Dr. Minh Dao for sharing their knowledgeand experience in computational techniques.I recognize the support of Unit 5 members and my colleagues in theCOCANA lab at UBC (Okanagan campus), who had made this journeypossible and cheerful. In particular, I thank Sarah Moffat, Julie Nutini,Chayne Planiden and Liangjin Yao. Special thanks to Mohamed Yafia.I thank UBC (Okanagan campus) for providing me with financial sup-port over the years of my doctoral studies.Finally, I thank my husband Ahmad, whose support always leaves mespeechless; my daughter Hala, whose love and belief in me are the ultimateencouragement I have; my son Yaseen, who witnessed this journey since hewas a day old with love and patience; and my parents, who have been andwill always be my role models.xiiDedicationTo Ahmad, Hala,Yaseen and my parents.xiiiChapter 1IntroductionThe problem of finding a zero of the sum of two maximally monotoneoperators is of fundamental importance in optimization and variationalanalysis. Indeed, a classical problem in optimization is to find a minimizerof the sum of two proper convex lower semicontinuous functions. Thisproblem can be modelled asfind x ∈ X such that 0 ∈ (A + B)x, (1.1)where A and B are maximally monotone operators on X, namely the sub-differential operators of the functions under consideration. (For detaileddiscussions on problem (1.1) and the connection to optimization problemswe refer the reader to [12], [42], [46], [50], [55], [58], [123], [125], [127], [128],[137], [140], [141], [142], and the references therein.)Due to its general convergence results, the Douglas–Rachford algorithmhas become a very popular splitting technique to solve the sum problem(1.1) (see, e.g., [64], [65] and [67] for applications of the method) providedthat a solution exists.As we shall explain below, the goal of this thesis is to explore thebehaviour of the algorithm when the sum problem has no solution.1.1 The goal of the thesisThe Douglas–Rachford algorithm was first introduced in [70] to numeri-cally solve certain types of heat equations. Let T = T(A,B) be the Douglas–Rachford operator associated with the ordered pair (A, B), let JA be theresolvent of A (see Definitions 5.2 and 3.13), and let zer(A + B) denotethe set of zeros of A + B. In their seminal work [96] (see also [95]), Lionsand Mercier extended the algorithm to be able to find a zero of the sum oftwo, not necessarily linear and possibly multivalued, maximally monotoneoperators. Let X be a real Hilbert space and let x ∈ X. Lions andMercier proved that the governing sequence (Tnx)n∈N converges weaklyto a fixed point of T, and that if A + B is maximally monotone, then the11.2. Contributions in this thesisweak cluster points of the shadow sequence (JATnx)n∈N are solutions of(1.1). In [132], weak convergence of the shadow sequence, regardless of themaximal monotonicity of A + B, was established. The convergence proofscritically rely on the assumption that the sum problem is consistent, i.e.,zer(A + B) 6= ∅. Nonetheless, not every sum problem admits a solution:Suppose that A and B are normal cone operators of nonempty closedconvex subsets of X. It is clear that zer(A + B) is equal to the intersectionof the two sets — however, this intersection may be empty, in which case thesum problem does not have any solution.In this context, the behaviour of the algorithm in the inconsistent set-ting, i.e., when the set of zeros of the sum is empty, was not fully un-derstood and our main motivation in this thesis is to better understandwhat the algorithm does in the absence of zeros of the sum. In fact, it isa fundamental problem in optimization to understand the problem in thecase when there is no solution and the corresponding behaviour of asso-ciated optimization algorithms — perhaps the most famous occurrence isthe birth of the least-squares method for solving inconsistent linear systemsdue to Gauss. The inconsistent case often results in considerable technicaldifficulties: For instance, Newton’s method for finding zeros applied to aquadratic function without zeros may lead to chaotic behaviour, see [104,Problem 7-a on page 72].The goal of the work in this thesis is to understand the behaviour of theDouglas–Rachford algorithm in the possibly inconsistent case, to introduce newconcepts that cope with the new situation, to develop a novel analysis that, unlikethe analysis used in the consistent setting, does not rely on the existence ofsolutions of the sum problem and to explore the possible algorithmic consequences.1.2 Contributions in this thesisIn this section, we summarize the main contributions of the research thatconstitutes this thesis, in the same order they appear in the sequel.Chapter 2 and Chapter 3 present standard material and basic facts fromconvex analysis and monotone operator theory.Our new results start in Chapter 4. We systematically study Attouch–The´ra duality for the sum problem. We provide new results related toPassty’s parallel sum, to Eckstein and Svaiter’s extended solution set,and to Combettes’ fixed point description of the set of primal solutions.Furthermore, paramonotonicity is revealed to be a key property, as it allowsfor the recovery of all primal solutions given just one arbitrary dual solution.21.2. Contributions in this thesisIn Chapter 5, we start by reviewing the bridge between the Douglas–Rachford splitting operator and the zeros of the sum. We provide anew description of the fixed point set of the Douglas–Rachford splittingoperator (see Theorem 5.14). We also use the Douglas–Rachford operatorto formulate new results on the relative geometry of the primal and dualsolutions (in the sense of Attouch–The´ra duality).Chapter 6 presents new Feje´r monotonicity principles. In particular, wepoint out that Lemma 6.3 is of critical importance in our main proofs. Italso presents a powerful generalization of the classical Feje´r monotonicityprinciple (see [12, Theorem 5.5]).Chapter 7 and Chapter 8 are dedicated to providing a detailed studyof the behaviour of (firmly) nonexpansive operators whose fixed point setis possibly empty. We provide a precise description of the connection ofthe fixed point sets of both the inner and outer shifts of nonexpansiveoperators. Under the additional assumption that the operator is affine,we are able to get stronger conclusions. As a consequence, we are able toprovide new characterizations of strong and linear convergence of iteratesof affine nonexpansive operators that were previously known only in thecase of linear operators.Chapter 9 focuses on the convergence analysis of the Douglas–Rachfordalgorithm in the consistent case. We provide a new proof of the weakconvergence of the shadow sequence. We apply the new asymptotic resultsof Chapter 8 to the Douglas–Rachford algorithm when used to find a zeroof the sum of two affine relations. This allows for stronger conclusionsincluding strong and linear convergence and more precise informationabout the limits of the governing and shadow sequences.Chapter 10 contains a collection of results that explore the connectionbetween the two Douglas–Rachford operators associated with the sumproblem. We show that the reflectors of the operators involved in thesum act as bijections between the sets of fixed points of the two Douglas–Rachford operators. Moreover, in some special cases, we are able to relatethe two sequences obtained by iterating the two operators. This allowsfor new conclusions regarding linear rates of convergence, as well as newsufficient conditions for finite convergence.In Chapter 11, we introduce the concept of the normal problem associatedwith finding a zero of the sum of two maximally monotone operators. Thisnew concept is a milestone in our analysis. If the original problem admitssolutions, then the normal problem returns this same set of solutions. Thenormal problem may yield solutions when the original problem does notadmit any; furthermore, it has attractive variational and duality properties.31.2. Contributions in this thesisThe normal problem proves to be useful in studying the behaviour of notonly the Douglas–Rachford algorithm, but also other splitting techniques,namely the forward–backward algorithm, in the possibly inconsistent case(see [110]).In Chapter 12, we focus on the ranges of the Douglas–Rachford oper-ator and the corresponding displacement map. Under mild assumptions,we are able to provide explicit formulae for both ranges in terms of thedomains and the ranges of the operators considered in the sum problem.Our formulae use near equality to describe the ranges, and we tighten ourconclusions by providing a counterexample that shows that near equality isoptimal. These results are important because the problem of finding a pointin zer(A + B) has a solution if and only if the Douglas–Rachford operatorhas a fixed point; equivalently if 0 is in the range of the displacement map-ping. It also provides information on finding the minimal displacementvector that defines the normal problem introduced in Chapter 11.Applications of the results in Chapter 12 are given in Chapter 13.We emphasize the case when the two operators in the sum problem aresubdifferential operators — a typical setting in optimization.Algorithmic consequences of our analysis start to appear in Chapter 14.We advance the understanding of the inconsistent case significantly byproviding a complete proof of the full weak convergence in the convexfeasibility setting when A and B are normal cone operators of nonemptyclosed convex subsets of X. This is an important instance of the Douglas–Rachford algorithm, as it has often been applied in the context of feasibilityproblems in both convex and nonconvex settings (see, e.g., [1], [2], [43],[45], [58], [76], [77], [87], [88], and [94]). We also provide some sufficientconditions that guarantee the convergence of the shadow sequence in moregeneral settings.In Chapter 15, we prove the strong convergence of the shadow sequencewhen the sets are affine subspaces that do not necessarily intersect. Weprecisely identify the limit to be the best approximation solution closest tothe starting point. Furthermore, we prove the rate of convergence to belinear when the sum of the two subspaces is closed.In Chapter 16, we explore what happens when only one set is an affinesubspace. Whether or not the two shadow sequences converge depends onthe order of the operators used in the definition of The Douglas–Rachfordoperator. Using the Pierra product space technique, we also apply ourresults to solve a least squares problem involving three or more sets.Finally, in Chapter 17, we give a brief summary of our results andprovide a list of possible avenues of future research.4Chapter 2Convex analysis: OverviewThe underlying space for most of the results in this thesis is a real Hilbertspace. Nonetheless, few results hold more generally in Banach spaces.Unless otherwise specified, throughout this thesisX is a real Hilbert space,with an inner product 〈·, ·〉 : X × X → R. The induced norm is denotedby ‖·‖. We start by recalling the basic definitions and facts we need fromconvex analysis.2.1 Convex sets and projection operatorsDefinition 2.1 (convex set). A subset C of X is convex if (∀α ∈ ]0, 1[) αC +(1− α)C ⊆ C, or equivalently (∀(x, y) ∈ C× C) ]x, y[ ⊆ C.Definition 2.2 (cone). A subset C of X is a cone if R+C = C.Example 2.3 (normal cone operator). Let C be a nonempty convex subsetof X and let x ∈ X. The normal cone of C at x is the operatorNC : X ⇒ X : x 7→{{u ∈ X ∣∣ sup〈C− x, u〉 ≤ 0}, if x ∈ C;∅, otherwise.(2.1)Let C be a subset of X. The convex hull of C, denoted by conv C is thesmallest convex set containing C. The affine hull of C denoted by aff C isthe smallest affine set containing C. The conic hull of C denoted by cone Cis the smallest cone containing C. The closure of C is denoted by C. Theinterior of C is denoted by int C. The relative interior of C denoted ri Cis defined by ri C ={x ∈ aff C ∣∣ ∃e > 0 such that ball(x; e) ∩ aff C ⊆ C}.The strong relative interior of C, denoted by sri C, is the interior with respectto the closed affine hull of C. The recession cone of C is rec C = {x ∈X | x + C ⊆ C}. The polar cone of C is C	 = {u ∈ X ∣∣ supc∈C〈c, u〉 ≤ 0}.The dual cone of C is C⊕ = −C	. The orthogonal complement of C is C⊥ =52.1. Convex sets and projection operators{y ∈ X ∣∣ (∀x ∈ X) 〈x, y〉 = 0}. When C is an affine subspace the linear spaceparallel to C is par C = C− C. Let u ∈ X and let r > 0. We use ball(x; r) todenote the closed ball in X centred at u with radius r. Further notation isdeveloped as necessary during the course of this thesis.Definition 2.4 (projection). Let C be a nonempty closed convex subset of Xand let x ∈ X. The projection (this is also known as the closest point mapping)of x onto C is the unique point in C denoted by PCx that satisfies‖x− PCx‖ = infc∈C‖x− c‖. (2.2)Fact 2.5 (characterization of projection). Let C be a nonempty closed convexsubset of X, let x ∈ X and let p ∈ X. Thenp = PCx ⇔ p ∈ C and (∀y ∈ C) 〈y− p, x− p〉 ≤ 0. (2.3)Proof. See [12, Theorem 3.14]. The following useful translation formula could be readily verified.Fact 2.6. Let S be a nonempty subset of X, and let y ∈ X. Then(∀x ∈ X) Py+Sx = y + PS(x− y). (2.4)Proof. See [12, Proposition 3.17]. Fact 2.7. Let C be a closed affine subspace of X and let x ∈ X. Then, the followinghold:(i) PC is affine.If C is a closed linear subspace of X, then we additionally have:(ii) PC is linear.(iii) PC⊥ = Id−PC.Proof. (i): See [12, Corollary 3.20(ii)]. (ii)&(iii): See [12, Corol-lary 3.22(iii)&(v)] respectively. Lemma 2.8. Suppose that C is an affine subspace and that w ∈ (par C)⊥. Then(∀α ∈ R)(∀x ∈ X) PC(x + αw) = PCx. (2.5)62.2. Convex functionsProof. Let a ∈ C. Then C = a + par C. Applying Fact 2.6 and Fact 2.7(ii)we have (∀α ∈ R)(∀x ∈ X) PC(x + αw) = Pa+par C(x + αw) =a + Ppar C(x + αw− a) = a + Ppar C(x− a) + αPpar Cw = a + Ppar C(x− a) =Pa+par Cx = PCx. The following facts regarding projection operators will be used in thesequel.Fact 2.9 (Zarantonello 1971). Let C be a nonempty closed convex subset of X.Then ran(Id−PC) = (rec C)	.Proof. See [139, Theorem 3.1]. Fact 2.10. Let U and V be nonempty closed convex subsets of X such that U ⊥ V.Then U +V is convex and closed, and PU+V = PU + PV .Proof. See [21, Proposition 2.6]. 2.2 Convex functionsConvex functions are major tools in optimization. The pleasant propertiesof convex functions are usually associated with the concept of lower semi-continuity. We start by recalling the following definitions.Definition 2.11. Let f : X → [−∞,+∞]. Then f is lower semicontinuous iffor every sequence (xn)n∈N in X, we havexn → x implies that f (x) ≤ lim infn→∞ f (xn). (2.6)Definition 2.12. Let f : X → [−∞,+∞]. Then f is convex if (∀x ∈ X)(∀y ∈ X) (∀α ∈ ]0, 1[)f (αx + (1− α)y) ≤ α f (x) + (1− α) f (y). (2.7)Geometrically, a convex function is characterized by having its epigraphepi f ={(x, α) ∈ X×R ∣∣ f (x) ≤ α} (2.8)be a convex subset of X×R.Let f : X → [−∞,+∞]. Then f is proper if −∞ 6∈ f (X) and (∃x ∈ X)such that f (x) < +∞. The domain of f is dom f ={x ∈ X ∣∣ f (x) < +∞}.72.3. Subdifferentiability and conjugationDefinition 2.13. Let f : X → ]−∞,+∞] be proper and let x ∈ X. Then x isa minimizer of f if f (x) = inf f (X). The set of minimizers of f{x ∈ X | f (x) = inf f (X)} , (2.9)is denoted by argmin f .Note that if argmin f 6= ∅, then (2.9) becomes {x ∈ X | f (x) = min f (X)}.The set of proper convex lower semicontinuous functions defined on Xshall be denoted by Γ(X).Example 2.14. Let C be a subset of X and let ιC be the indicator functionassociated with C defined byιC : X → ]−∞,+∞] : x 7→{0, if x ∈ C;+∞, if x 6∈ C. (2.10)Then ιC is proper if and only if C is nonempty and ιC is lower semicontinuosif and only if C is closed. Moreover epi ιC = C×R+ and hence ιC is convexif and only if C is convex.Proof. See [12, Examples 1.25 and 8.3]. Definition 2.15. Let f ∈ Γ(X) and let x ∈ X. Then Prox f is the uniquepoint in X that satisfiesminy∈X(f (y) +12‖x− y‖2)= f (Prox f x) +12‖x− Prox f x‖2. (2.11)The operator Prox f : X → X is the proximity operator or the proximal mappingof f .2.3 Subdifferentiability and conjugationDefinition 2.16 (subdifferential operator). Let f : X → ]−∞,+∞] beproper. The subdifferential of f is the set-valued operator∂ f : X ⇒ X : x 7→ {u ∈ X ∣∣ (∀y ∈ X) 〈y− x, u〉+ f (x) ≤ f (y)}. (2.12)Let x ∈ X. Then f is sudifferentiable at x if ∂ f (x) 6= ∅. Moreover, if aproper convex function f is differentiable at x, then ∂ f (x) = {∇ f (x)}, by[137, Theorem 2.4.4(i)].82.3. Subdifferentiability and conjugationExample 2.17. Let C be a nonempty convex subset of X. Then ∂ιC = NC.Proof. See [12, Example 16.12]. Fact 2.18 (Fermat’s rule). Let f : X → ]−∞,+∞] be proper. Thenargmin f = zer ∂ f ={x ∈ X ∣∣ 0 ∈ ∂ f (x)}. (2.13)Proof. See [12, Theorem 16.2]. Definition 2.19 (conjugate function). Let f : X → [−∞,+∞]. The conjugate(this is also known as Fenchel conjugate) of f isf ∗ : X → [−∞,+∞] : u 7→ supx∈X(〈x, u〉 − f (x)) (2.14)and the biconjugate of f is f ∗∗ = ( f ∗)∗.Fact 2.20. Let f ∈ Γ(X), let x ∈ X and let u ∈ X. Then the following areequivalent:(i) u ∈ ∂ f (x).(ii) f (x) + f ∗(u) = 〈x, u〉.(iii) x ∈ ∂ f ∗(u).Proof. See [123, Theorem 23.5] or [12, Theorem 16.23]. Fact 2.21. Let f ∈ Γ(X). Then the following hold:(i) dom ∂ f is a dense subset of dom f .(ii) Prox f = (Id+∂ f )−1.Proof. See [12, Corollary 16.29 & Proposition 16.34]. Let f : X → ]−∞,+∞]. Then f ∨ : X → ]−∞,+∞] : x 7→ f (−x).Fact 2.22 (Fenchel duality). Let f and g be functions in Γ(X) such that 0 ∈sri(dom f − dom g). Theninf( f + g)(X) = −min( f ∗ + g∗∨)(X). (2.15)Proof. See [12, Proposition 15.13]. 9Chapter 3(Firmly) nonexpansivemappings and monotoneoperatorsMost of the material in this chapter is standard. Facts without explicitreferences may be found in [12], [81], or [82]. However, we point out thatExample 3.36 and the results in Section 3.5 are new and appear in [38] and [39]respectively.3.1 Nonexpansive and firmly nonexpansive operatorsDefinition 3.1. Let T : X → X. Then(i) T is nonexpansive if (∀x ∈ X)(∀y ∈ X)‖Tx− Ty‖ ≤ ‖x− y‖. (3.1)(ii) T is firmly nonexpansive if (∀x ∈ X)(∀y ∈ X)‖Tx− Ty‖2 + ‖(Id−T)x− (Id−T)y‖2 ≤ ‖x− y‖2. (3.2)Clearly,T is firmly nonexpansive ⇒ T is nonexpansive. (3.3)Fact 3.2. Let T : X → X. Then the following are equivalent:(i) T is firmly nonexpansive.(ii) Id−T is firmly nonexpansive.(iii) 2T − Id is nonexpansive.(iv) (∀x ∈ X) (∀y ∈ X) ‖Tx− Ty‖2 ≤ 〈x− y, Tx− Ty〉.103.2. Monotone operators: Basic definitions and facts(v) (∀x ∈ X) (∀y ∈ X) 〈Tx− Ty, (Id−T)x− (Id−T)y〉 ≥ 0.Proof. See [81, Theorem 12.1] or [82, Section 11 in Chapter 1]. 3.2 Monotone operators: Basic definitions and factsLet A : X ⇒ X be an arbitrary possibly set-valued operator, i.e., (∀x ∈ X)Ax ⊆ X. The graph of A, denoted gra A, is defined asgra A ={(x, u) ∈ X× X ∣∣ u ∈ Ax}. (3.4)The domain of A, denoted dom A, is defined asdom A ={x ∈ X ∣∣ Ax 6= ∅}. (3.5)The range of A, denoted ran A, is defined asran A ={u ∈ X ∣∣ ∃x ∈ X such that u ∈ Ax}. (3.6)The set of zeros of A is written aszer A = A−1(0) ={x ∈ X ∣∣ 0 ∈ Ax}. (3.7)The inverse operator, denoted A−1 : X ⇒ X is defined in terms of its graphby gra A−1 ={(u, x)∣∣ (x, u) ∈ gra A}. We say that A is at most single-valued if (∀x ∈ X) Ax is at most a singleton. We say that A : X ⇒ Xis a linear relation (respectively affine relation) if gra A is a linear subspace(respectively affine subspace) of X × X. For further information of affinerelations we refer the reader to [25]. It will be quite convenient to introducethe notationA> = (− Id) ◦ A ◦ (− Id). (3.8)An easy calculation shows that (A−1)> = (A>)−1, which motivates thenotationA−> = (A−1)> = (A>)−1. (3.9)(This is similar to the linear-algebraic notation A−T for invertible squarematrices.)Definition 3.3. Let A : X ⇒ X. Then113.2. Monotone operators: Basic definitions and facts(i) A is monotone if(∀(x, u) ∈ gr A)(∀(y, v) ∈ gr A) 〈x− y, u− v〉 ≥ 0. (3.10)(ii) A is strictly monotone if(∀(x, u) ∈ gr A)(∀(y, v) ∈ gr A) x 6= y⇒ 〈x− y, u− v〉 > 0.(3.11)(iii) A is uniformly monotone with modulus φ : R+ → [0,+∞] if φ isincreasing, vanishes only at 0, and(∀(x, u) ∈ gr A)(∀(y, v) ∈ gr A) 〈x− y, u− v〉 ≥ φ(‖x− y‖).(3.12)(iv) A is maximally monotone if it is monotone and it is impossible toproperly enlarge the graph of A while keeping monotonicity.The following fact is easy to verify.Fact 3.4. Let T : X → X be nonexpansive and let α ∈ [−1, 1]. Then Id+αT ismaximally monotone.Proof. See [12, Example 20.26]. The following deep facts are known in the literature of monotoneoperators. The proofs are beyond the scope of the thesis.Fact 3.5. Let f ∈ Γ(X). Then the following hold:(i) ∂ f is maximally monotone.(ii) (∂ f )−1 = ∂ f ∗.Proof. (i): See [122, Theorem A] or [12, Theorem 21.2]. (ii): See [122,Remark on page 216], [83, The´ore`me 3.1] or [12, Corollary 16.24]. Fact 3.6. Let A : X ⇒ X be maximally monotone such that gra A is convex.Then gra A is actually affine.Proof. See [23, Theorem 4.2]. Fact 3.7. Let A : R ⇒ R. Then A is maximally monotone if and only if (∃ f ∈Γ(X)) such that A = ∂ f .123.3. Resolvents of monotone operatorsProof. See [125, Exercise 12.26] or [12, Corollary 22.19]. Fact 3.8. Let A : X ⇒ X be maximally monotone with bounded domain. Then Ais surjective.Proof. See [12, Corollary 21.21]. The following fact can be readily verified.Fact 3.9. Let A : X ⇒ X be maximally monotone and let x ∈ X. Then Ax isclosed and convex.Proof. See [12, Proposition 20.31]. Example 3.10. Suppose that C is nonempty closed convex subset of X.Then, in view of Example 2.17, Example 2.14 and Fact 3.5(i), NC is maxi-mally monotone.3.3 Resolvents of monotone operatorsDefinition 3.11 (resolvent and reflected resolvent). Let A : X ⇒ X. Theresolvent of A is the operatorJA = (Id+A)−1, (3.13)and the reflected resolvent isRA = 2JA − Id . (3.14)Fact 3.12 (monotonicity and firm nonexpansiveness). Let D be a nonemptysubset of X, let T : D → X, let A : X ⇒ X, and suppose that A = T−1 − Id.Then the following hold:(i) T = JA.(ii) A is monotone if and only if T is firmly nonexpansive.(iii) A is maximally monotone if and only if T is firmly nonexpansive and D =X.Proof. See [72, Theorem 2]. Let T : X → X. The set of fixed points of T is Fix T = {x ∈ X ∣∣ Tx = x}.133.3. Resolvents of monotone operatorsFact 3.13. Suppose that A is maximally monotone. Then the following hold:(i) JA is firmly nonexpansive and dom JA = X.(ii) RA is nonexpansive.(iii) We have the following inverse resolvent identityJA + JA−1 = Id . (3.15)(iv) zer A = Fix JA ={x ∈ X ∣∣ JAx = x}.Proof. (i): This follows from [105, Corollary on page 344] and [124,Proposition 1(c)]. Alternatively use Fact 3.12. (ii): Combine (i) and Fact 3.2.(iii): This follows from [125, Lemma 12.14]. (iv): Clear. Example 3.14. The following are examples of resolvents of monotone op-erators.(i) Let f ∈ Γ(X) and suppose that A = ∂ f . Then JA = Prox f , theproximal point mapping of the function f (see Definition 2.15).(ii) Let C be a nonempty closed convex subset of X and suppose thatA = NC. Then JA = ProxιC = PC, the projection onto the set C.Proof. See [12, Examples 23.3 & 23.4]. We obtain the following useful lemma.Lemma 3.15. Let C be a nonempty closed convex subset of X and let c ∈ C satisfythat ‖c‖ = ‖PC0‖. Then c = PC0.Proof. In view of Example 3.14(ii) and Fact 3.13(i) we learn that PC is firmlynonexpansive. Therefore we have‖c− PC0‖2 = ‖PCc− PC0‖2 (3.16a)≤ ‖c− 0‖2 − ‖(Id−PC)c− (Id−PC)0‖2 (3.16b)= ‖c‖2 − ‖c− PCc + PC0‖2 = ‖c‖2 − ‖PC0‖2 = 0. (3.16c)143.3. Resolvents of monotone operatorsProposition 3.16. Suppose that A : X → X is continuous, linear, and single-valued such that A and −A are monotone, and A2 = −α Id, where α ∈ R+.ThenJA =11+ α(Id−A) and RA = 1− α1+ α Id− 21+ αA. (3.17)Proof. We haveJA J−A = (Id+A)−1(Id−A)−1 =((Id−A)(Id+A))−1 (3.18a)= (Id−A2)−1 = (Id+α Id)−1 (3.18b)=11+ αId . (3.18c)It follows that JA = (1+ α)−1 (J−A)−1 = (1+ α)−1(Id−A) and hence thatRA = 2JA − Id = 21+ α (Id−A)− Id =1− α1+ αId− 21+ αA, (3.19)as claimed. Example 3.17. Suppose that X = R2 and that A : R2 → R2 : (x, y) 7→(−y, x) is the rotator by pi/2. Then A2 = − Id; consequently, by Propo-sition 3.16, JA = (1/2)(Id−A) and RA = −A.We recall the following very useful Minty parametrization.Fact 3.18 (Minty parametrization). Let A : X ⇒ X be maximally monotone.Then M : X → gra A : x 7→ (JAx, JA−1 x) is a continuous bijection, withcontinuous inverse M−1 : gra A→ X : (x, u) 7→ x + u; consequently,gra A = M(X) ={(JAx, x− JAx)∣∣ x ∈ X}. (3.20)Proof. See [105]. Lemma 3.19. Let A : X ⇒ X. Then dom A = ran JA.Proof. Indeed, dom A = X ∩ dom A = dom Id∩dom A = dom(Id+A) =ran(Id+A)−1 = ran JA. Applying Lemma 3.19 to A−1 and using (3.15), we obtainran A = dom A−1 = ran JA−1 = ran(Id−JA). (3.21)The following fact could be directly verified.153.4. Sums of monotone operators: maximality and range of the sumFact 3.20 (resolvent calculus). Let A : X ⇒ X be maximally monotone and letw ∈ X. Then the following hold:(i) Suppose that B = w + A. Then JB = JA(· − w).(ii) Suppose that B = A(·+ w). Then JB = −w + JA(·+ w)Proof. See [12, Proposition 23.15(ii)&(iii)]. The following lemma shall be useful in our work.Lemma 3.21. Let C be a nonempty closed convex subset of X and let w ∈ X.Then the following hold:(i) NC(· − w) = Nw+C.(ii) J−w+NC = JNC(·+ w) = PC(·+ w).(iii) JNC(·−w) = w + JNC(· − w) = w + PC(· − w).Proof. (i): One can easily verify that (∀w ∈ X) we have ιC(· − w) = ιw+C.Therefore NC(· − w) = ∂ιC(· − w) = ∂ιw+C = Nw+C. (ii)&(iii): ApplyFact 3.20(i)&(ii) with w replaced by −w and use Example 3.14(ii). 3.4 Sums of monotone operators: maximality andrange of the sumIn this section we review further notions of monotonicity that allow forstronger conclusions, as we shall demonstrate.Definition 3.22 (paramonotone operator). An operator C : X ⇒ X isparamonotone, if it is monotone and we have(x, u) ∈ gra C(y, v) ∈ gra C〈x− y, u− v〉 = 0 ⇒ {(x, v), (y, u)} ⊆ gra C. (3.22)Remark 3.23. Paramonotonicity has proven to be a very useful property forfinding solution of variational inequalities by iteration; see, e.g., [89], [57], [54],[112], and [84].163.4. Sums of monotone operators: maximality and range of the sumLet L : X → X be linear. Recall that the kernel of L is ker L ={x ∈ X ∣∣ Lx = 0}. We now provide examples of paramonotone operatorsbelow.Example 3.24. Each of the following is paramonotone.(i) ∂ f , where f ∈ Γ(X) (see [89, Proposition 2.2] or [12, 22.3(i)]).(ii) C : X ⇒ X, where C is strictly monotone.(iii) Rn → Rn : x 7→ Lx + b, where L ∈ Rn×n, b ∈ Rn, L+ = 12 L + 12 LT,ker L+ ⊆ ker L, and L+ is positive semidefinite [89, Proposition 3.1].Example 3.25. Suppose that X = R2 and that A : R2 → R2 : (x, y) 7→(y,−x). Then one can easily verify that A and−A are maximally monotonebut not paramonotone by [89, Section 3] (or [32, Theorem 4.9]).For further examples, and detailed discussion on paramonotone opera-tors see [89]. It is straightforward to check that for C : X ⇒ X, we haveC is paramonotone⇔ C−1 is paramonotone⇔ C> is paramonotone (3.23)⇔ C−> is paramonotone.Definition 3.26 (3∗ monotone operator). (See [51, page 166].) Let A : X ⇒X be monotone. Then A is 3∗ monotone (this is also known as rectangular) if(∀x ∈ dom A)(∀v ∈ ran A) inf(z,w)∈gra A〈x− z, v− w〉 > −∞. (3.24)When C is a continuous linear monotone operator, then C is paramono-tone if and only if C is 3* monotone; see [22, Section 4].Fact 3.27. Let f ∈ Γ(X). Then ∂ f is 3∗ monotone.Proof. See [51, page 167]. Corollary 3.28. Let A : R ⇒ R be monotone. Then A is 3∗ monotone andparamonotone.Proof. In view of Fact 3.7, there exists f ∈ Γ(R) such that gra A ⊆ gra ∂ f .The result now follows from Fact 3.27 and Example 3.24(i). It is clear that sum of two maximally monotone operators is monotone.This sum is maximally monotone if an appropriate constraint qualificationis imposed, as we see now.173.4. Sums of monotone operators: maximality and range of the sumFact 3.29. Let A : X ⇒ X and B : X ⇒ X be maximally monotone such that oneof the following conditions holds:(i) dom B = X.(ii) dom A ∩ int dom B 6= ∅.(iii) 0 ∈ int(dom A− dom B).(iv) dom A and dom B are convex and 0 ∈ sri(dom A− dom B).Then A + B is maximally monotone.Proof. See [12, Corollary 24.4]. Before we proceed further, we need to recall the following definitionsand facts.Definition 3.30 (near convexity). Let X be finite-dimensional and let D bea subset of X. Then D is nearly convex if there exists a convex subset C of Xsuch that C ⊆ D ⊆ C.Fact 3.31. Let X be finite-dimensional and let A : X ⇒ X be maximallymonotone. Then dom A and ran A are nearly convex.Proof. See [125, Theorem 12.41]. Definition 3.32 (near equality). (See [29, Definition 2.3].) Let X be finite-dimensional and let C and D be subsets of X. We say that C and D arenearly equal ifC ' D :⇔ C = D and ri C = ri D. (3.25)Fact 3.33. Let X be finite-dimensional and let D be a nonempty nearly convexsubset of X, say C ⊆ D ⊆ C, where C is a convex subset of X. ThenD ' D ' ri D ' conv D ' ri conv D ' C. (3.26)In particular, D and ri D are convex and nonempty.Proof. See [29, Lemma 2.7]. Fact 3.34. Suppose that X is finite-dimensional. Let A : X ⇒ X and B : X ⇒ Xbe maximally monotone such that ri dom A ∩ ri dom B 6= ∅. Then A + B ismaximally monotone.183.4. Sums of monotone operators: maximality and range of the sumProof. See [125, Corollary 12.44]. Fact 3.35 (Brezis–Haraux). Let A : X ⇒ X and B : X ⇒ X be monotoneoperators such that A + B is maximally monotone and one of the followingconditions holds:(i) A and B are 3∗ monotone.(ii) dom A ⊆ dom B and B is 3∗ monotone.Thenran(A + B) = ran A + ran B and int ran(A + B) = int(ran A + ran B).(3.27)If X is finite-dimensional, thenran(A + B) is nearly convex and ran(A + B) ' ran A + ran B. (3.28)Proof. See [51, Theorems 3 and 4] for the proof of (3.27) and [29,Theorem 3.13] for the proof of (3.28). Example 3.36 (and later Proposition 12.17 below) illustrate that theresults of Fact 3.35 are sharp in the sense that actual equality fails.Example 3.36. Suppose that X = R2 and let f : R2 →]−∞,+∞] : (ξ1, ξ2) 7→max {g(ξ1), |ξ2|}, where g(ξ1) = 1−√ξ1 if ξ1 ≥ 0, g(ξ1) = +∞ if ξ1 < 0.Set A = ∂ f ∗. Then A is 3∗ monotone and 2A = A + A is maximally monotone,yet2 ran A = ran 2A = ran(A + A) $ ran A + ran A. (3.29)Proof. First notice that by Fact 3.27 A is 3∗ monotone. Moreover, since A ismaximally monotone, it follows from Fact 3.33 that ri dom A is nonemptyand convex. Since ri dom A ∩ ri dom A = ri dom A 6= ∅, Fact 3.34 impliesthat A + A = 2A is maximally monotone. Using [12, Proposition 16.24]and [123, example on page 218] we know thatran ∂ f ∗ = ran(∂ f )−1 = dom ∂ f (3.30a)={(ξ1, ξ2)∣∣ ξ1 > 0} ∪ {(0, ξ2) ∣∣ |ξ2| ≥ 1} (3.30b)andran(A + A) ={(ξ1, ξ2)∣∣ ξ1 > 0} ∪ {(0, ξ2) ∣∣ |ξ2| ≥ 2} (3.31a)6= {(ξ1, ξ2) ∣∣ ξ1 ≥ 0} = ran A + ran A. (3.31b)193.5. Examples of linear monotone operators3.5 Examples of linear monotone operatorsIn this section we present numerous examples of linear monotone op-erators that are partly motivated by applications in partial differentialequations; see, e.g., [80] and [133]. We also pay attention to the tridiagonalToeplitz matrices and Kronecker products. These examples not only arehelpful in understanding the connection between the classical Douglas–Rachford setting (see Definition 5.1 below) and modern monotone operatortheory but also they allow explicit forms which should turn out to be usefulin applications.Let M ∈ Rn×n. Then we have the following equivalences:M is monotone ⇔ M + MT2is positive semidefinite (3.32a)⇔ the eigenvalues of M + MT2lie in R+. (3.32b)Lemma 3.37. LetM =(α βγ δ)∈ R2×2. (3.33)Then M is monotone if and only if α ≥ 0, δ ≥ 0 and 4αδ ≥ (β+ γ)2.Proof. Indeed, the principal minors of M + MT are 2α, 2δ and4αδ− (β+ γ)2; by [102, (7.6.12) on page 566]. Note that if M = MT, then M is monotone if and only if the eigenvaluesof M lie in R+. If M 6= MT, then some information about the location ofthe (possibly complex) eigenvalues of M is available:Lemma 3.38. Let M ∈ Rn×n be monotone, and let {λk}nk=1 denote the set ofeigenvalues of M. Then1 Re(λk) ≥ 0 for every k ∈ {1, . . . , n}.Proof. Write λ = α + iβ, where α and β belong to R and i =√−1 andassume that λ is an eigenvalue of M with (nonzero) eigenvector w = u+ iv,where u and v are inRn. Then (M−λ Id)w = 0⇔ ((M− α Id)− iβ Id)(u+iv) = 0⇔ (M− α Id)u + βv = 0 and (M− α Id)v− βu = 0. Hence〈u, (M− α Id)u〉+ β〈u, v〉 = 0, (3.34a)〈v, (M− α Id)v〉 − β〈v, u〉 = 0. (3.34b)1We use Re(z) to refer to the real part of the complex number z.203.5. Examples of linear monotone operatorsAdding (3.34) yields 〈u, (M− α Id)u〉+ 〈v, (M− α Id)v〉 = 0; equivalently,〈u, Mu〉+ 〈v, Mv〉 − α‖u‖2 − α‖v‖2 = 0. Solving for α yieldsRe(λ) = α =〈u, Mu〉+ 〈v, Mv〉‖u‖2 + ‖v‖2 ≥ 0, (3.35)as claimed. The converse of Lemma 3.38 is not true in general, as we demonstratein the following example.Example 3.39. Let ξ ∈ Rr [−2, 2], and setM =(1 ξ0 1). (3.36)Then M has 1 as its only eigenvalue (with multiplicity 2), M is not mono-tone by Lemma 3.37, and M is not symmetric.Proposition 3.40. Consider the tridiagonal Toeplitz matrixM =β γ 0α. . . . . .. . . . . . γ0 α β ∈ Rn×n. (3.37)Then M is monotone if and only if β ≥ |α+ γ| cos(pi/(n + 1)).Proof. Note that12 (M + MT) =β 12 (α+ γ) 012 (α+ γ). . . . . .. . . . . . 12 (α+ γ)0 12 (α+ γ) β . (3.38)By (3.32a), M is monotone ⇔ 12 (M + MT) is positive semidefinite. Ifα + γ = 0, then 12 (M + MT) = β Id and therefore 12 (M + MT) is positivesemidefinite⇔ β ≥ 0 = |α+ γ|. Now suppose that α+ γ 6= 0. It followsfrom [102, Example 7.2.5] that the eigenvalues of 12 (M + MT) areλk = β+ (α+ γ) cos( kpin+1), (3.39)213.5. Examples of linear monotone operatorswhere k ∈ {1, . . . , n}. Consequently,{λk}nk=1 ⊆ R+ ⇔ β ≥∣∣∣(α+ γ) cos ( pi(n+1)) ∣∣∣. (3.40)The characterization of monotonicity of M now follows from (3.32b). Proposition 3.41. LetM =β γ 0α. . . . . .. . . . . . γ0 α β ∈ Rn×n. (3.41)Then exactly one of the following holds:(i) αγ = 0 and det(M) = βn. Consequently M is invertible ⇔ β 6= 0, inwhich case[M−1]i,j = (−α)max{i−j,0}(−γ)max{j−i,0}βmin{j−i,i−j}−1. (3.42)(ii) αγ 6= 0. Set r = 12α (−β+√β2 − 4αγ), s = 12α (−β−√β2 − 4αγ) andΛ ={β+ 2γ√α/γ cos(kpi/(n + 1))∣∣ k ∈ {1, 2, . . . , n}}. Then rs 6= 0.Moreover, M is invertible2 ⇔ 0 6∈ Λ, in which caser 6= s⇒ [M−1]i,j= −γj−1(rmin{i,j} − smin{i,j})(rn+1smax{i,j} − rmax{i,j}sn+1)αj(r− s)(rn+1 − sn+1) ,(3.43a)r = s⇒ [M−1]i,j= −γj−1 min {i, j} (n + 1−max {i, j})ri+j−1αj(n + 1). (3.43b)Alternatively, define the recurrence relationsu0 = 0, u1 = 1, uk = − 1γ (αuk−2 + βuk−1), k ≥ 2; (3.44a)vn+1 = 0, vn = 1, vk = − 1α (βvk+1 + γvk+2), k ≤ n− 1. (3.44b)Then[M−1]i,j = −umin{i,j}vmax{i,j}αv0(γα)j−1. (3.45)2In the special case, when β = 0, this is equivalent to saying that M is invertible⇔ n iseven.223.5. Examples of linear monotone operatorsProof. (i): αγ = 0 ⇔ α = 0 or γ = 0, in which case M is a (lower orupper) triangular matrix. Hence det(M) = βn, and the characterizationfollows. The formula in (3.42) is easily verified. (ii): Note that 0 ∈ {r, s} ⇔β ∈ {±√β2 − 4αγ} ⇔ β2 = β2 − 4αγ⇔ αγ = 0. Hence rs 6= 0. Moreover,it follows from [102, Example 7.2.5] that Λ is the set of eigenvalues of M;therefore, M is invertible⇔ 0 6∈ Λ. The formulae (3.43) follow from [136,Remark 2 on page 110]. The recurrence formulae defined in (3.44) and [136,Theorem 2] yield (3.45). Remark 3.42. Concerning Proposition 3.41, using [134, Section 2 on page 44]we also have the alternative formulaer 6= s⇒ [M−1]i,j =− 1γs−i − r−is−1 − r−1s−n+j−1 − r−n+j−1s−(n+1) − r−(n+1) , j ≥ i;− 1αsj − rjs− rsn−i+1 − rn−i+1sn+1 − rn+1 , j ≤ i,(3.46a)r = s⇒ [M−1]i,j =− iγ(1− jn + 1)rj−i+1, j ≥ i;− jα(1− in + 1)rj−i−1, j ≤ i.(3.46b)Using the binomial expansion, (3.46a), and a somewhat tedious calculation whichwe omit here and provided that r 6= s, one can show that [M−1]i,j is equal to2S1S2(2α)min{0,j−i}(2γ)min{0,i−j}S3, (3.47)whereS1 =⌈min{i,j}2 −1⌉∑m=0(min {i, j}2m + 1)(−β)min{i,j}−(2m+1)(β2 − 4αγ)m, (3.48a)S2 =⌈(n+1−max{i,j})2⌉−1∑m=0(n−max {i, j}+ 12m + 1)(−β)n−max{i,j}−2m(β2 − 4αγ)m,(3.48b)S3 =⌈(n+1)2⌉−1∑m=0(n + 12m + 1)(−β)n−2m(β2 − 4αγ)m. (3.48c)233.5. Examples of linear monotone operatorsExample 3.43. Let β ≥ 2, setM =β −1 0−1 . . . . . .. . . . . . −10 −1 β ∈ Rn×n, (3.49)set r = 12 (β+√β2 − 4) and set s = 12 (β−√β2 − 4). Then M is monotoneand invertible. Moreover,r 6= s⇒ [M−1]i,j = (rmin{i,j} − smin{i,j})(rn+1smax{i,j} − rmax{i,j}sn+1)(r− s)(rn+1 − sn+1) ,(3.50a)r = s⇒ [M−1]i,j = min {i, j} (n + 1−max {i, j})n + 1 . (3.50b)Alternatively, define the recurrence relationsu0 = 0, u1 = 1, uk = βuk−1 − uk−2, k ≥ 2, (3.51a)vn+1 = 0, vn = 1, vk = βvk+1 − vk+2, k ≤ n− 1. (3.51b)Then[M−1]i,j = −umin{i,j}vmax{i,j}v0. (3.52)Proof. The monotonicity of M follows from Proposition 3.40 by noting thatβ ≥ 2 > 2 cos(pi/(n + 1)). The same argument implies that0 6∈ Λ ={β− 2 cos ( kpin+1) ∣∣∣ k ∈ {1, 2, . . . , n}} . (3.53)Hence M is invertible by Proposition 3.41(ii). Note that β = 2 ⇔ β2 − 4 =0⇔ r = s = 1. Now apply Proposition 3.41(ii). Let M1 = [αi,j]ni,j=1 ∈ Rn×n and M2 = [βi,j]ni,j=1 ∈ Rn×n. Recall that theKronecker product of M1 and M2 (see [93, page 407] or [102, Exercise 7.6.10])is defined by the block matrixM1 ⊗M2 = [αi,j M2] ∈ Rn2×n2 . (3.54)Lemma 3.44. Let M1 and M2 be symmetric matrices inRn×n. Then M1⊗M2 ∈Rn2×n2 is symmetric.243.5. Examples of linear monotone operatorsProof. Using [102, Exercise 7.8.11(a)] or [93, Proposition 1(e) on page 408]we have (M1 ⊗M2)T = M1T ⊗M2T = M1 ⊗M2. The following fact is very useful in the conclusion of the upcomingresults.Fact 3.45. Let M1 and M2 be in Rn×n, with eigenvalues{λk∣∣ k ∈ {1, . . . , n}}and{µk∣∣ k ∈ {1, . . . , n}}, respectively. Then the eigenvalues of M1 ⊗ M2 are{λjµk∣∣ j, k ∈ {1, . . . , n}}.Proof. See [93, Corollary 1 on page 412] or [102, Exercise 7.8.11(b)]. Corollary 3.46. Let M1 and M2 in Rn×n be monotone such that M1 or M2 issymmetric. Then M1 ⊗M2 is monotone.Proof. According to (3.32), it is suffices to show that all the eigenvaluesof M1 ⊗ M2 + (M1 ⊗ M2)T are nonnegative. Suppose first thatM1 is symmetric. Then using [93, Proposition 1(e)&(c)] we haveM1⊗M2 + (M1⊗M2)T = M1⊗M2 + M1⊗MT2 = M1⊗ (M2 + MT2 ). SinceM2 is monotone, it follows from (3.32) that all the eigenvalues of M2 + MT2are nonnegative. Now apply Fact 3.45 to M1 and M2 + MT2 to learn that allthe eigenvalues of M1⊗M2 +(M1⊗M2)T are nonnegative, hence M1⊗M2is monotone by (3.32b). A similar argument applies if M2 is monotone. Note that the assumption that at least one matrix is symmetric is criticalin Corollary 3.46, as we show in the next example.Example 3.47. Suppose thatM =(0 −11 0). (3.55)Then M is monotone, with eigenvalues {±i}, but not symmetric. However,M⊗M =0 0 0 10 0 −1 00 −1 0 01 0 0 0 (3.56)is a symmetric matrix with eigenvalues {±1} by Fact 3.45. Therefore M⊗M isnot monotone by (3.32).253.5. Examples of linear monotone operatorsProposition 3.48. Let M ∈ Rn×n be symmetric. Then Id⊗M is monotone ⇔M⊗ Id is monotone⇔ M is monotone, in which case we haveJIdn ⊗M = Idn⊗JM and JM⊗Idn = JM ⊗ Idn . (3.57)Proof. In view of Fact 3.45 the sets of eigenvalues of Id⊗M, M ⊗ Id, andM coincide. It follows from Lemma 3.44 that Id⊗M and M⊗ Id are sym-metric. Now apply (3.32b) and use the monotonicity of M. To prove (3.57),we use [93, Proposition 1(c) on page 408] to learn that Idn2 + Idn⊗M =Idn⊗ Idn + Idn⊗M = Idn⊗(Idn +M). Therefore, we learn from [93,Corollary 1(b) on page 408] thatJIdn ⊗M = (Idn2 + Idn⊗M)−1 = (Idn⊗(Idn +M))−1 = Idn⊗(Idn +M)−1= Idn⊗JM. (3.58)The other identity in (3.57) is proved similarly. Corollary 3.49. Let β ∈ R. SetM[β] =β −1 0−1 . . . . . .. . . . . . −10 −1 β ∈ Rn×n, (3.59)and let M→ and M↑ be block matrices in Rn2×n2 defined byM→ =M[β] 0n 0n0n. . . . . .. . . . . . 0n0n 0n M[β] (3.60)andM↑ =β Idn − Idn 0n− Idn . . . . . .. . . . . . − Idn0n − Idn β Idn . (3.61)Then M→ = Idn⊗M[β] and M↑ = M[β] ⊗ Idn. Moreover, M→ is monotone⇔M↑ is monotone⇔ M[β] is monotone⇔ β ≥ 2 cos(pi/(n + 1)), in which caseJM→ = Idn⊗M−1[β+1] and JM↑ = M−1[β+1] ⊗ Idn . (3.62)263.5. Examples of linear monotone operatorsProof. It is straightforward to verify that M→ = Idn⊗M[β] andM↑ = M[β] ⊗ Idn . It follows from Proposition 3.40 that M[β] is monotone⇔ β ≥ 2 cos ( pin+1). Now combine with Proposition 3.48. To prove (3.62)note that Idn +M[β] = M[β+1], and therefore JM[β] = M−1[β+1]. The conclusionfollows by applying (3.57). The above matrices play a key role in the original design of theDouglas–Rachford algorithm — see the Section 5.6 for details.Proposition 3.50. Let n ∈ {2, 3, . . .}, let M ∈ Rn×n and consider the blockmatrixM =M − Idn 0n− Idn . . . . . .. . . . . . − Idn0n − Idn M . (3.63)Let x = (x1, x2, . . . , xn) ∈ Rn2 , where xi ∈ Rn, i ∈ {1, 2, . . . n}. Then〈x, Mx〉 = 〈x1, (M− Id)x1〉+n−1∑k=2〈xk, (M− 2 Id)xk〉+〈xn, (M− Id)xn〉+n−1∑i=1‖xi − xi+1‖2.(3.64)Moreover the following hold:(i) Suppose that n = 2. Then M− Id is monotone⇔ M is monotone.(ii) M− 2 Id is monotone⇒M is monotone.(iii) M is monotone⇒ M− 2(1− 1n ) Id is monotone⇒ M is monotone.Proof. We have〈x, Mx〉 = 〈(x1, x2, . . . , xn), (Mx1 − x2,−x1 + Mx2 − x3, . . . ,−xn−1 + Mxn)〉= 〈x1, Mx1〉 − 〈x1, x2〉 − 〈x1, x2〉+ 〈x2, Mx2〉 − 〈x2, x3〉− 〈x2, x3〉+ 〈x3, Mx3〉 − . . .− 〈x2, x3〉+ 〈xn, Mxn〉 − 〈xn−1, xn〉= 〈x1, Mx1〉 − 2〈x1, x2〉+ 〈x2, Mx2〉 − 2〈x2, x3〉− . . .− 2〈x2, x3〉+ 〈xn, Mxn〉 − 2〈xn−1, xn〉= 〈x1, Mx1〉+ ‖x1 − x2‖2 − ‖x1‖2 − ‖x2‖2 + 〈x2, Mx2〉+ ‖x2 − x3‖2273.5. Examples of linear monotone operators− ‖x2‖2 − ‖x3‖2 + . . . + ‖xn−1 − xn‖2− ‖xn−1‖2 − ‖xn‖2 + 〈xn, Mxn〉= 〈x1, Mx1〉 − ‖x1‖2 + 〈x2, Mx2〉 − 2‖x2‖2 + . . . + 〈xn, Mxn〉− ‖xn‖2 + . . . + ‖x1 − x2‖2 + ‖x2 − x3‖2 + . . . + ‖xn−1 − xn‖2= 〈x1, (M− Id)x1〉+( n−1∑k=2〈xk, (M− 2 Id)xk〉)+ 〈xn, (M− Id)xn〉+n−1∑i=1‖xi − xi+1‖2.(i): “⇒”: Apply (3.64) with n = 2. “⇐”: Let y ∈ R2. Applying (3.64)to the point x = (y, y) ∈ R4, we get 0 ≤ 〈x, Mx〉 = 2〈y, (M − Id)y〉.(ii): This is clear from (3.64). (iii): Let y ∈ Rn. Applying (3.64)to the point x = (y, y, . . . , y) ∈ Rn2 yields 0 ≤ 〈x, Mx〉 =2〈y, (M − Id)y〉 + (n − 2)〈y, (M − 2 Id)y〉 = 〈y, (nM − 2(n − 1) Id)y〉.Therefore, M− 2(1− 1n ) Id is monotone. The converse of Proposition 3.50(ii) is not true in general, as we illus-trate now.Example 3.51. SetM =(1 −11 1), (3.65)and let M be as defined in Proposition 3.50. Then one verifies easily that Mis monotone while M− 2 Id is not.We now show that the converse of the implications in Proposi-tion 3.50(iii) are not true in general.Example 3.52. Set n = 2, set M = 12 Id ∈ R2×2, and let M be as defined inProposition 3.50. Then M is monotone but M− 2(1− 12 ) Id = − 12 Id is notmonotone, andM =121 0 −2 00 1 0 −2−2 0 1 00 −2 0 1 . (3.66)Note the M is symmetric and has eigenvalues {−1/2, 3/2}, hence M is notmonotone by (3.32).28Chapter 4Attouch–The´ra duality andparamonotonicity4.1 OverviewIn this chapter, we systematically study Attouch–The´ra duality for theproblem of finding a zero of the sum of two maximally monotone oper-ators. Let us now summarize our main results.• We observe a curious convexity property of the intersection of twosets involving the graphs of A and B (see Theorem 4.17). This relatesto Passty’s work on the parallel sum as well as to Eckstein andSvaiter’s work on the Kuhn–Tucker set (a.k.a. extended solution set,see [63] and [74, Section 2.1]).• We reveal the importance of paramonotonicity: in this case, it ispossible to recover all primal solutions from one dual solution (seeTheorem 4.28).• We generalize the best approximation results by Bauschke-Combettes-Luke from normal cone operators to paramonotoneoperators with a common zero (see Corollary 4.41).Except for facts with explicit references, this chapter is based on results thatappear in [26] and [39].4.2 Duality for monotone operatorsIn this section, we review and slightly refine the basic results on Attouch–The´ra duality. Throughout this chapter we assumeA and B are maximally monotone operators on X. (4.1)Definition 4.1 (primal problem). The primal problem, for the ordered pair(A, B), is to find a zero of A + B.294.2. Duality for monotone operatorsAt first, it looks strange to define the primal problem with respect to the(ordered) pair (A, B). The reason we must do this is to associate a uniquedual problem. (The ambiguity arises because addition is commutative.)Now since A and B form a pair of maximally monotone operators, so doA−1 and B−>: We thus define the dual pair(A, B)∗ = (A−1, B−>). (4.2)The biduality(A, B)∗∗ = (A, B) (4.3)holds, since (A−1)−1 = A, (B>)> = B, and (B>)−1 = (B−1)>.We are now in a position to formulate the dual problem.Definition 4.2 ((Attouch–The´ra) dual problem). The (Attouch–The´ra) dualproblem, for the ordered pair (A, B), is to find a zero of A−1 + B−>.This duality was systematically studied by Attouch and The´ra [3]; al-though, it is worth nothing that it was also touched upon earlier by Mercier[101, page 40]. (See [108] for the special case of variational inequalities,and also [56] for work on Toland duality.) Additional relevant work canbe found in [52] and [66]. In view of (4.3), it is the clear that the primalproblem is precisely the dual of the dual problem, as expected. One centralaim of this chapter is to understand the interplay between the primal anddual solutions that we formally define next.Definition 4.3 (primal and dual solutions). The primal solutions are thesolutions to the primal problem and analogously for the dual solutions. Weshall abbreviate these sets byZ(A,B) = Z = (A + B)−1(0), (4.4)andK(A,B) = K =(A−1 + B−>)−1(0), (4.5)respectively.As observed by Attouch and The´ra in [3, Corollary 3.2], one has:Z 6= ∅ ⇔ K 6= ∅. (4.6)Let us make this simple but important equivalence a little more precise. Inorder to do so, we define(∀z ∈ X) Kz = (Az) ∩ (−Bz) (4.7)304.2. Duality for monotone operatorsand(∀k ∈ X) Zk = (A−1k) ∩ (−B−>k) = (A−1k) ∩ (B−1(−k)). (4.8)As the next proposition illustrates, these objects are intimately tiedto primal and dual solutions defined in Definition 4.3. This result iselementary and implicitly contained in [3] and [101].Proposition 4.4. Let z ∈ X and let k ∈ X. Then the following hold.(i) Kz and Zk are closed convex (possibly empty) subsets of X.(ii) k ∈ Kz ⇔ z ∈ Zk;(iii) z ∈ Z⇔ Kz 6= ∅.(iv)⋃z∈Z Kz = K.(v) Z 6= ∅⇔ K 6= ∅.(vi) k ∈ K⇔ Zk 6= ∅.(vii)⋃k∈K Zk = Z.Proof. (i): Because A and B are maximally monotone, Fact 3.9 implies thatthe sets Az and Bz are closed and convex. Hence Kz is also closed andconvex. We see analogously that Zk is closed and convex as well.(ii): This is easily verified from the definitions.(iii): Indeed, z ∈ Z⇔ 0 ∈ (A + B)z⇔ (∃ a∗ ∈ Az ∩ (−Bz))⇔ (∃ a∗ ∈Kz)⇔ Kz 6= ∅.(iv): Take k ∈ ⋃z∈Z Kz. Then there exists z ∈ Z such that k ∈ Kz = Az ∩(−Bz). Hence z ∈ A−1k and z ∈ (−B)−1k = B−1(− Id)−1k = B−1(−k).Thus z ∈ A−1k and −z ∈ B−>k. Hence 0 ∈ (A−1 + B−>)k and so k ∈ K.The reverse inclusion is proved analogously.(v): Combine (iii) and (iv).(vi)&(vii): The proofs are analogous to the ones of (iii)&(iv). Let us provide some examples illustrating these notions.Example 4.5. Suppose that X = R2, and that we consider the rotators by∓pi/2, i.e.,A : R2 → R2 : (x1, x2) 7→ (x2,−x1) (4.9)andB : R2 → R2 : (x1, x2) 7→ (−x2, x1). (4.10)314.2. Duality for monotone operatorsNote that B = −A = A−1 = A∗, where A∗ denote the adjoint operator.Hence A + B ≡ 0, Z = X, and (∀z ∈ Z) Kz = {Az} = {−Bz}.Furthermore, A−1 = B while the linearity of B implies that B−> = B−1 =−B = A. Therefore, (A, B)∗ = (B, A). Hence K = Z, while (∀k ∈ K)Zk = {A−1k} = {Bk}.Example 4.6. Suppose that X = R2, that A is the normal cone operator ofR2+, and that B : X → X : (x1, x2) 7→ (−x2, x1) is the rotator by pi/2. Asalready observed in Example 4.5, we have B−1 = −B and B−> = B−1 =−B. A routine calculation yieldsZ = R+ × {0}; (4.11)thus, since B is single-valued,(∀z = (z1, 0) ∈ Z) Kz ={− Bz} = {(0,−z1)}. (4.12)Thus,K =⋃z∈ZKz = {0} ×R− (4.13)and so(∀k = (0, k2) ∈ K) Zk ={− B−>k} = {Bk} = {(−k2, 0)}. (4.14)The dual problem is to find a zero of A−1 + B−>, i.e., a zero of the sum ofthe normal cone operator of the negative orthant and the rotator by −pi/2.Example 4.7 (convex feasibility). Suppose that A = NU and B = NV ,where U and V are closed convex subsets of X such that U ∩V 6= ∅. Thenclearly Z = U ∩ V. Using Fact 14.2(i) below (see also [18, Fact A1]), wededuce that (∀z ∈ Z) Kz = NU−V(0) = K. Note that we do know at leastone dual solution: 0 ∈ K. Thus, by Proposition 4.4(ii)&(vii), (∀k ∈ K)Zk = Z.Remark 4.8. The preceding examples give some credence to the conjecture thatz1 ∈ Zz2 ∈ Zz1 6= z2 ⇒ either Kz1 = Kz2 or Kz1 ∩ Kz2 = ∅. (4.15)Note that (4.15) is trivially true whenever A or B is at most single-valued. Whilethis conjecture fails in general (see Example 4.9 below), it does, however, hold truefor the large class of paramonotone operators (see Theorem 4.28).324.3. Solution mappings K and ZExample 4.9. Suppose that X = R2, and set U = R×R+, V = R× {0},and R : X → X : (x1, x2) 7→ (−x2, x1). Now suppose that A = NU + R andthat B = NV . Then dom A = U and dom B = V; hence, dom(A+ B) = U ∩V = V. Let x = (ξ, 0) ∈ V. Then Ax = {0} × ]−∞, ξ] and Bx = {0} ×R.Hence Ax ⊂ ±Bx, (A+ B)x = {0}×R and therefore Z = V. Furthermore,Kx = Ax ∩ (−Bx) = Ax. Now take y = (η, 0) ∈ V = Z with ξ < η. ThenKx = Ax $ Ay = Ky and thus (4.15) fails.Proposition 4.10 (common zeros). zer A ∩ zer B 6= ∅⇔ 0 ∈ K.Proof. Suppose first that z ∈ zer A ∩ zer B. Then 0 ∈ Az and 0 ∈ Bz, so0 ∈ Az ∩ (−Bz) = Kz ⊆ K. Now assume that 0 ∈ K. Then 0 ∈ Kz, for somez ∈ Z and so 0 ∈ Az ∩ (−Bz). Therefore, 0 ∈ zer A ∩ zer B. Example 4.11. Suppose that B = A. Then Z = zer A, and zer A 6= ∅ ⇔0 ∈ K.Proof. Since 2A is maximally monotone and A+ A is a monotone extensionof 2A, we deduce that A + A = 2A. Hence Z = zer(2A) = zer A and theresult follows from Proposition 4.10. The following result, observed first by Passty, is very useful. For thesake of completeness, we include its short proof.Proposition 4.12 (Passty 1986). Let x ∈ X and suppose that, for every i ∈{0, 1}, wi ∈ Ayi ∩ B(x− yi). Then 〈y0 − y1, w0 − w1〉 = 0.Proof. (See [115, Lemma 14].) Since A is monotone, 0 ≤ 〈y0 − y1, w0 − w1〉.On the other hand, since B is monotone, 0 ≤ 〈(x− y0)− (x− y1), w0 − w1〉= 〈y1 − y0, w0 − w1〉. Altogether, 〈y0 − y1, w0 − w1〉 = 0. Corollary 4.13. Suppose that z1 and z2 belong to Z, that k1 ∈ Kz1 , and thatk2 ∈ Kz2 . Then 〈k1 − k2, z1 − z2〉 = 0.Proof. Apply Proposition 4.12 (x, B) replaced by (0, B>). 4.3 Solution mappings K and ZIn this section we focus on the solution mappings between primal and dualsolutions. We now interpret the families of sets (Kz)z∈X and (Zk)k∈X as set-334.3. Solution mappings K and Zvalued operators by settingK : X ⇒ X : z 7→ Kz and Z : X ⇒ X : k 7→ Zk. (4.16)Let us record some basic properties of these fundamental operators.Proposition 4.14. The following hold.(i) gr K = gr A ∩ gr(−B) and gr Z = gr A−1 ∩ gr(−B−>).(ii) dom K = Z, ran K = K, dom Z = K, and ran Z = Z.(iii) gr K and gr Z are closed sets.(iv) The operators K,−K, Z,−Z are monotone.(v) K−1 = Z.Proof. (i): This is clear from the definitions.(ii): This follows from Proposition 4.4.(iii): Since A and B are maximally monotone, the sets gr A and gr B areclosed. Hence, by (i), gr K is closed and similarly for gr Z.(iv): Since gr K ⊆ gr A and A is monotone, we see that K is monotone.Similarly, since B is monotone and gr(−K) ⊆ gr B, we obtain the mono-tonicity of −K. The proofs for ±Z are analogous.(v): Clear from Proposition 4.4(ii). In Proposition 4.4(i), we observed the closedness and convexity of Kzand Zk. In view of Proposition 4.4(iii)&(vii), the sets of primal and dualsolutions are both unions of closed convex sets. It would seem that wecannot a priori deduce convexity of these solution sets because unions ofconvex sets need not be convex. However, not only are Z and K indeedconvex, but so are gr Z and gr K. This surprising result, which is basicallycontained in works by Passty [115] and by Eckstein and Svaiter [74, 75], isbest stated by using the parallel sum, a notion systematically explored byPassty in [115]. See also [97].Definition 4.15 (parallel sum). (see [97], [115], or [12, Section 24.4]). Theparallel sum of A and B isA B = (A−1 + B−1)−1. (4.17)344.3. Solution mappings K and ZLet us recall that the infimal convolution of f and g, denoted by f  g, isdefined byf  g : X → R : x 7→ infx∈X(f (y) + g(x− y)). (4.18)The notation we use for the parallel sum (see [12, Section 24.4]) is nonstan-dard but highly convenient in view of the following fact.Fact 4.16. Let f and g be in Γ(X) such that 0 ∈ sri(dom f ∗ − dom g∗). Thenf  g ∈ Γ(X) and∂( f  g) = (∂ f ) (∂g). (4.19)Proof. See [115, Theorem 28], [109, Proposition 4.2.2], or [12,Proposition 15.7(i) and Proposition 24.27]. The proof of the following result is contained in the proof of [115,Theorem 21], although Passty stated a much weaker conclusion. For thesake of completeness, we present his proof.Theorem 4.17. For every x ∈ X, the set(gr A) ∩ ((x, 0)− gr(−B)) = {(y, w) ∈ gr A ∣∣ (x− y, w) ∈ gr B} (4.20)is convex.Proof. (See also [115, Proof of Theorem 21].) The identity (4.20) is easilyverified. To tackle convexity, for every i ∈ {0, 1} take (yi, wi) from theintersection (4.20); equivalently,(∀i ∈ {0, 1}) wi ∈ Ayi ∩ B(x− yi). (4.21)By Proposition 4.12,〈y0 − y1, w0 − w1〉 = 0. (4.22)Now let t ∈ [0, 1], set (yt, wt) = (1 − t)(y0, w0) + t(y1, w1), and take(a, a∗) ∈ gr A. Using (4.21) and the monotonicity of A in (4.23c), we obtain〈yt − a, wt − a∗〉= 〈(1− t)(y0 − a) + t(y1 − a), (1− t)(w0 − a∗) + t(w1 − a∗)〉 (4.23a)= (1− t)2 〈y0 − a, w0 − a∗〉+ t2 〈y1 − a, w1 − a∗〉+ (1− t)t( 〈y0 − a, w1 − a∗〉+ 〈y1 − a, w0 − a∗〉 ) (4.23b)≥ (1− t)t( 〈y0 − a, w1 − a∗〉+ 〈y1 − a, w0 − a∗〉 ). (4.23c)354.3. Solution mappings K and ZThus, using again monotonicity of A and recalling (4.22), we obtain〈y0 − a, w1 − a∗〉+ 〈y1 − a, w0 − a∗〉= 〈y0 − a, w1 − w0〉+ 〈y0 − a, w0 − a∗〉+ 〈y1 − a, w0 − w1〉+ 〈y1 − a, w1 − a∗〉 (4.24a)= 〈y1 − y0, w0 − w1〉+ 〈y0 − a, w0 − a∗〉+ 〈y1 − a, w1 − a∗〉 (4.24b)≥ 〈y1 − y0, w0 − w1〉 (4.24c)= 0. (4.24d)Combining (4.23) and (4.24), we obtain 〈yt − a, wt − a∗〉 ≥ 0. Since (a, a∗) isan arbitrary element of gr A and A is maximally monotone, we deduce that(yt, wt) ∈ gr A. A similar argument yields (x − yt, wt) ∈ gr B. Therefore,(yt, wt) is an element of the intersection (4.20). Before returning to the objects of interest, we record Passty’s [115,Theorem 21] as a simple corollary.Corollary 4.18 (Passty 1986). For every x ∈ X, the set (A B)x is convex.Proof. Let x ∈ X. Since the mapping (y, w) 7→ w is linear and the set{(y, w) ∈ gr A ∣∣ (x− y, w) ∈ gr B} is convex (Theorem 4.17), we deducethat {w ∈ X ∣∣ (∃ y ∈ X) w ∈ Ay ∩ B(x− y)} is convex. (4.25)On the other hand, a direct computation or [115, Lemma 2] implies that(A B)x =⋃y∈X(Ay ∩ B(x− y)). Altogether, (A B)x is convex. Corollary 4.19. For every x ∈ X, the set (gr A) ∩ ((x, 0) + gr(−B)) is convex.Proof. On the one hand, − gr(−B>) = gr(−B). On the other hand, B> ismaximally monotone. Altogether, Theorem 4.17 (applied with B> insteadof B) implies that (gr A)∩ ((x, 0)− gr(−B>)) = (gr A)∩ ((x, 0) + gr(−B))is convex. Remark 4.20. Using Theorem 4.17 and Corollary 4.19 we learn that the inter-sections (gr A) ∩ ±(gr(−B)) are convex. This somewhat resembles works byMartı´nez-Legaz (see [100, Theorem 2.1]) and by Za˘linescu [138], who encounteredconvexity when studying the sum/difference (gr A)± (gr(−B)).364.3. Solution mappings K and ZCorollary 4.21 (convexity). The sets gr Z and gr K are convex; consequently, Zand K are convex.Proof. Combining Proposition 4.14(i) and Corollary 4.19 (with x = 0), weobtain the convexity of gr K. Hence gr Z is convex by Proposition 4.14(v).It thus follows that Z and K are convex as images of convex sets underlinear transformations. Remark 4.22. Since Z = (A−1 B−1)(0) and K = (A B>)(0), the convexityof Z and K also follows from Corollary 4.18.Lemma 4.23. Suppose that B : X ⇒ X is linear. Then the following hold:(i) B> = B and B−> = B−1.(ii) (A−1, B−>) = (A−1, B−1).(iii) K = (A B)(0).Proof. This is straightforward from the definitions. Remark 4.24 (Connection to Eckstein and Svaiter’s “extended solutionset”). In [74, Section 2.1], Eckstein and Svaiter considerded the Kuhn–Tuckerset (a.k.a. extended solution set) (for the primal problem), defined byS(A,B) ={(z, w) ∈ X× X ∣∣ −w ∈ Bz, w ∈ Az}. (4.26)It is clear that gr Z−1 = gr K = S(A,B). Unaware of Passty’s work, they provedin [74, Lemma 1 and Lemma 2] (in the present notation) that Z = ran Z, andthat gr Z is closed and convex. Their proof is very elegant and completely differentfrom the above Passty-like proof. In their 2009 follow-up paper [75], Eckstein andSvaiter generalize the notion of the Kuhn–Tucker set to three or more operators;their corresponding proof of convexity in [75, Proposition 2.2] is more direct andalong Passty’s lines. Note that the extension to three or more operators in [75] isa product space reformulation of the corresponding result in [74] and that suchresults are significantly extended in [52] and [66].Remark 4.25 (Convexity of Z and K). If Z is nonempty and a constraintqualification holds, then A + B is maximally monotone (see [12, Section 24.1])and therefore Z = zer(A + B) is convex. It is somewhat surprising that Z isalways convex even without the maximal monotonicity of A + B.374.4. ParamonotonicityOne may inquire whether or not Z is also closed, which is anotherstandard property of zeros of maximally monotone operators. The nextexample illustrates that Z may fail to be closed.Example 4.26 (Z need not be closed). Suppose that X = `2, the real Hilbertspace of square-summable sequences. In [24, Example 3.17], the authorsprovide a monotone discontinuous linear at most single-valued operator Son X such that S is maximally monotone and its adjoint S∗ is a maximallymonotone single-valued extension of −S. Hence dom S is not closed. Nowassume that A = S and B = S∗. Then A + B is operator that is zero onthe dense proper subspace Z = dom(A + B) = dom S of X. Thus Z failsto be closed. Furthermore, in the language of Passty’s parallel sums (seeRemark 4.22), this also illustrates that the parallel sum need not map a pointto a closed set.Remark 4.27. We finish this section with some concluding remarks.(i) We do not know whether or not such counterexamples can reside in finite-dimensional Hilbert spaces when dom A ∩ dom B 6= ∅. On the one hand,in view of the forthcoming Corollary 4.30(i), any counterexample mustfeature at least one operator that is not paramonotone, which means thatthe operators cannot be simultaneously subdifferential operators of functionsin Γ(X). On the other hand, one has to avoid the situation when A + Bis maximally monotone, which happens when ri dom A ∩ ri dom B 6= ∅.This means that neither is one of the operators allowed to have full domain,nor can they simultaneously have relatively open domains, which excludesthe situation when both operators are maximally monotone linear relations(i.e., maximally monotone operators with graphs that are linear subspaces,see [23]).(ii) We note that K and Z are in general not maximally monotone. Indeed ifZ, say, is maximally monotone, then Corollary 4.21 and Fact 3.6 imply thatgr Z is actually affine (i.e., a translate of a subspace) and so are Z and K (asrange and domain of Z). However, the set Z of Example 4.7 need not be anaffine subspace (e.g., when U, V and Z coincide with the closed unit ball inX).4.4 ParamonotonicityIn this section we provide new results that underline the importance ofparamonotonicity in the understanding of the zeros of the sum.384.4. ParamonotonicityTheorem 4.28. Suppose that A and B are paramonotone. Then (∀z ∈ Z) Kz = Kand (∀k ∈ K) Zk = Z.Proof. Suppose that z1 and z2 belong to Z and that z1 6= z2. Take k1 ∈ Kz1 =Az1 ∩ (−Bz1) and k2 ∈ Kz2 = Az2 ∩ (−Bz2). By Corollary 4.13,〈k1 − k2, z1 − z2〉 = 0. (4.27)Since A and B are paramonotone, we have k2 ∈ Az1 and −k2 ∈ Bz1;equivalently, k2 ∈ Kz1 . It follows that Kz2 ⊆ Kz1 . Since the reverseinclusion follows in the same fashion, we see that Kz1 = Kz2 . In view ofProposition 4.4(iv), Kz1 = K, which proves the first conclusion. Since Aand B are paramonotone so are A−1 and B−> by (3.23). Therefore, thesecond conclusion follows from what we already proved (applied to A−1and B−>). Remark 4.29 (recovering all primal solutions from one dual solution).Suppose that A and B are paramonotone and we know one (arbitrary) dualsolution, say k0 ∈ K. ThenZk0 = A−1k0 ∩(B−1(−k0))(4.28)recovers the set Z of all primal solutions, by Theorem 4.28. If A = ∂ f and B = ∂g,where f and g belong to Γ(X), then, since (∂ f )−1 = ∂ f ∗ and (∂g)−1 = ∂g∗, weobtain a formula well-known in Fenchel duality, namely,Z = ∂ f ∗(k0) ∩ ∂g∗(−k0). (4.29)We shall revisit this setting in more detail in Section 4.6. In striking contrast,the complete recovery of all primal solutions from one dual solution is generallyimpossible when at least one of the operators is no longer paramonotone — see,e.g., Example 4.6 where one of the operators is a normal cone operator.Corollary 4.30. Suppose A and B are paramonotone. Then the following hold.(i) Z and K are closed.(ii) gr K and gr Z are the rectangles Z× K and K× Z, respectively.(iii) (Z− Z) ⊥ (K− K).(iv) span (K− K) = X⇒ Z is a singleton.394.4. Paramonotonicity(v) span (Z− Z) = X⇒ K is a singleton.Proof. (i): Combine Theorem 4.28 and Proposition 4.4(i).(ii): Clear from Theorem 4.28.(iii): Combine Corollary 4.13 with Theorem 4.28.(iv): In view of (iii), we have that0 = 〈Z− Z, K− K〉 = 〈Z− Z, span (K− K)〉 = 〈Z− Z, X〉⇒ Z− Z = {0} ⇔ Z is a singleton.(v): This is verified analogously to the proof of (iv). Corollary 4.31. Let U be a closed affine subspace of X, suppose that A = NU andthat B is paramonotone. Then the following hold:(i) Z = U ∩ (B−1((par U)⊥)) ⊆ U.(ii) (∀z ∈ Z) K = (−Bz) ∩ (par U)⊥ ⊆ (par U)⊥.(iii) K ⊥ (Z− Z).Proof. Since A = NC = ∂ιC, it is paramonotone by Example 3.24(i).(i): Let x ∈ X. Then x ∈ Z ⇔ 0 ∈ Ax + Bx = (par U)⊥ + Bx ⇔[x ∈ U and there exists y ∈ X such that y ∈ (par U)⊥ and y ∈ Bx] ⇔[x ∈ U and there exists y ∈ X such that x ∈ B−1y and y ∈ (par U)⊥]⇔ x ∈ U ∩ B−1((par U)⊥). (ii): Let z ∈ Z. Applying Remark 4.29 to(A−1, B−>) yields K = (−Bz) ∩ (Az) = (−Bz) ∩ (par U)⊥. (iii): By (i)Z− Z ⊆ U −U = par U. Now use (ii). Theorem 4.32. Suppose that A and B are paramonotone and that zer A ∩zer B 6= ∅. Then the following hold:(i) Z = (zer A) ∩ (zer B) and 0 ∈ K.(ii) K ⊥ (Z− Z).If, in addition, A or B is single-valued, then we also have:(iii) K = {0}.404.5. Projection operators and solution setsProof. (i): Since zer A ∩ zer B 6= ∅, it follows from Proposition 4.10 that 0 ∈K. Now apply Remark 4.29 to get Z = A−1(0)∩ B−1(0) = (zer A)∩ (zer B).(ii): Combine (i) and Corollary 4.30(iii). (iii): Let C ∈ {A, B} be single-valued. Using (i) we have Z ⊆ zer C. Suppose that C = A and let z ∈ Z.We use Remark 4.29 applied to (A−1, B−>) to learn that K = (Az)∩ (−Bz).Therefore {0} ⊆ K ⊆ Az ⊆ A(zer A) = {0}. A similar argument applies ifC = B. Remark 4.33. The conclusion of Theorem 4.32(i) generalizes the setting of convexfeasibility problems. Indeed, suppose that A = NU and B = NV , where U andV are nonempty closed convex subsets of X such that U ∩ V 6= ∅. Then Z =U ∩V = zer A ∩ zer B.The assumptions that A and B are paramonotone are critical in theconclusion of Theorem 4.32(i) as we illustrate now.Example 4.34. Suppose that X = R2, that U = R × {0}, that A = NUand that B : R2 → R2 : (x, y) 7→ (−y, x), is the counterclockwise rotatorin the plane by pi/2. Then one verifies that zer A = U, zer B = {(0, 0)},Z = zer(A+ B) = U; however (zer A)∩ (zer B) = {(0, 0)} 6= U = Z. Notethat A is paramonotone by Example 3.24(i) while B is not paramonotone byExample 3.25.In the light of Example 4.7 we learn that if neither A nor B is single-valued, then the conclusion of Theorem 4.32(iii) may fail.Remark 4.35 (paramonotonicity is critical). Various results in this section —e.g., Theorem 4.28, Remark 4.29, Corollary 4.30(ii)–(v)— fail if the assumption ofparamonotonicity is omitted. To generate these counterexamples, assume that Aand B are as in Example 4.5 or Example 4.6.4.5 Projection operators and solution setsTheorem 4.36. Suppose that A and B are paramonotone, that (z0, k0) ∈ Z× K,and that x ∈ X. Then the following hold.(i) Z + K is convex and closed.(ii) PZ+K(x) = PZ(x− k0) + PK(x− z0).(iii) If (Z− Z) ⊥ K, then PZ+K(x) = PZ(x) + PK(x− z0).414.5. Projection operators and solution sets(iv) If Z ⊥ (K− K), then PZ+K(x) = PZ(x− k0) + PK(x).Proof. (i): The convexity and closedness of Z and K follows from Corol-lary 4.21 and Corollary 4.30(i). By Corollary 4.30(iii),(Z− z0) ⊥ (K− k0). (4.30)Using Fact 2.10,Z + K− z0 − k0 is convex and closed,and PZ+K−z0−k0 = PZ−z0 + PK−k0 .(4.31)Hence Z + K is convex and closed. (ii): Using (4.31), Fact 2.10, and Fact 2.6,we obtainPZ+Kx = P(z0+k0)+(Z+K−z0−k0)x (4.32a)= z0 + k0 + P(Z−z0)+(K−k0)(x− (z0 + k0))(4.32b)= z0 + PZ−z0((x− k0)− z0)+ k0 + PK−k0((x− z0)− k0)(4.32c)= PZ(x− k0) + PK(x− z0). (4.32d)(iii): Using Fact 2.10 and Fact 2.6, we havePZ+Kx = Pz0+(Z+K−z0)x (4.33a)= z0 + P(Z−z0)+K(x− z0) (4.33b)= z0 + PZ−z0(x− z0) + PK(x− z0) (4.33c)= PZx + PK(x− z0). (4.33d)(iv): Argue analogously to the proof of (iii). Remark 4.37. Suppose that A and B are paramonotone and that 0 ∈ K. ThenCorollary 4.30(iii) implies that (Z−Z) ⊥ K−{0} = K and we thus may employTheorem 4.36(ii) (with k0 = 0) or Theorem 4.36(iii) to obtain the formula forPZ+K.However, if (Z − Z) ⊥ K, then the next two examples show—instrikingly different ways since Z is either large or small—that we cannotconclude that 0 ∈ K:Example 4.38. Fix u ∈ X and suppose that (∀x ∈ X) Ax = u and B =−A. Then A and B are paramonotone, A + B ≡ 0, and hence Z = X.Furthermore, K = {u}. Thus if u 6= 0, then K 6⊥ X = (Z− Z).424.6. Subdifferential operatorsExample 4.39. Let U and V be closed convex subsets of X such that0 /∈ U ∩V and U −V = X. (4.34)(For example, suppose that X = R and set U = V = [1,+∞[.) Now assumethat (A, B) = (NU , NV)∗. In view of Example 4.7, K = U ∩ V and Z =NU−V(0) = NX(0) = {0}. Hence Z is a singleton and thus Z− Z = {0} ⊥K while 0 /∈ K.Theorem 4.40. Suppose that A and B are paramonotone, let k0 ∈ K, and letx ∈ X. Then the following hold.(i) JAPZ+K(x) = PZ(x− k0).(ii) If (Z− Z) ⊥ K, then JA ◦ PZ+K = PZ.Proof. Take an arbitrary z0 ∈ Z. (i): Set z = PZ(x − k0). Using Theo-rem 4.36(ii) and Theorem 4.28, we havePZ+Kx− z = PZ+Kx− PZ(x− k0) = PK(x− z0) ∈ K = Kz ⊆ Az. (4.35)Hence PZ+Kx ∈ (Id+A)z⇔ z = JAPZ+Kx⇔ PZ(x− k0) = JAPZ+Kx.(ii): This time, let us set z = PZx. Using Theorem 4.36(iii) andTheorem 4.28, we havePZ+Kx− z = PZ+Kx− PZx = PK(x− z0) ∈ K = Kz ⊆ Az. (4.36)Hence PZ+Kx ∈ (Id+A)z⇔ z = JAPZ+Kx⇔ PZx = JAPZ+Kx. Corollary 4.41. Suppose that A and B are paramonotone, and that 0 ∈ K. ThenPZ = JAPZ+K. (4.37)4.6 Subdifferential operatorsIn this section, we assume thatA = ∂ f and B = ∂g, (4.38)where f and g belong to Γ(X). Recall that (see Fact 3.5(ii))(∂ f )−1 = ∂ f ∗. (4.39)Let f ∈ Γ(X) and let g ∈ Γ(X).434.6. Subdifferential operatorsCorollary 4.42 (subdifferential operators). The following hold:(i) Z = (∂ f ∗ ∂g∗)(0).(ii) K = (∂ f  ∂g∨)(0).(iii) Suppose that argmin f ∩ argmin g 6= ∅. Then Z = ∂ f ∗(0) ∩ ∂g∗(0).(iv) Suppose that 0 ∈ sri(dom f − dom g). Then Z = ∂( f ∗ g∗)(0).(v) Suppose that 0 ∈ sri(dom f ∗ + dom g∗). Then K = ∂( f  g∨)(0).Proof. Note that A and B are paramonotone by Example 3.24(i). (i): Using(4.39) and Remark 4.22 we have Z = (A + B)−1(0) = (((∂ f )−1)−1 +((∂g)−1)−1)−1(0) = ((∂ f ∗)−1 + (∂g∗)−1)−1(0) = (∂ f ∗ ∂g∗)(0). (ii):Observe that (∂g)−> = ((∂g)>)−1 = (∂g∨)−1. Therefore using Remark 4.22we have K = ((∂ f )−1 + ((∂g)>)−1)−1(0) = ((∂ f )−1 + (∂g∨)−1)−1(0) =(∂ f  ∂g∨)(0). (iii): Using Theorem 4.32(i), Fermat’s rule (see Fact 2.18)and (4.39) we have Z = (zer A) ∩ (zer B) = argmin f ∩ argmin g =(∂ f )−1(0) ∩ (∂g)−1(0) = ∂ f ∗(0) ∩ ∂g∗(0). (iv): Combine (i) and [12,Proposition 24.27] applied to the functions f ∗ and g∗. (v): Combine (ii) and[12, Proposition 24.27] applied to the functions f and g∨. We now turn to applications to best approximation. Consider the primalproblemminimizex∈Xf (x) + g(x), (4.40)the associated Fenchel dual problemminimizex∗∈Xf ∗(x∗) + g∗(−x∗), (4.41)the primal and dual optimal valuesµ = inf( f + g)(X) and µ∗ = inf( f ∗ + g∗∨)(X). (4.42)Note thatµ ≥ −µ∗. (4.43)Following [47] and [48], we say that total duality holds if µ = −µ∗ ∈ R,the primal problem (4.40) has a solution, and the dual problem (4.41) has asolution.Theorem 4.43 (total duality). Suppose that A = ∂ f and B = ∂g, where f andg belong to Γ. Then Z 6= ∅⇔ total duality holds, in which case Z coincides withthe set of solutions to the primal problem (4.40).444.6. Subdifferential operatorsProof. Observe that (∂ f )−1 = ∂ f ∗ and that (∂g)−> = (∂g∗)> = ∂(g∗∨).“⇒”: Suppose that Z 6= ∅, and let z ∈ Z. Then 0 ∈ ∂ f (z) + ∂g(z) ⊆∂( f + g)(z). Hence z solves the primal problem (4.40), andµ = f (z) + g(z). (4.44)Take k ∈ K = Kz = (∂ f )(z) ∩ (−∂g)(z). First, we note that 0 ∈ (∂ f )−1(k) +(∂g)−>(k) = ∂ f ∗(k) + ∂g∗∨(k) ⊆ ∂( f ∗+ g∗∨)(k) and so k solves the Fencheldual problem (4.41). Thus,µ∗ = f ∗(k) + g∗∨(k). (4.45)Moreover, k ∈ ∂ f (z) and −k ∈ ∂g(z), i.e., f (z) + f ∗(k) = 〈z, k〉 and g(z) +g∗(−k) = 〈z,−k〉. Adding these equations gives 0 = f (z) + f ∗(k) + g(z) +g∗∨(k) = µ+ µ∗. This verifies total duality.“⇐”: Suppose we have total duality. Then there exists x ∈ dom f ∩dom g and x∗ ∈ dom f ∗ ∩ dom g∗∨ such thatf (x) + g(x) = µ = −µ∗ = − f ∗(x∗)− g∗∨(x∗) ∈ R. (4.46)Hence 0 = ( f (x) + f ∗(x∗)) + (g(x) + g∗(−x∗)) ≥ 〈x, x∗〉+ 〈x,−x∗〉 = 0.Therefore, using convex analysis and Proposition 4.4,(x∗ ∈ ∂ f (x) and − x∗ ∈ ∂g(x)) ⇔ x∗ ∈ Kx ⇔ x ∈ Zx∗ . (4.47)Hence x ∈ Z.Note that Z = zer(∂ f + ∂g) ⊆ zer ∂( f + g) since gr(∂ f + ∂g) ⊆gr ∂( f + g). Hence Z is a subset of the set of primal solutions. Conversely,if x is a primal solution and x∗ is a dual solution, then (4.46) holds and therest of the proof of “⇐” shows that x ∈ Z. Altogether, Z coincides with theset of primal solutions. Remark 4.44 (sufficient conditions). On the one hand,the primal problem has at least one solution (4.48)if dom f ∩ dom g 6= ∅ and one of the following holds (see [12, Corollary 11.15]):(i) f is supercoercive; (ii) f is coercive and g is bounded below; (iii) 0 ∈sri(dom f ∗ + dom g∗) (by Fact 2.22 and since (4.40) is the Fenchel dualproblem of (4.41)). On the other hand,the sum rule ∂( f + g) = ∂ f + ∂g holds (4.49)454.6. Subdifferential operatorswhenever one of the following is satisfied (see [12, Corollary 16.38]): (i) (Attouch-Brezis condition)R++(dom f − dom g) is a closed linear subspace; (ii) dom f ∩int dom g 6= ∅; (iii) dom g = X; (iv) X is finite-dimensional and ri dom f ∩ri dom g 6= ∅. If both (4.48) and (4.49) hold, then Z 6= ∅ and Z coincides withthe set of primal solutions.46Chapter 5Zeros of the sum and theDouglas–Rachford operator5.1 OverviewRecall that the sum problem for two maximally monotone operators A andB is tofind x ∈ X such that 0 ∈ Ax + Bx. (5.1)When zer(A + B) 6= ∅, one approach to solve the problem is the Douglas–Rachford splitting technique. In this chapter we focus on the basic proper-ties of the Douglas–Rachford splitting operator and the connection to theprimal and dual sets of solutions. Our main results in this chapter are:• We provide a new description of the fixed point set of the Douglas–Rachford splitting operator (see Theorem 5.14); that refines Com-bettes description of Z = (A + B)−1(0).• Using the Douglas–Rachford operator we can formulate new results,on the relative geometry of the primal and dual solutions (in the senseof Attouch–The´ra duality) (see Theorem 5.19).• We sketch the connection between the historical Douglas–Rachfordalgorithm and the powerful extension provided by Lions and Mercierin [96] (see Section 5.6).Except for facts with explicit references, this chapter is based on results thatappear in [13], [26], [31] and [39].5.2 Basic properties and factsDefinition 5.1. The Douglas–Rachford splitting operator [96] for the orderedpair of operators (A, B) is defined byT(A,B) = 12 (Id+RBRA) = Id−JA + JBRA. (5.2)475.2. Basic properties and factsLet us record some useful properties of T(A,B).Fact 5.2. The following hold:(i) (Lions and Mercier). T(A,B) is firmly nonexpansive.(ii) (Eckstein). T(A,B) = T(A−1,B−>).(iii) (Combettes). Z = JA(Fix T(A,B)).(iv) K = JA−1(Fix T(A,B)).(v) Fix T(A,B) = Fix RBRA.Proof. (i): See [96, Lemma 1], [71, Corollary 4.2.1 on page 139], [72,Corollary 4.1], or Theorem 5.9(i). (ii): See [71, Lemma 3.6 on page 133] or[26, Proposition 2.16]). (iii): See [61, Lemma 2.6(iii)]. (iv): Apply (iii) to thedual pair (A−1, B−>) and use (ii) or Corollary 5.12 below. (v): Clear fromthe definition. It is clear from the definition and Fact 5.2(i) that Id−TA,B is also firmlynonexpansive. In fact, we note in passing that Id−TA,B is itself a Douglas–Rachford splitting operator:Proposition 5.3. Id−T(A,B) = T(A,B−1).Proof. Using (3.15), we obtainT(A,B) + T(A,B−1) = Id−JA + JBRA + Id−JA + JB−1 RA (5.3a)= 2 Id−2JA + (JB + JB−1)RA (5.3b)= 2 Id−2JA + RA (5.3c)= Id, (5.3d)and the conclusion follows. We will simply use T instead of T(A,B) provided there is no cause forconfusion.Fact 5.4 (Eckstein 1989).gr(T) ={(a + a∗, a∗ + b)∣∣ (a, a∗) ∈ gr A, (b, b∗) ∈ gr B, a− b = a∗ + b∗}.(5.4)485.3. Useful identities for the Douglas–Rachford operatorProof. See [71, Proposition 4.1]. Corollary 5.5. We havegr(Id−T)={(a + a∗, a− b) ∣∣ (a, a∗) ∈ gr A, (b, b∗) ∈ gr B, a− b = a∗ + b∗}; (5.5)consequently,ran(Id−T)={a− b ∣∣ (a, a∗) ∈ gr A, (b, b∗) ∈ gr B, a− b = a∗ + b∗} (5.6a)⊆ (dom A− dom B) ∩ (ran A + ran B). (5.6b)5.3 Useful identities for the Douglas–RachfordoperatorWe start with some elementary identities which are easily verified directly.Lemma 5.6. Let (a, b, z) ∈ X3. Then the following hold:(i) 〈z, z− a + b〉 = ‖z− a + b‖2 + 〈a, z− a〉+ 〈b, 2a− z− b〉.(ii) 〈z, a− b〉 = ‖a− b‖2 + 〈a, z− a〉+ 〈b, 2a− z− b〉.(iii) ‖z‖2 = ‖z− a + b‖2 + ‖b− a‖2 + 2〈a, z− a〉+ 2〈b, 2a− z− b〉.Lemma 5.7. Let (a, b, x, y, a∗, b∗, u, v) ∈ X8. Then〈(a, b)− (x, y), (a∗, b∗)− (u, v)〉 = 〈a− b, a∗〉+ 〈x, u〉 − 〈x, a∗〉−〈a− b, u〉+ 〈b, a∗ + b∗〉+ 〈y, v〉 − 〈y, b∗〉 − 〈b, u + v〉. (5.7)The following result will be useful.Lemma 5.8. Let x ∈ X. Then the following hold:(i) x− Tx = JAx− JBRAx = JA−1 x + JB−1 RAx.(ii) (JAx, JBRAx, JA−1 x, JB−1 RAx) lies in gra(A× B).495.3. Useful identities for the Douglas–Rachford operatorProof. (i): The first identity is a direct consequence of (5.2). In view of (3.15)JAx − JBRAx − JA−1 x − JB−1 RAx = JAx − (x − JAx) − (JB + JB−1)RAx =RAx − RAx = 0, which proves the second identity. (ii): Use (5.31) belowand Fact 3.18 applied to A× B at (x, RAx) ∈ X× X. The next theorem is a direct consequence of the key identities presentedin Lemma 5.6.Theorem 5.9. Let x ∈ X and let y ∈ X. Then the following hold:(i) 〈Tx− Ty, x− y〉 = ‖Tx− Ty‖2 + 〈JAx− JAy, JA−1 x− JA−1 y〉+ 〈JBRAx− JBRAy, JB−1 RAx− JB−1 RAy〉.(ii) 〈(Id−T)x− (Id−T)y, x− y〉= ‖(Id−T)x− (Id−T)y‖2+ 〈JAx− JAy, JA−1 x− JA−1 y〉+ 〈JBRAx− JBRAy, JB−1 RAx− JB−1 RAy〉.(iii) ‖x− y‖2 = ‖Tx− Ty‖2 + ‖(Id−T)x− (Id−T)y‖2+ 2〈JAx− JAy, JA−1 x− JA−1 y〉+ 2〈JBRAx− JBRAy, JB−1 RAx− JB−1 RAy〉.(iv) ‖JAx− JAy‖2 + ‖JA−1 x− JA−1 y‖2 − ‖JATx− JATy‖2− ‖JA−1 Tx− JA−1 Ty‖2= ‖(Id−T)x− (Id−T)y‖2 + 2〈JATx− JATy, JA−1 Tx− JA−1 Ty〉+ 2〈JBRAx− JBRAy, JB−1 RAx− JB−1 RAy〉.(v) ‖JATx− JATy‖2 + ‖JA−1 Tx− JA−1 Ty‖2≤ ‖JAx− JAy‖2 + ‖JA−1 x− JA−1 y‖2.Proof. (i)–(iii): Use Lemma 5.6(i)–(iii) respectively, with z = x − y, a =JAx− JAy and b = JBRAx− JBRAy, (3.20) and (5.2). (iv): It follows from the(3.15) that‖x− y‖2 = ‖JAx− JAy + JA−1 x− JA−1 y‖2 (5.8a)= ‖JAx− JAy‖2 + ‖JA−1 x− JA−1 y‖2+ 2〈JAx− JAy, JA−1 x− JA−1 y〉. (5.8b)505.4. Douglas–Rachford operator and Attouch–The´ra dualityApplying (5.8) to (Tx, Ty) instead of (x, y) yields‖Tx− Ty‖2 = ‖JATx− JATy‖2 + ‖JA−1 Tx− JA−1 Ty‖2+ 2〈JATx− JATy, JA−1 Tx− JA−1 Ty〉. (5.9)Now combine (5.8), (5.9) and (iii) to obtain (iv). (v): In view of (3.20), themonotonicity of A and B implies 〈JATx− JATy, JA−1 Tx− JA−1 Ty〉 ≥ 0 and〈JBRAx− JBRAy, JB−1 RAx− JB−1 RAy〉 ≥ 0. Now use (iv). 5.4 Douglas–Rachford operator and Attouch–The´radualityWe start with some useful identities involving resolvents and reflectedresolvents.Proposition 5.10. Let C : X ⇒ X be maximally monotone. Then the followinghold.(i) RC−1 = −RC.(ii) JC> = J>C .(iii) RC−> = Id−2J>C .Proof. (i): By (3.15), we have RC−1 = 2JC−1 − Id = 2(Id−JC) − Id =Id−2JC = −(2JC − Id) = −RC.(ii): Indeed,JC> = ( Id+(− Id) ◦ C ◦ (− Id))−1 (5.10a)=((− Id) ◦ (Id+C) ◦ (− Id))−1 (5.10b)= (− Id)−1 ◦ (Id+C)−1 ◦ (− Id)−1 (5.10c)= (− Id) ◦ JC ◦ (− Id) (5.10d)= J>C . (5.10e)(iii): Using (3.15) and (ii), we have that RC−> = 2JC−> − Id =2(Id−JC>)− Id = Id−2J>C . Recall that the Peaceman–Rachford operator for the ordered pair (A, B)is RBRA.515.4. Douglas–Rachford operator and Attouch–The´ra dualityCorollary 5.11 (Peaceman–Rachford operator is self-dual). (See Eck-stein’s [71, Lemma 3.5 on page 125].) The Peaceman–Rachford operatorsfor (A, B) and (A, B)∗ = (A−1, B−>) coincide, i.e., we have self-duality in thesense thatRBRA = RB−>RA−1 . (5.11)Consequently,(∀λ ∈ [0, 1]) (1− λ) Id+λRBRA = (1− λ) Id+λRB−>RA−1 . (5.12)Proof. Using Proposition 5.10(i)&(iii), we obtainRB−>RA−1 = (Id−2J>B )(−RA) = −RA + 2JBRA= (2JB − Id)RA = RBRA. (5.13)which proves (5.11). Now (5.12) follows immediately from (5.11). Corollary 5.12 (Douglas–Rachford operator is self-dual). (See Eckstein’s[71, Lemma 3.6 on page 133].) For the Douglas–Rachford operatorT(A,B) = 12 Id+12 RBRA (5.14)we haveT(A,B) = JBRA + Id−JA = T(A−1,B−>). (5.15)Proof. The left equality is a simple expansion while self-duality is (5.12)with λ = 12 . Remark 5.13 (backward-backward operator is not self-dual). In contrastto Corollary 5.11, the backward–backward operator is not self-dual: indeed, using(3.15) and Proposition 5.10(iii), we deduce thatJB−> JA−1 = (Id−J>B )(Id−JA) = Id−JA + JB(JA − Id)= (JB − Id)(JA − Id). (5.16)Thus if A ≡ 0 and dom B is not a singleton (equivalently, JA = Id and ran JB isnot a singleton), then JB−> JA−1 ≡ (JB − Id)0 ≡ JB0 6= JB = JB JA.We now provide a new description of the fixed point set of the Douglas–Rachford splitting operator (see Theorem 5.14); this refines Combettes’description of (A + B)−1(0).525.4. Douglas–Rachford operator and Attouch–The´ra dualityTheorem 5.14. The mappingΨ : gr K→ Fix T : (z, k) 7→ z + k (5.17)is a well-defined bijection that is continuous in both directions, with Ψ−1 : x 7→(JAx, x− JAx).Proof. Take (z, k) ∈ gr K. Then k ∈ Kz = (Az) ∩ (−Bz). Now k ∈ Az ⇔z + k ∈ (Id+A)z ⇔ z = JA(z + k). Similarly, k ∈ (−Bz) ⇔ −k ∈ Bz ⇔z− k ∈ (Id+B)z⇔ z = JB(z− k). Set x = z + k. Then JAx = JA(z + k) = zand hence RAx = 2JAx− x = 2z− (z + k) = z− k. Thus,Tx = x− JAx + JBRAx = z + k− z + JB(z− k) = k + z = x, (5.18)i.e., x ∈ Fix T. It follows that Ψ is well-defined.Let us now show that Ψ is surjective. To this end, take x ∈ Fix T. Setz = JAx as well as k = (Id−JA)x = x− z. Clearly,x = z + k. (5.19)Now z = JAx⇔ x ∈ (Id+A)z = z + Az⇔ k = x− z ∈ Az. Thus,k ∈ Az. (5.20)We also have RAx = 2JAx− x = 2z− (z+ k) = z− k; hence, x = Tx = x−JAx+ JBRAx⇔ JAx = JBRAx⇔ z = JB(z− k)⇔ z− k ∈ (Id+B)z = z+ Bz⇔k ∈ −Bz. (5.21)Altogether, k ∈ (Az) ∩ (−Bz) = Kz ⇔ (z, k) ∈ gr K. Hence Ψ is surjective.In view of Fact 3.18 and since gr K ⊆ gr A, it is clear that Ψ is injectiveand the result follows. The following result is a straightforward consequence of Theorem 5.14.Corollary 5.15. We have(∀z ∈ Z) Kz = JA−1(J−1A z ∩ Fix T)(5.22)and(∀k ∈ K) Zk = JA(J−1A−1 k ∩ Fix T). (5.23)535.4. Douglas–Rachford operator and Attouch–The´ra dualityProof. Let (z, k) ∈ Z × X. Then k ∈ Kz ⇔ (z, k) ∈ gra K ⇔ (∃ f ∈ Fix T)such that z = JA f and k = JA−1 f ⇔ (∃ f ∈ Fix T) such that f ∈ J−1A z andk = JA−1 f ⇔ k ∈ JA−1(J−1A z ∩ Fix T). Similarly, one can prove (5.23). Recall that the so-called Kuhn–Tucker set (see Section 4.3) is defined byS = S(A,B) = {(z, k) ∈ X× X | − k ∈ Bz, k ∈ Az} ⊆ Z× K. (5.24)In the following result we reveal the importance of paramonotonicity. In-deed we prove in Corollary 5.16(v) that the fixed point set of the Douglas–Rachford splitting operator is a rectangle, see Corollary 5.16(v) below.Corollary 5.16. Recalling Fact 3.18, let M : X → gra A : x 7→ (JAx, JA−1 x).Then the following hold:(i) S = M(Fix T) = {(JA × JA−1)(y, y) ∣∣ y ∈ Fix T}.(ii) Fix T = M−1(S) = {z + k ∣∣ (z, k) ∈ S}.(iii) (Eckstein and Svaiter). S is closed and convex.If A and B are paramonotone, then we additionally have:(iv) S = Z× K.(v) Fix T = Z + K.(vi) Z = JA(Z + K),(vii) K = JA−1(Z + K) = (Id−JA)(Z + K).Proof. (i)&(ii): This is Theorem 5.14. (iii): See [74, Lemma 2]. Alternatively,since Fix T is closed, and M and M−1 are continuous, we deduce theclosedness from (i). The convexity was proved in Proposition 4.14(i) andRemark 4.24. (iv): Combine Remark 4.24 and Corollary 4.30(ii). (v)–(vii):Combine Corollary 4.30(ii) with Theorem 5.14. Specializing the result of Corollary 4.41 to normal cone operators, inview of Corollary 5.16(v), we recover the consistent case of [20, Corol-lary 3.9].Example 5.17. Suppose that A = NU and B = NV , where U and V areclosed convex subsets of X such that U ∩ V 6= ∅. Then Z = U ∩ V, K =NU−V(0), andPZ = PU PZ+K = PU PFix T. (5.25)Proof. Combine Example 4.7, Corollary 5.16(v), and Corollary 4.41. 545.5. The Douglas–Rachford operator and solution sets5.5 The Douglas–Rachford operator and solution setsLemma 5.18. Suppose that A and B are paramonotone. Let k ∈ K be such that(∀z ∈ Z) JA(z + k) = PZ(z + k). Then k ∈ (Z− Z)⊥.Proof. By Corollary 5.16(v), Fix T = Z + K. Let z1 and z2 be in Z. It followsfrom Theorem 5.14 that (∀z ∈ Z) JA(z + k) = z. Therefore,(∀i ∈ {1, 2}) zi + k ∈ Fix T and zi = JA(zi + k) = PZ(zi + k). (5.26)Furthermore, the Projection Theorem (see Fact 2.5) yields〈k, z1 − z2〉 = 〈z1 + k− z1, z1 − z2〉 (5.27a)= 〈z1 + k− PZ(z1 + k), PZ(z1 + k)− z2〉 ≥ 0. (5.27b)On the other hand, interchanging the roles of z1 and z2 yields〈k, z2 − z1〉 ≥ 0. Altogether, 〈k, z1 − z2〉 = 0. The next results relate the Douglas–Rachford operator to orthogonalproperties of primal and dual solutions. These results extend and sharpenour results in Chapter 4.Let S be a subset of X and let T : X → X. We use T|S to refer to therestriction of the map T to the set S.Theorem 5.19. Suppose that A and B are paramonotone. Then the following areequivalent:(i) JAPFix T = PZ.(ii) JA|Fix T = PZ|Fix T .(iii) K ⊥ (Z− Z).Proof. “(i)⇒(ii)”: This is obvious. “(ii)⇒(iii)”: Let k ∈ K and let z ∈ Z.Then Fix T = Z + K by Corollary 5.16(v); hence, z + k ∈ Fix T. ThereforeJA(z + k) = PZ(z + k). Now apply Lemma 5.18. “(iii)⇒(i)”: This followsfrom Theorem 4.40(ii) and Corollary 5.16(v). As a consequence of Theorem 5.19 we now refine Example 5.17.Example 5.20. Let U be a closed affine subspace of X, suppose that A = NU andthat B is paramonotone such that Z 6= ∅. ThenPZ = JAPFix T = PU PFix T. (5.28)555.5. The Douglas–Rachford operator and solution setsProof. Combine Corollary 4.31(iii) and Theorem 5.19. Recall that Proposition 4.10 implieszer A ∩ zer B 6= ∅ ⇔ 0 ∈ K. (5.29)Theorem 5.21. Suppose that A and B are paramonotone and that zer A ∩zer B 6= ∅. Then the following hold:(i) JAPFix T = PZ.(ii) We have the implicationA or B is single-valued⇒ Fix T = (zer A) ∩ (zer B). (5.30)Proof. (i): Combine Corollary 4.41 and Corollary 5.16(v). (ii): CombineCorollary 5.16(v) with Theorem 4.32(i)&(iii). In view of (5.29) and Theorem 5.21(i), when A and B are paramonotone,we have the implication 0 ∈ K ⇒ JAPFix T = PZ. However the converseimplication is not true, as we show in the next example.Example 5.22. Suppose that a ∈ X r {0}, that A = Id−2a and thatB = Id. Then Z = {a}, (A−1, B−>) = (Id+2a, Id), hence K = {−a},Z − Z = {0} and therefore K ⊥ (Z − Z) which implies that JAPFix T = PZby Theorem 5.19, but 0 6∈ K.If neither A nor B is single-valued, then the conclusion ofTheorem 5.21(ii) may fail as we now illustrate.Example 5.23. Suppose that X = R2, that U = R × {0}, that V =ball((0, 1); 1), that A = NU and that B = NV . By Example 4.7, Z =U ∩ V = {(0, 0)} and K = NU−V(0) = R+ · (0, 1) 6= {(0, 0)}. ThereforeFix T = R+ · (0, 1) 6= {(0, 0)} = U ∩V = zer A ∩ zer B.Working in X× X, we recall that (see [12, Proposition 23.16])A× B is maximally monotone and JA×B = JA × JB. (5.31)Corollary 5.24. Let x ∈ X and let (z, k) ∈ S . Then‖(JATx, JA−1 Tx)− (z, k)‖2 = ‖JATx− z‖2 + ‖JA−1 Tx− k‖2 (5.32a)≤ ‖JAx− z‖2 + ‖JA−1 x− k‖2 (5.32b)= ‖(JAx, JA−1 x)− (z, k)‖2. (5.32c)565.5. The Douglas–Rachford operator and solution setsProof. It follows from Theorem 5.14 that z + k ∈ Fix T, JA(z + k) = z andJA−1(z + k) = k. Now combine with Theorem 5.9(v) with y replaced byz + k. We recall, as consequence of Fact 3.7 and Example 3.24(i), that whenX = R, the operators A and B are paramonotone. Using Corollary 5.16(iv),we then have S = Z× K.Lemma 5.25. Suppose that X = R. Let x ∈ X and let (z, k) ∈ Z× K. Then thefollowing hold:(i) |JATx− z|2 ≤ |JAx− z|2.(ii) |JA−1 Tx− k|2 ≤ |JA−1 x− k|2.Proof. (i): Setq(x, z) = |JATx− z|2 − |JAx− z|2. (5.33)If x ∈ Fix T we get q(x, z) = 0. Suppose that x ∈ RrFix T. Since T is firmlynonexpansive we have that Id−T is firmly nonexpansive (see Fact 3.2(ii)),hence monotone by Fact 3.4. Therefore (∀x ∈ Rr Fix T)(∀ f ∈ Fix T) wehave(x− Tx)(x− f ) = ((Id−T)x− (Id−T) f ) (x− f ) > 0. (5.34)Notice that (5.33) can be rewritten asq(x, z) = (JATx− JAx)((JATx− z) + (JAx− z)). (5.35)We argue by cases.Case 1: x < Tx.It follows from (5.34) that(∀ f ∈ Fix T) x < f . (5.36)On the one hand, since JA is firmly nonexpansive, we have JA is mono-tone and therefore JATx − JAx ≥ 0. On the other hand, it follows fromFact 5.2(iii) that (∃ f ∈ Fix T) such that z = JA f = JAT f . Using (5.36) andthe fact that JA is monotone we conclude that JAx − z = JAx − JA f ≤ 0.Moreover, since JA and T are firmly nonexpansive operators onR, we haveJA ◦ T is firmly nonexpansive hence monotone and therefore (5.36) impliesthat JATx− z = JATx− JAT f ≤ 0. Combining with (5.35) we conclude that(i) holds.575.5. The Douglas–Rachford operator and solution setsCase 2: x > Tx.The proof follows similar to Case 1.(ii) Apply the results of (i) to A−1 and use (3.15). In view of (5.24) one might conjecture that Corollary 5.24 holds whenwe replace S by Z × K. The following example gives a negative answerto this conjecture. It also illustrates that when X 6= R, the conclusion ofLemma 5.25 could fail.Example 5.26. Suppose that X = R2, that A is the normal cone operator ofR2+, and that B : X → X : (x1, x2) 7→ (−x2, x1) is the rotator by pi/2. ThenFix T = R+ · (1,−1), Z = R+ × {0} and K = {0} ×R−. Moreover, (∃x ∈R2) (∃(z, k) ∈ Z× K) such that‖(JATx, JA−1 Tx)− (z, k)‖2 − ‖(JAx, JA−1 x)− (z, k)‖2 > 0and ‖JATx− z‖2 − ‖JAx− z‖2 > 0.Proof. Let (x1, x2) ∈ R2. Using Example 3.17 we haveJB(x1, x2) = ( 12 (x1 + x2),12 (−x1 + x2)) (5.37)andRB(x1, x2) = (x2,−x1) = −B(x1, x2). (5.38)Hence, R−1B = (−B)−1 = B. Using (5.2) we conclude that (x1, x2) ∈Fix T ⇔ (x1, x2) ∈ Fix RBRA. HenceFix T ={(x1, x2) ∈ R2∣∣ (x1, x2) = RBRA(x1, x2)} (5.39a)={(x1, x2) ∈ R2∣∣ R−1B (x1, x2) = 2JA(x1, x2)− (x1, x2)} (5.39b)={(x1, x2) ∈ R2∣∣ B(x1, x2) + (x1, x2) = 2PR2+(x1, x2)} (5.39c)={(x1, x2) ∈ R2∣∣ (x1 − x2, x1 + x2) = 2PR2+(x1, x2)}. (5.39d)We argue by cases.Case 1: x1 ≥ 0 and x2 ≥ 0. Then (x1, x2) ∈ Fix T⇔ (x1 − x2, x1 + x2) =2PR2+(x1, x2) = (2x1, 2x2)⇔ x1 = −x2 and x1 = x2 ⇔ x1 = x2 = 0.Case 2: x1 < 0 and x2 < 0. Then (x1, x2) ∈ Fix T⇔ (x1 − x2, x1 + x2) =2PR2+(x1, x2) = (0, 0) ⇔ x1 = x2 and x1 = −x2 ⇔ x1 = x2 = 0, whichcontradicts that x1 < 0 and x2 < 0.Case 3: x1 ≥ 0 and x2 < 0. Then (x1, x2) ∈ Fix T⇔ (x1 − x2, x1 + x2) =2PR2+(x1, x2) = (2x1, 0)⇔ x1 = −x2.585.5. The Douglas–Rachford operator and solution setsCase 4: x1 < 0 and x2 ≥ 0. Then (x1, x2) ∈ Fix T⇔ (x1 − x2, x1 + x2) =2PR2+(x1, x2) = (0, 2x2) ⇔ x1 = x2, which never occurs since x1 < 0 andx2 ≥ 0. Altogether we conclude that Fix T = R+ · (1,−1), as claimed.Using Fact 5.2(iii)&(iv) we have Z = JA(Fix T) = R+ × {0} , and K =JA−1(Fix T) = (Id−JA)(Fix T) = PR2−(Fix T) = {0} ×R−.Now let a > 0, let x = (a, 0), set z = (2a, 0) ∈ Z and set k = (0,−a) ∈ K.Notice that Tx = x − PR2+ + 12 (Id−B)x = (a, 0) − (a, 0) + 12 ((a, 0) −(0, a)) = ( 12 a,− 12 a). Hence, JAx = PR2+(a, 0) = (a, 0), JA−1 x = PR2−(a, 0) =(0, 0), JATx = PR2+(12 a,− 12 a) = ( 12 a, 0), and JA−1 x = PR2−( 12 a,− 12 a) =(0,− 12 a). Therefore‖(JATx, JA−1 Tx)− (z, k)‖2 − ‖(JAx, JA−1 x)− (z, k)‖2= ‖JATx− z‖2 + ‖JA−1 Tx− k‖2 − ‖JAx− z‖2 − ‖JA−1 x− k‖2= ‖( 12 a, 0)− (2a, 0)‖2+ ‖(0,− 12 a)− (0,−a)‖2− ‖(a, 0)− (2a, 0)‖2 − ‖(0, 0)− (0,−a)‖2= 94 a2 + 14 a2 − a2 − a2 = 12 a2 > 0.Similarly one can verify that ‖JATx− z‖2 − ‖JAx− z‖2 = 54 a2 > 0. Throughout the rest of this section, we assume thatA : X ⇒ X and B : X ⇒ X are maximally monotone linear relations;equivalently, by [27, Theorem 2.1(xviii)], thatJA and JB are linear operators from X to X. (5.40)This additional assumption leads to stronger conclusions.Lemma 5.27. Id−T = JA − 2JB JA + JB.Proof. Let x ∈ X. Then indeed x − Tx = JAx − JBRAx =JAx− JB(2JAx− x) = JAx− 2JB JAx + JBx. Lemma 5.28. Suppose that U is a linear subspace of X and that A = PU . ThenA is maximally monotone,JA = JPU =12 (Id+PU⊥), and RA = PU⊥ = Id−A. (5.41)595.5. The Douglas–Rachford operator and solution setsProof. Let (x, y) ∈ X× X. Theny = JAx ⇔ x = y + PUy. (5.42)Now assume y = JAx. Since PU is linear, (5.42) implies thatPU⊥x = PU⊥y. Moreover, y = x − PUy = 12 (x + x − 2PUy) =12 (x + y + PUy− 2PUy) = 12 (x + y− PUy) = 12 (x + PU⊥y) = 12 (x + PU⊥x).Next, RA = 2JA − Id = (Id+PU⊥)− Id = PU⊥ . We say that a linear relation A is skew (see [25]) if (∀(a, a∗) ∈ gra A)〈a, a∗〉 = 0.Lemma 5.29. Suppose that A : X → X and B : X → X are both skew, andA2 = B2 = − Id. Then Id−T = 12 (Id−BA).Proof. It follows from Proposition 3.16 that RA = A and RB = B. Therefore(5.2) implies that Id−T = 12 (Id−RBRA) = 12 (Id−BA). Example 5.30. Suppose that A and B are skew. Let x ∈ X and let y ∈ X.Then the following hold:(i) 〈Tx− Ty, x− y〉 = ‖Tx− Ty‖2.(ii) 〈(Id−T)x− (Id−T)y, x− y〉 = ‖(Id−T)x− (Id−T)y‖2.(iii) ‖x− y‖2 = ‖Tx− Ty‖2 + ‖(Id−T)x− (Id−T)y‖2.(iv) ‖JAx− JAy‖2 + ‖JA−1 x− JA−1 y‖2 − ‖JATx− JATy‖2−‖JA−1 Tx− JA−1 Ty‖2 = ‖(Id−T)x− (Id−T)y‖2.(v) ‖x‖2 = ‖Tx‖2 + ‖x− Tx‖2.(vi) 〈Tx, x− Tx〉 = 0.Proof. (i)–(iv): Apply Theorem 5.9, and use (3.20) as well as theskewness of A and B. (v): Apply (iii) with y = 0. (vi): We have2〈Tx, x− Tx〉 = ‖x‖2 − ‖Tx‖2 − ‖x− Tx‖2. Now apply (v). Suppose that U is a closed affine subspace of X. One can easily verifythat(∀x ∈ X)(∀y ∈ X) 〈PUx− PUy, (Id−PU)x− (Id−PU)y〉 = 0. (5.43)605.5. The Douglas–Rachford operator and solution setsExample 5.31. Suppose that U and V are closed affine subspaces of X suchthat U ∩ V 6= ∅, that A = NU , and that B = NV . Let x ∈ X, and let(z, k) ∈ Z× K). Then‖(PUx, (Id−PU)x)− (z, k)‖2 − ‖(PUTx, (Id−PU)Tx)− (z, k)‖2 (5.44a)= ‖x− (z + k)‖2 − ‖Tx− (z + k)‖2 (5.44b)= ‖x− Tx‖2 (5.44c)= ‖PUx− PV x‖2. (5.44d)Proof. As subdifferential operators, A and B are paramonotone (by Exam-ple 3.24(i)). It follows from Corollary 5.16(v) and Theorem 5.14 thatz + k ∈ Fix T, PU(z + k) = z and (Id−PU)(z + k) = k. (5.45)Hence, in view of (5.43) we have‖(PUx, (Id−PU)x)− (z, k)‖2 (5.46a)= ‖PUx− z‖2 + ‖(Id−PU)x− k‖2 (5.46b)= ‖PUx− PU(z + k)‖2 + ‖(Id−PU)x− (Id−PU)(z + k)‖2+ 2〈PUx− PU(z + k), (Id−PU)x− (Id−PU)(z + k)〉 (5.46c)= ‖PUx− PU(z + k) + (Id−PU)x− (Id−PU)(z + k)‖2 (5.46d)= ‖x− (z + k)‖2. (5.46e)Applying (5.46) with x replaced by Tx yields‖(PUTx, (Id−PU)Tx)− (z, k)‖2 = ‖Tx− (z + k)‖2. (5.47)Combining (5.46) and (5.47) yields (5.44b). It follows from (5.43) andTheorem 5.9(iii) applied with (A, B, y) replaced by (NU , NV , z + k) that‖x− (z + k)‖2 − ‖Tx− T(z + k)‖2 = ‖x− Tx− ((z + k)− T(z + k))‖2,(5.48)which in view of (5.45), proves (5.44c).Now we turn to (5.44d). Let w ∈ U ∩ V. Then U = w + par U andV = w + par V. Suppose momentarily that w = 0. In this case, par U = Uand par V = V. Using [30, Proposition 3.4(i)], we haveT = T(U,V) = PV PU + PV⊥PU⊥ . (5.49)615.6. From PDEs to maximally monotone operatorsThereforex− Tx = PUx + PU⊥x− PV PUx− PV⊥PU⊥x (5.50a)= (Id−PV)PUx + (Id−PV⊥)PU⊥x (5.50b)= PV⊥PUx + PV PU⊥x. (5.50c)Using (5.50c) we have‖x− Tx‖2 = ‖PV⊥PUx + PV PU⊥x‖2 (5.51a)= ‖PUx− PV PUx + PV x− PV PUx‖2 (5.51b)= ‖PUx− 2PV PUx + PV x‖2 (5.51c)= ‖PUx‖2 + ‖PV x‖2 + 4‖PV PUx‖2 (5.51d)+ 2〈PUx, PV x〉 − 4〈PUx, PV PUx〉 − 4〈PV x, PV PUx〉= ‖PUx‖2 + ‖PV x‖2 − 2〈PUx, PV x〉 = ‖PUx− PV x‖2. (5.51e)Now, if w 6= 0, by Proposition 9.11 below we have Tx = w +T(par U,par V)(x− w). Therefore, (5.51) yields‖x− Tx‖2 = ‖(x− w)− T(par U,par V)(x− w)‖2 (5.52a)= ‖Ppar U(x− w)− Ppar V(x− w)‖2 (5.52b)= ‖w + Ppar U(x− w)− (w + Ppar V(x− w))‖2 (5.52c)= ‖PUx− PV x‖2, (5.52d)where the last equality follows from Fact 2.6. 5.6 From PDEs to maximally monotone operatorsIn this section we briefly show the connection between the originalDouglas–Rachford algorithm introduced in [70] (see also [69], [103] and[117] for variations of this method) to solve certain types of heat equationsand the general algorithm introduced by Lions and Mercier in [96]. Attimes, we follow in our presentation [61] which also contains additionalhistorical comments.Suppose that Ω is a bounded square region in R2. Consider the Dirich-let problem for the Poisson equation: Given f and g, find u : Ω → R suchthat∆u = f on Ω and u = g on bdryΩ, (5.53)625.6. From PDEs to maximally monotone operatorswhere ∆ = ∇2 = ∂2∂x2 +∂2∂y2 is the Laplace operator and bdryΩ denotes theboundary of Ω. Discretizing u followed by converting it into a long vectory (see [102, Example 7.6.2 & Problem 7.6.9]) we obtain the system of linearequationsL→y + L↑y = −b. (5.54)Here L→ and L↑ denote the horizontal (respectively vertical) positive def-inite discretization of the negative Laplacian over a square mesh with n2points at equally spaced intervals (see [102, Problem 7.6.10]). We haveL→ = Id⊗M and L↑ = M⊗ Id, (5.55)whereM =2 −1 0−1 . . . . . .. . . . . . −10 −1 2 ∈ Rn×n. (5.56)To see the connection to monotone operators, set A = L→ and B : L↑ + b :y 7→ L↑y + b. Then A and B are affine and strictly monotone. The problemthen reduces tofind y ∈ Rn2 such that Ay + By = 0, (5.57)and the algorithm proposed by Douglas and Rachford in [70] becomesyn+1/2 + Ayn + Byn+1/2 − yn = 0, (5.58a)yn+1 − yn+1/2 − Ayn + Ayn+1 = 0. (5.58b)Consequently,(5.58a) ⇔ (Id+B)(yn+1/2) = (Id−A)yn ⇔ yn+1/2 = JB(Id−A)yn,(5.59a)(5.58b) ⇔ (Id+A)yn+1 = Ayn + yn+1/2 ⇔ yn+1 = JA(Ayn + yn+1/2).(5.59b)Substituting (5.59a) into (5.59b) to eliminate yn+1/2 yieldsyn+1 = JA(Ayn + JB(Id−A)yn). (5.60)To proceed further, we must show that(Id−A)JA = RA (5.61a)635.6. From PDEs to maximally monotone operatorsAJA = Id−JA. (5.61b)Indeed, note that Id−A = 2 Id−(Id+A), therefore multiplying by JA =(Id+A)−1 from the right yields (Id−A)JA = (2 Id−(Id+A))JA = 2JA −Id = RA. Hence JA − AJA = JA − (Id−JA); equivalently, AJA = Id−JA.Now consider the change of variable(∀n ∈N) xn = (Id+A)yn, (5.62)which is equivalent to yn = JAxn. Substituting (5.60) into (5.62), and using(5.61), yieldxn+1 = (Id+A)yn+1 = (Id+A)JA(Ayn + JB(Id−A)yn) (5.63a)= Ayn + JB(Id−A)yn (5.63b)= AJAxn + JB(Id−A)JAxn (5.63c)= xn − JAxn + JBRAxn (5.63d)= (Id−JA + JBRA)xn, (5.63e)which is the Douglas–Rachford update formula (5.2).We point out that JA = JL→ , and using [12, Proposition 23.15(ii)] wehave JB = JL↑+b = JL↑ − JL↑b. To calculate JA and JB apply Corollary 3.49 togetJA = Idn⊗JM and JB = JM ⊗ Idn−(JM ⊗ Idn)(b). (5.64)For instance, when n = 3, the above calculations yieldJM =8211712117371712117821 , (5.65)Id3⊗JM =82117121 0 0 0 0 0 0173717 0 0 0 0 0 012117821 0 0 0 0 0 00 0 0 82117121 0 0 00 0 0 173717 0 0 00 0 0 12117821 0 0 00 0 0 0 0 0 821171210 0 0 0 0 0 1737170 0 0 0 0 0 12117821, (5.66)645.7. Why do we need to work with monotone operators and not just subdifferential operators?andJM ⊗ Id3 =821 0 017 0 0121 0 00 821 0 017 0 0121 00 0 821 0 017 0 012117 0 037 0 017 0 00 17 0 037 0 017 00 0 17 0 037 0 017121 0 017 0 0821 0 00 121 0 017 0 0821 00 0 121 0 017 0 0821. (5.67)5.7 Why do we need to work with monotoneoperators and not just subdifferential operators?As shown earlier, the Douglas–Rachford operator T is firmly nonexpansive(see Fact 5.2(i)) which, in view of Fact 3.12, asserts that T = JC, i.e., T is aresolvent of some maximally monotone operator C. Hence the Douglas–Rachford algorithm is a particular instance of the proximal point method,where in the latter C is a subdifferential operator of some f ∈ Γ(X).(The results by Rockafellar [124] also allow for additional parameters andsummable errors.) Interestingly, in [71] Eckstein demonstrated that evenif A and B are subdifferential operators whose resolvents in this caseare proximal mappings (see Example 3.14) the corresponding Douglas–Rachford operator is just a resolvent that is not a proximal mapping (seealso [40] for detailed discussion and examples of this case). For the sake ofcompleteness we include the following example that appears in [126].Example 5.32. Suppose that X = R2, that A = NR(1,0) and that B = NR(1,1).Simple calculations yieldT =12(1 −11 1)(5.68)which implies that T = JC whereC =(0 1−1 0). (5.69)Note that T is not symmetric, and therefore is not a proximal mapping by[40, Corollary 2.5].655.7. Why do we need to work with monotone operators and not just subdifferential operators?Therefore, even if we are interested in solving only optimization prob-lems, where the resolvents of the operators (which in this case are sub-differential operators) are proximal mappings, the corresponding Douglas–Rachford splitting operator could possibly be just a resolvent of a monotoneoperator which is not a proximal mapping. Therefore, in view of Fact 3.12,we do actually need to work with general monotone operators and not justsubdifferential operators.66Chapter 6On Feje´r monotone sequencesand nonexpansive mappings6.1 OverviewLet C be a nonempty closed convex subset of X. A sequence (xn)n∈N in Xis called Feje´r monotone (see [11], [59] and [60]) with respect to C if(∀c ∈ C)(∀n ∈N) ‖xn+1 − c‖ ≤ ‖xn − c‖. (6.1)In other words, each point in a Feje´r monotone sequence is not further fromany point in C than its predecessor.The notion of Feje´r monotonicity has proven to be a fruitful concept infixed point theory and optimization. It has shown to be an efficient tool toanalyze various iterative algorithms in convex optimization.We summarize our main results in this chapter as follows:• We present new conditions sufficient for convergence of Feje´r mono-tone sequences (see Lemma 6.3 and Theorem 6.11) and for the con-vergence of the sequence (see Lemma 6.4 and Proposition 6.5).• We provide a mild generalization of Ostrowski’s Theorem [113, The-orem 26.1] (see Lemma 6.8).Except for facts with explicit references, this chapter is based on results thatappear in [36], [13], and [33].6.2 Feje´r monotonicity: New principlesWe start by recalling some pleasant properties of Feje´r monotonesequences.Fact 6.1. Let (xn)n∈N be a sequence in X that is is Feje´r monotone with respect toa nonempty closed convex subset C of X. Then the following hold:676.2. Feje´r monotonicity: New principles(i) The sequence (xn)n∈N is bounded.(ii) For every c ∈ C, the sequence (‖xn − c‖)n∈N converges.(iii) The set of strong cluster points of (xn)n∈N lies in a sphere of X.(iv) The sequence (PCxn)n∈N converges strongly to a point in C.(v) If int C 6= ∅, then (xn)n∈N converges strongly to a point in X.(vi) If C is a closed affine subspace of X, then (∀n ∈N) PCxn = PCx0.(vii) Every weak cluster point of (xn)n∈N that belongs to C must belimn→∞ PCxn.(viii) The sequence (xn)n∈N converges weakly to some point in C if and only if allweak cluster points of (xn)n∈N lie in C.(ix) If all weak cluster points of (xn)n∈N lie in C, then (xn)n∈N convergesweakly to limn→∞ PCxn.Proof. (i)&(ii): [12, Proposition 5.4]. (iii): Clear from (ii). (iv): [12,Proposition 5.7]. (v): [12, Proposition 5.10]. (vi): [12, Proposition 5.9(i)].(vii): This follows from [12, Corollary 5.11]. (viii): This follows from [12,Theorem 5.5]. (ix): Combine (viii) with (vii). The following result was first presented in [7, Theorem 6.2.2(ii)]; forcompleteness, we include its short proof.Lemma 6.2. Let (xn)n∈N be a sequence in X that is is Feje´r monotone with respectto a nonempty closed convex subset C of X. Let w1 and w2 be weak cluster pointsof (xn)n∈N. Then w1 − w2 ∈ (C− C)⊥.Proof. Let (c1, c2) ∈ C× C. Using Fact 6.1(ii), set Li = limn→∞‖xn − ci‖, fori ∈ {1, 2}. Note that‖xn − c1‖2 = ‖xn − c2‖2 + ‖c1 − c2‖2 + 2〈xn − c2, c2 − c1〉. (6.2)Now suppose that xkn ⇀ w1 and xln ⇀ w2. Taking the limit in (6.2)along the two subsequences (kn)n∈N and (ln)n∈N yields L1 = L2 + ‖c2 −c1‖2 + 2〈w1 − c2, c2 − c1〉 and L1 = L2 + ‖c2 − c1‖2 + 2〈w2 − c2, c2 − c1〉.Subtracting the last two equations yields 2 〈c2 − c1, w1 − w2〉 = 0. The next novel result on Feje´r monotone sequences is of critical impor-tance in our analysis. (When (un)n∈N = (xn)n∈N one obtains a well-knownresult; see [12, Theorem 5.5].)686.2. Feje´r monotonicity: New principlesLemma 6.3 (new Feje´r monotonicity principle). Suppose that E is anonempty closed convex subset of X, that (xn)n∈N is a sequence in X that is Feje´rmonotone with respect to E, that (un)n∈N is a bounded sequence in X suchthat its weak cluster points lie in E, and that(∀e ∈ E) 〈un − e, un − xn〉 → 0. (6.3)Then (un)n∈N converges weakly to some point in E.Proof. It follows from (6.3) that (∀(e1, e2) ∈ E× E)〈e2 − e1, un − xn〉 = 〈un − e1, un − xn〉 − 〈un − e2, un − xn〉 → 0. (6.4)Next, let us assume that e1 and e2 are two (possibly different) weak clusterpoints of (un)n∈N. Obtain four subsequences (xkn)n∈N, (xln)n∈N, (ukn)n∈Nand (uln)n∈N such that xkn ⇀ y1, xln ⇀ y2, ukn ⇀ e1 and uln ⇀ e2. Takingthe limit in (6.4) along these subsequences we have 〈e2 − e1, e1 − y1〉 = 0 =〈e2 − e1, e2 − y2〉, hence‖e2 − e1‖2 = 〈e2 − e1, y2 − y1〉. (6.5)Since {e1, e2} ⊆ E, we conclude, in view of Lemma 6.2 that〈e2 − e1, y2 − y1〉 = 0. By (6.5), e1 = e2. Lemma 6.4. Let A be a closed linear subspace of X, let C be a nonempty closedconvex subset of A, and let (xn)n∈N be a sequence in X. Suppose that (xn)n∈N isFeje´r monotone with respect to C, i.e., (∀n ∈N) (∀c ∈ C) ‖xn+1 − c‖ ≤ ‖xn −c‖, and that all the weak cluster points of (PAxn)n∈N lie in C. Then (PAxn)n∈Nconverges weakly to some point in C.Proof. Since (xn)n∈N is bounded (by Fact 6.1(i)) and PA is (firmly) non-expansive we learn that (PAxn)n∈N is bounded and by assumption, itsweak cluster points lie in C ⊆ A. Now let c1 and c2 be in C. On theone hand, the Feje´r monotonicity of (xn)n∈N implies the convergence ofthe sequences (‖xn − c1‖2)n∈N and (‖xn − c2‖2)n∈N by Fact 6.1(ii). On theother hand, expanding and simplifying yield ‖xn − c1‖2 − ‖xn − c2‖2 =‖xn‖2 + ‖c1‖2 − 2〈xn, c1〉 − ‖xn‖2 − ‖c2‖2 + 2〈xn, c2〉 = ‖c1‖2 − 2〈xn, c1 −c2〉 − ‖c2‖2, which in turn implies that (〈xn, c1 − c2〉)n∈N converges. Sincec1 ∈ A and c2 ∈ A we have〈xn, c1 − c2〉 = 〈xn, PAc1 − PAc2〉 = 〈xn, PA(c1 − c2)〉 = 〈PAxn, c1 − c2〉.(6.6)696.2. Feje´r monotonicity: New principlesNow assume that (PAxkn)n∈N and (PAxln)n∈N are subsequences of(PAxn)n∈N such that PAxkn ⇀ c1 and PAxln ⇀ c2. By the uniqueness of thelimit in (6.6) we conclude that 〈c1, c1 − c2〉 = 〈c2, c1 − c2〉 or equivalently‖c1 − c2‖2 = 0, hence (PAxn)n∈N has a unique weak cluster point whichcompletes the proof. The next result can be seen as a finite-dimensional variant of Lemma 6.4(where A is a closed linear subspace) and Fact 6.1(viii) (where A = X).Proposition 6.5. Suppose that X is finite-dimensional, let (xn)n∈N be a sequencein X that is Feje´r monotone with respect to a nonempty closed convex subset C ofX, and let A be a closed convex subset of X such that C ⊆ A. Suppose that allcluster points of (PAxn)n∈N lie in C. Then (PAxn)n∈N converges; in fact,limn→∞ PAxn = limn→∞ PCxn. (6.7)Proof. Set c∗ = limn→∞ PCxn (see Fact 6.1(iv)). By Fact 6.1(i), (xn)n∈N isbounded, hence so is (PAxn)n∈N because PA is nonexpansive. Note that byassumption, all cluster points of (PAxn)n∈N lie in C. Let c be an arbitrarycluster point of (PAxn)n∈N. Then there exist a subsequence (xkn)n∈N of(xn)n∈N and a point x ∈ X such that xkn → x and PAxkn → PAx = c ∈ C.It follows from [12, Proposition 28.5] that PCxkn → PCx = PAx = c. On theother hand, by assumption we have PCxkn → c∗. Altogether, c = c∗ and theresult follows. Our next result decouples Feje´r monotonicity into two properties in thecase when the underlying set can be written as the sum of a set and a cone.Proposition 6.6. Let (xn)n∈N be a sequence in X, let E be a nonempty subset ofX and let K be a nonempty convex cone of X. Then the following are equivalent:(i) (xn)n∈N is Feje´r monotone with respect to E + K.(ii) (xn)n∈N is Feje´r monotone with respect to E and (∀n ∈N) xn+1 ∈ xn +K⊕, where K⊕ ={u ∈ X ∣∣ inf 〈u, K〉 = 0}.Proof. Set(∀x ∈ X)(∀n ∈N) ∆n(x) = ‖xn − x‖2 − ‖xn+1 − x‖2. (6.8)Then for every e ∈ E and k ∈ K, we have∆n(e + k) = ‖xn − e‖2 + ‖k‖2 − 2 〈xn − e, k〉706.2. Feje´r monotonicity: New principles− (‖xn+1 − e‖2 + ‖k‖2 − 2 〈xn+1 − e, k〉 ) (6.9a)= ∆n(e) + 2 〈xn+1 − xn, k〉 . (6.9b)Assume first that (i) holds. Then (xn)n∈N is Feje´r monotone with respect toE because E ⊆ E + K. Let (e, k) ∈ E× K and n ∈N. Using (6.8),0 ≤ ∆n(e + k) = ∆n(e) + 2 〈xn+1 − xn, k〉 . (6.10)Since K is a cone, this shows that 2 inf 〈xn+1 − xn,R++k〉 ≥ −∆n(e) > −∞.Hence 〈xn+1 − xn, k〉 ≥ 0. It follows that xn+1 − xn ∈ K⊕. Conversely, if (ii)holds, then (6.9) immediately yields (i). The following consequence of Proposition 6.6 shows that Proposi-tion 6.6 is a generalization of Fact 6.1(vi).Corollary 6.7. Let (xn)n∈N be a sequence in X, and let C be a closed affinesubspace of X, say C = c + Y, where Y is a closed linear subspace of X.Then (xn)n∈N is Feje´r monotone with respect to C if and only if (∀n ∈N)‖xn+1 − c‖ ≤ ‖xn − c‖ and xn+1 ∈ xn + Y⊥, in which case (PCxn)n∈N is aconstant sequence.We continue with the following lemma, which is a minor generalizationof a theorem of Ostrowski (see [113, Theorem 26.1]) whose proof we follow.Lemma 6.8. Let (Y, d) be a metric space, and let (xn)n∈N be a sequence in acompact subset C of Y such that d(xn, xn+1) → 0. Then the set of cluster pointsof (xn)n∈N is a compact connected subset of C.Proof. Denote the set of cluster points of (xn)n∈N by S and assume to thecontrary that S = A ∪ B where A and B are nonempty closed subsets of Xand A ∩ B = ∅. Thenδ = inf(a,b)∈A×Bd(a, b) > 0. (6.11)By assumption on (xn)n∈N, there exists n0 ∈ N such that (∀n ≥ n0) wehave d(xn, xn+1) ≤ δ/3. Let a ∈ A. Then there exists m > n0 such thatd(xm, a) < δ/3. Because (xn)n>m has a cluster point in B, there exists asmallest integer k > m such that d(xk, B) < 2δ/3. Then d(xk−1, B) ≥ 2δ/3and hence d(xk, B) ≥ d(xk−1, B)− d(xk−1, xk) ≥ 2δ/3− δ/3 = δ/3. Thusδ/3 ≤ d(xk, B) < 2δ/3. Repeating this argument yields a subsequence(xkn)n∈N of (xn)n∈N such that (∀n ∈N) δ/3 ≤ d(xkn , B) < 2δ/3. Let x be acluster point of (xkn)n∈N. It follows thatδ/3 ≤ d(x, B) ≤ 2δ/3 (6.12)716.2. Feje´r monotonicity: New principlesObviously, x /∈ B. Hence x ∈ A, and therefore (recall (6.11))δ ≤ d(x, B) ≤ 2δ/3 < δ, which is absurd. To proceed to the next results we need to recall the following definition.Definition 6.9 (asymptotically regular sequence). Let (xn)n∈N be a se-quence in X. Then (xn)n∈N is asymptotically regular if xn − xn+1 → 0.An immediate consequence of Lemma 6.8 is the classical Ostrowskiresult.Corollary 6.10 (Ostrowski). Suppose that X is finite-dimensional and let(xn)n∈N be a bounded sequence in X such that (xn)n∈N is asymptotically regular.Then the set of cluster points of (xn)n∈N is compact and connected.We are now in position to prove the following key result which can beseen as a variant of Fact 6.1(v).Theorem 6.11 (a new sufficient condition for convergence). Suppose thatX is finite-dimensional and that C is a nonempty closed convex subset of X ofco-dimension 1, i.e.,codim C = codim(aff C− aff C) = 1. (6.13)Let (xn)n∈N be a sequence that is Feje´r monotone with respect to C and asymptot-ically regular. Then (xn)n∈N is actually convergent.Proof. By Fact 6.1(i), (xn)n∈N is bounded. Denote by S the set of clusterpoints of (xn)n∈N. Since xn − xn+1 → 0, Corollary 6.10 implies that Sis connected. Moreover, S lies in a sphere of X due to Fact 6.1(iii). Onthe other hand, by combining Lemma 6.2 and (6.13), S lies in a line ofX. Altogether S is a connected subset of a sphere that lies on a line. Wededuce that S is a singleton. We conclude with two examples illustrating that the assumptions onasymptotic regularity and co-dimension 1 are important.Example 6.12. Suppose that X = R2, set C = {0} × R, and (∀n ∈N)xn = ((−1)n, 0). Then codim C = 1 and (∀c ∈ C) (∀n ∈N) ‖xn − c‖ =‖xn+1 − c‖, hence (xn)n∈N is Feje´r monotone with respect to C. However,(xn)n∈N does not converge. This does not contradict Theorem 6.11 because‖xn − xn+1‖ = 2 6→ 0.726.2. Feje´r monotonicity: New principlesExample 6.13. Suppose that X = R2, set C = {(0, 0)} ⊆ X, and (∀n ∈N)θn = ∑nk=1(1/k) and xn = cos(θn)(1, 0) + sin(θn)(0, 1). Then (xn)n∈N isasymptotically regular and Feje´r monotone with respect to C. However,the set of cluster points of (xn)n∈N is the unit sphere because the harmonicseries diverges. Again, this does not contradict Theorem 6.11 becausecodim C = 2 6= 1.73Chapter 7Nonexpansive mappings andthe minimal displacementvector7.1 OverviewThe goal of this chapter is to provide information on the connection be-tween the inner and outer shifts of an operator. These results are heavilyused in subsequent chapters. We summarize the main results of thischapter as follows:• We provide a characterization of the existence of fixed points of affineoperators (see Theorem 7.3).• We explore the connection between the inner and outer shifts, by theminimal displacement vector (see Proposition 7.8), of a nonexpansiveoperator and the corresponding sets of fixed points (see Proposi-tion 7.8).Except for facts with explicit references, this chapter is based on results thatappear in [31], [15] and [39].7.2 Auxiliary resultsWe start with the following definition and auxiliary results.Definition 7.1. Let w ∈ X. For a single-valued operator T we define theinner shift and outer shift by w at x ∈ X byTwx = T(x− w) and wTx = −w + Tx, (7.1)respectively.Lemma 7.2. Let T : X → X and let w ∈ X. Then the following hold:747.2. Auxiliary results(i) Fix(T−w) = −w + Fix(w + T) = −w + Fix(−wT).(ii) w ∈ ran(Id−T)⇔ Fix(w + T) 6= ∅⇔ Fix(T−w) 6= ∅.Proof. (i): Let x ∈ X. Then x ∈ Fix(T−w) ⇔ x = T(x + w) ⇔ x + w =w + T(x + w)⇔ x + w ∈ Fix(w + T)⇔ x ∈ −w + Fix(w + T).(ii): w ∈ ran(Id−T) ⇔ (∃x ∈ X) such that w = x − Tx ⇔ (∃x ∈ X)such that x = w + Tx⇔ Fix(w + T) 6= ∅. Now combine with (i). Working with affine operators, we arrive at the following very usefulresult.Theorem 7.3. Let L : X → X be linear, let b ∈ X, set T : X → X : x 7→ Lx + b.Then the following are equivalent:(i) Fix T 6= ∅.(ii) b ∈ ran(Id−L).(iii) There exists a point a ∈ X such that b = a− La and(∀x ∈ X) Tx = L(x− a) + a. (7.2)(iv) There exists a point a ∈ X such that b = a− La andFix T = a + Fix L. (7.3)Moreover we have the implication:Fix T 6= ∅ =⇒ (∃a ∈ X) such that b = a− La and(∀x ∈ X) PFix Tx = a + PFix L(x− a) = P(Fix L)⊥a + PFix Lx. (7.4)Proof. “(i)⇔(ii)”: Fix T 6= ∅ ⇔ (∃y ∈ X) y = Ly + b ⇔ b ∈ ran(Id−L).“(ii)⇒(iii)”: The existence of a is a direct consequence of (ii) whereas(7.2) follows from the linearity of L. “(iii)⇒(iv)”: Let y ∈ X. Theny ∈ Fix T ⇔ y − a ∈ Fix L ⇔ y ∈ a + Fix L. “(iv)⇒(i)”: Thisis clear since 0 ∈ Fix L 6= ∅. Now we turn to the implication(7.4). It follows from the equivalence of (i) and (iv) that (7.3) holds,which as well proves the existence of a. The first identity followsfrom combining (7.3) and Fact 2.6 applied with (y, S) replaced by(a, Fix L). It follows from the linearity of PFix L (see Fact 2.7(ii)) thata + PFix L(x− a) = a + PFix Lx− PFix La = P(Fix L)⊥a + PFix Lx. 757.3. The displacement map and the minimal displacement vector7.3 The displacement map and the minimaldisplacement vectorIn this section, we collect new results concerning nonexpansive and firmlynonexpansive operators whose fixed point sets could possibly be empty.Fact 7.4 (minimal displacement vector). Let T : X → X be nonexpansive.Then ran(Id−T) is convex; consequently, the minimal displacement vectorv = Pran(Id−T)0 is the unique element in ran(Id−T) such that‖v‖ = infx∈X‖x− Tx‖. (7.5)Proof. See [6], [53] or [116]. The next example is readily verified.Example 7.5 (Fix(v+ T) vs. Fix(T−v)). Let C be a nonempty closed convexsubset of X and suppose that T = Id−PC. Then T is firmly nonexpansiveand v = PC0. Let x ∈ X. Then x ∈ Fix(v + T) ⇔ PCx = v, while x ∈Fix(T−v)⇔ PC(x + v) = v.Lemma 7.6. Let T1 : X → X and T2 : X → X be nonexpansive. Set v1 =Pran(Id−T1T2)0 and v2 = Pran(Id−T2T1)0. Then ‖v1‖ = ‖v2‖.Proof. By definition of v1, there exists a sequence (xn)n∈N in X such that‖xn − T1T2xn‖ → ‖v1‖. Hence (∀n ∈N) ‖v2‖ ≤ ‖(T2xn)− T2T1(T2xn)‖ ≤‖xn − T1T2xn‖ and thus ‖v2‖ ≤ ‖v1‖. Analogously, ‖v1‖ ≤ ‖v2‖. Unless stated otherwise, throughout the rest of this chapter we assumethatT is a nonexpansive operator on X, (7.6)and thatv = Pran(Id−T)0. (7.7)In view of (7.7) and Lemma 7.2(ii) we havev ∈ ran(Id−T)⇔ Fix(T−v) 6= ∅⇔ Fix(v + T) 6= ∅. (7.8)The following example provides scenarios where v ∈ ran(Id−T).Example 7.7. Suppose that one of the following holds:767.3. The displacement map and the minimal displacement vector(i) X is finite-dimensional and T : X → X is an affine operator.(ii) U and V are nonempty closed convex subsets of X, T is defined as in(14.8) below, and one of the following conditions holds:(a) V is bounded.(b) U and V are polyhedral subsets3 of X.Thenv ∈ ran (Id−T).Proof. (i): ran(Id−T) is a finite-dimensional affine subspace, hence closed.Therefore, ran(Id−T) = ran(Id−T) and (7.7) holds. (ii): It follows fromthe Browder–Go¨hde–Kirk fixed point theorem (see [12, Theorem 4.19])applied to the case (ii)(a), and [17, Theorem 5.6.1] applied to the case(ii)(b), that Fix PV PU 6= ∅. By Example 11.24 and Proposition 11.10 below,v ∈ ran (Id−T) as claimed. Proposition 7.8. Suppose that v ∈ ran(Id−T) and let y0 ∈ Fix(v + T). Thenthe following hold4:(i) y0 −R+v ⊆ Fix(v + T).(ii) Fix(v + T)−R+v = Fix(v + T).(iii) −R+v ⊆ rec(Fix(v + T)).(iv) ]−∞, 1] · v + Fix(T−v) ⊆ Fix(v + T). In particular it holds thatFix(T−v) ⊆ Fix(v + T).Proof. (i): First we use induction to show that(∀n ∈N) y0 − nv ∈ Fix(v + T). (7.9)Clearly when n = 0 the base case holds true. Now suppose that for somen ∈N it holds that y0 − nv ∈ Fix(v + T), i.e.,y0 − nv = v + T(y0 − nv). (7.10)Using (7.7) and (7.10) we have‖v‖ ≤ ‖(Id−T)(y0 − (n + 1)v)‖ = ‖y0 − (n + 1)v− T(y0 − (n + 1)v)‖3A subset of X is polyhedral if it is a finite intersection of closed half-spaces.4Let x ∈ X. Then R+x ={rx∣∣ r ∈ [0,+∞[}.777.3. The displacement map and the minimal displacement vector= ‖y0 − nv− v− T(y0 − (n + 1)v)‖= ‖T(y0 − nv)− T(y0 − (n + 1)v)‖ ≤ ‖v‖.Consequently all the inequalities above are equalities and we concludethat ‖v‖ = ‖y0 − (n + 1)v − T(y0 − (n + 1)v)‖. It follows from (7.7) andLemma 3.15 thaty0 − (n + 1)v− T(y0 − (n + 1)v) = v. (7.11)That is, y0 − (n + 1)v = v + T(y0 − (n + 1)v), which proves (7.9). Nowusing [12, Corollary 4.15] we learn that Fix(v + T) is convex, which whencombined with (7.9) yields (i).(ii): On the one hand, using (i) one concludes that Fix(v + T)−R+v ⊆Fix(v + T). On the other hand, Fix(v + T) = Fix(v + T)− 0 · v ⊆ Fix(v +T)−R+v.(iii): This follows directly from Proposition 7.8(ii).(iv): Using Lemma 7.2(i) and Proposition 7.8(ii) we have ]−∞, 1] · v +Fix(T−v) = v − R+v + Fix(T−v) = Fix(v + T) − R+v = Fix(v + T). Inparticular, we have Fix(T−v) = 0 · v + Fix(T−v) ⊆ Fix(v + T). 78Chapter 8Asymptotic behaviour ofnonexpansive mappings8.1 OverviewThis chapter is devoted to applications in fixed point theory. Throughoutthis chapter, we assume thatT : X → X is nonexpansiveand that5v = Pran(Id−T)0.Our main results in this chapter are:• We prove the Feje´r monotonicity of (Tnx + nv)n∈N with respect toFix(T−v) and Fix(v + T) (see Proposition 8.4).• We compare the sequences ((−vT)nx)n∈N, ((T−v)nx)n∈N and(Tnx + nv)n∈N when T is an affine nonexpansive operator andv ∈ ran(Id−T). We prove that the three sequences coincide (seeTheorem 8.8). Surprisingly, when we drop the assumption of T beingaffine, the sequences can be dramatically different (see Example 8.9).• We extend the well-known result in Fact 8.12 from linear to affineoperators (see Theorem 8.20).• A new characterization of strongly convergent iterations of affinenonexpansive operators is presented (see Theorem 8.20). We alsodiscuss when the convergence is linear in Section 8.3.2.Except for facts with explicit references, this chapter is based on results that appearin [15], [39] and [110].5In light of Fact 7.4 the vector v is unique and well-defined.798.2. Iterating nonexpansive operators8.2 Iterating nonexpansive operatorsIn this section, we explore the asymptotic behaviour of nonexpansivemappings. The following two results are well-known.Fact 8.1 (Pazy 1971). Exactly one of the following holds:(i) Fix T = ∅ and (∀x ∈ X) ‖Tnx‖ → +∞.(ii) Fix T 6= ∅ and (∀x ∈ X) (Tnx)n∈N is bounded.Proof. See [116, Corollary 6]. Fact 8.2. Let T : X → X be firmly nonexpansive. Then(∀x ∈ X) v = limn→∞ Tnx− Tn+1x = − limn→∞Tnxn. (8.1)Proof. See [53, Corollary 1.5], [6, Corollary 2.3], and [116]. Before we proceed, we state the following lemma that relates the iteratesof inner and outer shifts of a single-valued operator.Lemma 8.3. Let S : X → X, let w ∈ X, let x ∈ X and let n ∈N. Then thefollowing hold:(S−w)nx = −w + (w + S)n(x + w) = −w + (−wS)n(x + w). (8.2)Proof. We proceed by induction. The conclusion is clear when n = 0. Nowassume that for some n ∈N it holds that (S−w)nx = −w+(w+S)n(x+w).Then (S−w)n+1x = S((S−w)nx + w) = S(−w + (w + S)n(x + w) + w) =−w+w+ S((w+ S)n(x+w)) = −w+ (w+ S)n+1(x+w), as claimed. Proposition 8.4. Suppose that v ∈ ran(Id−T) and let y0 ∈ Fix(v + T). Thenthe following hold6:(i) (∀n ∈N) Tny0 = y0 − nv.(ii) For every x ∈ X, the sequence (Tnx + nv)n∈N is Fe´jer monotone withrespect to both Fix(v + T) and Fix(T−v).6Let x ∈ X. Then R+x ={rx∣∣ r ∈ [0,+∞[}.808.2. Iterating nonexpansive operators(iii) Suppose that x0 ∈ Fix(T−v) and set (∀n ∈ N) xn = Tnx0. Then xn =x0 − nv and (xn)n∈N lies in Fix(T−v).Proof. (i): We use induction. Clearly y0 − 0v = y0 = T0y0. Now supposethat for some n ∈ N it holds Tny0 = y0 − nv. Using Proposition 7.8(i) wehave Tn+1y0 = T(y0 − nv) = −v + y0 − nv = y0 − (n + 1)v.(ii): Let x ∈ X and let y ∈ Fix(v + T). Then using (i) we have for everyn ∈N,‖Tn+1x + (n + 1)v− y‖ = ‖Tn+1x− (y− (n + 1)v)‖ = ‖Tn+1x− Tn+1y‖≤ ‖Tnx− Tny‖ = ‖Tnx− (y− nv)‖= ‖Tnx + nv− y‖. (8.3)The statement for Fix(T−v) follows from Proposition 7.8(iv).(iii): Combine Proposition 7.8(iv) and (i) to get that xn = x0 − nv.Now by Lemma 7.2(i) x0 + v ∈ Fix(v + T). Using Proposition 7.8(i) wehave (∀n ∈ N) x0 + v− nv ∈ Fix(v + T) or equivalently by Lemma 7.2(i)x0 − nv ∈ −v + Fix(v + T) = Fix(T−v). Proposition 8.5. Suppose that X = R, that Fix T = ∅ and thatv ∈ ran(Id−T). Let x ∈ R and set (∀n ∈ N) yn = Tnx + nv. Thenthe following hold:(i) (yn)n∈N converges.(ii) R→ R : x 7→ limn→∞(Tnx + nv) is nonexpansive.(iii) Suppose that T is firmly nonexpansive. Then R → R : x 7→ limn→∞(Tnx +nv) is firmly nonexpansive.Proof. (i): In view of Proposition 8.4(ii) the sequence (yn)n∈N is Fe´jermonotone with respect to Fix(v + T). Now by Proposition 7.8(i) we knowthat Fix(v + T) contains an unbounded interval. Since X = Rwe concludethat int Fix(v + T) 6= ∅. It follows from [12, Proposition 5.10] that (yn)n∈Nconverges.(ii): Let y ∈ R. Then∣∣∣ limn→∞(Tnx + nv)− limn→∞(Tny + nv)∣∣∣ = ∣∣∣ limn→∞(Tnx + nv− Tny− nv)∣∣∣= limn→∞|Tnx− Tny| ≤ limn→∞|x− y|818.2. Iterating nonexpansive operators= |x− y|. (8.4)(iii): It follows from [12, Proposition 4.2(iv)] that an operator is firmlynonexpansive if and only if it is nonexpansive and monotone. Therefore, inview of (ii), we need to check monotonicity. Without loss of generality lety ∈ R such that x ≤ y. Since T is firmly nonexpansive, hence monotone,one can verify that (∀n ∈ N) Tnx ≤ Tny and therefore (∀n ∈ N)Tnx + nv ≤ Tny + nv. Now take the limit as n→ ∞. Example 8.6 below shows that the assumptions that Fix T = ∅ and v ∈ran(Id−T) (equivalently; v 6= 0) in Proposition 8.5 are important.Example 8.6. Suppose that X = R, that T = − Id and that x 6= 0. Then thesequence (Tnx)n∈N = (x,−x, x,−x, . . .) is not convergent.When X = R, it follows from Proposition 8.5(i) that the sequence(Tnx + nv)n∈N converges. In view of Proposition 8.4(ii) the sequence(Tnx + nv)n∈N is Fe´jer monotone with respect to Fix(v + T) which mightsuggest that the limit lies in Fix(v + T). We show in the following examplethat this is not true in general.Example 8.7. Suppose that X = R and thatT : R→ R : x 7→x− α, if x ≤ α;0, if α < x ≤ β;x− β, if x > β,(8.5)where 0 < α < β. Then T is firmly nonexpansive but not affine, v = α, Fix(v +T) = ]−∞, α], Fix(T−v) = ]−∞, 0] , andTn + nv : R→ R :x 7→x, if x ≤ α;α, if α < x ≤ β;x− n(β− α), if x > β and n ≤ bx/βc;min{α, x−⌊xβ⌋β}+⌊xβ⌋α, if x > β and n > bx/βc.(8.6)Consequently,limn→∞(Tn + nv) : R→ R :x 7→x, if x ≤ α;α, if α < x ≤ β;min{α, x−⌊xβ⌋β}+ α⌊xβ⌋, if x > β.(8.7)828.2. Iterating nonexpansive operatorsTherefore for every x0 ∈ R the sequence (Tnx0 + nv)n∈N is eventually constant.However, if the starting point x0 lies in the interval ]β,+∞[, then limn→∞ Tnx0+nv = min{α, x− b xβcβ}+ αb xβc 6∈ Fix(v + T).Proof. ClearlyId−T = P[α,β] : R→ R : x 7→α, if x ≤ α;x, if α < x ≤ β;β, if x > β.(8.8)Therefore, ran(Id−T) = [α, β], and consequently v = α. Moreover(∀x ∈ R) x ≥ Tx + α ≥ T2x + 2α ≥ · · · ≥ Tnx + nα ≥ · · · . (8.9)It is clear from Example 7.5 thatFix(v + T) = ]−∞, α] . (8.10)The statement for Fix T−v then follows from combining (8.10) andLemma 7.2(i). The convergence of the sequence follows from Example 7.5or Proposition 8.5(i). Now we prove (8.6). We claim thatTn : R→ R :x 7→x− nα, if x ≤ α;(1− n)α, if α < x ≤ β;x− nβ, if x > β and n ≤ bx/βc;min{α, x−⌊xβ⌋β}+(⌊xβ⌋− n)α, if x > β and n > bx/βc.(8.11)Using induction it is easy to verify the cases when x ≤ α and when α <x ≤ β. Now we focus on the case when x > β. SetK = bx/βc and r = x− Kβ, (8.12)and note that x = Kβ+ r, K ∈ {1, 2, 3, . . .} and 0 ≤ r < β. In view of (8.9),if n ∈ {0, 1, 2, 3, . . . , K} we get Tnx = x− nβ = (K− n)β+ r. In particular,TKx = x− bx/βcβ = r. (8.13)If n > K we examine two cases. Case 1: 0 ≤ r ≤ α. It follows from(8.13) and (8.5) that (∀n ≥ K) Tnx = r + (K − n)α. Case 2: α < r < β.838.3. Affine nonexpansive operatorsNote that TK+1x = 0, therefore using (8.13) and (8.5) we have (∀n > K)Tnx = (K + 1 − n)α = α + (K − n)α, which proves (8.11). Now (8.6)follows from (8.11) because v = α. Letting n → ∞ in (8.6) yields(8.7). Note that min{α, x − b xβcβ} ≥ 0 and b xβc ≥ 1. By consideringcases (K = 1 and K ≥ 1), (8.7) implies that limn→∞(Tnx0 + nv) =min{α, x− b xβcβ}+ b xβcα > α 6∈ ]−∞, α] = Fix(v + T). 8.3 Affine nonexpansive operatorsIn this section, we investigate properties of affine nonexpansive operatorsand their corresponding inner and outer normal shifts. This additionalassumption allows for stronger results than those obtained in the previoussection. Various examples that illustrate our theory are provided.8.3.1 Iterating an affine nonexpansive operatorIn this section, we compare the sequences ((−vT)nx)n∈N, ((T−v)nx)n∈Nand (Tnx + nv)n∈N when T is an affine nonexpansive operator and v =Pran(Id−T)0 ∈ ran(Id−T). We prove that the three sequences coincide(see Theorem 8.8). Surprisingly, when we drop the assumption of T beingaffine, the sequences can be dramatically different (see Example 8.9).Theorem 8.8. Let L : X → X be linear and nonexpansive, let b ∈ X, and supposethat T : X → X : x 7→ Lx+ b. Suppose also that v ∈ ran(Id−T), and let x ∈ X.Then the following hold:(i) v = PFix L(−b) ∈ Fix L = (ran(Id−L))⊥, and v 6= 0 ⇔ b 6∈ran(Id−L).(ii) (∀n ∈N) Tnx = Lnx +∑n−1k=0 Lkb.(iii) (∀n ∈N) Tnx + nv = Lnx +∑n−1k=0 LkPran(Id−L)b.(iv) (∀n ∈N) (T−v)nx = Tnx + nv.(v) (∀n ∈N) (T−v)nx = (v + T)nx.(vi) Fix(T−v) = −v + Fix(T−v) = −v + Fix(−vT) = −v + Fix(v + T) =Fix(v + T).848.3. Affine nonexpansive operators(vii) Fix(T−v) = Fix(v + T) = Rv + Fix(v + T) = Rv + Fix(T−v). Conse-quently v lies in the lineality space7 of Fix(T−v) = Fix(v + T).Proof. (i): Note that ran(Id−T) = ran(Id−L)− b and hence ran(Id−T) =ran(Id−L)− b. Therefore, using Fact 2.6 we havev = Pran(Id−T)0 = P−b+ran(Id−L)0 = −b + Pran(Id−L)(0− (−b)) (8.14a)= −b + Pran(Id−L)b. (8.14b)Using [12, Fact 2.18(iv)] and [19, Lemma 2.1], we learn that(ran(Id−L))⊥ = ker(Id−L∗) = Fix L∗ = Fix L, and hencev = (Id−Pran(Id−L))(−b) = P(ran(Id−L))⊥(−b) = PFix L(−b). (8.15)Note that v 6= 0⇔ b 6∈ ran(Id−L). .(ii): We prove this by induction. When n = 0 the conclusion isobviously true. Now suppose that for some n ∈N it holds thatTnx = Lnx +n−1∑k=0Lkb. (8.16)Then Tn+1x = T(Tnx) = T(Lnx + ∑n−1k=0 Lkb) = L(Lnx + ∑n−1k=0 Lkb) + b =Ln+1x +∑nk=0 Lkb, as claimed.(iii): Note that b = Pran(Id−L)b + PFix Lb. Using (i) and Theorem 8.8 (ii)yieldTnx + nv = Lnx +n−1∑k=0(Lkb + v) = Lnx +n−1∑k=0(Lkb + Lkv)= Lnx +n−1∑k=0(Lkb− LkPFix Lb)= Lnx +n−1∑k=0Lk(Id−PFix L)b= Lnx +n−1∑k=0LkPran(Id−L)b.(iv): We prove this by induction. Note that by (i), v ∈ Fix L, henceLv = v. When n = 0 we have (T−v)0x = x = T0x + 0 · v. Now supposethat for some n ∈ N it holds that (T−v)nx = Tnx + nv. Then (T−v)n+1x =T−v(Tnx + nv) = T(Tnx + nv + v) = L(Tnx) + L((n + 1)v) + b = Tn+1x +(n + 1)v.7For the definition and a detailed discussion of the lineality space, we refer the reader to[123, page 65].858.3. Affine nonexpansive operators(v) We use induction again. The base case is obviously true. Nowsuppose that for some n ∈ N it holds that (v + T)nx = Tnx + nv. Then(v+ T)n+1x = v+ T(v+ T)nx = v+ T(Tnx+ nv) = v+ L(Tnx+ nv)+ b =v + LTnx + nv + b = LTnx + b + (n + 1)v = Tn+1x + (n + 1)v. Nowcombine with (iv).(vi): It follows from (v) applied with n = 1 that T−v = v + T. Nowapply Lemma 7.2(i).(vii): Using (vi) and the assumption that T is an affine operator,we have Fix(T−v) = Fix(v + T) is an affine subspace. Nowlet y0 ∈ Fix(T−v) = Fix(v + T). Using Proposition 7.8(i) wehave −R+v ⊆ Fix(v + T) − y0 = par Fix(v + T) and thereforeRv ⊆ par Fix(v + T). Hence y0 + Rv ⊆ Fix(v + T) which yieldsFix(v + T) +Rv ⊆ Fix(v + T). Since the opposite inclusion is obviouslytrue we conclude that (vii) holds. Suppose T is nonexpansive but not affine. Theorem 8.8 might suggestthat, for every x ∈ X, the sequences (Tnx + nv)n∈N, ((T−v)nx)n∈N and((v + T)nx)n∈N coincide, and consequently (Tnx + nv)n∈N is a sequence ofiterates of a nonexpansive operator. Interestingly, this is not the case as weillustrate now.Example 8.9. Suppose that X = R and let β > 0. Suppose thatT : R→ R : x 7→{x− β, x ≤ β;α (x− β) , x > β, (8.17)where 0 < α < 1. Then Fix T = ∅, v = β, for every n ∈N(T−v)n : R→ R : x 7→ αn max {x, 0}+min {x, 0} , (8.18)(v + T)n : R→ R : x 7→ αn max {x− β, 0}+min {x, β} , (8.19)andTn + nv : R→ R :x 7→x, if x ≤ β;αnx−(α(1−αn)1−α)β+ nβ, if x > β, n < q(x);αq(x)x−(α(1−αq(x))1−α)β+ q(x)β, if x > β, n ≥ q(x),(8.20)where q(x) : R→N : x 7→⌈logαβαβ+(1−α)x⌉.868.3. Affine nonexpansive operatorsConsequently,(∀x ∈ R) limn→∞ (T−v)n x = min {x, 0} , (8.21)(∀x ∈ R) limn→∞(v + T)nx = min {x, β} , (8.22)and(∀x ∈ R) limn→∞ (Tnx + nv)=x, if x ≤ β;αq(x)x− ( α(1−αq(x))1−α ) β+ q(x)β, if x > β.(8.23)Moreover, there is no operator S : R→ R such that for every x ∈ R and for everyn ∈N we have Snx = Tnx + nv.Proof. Considering cases, we easily check thatId−T : R→ R : x 7→ (1− α)max {x, β}+ αβ ≥ β > 0. (8.24)Hence Fix T = ∅ and v = β as claimed. Moreover, using (8.24) one canverify that(∀x ∈ X) x ≥ Tx + β > · · · ≥ Tnx + nβ ≥ Tn+1x + (n + 1)β ≥ · · · .(8.25)We also verify that(∀x ∈ R) T−v : R→ R : x 7→ max {x, 0} α+min {x, 0} . (8.26)We now prove (8.18) by induction. Let x ∈ R.Clearly when n = 0 the base case holds true. Now suppose that forsome n ∈ N (8.18) holds. If x ≤ 0, then (T−v)n x = x ≤ 0, and therefore,(8.26) implies that (T−v)n+1 x = T−v((T−v)n x) = T−vx = x. Similarlywe have x > 0 ⇒ αnx = (T−v)n x > 0, and consequently (8.26) impliesthat (T−v)n+1 x = T−v((T−v)n x) = T−v(αnx) = αn+1x. The proof of (8.19)follows from combining (8.18) and Lemma 8.3. Now we turn to (8.20). Weconsider two cases.Case 1: x ≤ β. It is obvious using the definition of T that (∀n ∈ N) Tnx =x− nβ.Case 2: x > β. Let n ∈ N be such that Tnx > β. By (8.25) and (8.17) wehaveTn+1x = αn+1x− (αn+1 + αn + · · ·+ α)β = αn+1x− α(1− αn+1)1− α β878.3. Affine nonexpansive operators= αn+1((1− α)x + αβ1− α)− α1− αβ. (8.27)In view of (8.25), there exists a unique integer, say, q(x) ∈ {1, 2, . . .} thatsatisfies Tq(x)−1x > β and Tq(x)x ≤ β. Since 0 < α < 1, using (8.27) wehaveTq(x)x ≤ β⇔ αq(x)((1− α)x + αβ1− α)− α1− αβ ≤ β⇔ αq(x)((1− α)x + αβ1− α)≤ β1− α ⇔ αq(x) ((1− α)x + αβ) ≤ β⇔ αq(x) ≤ β(1− α)x + αβ ⇔ q(x) ≥ logαβαβ+ (1− α)x .Consequently, q(x) =⌈logαβαβ+(1−α)x⌉. At this point, since Tq(x)x ≤ β,we must have (∀n ≥ q(x)) Tnx = Tq(x)x − (n − q(x))β, which proves(8.20). The formulae (8.21), (8.22) and (8.23) are direct consequences of(8.18), (8.19) and (8.20), respectively. To prove the last claim note that ifS : R→ R is such that for every n ∈Nwe have Sn = Tn + nv, then settingn = 1 must yieldS = v + T : R→ R : x 7→{x, x ≤ β;α (x− β) + β, x > β. (8.28)Now compare (8.19) and (8.20). 888.3. Affine nonexpansive operatorsFigure 8.1: The solid curve represents limn→∞(T−v)nx, the dashed dot-ted curve represents limn→∞(v + T)nx, and the dashed curve representslimn→∞ Tnx + nv, when α = 0.5 and β = 1.Figure 8.1 provides a Maple [99] plot of the functions defined by (8.21),(8.22) and (8.23) that illustrates that they are pairwise distinct.8.3.2 Strong and linear convergence of affine nonexpansiveoperatorsIn this section, we focus on strong convergence of iterates of affine non-expansive operators (see Theorem 8.20 and Corollary 8.21). We also presentmild conditions sufficient to obtain linear rate of convergence (see Theo-rem 8.23). Let Y be a Banach space. We shall use B(Y) to denote the setof bounded linear operators on Y. Let L ∈ B(Y). The operator norm of L is‖L‖ = sup‖y‖≤1‖Ly‖.Definition 8.10 (asymptotic regularity of operators). Let T : X → X. ThenT is asymptotically regular if (∀x ∈ X) Tnx − Tn+1x → 0, i.e., (∀x ∈ X)(Tnx)n∈N is asymptotically regular.We start by collecting various results that will be useful in the sequel.898.3. Affine nonexpansive operatorsFact 8.11. Let T : X → X. ThenT firmly nonexpansiveFix T 6= ∅}⇒ T asymptotically regular. (8.29)Proof. See [53, Corollary 1.1] or [12, Corollary 5.16(ii)]. Fact 8.12. Let L : X → X be linear and nonexpansive, and let x ∈ X. ThenLnx → PFix Lx ⇔ Lnx− Ln+1x → 0. (8.30)Proof. See [5, Proposition 4], [6, Theorem 1.1], [19, Theorem 2.2] or [12,Proposition 5.27]. (We mention in passing that in [5, Proposition 4] theauthor proved the result for general odd nonexpansive mappings inHilbert spaces and in [6, Theorem 1.1], the authors generalize the result toBanach spaces.) Definition 8.13. Let Y be a real Banach space, let (yn)n∈N be a sequencein Y and let y∞ ∈ Y. Then (yn)n∈N converges to y∞, denoted yn → y∞, if‖yn − y∞‖ → 0; (yn)n∈N converges µ-linearly8 to y∞ if µ ∈ [0, 1[ and thereexists M ≥ 0 such that(∀n ∈N) ‖yn − y∞‖ ≤ Mµn; (8.31)and finally (yn)n∈N converges linearly to y∞ if there exists µ ∈ [0, 1[ andM ≥ 0 such that (8.31) holds.Example 8.14 (convergence vs. pointwise convergence of continuous lin-ear operators). Let Y be a real Banach space, let (Ln)n∈N be a sequence inB(Y), and let L∞ ∈ B(Y). Then:(i) (Ln)n∈N converges or converges uniformly to L∞ in B(Y) if Ln → L∞(in B(Y)).(ii) (Ln)n∈N converges pointwise to L∞ if (∀y ∈ Y) Lny→ L∞y (in Y).Remark 8.15. It is easy to see that the convergence of a sequence of continuouslinear operators implies pointwise convergence; however, the converse is not true(see [92, Example 4.9-2]).8 This is also known as R-linear (or root linear) convergence; see [111, Ap-pendix 2 on page 620]. We use here a more concise notation for convenience to exposedirectly the rate of convergence. By [28, Remark 3.7], this is equivalent to (∃M > 0)(∃N ∈N)(∀n ≥ N) ‖yn − y∞‖ ≤ Mµn.908.3. Affine nonexpansive operatorsThe following two results are part of the folklore; however, we have notbeen able to find precise references and thus include the short proofs forthe reader’s convenience.Lemma 8.16. Let Y be a real Banach space, let (Ln)n∈N be a sequence in B(Y),let L∞ ∈ B(Y), and let µ ∈ ]0, 1[. Then(∀y ∈ Y) Lny→ L∞y µ-linearly (in Y)⇔ Ln → L∞ µ-linearly (in B(Y)).(8.32)Proof. Let y ∈ Y. “⇒”: Because Lny→ L∞y µ-linearly, there exists My ≥ 0such that (∀n ∈N) ‖(Ln − L∞)y‖ ≤ µn My; equivalently,∥∥∥(Ln − L∞µn)y∥∥∥ = ‖(Ln − L∞)y‖µn≤ My. (8.33)It follows from the Uniform Boundedness Principle (see [92, 4.7-3])applied to the sequence ((Ln − L∞)/µn)n∈N that (∃M ≥ 0)(∀n ∈ N)‖(Ln − L∞)/µn‖ ≤ M; equivalently, ‖Ln − L∞‖ ≤ Mµn, as required. “⇐”:Since Ln → L∞ µ-linearly, we have (∃M ≥ 0)(∀n ∈N) ‖Ln − L∞‖ ≤ Mµn.Therefore, (∀n ∈N) ‖Lny− L∞y‖ ≤ ‖Ln − L∞‖‖y‖ ≤ M‖y‖µn. Lemma 8.17. Suppose that X is finite-dimensional, let (Ln)n∈N be a sequence oflinear nonexpansive operators on X and let L∞ : X → X. Then the following areequivalent:(i) (∀x ∈ X) Lnx → L∞x.(ii) Ln → L∞ pointwise (in X), and L∞ is linear and nonexpansive.(iii) Ln → L∞ (in B(X)).Proof. The implications “(i)⇒(ii)” and “(iii)⇒(i)” are easy to verify. Wenow prove “(ii)⇒(iii)”: Suppose that (xn)n∈N is a sequence in X such that(∀n ∈N) ‖xn‖ = 1 and‖Ln − L∞‖ − ‖Lnxn − L∞xn‖ → 0. (8.34)After passing to a subsequence and relabeling if necessary, we canand do assume that xn → x∞ because X is finite-dimensional. SinceL∞ and (Ln)n∈N are linear and nonexpansive, we have ‖L∞‖ ≤ 1and (∀n ∈ N) ‖Ln‖ ≤ 1. Using the triangle inequality, we have‖Lnxn− L∞xn‖ = ‖(Ln− L∞)(xn− x∞)+ (Ln− L∞)x∞‖ ≤ ‖Ln− L∞‖‖xn−918.3. Affine nonexpansive operatorsx∞‖+ ‖(Ln − L∞)x∞‖ ≤ 2‖xn − x∞‖+ ‖(Ln − L∞)x∞‖ → 0+ 0 = 0. Nowcombine with (8.34). Corollary 8.18. Suppose that X is finite-dimensional, let L : X → X be linear,and let L∞ : X → X be such that Ln → L∞ pointwise. Then Ln → L∞ linearly.Proof. Combine Lemma 8.17 and [35, Theorem 2.12(i)]. Lemma 8.19. Let L : X → X be linear, let b ∈ X, set T : X → X : x 7→ Lx + b,and suppose that Fix T 6= ∅. Then there exists a point a ∈ X such that b =a− La, and(∀n ∈N)(∀x ∈ X) Tnx = a + Ln(x− a). (8.35)Proof. The existence of a follows from Theorem 7.3. By telescoping, wehaven−1∑k=0Lkb =n−1∑k=0Lk(a− La) = a− Lna. (8.36)Consequently, Theorem 8.8(ii) and (8.36) yield Tnx = Lnx + a − Lna =a + Ln(x− a). The following result extends Fact 8.12 from the linear to the affine case.Theorem 8.20. Let L : X → X be linear and nonexpansive, let b ∈ X, setT : X → X : x 7→ Lx + b, and suppose that Fix T 6= ∅. Then the followingare equivalent:(i) L is asymptotically regular.(ii) Ln → PFix L pointwise.(iii) Tn → PFix T pointwise.(iv) T is asymptotically regular.Proof. Let x ∈ X and note that in view of Theorem 7.3 there exists a ∈ Xsuch that b = a− La. “(i)⇔(ii)”: This is Fact 8.12. “(ii)⇒(iii)”: In view ofLemma 8.19 and (7.4) we have Tnx = Ln(x− a) + a → PFix L(x− a) + a =PFix Tx. “(iii)⇒(iv)”: Tnx − Tn+1x → PFix Tx − PFix Tx = 0. “(iv)⇒(i)”:Using Lemma 8.19 we have Lnx− Ln+1x = Tn(x + a)− Tn+1(x + a) → 0,which finishes the proof. 928.3. Affine nonexpansive operatorsCorollary 8.21. Let L : X → X be linear and nonexpansive, let b ∈ X, supposethat T : X → X : x 7→ Lx + b and that v ∈ ran(Id−T). Let x ∈ X. ThenFix(v + T) 6= ∅. Moreover the following are equivalent:(i) L is asymptotically regular.(ii) Lnx → PFix Lx.(iii) Tnx + nv = (v + T)nx = (T−v)nx → PFix(v+T)x.(iv) T−v = v + T is asymptotically regular.(v) (Tnx + nv)n∈N is asymptotically regular.Proof. The fact that Fix(v + T) 6= ∅ follows from (7.8). Note thatv + T = L + b + v. Now apply Theorem 8.20 with (T, b) replacedby (v + T, v + b) and use Theorem 8.8(iv)&(v). Note that in view ofTheorem 8.8(iv)&(v) the asymptotic regularity of v+ T = T−v is equivalentto the asymptotic regularity of the sequence (Tnx + nv)n∈N. We now turn to linear convergence.Lemma 8.22. Suppose that X is finite-dimensional, and let L : X → X be linearand nonexpansive. Then the following are equivalent:(i) L is asymptotically regular.(ii) Ln → PFix L pointwise (in X).(iii) Ln → PFix L (in B(X)).(iv) Ln → PFix L linearly pointwise (in X).(v) Ln → PFix L linearly (in B(X)).Proof. “(i)⇔(ii)”: This follows from Fact 8.12. “(ii)⇔(iii)”: CombineFact 8.12 and Lemma 8.17. “(iii)⇒(v)”: Apply Corollary 8.18 with L∞replaced by PFix L. “(v)⇒(iii)”: This is obvious. “(iv)⇔(v)”: ApplyLemma 8.16 to the sequence (Ln)n∈N and use Fact 8.12. Theorem 8.23. Let L : X → X be linear and nonexpansive, let b ∈ X, setT : X → X : x 7→ Lx + b, suppose that Fix T 6= ∅, and let µ ∈ ]0, 1[. Thenthe following are equivalent:(i) Tn → PFix T µ-linearly pointwise (in X).938.3. Affine nonexpansive operators(ii) Ln → PFix L µ-linearly pointwise (in X).(iii) Ln → PFix L µ-linearly (in B(X)).Proof. In view of Theorem 7.3 there exists a ∈ X such that b = a − La.“(i)⇔(ii)”: It follows from Lemma 8.19 and (7.4) that Tnx − PFix Tx =a + Ln(x − a) − (a + PFix L(x − a)) = Ln(x − a) − PFix L(x − a). Now useFact 8.12. “(ii)⇔(iii)”: Combine Lemma 8.16 and Fact 8.12. Corollary 8.24. Let L : X → X be linear and nonexpansive, let b ∈ X, supposethat T : X → X : x 7→ Lx + b and that v ∈ ran(Id−T). Let x ∈ X and letµ ∈ ]0, 1[. Then the following are equivalent:(i) Tnx + nv = (v + T)nx = (T−v)nx → PFix(v+T)x µ-linearly.(ii) Lnx → PFix Lx µ-linearly.(iii) Ln → PFix L µ-linearly (in B(X)).Proof. Note that Fix(v + T) 6= ∅ by (7.8). Now apply Theorem 8.23 with(T, b) replaced by (v + T, v + b) and use Theorem 8.8(iv)&(v). Corollary 8.25. Suppose that X is finite-dimensional. Let L : X → X be linear,nonexpansive and asymptotically regular, let b ∈ X, set T : X → X : x 7→ Lx+ band suppose that Fix T 6= ∅. Then Tn → PFix T pointwise linearly.Proof. It follows from Fact 8.12 that Ln → PFix L pointwise. Consequently,by Corollary 8.18, Ln → PFix L linearly. Now apply Theorem 8.23. Corollary 8.26. Suppose that X is finite-dimensional. Let L : X → X be linear,nonexpansive and asymptotically regular, let b ∈ X and set T : X → X : x 7→Lx + b. Let x ∈ X. Then v ∈ ran(Id−T) andTnx + nv = (v + T)nx = (T−v)nx → PFix(v+T)x linearly. (8.37)Proof. Since X is finite-dimensional we learn that ran(Id−T) is a closedaffine subspace of X, hence v ∈ ran(Id−T); equivalently Fix(v + T) 6= ∅by (7.8). Now apply Theorem 8.23 with (T, b) replaced by (v + T, v + b)and use Theorem 8.8(iv)&(v). 948.3. Affine nonexpansive operators8.3.3 Some algorithmic consequencesIn this section, we make use of the following useful fact that is well knownin analysis.Fact 8.27. Suppose that (an)n∈N is a decreasing sequence of nonnegative realnumbers such that ∑∞n=0 an < +∞. Thennan → 0. (8.38)Proof. See [90, Section 3.3, Theorem 1]. To make further progress we impose now additional assumptions on T.Definition 8.28 (averaged operator). Let T : X → X. Then T is averagedif there exist a nonexpansive operator R : X → X and a constant α ∈]0, 1[ such that T = (1− α) Id+αR; equivalently, (see [12, Proposition 4.25])(∀x ∈ X)(∀y ∈ X)‖Tx− Ty‖2 + 1−αα ‖(Id−T)x− (Id−T)y‖2 ≤ ‖x− y‖2. (8.39)Clearly,T is averaged with α = 1/2 ⇔ T is firmly nonexpansive. (8.40)Averaged operators have proven to be a useful class in fixed point theoryand optimization; see [6] and [61]. The following result is well known whenT is firmly nonexpansive. We include a simple proof, when T is averaged,for the sake of completeness.Proposition 8.29. Suppose that T is averaged and that v ∈ ran(Id−T). Letx ∈ X. Then the following hold:(i) ∑∞n=0 ‖Tnx− Tn+1x− v‖2 < +∞.(ii) Tnx− Tn+1x → v, equivalently; the sequence (Tnx + nv)n∈N is asymp-totically regular.Proof. It follows from [61, Lemma 2.1] that (∃α ∈ ]0, 1[) such that (∀x ∈ X)(∀y ∈ X)‖(Id−T)x− (Id−T)y‖2 ≤ α1− α(‖x− y‖2 − ‖Tx− Ty‖2). (8.41)Moreover, Proposition 8.4(ii) implies that (Tnx+ nv)n∈N is Feje´r monotonewith respect to Fix(v + T). Now let n ∈ N and let y0 ∈ Fix(v + T). Using958.3. Affine nonexpansive operatorsProposition 8.4(iii) we learn that Tny0 = y0 − nv. It follows from (8.41)applied with (x, y) replaced by (Tnx, Tny0) that‖Tnx− Tn+1x− v‖2 = ‖(Id−T)Tnx− (Id−T)Tny0‖2 (8.42a)≤ α1− α(‖Tnx− Tny0‖2 − ‖Tn+1x− Tn+1y0‖2).(8.42b)(i): This follows from (8.42) by telescoping. (ii): This is a direct consequenceof (i). Lemma 8.30. Let L : X → X be linear, nonexpansive and asymptotically regular,let b ∈ X, and suppose that T : X → X : x 7→ Lx + b and that v ∈ ran(Id−T).Let x ∈ X. Then the sequence (‖Tnx− Tn+1x− v‖)n∈N is a decreasing sequenceof nonnegative real numbers that converges to 0.Proof. Let n ∈ N. It follows from Theorem 8.8(iv)&(v) that Tnx + nv =(v + T)nx. Moreover, since L is nonexpansive so is v + T. Now‖Tnx− Tn+1x− v‖ = ‖Tnx + nv− (Tn+1x + (n + 1)v)‖ (8.43a)= ‖(v + T)nx− (v + T)n+1x‖ (8.43b)≤ ‖(v + T)n−1x− (v + T)nx‖ (8.43c)= ‖Tn−1x + (n− 1)v− (Tnx + nv)‖ (8.43d)= ‖Tn−1x− Tnx− v‖. (8.43e)The claim about convergence follows from Proposition 8.29(ii). Theorem 8.31. Let L : X → X be linear and averaged, let b ∈ X, and supposethat T : X → X : x 7→ Lx + b and that v ∈ ran(Id−T). Let x ∈ X and set(∀n ∈N) xn = Tnx + n(Tn2 x− Tn2+1x). (8.44)Then xn → PFix(v+T)x.Proof. We have‖xn − (v + T)nx‖ = ‖Tnx + n(Tn2 x− Tn2+1x)− (Tnx + nv)‖ (8.45a)= n‖Tn2 x− Tn2+1x− v‖ (8.45b)=√n2‖Tn2 x− Tn2+1x− v‖ → 0, (8.45c)968.4. Further resultswhere the limit follows from combining Proposition 8.29(i), Lemma 8.30and Fact 8.27 applied with an replaced by ‖Tnx− Tn+1x− v‖2. It followsfrom Corollary 8.21 that (v + T)nx → PFix(v+T)x, hence the conclusionfollows. 8.4 Further resultsLet x and y be in X. It is clear that (∀n ∈N) ‖Tn+1x − Tn+1y‖ ≤ ‖Tnx −Tny‖ hence (‖Tnx− Tny‖)n∈N is bounded. The following question is thusnatural:Under which conditions on T must (Tnx− Tny)n∈N converge weakly?(8.46)In this section we present some sufficient conditions that guarantee that(8.46) holds. We first note that, in view of Example 8.6, (8.46) will imposesome restriction on T:Fact 8.32. Suppose that Fix T 6= ∅ and let x ∈ X. Then (Tnx)n∈N is weaklyconvergent if and only if Tnx − Tn+1x ⇀ 0; if this is the case, then (Tnx)n∈Nconverges weakly to a point in Fix T.Proof. See [6, Theorem 1.2]. Lemma 8.33. Suppose that Fix(v + T) 6= ∅. Then(∀x ∈ X)(∀y ∈ X) (Tnx− Tny)n∈N is weakly convergent (8.47a)if and only if(∀x ∈ X) (Tnx + nv)n∈N is weakly convergent. (8.47b)Proof. Indeed, if (8.47a) holds, then (8.47b) follows by choosingy ∈ Fix(v + T) and recalling Proposition 8.4(i). Conversely, assumethat (8.47b) holds. Then (Tnx + nv)n∈N and (Tny + nv)n∈N are weaklyconvergent, and so is their difference which yields (8.47a). Remark 8.34. Suppose that X = R, that v 6= 0, and that Fix(v+ T) 6= ∅. Thenby Proposition 8.5(i) the sequence (Tnx + nv)n∈N is convergent, which gives amild sufficient condition for (8.46).978.4. Further resultsRecall that T is asymptotically regular at x if Tnx− Tn+1x → 0 and thatT is asymptotically regular if it is asymptotically regular at every point.Theorem 8.35. Suppose that T is affine, say T : x → Lx + b, where L is linearand nonexpansive, and b ∈ X. Suppose furthermore that L is asymptoticallyregular, and let x and y be points in X. ThenTnx− Tny = Ln(x− y)→ PFix L(x− y). (8.48)Proof. Using Theorem 8.8(iii), we have (∀n ∈N) Tnx − Tny =Lnx − Lny = Ln(x − y). The asymptotic regularity assumption yieldsLn(x − y) − Ln+1(x − y) → 0. Using Fact 8.12, we see that altogetherTnx− Tny = Ln(x− y)→ PFix(L)(x− y). Amazingly, on the real line, averagedness is another sufficient condition(see Remark 8.34) for (8.46):Theorem 8.36. Suppose that X = R and that T is averaged. Let x and y be inR.Then the sequence (Tnx− Tny)n∈N is convergent.Proof. Set (∀n ∈N) an = Tnx − Tny. We must show that (an)n∈N isconvergent. From (8.39), there exists α ∈ ]0, 1[ such that(∀n ∈N) a2n+1 + 1−αα (an − an+1)2 ≤ a2n. (8.49)Set β = 1− 2α and note that 0 ≤ |β| < 1. By viewing (8.49) as a quadraticinequality in an+1, we learn that(∀n ∈N) |an+1| ≤ |an| and an+1 lies between an and βan. (8.50)If some an0 = 0, then an → 0 and we are done. So assume that an 6= 0for every n ∈N. If (an)n∈N changes sign only finitely many times, then(an)n∈N is eventually always positive or negative. Since (|an|)n∈N isdecreasing, we deduce that (an)n∈N is convergent. Finally, we assume that(an)n∈N changes signs frequently. If n ∈ N and sgn(an+1) = − sgn(an),then |an+1| ≤ |β||an|; since this occurs infinitely many times, it follows thatan → 0. Theorem 8.37. Suppose that X is finite-dimensional, that T is averaged, thatFix(v + T) 6= ∅, and that codim Fix(v + T) ≤ 1. Then for every (x, y) ∈X× X, the sequence (Tnx− Tny)n∈N is convergent.988.4. Further resultsProof. In view of Lemma 8.33, we let x ∈ X and must show that(Tnx + nv)n∈N is convergent. Set C = Fix(v + T) and (∀n ∈N)xn = Tnx + nv. By Proposition 8.4(ii), (xn)n∈N is Feje´r monotone withrespect to C. Suppose first that codim C = 0. Then int C 6= ∅ and we aredone by Fact 6.1(v). Now assume that codim C = 1. By Proposition 8.29(ii),(xn)n∈N is asymptotically regular. Altogether, by Theorem 6.11, (xn)n∈N isconvergent. Corollary 8.38. Let x ∈ X. Suppose that X = R2, that T is averaged, thatv 6= 0, and that Fix(v + T) 6= ∅. Then for every (x, y) ∈ X × X, the sequence(Tnx− Tny)n∈N is convergent.Proof. Because v 6= 0, Proposition 7.8(ii) implies that dim Fix(v + T) ≥ 1,i.e., codim Fix(v + T) ≤ dim(X) − 1 = 1. The result now follows fromTheorem 8.37. Although it is not clear if Corollary 8.38 remains true when dim(X) ≥ 3,we conclude with an example which numerically illustrates that the answerto this question may be positive.Example 8.39. Suppose that X = R3 and let A and B be two closedballs in X such that A ∩ B = ∅. Set T = 12 (Id+RBRA). Then T isfirmly nonexpansive and hence averaged. (In fact, T is the Douglas–Rachford operator (see Definition 5.1) associated with the sets A and B.) Itfollows from Fact 14.4 below that A ∩ (v + B) + NA−Bv ⊆ Fix(v + T) ⊆v + A ∩ (v + B) + NA−Bv. Furthermore, Example 12.15 below implies thatA− B is a closed ball and v lies on the boundary of A− B. Consequently,A ∩ (v + B) is a singleton and NA−Bv is a ray. Hence Fix(v + T) is ray andtherefore dim Fix(v + T) = 1 and so codim Fix(v + T) = 2. Even thoughTheorem 8.37 is not applicable here, we still conjecture that (Tnx + nv)n∈Nconverges (see Figure 8.2 below).998.4. Further resultsFigure 8.2: A GeoGebra [78] snapshot that illustrates Example 8.39. The firstfew terms of the sequence (Tnx + nv)n∈N (blue points) are depicted.100Chapter 9The Douglas–Rachfordalgorithm: convergenceanalysis9.1 OverviewThe results in this chapter are about the convergence of the Douglas–Rachford algorithm in the consistent case, i.e., when zer(A + B) 6= ∅. Inthis chapter we assume thatA and B are maximally monotone operators on Xand we setT = T(A,B) = Id−JA + JBRA.We also assume thatZ = zer(A + B) 6= ∅ and Fix T 6= ∅.Let x ∈ X. The governing sequence and the shadow sequence generated by theDouglas–Rachford algorithm are (Tnx)n∈N and (JATnx)n∈N respectively.It is known that the governing sequence (Tnx)n∈N converges weakly toa point in Fix T and the shadow sequence (JATnx)n∈N converges weakly toa point in zer(A + B) (see Fact 9.2 below).We summarize our main results in this chapter as follows:• We present a new proof for Svaiter’s result [132] concerning the weakconvergence of the shadow sequence (see Theorem 9.4) in the con-sistent case. Our proof is in the spirit of the techniques used in theoriginal paper by Lions and Mercier [96].• The main algorithmic result (Theorem 9.7) is derived in Section 9.4.We provide precise information on the behaviour of the Douglas–Rachford algorithm in the affine case.1019.2. Convergence of Douglas–Rachford algorithm• We also provide an application to the parallel splitting method thatcan be used to find a zero of the sum of more than two operators (seeProposition 9.14).• We sketch some algorithmic consequences and conclude by com-menting on the applicability of our work to a more general dualityframework.Except for facts with explicit references, this chapter is based on results that appearin [13], [26], and [39].9.2 Averaged operators, Krasnosel’skiı˘–Manniteration and convergence of Douglas–RachfordalgorithmFact 9.1 (Krasnosel’skiı˘–Mann). Suppose that T : X → X is averaged suchthat Fix T 6= ∅. Let x ∈ X. Then the following hold:(i) (Tnx)n∈N is Feje´r monotone with respect to Fix T.(ii) (Tnx− Tn+1x)n∈N converges strongly to 0.(iii) (Tnx)n∈N converges weakly to a point in Fix T.Proof. See [91], [98] or [12, Theorem 5.14]. Fact 9.2 (convergence of Douglas–Rachford algorithm). Let A : X ⇒ Xand B : X ⇒ X be maximally monotone such that zer(A + B) 6= ∅ and letx ∈ X. Then Fix T 6= ∅ and there exists x¯ ∈ Fix T such that the following hold:(i) JA x¯ ∈ Z.(ii) (Tnx− Tn+1x)n∈N = (JATnx− JBRATnx)n∈N converges strongly to 0.(iii) (Tnx)n∈N converges weakly to x¯.(iv) (JATnx)n∈N converges weakly to JA x¯.(v) (JBRATnx)n∈N converges weakly to JA x¯.(vi) Suppose that B = NV , where V is a closed affine subspace of X. Then(PV Tnx)n∈N converges weakly to JA x¯.1029.3. A new proof of the Lions–Mercier–Svaiter theorem(vii) Suppose that int Fix T 6= ∅. Then (JATnx)n∈N and (JBRATnx)n∈Nconverge strongly to JA x¯.(viii) Suppose that one of the following hold:(a) A is uniformly monotone on every nonempty bounded subset ofdom A.(b) B is uniformly monotone on every nonempty bounded subset ofdom B.Then zer(A + B) is a singleton. Moreover, (JATnx)n∈N and(JBRATnx)n∈N converge strongly to the unique point in Z.Proof. See [96], [62, Theorem 2.1] or [12, Section 25.2]. 9.3 A new proof of the Lions–Mercier–SvaitertheoremParts of the following two results are implicit in [132]; however, our proofsare different.Proposition 9.3. Let x ∈ X. Then the following hold:(i) Tnx− Tn+1x = JATnx− JBRATnx = JA−1 Tnx + JB−1 RATnx → 0.(ii) The sequence (JATnx, JBRATnx, JA−1 Tnx, JB−1 RATnx)n∈N is bounded andlies in gra(A× B).Suppose that (a, b, a∗, b∗) is a weak cluster point of the sequence(JATnx, JBRATnx, JA−1 Tnx, JB−1 RATnx)n∈N. (9.1)Then:(iii) a− b = a∗ + b∗ = 0.(iv) 〈a, a∗〉+ 〈b, b∗〉 = 0.(v) (a, a∗) ∈ gra A and (b, b∗) ∈ gra B.(vi) For every x ∈ X, the sequence (JATnx, JA−1 Tnx)n∈N is bounded and itsweak cluster points lie in S .1039.3. A new proof of the Lions–Mercier–Svaiter theoremProof. (i): Apply Lemma 5.8(i) with x replaced by Tnx. The claim of thestrong limit follows from combining Fact 5.2(i) and [6, Corollary 2.3] or [12,Theorem 5.14(ii)].(ii): The boundedness of the sequence follows from the weak con-vergence of (Tnx)n∈N (see Fact 9.2(iii)) and the nonexpansiveness of theresolvents and reflected resolvents of monotone operators (see [12, Corol-lary 23.10(i) and (ii)]). Now apply Lemma 5.8(ii) with x replaced by Tnx.(iii): This follows from taking the weak limit along the subsequences in (i).(iv): In view of (iii) we have 〈a, a∗〉+ 〈b, b∗〉 = 〈a, a∗ + b∗〉 = 〈a, 0〉 = 0. (v):Let ((x, y), (u, v)) ∈ gra(A× B) and setan = JATnx, a∗n = JA−1 Tnx, bn = JBRATnx, b∗n = JB−1 RATnx. (9.2)Applying Lemma 5.7 with (a, b, a∗, b∗) replaced by (an, bn, a∗n, b∗n) yields〈(an, bn)− (x, y), (a∗n, b∗n)− (u, v)〉= 〈an − bn, a∗n〉+ 〈x, u〉 − 〈x, a∗n〉 − 〈an − bn, u〉+ 〈bn, a∗n + b∗n〉+ 〈y, v〉 − 〈y, b∗n〉 − 〈bn, u + v〉. (9.3)By (5.31), A× B is monotone. In view of (9.2), (9.3) and Proposition 9.3(ii),we deduce that〈an − bn, a∗n〉+ 〈x, u〉 − 〈x, a∗n〉 − 〈an − bn, u〉+〈bn, a∗n + b∗n〉+ 〈y, v〉 − 〈y, b∗n〉 − 〈bn, u + v〉 ≥ 0. (9.4)It follows from taking the limit in (9.4) along a subsequence and using (9.2),Proposition 9.3(i), (iii) and (iv) that0 ≤ 〈x, u〉 − 〈x, a∗〉+ 〈y, v〉 − 〈y, b∗〉 − 〈b, u + v〉 (9.5a)= 〈x, u〉 − 〈x, a∗〉+ 〈y, v〉 − 〈y, b∗〉− 〈a, u〉 − 〈b, v〉+ 〈a, a∗〉+ 〈b, b∗〉 (9.5b)= 〈a− x, a∗ − u〉+ 〈b− y, b∗ − v〉 (9.5c)= 〈(a, b)− (x, y), (a∗, b∗)− (u, v)〉. (9.5d)By maximality of A × B (see (5.31)), we deduce that ((a, b), (a∗, b∗)) liesin gra(A × B). Therefore, (a, a∗) ∈ gra A and (b, b∗) ∈ gra B. (vi): Theboundedness of the sequence follows from (ii). Now let (a, b, a∗, b∗) be aweak cluster point of (JATnx, JBRATnx, JA−1 Tnx, JB−1 RATnx)n∈N. By (v)we know that (a, a∗) ∈ gra A and (b, b∗) = (a, b∗) ∈ gra B, which in viewof (iv) implies a∗ ∈ Aa and −a∗ = b∗ ∈ Bb = Ba, hence (a, a∗) ∈ S , asclaimed (see (5.24)). 1049.3. A new proof of the Lions–Mercier–Svaiter theoremTheorem 9.4. Let x ∈ X and let (z, k) ∈ S . Then the following hold:(i) For every n ∈N,‖(JATn+1x, JA−1 Tn+1x)− (z, k)‖2= ‖JATn+1x− z‖2 + ‖JA−1 Tn+1x− k‖2 (9.6a)≤ ‖JATnx− z‖2 + ‖JA−1 Tnx− k‖2 (9.6b)= ‖(JATnx, JA−1 Tnx)− (z, k)‖2. (9.6c)(ii) The sequence (JATnx, JA−1 Tnx)n∈N is Feje´r monotone with respect to S .(iii) The sequence (JATnx, JA−1 Tnx)n∈N converges weakly to some point in S .Proof. (i): Apply Corollary 5.24 with x replaced by Tnx. (ii): This followsdirectly from (i). (iii): Combine Proposition 9.3(vi), (ii), Corollary 5.16(iii)and Fact 6.1(viii). We are now ready for the main result of this section, i.e., an alternativeshorter proof of the Lions–Mercier–Svaiter result.Corollary 9.5. (Lions–Mercier–Svaiter). (JATnx)n∈N converges weakly tosome point in Z.Proof. This follows from Theorem 9.4(iii) and (5.24); see also Lions andMercier’s [96, Theorem 1] and Svaiter’s [132, Theorem 1]. In our final result we show that when X = R, the Feje´r monotonicityof the sequence (JATnx, JA−1 Tnx)n∈N with respect to S can be decoupled toyield Feje´r monotonicity of (JATnx)n∈N and (JA−1 Tnx)n∈N with respect toZ and K, respectively.Lemma 9.6. Suppose that X = R. Let x ∈ X and let (z, k) ∈ Z× K. Then thefollowing hold:(i) The sequence (JATnx)n∈N is Feje´r monotone with respect to Z.(ii) The sequence (JA−1 Tnx)n∈N is Feje´r monotone with respect to K.Proof. Apply Lemma 5.25 with x replaced by Tnx. We point out that the conclusion of Lemma 9.6 does not hold whendim X ≥ 2, see [30, Section 5 & Figure 1].1059.4. The Douglas–Rachford algorithm in the affine case9.4 The Douglas–Rachford algorithm in the affinecaseIn this section we assume thatA : X ⇒ X and B : X ⇒ X are maximally monotone and affine,and thatZ = {x ∈ X | 0 ∈ Ax + Bx} 6= ∅. (9.7)Since the resolvents JA and JB are affine (see [27, Theorem 2.1(xix)]), so is T.The results of this section are of historical interest and they extend the orig-inal setting of the Douglas–Rachford method from affine operators definedon a finite-dimensional space to possibly infinite-dimensional settings.Theorem 9.7. Let x ∈ X. Then the following hold:(i) Tnx → PFix Tx and JATnx → JAPFix Tx ∈ Z.(ii) Suppose that A and B are paramonotone such that K ⊥ (Z− Z) (as is thecase when9 A and B are paramonotone and (zer A) ∩ (zer B) 6= ∅). ThenJATnx → PZx.(iii) Suppose that X is finite-dimensional. Then Tnx → PFix Tx linearly andJATnx → JAPFix Tx linearly.Proof. (i): Note that in view of Fact 5.2(iii) and (9.7) we have Fix T 6= ∅.Moreover Fact 5.2(i) and Fact 8.11 imply that T is asymptotically regular.It follows from Theorem 8.20 that (i) holds. (ii): Use (i) and Theorem 5.19.(iii): The linear convergence of (Tnx)n∈N follows from Corollary 8.25. Thelinear convergence of (JATnx)n∈N is a direct consequence of the linearconvergence of (Tnx)n∈N and the fact that JA is (firmly) nonexpansive. Remark 9.8. Theorem 9.7 generalizes the convergence results for the originalDouglas–Rachford algorithm [70] from particular symmetric matrices/affine op-erators on a finite-dimensional space to general affine relations defined on possiblyinfinite dimensional spaces, while keeping strong and linear convergence of theiterates of the governing sequence (Tnx)n∈N and identifying the limit to bePFix Tx. Paramonotonicity coupled with common zeros yields convergence of theshadow sequence (JATnx)n∈N to PZx.9See Theorem 4.32(ii).1069.4. The Douglas–Rachford algorithm in the affine caseThe assumption that both operators are paramonotone is critical for theconclusion in Theorem 9.7(ii), as shown below.Example 9.9. Suppose that X = R2, thatA =(0 −11 0)and that B = N{0}×R. (9.8)Then T : R2 → R2 : (x, y) 7→ 12 (x − y) · (1,−1), Fix T = R · (1,−1),Z = {0} × R, K = {0}, hence K ⊥ (Z − Z), and (∀(x, y) ∈ R2)(∀n ≥1)Tn(x, y) = T(x, y) = 12 (x − y, y− x) ∈ Fix T, however (∀(x, y) ∈ (Rr{0})×R) (∀n ≥ 1)(0, y− x) = JATn(x, y) 6= PZ(x, y) = (0, y). (9.9)Note that A is not paramonotone by Example 3.25.Proof. We haveJA = (Id+A)−1 =(1 −11 1)−1=12(1 1−1 1), (9.10)andRA = 2JA − Id =(0 1−1 0). (9.11)Moreover, by [12, Example 23.4],JB = P{0}×R =(0 00 1). (9.12)ConsequentlyT = Id−JA + JBRA =(1 00 1)− 12(1 1−1 1)+(0 00 1)(0 1−1 0)=12(1 −1−1 1),(9.13)i.e.,T : R2 → R2 : (x, y) 7→ x−y2 (1,−1). (9.14)Now let (x, y) ∈ R2. Then (x, y) ∈ Fix T ⇔ (x, y) = ( x−y2 ,− x−y2 )⇔ x = x−y2 and y = − x−y2 ⇔ x + y = 0, hence Fix T = R · (1,−1)as claimed. It follows from Fact 5.2(iii) that Z = JA(Fix T) =1079.4. The Douglas–Rachford algorithm in the affine caseR · JA(1,−1) = R · 12 (0, 2) = {0} × R, as claimed. Now let(x, y) ∈ R2. By (9.14) we have T(x, y) = x−y2 (1,−1) ∈ Fix T, hence(∀n ≥ 1) Tn(x, y) = T(x, y) = x−y2 (1,−1). Therefore, (∀n ≥ 1)JATn(x, y) = JAT(x, y) = JA(x−y2 (1,−1))= (0, y− x) 6= (0, y) = PZ(x, y)whenever x 6= 0. In Example 9.10 below, we show that the assumption that A and B havecommon zeros (equivalently, in view of Theorem 4.32, that K ⊥ (Z − Z))is critical in Theorem 9.7(ii) — it cannot not be relaxed to assuming merely(9.7).Example 9.10 (when K 6⊥ (Z−Z)). Let u ∈ Xr {0}. Suppose that A : X →X : x 7→ u and B : X → X : x 7→ −u. Then A and B are paramonotone,A + B ≡ 0 and therefore Z = X. Moreover, by Remark 4.29 (∀z ∈ Z = X)K = (Az) ∩ (−Bz) = {u} 6⊥ (Z − Z) = X. Note that Fix T = Z + K =X + {u} = X and JA : X → X : x 7→ x− u. Consequently(∀x ∈ X)(∀n ∈N) JATnx = JAPFix Tx = JAx = x− u 6= x = PZx. (9.15)Suppose that U and V are nonempty closed convex subsets of X. ThenTU,V = T(NU ,NV) = Id−PU + PV(2PU − Id). (9.16)Proposition 9.11. Suppose that U and V are closed linear subspaces of X. Letw ∈ X. Then w+U and w+V are closed affine subspaces of X, (w+U)∩ (w+V) 6= ∅ and (∀n ∈N)Tnw+U,w+V = TnU,V(· − w) + w. (9.17)Proof. Let x ∈ X. We proceed by induction. The case n = 0 is clear. Wenow prove the case when n = 1, i.e.,Tw+U,w+V = TU,V(· − w) + w. (9.18)Indeed, using Fact 2.6 Tw+U,w+V x = (Id−Pw+U + Pw+V(2Pw+U − Id))x =x−w− PU(x−w) +w+ PV(2Pw+Ux− x−w) = x−w− PU(x−w) +w+PV(2w + 2PU(x− w)− x− w) = (x− w)− PU(x− w) + PV(2PU(x− w)−(x−w)) +w = (Id−PU + PV RU)(x−w) +w = TU,V(x−w) +w. We nowassume that (9.17) holds for some n ∈N. Applying (9.18) with x replacedby Tnw+V,w+Ux yieldsTn+1w+U,w+V x = Tw+U,w+V(Tnw+U,w+V x) = TU,V(Tnw+U,w+V x− w) + w1089.4. The Douglas–Rachford algorithm in the affine case= TU,V(TnU,V(x− w) + w− w) + w = Tn+1U,V (x− w) + w; (9.19)hence (9.17) holds for all n ∈N. Example 9.12 (Douglas–Rachford in the affine feasibility case ). (see also[30, Corollary 4.5]) Suppose that U and V are closed linear subspaces of X.Let w ∈ X and let x ∈ X. Suppose that A = Nw+U and that B = Nw+V .Then Tw+U,w+V x = Lx + b, where L = TU,V and b = w− TU,Vw. Moreover,Tnw+U,w+V x → PFix Tw+U,w+V x (9.20)andJATnw+U,w+V x = Pw+UTnw+U,w+V x → PZx = P(w+V)∩(w+U)x. (9.21)Finally, if U +V is closed (as is the case when X is finite-dimensional), thenthe convergence is linear with rate cF(U, V) < 1, where cF(U, V) is thecosine of the Friedrich’s angle10 between U and V.Proof. Using (9.17) with n = 1 and the linearity of TU,V we haveTw+U,w+V = TU,V(· − w) + w = TU,V + w− TU,Vw. (9.22)Hence L = TU,V and b = w− TU,Vw, as claimed. To obtain (9.20) and (9.21),use Theorem 9.7(i) and Theorem 9.7(ii), respectively. The claim aboutthe linear rate follows by combining [30, Corollary 4.4] and Theorem 8.23with T replaced by Tw+U,w+V and (L, b) replaced by (TU,V , w− TU,Vw). Remark 9.13. When X is infinite-dimensional, it is possible to construct anexample (see [30, Section 6]) of two linear subspaces U and V where cF(U, V) = 1,and the rate of convergence of T is not linear.The following result complements Combettes’s work [62, Section 2.2]which deals with more general operators; here, we obtain a strong conver-gence result in the affine setting.10 Suppose that U and V are closed linear subspaces of X. The cosine of the Friedrichsangle is cF(par U, par V) = supu∈par U∩W⊥∩ball(0;1)v∈par V∩W⊥∩ball(0;1)|〈u, v〉| < 1, where W = par U ∩ par V.1099.5. Eckstein–Ferris–Pennanen–Robinson duality and algorithmsProposition 9.14 (parallel splitting). Let m ∈ {2, 3, . . .}, and let Bi : X ⇒ Xbe maximally monotone and affine, i ∈ {1, 2, . . . , m}, such that zer(∑mi=1 Bi) 6=∅. Set ∆ = {(x, . . . , x) ∈ Xm | x ∈ X}, set A = N∆, set B = ×mi=1Bi, setT = T(A, B), let j : X → Xm : x 7→ (x, x, . . . , x), and let e : Xm → X :(x1, x2, . . . , xm) 7→ 1m (∑mi=1 xi). Let x ∈ Xm. Then∆⊥ ={(u1, . . . , um) ∈ Xm∣∣∣ m∑i=1ui = 0}, (9.23)Z = Z(A,B) = j(zer(∑mi=1 Bi)) ⊆ ∆ (9.24)andK = K(A,B) = (−B(Z)) ∩ ∆⊥ ⊆ ∆⊥. (9.25)Moreover, the following hold:(i) Tnx→ PFix Tx and P∆Tnx→ P∆PFix Tx.(ii) Suppose that X is finite-dimensional. Then Tnx → PFix Tx linearly andJATnx = P∆Tnx→ P∆PFix Tx linearly.(iii) Suppose that Bi : X ⇒ X, i ∈ {1, 2, . . . , m}, are paramonotone. Then B isparamonotone and JATnx = P∆Tnx → PZx. Consequently, e(JATnx) =e(P∆Tnx)→ e(PZx) ∈ Z.Proof. Both (9.23) and (9.24) follow from [12, Proposition 25.5(i)&(vi)].On the other hand (9.25) follows from Corollary 4.31(iii) applied to(A, B). (i): Apply Theorem 9.7(i) to (A, B). (ii): Apply Theorem 9.7(iii)to (A, B). (iii): Let (x, u), (y, v) be in gra B. On the one hand,〈x − y, u − v〉 = 0 ⇔ ∑mi=1〈xi − yi, ui − vi〉 = 0, (xi, ui), (yi, vi) are ingra Bi, i ∈ {1, . . . , m}. On the other hand, since (∀i ∈ {1, . . . , m}) Bi aremonotone we learn that (∀i ∈ {1, . . . , m}) 〈xi − yi, ui − vi〉 ≥ 0. Altogether,(∀i ∈ {1, . . . , m}) 〈xi − yi, ui − vi〉 = 0. Now use that paramonotonicity ofBi to deduce that (xi, vi), (yi, ui) are in gra Bi, i ∈ {1, . . . , m}; equivalently,(x, v), (y, u) in gra B. Finally, apply Example 5.20. 9.5 Eckstein–Ferris–Pennanen–Robinson duality andalgorithmsIn this last section, we sketch some algorithmic consequences and thenconclude by commenting on the applicability of our work to a more general1109.5. Eckstein–Ferris–Pennanen–Robinson duality and algorithmsduality framework. Recall thatT = 12 Id+12 RBRA = Id−JA + JBRA, (9.26)and that the set of primal solutions Z coincides with JA(Fix T) (seeFact 5.2(iii)). This explains the interest in finding fixed points of T.Moreover, if the nearest primal solution is of interest (e.g., the problem offinding the projection onto the intersection of two nonempty closed convexsets), then the following result may be helpful:Theorem 9.15 (abstract algorithm). Suppose that A and B are paramonotone.Let (xn)n∈N be a sequence such that (xn)n∈N converges (weakly or strongly)to x ∈ Fix T and (JAxn)n∈N converges (weakly or strongly) to JAx. Then thefollowing hold.(i) (∀k ∈ K) JAx = PZ(x− k).(ii) If (Z− Z) ⊥ K, then JAx = PZx.Proof. Combine Corollary 5.16(v) with Theorem 4.40. We provide three examples.Example 9.16 (Douglas–Rachford algorithm). Suppose that A and B areparamonotone and that the sequence (xn)n∈N is generated by (∀n ∈N)xn+1 = Txn. The hypothesis in Theorem 9.15 is satisfied, and the con-vergence of the sequences is with respect to the weak topology [132]. Seealso [9] for a much simpler proof and [12, Theorem 25.6] for a powerfulgeneralization.Example 9.17 (Halpern-type algorithm). Suppose that A and B are para-monotone and that the sequence (xn)n∈N is generated by (∀n ∈N) xn+1 =(1− λn)Txn + λny, where (λn)n∈N is a sequence of parameters in ]0, 1[ andy ∈ X is given. Under suitable assumptions on (λn)n∈N, it is known (see[85], [135]) that xn → x = PFix Ty with respect to the norm topology. SinceJA is (firmly) nonexpansive, it is clear that the hypothesis of Theorem 9.15holds. Furthermore, JAxn → JAx = JAPFix Ty. Thus, if k0 ∈ K, thenJAxn → PZ(y − k0) by Theorem 4.40(i). And if (Z − Z) ⊥ K, thenJAxn → PZy by Theorem 4.40(ii).Example 9.18 (Haugazeau-type algorithm). This is similar to Example 9.17in that xn → x = PFix Ty with respect to the norm topology and where y ∈ Xis given. For the precise description of the (somewhat complicated) update1119.5. Eckstein–Ferris–Pennanen–Robinson duality and algorithmsformula for (xn)n∈N, we refer the reader to [12, Section 29.2] or [11]; see also[86]. Once again, we have JAxn → JAx = JAPFix Ty and thus, if k0 ∈ K, thenJAxn → PZ(y− k0) by Theorem 4.40(i). And if (Z− Z) ⊥ K, then JAxn →PZy by Theorem 4.40(ii). Consequently, in the context of Example 5.17, weobtain PUxn → PU∩Vy; in fact, this is [21, Theorem 3.3], which is the mainresult of [21].Turning to Eckstein–Ferris–Pennanen–Robinson duality, let us assumethe following:• Y is a real Hilbert space (and possibly different from X);• C is a maximally monotone operator on Y;• L : X → Y is continuous and linear.Around the turn of the millennium, Eckstein and Ferris [73], Pennanen[118] as well as Robinson [121] considered the problem of finding zerosofA + L∗CL. (9.27)This framework is more flexible than the Attouch–The´ra framework, whichcorresponds to the case when Y = X and L = Id. (For an even more generalframework, see [66].) Note that just as Attouch–The´ra duality relatesto classical Fenchel duality in the subdifferential case (see Section 4.6),the Eckstein–Ferris–Pennanen–Robinson duality pertains to the classicalFenchel–Rockafellar duality for the problem of minimizing f + h ◦ L whenf ∈ ΓX and h ∈ ΓY, and A = ∂ f and C = ∂h.The results in the previous sections can be used in the Eckstein–Ferris–Pennanen–Robinson framework thanks to items (ii) and (iii) of the follow-ing result, which allows us to set B = L∗CL.Proposition 9.19. The following hold.(i) If C is paramonotone, then L∗CL is paramonotone.(ii) (Pennanen) IfR++(ran L− dom C) is a closed subspace of Y, then L∗CLis maximally monotone.(iii) If C is paramonotone and R++(ran L− dom C) is a closed subspace of Y,then L∗CL is maximally monotone and paramonotone.1129.6. Brief historyProof. (i): Take x1 and x2 in X, and suppose that x∗1 ∈ L∗CLx1 and x∗2 ∈L∗CLx2. Then there exist y∗1 ∈ CLx1 and y∗2 ∈ CLx2 such that x∗1 = L∗y∗1and x∗2 = L∗y∗2 . Thus,〈x1 − x2, x∗1 − x∗2〉 = 〈x1 − x2, L∗y∗1 − L∗y∗2〉 = 〈Lx1 − Lx2, y∗1 − y∗2〉 ≥ 0(9.28)because C is monotone. Hence L∗CL is monotone. Now suppose further-more that 〈x1 − x2, x∗1 − x∗2〉 = 0. Then 〈Lx1 − Lx2, y∗1 − y∗2〉 = 0 and theparamonotonicity of C yields y∗2 ∈ C(Lx1) and y∗1 ∈ C(Lx2). Therefore,x∗2 = L∗y∗2 ∈ L∗CLx1 and x∗1 = L∗y∗1 ∈ L∗CLx2.(ii): See [118, Corollary 4.4.(c)].(iii): Combine (i) and (ii). 9.6 Brief historyThe Douglas–Rachford algorithm has its roots in the 1956 paper [70] as amethod for solving a system of linear equations, where the matrices aresymmetric and positive semi-definite. In 1969 Lieutaud (see [95]) extendedtheir method to deal with (possibly nonlinear) maximally monotone oper-ators that are defined everywhere. Lions and Mercier, in their paper [96]from 1979, presented a broad and powerful generalization to its currentform, i.e., to handle the sum of any two maximally monotone operatorsthat are possibly nonlinear, possibly set-valued and not necessarily definedeverywhere. (For details on this connection we refer the reader to [95]and [61].) In their seminal work, they showed that, for every x ∈ X,(Tnx)n∈N converges weakly to a point in Fix T and that the boundedshadow sequence (JATnx)n∈N has all its weak cluster points in zer(A + B)provided that A + B was maximally monotone. (Note that resolvents arenot weakly continuous in general; see [139] or [12, Example 4.12].) Intheir joint work from 1992, Eckstein and Bertsekas proved that JA(Fix T) ⊆zer(A + B) (see [72, Theorem 5]). Later on, in 2006, Combettes refined theresults by Eckstein and Bertsekas by providing the first characterizationof the set of zeros of the sum to be precisely the shadows of the fixedpoints of T, namely JA(Fix T) (see [61, Lemma 2.6(iii)]). In the finite-dimensional setting, together with the earlier results by Lions and Mercier[96], the work by Eckstein and Bertsekas and later by Combettes, assertsthe convergence of the shadow sequence to a solution of the sum (withoutassuming maximality). Working with the shadow sequence, we point outthat explicit proof of weak convergence of the shadow sequence under1139.6. Brief historyadditional assumptions and strong convergence results in special cases canbe found in [62]. The first proof that the weak cluster points of the shadowsequence are zeros of A+ B (without assuming the maximality of the sum)in the convex feasibility setting appeared in [18, Fact 5.9] in 2002. Buildingon [8] and [74], Svaiter provided a complete answer in 2011 (see [132])demonstrating that A + B does not have to be maximally monotone andthat the shadow sequence (JATnx)n∈N in fact does converge weakly to apoint in zer(A + B). (He used Theorem 9.4; however, his proof differsfrom ours which is more in the style of the original paper by Lions andMercier [96].) Nonetheless, when Z = ∅, the complete understanding of(JATnx)n∈N remains open — to the best of our knowledge, Theorem 14.10below is currently the most powerful result available.114Chapter 10On the order of the operators10.1 OverviewBy definition, the Douglas–Rachford splitting operator associated withthe ordered pair of operators (A, B) is dependent on the order of theoperators A and B, even though the sum problem remains unchangedwhen interchanging A and B. The goal of this chapter is to investigate theconnection between the operators T(A,B) and T(B,A). Our main results can besummarized as follows.• We show that RA is an isometric bijection from the fixed points set ofT(A,B) to that of T(B,A), with inverse RB : Fix T(B,A) → Fix T(A,B) (seeTheorem 10.2).• When A is an affine relation, we have (∀n ∈N) RATn(A,B) = Tn(B,A)RA.In particular when A = NU where U is a closed affine subspace of X,we have (∀n ∈N) Tn(A,B) = RATn(B,A)RA and Tn(B,A) = RATn(A,B)RA(see Proposition 10.5(i) and Theorem 10.8(i)).• Our results connect to the recent linear and finite convergence results(see Remark 10.11) for the Douglas–Rachford algorithm (see [1], [2],[37], [34], [87] and [88]).The results in this chapter are mainly based on the work published in [14].10.2 Connection between Fix T(A,B) and Fix T(B,A)In view of Definition 4.3, one easily verifies thatZ(B,A) = (B + A)−1(0) = Z (10.1)andK(B,A) = (B−1 + A−>)−1(0) = −K. (10.2)11510.2. Connection between Fix T(A,B) and Fix T(B,A)We further recall (see ?? and ??) thatZ = JA(Fix T(A,B)) and K = (Id−JA)(Fix T(A,B)), (10.3)and we will make use of the following useful lemma.Lemma 10.1. We haveRAT(A,B) − T(B,A)RA = 2JAT(A,B) − JA − JARBRA. (10.4)Proof. Using (5.2) we haveRAT(A,B) − T(B,A)RA = 2JAT(A,B) − T(A,B) − RA + JBRA − JARBRA= 2JAT(A,B) − Id+JA − JBRA − 2JA+ Id+JBRA − JARBRA= 2JAT(A,B) − JA − JARBRA. We are now ready for the first main result in this chapter.Theorem 10.2. RA is an isometric bijection from Fix T(A,B) to Fix T(B,A), withisometric inverse RB. Moreover, we have the following commutative diagram:Fix T(A,B) Fix T(B,A)S(A,B) S(B,A)RARBId×(− Id)+(JA ,Id−JA)◦∆+(JB ,Id−JB )◦∆Here S(A,B) = {(z,−w) ∈ X× X | − w ∈ Bz, w ∈ Az} is the Kuhn–Tuckerset11 for the pair (A, B), and ∆ : X → X × X : x 7→ (x, x). In particular, wehaveRA : Fix T(A,B) → Fix T(B,A) : z + k 7→ z− k, (10.5)where (z, k) ∈ S(A,B).11For further information on the Kuhn–Tucker set, we refer the reader to [74, Section 2.1].11610.3. Iterates of T(A,B) vs. iterates of T(B,A)Proof. Let x ∈ X and note that (5.2) implies that Fix T(A,B) = Fix RBRA andFix T(B,A) = Fix RARB. Now x ∈ Fix T(A,B) ⇔ x = RBRAx ⇒ RAx =RARBRAx ⇔ RAx ∈ Fix RARB = Fix T(B,A), which proves that RA mapsFix T(A,B) into Fix T(B,A). By interchanging A and B one sees that RB mapsFix T(B,A) into Fix T(A,B). We now prove the claim that RA maps Fix T(A,B)onto Fix T(B,A). To this end, let y ∈ Fix T(B,A) and note that RBy ∈ Fix T(A,B)and RARBy = y, which proves that RA maps Fix T(A,B) onto Fix T(B,A). Thesame argument holds for RB. Finally since (∀x ∈ Fix T(A,B)) RBRAx = x,this proves that RA is a bijection from Fix T(A,B) to Fix T(B,A) with the desiredinverse. To prove that RA : Fix T(A,B) → Fix T(B,A) is an isometry notethat (∀x ∈ Fix T(A,B)) (∀y ∈ Fix T(A,B)) we have ‖x − y‖ = ‖RBRAx −RBRAy‖ ≤ ‖RAx− RAy‖ ≤ ‖x− y‖.We now turn to the diagram. The correspondence of Fix T(A,B) andFix T(B,A) follows from our earlier argument. On the other hand, thecorrespondences of Fix T(A,B) and S(A,B), and Fix T(B,A) and S(B,A) followfrom combining Remark 4.24 and Theorem 5.14 applied to T(A,B) and T(B,A)respectively. The fourth correspondence is obvious from the definition ofS(A,B) and S(B,A). To prove (10.5) we let y ∈ Fix T(A,B) and recall that in viewof Theorem 5.14 and Remark 4.24, that y = z + k where (z, k) ∈ S(A,B) andRA(z+ k) = (JA − (Id−JA))(z+ k) = JA(z+ k)− (Id−JA)(z+ k) = z− k,which completes the proof of the theorem. Remark 10.3. In view of Remark 4.24, Theorem 5.14 and Corollary 5.16(v), whenA and B are paramonotone (as is always the case when A and B are subdifferentialoperators of functions in Γ(X)), we can replace S(A,B) and S(B,A) by, respectively,Z× K and Z× (−K).10.3 Iterates of T(A,B) vs. iterates of T(B,A)Lemma 10.4. Suppose that A is an affine relation. Then(i) JA is affine and JARA = 2J2A − JA = RA JA.If A = NU , where U is a closed affine subspace of X, then we have additionally:(ii) PU = JA = JARA = RA JA and (Id−JA)RA = JA − Id.(iii) R2A = Id, RA = R−1A , and RA : X → X is an isometric bijection.Proof. (i): The fact that JA is affine follows from [27, Theorem 2.1(xix)].Hence JARA = JA(2JA − Id) = 2J2A − JA = RA JA.11710.3. Iterates of T(A,B) vs. iterates of T(B,A)(ii): It follows from Example 3.14(ii) that PU = JA. Now using (i) wehave RA JA = JARA = 2P2U − PU = 2PU − PU = PU = JA. To prove thelast identity note that by (i) we have (Id−JA)RA = RA − JARA = 2PU −Id+PU = PU − Id.(iii): Because RA is affine, it follows from (ii) that R2A = RA(2JA − Id) =2RA JA − RA = 2PU − 2(PU − Id) = Id . Finally let x, y ∈ X. Since RAis nonexpansive we have ‖x − y‖ = ‖R2Ax − R2Ay‖ ≤ ‖RAx − RAy‖ ≤‖x − y‖, hence all the inequalities become equalities which completes theproof. We now turn to the iterates of the Douglas–Rachford algorithm.Proposition 10.5. Suppose that A is an affine relation. Then the following hold:(i) (∀n ∈N) we have RATn(A,B) = Tn(B,A)RA.(ii) RAZ = JA Fix T(B,A) and RAK = (JA − Id)(− Fix T(B,A)).If B is an affine relation, then we additionally have:(iii) T(A,B)RBRA = RBRAT(A,B).(iv) 4(T(A,B)T(B,A) − T(B,A)T(A,B)) = RBR2ARB − RAR2BRA. Consequently,T(A,B)T(B,A) = T(B,A)T(A,B) ⇔ RBR2ARB = RAR2BRA. (10.6)(v) If R2A = R2B = Id, then T(A,B)T(B,A) = T(B,A)T(A,B).Proof. (i): It follows from (10.4), Lemma 10.4(i) and (5.2) thatRAT(A,B) − T(B,A)RA = 2JAT(A,B) − JA − JARBRA (10.7a)= JA(2T(A,B) − Id)− JARBRA (10.7b)= JA(2( 12 (Id+RBRA))− Id)− JARBRA (10.7c)= JARBRA − JARBRA = 0, (10.7d)which proves the claim when n = 1. The general proof follows byinduction.(ii): Using (10.3), Lemma 10.4(i) and Theorem 10.2, we haveRAZ = RA JA(Fix T(A,B)) = JARA(Fix T(A,B)) = JA(Fix T(B,A)). (10.8)11810.3. Iterates of T(A,B) vs. iterates of T(B,A)By the inverse resolvent identity (3.15) we easily deduce that RA−1 = −RA.Therefore using Lemma 10.4(i) applied to A−1 and Theorem 10.2, we obtainRAK = −RA−1 JA−1(Fix T(A,B)) = −JA−1 RA−1(Fix T(A,B)) (10.9a)= −JA−1(−RA Fix T(A,B)) = (JA − Id)(− Fix T(B,A)). (10.9b)(iii): Note that T(A,B) and T(B,A) are affine. It follows from (5.2) thatT(A,B)RBRA = T(A,B)(2T(A,B) − Id) = 2T2(A,B) − T(A,B) (10.10a)= (2T(A,B) − Id)T(A,B) = RBRAT(A,B). (10.10b)(iv) We have4(T(A,B)T(B,A) − T(B,A)T(A,B))=4( 12 (Id+RBRA)12 (Id+RARB)− 12 (Id+RARB) 12 (Id+RBRA))(10.11a)= Id+RBRA + RARB + RBR2ARB − (Id+RARB + RBRA+ RAR2BRA) = RBR2ARB − RAR2BRA. (10.11b)(v): This is a direct consequence of (iv). Remark 10.6. In passing, we point out that the assumption in Proposition 10.5(v)is equivalent to saying that A = NU and B = NV where U and V are closedaffine subspaces of X. Indeed, R2A = Id ⇔ JA = J2A and therefore we concludethat ran JA = Fix JA. Combining with [139, Theorem 1.2] yields that JA is aprojection, hence A is an affine normal cone operator using Example 3.14(ii).With regards to Proposition 10.5(i), one may inquire whether the con-clusion still holds when RA is replaced by RB. We now give an exampleillustrating that the answer to this question is negative.Example 10.7. Suppose that X = R2, that U = R×{0}, that V = {0}×R+,that A = NU and that B = NV . Then A is linear, hence RAT(A,B) = T(B,A)RA,however RBT(A,B) 6= T(B,A)RB and RBT(B,A) 6= T(A,B)RB.Proof. To prove the identity RAT(A,B) = T(B,A)RA apply Proposition 10.5(i)with n = 1. Now let (x, y) ∈ R2 and set for every x ∈ R, x+ = max{x, 0}and x− = min{x, 0}. Elementary calculations show that RA(x, y) = (x,−y)and RB(x, y) = (−x, |y|). Consequently, (5.2) implies that T(A,B)(x, y) =(0, y+) and T(B,A)(x, y) = (0, y−) Therefore,RBT(A,B)(x, y) = (0, y+), (10.12a)11910.3. Iterates of T(A,B) vs. iterates of T(B,A)T(B,A)RB(x, y) = (0, 0), (10.12b)RBT(B,A)(x, y) = (0,−y+), (10.12c)T(A,B)RB(x, y) = (0, |y|). (10.12d)The conclusion follows from comparing equations (10.12a)–(10.12d). We are now ready for our second main result.Theorem 10.8 (When A is normal cone of closed affine subspace). Supposethat U is a closed affine subspace and that A = NU . Then the following hold:(i) (∀n ∈N) RATn(B,A) = Tn(A,B)RA, Tn(B,A) = RATn(A,B)RA and Tn(A,B) =RATn(B,A)RA.(ii) RA : Fix T(B,A) → Fix T(A,B), Z = JA(Fix T(B,A)) andK = (JA − Id)(Fix T(B,A)).(iii) Suppose that V is a closed affine subspace of X and that B = NV . ThenT(A,B)RARB = RARBT(A,B) and T(A,B)T(B,A) = T(B,A)T(A,B).Proof. (i): Let n ∈N. It follows from Proposition 10.5(i) andLemma 10.4(iii) that Tn(A,B) = RARATn(A,B) = RATn(B,A)RA. HenceTn(A,B)RA = RATn(B,A)RARA = RATn(B,A).(ii): The statement for RA follows from combining Theorem 10.2 andLemma 10.4(iii). In view of (10.3), Lemma 10.4(ii) and Theorem 10.2 onelearns that Z = JA(Fix T(A,B)) = JARA(Fix T(A,B)) = JA(Fix T(B,A)). Finally,(10.3), Lemma 10.4(iii) and (ii), and Theorem 10.2 imply thatK = (Id−JA)(Fix T(A,B)) = (Id−JA)RA(RA Fix T(A,B)) (10.13a)= (JA − Id) Fix T(B,A). (10.13b)(iii): In view of (i) applied to A and B we have T(A,B)RARB =RAT(B,A)RB = RARBT(A,B). The second identity now follows fromcombining Proposition 10.5(v) and Lemma 10.4(iii) applied to both A andB. 12010.3. Iterates of T(A,B) vs. iterates of T(B,A)Figure 10.1: A GeoGebra [78] snapshot. Two closed convex sets in R2, Uis a linear subspace (green line) and V (the ball). Shown are also the firstfive terms of the sequences (Tn(A,B)RAx0)n∈N (red points) and (Tn(B,A)x0)n∈N(blue points) in each case.Figure 10.2: A GeoGebra [78] snapshot. Two closed convex sets in R2, Uis the halfspace (cyan region) and V (the ball). Shown are also the firstfive terms of the sequences (Tn(A,B)RAx0)n∈N (red points) and (Tn(B,A)x0)n∈N(blue points) in each case.12110.3. Iterates of T(A,B) vs. iterates of T(B,A)Figure 10.1 illustrates Theorem 10.8(i) while Figure 10.2 illustrates thefailure of this result when the subspace is replaced by a cone.The conclusion of Theorem 10.8(iii) may fail when we assume that A orB is an affine, but not a normal cone operator, as we illustrate next.Example 10.9. Suppose that X = R2, that U = R× {0}, thatA = NU and that B =(1 11 1). (10.14)Then B is linear and maximally monotone but not a normal cone operatorand19(5 −1−1 2)= T(A,B)T(B,A) 6= T(B,A)T(A,B) = 19(5 11 2). (10.15)Proof. Maximal monotonicity of B follows from [12, Example 20.19 andExample 20.29]. Moreover, since B is nonzero and single-valued it cannotbe a normal cone operator. Now on the one hand, using Lemma 10.4(ii) wehaveJA = PU =(1 00 0)hence RA =(1 00 −1). (10.16)On the other hand one readily verifies thatJB = 13(2 −1−1 2)hence RB = 13(1 −2−2 1). (10.17)Therefore,T(A,B) = 12 (Id+RBRA) =13(2 1−1 1), (10.18)andT(B,A) = 12 (Id+RARB) =13(2 −11 1), (10.19)which completes the proof. Corollary 10.10. Suppose that U is an affine subspace, that A = NU and thatZ 6= ∅. Let x and y be in X. Then the following hold:(i) (∀n ∈N) JATn(B,A)x = JATn(A,B)RAx. Furthermore, (JATn(B,A)x)n∈Nconverges weakly to a point in Z.12210.3. Iterates of T(A,B) vs. iterates of T(B,A)(ii) ‖T(A,B)x− T(A,B)y‖ = ‖T(B,A)RAx− T(B,A)RAy‖ ≤ ‖RAx− RAy‖.Proof. (i): It follows from applying Lemma 10.4(iii), Proposition 10.5(i) andLemma 10.4(ii) thatJATn(B,A)x = JATn(B,A)RARAx = JARATn(A,B)RAx = JATn(A,B)RAx, (10.20)as claimed. The convergence of the sequence (JATn(B,A)x)n∈N follows from[12, Theorem 25.6].(ii): Apply Lemma 10.4(iii) with (x, y) replaced by (T(A,B)x, T(A,B)y),Proposition 10.5(i) with n = 1, and use nonexpansiveness of T(B,A). Remark 10.11.(i) The results of Theorem 10.8 and Corollary 10.10 are of interest when theDouglas–Rachford method is applied to find a zero of the sum of more thantwo operators in which case one can use a parallel splitting method (see [12,Proposition 25.7]), where one operator is the normal cone operator of thediagonal subspace in a product space.(ii) A second glance at the proof of Theorem 10.8(i) reveals that the resultremains true if JB is replaced by any operator QB : X → X (and RB isreplaced by 2QB − Id, of course). This is interesting because in [1], [2], [87]and [88], QB is chosen to be a selection of the (set-valued) projector onto aset V that is not convex. Hence the generalized variant of Theorem 10.8(i)then guarantees that the orbits of the two Douglas–Rachford operators arerelated via(∀n ∈N) Tn(B,A) = RATn(A,B)RA. (10.21)(iii) As a consequence of (ii) and Lemma 10.4(iii), we see that if linear conver-gence is guaranteed for the iterates of T(A,B), then the same holds true for theiterates of T(B,A) provided that U is a closed affine subspace, V is a nonemptyclosed set, A = NU and JB is a selection of the projection onto V. This isnot particularly striking when we compare to sufficient conditions that arealready symmetric in A and B (such as, e.g., ri U ∩ ri V 6= ∅ in [34] and[119]); however, this is a new insight when the sufficient conditions are notsymmetric (as in, e.g., [1], [43], [87] and [88]).(iv) A comment similar to (iii) can be made for finite convergence results;see [130] and [37] for nonsymmetric Slater’s type (U ∩ int V) sufficientconditions .12310.3. Iterates of T(A,B) vs. iterates of T(B,A)We now turn to the Borwein–Tam method [44].Proposition 10.12. Suppose that U is an affine subspace of X, that A = NU , andsetT[A,B] = T(A,B)T(B,A). (10.22)Then the following holds:(i) T[A,B] = RAT[B,A]RA = (T(A,B)RA)2 = (RAT(B,A))2.(ii) Suppose that V is an affine subspace and that B = NV . Then12 T[A,B] =T[B,A]. Consequently T[A,B] = (RBT(A,B))2 = (T(B,A)RB)2 = 12 (T(A,B) +T(B,A)), and T[A,B] is firmly nonexpansive.Proof. (i): Using (10.22) and Theorem 10.8(i) with n = 1 we obtainT[A,B] = RAT(B,A)RAT(B,A) = RAT(B,A)T(A,B)RA = RAT[B A]RA= T(A,B)RAT(A,B)RA = (T(A,B)RA)2 = (RAT(B,A))2. (10.23)(ii): The identity T[A,B] = T[B,A] follows from Theorem 10.8(iii) and(10.22). Now combine with (i) with A and B switched, and use [44,Remark 4.1]. The fact that T[A,B] (hence T[B,A]) is firmly nonexpansivefollows from the firm nonexpansiveness of T(A,B) and T(B,A) and the factthat the class of firmly nonexpansive operators is closed under convexcombinations (see [12, Example 4.31]). Following [44], the Borwein–Tam method specialized to two nonemptyclosed convex subsets U and V of X, iterates the operator T[A,B] of (10.22),where A = NU and B = NV . We conclude with an example that shows thatif A or B is not an affine normal cone operator, then T[A,B] and T[B,A] neednot be firmly nonexpansive.Example 10.13. Suppose that X = R2, that U = R+ · (1, 1), that V = R×{0}, that A = NU and that B = NV . Then neither T[A,B] nor T[B,A] is firmlynonexpansive.Proof. Applying Example 3.14(ii) to A and B respectively we verify thatJA = PU : R2 → R2 : (x, y) 7→ 12 max{0, (x + y)} · (1, 1) and JB = PV : R2 →R2 : (x, y) 7→ (x, 0). ConsequentlyRA : R2 → R2 : (x, y) 7→ (max{−x, y}, max{x,−y}) (10.24)12See [30, Proposition 3.5] for the case when U and V are linear subspaces.12410.3. Iterates of T(A,B) vs. iterates of T(B,A)andRB : R2 → R2 : (x, y) 7→ (x,−y). (10.25)This implies thatRARB : R2 → R2 : (x, y) 7→ RA(x,−y) ={(−x, y), x− y ≤ 0;(−y, x), x− y > 0, (10.26)andRBRA : R2 → R2 : (x, y) 7→{(−x, y), x + y ≤ 0;(y,−x), x + y > 0. (10.27)Let (x, y) ∈ R2. Using (5.2) we verify that T(A,B)(x, y) = ( 12 (x+ y)+, y−12 (x + y)+) and T(B,A)(x, y) = ( 12 (x− y)+, y + 12 (x− y)+).Now let α > 0, let x = (−2α, 2α) and let y = (0, 0). A routine cal-culation shows that T[A,B]x = T(A,B)T(B,A)(−2α, 2α) = (α, α) and T[A,B]y =T(A,B)T(B,A)(0, 0) = (0, 0), hence〈T[A,B]x− T[A,B]y, (Id−T[A,B])x− (Id−T[A,B])y〉= 〈(α, α), (−3α, α)〉 = −2α2 < 0. (10.28)Applying similar argument to T[B,A] with x = (−2α,−2α) and y = (0, 0)shows that〈T[B,A]x− T[B,A]y, (Id−T[B,A])x− (Id−T[B,A])y〉= 〈(−3α,−α), (α,−α)〉 = −2α2 < 0. (10.29)It then follows from Fact 3.2(v) that neither T[A,B] nor T[B,A] is firmlynonexpansive. 125Chapter 11Generalized solutions for thesum of two maximallymonotone operators11.1 OverviewA common theme in mathematics is to define generalized solutions to dealwith problems that potentially do not have solutions. A classical exampleis the introduction of least squares solutions via the normal equationsassociated with a possibly infeasible system of linear equations.Our goal in this chapter is to define a normal problem associated with theoriginal problem of finding a zero of the sum of two maximally monotone operatorsthat has attractive and useful properties. Similarly to the complete extensionof classical linear equations via normal equations (see Section 11.2), ourproposed approach achieves the following:• If the original problem has a solution, then so does the normal prob-lem and the sets of solutions to these problems coincide.• The normal problem may have a solution even if the original problemdoes not have any.• The solutions of the normal problem have a variational interpretationas minimal displacement solutions related to the Douglas–Rachfordsplitting operator.• The normal problem interacts well with Attouch–The´ra duality.New results presented in this chapter are published in [31] and [38].12611.2. A motivation from Linear Algebra11.2 A motivation from Linear AlgebraA classical problem rooted in Linear Algebra and of central importance inthe natural sciences is to solve a system of linear equations, sayAx = b. (11.1)However, it may occur (due to noisy data, for instance) that (11.1) does nothave a solution. An ingenious approach to cope with this situation, datingback to Carl Friedrich Gauss and his famous prediction of the asteroidCeres (see [41, Subsection 1.1.1] and [102, Epilogue in Section 4.6]) in 1801,is to consider the normal equation associated with (11.1), namelyA∗Ax = A∗b, (11.2)where A∗ denotes the transpose of A. The normal equation (11.2) hasextremely useful properties:• If the original system (11.1) has a solution, then so does the associatedsystem (11.2); furthermore, the sets of solutions of these two systemscoincide in this case.• The associated system (11.2) always has a solution.• The solutions of the normal equations have a variational interpretationas least squares solutions: they are the minimizers of the function x 7→‖Ax− b‖2.From now on, throughout the rest of this chapter, we assume thatA and B are maximally monotone operators on X. (11.3)Recall that(A, B)∗ = (A−1, B−>), (11.4)that (see Definition 4.1) the primal problem associated with the (ordered)pair (A, B) is to determine the set of zeros of the sum,Z = Z(A,B) = (A + B)−1(0), (11.5)and that the (Attouch–The´ra) dual problem (see Definition 4.2) associatedwith the pair (A, B) is to determine the set of zeros of the sum,K = K(A,B) =(A−1 + B−1)−>(0), (11.6)where A> = (− Id) ◦ A ◦ (− Id).12711.3. Perturbation calculus11.3 Perturbation calculusRecall that (see Section 4.3) the dual and primal solution mappings associatedwith (A, B) areK : X ⇒ X : x 7→ (Ax) ∩ (−Bx) (11.7)andZ : X ⇒ X : x 7→ (A−1x) ∩ (−B−>x), (11.8)respectively. Note that the primal solution mapping Z of (A, B) is the dualsolution mapping of (A, B)∗ and analogously for K.The importance of these mappings stems from the following result (see[3] or Proposition 4.14),dom K = Z, ran K = K, dom Z = K, ran Z = Z, and Z = K−1. (11.9)which shows that the solutions mappings relate the sets of solutions Z andK to each other.Definition 11.1 (shift operator and corresponding inner/outer perturba-tions). Let w ∈ X. We define13 the associated shift operatorSw : X → X : x 7→ x− w, (11.10)and we extend Sw to deal with subsets of X by setting (∀C ⊆ X) Sw(C) =⋃c∈C{Sw(c)}. We define the corresponding inner and outer perturbations ofA byAw = A ◦ Sw : X ⇒ X : x → A(x− w), (11.11)andw A = Sw ◦ A : X ⇒ X : x → Ax− w. (11.12)Observe that if w ∈ X, then the operators Aw and Aw are maximallymonotone, with domains S−w(dom A) = w + dom A and dom A, respec-tively.Lemma 11.2 (perturbation calculus). Let w ∈ X. Then the following hold:(i) (Aw)−1 = −w(A−1).(ii) (w A)−1 = (A−1)−w.13We point out that Definition 11.1 extends our earlier definition of inner and outer shiftsof single-valued operators (see Definition 7.1) to possibly set-valued operators.12811.4. Perturbations of the Douglas–Rachford operator(iii) (Aw)> = (A>)−w.(iv) (w A)> = −w(A>).(v) (Aw)−> = w(A−>).(vi) (w A)−> = (A−>)w.Proof. Let (x, y) ∈ X2. (i): y ∈ (Aw)−1x ⇔ x ∈ Awy = A(y − w) ⇔y − w ∈ A−1x ⇔ y ∈ A−1x + w = −w(A−1)x. (ii): y ∈ (w A)−1x ⇔x ∈ w Ay ⇔ x ∈ Ay− w ⇔ x + w ∈ Ay ⇔ y ∈ A−1(x + w) = (A−1)−wx.(iii): (Aw)> = −Aw(−x) = −A(−x − w) = A>(x + w) = (A>)−w. (iv):(Aw)>x = −Aw(−x) = −(A(−x) − w) = A>x − (−w) = −w(A>)x.(v): Using (i) and (iv), we see that (Aw)−> = ((Aw)−1)> =(−w(A−1))> = w(A−>). (vi): Using (ii) and (iii), we see that(w A)−> = ((w A)−1)> = ((A−1)−w)> = (A−>)w. As an application, we record the following result which will be usefullater.Corollary 11.3 (dual of inner-outer perturbation). Let w ∈ X. Then(w A, Bw)∗=((w A)−1, (Bw)−>) = ((A−1)−w, w(B−>)). (11.13)Proof. Combine (11.4) with Lemma 11.2(ii)&(v). 11.4 Perturbations of the Douglas–Rachford operatorWe now turn to the Douglas–Rachford operator (see Definition 5.1) definedbyT = T(A,B) = JBRA + Id−JA. (11.14)Proposition 11.4. Let w ∈ X. Then the following hold:(i) If x ∈ Fix(−wT), then x− w− JAx ∈ w AJAx ∩ (−Bw JAx).(ii) If y ∈ w Az ∩ (−Bwz), then x = w + y + z ∈ Fix(−wT) and z = JAx.Proof. If x ∈ X, then x− w− JAx ∈ w AJAx.(i): Since x ∈ Fix(−wT), we have x− Tx = w; equivalently, JAx− w =JBRAx. Hence 2JAx − x = RAx ∈ (B + Id)(JAx − w) = Bw JAx + JAx − wand thus −(x− w− JAx) ∈ Bw JAx.12911.4. Perturbations of the Douglas–Rachford operator(ii): Since y ∈ w Az ∩ (−Bwz) = (Az − w) ∩ (−B(z − w)), we havez = JAx and z − w = JB(−y + z − w). Hence RAx = 2JAx − x =2z− (w+ y+ z) = z−w− y and so JBRAx = JB(z−w− y) = z−w. Thus,x− Tx = JAx− JBRAx = z− (z− w) = w. Corollary 11.5. Let w ∈ X. ThenFix(−wT) = w +⋃z∈X(z + w Az ∩ (−Bwz)). (11.15)Proposition 11.6. Let w ∈ X. ThenT(w A,Bw) = T−w (11.16)andFix(T−w) = −w + Fix(−wT) =⋃z∈X(z +((Az− w) ∩ (−B(z− w)))).(11.17)Proof. Let x ∈ X. Using Fact 3.20, we obtain JBw x = JB(x − w) + w andJw Ax = JA(x + w). Consequently, Rw Ax = 2JA(x + w)− x. It thus followswith (11.14) thatT(w A,Bw)x = x− Jw Ax + JBw Rw Ax (11.18a)= x− JA(x + w) + JB(2JA(x + w)− x− w)+ w (11.18b)= (x + w)− JA(x + w) + JB(RA(x + w))(11.18c)= T(x + w) = (T−w)x, (11.18d)and so (11.16) holds. Next, x ∈ Fix(T−w) ⇔ x = T(x + w) ⇔x + w = w + T(x + w) ⇔ x + w ∈ Fix(−wT), and have thus verifiedthe left identity in (11.17). Finally to see the right identity in (11.17), useCorollary 11.5. We now obtain a generalization of Theorem 5.14, which corresponds tothe case when w = 0.Proposition 11.7. Let w ∈ X and defineKw : X ⇒ X : x 7→ (Ax− w) ∩ (−B(x− w)). (11.19)ThenΨw : gr Kw → Fix−wT : (z, k) 7→ z + k + w (11.20)is a well-defined bijection that is continuous in both directions, with Ψ−1w : x 7→(JAx, x− JAx− w).13011.5. The w-perturbed problemProof. For the pair (w A, Bw), the dual solution mapping is Kw and by (11.16)the Douglas–Rachford operator is T−w. Applying Theorem 5.14 in thiscontext, we obtainΦ : gr Kw → Fix T−w : (z, k) 7→ z + k (11.21)is continuous in both directions with Φ−1 : x 7→ (Jw Ax, x − Jw Ax) =(JA(x + w), x− JA(x + w)). Furthermore, S−w is a bijection from Fix(T−w)to Fix(−wT) by (11.17). This shows that Ψw = S−w ◦Φ as claimed. 11.5 The w-perturbed problemDefinition 11.8 (w-perturbed problem). Let w ∈ X. The w-perturbation of(A, B) is (w A, Bw). The w-perturbed problem associated with the pair (A, B)is to determine the set of zerosZw = (w A + Bw)−1(0) ={x ∈ X ∣∣ w ∈ Ax + B(x− w)}. (11.22)Note that the w-perturbed problem of (A, B) is precisely the primalproblem of (w A, Bw), i.e., of the w-perturbation of (A, B).Proposition 11.9 (Douglas–Rachford operator of the w-perturbation). Letw ∈ X. Then the Douglas–Rachford operator of the w-perturbation (w A, Bw) of(A, B) isT(w A,Bw) = T−w. (11.23)Proof. This follows from (11.16) of Proposition 11.6. Proposition 11.10. Let w ∈ X. ThenZw = Jw A(Fix(T−w))= JA(w +{x ∈ X ∣∣ x = T(x + w)}). (11.24)Furthermore, the following are equivalent:(i) Zw 6= ∅.(ii) Fix(T−w) 6= ∅.(iii) w ∈ ran(Id−T).(iv) w ∈ ran(A + Bw).13111.5. The w-perturbed problemProof. The identity (11.24) follows by combining ?? with Proposition 11.9.This also yields the equivalence of (i) and (ii). Let x ∈ X. Then x ∈ Zw ⇔0 ∈ w Ax + Bwx ⇔ w ∈ Ax + Bwx, and we deduce the equivalence of (i)and (iv). Finally, x ∈ Fix(T−w) ⇔ x = T(x + w) ⇔ w ∈ (Id−T)(x + w),which yields the equivalence of (ii) and (iii). The equivalence of (i) and (iii) yields the following key result on whichw-perturbations have nonempty solution sets.Corollary 11.11.{w ∈ X ∣∣ Zw 6= ∅} = ran(Id−T).Remark 11.12 (Attouch–The´ra dual of the perturbed problem). Considerthe given pair of monotone operators (A, B). We could either first perturb andthen take the Attouch–The´ra dual or start with the Attouch–The´ra dual and thenperturb. It turns out that the order of these operations does not matter — up to ahorizontal shift of the graphs. Indeed, for every x ∈ X, we have(w(A−1) + (B−>)w)x = A−1x− w + B−>(x− w) (11.25a)= A−1((x− w) + w) + B−>(x− w)− w (11.25b)= (A−1)−w(x− w) + w(B−>)(x− w) (11.25c)=((A−1)−w + w(B−>))(x− w). (11.25d)Hence gr(w(A−1) + (B−>)w) = (w, 0) + gr((A−1)−w + w(B−>)), which givesrise to the following diagram:13211.6. The normal problem(A, B)(A−1, B−>)(w(A−1), (B−>)w) ((A−1)−w, w(B−>))(w A, Bw)Attouch–The´ra dualperturb by whorizontal shift by −whorizontal shift by wAttouch–The´ra dualperturb by wFigure 11.1: A diagram that illustrates Remark 11.12.11.6 The normal problemWe are now in a position to define the normal problem.Definition 11.13 (minimal displacement vector and the normal problem).The vectorv(A,B) = v = Pran(Id−T)0 (11.26)is the minimal displacement vector of (A, B) . The normal problem associatedwith (A, B) is the v(A,B)-perturbed problem of (A, B). The set of primalnormal solutions (normal solutions for simplicity) isZv = Z(v A,Bv) (11.27)and the set of dual normal solutions isKv = K(v A,Bv) = Z((v A)−1,(Bv)−>). (11.28)Lemma 11.14. The following hold:(i) Zv = Jv A(Fix(T−v)) = J(−v+A)(Fix(T−v)) = JA(Fix(T−v) + v) =JA(Fix(v + T)).(ii) Kv = (Id−Jv A)(Fix(T−v)) = (Id−J(−v+A))(Fix(T−v)).13311.6. The normal problem(iii) Kv 6= ∅⇔ Zv 6= ∅⇔ v ∈ ran(Id−T).Proof. (i): Apply (10.3) to the normal pair (v A, Bv) and use (11.23) with wreplaced by v and (7.1). Now apply Fact 3.20(i). The last equality followsfrom Lemma 7.2(i). (ii): Apply (10.3) to the normal pair (v A, Bv), then use(11.23) with w replaced by v and Fact 3.20(ii). (iii): The first equivalencefollows from applying Proposition 4.4(v) to the normal pair (v A, Bv). Nowapply Corollary 11.11. Remark 11.15 (new notions are well-defined). The notions presented inDefinition 11.13 are well-defined: indeed, since T is firmly nonexpansive(Fact 5.2(i)), it is also nonexpansive and the existence and uniqueness of v(A,B)follows from Fact 7.4.Remark 11.16 (new notions extend original notions). Suppose that for theoriginal problem (A, B), we have Z = Z0 = (A + B)−1(0) 6= ∅. ByCorollary 11.11, 0 ∈ ran(Id−T) and so v(A,B) = 0. Hence the normal problemcoincides with the original problem, as do the associated sets of solutions.Remark 11.17 (normal problem may or may not have solutions). If the setof original solutions Z is empty, then the set of normal solutions may be eithernonempty (see Example 11.24) or empty (see Example 11.25).The original problem of finding a zero of A + B is clearly symmetricin A and B. We now present a statement about the magnitude of thecorresponding minimal displacement vectors:Proposition 11.18. ‖v(A,B)‖ = ‖v(B,A)‖.Proof. It follows from Fact 5.2(i) thatId−T(A,B) = 12 (Id−RBRA) and Id−T(B,A) = 12 (Id−RARB). (11.29)Thus, using Lemma 7.6, we see that‖v(A,B)‖ = 2‖Pran(Id−RBRA)0‖ = 2‖Pran(Id−RARB)0‖ = ‖v(B,A)‖.In view of Definition 11.13 and Proposition 11.18, the magnitude of thevector v(A,B) is actually a measure of how far the original problem is fromthe normal problem. This magnitude is the same for the pairs (A, B) and(B, A).13411.7. ExamplesRemark 11.19 (v(A,B) 6= v(B,A) may occur). We will see in the sequel exampleswhere v(A,B) 6= 0 but (i) v(B,A) = −v(A,B) (see Remark 11.23); or (ii) v(B,A) =v(A,B) (see Example 11.27); or (iii) the enclosed angle between v(B,A) and v(A,B)takes any value (see Example 11.26 and Example 11.28).Remark 11.20 (self-duality: v(A,B) = v(A−1,B−>)). Since, by Corollary 5.12, wehave T(A,B) = T(A−1,B−>), it is clear that v(A,B) = v(A−1,B−>). It follows fromRemark 11.12 that the operations of perturbing by v(A,B) and taking the Attouch–The´ra dual commute, up to a shift.11.7 ExamplesProposition 11.21. (Id−JB)A−1(0) ⊆ ran(Id−T) ⊆ dom A− dom B.Proof. The right inclusion follows from (12.21). To tackle the leftinclusion, suppose that z ∈ A−1(0) and set w = z − JBz. Thenw = z − JBz ∈ B(JBz) = B(z − w) + 0 ⊆ Az + Bw(z). Hence, byProposition 11.10, w ∈ ran(Id−T). Proposition 11.22 (normal cone operators). Suppose that A = NU and B =NV , where U and V are nonempty closed convex subsets of X. Thenv(A,B) = PU−V0 (11.30)and the set of normal solutions isU ∩ (v(A,B) +V) = Fix(PU PV). (11.31)Proof. Since A−1(0) = U and JB = PV , Proposition 11.21 yields C ={u− PVu∣∣ u ∈ U} ⊆ ran(Id−T) ⊆ U −V; hence,C ⊆ ran(Id−T) ⊆ U −V. (11.32)Set g = PU−V0. By [10, Theorem 4.1], there exists a sequence (un)n∈N inU such that un − PVun → g. It follows that (un − PVun)n∈N lies in C andhence that g ∈ C. Therefore PC0 = Pran(Id−T)0 = PU−V0 and we obtain(11.30). (For an alternative proof, see [20, Theorem 3.5].)Let x ∈ X. Then x is a normal solution if and only ifg ∈ NUx + NV(x− g). (11.33)13511.7. ExamplesAssume first that (11.33) holds. Then x ∈ U and x − g ∈ V. Hencex ∈ U ∩ (g + V) = Fix(PU PV) by [10, Lemma 2.2]. Conversely, assumex ∈ U ∩ (g + V) = Fix(PU PV). Then x − g ∈ V, x ∈ U, PV x = x − g andPU(x− g) = x. Hence NU(x) ⊇ R−g and NV(x− g) ⊇ R+g; consequently,NU(x) + NV(x− g) ⊇ R−g +R+g = Rg 3 g and (11.33) holds. Remark 11.23. Proposition 11.22 is consistent with the theory dealing withinconsistent feasibility problems (see [10]). Note that it also yields the formulav(A,B) = −v(B,A) (11.34)in this particular context.Example 11.24 (no original solutions but normal solutions exist). Sup-pose that A and B are as in Proposition 11.22, that U ∩ V = ∅, and that Vis also bounded. Then U ∩ (v(A,B) +V) = Fix(PU PV) 6= ∅ by the Browder–Go¨hde–Kirk fixed point theorem (see [12, Theorem 4.19]). So, the originalproblem has no solution but there exist normal solutions.Figure 11.2: A GeoGebra [78] snapshot that illustrates Example 11.24.Example 11.25 (neither original nor normal solutions exist). Suppose thatX = R2, that A and B are as in Proposition 11.22, thatU ={(x, y) ∈ R2 ∣∣ β+ exp(x) ≤ y}, (11.35)where β ∈ R+, and that V = R×{0}. Then v(A,B) = (β, 0) yet Fix(PU PV) =∅.13611.7. ExamplesFigure 11.3: A GeoGebra [78] snapshot that illustrates Example 11.25.Example 11.26. Suppose that X = R2, let L : R2 → R2 : (ξ, η) 7→ (−η, ξ)be the rotator by pi/2, let a∗ ∈ R2 and b∗ ∈ R2. Suppose that (∀x ∈ R2)Ax = Lx + a∗ and Bx = −Lx − b∗. Now let x ∈ X and let w ∈ X. Then0 = Ax+ B(x−w)−w = Lx+ a∗ − L(x−w)− b∗ −w and so (Id−L)w =a∗ − b∗, i.e., w = J−L(a∗ − b∗) = (1/2)(Id+L)(a∗ − b∗) by Example 3.17. Itfollows thatv(A,B) = 12 (Id+L)(a∗ − b∗). (11.36)An analogous argument yieldsv(B,A) = 12 (Id−L)(b∗ − a∗). (11.37)Setting d∗ = a∗ − b∗, we have 4〈v(A,B), v(B,A)〉= 〈Ld∗ + d∗, Ld∗ − d∗〉 =‖Ld∗‖2 − ‖d∗‖2 = 0. and v(A,B) + v(B,A) = Ld∗. Thus if d∗ 6= 0, i.e., a∗ 6= b∗,thenv(A,B) 6= 0 and v(A,B) ⊥ v(B,A). (11.38)Example 11.27. Suppose that there exists a∗ and b∗ in X such that gr A =X × {a∗} and gr B = X × {b∗}. By (12.21), ∅ 6= ran(Id−T) ⊆ {a∗ + b∗}.Hence v(A,B) = a∗+ b∗ and analogously v(B,A) = a∗+ b∗. Thus, if a∗+ b∗ 6=0, we havev(A,B) 6= 0 and v(A,B) = v(B,A). (11.39)In the previous examples we constructed cases where〈v(A,B), v(B,A)〉‖v(A,B)‖‖v(B,A)‖∈ {−1, 0, 1} . (11.40)We now show that this quotient can take on any value in [−1, 1].13711.7. ExamplesExample 11.28 (angle between v(A,B) and v(B,A)). Suppose that S is a linearsubspace of X such that {0} $ S $ X. Let θ ∈ R, let u ∈ S, and letv ∈ S⊥ such that ‖u‖ = ‖v‖ = 1. Set a = sin(θ)v, and set b = cos(θ)u.Suppose that A = NS+a and that B = NS + b. Then D = dom A− dom B =S + a − S = S + a, and R = ran A + ran B = S⊥ + S⊥ + b = S⊥ + b.Consequently, −D = S − a. Clearly, D ∩ R = D ∩ R = {b + a}, whereas(−D) ∩ R = (−D) ∩ R = {b− a}. Therefore, v(A,B) = b + a, and v(B,A) =b− a. By Proposition 11.18 ‖v(A,B)‖ = ‖v(B,A)‖ = 1. Moreover, since a ⊥ b〈v(A,B), v(B,A)〉 = 〈b + a, b− a〉 = ‖b‖2 − ‖a‖2 (11.41a)= cos2(θ)− sin2(θ) = cos(2θ). (11.41b)Proposition 11.29. Suppose that there exists continuous linear monotone opera-tors L and M on X, and vectors a∗ and b∗ in X such that (∀x ∈ X) Ax = Lx+ a∗and Bx = Mx + b∗. Consider the problemminimize ‖w‖2 subject to(w, x) ∈ X× X and (Id+M)w− (L + M)x = a∗ + b∗.(11.42)Let (w, x) ∈ X× X. Then (w, x) solves (11.42)⇔ w = v(A,B) and x is a normalsolution⇔ w = PJM(ran(A+B))0 and x ∈ (A + B)−1(Id+M)w.Proof. Let (w, x) ∈ X × X . Then w = Ax + Bwx ⇔ (Id+M)w −(L + M)x = a∗ + b∗ ⇔ (Id+M)w = (L + M)x + a∗ + b∗ ⇔w = JM((L + M)x + a∗ + b∗)= JM(A + B)x. The conclusion thusfollows from Proposition 11.10. It is nice to recover a special case of our original motivation given inSection 11.2:Example 11.30 (classical least squares solutions). Suppose that X = Rn,let M ∈ Rn×n be such that M + MT is positive semidefinite, and let b ∈ Rn.Suppose that (∀x ∈ Rn) A = M and Bx = −b so that the original problemis to find x ∈ Rn such that Mx = b. Then v(A,B) = Pran M(b)− b and thenormal solutions are precisely the least squares solutions.Proof. We will use Proposition 11.29. The constraint in (11.42) turns into(Id+0)w− (0+ M)x = 0+ (−b), i.e., w = Mx− b so that the optimizationproblem in (11.42) isminimize ‖Mx− b‖2. (11.43)13811.7. ExamplesHence the normal solutions in our sense are precisely the classical leastsquares solutions. Furthermore, v = Pran(A+B)0 = P−b+ran M(0) =Pran M(b)− b. 139Chapter 12On the range of theDouglas–Rachford operator:Theory12.1 OverviewAs seen previously, the Douglas–Rachford splitting method is one of themost popular methods of solving the problem of finding a zero of the sumof two maximally monotone operators. Surprisingly, little is known aboutthe range of the Douglas–Rachford operator.The main goal of this chapter is to analyze the range of T(A,B) and of thedisplacement mapping Id−T(A,B). Recall that (see Corollary 5.5)ran(Id−T(A,B)) ⊆ (dom A− dom B) ∩ (ran A + ran B). (12.1)It is natural to inquire whether or not this is a mere inclusion or aninequality. In general, this inclusion is strict — sometimes even extremelyso in the sense that ran(Id−T(A,B)) may be a singleton while (dom A −dom B) ∩ (ran A + ran B) may be the entire space; see Example 12.8. Thislikely has discouraged efforts to obtain a better description of these ranges.However, and somewhat surprisingly, we are able to obtain — under fairlymild assumptions on A and B — simple yet elegant formulae for the rangesof Id−T(A,B) and T(A,B).Our main findings are summarized as follows:• We prove that for 3∗ monotone operators a very pleasing formulacan be found that reveals the range to be nearly equal to a simpleset involving the domains and ranges of the underlying operators.More precisely we obtain the simple and elegant formulae (see Theo-rem 12.10 and Corollary 12.11)ran(Id−T(A,B)) ' (dom A− dom B) ∩ (ran A + ran B) (12.2)14012.2. Near convexity and near equalityandran T(A,B) ' (dom A− ran B) ∩ (ran A + dom B), (12.3)where “'” denotes to the near equality discussed in Definition 3.32.• Using the correspondence between maximally monotone operatorsand firmly nonexpansive mappings (see Fact 3.12), we are able toreformulate our results for firmly nonexpansive mappings (see Corol-lary 13.7).• We provide some partial results when X is possibly infinite-dimensional (see Section 12.5).Except for facts with explicit references, all the new results in this chapter arepublished in [38].12.2 Near convexity and near equalityIn this section we review and extend the notions of near convexity (seeDefinition 3.30) and near equality (see Definition 3.32), and we also presentsome new results. Unless otherwise stated, throughout this chapterX is a finite-dimensional real Hilbert space.Remark 12.1. In fact, it is the calculus of relative interiors of nearly convex setsalong with the finite-dimensional version of the Brezis–Haraux theorem (see [4,Section 6.5]) that prompted us to work in finite-dimensional space. Moreover, weare unaware of any generalization of the relative interior that would be alwaysnonempty for nonempty closed convex subsets of a general infinite-dimensionalHilbert space.Fact 12.2. Let C and D be nearly convex subsets of X. ThenC ' D ⇔ C = D. (12.4)Proof. See [29, Proposition 2.12(i)&(ii)]. Fact 12.3. Let (Ci)i∈I be a finite family of nearly convex subsets of X, and let(λi)i∈I be a finite family of real numbers. Then ∑i∈I λiCi is nearly convex andri(∑i∈I λiCi) = ∑i∈I λi ri Ci.14112.2. Near convexity and near equalityProof. See [29, Lemma 2.13]. Fact 12.4. Let (Ci)i∈I be a finite family of nearly convex subsets of X, and let(Di)i∈I be a family of subsets of X such that Ci ' Di for every i ∈ I. Then∑i∈I Ci ' ∑i∈I Di.Proof. See [29, Theorem 2.14]. Fact 12.5. Let (Ci)i∈I be a finite family of convex subsets of X. Suppose that∩i∈I ri Ci 6= ∅. Then ∩i∈ICi = ∩i∈ICi and ri∩i∈ICi = ∩i∈I ri Ci.Proof. See [123, Theorem 6.5]. Most of the following results are known. For different proofs see also[49, Theorem 2.1], [106] and [107].Lemma 12.6. Let C and D be nearly convex subsets of X such that ri C ∩ ri D 6=∅. Then the following hold:(i) C ∩ D is nearly convex.(ii) C ∩ D ' ri C ∩ ri D.(iii) ri(C ∩ D) = ri C ∩ ri D.(iv) C ∩ D = C ∩ D.Proof. (i): Since C and D are nearly convex, by Fact 3.33, ri C and ri D areconvex. Consequently,ri C ∩ ri D is convex, (12.5)and clearlyri C ∩ ri D ⊆ C ∩ D. (12.6)By Fact 3.33 we have ri C ' C and ri D ' D. Hence, ri(ri C) = ri C andri(ri D) = ri D. Therefore,ri(ri C) ∩ ri(ri D) = ri C ∩ ri D 6= ∅. (12.7)Using (12.7) and Fact 12.5 applied to the convex sets ri C and ri D yieldri(C ∩ D) = ri C ∩ ri D; henceri C ∩ ri D = ri C ∩ ri D. (12.8)14212.2. Near convexity and near equalitySince ri C ' C and ri D ' D by Fact 3.33, we have ri C = C and ri D = D.Combining with (12.6) and (12.8), we obtainri C ∩ ri D ⊆ C ∩ D ⊆ C ∩ D = ri C ∩ ri D = ri C ∩ ri D, (12.9)which in turn yields (i) in view of (12.5). (ii): Use (i) and Fact 3.33 appliedto the convex set ri C ∩ ri D and the nearly convex set C ∩ D. (iii): Using(ii), Fact 12.5 applied to the convex sets ri C and ri D and (12.7) we haveri(C ∩ D) = ri(ri C ∩ ri D) = ri(ri C) ∩ ri(ri D) = ri C ∩ ri D, (12.10)as required. (iv): Since C and D are nearly convex, it follows from Fact 3.33that ri C = C and ri D = D. Combining with (ii) and (12.8) we haveC ∩ D = ri C ∩ ri D = ri C ∩ ri D = C ∩ D, (12.11)as claimed. Corollary 12.7. Let C1 and C2 be nearly convex subsets of X, and let D1 and D2be subsets of X such that C1 ' D1 and C2 ' D2 . Suppose that ri C1 ∩ ri C2 6= ∅.ThenC1 ∩ C2 ' D1 ∩ D2. (12.12)Proof. Let i ∈ {1, 2}. Since Ci ' Di, by Definition 3.32 we haveri Ci = ri Di and Ci = Di. (12.13)Hence,ri D1 ∩ ri D2 = ri C1 ∩ ri C2 6= ∅. (12.14)Moreover, since Ci is nearly convex, it follows from Fact 3.33 that ri Ci = Ciand ri Ci is convex. Thereforeri Ci = ri Di ⊆ Di ⊆ Di = Ci = ri Ci. (12.15)Hence Di is nearly convex. Applying Lemma 12.6 (iii) to the two sets C1and C2 implies thatri(C1 ∩ C2) = ri C1 ∩ ri C2. (12.16)Similarly we haveri(D1 ∩ D2) = ri D1 ∩ ri D2. (12.17)14312.3. Attouch–The´ra duality and the normal problemUsing (12.14) and Lemma 12.6(i), applied to the sets C1 and C2 we haveC1 ∩ C2 is nearly convex. Similarly, D1 ∩ D2 is nearly convex. By Fact 3.33we haveC1 ∩ C2 ' ri(C1 ∩ C2) and D1 ∩ D2 ' ri(D1 ∩ D2). (12.18)Hence C1 ∩ C2 = ri(C1 ∩ C2) and D1 ∩ D2 = ri(D1 ∩ D2). Combining with(12.16), (12.14) and (12.17) yieldC1 ∩ C2 = ri(C1 ∩ C2) = ri C1 ∩ ri C2 = ri D1 ∩ ri D2= ri(D1 ∩ D2) = D1 ∩ D2. (12.19)Now, Fact 12.2 applied to the nearly convex sets C1 ∩ C2 and D1 ∩ D2implies that C1 ∩ C2 ' D1 ∩ D2. 12.3 Attouch–The´ra duality and the normal problemThis section provides a review of the Attouch–The´ra duality14 [3] and theassociated normal problem. From now on, throughout the rest of thischapter, we assume thatA : X ⇒ X and B : X ⇒ X are maximally monotone,and that T = T(A,B) refers to the Douglas–Rachford splitting operator for twooperators A and B, defined in (5.2).For convenience of the reader, let us recall that (see Corollary 5.12) T isself-dual, i.e.,T(A−1,B−>) = T(A,B). (12.20)Also recall (see Corollary 5.5) thatran(Id−T(A,B)) ={a− b ∣∣ (a, a∗) ∈ gr A, (b, b∗) ∈ gr B, a− b = a∗ + b∗}(12.21)⊆ (dom A− dom B) ∩ (ran A + ran B). (12.22)Let w ∈ X be fixed. For the operator A, the inner and outer shiftsassociated with A are defined by (see Definition 11.1)Aw : X ⇒ X : x 7→ A(x− w), (12.23)14 See the discussion in Chapter 4 as well as [66], [73], [118], and [121].14412.4. Main resultsw A : X ⇒ X : x 7→ Ax− w. (12.24)Notice that Aw and w A are maximally monotone, with dom Aw = dom A+w and dom w A = dom A.Finally, recall that Proposition 11.18 implies‖v(A,B)‖ = ‖v(B,A)‖. (12.25)We now explore how ran(Id−T) is related to the set (dom A −dom B) ∩ (ran A + ran B). We will prove that when the operators A and Bare “sufficiently nice”, we haveran(Id−T(A,B)) ' (dom A− dom B) ∩ (ran A + ran B). (12.26)In general, (12.26) may fail spectacularly as we will now illustrate.Example 12.8. Suppose that X = R2, and thatA =(0 −11 0)and B = −A =(0 1−1 0). (12.27)Thenran(Id−T(A,B)) = {0} $ R2 (12.28)= (dom A− dom B) ∩ (ran A + ran B). (12.29)Proof. Recall that dom A = dom B = ran A = ran B = R2, consequently(dom A− dom B) ∩ (ran A + ran B) = R2. On the other hand, one checksthat RA : (x, y) 7→ (y,−x) = B and RB : (x, y) 7→ (−y, x) = A. HenceRBRA = Id and therefore Id−T(A,B) = 12 (Id−RBRA) ≡ 0. 12.4 Main resultsIn this section we present the main results of this chapter. Keeping thenotation of Section 12.3, we set additionallyD = D(A,B) = dom A− dom B and R = R(A,B) = ran A + ran B.(12.30)We start by proving some auxiliary results.Lemma 12.9. The following hold:14512.4. Main results(i) The sets D and R are nearly convex.(ii) ri D ∩ ri R 6= ∅.(iii) D ∩ R is nearly convex.(iv) ri(D ∩ R) = ri D ∩ ri R .(v) D ∩ R = ri D ∩ ri R.(vi) D ∩ R = D ∩ R.Proof. (i): Combine Fact 3.31 and Fact 12.3. (ii): Since B is maximallymonotone, the Minty parametrization (3.20) implies that X = dom B +ran B. Hence by (i) and Fact 12.30 ∈ X = ri X = ri(ran A + ran B− (dom A− dom B)) = ri R− ri D.(12.31)Hence, ri D ∩ ri R 6= ∅, as claimed. (Note that we did not use the maximalmonotonicity of A in this proof.) (iii): Combine (i), (ii) and Lemma 12.6(i).(iv): Combine (i), (ii) and Lemma 12.6(iii). (v): Combine Lemma 12.6(ii), (i)and (ii). (vi): Combine (i), (ii) and Lemma 12.6(iv). Theorem 12.10. Suppose that A and B satisfy one of the following:(i) (∀w ∈ ri D ∩ ri R) ri(ran A + ran B) ⊆ ri ran (A + Bw) .(ii) A and B are 3∗ monotone.(iii) dom B + ri D ∩ ri R ⊆ dom A and A is 3∗ monotone.Thenran(Id−T(A,B)) ' D ∩ R. (12.32)Furthermore, the following implications hold:(∃C ∈ {A, B}) dom C = X and C is 3∗ monotone⇒ ran(Id−T(A,B)) ' R,(12.33)(∃C ∈ {A, B}) ran C = X and C is 3∗ monotone⇒ ran(Id−T(A,B)) ' D,(12.34)andri(D ∩ R) = D ∩ R ⇒ ran(Id−T(A,B)) = D ∩ R. (12.35)14612.4. Main resultsProof. First we show that(∀w ∈ ri D) A + Bw is maximally monotone. (12.36)Notice that (∀w ∈ X)dom Bw = dom B + w. Let w ∈ ri D = ri(dom A−dom B). Then ri dom A ∩ ri dom Bw 6= ∅. Using Fact 3.34, we concludethat A + Bw is maximally monotone, which proves (12.36). Now, supposethat (i) holds. Then (∀w ∈ ri D ∩ ri R) w ∈ ri ran(A + Bw) ⊆ ran(A + Bw).Combining with Proposition 11.10 we conclude that (∀w ∈ ri D ∩ ri R) w ∈ran(Id−T(A,B)). Henceri D ∩ ri R ⊆ ran(Id−T(A,B)). (12.37)It follows from Lemma 12.9(v) that ri D ∩ ri R = D ∩ R. Altogether,D ∩ R ⊆ ran(Id−T(A,B)). (12.38)It follows from (12.22) that ran(Id−T(A,B)) ⊆ D ∩ R. Therefore,D ∩ R = ran(Id−T(A,B)). (12.39)Since T(A,B) is firmly nonexpansive, hence nonexpansive, it follows from[12, Example 20.26] that Id−T(A,B) is maximally monotone, and thereforeran(Id−T(A,B)) is nearly convex by Fact 3.31. Using Lemma 12.9(ii)&(iii),we know that ri D ∩ ri R 6= ∅ and D ∩ R is nearly convex. Therefore,using (12.39) and Fact 12.2 applied to the nearly convex sets D ∩ R andran(Id−T(A,B)), we get ran(Id−T(A,B)) ' D ∩ R. Now we show thateach of the conditions (ii) and (iii) imply (i). Let w ∈ ri D ∩ ri R, andnotice that (ii) implies that Bw is 3∗ monotone, whereas (iii) implies thatdom Bw = dom B + w ⊆ dom A. Using (12.36) and Fact 3.35 applied to Aand Bw we have (∀w ∈ ri D ∩ ri R)w ∈ ri R = ri(ran A + ran B)= ri(ran A + ran Bw) = ri ran(A + Bw).(12.40)That is, (i) holds, and consequently (12.32) holds.We now turn to the implication (12.33). Observe first that D = X.If A is 3∗ monotone and dom A = X, then clearly (iii) holds. Thus, itremains to consider the case when B is 3∗ monotone and dom B = X.Then Bw is 3∗ monotone and dom A ⊆ X = dom Bw. As before, we obtainw ∈ ri R = ri(ran A+ ran B) = ri(ran A+ ran Bw) = ri ran(A+ Bw). Hence14712.4. Main results(i) holds, which completes the proof of (12.33). To prove the implication(12.34), first notice that (∃C ∈ {A, B}) ran C = X and C is 3∗ monotone⇔ (∃C ∈ {A−1, B−>}) dom C = X and C is 3∗ monotone. Thereforeusing Remark 11.20 and (12.33) applied to the operators A−1 and B−>(∃C ∈ {A, B}) ran C = X and C is 3∗ monotone ⇒ ran(Id−T(A,B)) =ran(Id−T(A−1,B−>) = R(A−1,B−>) = ran A−1 + ran B−> = dom A −dom B = D, which proves (12.34).Now suppose that ri(D∩R) = D∩R. It follows from (12.32) and (12.22)thatD ∩ R = ri(D ∩ R) = ri ran(Id−T(A,B)) ⊆ ran(Id−T(A,B)) ⊆ D ∩ R.(12.41)Hence all the inclusions become equalities, which proves (12.35). Corollary 12.11 (range of the Douglas–Rachford operator). Suppose that Aand B satisfy one of the following:(i) (∀w ∈ ri D(A,B−1)∩ ri R(A,B−1)) ri(ran A+dom B) ⊆ ri ran(A + B−1w).(ii) A and B are 3∗ monotone.(iii) ran B + ri D(A,B−1) ∩ ri R(A,B−1) ⊆ dom A and A is 3∗ monotone.Thenran T(A,B) ' (dom A− ran B) ∩ (ran A + dom B). (12.42)Furthermore, the following implications hold:(∃C ∈ {A, B−1}) dom C = X and C is 3∗ monotone⇒ ran T(A,B) ' ran A + dom B (12.43)andri(D(A,B−1) ∩ R(A,B−1))= D(A,B−1) ∩ R(A,B−1)⇒ ran T(A,B) = (dom A− ran B) ∩ (ran A + dom B). (12.44)Proof. Using Proposition 5.3, we know that T(A,B) = Id−T(A,B−1). Theresult thus follows by applying Theorem 12.10 to (A, B−1). The assumptions in Theorem 12.10 are critical. Example 12.8 shows thatwhen neither A nor B is 3∗ monotone, the conclusion of the theorem fails.Now we show that the conclusion of Theorem 12.10 fails even if one of theoperators is a subdifferential operator.14812.4. Main resultsExample 12.12. Suppose that X = R2, set C = R× {0}, and suppose thatA =(0 −11 0)and B = NC. (12.45)Then Id−T(A,B) = JA − PC RA. Notice that PC : (x, y) 7→ (x, 0), JA : (x, y) 7→12 (x + y,−x + y) and consequently RA : (x, y) 7→ (y,−x). Henceran(Id−T(A,B)) = R · (1,−1) $ R2 = (dom A− dom B)∩ (ran B+ ran A).(12.46)Corollary 12.13. Suppose that A and B satisfy one of the following:(i) (∀w ∈ ri D ∩ ri R) ri(ran A + ran B) ⊆ ri ran (A + Bw) .(ii) A and B are 3∗ monotone.(iii) dom B + ri D ∩ ri R ⊆ dom A and A is 3∗ monotone.(iv) (∃C ∈ {A, B}) dom C = X and C is 3∗ monotone.Furthermore, suppose that D and R are affine subspaces. Then ran(Id−T(A,B)) =D ∩ R.Proof. Since ri D = D and ri R = R, Lemma 12.9(iv) yieldsD ∩ R = ri D ∩ ri R = ri(D ∩ R). Now apply (12.35). Corollary 12.14. Suppose that X = R. Then ran(Id−T(A,B)) ' D ∩ R.Proof. Indeed, it follows from e.g. [12, Corollary 22.19] and Fact 3.27 that Aand B are 3∗ monotone. Now apply Theorem 12.10(ii). We now construct an example where ran(Id−T(A,B)) properly lies be-tween ri(D ∩ R) and D ∩ R. This illustrate that Theorem 12.10 is optimal inthe sense that near equality cannot be replaced by actual equality.Example 12.15. Suppose that dim X ≥ 2, let u and v be in X with u 6= v, letr and s be in ]0,+∞[, set U = ball(u; r) and V = ball(v; s), and suppose thatA = NU and B = NV . Then D ∩ R = ball(u− v; r + s) andran(Id−T(A,B)) = int ball(u− v; r + s)∪{(1− r+s‖u−v‖)(u− v),(1+ r+s‖u−v‖)(u− v)}; (12.47)14912.4. Main resultsconsequently,ri(D ∩ R) $ ran(Id−T(A,B)) $ D ∩ R. (12.48)Moreover,v(A,B) = max{(r + s)− ‖v− u‖, 0} · v− u‖v− u‖ . (12.49)Proof. It follows from Fact 3.27 that A and B are 3∗ monotone. UsingFact 3.8, we have ran A = ran B = X, hence R = X and D ∩ R = D =U −V. First notice thatD ∩ R = D = U −V = ball(u− v; r + s). (12.50)We claim that(∀w ∈ Dr ri D) U ∩ (V + w) is a singleton. (12.51)Since D = U − V, we have (∀w ∈ D) U ∩ (V + w) 6= ∅. Now let w ∈Dr ri D and assume to the contrary that {y, z} ⊆ U ∩ (V + w) with y 6= z.Then {y− w, z− w} ⊆ V, and (∀λ ∈ ]0, 1[)λy + (1− λ)z ∈ int U and λy + (1− λ)z− w ∈ int V. (12.52)It follows from Fact 12.3 and the above inclusions that w ∈ int U − int V =ri U − ri V = ri D, which is absurd. Therefore (12.51) holds. Now, let w ∈D r ri D and notice that V + w = ball(v + w; r). Using (12.51) we haveU ∩ (V + w) = dom(A + Bw) is a singleton. Consequently, ran(A + Bw)is the line passing through the origin parallel to the line passing throughu and v + w, and by Proposition 11.10, we have w ∈ ran(Id−T(A,B)) ⇔w ∈ ran(A + Bw) ⇔ w = λ(u − v − w) for some λ ∈ Rr {−1} ⇔ w =λ1+λ (u− v) with λ ∈ Rr {−1} (since u 6= v), or equivalently,w = α(u− v), where α ∈ Rr {1} . (12.53)Finally notice that w is on the boundary of U −V. Therefore, using (12.50)and (12.53) we must have ‖w− (u− v)‖ = r + s⇔ |α− 1|‖u− v‖ = r + s⇔ α = 1 ± r+s‖u−v‖ , which means that only two points on the boundaryof D are included in ran(Id−T(A,B)). Moreover, if ‖u − v‖ > r + s, then0 ∈ int ball(u− v; s+ r), hence v(A,B) = 0. Else if ‖u− v‖ ≤ r+ s, using [12,Proposition 28.10] we get v(A,B) =(1− r+s‖u−v‖)(u − v), which completesthe proof. 15012.5. Some infinite-dimensional observationsFigure 12.1: A GeoGebra [78] snapshot that illustrates Example 12.15.12.5 Some infinite-dimensional observationsIn this section, we provide some results that remain true in a possiblyinfinite-dimensional settings. We also provide various examples and coun-terexamples. We assume henceforth thatH is a (possibly infinite-dimensional) real Hilbert space. (12.54)A pleasing identity arises when the we are dealing with normal coneoperators of closed subspaces.Proposition 12.16. Let U and V be closed linear subspaces of H, and supposethat A = NU and B = NV . Then ran(Id−T(A,B)) = (U +V) ∩ (U⊥ +V⊥).Proof. Since gr NU = U×U⊥ and gr NV = V ×V⊥, the result follows from(12.21). Proposition 12.17. Let U and V be closed linear subspaces of H such thatU⊥ ∩V = {0} , (12.55)and suppose that A = NU and B = PV . Then the following hold:(i) U⊥ ∩ P−1V (PV(U⊥)r PV(U)) ⊆ (D ∩ R)r ran(Id−T(A,B)).(ii) (U⊥ + V) ∩ (V + P−1V (PV(U⊥) r PV(U))) ⊆ (ran A + ran B) rran(A + B).15112.5. Some infinite-dimensional observationsConsequently, if PV(U⊥)r PV(U) 6= ∅, thenran(Id−T(A,B)) $ D ∩ R (12.56)andran(A + B) $ ran A + ran B. (12.57)Proof. It is clear that D ∩ R = (U − H) ∩ (U⊥ +V) = U⊥ +V. Notice that(i) and (ii) trivially hold when PV(U⊥)r PV(U) = ∅. Now suppose thatPV(U⊥)rPV(U) 6= ∅. It follows from (11.22) and (12.55) that (∀w ∈ U⊥ ⊆U⊥ +V)Zw 6= ∅⇔ (∃u ∈ U) such that w ∈ NUu + PV(u− w)⇔ (∃u ∈ U)PV u− PV w ∈ U⊥ − w = U⊥⇔ (∃u ∈ U)PV u− PV w = 0⇔ PV w ∈ PV(U). (12.58)Now let w ∈ U⊥ ⊆ U⊥ + V = D ∩ R such that PV w 6∈ PV(U). Then(12.58) implies that Zw = ∅, hence by Corollary 11.11 w 6∈ ran(Id−T(A,B)),which proves (i) and consequently (12.56). To complete the proof we needto show that (ii) holds. Notice that (∀u⊥ ∈ U⊥) u⊥ + PV u⊥ ∈ U⊥ + V =ran A + ran B. It follows from (12.55) thatu⊥ + PV u⊥ ∈ ran(A + B)⇔(∃u ∈ U = dom(A + B)) u⊥ + PV u⊥∈ U⊥ + PV u⇔(∃u ∈ U)PV u⊥ − PV u ∈ U⊥ − u⊥ = U⊥⇔(∃u ∈ U)PV u⊥ = PV u. (12.59)Now, let u⊥ ∈ U⊥ such that PV u⊥ 6∈ PV(U). Then using (12.59)w = u⊥ + PV u⊥ 6∈ ran(A + B). Notice that by constructionw ∈ U⊥ +V = ran A + ran B. Remark 12.18. Notice that in Proposition 12.17 both A and B are linear relations,maximally monotone and 3∗ monotone operators. Consequently, the sets D and Rare linear subspaces of H. When H is finite-dimensional, Corollary 12.13 and [51,footnote on page 174] imply that ran(Id−T) = D ∩ R and ran(A + B) =ran A + ran B. Thus, if (12.56) or (12.57) holds, then H is necessarily infinite-dimensional.We now provide a concrete example in `2(N) where both (12.56)and (12.57) hold. This illustrates again the requirement of the closure inFact 3.35.15212.5. Some infinite-dimensional observationsProposition 12.19. Suppose that H = `2(N), let p ∈ R++, and let (αn)n∈N bea sequence in R++ such thatαn → 0,∞∑n=0α2p−2n < +∞ and∞∑n=0α2p−4n = +∞. (12.60)SetU ={x = (xn)n∈N ∈ H∣∣ x2n+1 = −αnx2n} (12.61)andV ={x = (xn)n∈N ∈ H∣∣ x2n = 0}, (12.62)and suppose that A = NU and B = PV . Then PV(U⊥)r PV(U) 6= ∅ and henceran(Id−T(A,B)) $ D ∩ R and ran(A + B) $ ran A + ran B. (12.63)Proof. It is easy to check that U⊥ ={x = (xn) ∈ H∣∣ x2n+1 = α−1n x2n}.Hence U⊥ ∩ V = {0}. Let w ∈ H be defined as (∀n ∈ N) w2n = αpnand w2n+1 = αp−1n . Clearly w ∈ U⊥. We claim that PV w 6∈ PV U.Suppose this is not true. Then (∃u ∈ U) such that PV w = PV u.Hence (∀n ∈ N) u2n+1 = (PV u)2n+1 = (PV w)2n+1 = w2n+1 = αp−1n .Consequently, (∀n ∈ N) u2n = −αp−2n , which is absurd since itimplies that ∑∞n=0 u22n = ∑∞n=0 α2p−4n = +∞, by (12.60). Therefore,PV(U⊥)r PV(U) 6= ∅. Using Proposition 12.17 we conclude that (12.63)holds. The next example is a special case of Proposition 12.19.Example 12.20. Suppose that H = `2(N), let (αn)n∈N = (1/(n + 1))n∈N,let p ∈ ] 32 , 52] and let U, V, A and B be as defined in Proposition 12.19. Since2p− 2 > 1 and 2p− 4 ≤ 1, we see that (12.60) holds. From Proposition 12.19we conclude that ran(Id−T(A,B)) $ D ∩ R and ran(A + B) $ ran A + ran B.When A or B has additional structure, it may be possible to traversebetween ran(A + B) and ran(Id−T(A,B)) as we illustrate now.Proposition 12.21. Let A : H ⇒ H and B : H ⇒ H be maximally monotone.Then the following hold:(i) If B : H → H is linear, then ran(Id−T(A,B)) = JB(ran(A + B)) andran(A + B) = (Id+B) ran(Id−T(A,B)).15312.5. Some infinite-dimensional observations(ii) If A : H → H is linear and Id− A is invertible, then ran(Id−T(A,B)) =(Id−A)−1(ran(A + B)) and ran(A + B) = (Id−A) ran(Id−T(A,B)).(iii) If A : H → H and B : H → H are linear and A∗ = −A, then (∀λ ∈[0, 1]) ran(Id−T(A,B)) = JλA∗+(1−λ)B(ran(A + B)).Proof. Let w ∈ X. It follows from Proposition 11.10 thatw ∈ ran(Id−T(A,B))⇔ (∃x ∈ H) such that w ∈ Ax + B(x− w).(12.64)(i): It follows from (12.64) that w ∈ ran(Id−T(A,B)) ⇔ (∃x ∈ H) suchthat w ∈ Ax + Bx − Bw ⇔ (∃x ∈ H) (Id+B)w = w + Bw ∈ (A + B)x⇔ (∃x ∈ H) w ∈ JB((A + B)x) ⇔ w ∈ JB(ran(A + B)). Using [27,Theorem 2.1(ii)&(iv)] we learn that JB is a bijection, hence invertible, andran(A + B) = J−1B ran(Id−T(A,B)) = (Id+B) ran(Id−T(A,B)), as claimed.(ii): It follows from (12.64) that w ∈ ran(Id−T(A,B)) ⇔ (∃x ∈ H) suchthat w − Aw ∈ A(x − w) + B(x − w) = (A + B)(x − w) ⇔ (∃x ∈H) (Id−A)w ∈ (A + B)(x− w)⇔ (∃x ∈ H) w ∈ (Id−A)−1((A + B)(x−w))⇔ w ∈ (Id−A)−1(ran(A+ B)). Since Id−A is invertible, we learn thatId−A is a bijection and ran(A+ B) = (Id−A) ran(Id−T(A,B)), as claimed.(iii): It follows from (12.64) thatw ∈ ran(Id−T(A,B))⇔(∃x ∈ H) such that (∀λ ∈ [0, 1]) w− λAw + (1− λ)Bw= A(x− λw) + B(x− w + (1− λ)w) ∈ (A + B)(x− λw) (12.65a)⇔(Id+λA∗ + (1− λ)B)w ∈ ran(A + B) (12.65b)⇔w ∈ JλA∗+(1−λ)B(ran(A + B)). (12.65c)154Chapter 13On the range of theDouglas–Rachford operator:Applications13.1 OverviewIn this chapter we present some applications of the results in Chapter 12.As is the case in Chapter 12 we assume thatX is a finite-dimensional real Hilbert space.We also follow the same notation as in Chapter 12. Our main resultscan be summarized as:• We apply of the results of Chapter 12 to obtain more accurate formu-lae for the minimal displacement vector (see Proposition 13.1).• When A = ∂ f and B = ∂g are subdifferential operators, which is thekey setting in convex optimization, formulae (12.2) and (12.3) can bewritten as (see Corollary 13.4)ran(Id−T(∂ f ,∂g)) ' (dom f − dom g) ∩ (dom f ∗ + dom g∗)(13.1)andran T(∂ f ,∂g) ' (dom f − dom g∗) ∩ (dom f ∗ + dom g). (13.2)• Using the correspondence between maximally monotone operatorsand firmly nonexpansive mappings (see Fact 3.12), we are able toreformulate our results for firmly nonexpansive mappings (see Corol-lary 13.7).Except for facts with explicit references, all the new results in this chapter arepublished in [38].15513.2. On the minimal displacement vector v(A,B)13.2 On the minimal displacement vector v(A,B)In this section, we focus on v(A,B).Proposition 13.1. Suppose that A and B satisfy one of the following:(i) (∀w ∈ ri D ∩ ri R) ri(ran A + ran B) ⊆ ri ran (A + Bw) .(ii) A and B are 3∗ monotone.(iii) dom B + ri D ∩ ri R ⊆ dom A and A is 3∗ monotone.(iv) (∃C ∈ {A, B}) dom C = X and C is 3∗ monotone.Then ran(Id−T(A,B)) = D ∩ R = D ∩ R and v(A,B) = PD∩R(0).Proof. Combine Theorem 12.10, Lemma 12.9(vi), and (11.26). Using the symmetric hypotheses of Theorem 12.10, we obtain the fol-lowing result:Lemma 13.2. Suppose that both A and B are 3∗ monotone, or that (∃C ∈ {A, B})such that dom C = X and C is 3∗ monotone. Then the following hold:(i) If D is a linear subspace of X, then ran(Id−T(A,B)) ' ran(Id−T(B,A))and v(A,B) = v(B,A).(ii) If R is a linear subspace of X, then ran(Id−T(A,B)) ' − ran(Id−T(B,A))and v(A,B) = −v(B,A).(iii) If dom A = X or dom B = X, then ran(Id−T(A,B)) 'ran(Id−T(B,A)) ' R, and v(A,B) = v(B,A) = PR(0).(iv) If dom A or dom B is bounded, then ran(Id−T(A,B)) ' − ran(Id−T(B,A))' D, and v(A,B) = −v(B,A) = PD(0).Proof. Observe first thatran(Id−T(B,A)) ' (−D) ∩ R (13.3)by (12.30). (i): Since D = −D, Theorem 12.10 and (13.3) yieldran(Id−T(A,B)) ' ran(Id−T(B,A)), (13.4)and the conclusion follows from (11.26). (ii): Let u ∈ X. Since R = −R,we obtain the equivalences u ∈ D ∩ R ⇔ −u ∈ −D and −u ∈ R15613.3. Applications to subdifferential operators⇔ −u ∈ (−D) ∩ R⇔ u ∈ −((−D) ∩ R). Hence, D ∩ R = −((−D) ∩ R).Consequently, D ∩ R = −((−D) ∩ R) and ri(D ∩ R) = − ri ((−D) ∩ R).Applying Theorem 12.10, in view of (13.3), to the pair (A, B) and thepair (B, A), we conclude ran(Id−T(A,B)) ' − ran(Id−T(B,A)). Thusran(Id−T(A,B)) = −ran(Id−T(B,A)), and the result follows from (11.26).(iii): The hypothesis implies D = X = X = D. Now combine with (i)and Proposition 13.1(ii). (iv): The assumption that either dom A or dom Bis bounded implies that ran A = X (respectively ran B = X) (see [12,Corollary 21.21]). Hence R = X = X = R. Now combine with (ii) andProposition 13.1(ii). Example 13.3. Suppose that X = R. It follows from Proposition 11.18 thatv(A,B) = ±v(B,A).13.3 Applications to subdifferential operatorsWe now turn to subdifferential operators.Corollary 13.4. Let f : X → ]−∞,+∞] and g : X → ]−∞,+∞] be properlower semicontinuous convex functions. Then the following hold:(i) ran(Id−T(∂ f ,∂g)) ' (dom f − dom g) ∩ (dom f ∗ + dom g∗).(ii) ran T(∂ f ,∂g) ' (dom f − dom g∗) ∩ (dom f ∗ + dom g).Proof. It is well known that (see [12, Corollary 16.29]) dom f = dom ∂ f .Since f is convex, so is dom f . Moreover, by Fact 3.31 dom ∂ f is nearlyconvex. Therefore, applying Fact 12.2 to the sets dom f and dom ∂ f weconclude that dom ∂ f ' dom f . Using [12, Proposition 16.24], and the pre-vious conclusion applied to f ∗, we have ran ∂ f = dom(∂ f )−1 = dom ∂ f ∗ 'dom f ∗. Altogether,dom ∂ f ' dom f and ran ∂ f ' dom f ∗. (13.5)Applying Fact 12.4 with C1 = dom f , C2 = −dom g, D1 = dom ∂ f , D2 =−dom ∂g, we conclude thatdom ∂ f − dom ∂g ' dom f − dom g. (13.6)One shows similarly thatran ∂ f + ran ∂g ' dom f ∗ + dom g∗. (13.7)15713.3. Applications to subdifferential operatorsIt follows from the maximal monotonicity of ∂ f and ∂g and Lemma 12.9(ii)that (dom ∂ f − dom ∂g) ∩ (ran ∂ f + ran ∂g) 6= ∅. Applying Corollary 12.7with C1 = dom ∂ f − dom ∂g, C2 = ran ∂ f + ran ∂g, D1 = dom f − dom g,and D2 = dom f ∗ + dom g∗, we conclude that(dom ∂ f − dom ∂g) ∩ (ran ∂ f + ran ∂g)' (dom f − dom g) ∩ (dom f ∗ + dom g∗). (13.8)To complete the proof, notice that by Fact 3.27 ∂ f and ∂g are 3∗ monotoneoperators. Therefore, by Theorem 12.10, we haveran(Id−T(∂ f ,∂g)) ' (dom ∂ f − dom ∂g) ∩ (ran ∂ f + ran ∂g). (13.9)Combining (13.8) and (13.9) we conclude that (i) holds true. To prove (ii),combine Corollary 12.11, (13.5) and Corollary 12.7. Corollary 13.5. Let f ∈ Γ(X) and suppose that V is a nonempty closed convexsubset of X. Suppose that A = ∂ f and B = NV . Then the following hold:(i) T(A,B) = JNV R∂ f + Id−J∂ f = PV(2 Prox f − Id) + Id−Prox f .(ii) ran(Id−T(A,B)) ' (dom f −V) ∩ (dom f ∗ + (rec V)	).(iii) ran T(A,B) ' (dom f − (rec V)	) ∩ (dom f ∗ +V).Consequently, if V is a linear subspace we may add to this list the following items:(iv) ran(Id−T(A,B)) ' (dom f +V) ∩ (dom f ∗ +V⊥).(v) ran T(A,B) ' (dom f +V⊥) ∩ (dom f ∗ +V).Proof. Since ran NV is nearly convex and (rec V)	 is convex, it follows from(3.21), Fact 2.9 and Fact 12.2 thatran NV ' (rec V)	. (13.10)(i): This follows from (12.20) and the fact that JNV = PV and J∂ f = Prox f .(ii): Combine (13.5), (13.10), Theorem 12.10, Fact 12.4 and Corollary 12.7.(iii): Combine (13.5), (13.10), Corollary 12.11, Fact 12.4 and Corollary 12.7.(iv)&(v): It follows from [12, Proposition 6.22 and Corollary 6.49] thatrec V = V and (rec V)	 = V⊥. Combining this with (ii) and (iii), we obtain(iv) and (v), respectively. 15813.4. Application to firmly nonexpansive mappingsCorollary 13.6 (two normal cone operators). Let U and V be two nonemptyclosed convex subsets of X, and suppose that A = NU and that B = NV . Thenthe following hold:(i) ran(Id−T(A,B)) ' (U −V) ∩ ((rec U)	 + (rec V)	).(ii) ran T(A,B)) ' (U − (rec V)	) ∩ ((rec U)	 +V).(iii) v(A,B) = PU−V(0).Proof. Clearly dom A = U and dom B = V. It follows from (13.10) thatran NU ' (rec U)	 and ran NV ' (rec V)	. Therefore, Fact 12.4 impliesthatR ' (rec U)	 + (rec V)	. (13.11)Now (i) follows from combining (13.11) and Theorem 12.10, and (ii) followsfrom combining (13.10) applied to the sets U and V, Fact 12.4 and Corol-lary 12.11. It remains to show (iii) is true. Set v = PU−V(0) = PD(0).On the one hand, by definition of v and Proposition 13.1(ii), we havev(A,B) ∈ D ∩ R ⊆ D and hence‖v‖ ≤ ‖v(A,B)‖. (13.12)On the other hand, using [20, Corollary 2.7] we have v ∈ (PU − Id)(V) ∩(Id−PV)(U) ⊆ (rec U)⊕ ∩ (rec V)	. Therefore, using (13.10) and that0 ∈ (rec U)	 we have v ∈ (rec U)⊕ ∩ (rec V)	 ⊆ (rec V)	 ⊆ (rec U)	 +(rec V)	 ⊆ R. Hence,v ∈ D ∩ R. (13.13)Combining (13.12), (13.13) and Proposition 13.1(ii) yields v = v(A,B). 13.4 Application to firmly nonexpansive mappingsWe now restate the main result from the perspective of fixed point theory.Corollary 13.7. Let T1 : X → X and T2 : X → X be firmly nonexpansive suchthat each Ti satisfies(∀x ∈ X)(∀y ∈ X) infz∈X〈Tix− Tiz, (y− Tiy)− (z− Tiz)〉 > +∞, (13.14)and set T = T2(2T1 − Id) + Id−T1. Thenran T ' (ran T1 − ran(Id−T2)) ∩ (ran(Id−T1) + ran T2), (13.15)15913.4. Application to firmly nonexpansive mappingsandran(Id−T) ' (ran T1 − ran T2) ∩ (ran(Id−T1) + ran(Id−T2)). (13.16)Proof. Using Fact 3.12 we conclude that there exist maximally monotoneoperators A : X ⇒ X and B : X ⇒ X such thatT1 = JA and T2 = JB. (13.17)Moreover, it follows from [27, Theorem 2.1(xvii)] and (13.14) that A andB are 3∗ monotone. By (5.2), we conclude that T = T(A,B). Using Corol-lary 12.11, Lemma 3.19 and (3.21) we haveran T ' (dom A− ran B) ∩ (ran A + dom B)= (ran(Id−T1)− ran T2) ∩ (ran T1 + ran(Id−T2)).That is, (13.15) holds true. Similarly, one can prove (13.16) by combiningTheorem 12.10, Lemma 3.19 and (3.21). 160Chapter 14The Douglas–Rachfordalgorithm in the possiblyinconsistent case14.1 OverviewThroughout this chapter, we assume thatA and B are maximally monotone operators on X. (14.1)Let T = T(A,B) be the Douglas–Rachford operator associated with (A, B)and let x0 ∈ X. Recall that when Z = (A + B)−1(0) 6= ∅ the governingsequence (Tnx0)n∈N produced by the Douglas–Rachford operator convergesweakly to a point in Fix T and the shadow sequence (JATnx0)n∈N convergesweakly to a point in Z = (A + B)−1(0) (see Fact 9.2). In this chapter weconsider the inconsistent case when the set Z = (A + B)−1(0) is possiblyempty. Our main results in this chapter can be summarized as follows:• We advance the understanding of the inconsistent case significantlyby providing a complete proof of the full weak convergence in theconvex feasibility setting, when (A, B) = (NU , NV), where U andV are nonempty closed convex subsets of X (see Theorem 14.11).This is remarkable because we do not have to have prior knowledgeabout the gap vector v; the shadow sequence is simply (JATnx0)n∈N =(PUTnx0)n∈N.• In fact, while the general case remains open (especially because ofRemark 14.16(ii) and Example 14.17), a more general sufficient con-dition for weak convergence in the general case is presented (seeTheorem 14.10 and Corollary 14.14).Except for facts with explicit references, all the new results in this chapter appearin [13].16114.2. The convex feasibility settingWe recall that the minimal displacement vector (see Fact 7.4)v = Pran(Id−T)0 (14.2)is unique and well-defined. Recall also that the normal problem (see Chap-ter 11) associated with the ordered pair (A, B) is tofind x ∈ X such that 0 ∈ v Ax + Bvx = Ax− v + B(x− v). (14.3)and thatZv = Z(v A,Bv) and Kv = K((v A)−1,(Bv)−>), (14.4)denote the primal normal and dual normal solutions, respectively. It followsfrom Corollary 11.11 thatZv 6= ∅⇔ v ∈ ran(Id−T). (14.5)14.2 The convex feasibility settingWe now turn to the convex feasibility problem. Throughout the rest of thissection we assume thatU and V are nonempty closed convex subsets of X (14.6)and thatA = NU , B = NV and v = PU−V0. (14.7)Note that, in view of Proposition 11.22, the above formula for v coincidewith (14.2). We shall use RU (respectively RV) to denote the reflector bythe U (respectively V) defined by RU = 2PU − Id. One can see that theconvex feasibility problem is a special case of the sum problem. Indeed,when A = NU and B = NV , the sum problem is equivalent to the convexfeasibility problem: Find x ∈ U ∩V. In this case, using Example 3.14(ii),T = T(NU ,NV) = Id−PU + PV RU . (14.8)Definition 14.1 (best approximation pair). Suppose that U and V arenonempty closed convex subsets of X and let (u¯, v¯) ∈ U×V. Then (u¯, v¯) isa best approximation pair relative to U and V if‖u¯− v¯‖ = inf {‖u− v‖ ∣∣ u ∈ U, v ∈ V}. (14.9)16214.2. The convex feasibility settingIn the following we collect some facts whose detailed proofs appear in[20].Fact 14.2. Suppose that e ∈ U ∩ (v + V) and that y ∈ NU−Vv and set f =e− v ∈ (U − v) ∩V. Then the following hold:(i) NU−Vv = (NUe) ∩ (−NV f ).(ii) PU(e + y) = e.(iii) PV( f − y) = f .Proof. See [20, Proposition 2.4]. Fact 14.3. The following hold:(i) v ∈ (PU − Id)(V) ∩ (Id−PV)(U) ⊆ (rec U)⊕ ∩ (rec V)	.(ii) Suppose that S ∈ {U, V} is a closed affine subspace. Thenv ∈ par(S− S)⊥. (14.10)Proof. See [20, Corollary 2.7 and Remark 2.8(ii)]. Fact 14.4. The set Fix(v + T) is closed and convex. MoreoverU ∩ (v +V) + NU−Vv ⊆ Fix(v + T) ⊆ v +U ∩ (v +V) + NU−Vv. (14.11)Proof. See [20, Theorem 3.5]. Fact 14.5. Suppose that U ∩V 6= ∅. Then v = 0,Fix T = U ∩V + NU−V0 and PU(Fix T) = U ∩V. (14.12)Proof. See [20, Corollary 3.9]. Fact 14.6. Let x ∈ X. Then the following hold:(i) Tnx− Tn+1x = PUTnx− PV RUTnx → v and PUTnx− PV PUTnx → v.(ii) If U ∩ V 6= ∅ then (Tnx)n∈N converges weakly to a point in Fix T =U ∩V + NU−Vv; otherwise ‖Tnx‖ → ∞.16314.2. The convex feasibility setting(iii) Exactly one of the following alternatives hold:(a) U ∩ (v +V) = ∅, ‖PUTnx‖ → ∞ and ‖PV PUTnx‖ → ∞.(b) U ∩ (v + V) 6= ∅, the sequences (PUTnx)n∈N and (PV PUTnx)n∈Nare bounded and their weak cluster points belong to U ∩ (v + V)and (U − v) ∩ V, respectively; in fact, the weak cluster points of thesequences ((PV RUTnx, PUTnx))n∈N and ((PV PUTnx, PUTnx))n∈Nare best approximation pairs relative to (U, V).Proof. See [20, Theorem 13.3]. Figure 14.1: A GeoGebra [78] snapshot that illustrates Fact 14.6(ii). Twointersecting lines in R3, U the blue line and V the red line. The firstfew iterates of the sequences (Tnx)n∈N (red points) and (PUTnx)n∈N (bluepoints) are also depicted.Fact 14.7. Suppose that V is a closed affine subspace. Let x ∈ X Then PUTnx−PV Tnx → v.Proof. See [20, Theorem 3.17]. In passing we recall that, in the inconsistent convex feasibility setting,when U ∩ V = ∅ (equivalently Fix T = ∅ by Fact 5.2(iii)), the governingsequence is known to satisfy ‖Tnx0‖ → +∞ (see Fact 8.1(i)) while theshadow sequence (PUTnx0)n∈N remains bounded with weak cluster pointsbeing the best approximation pairs relative to U and V provided they exist16414.3. Convergence analysis in the inconsistent case(see Fact 14.6(iii)(b)). Unlike the method of alternating projections, whichemploys the operator PV PU , the Douglas–Rachford method is not fullyunderstood in the inconsistent case.14.3 Convergence analysis in the inconsistent caseRecall the following important fact.Fact 14.8 (Eckstein and Bertsekas (1992)). Suppose that zer(A+ B) = ∅ andlet x ∈ X. Then ‖Tnx‖ → +∞.Proof. See [72, Corollary 6.2]. Alternatively, note that zer(A + B) = ∅ ⇔Fix T = ∅. Now apply Fact 8.1(i). We start the following useful consequence of Theorem 5.9(iii).Corollary 14.9. Let x ∈ X and let y ∈ X. Then the following hold:∞∑n=0‖(Id−T)Tnx− (Id−T)Tny‖2 < +∞, (14.13a)∞∑n=0〈JATnx− JATny, JA−1 Tnx− JA−1 Tny〉︸ ︷︷ ︸≥0< +∞, (14.13b)∞∑n=0〈JBRATnx− JBRATny, JB−1 RATnx− JB−1 RATny〉︸ ︷︷ ︸≥0< +∞. (14.13c)Consequently,(Id−T)Tnx− (Id−T)Tny→ 0, (14.14a)〈JATnx− JATny, JA−1 Tnx− JA−1 Tny〉 → 0, (14.14b)〈JBRATnx− JBRATny, JB−1 RATnx− JB−1 RATny〉 → 0. (14.14c)Proof. Let n ∈N. Applying (3.20), to the points Tnx and Tny, we learn that{(JATnx, JA−1 Tnx), (JATny, JA−1 Tny)} ⊆ gra A, hence, by monotonicityof A we have 〈JATnx − JATny, JA−1 Tnx − JA−1 Tny〉 ≥ 0. Similarly〈JBRATnx − JBRATny, JB−1 RATnx − JB−1 RATny〉 ≥ 0. Now (14.13) and(14.14) follow from Theorem 5.9(iii) by telescoping. We are now ready for our main result.16514.3. Convergence analysis in the inconsistent caseTheorem 14.10 (shadow convergence). Suppose that x ∈ X, that the sequence(JATnx)n∈N is bounded and its weak cluster points lie in Zv, that Zv ⊆ Fix(v +T) and that (∀n ∈ N) (∀y ∈ Zv) JATny = y. Then the shadow sequence(JATnx)n∈N converges weakly to some point in Zv.Proof. Let y ∈ Zv ⊆ Fix(v + T). Using (14.14b) and Proposition 8.4(i) wehave〈JATnx− y, Tnx + nv− JATnx〉= 〈JATnx− y, Tnx− JATnx− (y− nv− y)〉 (14.15a)= 〈JATnx− JATny, (Id−JA)Tnx− (Id−JA)Tny〉 (14.15b)→ 0. (14.15c)Note that Proposition 8.4(ii) implies that (Tnx + nv)n∈N is Feje´r monotonewith respect to Fix(v+ T) and consequently with respect to Zv. Now applyLemma 6.3 with E replaced by Zv, (un)n∈N replaced by (JATnx)n∈N, and(xn)n∈N replaced by (Tnx + nv)n∈N. As a powerful application of Theorem 14.10, we obtain the followingstriking result on the convergence of the Douglas–Rachford algorithm forfeasibility problems. It illustrates that, even in the inconsistent case, theshadow sequence (PUTnx)n∈N behaves extremely well because it con-verges to a normal solution without prior knowledge of the minimal displace-ment vector.Theorem 14.11 (possibly inconsistent feasibility problem). Suppose thatU and V are nonempty closed convex subsets of X, that A = NU , that B = NV ,that v = Pran(Id−T)0 and that U ∩ (v +V) 6= ∅. Let x ∈ X. Then (PUTnx)n∈Nconverges weakly to some point in Zv = U ∩ (v +V).Proof. It follows from Fact 14.6(iii)(b) that (PUTnx)n∈N is bounded andits weak cluster points lie in U ∩ (v + V). Moreover Fact 14.4 impliesthat Zv = U ∩ (v + V) ⊆ U ∩ (v + V) + NU−V(v) ⊆ Fix(v + T). Finally,Proposition 8.4(i) and Fact 14.2(ii) imply that (∀y ∈ Fix(v + T))(∀n ∈N)PUTny = PU(y− nv) = y, hence all the assumptions of Theorem 14.10 aresatisfied and the result follows. Remark 14.12. Suppose that v ∈ ran(Id−T). More than a decade ago, it wasshown in [20] that when A = NU and B = NV , where U and V are nonemptyclosed convex subsets of X, that (PUTnx)n∈N is bounded and its weak cluster16614.3. Convergence analysis in the inconsistent casepoints lie in U ∩ (v + V). Theorem 14.11 yields the much stronger result that(PUTnx)n∈N actually converges weakly to a point in U ∩ (v +V). We point outthat the feasibility setting is actually of practical importance and occurs in real-world applications. We also stress the fact that Theorem 14.11 is entirely neweven in finite-dimensional spaces.Figure 14.2: A GeoGebra [78] snapshot that illustrates Theorem 14.11. Twononintersecting polyhedral sets in R2, U and V. The first few iterates ofthe sequences (Tnx)n∈N (red points) and (PUTnx)n∈N (blue points) are alsodepicted. Shown is the minimal displacement vector v as well.In proving Corollary 14.14, which is another variant of Theorem 14.10,we require the following auxiliary result.Proposition 14.13. Suppose that U is a closed affine subspace of X, that A = NU ,that B is paramonotone and that v = Pran(Id−T)0 ∈ ran(Id−T). Suppose alsothat zer(v A) ∩ zer(Bv) 6= ∅ and let x ∈ X. Then the following hold:(i) Zv = zer(v A) ∩ zer(Bv) = zer(−v + NU) ∩ zer(B(· − v)).(ii) 0 ∈ Kv.16714.3. Convergence analysis in the inconsistent case(iii) v ∈ (par U)⊥.(iv) Zv ⊆ U.(v) Zv ⊆ Zv + Kv = Fix(T−v) ⊆ Fix(v + T).(vi) (∀n ∈N)(∀y ∈ Zv) JATny = PUTny = y.(vii) PUTnx− JBRUTnx = Tnx− Tn+1x → v.(viii) (JATnx)n∈N = (PUTnx)n∈N is bounded.(ix) (JBRUTnx)n∈N is bounded.Proof. (i)&(ii): Apply Theorem 4.32(i) to (v A, Bv) = (−v + NU , B(· − v)).(iii): In view of (14.5) we learn that Zv 6= ∅. Therefore (i) implies thatzer(v A) = zer(−v + NU) 6= ∅. Hence (∃x ∈ X) 0 ∈ −v + NUx =−v + (par U)⊥; equivalently, v ∈ (par U)⊥. (iv): Using (i) we see thatZv ⊆ zer(−v + NU). We claim that zer(−v + NU) = U. Indeed, lety ∈ X. In view of (iii) we have 0 ∈ −v + NUy ⇔ [y ∈ U and 0 ∈−v + (par U)⊥ = (par U)⊥]. (v): The left-hand inclusion follows from (ii).Note that, as a subdifferential operator, A is paramonotone. To prove theidentity, apply Corollary 5.16(v) to (v A, Bv) and use Proposition 11.6. Theright-hand inclusion follows from Proposition 7.8(iv). (vi): Let y ∈ Zv andnote that y ∈ U ∩ Fix(T−v) by (iv) and (v). Therefore by Proposition 8.4(i)Tny = y− nv. Now use Lemma 2.8 and (iii). (vii): By definition of T wehave PUTnx− JBRUTnx = (Id−T)Tnx, which proves the first identity. Thelimit follows from Fact 8.2. (viii): It follows from Proposition 8.4(ii) thatthe sequence (Tnx + nv)n∈N is Feje´r monotone with respect to Fix(T−v),hence bounded. In view of Lemma 2.8 and (iii) we learn that (∀n ∈N)JATnx = PUTnx = PU(Tnx+ nv) and therefore (JATnx)n∈N = (PUTnx)n∈Nis bounded. (ix): Combine (vii) and (viii). Here is another instance of Theorem 14.10.Corollary 14.14. Let x ∈ X. Suppose that U is a closed affine subspace of X,that A = NU , that B is paramonotone, that v = Pran(Id−T)0 ∈ ran(Id−T),that zer(v A) ∩ zer(Bv) 6= ∅ and that all weak cluster points of (JATnx)n∈N =(PUTnx)n∈N lie in Zv. Then (JATnx)n∈N = (PUTnx)n∈N converges weakly tosome point in Zv.Proof. This follows from combining Theorem 14.10 and Proposi-tion 14.13(v), (vi), and (viii). 16814.3. Convergence analysis in the inconsistent caseBefore we proceed, we need to recall that by Theorem 12.10(ii) we haveran(Id−T) = (dom A− dom B) ∩ (ran A + ran B). (14.16)The following provides an example of Corollary 14.14.Example 14.15. Suppose that U is a closed linear subspace of X, that b ∈U⊥ r {0}, that A = NU and that B = Id+N(−b+U). Then Z = ∅,v = b ∈ ran(Id−T), Zv = {0} and Kv = U⊥. Moreover, (∀x ∈ X) (∀n ∈ N)PUTnx = 12n PUx → 0 and ‖PU⊥Tnx‖ → +∞, and 0 ∈ zer(v A) ∩ zer(Bv).Proof. By the Brezis–Haraux theorem (see Fact 3.35) we have X = int X ⊆int ran B = int(ran Id+ ran N(−b+U)) ⊆ X, hence ran B = X. Using (14.16)ran(Id−T) = (U+ b−U)∩ (U⊥+X) = b+U. Consequently, using (14.2)and [12, Proposition 3.17] we havev = Pran(Id−T)0 = Pb+U0 = b + PU(−b) = b ∈ U⊥ r {0} . (14.17)Note that dom v A = dom A = U and dom Bv = v + dom B = b− b +U =U, hence dom(v A + Bv) = U ∩U = U. Let x ∈ U. Using (14.17) we havex ∈ Zv ⇔ 0 ∈ NUx− b + x− b + N−b+U(x− b) (14.18a)= NUx− b + x− b + NUx (14.18b)⇔ 0 ∈ U⊥ − b + x− b +U⊥ = x +U⊥ ⇔ x ∈ U⊥, (14.18c)hence Zv = {0}, as claimed. As subdifferential operators, both A and Bare paramonotone, and so are the translated operators v A and Bv. SinceZv = {0}, in view of Remark 4.29 and (14.17) we learn thatKv = (NU0− b) ∩ (0− b + N−b+U(0− b)) (14.19a)= (U⊥ − b) ∩ (−b +U⊥) = U⊥. (14.19b)Next we claim that(∀x ∈ X) PUTx = 12 PUx. (14.20)Indeed, note that JB = (Id+B)−1 = (2 Id+N−b+U)−1 = (2 Id+2N−b+U)−1= (Id+N−b+U)−1 ◦ ( 12 Id) = P−b+U ◦ ( 12 Id) = −b + 12 PU , where the lastidentity follows from [12, Proposition 3.17)] and (14.17). Now, using thatPU RU = PU (see Lemma 10.4(ii)) and (14.17) we have PUT = PU(PU⊥ +JBRU) = PU JBRU = PU(−b + 12 PU)RU = PU(−b + 12 PU) = 12 PU . To showthat (∀x ∈ X) (∀n ∈ N) PUTnx = 12n PUx, we use induction. Let x ∈ X.Clearly, when n = 0, the base case holds. Now suppose that for some n ∈16914.3. Convergence analysis in the inconsistent caseN, we have, for every x ∈ X, PUTnx = 12n PUx. Now applying the inductivehypothesis with x replaced by Tx, and using (14.20), we have PUTn+1x =PUTn(Tx) = 12n PUTx =12n (12 PUx) =12n+1 PUx, as claimed. Finally, using(14.17) and [116, Corollary 6(a)] we have ‖Tnx‖ → +∞, hence‖PU⊥Tnx‖2 = ‖Tnx‖2 − ‖PUTnx‖2 = ‖Tnx‖2 − 14n ‖PUx‖2 → +∞. (14.21)In fact, as we shall now see, the shadow sequence may be unbounded inthe general case, even when one of the operators is a normal cone operator.Remark 14.16. (shadows in the presence of normal solutions)(i) Example 14.15 illustrates that even when normal solutions exist, the shad-ows need not converge. Indeed, we have Kv = U⊥ 6= ∅ but the dualshadows satisfy ‖PU⊥Tnx‖ → +∞.(ii) Suppose that A and B are as defined in Example 14.15. Set A˜ = A−1,B˜ = B−> and Z˜ = Z(A˜,B˜). By Proposition 4.4(v) Z˜ 6= ∅ ⇔ K(A˜,B˜) =Z(A,B) 6= ∅, hence Z˜ = ∅. Moreover, Remark 11.20 and Remark 11.12imply that v = b ∈ ran(Id−T) and Z˜v = U⊥ + b = U⊥ 6= ∅. However,in the light of (i) the primal shadows satisfy ‖JA˜Tnx‖ = ‖JA−1 Tnx‖ =‖PU⊥Tnx‖ → +∞.(iii) Concerning Theorem 14.10, it would be interesting to find other conditionssufficient for weak convergence of the shadow sequence or to even character-ize this behaviour.The assumption that zer(v A)∩ zer(Bv) 6= ∅ is critical in the conclusionsof Proposition 14.13 and Corollary 14.14 as we illustrate in Example 14.17.It also provides another example where the shadow sequence diverges toinfinity.Example 14.17. Suppose that U is a closed linear subspace of X, that w ∈ X rU⊥, that A = NU and that (∀x ∈ X) Bx = w. Let x ∈ X. Then Z = ∅,v = PUw 6∈ U⊥, Zv = U, T = PU −w, and (∀n ∈N) PUTnx = PUx− nPUw,hence ‖PUTnx‖ → +∞. Note that zer(v A) = zer(Bv) = zer(v A) ∩ zer(Bv) =∅.Proof. One can readily verify that (∀y ∈ X) NUy + By ⊆ U⊥ + w = U⊥ +PUw 63 0 and therefore Z = ∅. Using (14.16) we have ran(Id−T) = (U −17014.3. Convergence analysis in the inconsistent caseX) ∩ (U⊥ + w) = w + U⊥. Therefore using (14.2) and Fact 2.6 we havev = Pw+U⊥0 = w− PU⊥w = PUw 6∈ U⊥. Now let y ∈ X. Then y ∈ Zv ⇔[y ∈ U and 0 ∈ −v + NUy + B(y− v) = U⊥ − PUw + w = U⊥ + PU⊥w =U⊥]. Hence Zv = U. Clearly JBx = (Id+B)−1x = x − w, therefore T =Id−PU + JB(2PU − Id) = Id−PU + 2PU − Id−w = PU − w. Using simpleinduction we get (∀n ≥ 1) Tnx = PU − (n− 1)PUw−w, now apply PU andnotice that PU is linear. We point out that the assumption that Zv ⊆ Fix(v+T) in Theorem 14.10does not hold true in general, even in the consistent case when v = 0 as weillustrate in the next two examples.Example 14.18 (v 6= 0 and Zv 6⊆ Fix(v + T)). Suppose that A and B are asdefined in Example 14.17 and that w 6∈ U. ThenU = Zv 6⊆ Fix(v + T) = −w +U. (14.22)Proof. Let x ∈ X. Then x ∈ Fix(v + T) = Fix(PUw + PU − w) =Fix(PU − PU⊥w) ⇔ x = PUx− PU⊥w ⇔ PU⊥x = −PU⊥w ⇔ PU⊥(x + w) =0 ⇔ x + w ∈ U ⇔ x ∈ −w + U. Using Example 14.17 we haveU = Zv ⊆ Fix(v + T) = −w +U ⇔ w ∈ U which is not true. Example 14.19 (v = 0 and Z 6⊆ Fix T). Suppose that (a, b) ∈ X × (Xr {0})and that (∀x ∈ X) Ax = x− a and Bx = x− b. Then{ 12 (a + b)}= Z 6⊆ Fix T = {b} (14.23)whenever a 6= b.Proof. Clearly Z = 12 (a + b). One can readily verify that (∀x ∈ x) JAx =12 (x + a), JBx =12 (x + b), hence RAx = a, RBx = b and therefore RBRAx =b. Using (5.2) we learn that Fix T = Fix(RBRA) = {b} and the conclusionfollows. We conclude this chapter with the following application ofTheorem 8.31 to the Douglas–Rachford algorithm when A and B areaffine relations (possibly not normal cone operators) whose sum does notnecessarily have a zero.Proposition 14.20. Suppose that A and B are affine such that v ∈ ran(Id−T).Let x ∈ X and set(∀n ∈N) xn = Tnx + n(Tn2 x− Tn2+1x). (14.24)17114.3. Convergence analysis in the inconsistent caseThen the following hold:(i) xn → PFix(v+T)x = PFix(T−v)x = PFix(T(v A,Bv))x.(ii) JAxn → some point in Zv.Proof. Note that JA and JB are affine and so is T. (i): The limit follows fromTheorem 8.31 whereas the identities follow from Theorem 8.8(v) appliedwith n = 1. (ii): In view of (i) we learn that JAxn → JA(PFix(T(v A,Bv))x) ∈ Zv,where the inclusion follows from applying Lemma 11.14(i). 172Chapter 15The Douglas–Rachfordalgorithm for two (notnecessarily intersecting) affinesubspaces15.1 OverviewIn this chapter, we carefully study the case when U and V are closed affinesubspaces that do not necessarily intersect. This case has applications e.g.,in image restoration of a spatially limited image from partial knowledgeof its Fourier transform (see, e.g., [58]). The problem reduces to a convexfeasibility problem for two affine subspaces, and one approach to solve theproblem is the Gerchberg–Papoulis algorithm (see [79] and [114]), whichreduces to the method of alternating projections in this case, to find asolution. Furthermore, the Douglas–Rachford method for two closed affinesubspaces has recently shown to be useful for solving the nonconvex sparseaffine feasibility problem (see [87] and [88]) and basis pursuit problem (see[68]). Other possible applications arise via the parallel splitting method(see [12, Proposition 25.7]), which adapts the Douglas–Rachford methodto handle a finite sum of monotone operators. When specializing theoperators further to normal cone operators of affine subspaces, the problemreduces to a (possibly inconsistent) convex feasibility problem involving anaffine subspace and the diagonal subspace in the product space.The main results of this chapter appear in Theorem 15.4 and are sum-marized as follows:• Let x ∈ X. We prove the strong convergence of the shadow sequence(PUTnx)n∈N when U and V are affine subspaces that do not have tointersect (see Theorem 15.4).• We identify the limit to be the closest best approximation solution to the17315.2. The Douglas–Rachford operator for two affine subspacesstarting point; moreover, the rate of convergence is linear when U +Vis closed.Our proofs critically rely on the new results developed in Theorem 8.8 andProposition 15.3, the well-developed results in the consistent case in [30]and those of the normal problem studied in Chapter 11.All the new results in this chapter are published in [15].15.2 The Douglas–Rachford operator for two affinesubspacesIn the following we assume thatv = Pran(Id−T)0 ∈ ran (Id−T), (15.1)thatU and V are nonempty closed convex subsets of X (15.2)and thatA = NU and B = NV . (15.3)Using Example 3.14(ii), (5.2) becomesTU,V = T(NU ,NV) = Id−PU + PV RU , (15.4)where RU = 2PU − Id. In this case (see Proposition 11.22)v = PU−V0, (15.5)or equivalently− v ∈ NU−Vv. (15.6)The normal problem now is to find x ∈ X such that0 ∈ NUx− v + NV(x− v). (15.7)We start with the following result:Proposition 15.1. Suppose that U is a closed affine subspace of X. Then thefollowing hold:17415.2. The Douglas–Rachford operator for two affine subspaces(i) Kv = NU−Vv.(ii) T(−v+NU ,NV(·−v)) = T(NU ,NV(·−v)) = TU,v+V .Proof. (i): Let z ∈ U ∩ (v + V) = Zv and note that, as subdifferentialoperators, NU and NV are paramonotone (see Example 3.24(i)) and so arethe translated operators −v + NU and NV(· − v). Therefore, in view ofRemark 4.29, Fact 14.3(ii) and Fact 14.2(i) we haveKv = (−v + NUz) ∩ (−NV(z− v)) (15.8a)= (−v + (par U)⊥) ∩ (−NV(z− v)) (15.8b)= (par U)⊥ ∩ (−NV(z− v)) (15.8c)= (NUz) ∩ (−NV(z− v)) = NU−Vv. (15.8d)(ii): It follows from Lemma 3.21(i), Fact 14.3(ii) and Lemma 2.8 thatJ−v+NU = PU(· + v) = PU . Moreover, Lemma 3.21(ii) implies thatNV(· − v) = Nv+V . Altogether, we haveT(−v+NU ,NV(·−v)) = T(−v+NU ,Nv+V) (15.9a)= Id−J−v+NU + JNV(·−v)(2J−v+NU − Id) (15.9b)= Id−PU + JNV(·−v)(2PU − Id) = T(NU ,NV(·−v)) (15.9c)= T(NU ,Nv+V) = TU,v+V . (15.9d)The conclusions of Proposition 15.1 may fail if we drop the assumptionthat U is an affine subspace as we illustrate in the next example that wasmotivated by [20, Example 3.7].Example 15.2. Let X = R, U = [1,+∞[, and V = {0}. Then the followinghold:(i) v = 1.(ii) Zv = {1}.(iii) ]−∞,−1] = Kv 6= NU−Vv = ]−∞, 0].(iv) (∀x ∈ ]0,+∞[) 0 = T(−v+NU ,NV(·−v))x 6= TU,v+V x = min{1, x}.17515.2. The Douglas–Rachford operator for two affine subspacesProof. (i): Clear in view of (15.5). (ii): Combine (i) and Proposition 11.22.(iii): It follows from (15.8a) and (ii) that Kv = (−1 + NU(1)) ∩(NV(0)) = (−1 + ]−∞, 0]) ∩ R = ]−∞,−1]. On the other handNU−Vv = NU(1) = ]−∞, 0]. (iv): By Lemma 3.21(ii) we learn thatNV(· − v) = Nv+V . Let x ∈ X and note that v + V = {1}, hence(∀x ∈ X)Pv+V x = 1. In view of Lemma 2.8 and Lemma 3.21(ii) wehave T(−v+NU ,NV(·−v))x = T(U,v+V)x ⇔ T(−v+NU ,Nv+V)x = T(U,v+V)x⇔ x− PU(x + v) + Pv+V(2PU(x + v)− x) = x− PUx + Pv+V(2PUx− x)⇔PU(x + v) = PUx ⇔ x ≤ 0. Now suppose that x > 0. Using the same toolsas above we obtain T(−v+NU ,NV(·−v))x = x − PU(x + v) + Pv+V(2PU(x +v) − x) = x − PU(x + 1) + 1 = x − (x + 1) + 1 = 0. On the other hand,T(U,v+V)x = x − PUx + Pv+V(2PUx − x) = x − PUx + 1 = min{1, x} > 0,which completes the proof. Proposition 15.3. Suppose that U and V are closed affine subspaces of X and thatT = TU,V . Then the following hold:(i) T is affine and T = Id−PU − PV + 2PV PU .(ii) v ∈ (par U)⊥ ∩ (par V)⊥.(iii) (∀x ∈ X) (∀α ∈ R) PUx = PU(x + αv).(iv) (∀x ∈ X) (∀α ∈ R) PV x = PV(x + αv).(v) T−v = v + T = TNU ,NV(·−v) = TU,v+V .(vi) Zv = U ∩ (v +V).(vii) Kv = NU−Vv = (par U)⊥ ∩ (par V)⊥.(viii) Fix(T−v) = Fix(v + T) = Zv + Kv = (U ∩ (v + V)) + ((par U)⊥ ∩(par V)⊥).Proof. (i): Note that PU and PV are affine (see Fact 2.7(i)). Using (15.4) wehave T = Id−PU + PV(2PU − Id) = Id−PU + 2PV PU − PV . Since the classof affine operators is closed under addition, subtraction and compositionwe deduce that T is affine.(ii): This follows from Fact 14.3(ii). (iii) and (iv): Combine (ii) withLemma 2.8.(v) Combine Theorem 8.8(v) (applied with n = 1) and Proposi-tion 15.1(ii). (vi): See Proposition 11.22.17615.2. The Douglas–Rachford operator for two affine subspaces(vii): The first identity follows from Proposition 15.1(i). To prove thesecond identity note that NU−Vv = (par(U −V))⊥ = (par U + par V)⊥ =(par U)⊥ ∩ (par V)⊥. (viii): Since −v + NU and NV(· − v) areparamonotone, it follows from (v), (vii) and Corollary 5.16(v) appliedto the normal pair (v A, Bv) that Fix(T−v) = Fix(v + T) = Zv + Kv. Nowcombine with (vi) and (vii) to get the desired result. We are now ready for our main result. The proof of Theorem 15.4 relieson the work leading up to this point as well as the convergence analysis ofthe consistent case in [30].Theorem 15.4 (Douglas–Rachford algorithm for two affine subspaces).Suppose that U and V are closed affine subspaces of X and let x ∈ X. Then(∀n ∈N) we havePUTnx = PU(Tnx + nv) = PU((T−v)nx) (15.10a)= PUTnU,v+V x = J−v+NU ((T−v)nx), (15.10b)andPUTnx → PZv x = PU∩(v+V)x. (15.11)Moreover, if par U + par V is closed (as is always the case when X is finite-dimensional) then the convergence is linear with rate being the cosine of theFriedrichs anglecF(par U, par V) = supu∈par U∩W⊥∩ball(0;1)v∈par V∩W⊥∩ball(0;1)|〈u, v〉| < 1, (15.12)where W = par U ∩ par V and ball(0; 1) is the closed unit ball.Proof. Let n ∈ N. Using Proposition 15.3(iii) with (x, α) replaced by(Tnx, n) we learn that PUTnx = PU(Tnx + nv). Now combine withTheorem 8.8(iv) to get the second identity. The third identity followsfrom applying Proposition 15.3(v). Finally note that using the first identity,Proposition 15.3(iii) with (x, α) replaced by ((T−v)nx, 1) and Lemma 3.21(ii)we learn that PUTnx = PU((T−v)nx + v) = J−v+NU ((T−v)nx). Nowwe prove (15.11). It follows from (15.1), Lemma 11.14(iii) andProposition 15.3(vi) that Zv = U ∩ (v + V) 6= ∅. Now apply [30,Corollary 4.5]. 17715.2. The Douglas–Rachford operator for two affine subspacesFigure 15.1: Two nonintersecting affine subspaces U (blue line) and V(purple line) in R3. Shown are also the first few iterates of (Tnx0)n∈N (redpoints) and (PUTnx0)n∈N (blue points).Figure 15.1 shows a GeoGebra snapshot [78] of the Douglas–Rachforditerates and its shadows for two nonintersecting nonparallel lines U and Vin R3.Let us now comment on the comparison of the Douglas–Rachfordalgorithm to the method of alternating projections.Remark 15.5. Let x ∈ X and n ∈ {1, 2, . . .}. A straight-forward inductionyields(PV PU)nx = −v + (Pv+V PU)nx (15.13)while Theorem 15.4 results inPUTnx = PUTnU,v+V x. (15.14)We conclude that executing the Douglas–Rachford algorithm or the Method ofAlternating Projection (MAP), to solve the best approximation problem for U andV, is essentially the same (up to a shift with v in the case of MAP) as executing thetwo algorithms on the (consistent) normal problem. Consequently, we can use thenumerical results established in [30, Sections 7 and 8] to illustrate the efficiencyof our algorithm. In passing, we recall that the numerical experiments presentedin [30, Section 7] show that Douglas–Rachford algorithm reveal a faster rate ofconvergence than MAP when the angle between the subspaces is small.17815.2. The Douglas–Rachford operator for two affine subspacesIn the light of refined conclusions of Theorem 15.4, under the ad-ditional assumptions that the sets are affine subspaces, in comparisonto those of Theorem 14.11, it is natural to ask whether or not we canobtain refined/better conclusions in the general inconsistent case when theoperators are affine relations.The following simple example shows that for two affine (but not normalcone) operators, the shadow sequence may fail to converge.Example 15.6 (both primal and dual shadows are unbounded). Supposethat X = R2 and let S : R2 → R2 : (x1, x2) 7→ (−x2, x1), be the counter-clockwise rotator by pi/2. Let b ∈ R2 r {(0, 0)}. Suppose that A = S andset B = −S+ b. Then zer A 6= ∅, zer B 6= ∅ yet zer(A+ B) = ∅. Moreover,v = (Id+S)(b), the set of normal solutions Zv = R2 and for every x ∈ R2we have ‖JATnx‖ → +∞ and ‖JA−1 Tnx‖ → +∞.Proof. Let x ∈ R2 and note that S and −S = S−1 are both linear, con-tinuous, single-valued, monotone and S2 = (−S)2 = − Id. It followsfrom Proposition 3.16 that JAx = JSx = 12 (Id−S)x = 12 (x − Sx) andJA−1 x = J−Sx = 12 (Id+S)x =12 (x + Sx). Similarly using Fact 3.20(i) wecan see that JBx = 12 (x− b + Sx− Sb). Therefore we have RAx = −Sx andRBx = −b+Sx−Sb. Hence RBRAx = S(−Sx)−Sb− b = −S2x−Sb− b =x− Sb− b = x− (Id+S)b. Consequently we have(∀x ∈ R2) Tx = 12 (Id+RBRA)x = x− 12 (Id+S)b. (15.15)It follows from (15.15) that ran(Id−T) = { 12 (Id+S)b}, hencev = 12 (Id+S)b and Tx = x − v. Therefore, using Theorem 8.8(v)Fix(v + T) = Fix(T−v) = R2. Moreover, using Lemma 11.14(i) welearn that Zv = Jv A(Fix(T−v)) = R2. In view of Proposition 7.8(i)we have Tnx = x − nv. Hence, using that JA is linear, we getJATnx = JA(x − nv) = JAx − nJAv. Now JAv = 12 (Id−S)( 12 (Id+S)b) =12 b 6= (0, 0). Similarly JA−1 Tnx = JA−1(x − nv) = JA−1 x − nJA−1 v andJA−1 v =12 (Id+S)(12 (Id+S)b) =12 Sb 6= (0, 0). With some more work, we can construct an example of the same typewhere the involved operators are even subdifferential operators:Example 15.7 (The shadows may fail to converge to a normal solution).Suppose that U and V are affine subspaces of X such that U ∩V = ∅, andset A˜ = N−1U and B˜ = N−>V . Then Z = K = ∅,Zv = v +((par U)⊥ ∩ (par V)⊥), Kv = −v + (U ∩ (v +V)) , (15.16)17915.2. The Douglas–Rachford operator for two affine subspaces‖JA˜Tnx‖ → +∞, and JA˜−1 Tnx = PUTnx → PU∩(v+V)x. Consequently,even though Zv 6= ∅, the sequence of primal shadows has no convergentsubsequence.Proof. Note that A˜−1 = NU , B˜−> = NV , therefore using Fact 5.2(ii) wehave T(A˜,B˜) = T(A˜−1,B˜−>) = TU,V . Clearly, K = (NU + NV)−1(0) =U ∩ V = ∅, hence by (10.3) and Fact 5.2(ii), Z = ∅ and Fix T(A˜,B˜) =∅. Using Remark 11.12 and Proposition 15.3(vii)&(vi), we conclude thatZv = v +((par U)⊥ ∩ (par V)⊥) and Kv = −v + (U ∩ (v +V)). Letx ∈ X. It follows from Theorem 15.4 with A and B replaced with A˜−1and B˜−> that JA˜−1 Tnx = JNU Tnx = PUTnx → PU∩(v+V)x. By Fact 8.1(i)‖Tnx‖ → +∞. Moreover the inverse resolvent identity (see (3.15)) impliesJA˜Tnx = (Id−PU)Tnx = Tnx − PUTnx. Finally, ‖JA˜Tnx‖ = ‖Tnx −PUTnx‖ ≥ ‖Tnx‖ − ‖PUTnx‖ → +∞. Example 14.15 provides another instance where the shadow sequence isunbounded with both operators being affine subdifferential operators.180Chapter 16The Douglas–Rachfordalgorithm in the affine-convexcase16.1 OverviewIn this chapter we provide convergence results when one constraint set isan affine subspace. As a consequence, we extend a result by Spingarn fromhalfspaces to general closed convex sets admitting least-squares solutions.Let x ∈ X and suppose that (A, B) = (NU , NV) where U and V arenonempty closed convex subsets of X. Our main results in this chaptercan be summarized as follows:• When U is a closed affine subspace we provide a simpler proof thatthe shadow sequence (PUTnx)n∈N will always converge to a normalsolution (see Theorem 16.1). We tighten our conclusion by showingthat the other shadow sequence (PV Tnx)n∈N may or may not bebounded (see Example 16.2).• In contrast, when V is a closed affine subspace, we obtain the conver-gence of both shadow sequences (PUTnx)n∈N and (PV Tnx)n∈N (seeTheorem 16.5).• As a consequence we obtain a far-reaching refinement of Spingarn’ssplitting method introduced in [131] (see Corollary 16.6).All the new results in this chapter are published in [36].16.2 Convergence resultsIn this chapter we work under the assumptions thatU and V are nonempty closed convex subsets of X, (16.1)18116.2. Convergence resultsthatA = NU , B = NV and v = PU−V0, (16.2)and thatv = v(U,V) = PU−V0 ∈ U −V. (16.3)In view of (16.3) we haveU ∩ (v +V) 6= ∅ and (−v +U) ∩V 6= ∅. (16.4)For sufficient conditions on when v ∈ U − V (or equivalently the setsU ∩ (v + V) and (−v + U) ∩ V are nonempty) we refer the reader to [10,Facts 5.1].Here is our first main result in this chapter.Theorem 16.1 (convergence of Douglas–Rachford algorithm when U is aclosed affine subspace). Suppose that U is a closed affine subspace of X, andlet x ∈ X. Then(∀n ∈N) PUTnx = PU(Tnx + nv). (16.5)Moreover the following hold:(i) The shadow sequence (PUTnx)n∈N converges weakly to some point inU ∩ (v +V).(ii) No general conclusion can be drawn about the boundedness of the sequence(PV Tnx)n∈N.Proof. After translating the sets U and V by a vector if necessary, we canand do assume that U is a closed linear subspace of X. Using Fact 14.3(ii)and Lemma 2.8 we learn that (∀n ∈ N) PUTnx = PU(Tnx + nv). (i): Notethat U∩ (v+V) ⊆ U. Now combine Proposition 8.4(ii), Fact 14.6(iii)(b) andLemma 6.4 with C = U ∩ (v+V), and (xn)n∈N replaced by (Tnx+ nv)n∈N.(ii): In fact, (PV Tnx)n∈N can be unbounded (see Example 16.2 below) orbounded (e.g., when U = V = X). Example 16.2. Suppose that X = R2, that U = R × {0} and that V =epi (|·|+ 1). Then U ∩ V = ∅ and for the starting point x ∈ [−1, 1]× {0}we have (∀n ∈ {1, 2, . . .}) Tnx = (0, n) ∈ V and therefore ‖PV Tnx‖ =‖Tnx‖ = n→ +∞.18216.2. Convergence resultsProof. Let x = (α, 0) with α ∈ [−1, 1]. We proceed by induction. Whenn = 1 we have T(α, 0) = PU⊥(α, 0) + PV RU(α, 0) = PV(α, 0) = (0, 1).Now suppose that for some (n ∈ {1, 2, . . .}) Tnx = (0, n). Then Tn+1x =T(0, n) = PU⊥(0, n) + PV RU(0, n) = (0, n) + PV(0,−n) = (0, n + 1) ∈ Vand the conclusion follows. Figure 16.1: A GeoGebra [78] snapshot that illustrates Example 16.2.Remark 16.3. We point out that even though the conclusion of Theorem 16.1(i)follows from Theorem 14.11, in Theorem 16.1(i) we provide an alternative simplerproof that requires less tools than what we need to prove the general case presentedin Theorem 14.11. The simpler proof as well as (16.5) are both products of theadditional assumption that one set is an affine subspace.It is tempting to conjecture that (16.5) holds true when U is just convexand not necessarily a subspace. We now show that, even though we stillget the convergence of the shadow sequence (PUTnx)n∈N in the light ofTheorem 14.11, this conjecture fails, in general.Example 16.4. Suppose that X = R, that U = [1, 2] and that V = {0}.Then v = 1 and U ∩ (v + V) = {1}. Let x = 4. We have (Tnx)n∈N =(4, 2, 0,−1,−2,−3, . . . ), PUTnx → 1 ∈ U ∩ (v+V) and (∀n ∈ {2, 3, 4, . . . })Tnx + nv = −(n − 2) + n(1) = 2 ∈ U and PU(Tnx + nv) = 2 ∈ U r(U ∩ (v + V)). In the proof of Theorem 16.1(i), we had (PUTnx)n∈N =(PU(Tnx + nv))n∈N which is strikingly false here.18316.3. Spingarn’s methodWhen V is an affine subspace, the convergence theory is even moresatisfying:Theorem 16.5 (convergence of Douglas–Rachford algorithm when V is aclosed affine subspace). Suppose that V is a closed affine subspace of X, andlet x ∈ X. Then the following hold:(i) The shadow sequence (PUTnx)n∈N converges weakly to some point inU ∩ (v +V).(ii) The sequence (PV Tnx)n∈N converges weakly to some point in (U− v)∩V.Proof. (i): Apply Theorem 16.1(i) with (U, V) replaced by (V, U) and useCorollary 10.10. (ii): Combine (i) and Fact 14.7. 16.3 Spingarn’s methodIn this section we discuss the problem to find a least-squares solution of⋂Mi=1 Ci, i.e., tofind a minimizer ofM∑i=1d2Ci , (16.6)where C1, . . . , CM are nonempty closed convex (possibly nonintersecting)subsets of X with corresponding distance functions dC1 , . . . , dCM . Follow-ing Pierra [120], we now consider the product Hilbert space X = XM, withthe inner product ((x1, . . . , xM), (y1, . . . , yM)) 7→ ∑Mi=1 〈xi, yi〉. We setU ={(x, . . . , x) ∈ X ∣∣ x ∈ X} and V = C1 × · · · × CM. (16.7)Then the projections of x = (x1, . . . , xM) ∈ X onto U and Vare given by, respectively, PUx =(1M ∑Mi=1 xi, . . . ,1M ∑Mi=1 xi)andPVx = (PC1 x1, . . . , PCM xM). Now assume thatv = (v1, . . . , vM) = PU−V0 ∈ U−V. (16.8)Then we have U ∩ (v + V) 6= ∅ and(x, . . . , x) ∈ U ∩ (v + V) ⇔ x ∈M⋂i=1(vi + Ci). (16.9)18416.3. Spingarn’s methodUsing [10, Section 6], we see that the M-set problem (16.6) is equivalentto the two-set problemfind a least-squares solution of U ∩V. (16.10)It follows from (16.8) and (16.9) that v is the unique vector in U−V thatsatisfies(w1, w2, . . . , wM) 6= (v1, v2, . . . , vM),andM⋂i=1(Ci + wi) 6= ∅ ⇒ M∑i=1 ‖wi‖2 >M∑i=1‖vi‖2.(16.11)We have the following result for the problem of finding a least-squaressolution for the intersection of a finite family of sets.Corollary 16.6. Suppose that C1, . . . , CM are closed convex subsets of X. LetT = Id−PU + PVRU, let x ∈ X and recall assumption (16.8). Then the shadowsequence (PUTnx)n∈N converges to x¯ = (x¯, . . . , x¯) ∈ U ∩ (v + V), wherex¯ ∈ ⋂Mi=1(Ci + vi) and x¯ is a least-squares solution of (16.6).Proof. Combine Theorem 16.1 with (16.11) and (16.9). In Figure 16.2 below we visualize Corollary 16.6 in the case when X = R2,and M = 3.Remark 16.7. When we particularize Corollary 16.6 from convex sets to half-spaces and X is finite-dimensional, we recover Spingarn’s [131, Theorem 1].Note that in this case, in view of [10, Facts 5.1(ii)] we have v ∈ U− V. Recallthat Spingarn used the following version of his method of partial inverses from[129]:(u0, v0) ∈ U×U⊥ (16.12)and(∀n ∈N){u′n = PV(un + vn), v′n = un + vn − u′n,un+1 = PUu′n, vn+1 = v′n − PUv′n.(16.13)This method is the Douglas–Rachford algorithm in X, applied to U and V withstarting point (u0 − v0) (see [37, Lemma 2.17]).18516.3. Spingarn’s methodFigure 16.2: A GeoGebra [78] snapshot that illustrates Corollary 16.6. Threenonintersecting closed convex sets, C1 (the blue triangle), C2 (the redpolygon) and C3 (the green circle), are shown along with their translationsforming the generalized intersection. The first few terms of the sequence(e(PUTnx))n∈N (yellow points) are also depicted. Here e : U → R2 :(x, x, x) 7→ x.186Chapter 17ConclusionIn this thesis we presented a novel analysis of the behaviour of theDouglas–Rachford algorithm when the underlying sum problem ispossibly inconsistent, i.e., the problem had no solution.Our results represent considerable progress which helps to better un-derstand the inconsistent case. A significant consequence of our newanalysis was that we were able to provide a proof of weak convergence ofthe shadow sequence to a best approximation solution when the algorithmis specialized to solving convex feasibility problems.The work in this thesis is based on the author’s joint work that appearsin the publications [13], [14], [15], [26], [31], [33], [36], [38] and [107] and thesubmitted manuscripts [39] and [110]. We start by summarizing the maincontributions of this thesis.17.1 Main resultsWe introduced the new concept of the normal problem and associatednormal solutions for the sum of two maximally monotone operators. Anice feature of the theory is that it recovers, as a particular case, thenotion of normal solution (obtained by the least squares method) of linearsystems. Another interesting and very useful feature is that the theoryfits perfectly with Attouch–The´ra duality, in the sense that the Douglas–Rachford splitting operators for the primal and dual problems coincide.To complete the theory, we were able to provide an elegant descriptionof the range of the displacement mapping associated with the Douglas–Rachford operator. This result is of particular importance because it givesan explicit tool to calculate the minimal displacement vector that definesthe normal problem.As a natural follow up, we looked at the algorithmic consequences ofour new structure. We provided sufficient conditions for the convergenceof the shadow sequence produced by the Douglas–Rachford operator in thepossibly inconsistent case. When specialized to the convex feasibility set-18717.2. Side resultsting, we proved the full weak convergence of the shadow sequence to a bestapproximation solution, which is a powerful result. Indeed, the shadowsequence does not require prior knowledge of the minimal displacementvector that defines the normal problem, hence the normal solutions. Theconvergence analysis turned out to be more satisfying when the underlyingsets are affine subspaces. In this case we get strong convergence of theshadow sequence, with a linear rate (under mild assumptions) and the limitidentified to be the closest best approximation solution to the starting point.We also provided an application to the parallel splitting method that dealswith more than two sets.17.2 Side resultsWe now turn to our new results that arose either as auxiliary tools or asbyproducts of the analysis of the inconsistent case.We systematically studied Attouch–The´ra duality for the sum prob-lem problem. We provide new results related to Passty’s parallel sum,to Eckstein and Svaiter’s extended solution set, and to Combettes’ fixedpoint description of the set of primal solutions. Furthermore, we proveparamonotonicity is a key property because it allows for the recovery of allprimal solutions given just one arbitrary dual solution.We gave a systematic study of nearly convex sets, provided furthercharacterizations, and extended known results on calculus, relative inte-riors, and applications. Although nearly convex sets need not be convex,we showed that many results on convex sets do extend.As noticed, even though the sum problem does not depend on theorder of the operators, the corresponding Douglas–Rachford operator does.Another question that we tackled was, what can we say about the twoDouglas–Rachford operators that can solve the sum problem? We provedthat the reflectors of the underlying operators act as bijections between thesets of fixed points of the two operators. Moreover, in some special cases,we were able to relate the two sequences obtained by iterating the twooperators.To make further progress in our convergence analysis, we presentedsome new Feje´r monotonicity principles, which serve as major buildingbuilding blocks in the convergence results.We developed a comprehensive study of iterates of nonexpansive op-erators, their corresponding inner/outer shifts and the associated sets offixed points. When the operators were affine, we got strikingly elegant as18817.3. Future workwell as useful results. As a consequence, we were able to extend and refinesome of the well-known results of linear and strong convergence from linearoperators to affine operators.Finally we presented a new proof to the convergence of the shadowsequence produced by the Douglas–Rachford algorithm in the consistentcase.17.3 Future workIn the following we discuss some possible directions of future research:• Suppose that T is nonexpansive. We present a list of open problemsthat may be easier than the general question (8.46), namely: underwhich conditions on T must (Tnx− Tny)n∈N converge weakly? Let xand y be in X.P1: Suppose that X = R, v = 0 but Fix T = ∅. Is (Tnx − Tny)n∈Nconvergent?P2: Suppose that X = R, v 6= 0 but Fix(v + T) = ∅. Is (Tnx −Tny)n∈N convergent?P3: Does Corollary 8.38 remain true if dim(X) ≥ 3? Example 8.39suggests that the answer to this question might be positive.P4: What can be said about (8.46) if we replace “weakly” by“strongly”?• Note that the minimal displacement vector can be found by usingFact 8.2. Conceptionally, we can thus first find v(A,B) via eitheriterations in (8.2), and proceed then by iterating the operator x 7→T(x + v(A,B)) to find a normal solution. It would be desirable todevise an algorithm that approximates v(A,B) and a correspondingnormal solution (should it exist) simultaneously. Proposition 11.29,which leads us to solving a quadratic optimization problem, suggeststhat this may indeed be possible in general.• In view of Example 14.15, Remark 14.16 and Example 15.6 it is naturalto ask the question: what conditions imposed on the operators Aand B guarantee that the shadow sequence of the Douglas–Rachfordalgorithm converges to a normal solution, provided there is one?• One natural extension of the work of this thesis is to examine whetheror not this can be extended to the nonconvex feasibility setting, where18917.3. Future workthe operators A and B are normal cones of two sets, each of which isa finite union of convex sets. Convergence analysis of this setting inthe consistent case are indeed developed in [16].• Regarding the inconsistent case, it is tempting to explore the be-haviour of the alternating direction method of multipliers (ADMM) inthis case.190Bibliography[1] F. J. Arago´n Artacho and J. M. Borwein. Global convergence of anon-convex Douglas–Rachford iteration. J. Global Optim., 57:753–769,2013. → pages 4, 115, 123[2] F. J. Arago´n Artacho, J. M. Borwein, and M. K. Tam. Recent resultson Douglas–Rachford methods for combinatorial optimization prob-lems. J. Optim. Theory Appl., 163:1–30, 2014. → pages 4, 115, 123[3] H. Attouch and M. The´ra. A general duality principle for the sum oftwo operators. J. Convex Anal., 3(1):1–24, 1996. → pages 30, 31, 128,144[4] A. Auslender and M. Teboulle. Asymptotic Cones and Functions inOptimization and Variational Inequalities. Springer Monographs inMathematics. Springer-Verlag, New York, 2003. → pages 141[5] J.-B. Baillon. Quelques proprie´te´s de convergence asymptotique pourles contractions impaires. C. R. Acad. Sci. Paris Se´r. A-B, 283(8):Aii,A587–A590, 1976. → pages 90[6] J. B. Baillon, R. E. Bruck, and S. Reich. On the asymptotic behavior ofnonexpansive mappings and semigroups in Banach spaces. HoustonJ. Math., 4:1–9, 1978. → pages 76, 80, 90, 96, 98, 104[7] H. H. Bauschke. Projection Algorithms and Monotone Operators.Ph.D. thesis, Simon Fraser University, Canada, 1996. → pages 68[8] H. H. Bauschke. A note on the paper by Eckstein and Svaiter on“General projective splitting methods for sums of maximal mono-tone operators”. SIAM J. Control Optim., 48:2513–2515, 2009.→ pages114[9] H. H. Bauschke. New demiclosedness principles for (firmly) non-expansive operators. In Computational and analytical mathematics,191Bibliographyvolume 50 of Springer Proc. Math. Stat., pages 19–28. Springer, NewYork, 2013. → pages 111[10] H. H. Bauschke and J. M. Borwein. Dykstra’s alternating projectionalgorithm for two sets. J. Approx. Theory, 79:418–443, 1994. → pages135, 136, 182, 185[11] H. H. Bauschke and P. L. Combettes. A weak-to-strong convergenceprinciple for Feje´r-monotone methods in Hilbert spaces. Math. Oper.Res., 26:248–264, 2001. → pages 67, 112[12] H. H. Bauschke and P. L. Combettes. Convex Analysis and MonotoneOperator Theory in Hilbert Spaces. CMS Books in Mathematics /Ouvrages de Mathe´matiques de la SMC. Springer, New York, 2011.ISBN 978-1-4419-9466-0. With a foreword by He´dy Attouch.→ pages1, 3, 6, 8, 9, 10, 12, 13, 14, 16, 17, 18, 19, 34, 35, 37, 44, 45, 46, 56, 64, 68,70, 77, 78, 81, 82, 85, 90, 95, 102, 103, 104, 107, 110, 111, 112, 113, 122,123, 124, 136, 147, 149, 150, 157, 158, 169, 173[13] H. H. Bauschke and W. M. Moursi. On the Douglas–Rachfordalgorithm. To appear in Math. Program. (Ser. A). → pages iii, 47, 67,102, 161, 187[14] H. H. Bauschke and W. M. Moursi. On the order of the operators inthe Douglas–Rachford algorithm. Optim. Lett., 10:447–455, 2016. →pages iii, 115, 187[15] H. H. Bauschke and W. M. Moursi. The Douglas–Rachford algorithmfor two (not necessarily intersecting) affine subspaces. SIAM J.Optim., 26:968–985, 2016. → pages iii, 74, 79, 174, 187[16] H. H. Bauschke and D. Noll. On the local convergence of theDouglas–Rachford algorithm. Arch. Math. (Basel), 102(6):589–600,2014. → pages 190[17] H. H. Bauschke, J. M. Borwein, and A. S. Lewis. The method ofcyclic projections for closed convex sets in Hilbert space. In RecentDevelopments in Optimization Theory and Nonlinear Analysis (Jerusalem,1995), volume 204 of Contemp. Math., pages 1–38. Amer. Math. Soc.,Providence, RI, 1997. → pages 77[18] H. H. Bauschke, P. L. Combettes, and D. R. Luke. Phase retrieval,error reduction algorithm, and Fienup variants: a view from convex192Bibliographyoptimization. J. Opt. Soc. Amer. A, 19:1334–1345, 2002. → pages 32,114[19] H. H. Bauschke, F. Deutsch, H. Hundal, and S.-H. Park. Acceleratingthe convergence of the method of alternating projections. Trans. Amer.Math. Soc., 355:3433–3461, 2003. → pages 85, 90[20] H. H. Bauschke, P. L. Combettes, and D. R. Luke. Finding bestapproximation pairs relative to two closed convex sets in Hilbertspaces. J. Approx. Theory, 127:178–192, 2004. → pages 54, 135, 159,163, 164, 166, 175[21] H. H. Bauschke, P. L. Combettes, and D. R. Luke. A stronglyconvergent reflection method for finding the projection onto theintersection of two closed convex sets in a Hilbert space. J. Approx.Theory, 141:63–69, 2006. → pages 7, 112[22] H. H. Bauschke, J. M. Borwein, and X. Wang. Fitzpatrick functionsand continuous linear monotone operators. SIAM J. Optim., 18(3):789–809, 2007. → pages 17[23] H. H. Bauschke, X. Wang, and L. Yao. Monotone linear relations:maximality and Fitzpatrick functions. J. Convex Anal., 16:673–686,2009. → pages 12, 38[24] H. H. Bauschke, X. Wang, and L. Yao. Examples of discontinuousmaximal monotone linear operators and the solution to a recentproblem posed by B. F. Svaiter. J. Math. Anal. Appl., 370:224–241, 2010.→ pages 38[25] H. H. Bauschke, X. Wang, and L. Yao. On Borwein–Wiersmadecompositions of monotone linear relations. SIAM J. Optim., 20(5):2636–2652, 2010. → pages 11, 60[26] H. H. Bauschke, R. I. Bot¸, W. L. Hare, and W. M. Moursi. Attouch–The´ra duality revisited: paramonotonicity and operator splitting. J.Approx. Theory, 164:1065–1084, 2012. → pages iii, 29, 47, 48, 102, 187[27] H. H. Bauschke, S. M. Moffat, and X. Wang. Firmly nonexpansivemappings and maximally monotone operators: correspondence andduality. Set-Valued Var. Anal., 20:131–153, 2012. → pages 59, 106, 117,154, 160193Bibliography[28] H. H. Bauschke, D. R. Luke, H. M. Phan, and X. Wang. Restricted nor-mal cones and the method of alternating projections: applications.Set-Valued Var. Anal., 21(3):475–501, 2013. → pages 90[29] H. H. Bauschke, S. M. Moffat, and X. Wang. Near equality, nearconvexity, sums of maximally monotone operators, and averages offirmly nonexpansive mappings. Math. Program. (Ser. B), 139:55–70,2013. → pages 18, 19, 141, 142[30] H. H. Bauschke, J. Y. Bello Cruz, T. T. A. Nghia, H. M. Phan, andX. Wang. The rate of linear convergence of the Douglas–Rachfordalgorithm for subspaces is the cosine of the Friedrichs angle. J.Approx. Theory, 185:63–79, 2014. → pages 61, 105, 109, 124, 174, 177,178[31] H. H. Bauschke, W. L. Hare, and W. M. Moursi. Generalized solutionsfor the sum of two maximally monotone operators. SIAM J. ControlOptim., 52:1034–1047, 2014. → pages iii, 47, 74, 126, 187[32] H. H. Bauschke, X. Wang, and L. Yao. Rectangularity and paramono-tonicity of maximally monotone operators. Optimization, 63:487–504,2014. → pages 17[33] H. H. Bauschke, M. N. Dao, and W. M. Moursi. On Feje´r monotonesequences and nonexpansive mappings. Linear and Nonlinear Analy-sis, 1:287–295, 2015. → pages iii, 67, 187[34] H. H. Bauschke, D. Noll, and H. M. Phan. Linear and strong conver-gence of algorithms involving averaged nonexpansive operators. J.Math. Anal. Appl., 421:1–20, 2015. → pages 115, 123[35] H. H. Bauschke, J. Y. Bello Cruz, T. T. A. Nghia, H. M. Phan, andX. Wang. Optimal rates of convergence of matrices with applications.Numer. Algor., 73:33–76, 2016. → pages 92[36] H. H. Bauschke, M. N. Dao, and W. M. Moursi. The Douglas–Rachford algorithm in the affine-convex case. Oper. Res. Lett., 44:379–382, 2016. → pages iii, 67, 181, 187[37] H. H. Bauschke, M. N. Dao, D. Noll, and H. M. Phan. On Slater’scondition and finite convergence of the Douglas–Rachford algorithmfor solving convex feasibility problems in Euclidean spaces. J. GlobalOptim., 65(2):329–349, 2016. → pages 115, 123, 185194Bibliography[38] H. H. Bauschke, W. L. Hare, and W. M. Moursi. On the range of theDouglas–Rachford operator. Math. Oper. Res., 41:884–879, 2016. →pages iii, 10, 126, 141, 155, 187[39] H. H. Bauschke, B. Lukens, and W. M. Moursi. Affine nonexpansiveoperators, Attouch–The´ra duality and the Douglas–Rachford algo-rithm. arXiv:1603.09418v1 [math.OC], 2016. → pages iii, 10, 29,47, 74, 79, 102, 187[40] H. H. Bauschke, J. Schaad, and X. Wang. On the Douglas–Rachfordoperators that fail to be proximal mappings. Math. Program. (Ser. B),DOI 10.1007/s10107-016-1076-5, 2016. → pages 65[41] A˚. Bjo¨rck. Numerical Methods for Least Squares Problems. Society forIndustrial and Applied Mathematics (SIAM), Philadelphia, PA, 1996.→ pages 127[42] J. M. Borwein. Fifty years of maximal monotonicity. Optim. Lett., 4:473–490, 2010. → pages 1[43] J. M. Borwein and B. Sims. The Douglas–Rachford algorithm in theabsence of convexity. In Fixed-point Algorithms for Inverse Problemsin Science and Engineering, volume 49 of Springer Optim. Appl., pages93–109. Springer, New York, 2011. → pages 4, 123[44] J. M. Borwein and M. K. Tam. A cyclic Douglas–Rachford iterationscheme. J. Optim. Theory Appl., 160:1–29, 2014. → pages 124[45] J. M. Borwein and M. K. Tam. The cyclic Douglas–Rachford methodfor inconsistent feasibility problems. J. Nonlinear Convex Anal., 16(4):573–584, 2015. → pages 4[46] J. M. Borwein and J. D. Vanderwerff. Convex Functions: Constructions,Characterizations and Counterexamples, volume 109 of Encyclopediaof Mathematics and its Applications. Cambridge University Press,Cambridge, 2010. → pages 1[47] R. I. Bot¸, S.-M. Grad, and G. Wanka. New regularity conditions forstrong and total Fenchel–Lagrange duality in infinite dimensionalspaces. Nonlinear Anal., 69:323–336, 2008. → pages 44[48] R. I. Bot¸, S.-M. Grad, and G. Wanka. On strong and total Lagrangeduality for convex optimization problems. J. Math. Anal. Appl., 337:1315–1325, 2008. → pages 44195Bibliography[49] R. I. Bot¸, G. Kassay, and G. Wanka. Duality for almost convexoptimization problems via the perturbation approach. J. GlobalOptim., 42:385–399, 2008. → pages 142[50] H. Bre´zis. Ope´rateurs Maximaux Monotones et Semi-groupes de Con-tractions dans les Espaces de Hilbert. North-Holland Publishing Co.,Amsterdam-London; American Elsevier Publishing Co., Inc., NewYork, 1973. North-Holland Mathematics Studies, No. 5. Notas deMatema´tica (50). → pages 1[51] H. Brezis and A. Haraux. Image d’une somme d’ope´rateurs mono-tones et applications. Israel J. Math., 23:165–186, 1976. → pages 17,19, 152[52] L. M. Bricen˜o-Arias and P. L. Combettes. A monotone + skewsplitting model for composite monotone inclusions in duality. SIAMJ. Optim., 21:1230–1250, 2011. → pages 30, 37[53] R. E. Bruck and S. Reich. Nonexpansive projections and resolventsof accretive operators in Banach spaces. Houston J. Math., 3:459–470,1977. → pages 76, 80, 90[54] R. S. Burachik and A. N. Iusem. A generalized proximal pointalgorithm for the variational inequality problem in a Hilbert space.SIAM J. Optim., 8:197–216 (electronic), 1998. → pages 16[55] R. S. Burachik and A. N. Iusem. Set-Valued Mappings and Enlargementsof Monotone Operators, volume 8 of Springer Optimization and ItsApplications. Springer, New York, 2008. → pages 1[56] G. Carlier. Remarks on Toland’s duality, convexity constraint andoptimal transport. Pac. J. Optim., 4:423–432, 2008. → pages 30[57] Y. Censor, A. N. Iusem, and S. A. Zenios. An interior point methodwith Bregman functions for the variational inequality problem withparamonotone operators. Math. Program. (Ser. A), 81:373–400, 1998.→ pages 16[58] P. Combettes. The convex feasibility problem in image recovery.Advances in Imaging and Electron Physics, 42:155–270, 1995. → pages1, 4, 173196Bibliography[59] P. L. Combettes. Feje´r monotonicity in convex optimization. InEncyclopedia of Optimization, pages 1016–1024. Springer, New York,2001. → pages 67[60] P. L. Combettes. Quasi-Feje´rian analysis of some optimization algo-rithms. In Inherently Parallel Algorithms in Feasibility and Optimizationand their Applications (Haifa, 2000), volume 8 of Stud. Comput. Math.,pages 115–152. North-Holland, Amsterdam, 2001. → pages 67[61] P. L. Combettes. Solving monotone inclusions via compositions ofnonexpansive averaged operators. Optimization, 53:475–504, 2004. →pages 48, 62, 96, 113[62] P. L. Combettes. Iterative construction of the resolvent of a sum ofmaximal monotone operators. J. Convex Anal., 16:727–748, 2009. →pages 103, 109, 114[63] P. L. Combettes and J. Eckstein. Asynchronous block-iterativeprimal-dual decomposition methods for monotone inclusions. Math.Programm. (Ser. B), 2016. doi: 10.1007/s10107-016-1044-0. → pages29[64] P. L. Combettes and J.-C. Pesquet. A Douglas–Rachford splittingapproach to nonsmooth convex variational signal recovery. IEEE J.Sel. Topics Signal Process., 1:564–574, 2007. → pages 1[65] P. L. Combettes and J.-C. Pesquet. A proximal decomposition methodfor solving convex variational inverse problems. Inverse Problems, 24:065014, 27, 2008. → pages 1[66] P. L. Combettes and J.-C. Pesquet. Primal-dual splitting algorithmfor solving inclusions with mixtures of composite, Lipschitzian, andparallel-sum type monotone operators. Set-Valued Var. Anal., 20:307–330, 2012. → pages 30, 37, 112, 144[67] P. L. Combettes and J.-C. Pesquet. Stochastic quasi-Feje´r block-coordinate fixed point iterations with random sweeping. SIAM J.Optim., 25:1221–1248, 2015. → pages 1[68] L. Demanet and X. Zhang. Eventual linear convergence of theDouglas–Rachford iteration for basis pursuit. Math. Comp., 85:209–238, 2016. → pages 173197Bibliography[69] J. Douglas, Jr. On the numerical integration of ∂2u/∂x2 + ∂2u/∂y2 =∂u/∂t by implicit methods. J. Soc. Indust. Appl. Math., 3:42–65, 1955.→ pages 62[70] J. Douglas, Jr. and H. H. Rachford, Jr. On the numerical solutionof heat conduction problems in two and three space variables.Transactions of the American Mathematical Society, 82:421–439, 1956. →pages 1, 62, 63, 106, 113[71] J. Eckstein. Splitting Methods for Monotone Operators with Applicationsto Parallel Optimization. Ph.D. thesis, Massachusetts Institute ofTechnology, USA, 1989. → pages 48, 49, 52, 65[72] J. Eckstein and D. P. Bertsekas. On the Douglas–Rachford splittingmethod and the proximal point algorithm for maximal monotoneoperators. Math. Program. (Ser. A), 55:293–318, 1992. → pages 13,48, 113, 165[73] J. Eckstein and M. C. Ferris. Smooth methods of multipliers forcomplementarity problems. Math. Program. (Ser. A), 86:65–90, 1999.→ pages 112, 144[74] J. Eckstein and B. F. Svaiter. A family of projective splitting methodsfor the sum of two maximal monotone operators. Math. Program.,111:173–199, 2008. → pages 29, 34, 37, 54, 114, 116[75] J. Eckstein and B. F. Svaiter. General projective splitting methods forsums of maximal monotone operators. SIAM J. Control Optim., 48:787–811, 2009. → pages 34, 37[76] V. Elser and I. Rankenburg. Deconstructing the energy landscape:constraint-based algorithm for folding heteropolymers. Physics Re-views E, 73:026702, 2006. → pages 4[77] V. Elser, I. Rankenburg, and P. Thibault. Searching with iteratedmaps. Proc. Natl. Acad. Sci. USA, 104(2):418–423 (electronic), 2007.→ pages 4[78] GeoGebra software. https://www.geogebra.org. → pages viii, ix,100, 121, 136, 137, 151, 164, 167, 178, 183, 186[79] R. Gerchberg. Super-resolution through error energy reduction. OptaActa, 21:709–720, 1974. → pages 173198Bibliography[80] R. Glowinski. Variational Methods for the Numerical Solution of Non-linear Elliptic Problems, volume 86 of CBMS-NSF Regional ConferenceSeries in Applied Mathematics. Society for Industrial and AppliedMathematics (SIAM), Philadelphia, PA, 2015. → pages 20[81] K. Goebel and W. A. Kirk. Topics in Metric Fixed Point Theory,volume 28 of Cambridge Studies in Advanced Mathematics. CambridgeUniversity Press, Cambridge, 1990. → pages 10, 11[82] K. Goebel and S. Reich. Uniform Convexity, Hyperbolic Geometry, andNonexpansive Mappings, volume 83 of Monographs and Textbooks inPure and Applied Mathematics. Marcel Dekker, Inc., New York, 1984.→ pages 10, 11[83] J.-P. Gossez. Ope´rateurs monotones non line´aires dans les espaces deBanach non re´flexifs. Journal of Mathematical Analysis and Applications,34:371–395, 1971. → pages 12[84] N. Hadjisavvas and S. Schaible. On a generalization of paramonotonemaps and its application to solving the Stampacchia variationalinequality. Optimization, 55:593–604, 2006. → pages 16[85] B. Halpern. Fixed points of nonexpanding maps. Bull. Amer. Math.Soc., 73:957–961, 1967. → pages 111[86] Y. Haugazeau. Sur les Ine´quations Variationnelles et la Minimisation deFonctionnelles Convexes. The´se, Universite de Paris, 1968. → pages112[87] R. Hesse and D. R. Luke. Nonconvex notions of regularity andconvergence of fundamental algorithms for feasibility problems.SIAM J. Optim., 23:2397–2419, 2013. → pages 4, 115, 123, 173[88] R. Hesse, D. R. Luke, and P. Neumann. Alternating projections andDouglas–Rachford for sparse affine feasibility. IEEE Trans. SignalProcess., 62(18):4868–4881, 2014. → pages 4, 115, 123, 173[89] A. N. Iusem. On some properties of paramonotone operators. J.Convex Anal., 5:269–278, 1998. → pages 16, 17[90] K. Knopp. Infinite Sequences and Series. Dover Publications, Inc., NewYork, 1956. → pages 95199Bibliography[91] M. A. Krasnosel’skiı˘. Two remarks on the method of successiveapproximations. Uspehi Mat. Nauk (N.S.), 10(1(63)):123–127, 1955. →pages 102[92] E. Kreyszig. Introductory Functional Analysis with Applications. JohnWiley & Sons, New York-London-Sydney, 1978. → pages 91[93] P. Lancaster and M. Tismenetsky. The Theory of Matrices. ComputerScience and Applied Mathematics. Academic Press, Inc., Orlando,FL, second edition, 1985. → pages 24, 25, 26[94] G. Li and T. K. Pong. Douglas–Rachford splitting for nonconvexoptimization with application to nonconvex feasibility problems.Math. Program. (Ser. A), 159:371–401, 2016. → pages 4[95] J. Lieutaud. Approximations d’ope´rateurs monotones par desme´thodes de splitting. In Theory and Applications of Monotone Oper-ators (Proc. NATO Advanced Study Inst., Venice, 1968), pages 259–264.Edizioni “Oderisi”, Gubbio, 1969. → pages 1, 113[96] P.-L. Lions and B. Mercier. Splitting algorithms for the sum of twononlinear operators. SIAM J. Numer. Anal., 16:964–979, 1979.→ pages1, 47, 48, 62, 101, 103, 105, 113, 114[97] J. Luque. Convolutions of Maximal Monotone Mappings. technicalreport LIDS-P-1597. MIT Libraries, Cambridge, WA, 1986. URLhttp://mit.dspace.org/handle/1721.1/2953. → pages 34[98] W. R. Mann. Mean value methods in iteration. Proc. Amer. Math. Soc.,4:506–510, 1953. → pages 102[99] Maple software. www.maplesoft.com. → pages 90[100] J.-E. Martı´nez-Legaz. Some generalizations of Rockafellar’s surjec-tivity theorem. Pac. J. Optim., 4:527–535, 2008. → pages 36[101] B. Mercier. Ine´quations Variationnelles de la Me´canique, volume 1 ofPublications Mathe´matiques d’Orsay 80 [Mathematical Publications ofOrsay 80]. Universite´ de Paris-Sud, De´partement de Mathe´matique,Orsay, 1980. → pages 30, 31[102] C. Meyer. Matrix Analysis and Applied Linear Algebra. Society forIndustrial and Applied Mathematics (SIAM), Philadelphia, PA, 2000.With 1 CD-ROM (Windows, Macintosh and UNIX) and a solutionsmanual (iv+171 pp.). → pages 20, 21, 23, 24, 25, 63, 127200Bibliography[103] W. E. Milne. Numerical Solution of Differential Equations. John Wiley &Sons, Inc., New York; Chapman & Hall, Limited, London, 1953. →pages 62[104] J. Milnor. Dynamics in One Complex Variable, volume 160 of Annals ofMathematics Studies. Princeton University Press, Princeton, NJ, thirdedition, 2006. → pages 2[105] G. J. Minty. Monotone (nonlinear) operators in Hilbert space. DukeMath. J., 29:341–346, 1962. → pages 14, 15[106] S. Moffat. The Resolvent Average: An Expansive Analysis of Firmly Non-expansive Mappings and Maximally Monotone Operators. Ph.D. thesis,University of British Columbia, Canada, 2014. → pages 142[107] S. Moffat, W. M. Moursi, and X. Wang. Nearly convex sets: fineproperties and domains or ranges of subdifferentials of convexfunctions. Math. Program. (Ser. A), 160:193–223, 2016. → pages iii,142, 187[108] U. Mosco. Dual variational inequalities. J. Math. Anal. Appl., 40:202–206, 1972. → pages 30[109] A. Moudafi. On the stability of the parallel sum of maximal mono-tone operators. J. Math. Anal. Appl., 199:478–488, 1996. → pages 35[110] W. M. Moursi. The forward–backward algorithm and the normalproblem. arXiv:1608.02240 [math.OC], 2016. → pages iii, 4, 79,187[111] J. Nocedal and S. J. Wright. Numerical Optimization. Springer Seriesin Operations Research. Springer-Verlag, New York, 1999. → pages90[112] N. Ogura and I. Yamada. Nonstrictly convex minimization over thebounded fixed point set of a nonexpansive mapping. Numer. Funct.Anal. Optim., 24:129–135, 2003. → pages 16[113] A. M. Ostrowski. Solution of Equations in Euclidean and BanachSpaces. Academic Press [A Subsidiary of Harcourt Brace Jovanovich,Publishers], New York-London, 1973. Third edition of ıt Solution ofequations and systems of equations, Pure and Applied Mathematics,Vol. 9. → pages 67, 71201Bibliography[114] A. Papoulis. A new algorithm in spectral analysis and band-limitedextrapolation. IEEE Trans. Circuits and Systems, CAS-22:735–742, 1975.→ pages 173[115] G. B. Passty. The parallel sum of nonlinear monotone operators.Nonlinear Anal., 10:215–227, 1986. → pages 33, 34, 35, 36[116] A. Pazy. Asymptotic behavior of contractions in Hilbert space. IsraelJ. Math., 9:235–240, 1971. → pages 76, 80, 170[117] D. W. Peaceman and H. H. Rachford, Jr. The numerical solution ofparabolic and elliptic differential equations. J. Soc. Indust. Appl. Math.,3:28–41, 1955. → pages 62[118] T. Pennanen. Dualization of generalized equations of maximalmonotone type. SIAM J. Optim., 10:809–835, 2000. → pages 112, 113,144[119] H. M. Phan. Linear convergence of the Douglas–Rachford methodfor two closed sets. Optimization, 65:369–385, 2016. → pages 123[120] G. Pierra. Decomposition through formalization in a product space.Math. Program., 28:96–115, 1984. → pages 184[121] S. M. Robinson. Composition duality and maximal monotonicity.Math. Program. (Ser. A), 85:1–13, 1999. → pages 112, 144[122] R. T. Rockafellar. On the maximal monotonicity of subdifferentialmappings. Pacific J. Math., 33:209–216, 1970. → pages 12[123] R. T. Rockafellar. Convex Analysis. Princeton Mathematical Series,No. 28. Princeton University Press, Princeton, N.J., 1970. → pages 1,9, 19, 85, 142[124] R. T. Rockafellar. Monotone operators and the proximal pointalgorithm. SIAM J. Control Optimization, 14:877–898, 1976. → pages14, 65[125] R. T. Rockafellar and R. J.-B. Wets. Variational Analysis, volume 317 ofGrundlehren der Mathematischen Wissenschaften [Fundamental Principlesof Mathematical Sciences]. Springer-Verlag, Berlin, 1998.→ pages 1, 13,14, 18, 19202Bibliography[126] J. Schaad. Modeling the 8-Queens Problem and Sudoku using anAlgorithm based on Projections onto Nonconvex Sets. Masters thesis,University of British Columbia, Canada, 2010. → pages 65[127] S. Simons. Minimax and Monotonicity, volume 1693 of Lecture Notes inMathematics. Springer-Verlag, Berlin, 1998. → pages 1[128] S. Simons. From Hahn-Banach to Monotonicity, volume 1693 of LectureNotes in Mathematics. Springer, New York, second edition, 2008. →pages 1[129] J. E. Spingarn. Partial inverse of a monotone operator. Appl. Math.Optim., 10:247–265, 1983. → pages 185[130] J. E. Spingarn. A primal-dual projection method for solving systemsof linear inequalities. Linear Algebra Appl., 65:45–62, 1985. → pages123[131] J. E. Spingarn. A projection method for least-squares solutions tooverdetermined systems of linear inequalities. Linear Algebra Appl.,86:211–236, 1987. → pages 181, 185[132] B. F. Svaiter. On weak convergence of the Douglas–Rachford method.SIAM J. Control Optim., 49:280–287, 2011. → pages 2, 101, 103, 105,111, 114[133] V. Thome´e. Finite difference methods for linear parabolic equations.In Handbook of Numerical Analysis, Vol. I, Handb. Numer. Anal., I,pages 5–196. North-Holland, Amsterdam, 1990. → pages 20[134] T. Torii. Inversion of tridiagonal matrices and the stability oftridiagonal systems of linear equations. Information Processing inJapan, 6:41–46, 1966. → pages 23[135] R. Wittmann. Approximation of fixed points of nonexpansive map-pings. Arch. Math. (Basel), 58:486–491, 1992. → pages 111[136] T. Yamamoto and Y. Ikebe. Inversion of band matrices. Linear AlgebraAppl., 24:105–111, 1979. → pages 23[137] C. Za˘linescu. Convex Analysis in General Vector Spaces. World ScientificPublishing Co., Inc., River Edge, NJ, 2002. → pages 1, 8[138] C. Za˘linescu. A new convexity property for monotone operators. J.Convex Anal., 13:883–887, 2006. → pages 36203Bibliography[139] E. H. Zarantonello. Projections on convex sets in Hilbert space andspectral theory. I. Projections on convex sets. In Contributions toNonlinear Functional Analysis (Proc. Sympos., Math. Res. Center, Univ.Wisconsin, Madison, Wis., 1971), pages 237–341. Academic Press, NewYork, 1971. → pages 7, 113, 119[140] E. Zeidler. Nonlinear Functional Analysis and its Applications. I.Springer-Verlag, New York, 1986. Fixed-point theorems, Translatedfrom the German by Peter R. Wadsack. → pages 1[141] E. Zeidler. Nonlinear Functional Analysis and its Applications. II/A.Springer-Verlag, New York, 1990. Linear monotone operators, Trans-lated from the German by the author and Leo F. Boron. → pages1[142] E. Zeidler. Nonlinear Functional Analysis and its Applications. II/B.Springer-Verlag, New York, 1990. Nonlinear monotone operators,Translated from the German by the author and Leo F. Boron.→ pages1204Index3∗ monotone, 17R-linear convergence, 90affine hull, 5affine relation, 11asymptotically regular operator,89asymptotically regular sequence,72Attouch–The´ra dual problem, 30averaged operator, 95best approximation pair, 162Brezis–Haraux Theorem, 19cone, 5conic hull, 5conjugate function, 9convex feasibility, 162convex function, 7convex hull, 5convex set, 5discrete Laplacian, 63domain of an operator, 11Douglas–Rachford splitting oper-ator, 47dual cone, 5dual normal solutions, 133dual pair, 30dual solutions, 30eigenvalues, 24epigraph of a function, 7Feje´r monotone, 67Feje´r monotone sequence, 67Fenchel conjugate, 9Fermat’s rule, 9firmly nonexpansive operator, 10fixed points, 13governing sequence, 101graph of an operator, 11Halpern-type algorithm, 111Haugazeau-type algorithm, 111Hilbert space, 5inconsistent feasibility problem,164indicator function, 8inner perturbation, 128inner shift, 74inverse of an operator, 11inverse resolvent identity, 14Krasnosel’skiı˘–Mann iteration,102Kronecker product of matrices, 24Laplace operator, 63least squares solutions, 127linear convergence, 93linear relation, 11lower semicontinuous function, 7maximally monotone operator, 12205Indexminimal displacement vector, 76,133, 156minimizer of a function, 8Minty parametrization, 15monotone operator, 12nearly convex, 18nearly equal, 18nonexpansive operator, 10normal cone, 5normal problem, 126, 133normal solutions, 133operator norm, 89Ostrowski’s Theorem, 67outer perturbation, 128outer shift, 74parallel space, 6parallel splitting, 110parallel sum, 34paramonotone operator, 16pointwise convergence of linearoperators, 90polar cone, 5positive semidefinite, 20primal normal solutions, 133primal problem, 30primal solutions, 30projection operator, 6proper function, 7proximal mapping, 8range of an operator, 11recession cone, 5rectangular, 17reflected resolvent, 13relative interior, 5resolvent, 13root linear convergence, 90shadow sequence, 101shift operator, 128solution mappings, 34Spingarn’s method, 184strictly monotone operator, 12strong relative interior, 5subdifferential operator, 8symmetric matrix, 24Toeplitz matrix , 21total duality, 45tridiagonal matrix , 21uniform convergence of linear op-erators, 90uniformly monotone operator, 12zeros of a set-valued operator, 11206

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0340501/manifest

Comment

Related Items