UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

On monotone operator classes and the Borwein-Wiersma decomposition : with demonstrations using low dimensional… Edwards, Mclean Robert 2013

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
24-ubc_2013_fall_edwards_mclean.pdf [ 5.52MB ]
Metadata
JSON: 24-1.0071930.json
JSON-LD: 24-1.0071930-ld.json
RDF/XML (Pretty): 24-1.0071930-rdf.xml
RDF/JSON: 24-1.0071930-rdf.json
Turtle: 24-1.0071930-turtle.txt
N-Triples: 24-1.0071930-rdf-ntriples.txt
Original Record: 24-1.0071930-source.json
Full Text
24-1.0071930-fulltext.txt
Citation
24-1.0071930.ris

Full Text

On Monotone Operator Classes and the Borwein-Wiersma Decomposition With demonstrations using low dimensional examples and the construction of decompositions. by Mclean Robert Edwards B.Sc., The University of Guelph, 2004 M.Sc., The University of Guelph, 2005 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in The Faculty of Graduate Studies (Mathematics)  THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver) April 2013 c Mclean Robert Edwards 2013  Abstract In Hilbert spaces, five classes of monotone operator of relevance to the theory of monotone operators, variational inequality problems, equilibrium problems, and differential inclusions are investigated. These are the classes of paramonotone, strictly monotone, 3-cyclic monotone, 3∗ -monotone (or rectangular, or ∗-monotone), and maximal monotone operators. Examples of simple operators with all possible combinations of class inclusion are given, which together with some additional results lead to an exhaustive knowledge of monotone class relationships for linear operators, linear relations, and for monotone operators in general. Many of the example operators considered are the sum of a subdifferential with a skew linear operator (and so are Borwein-Wiersma decomposable). Since for a single operator its Borwein-Wiersma decompositions are not unique, clean, essential, extended, and standardized decompositions are defined and the theory developed. In particular, every Borwein-Wiersma decomposable operator has an essential decomposition, and many sufficient conditions are given for the existence of a clean decomposition. Various constructive methods are demonstrated together which, given any BorweinWiersma decomposable operator, are able to obtain a decomposition, as long as the operator has starshaped domain. These methods are more accurate if a clean decomposition exists. The techniques used apply a variant of Fitzpatrick’s Last Function, the theory of which is developed here, where this function is shown to consist of a Riemann integration and be equivalent to Rockafellar’s antiderivative when applied to subdifferentials. Furthermore, a different saddle function representation for monotone operators is created using this function which has theoretical and numerical advantages over more classical representations.  ii  Preface The research described herein is the original research of Mclean Robert Edwards. Much of the work on linear relations and linear operators in Chapter 4 is to appear with the title “Five classes of monotone linear relations and operators” in D. Bailey, H.H. Bauschke, P. Borwein, F. Garvan, M. Thera, J. Vanderwerff and H. Wolkowicz (editors), Computational and Analytical Mathematics, Springer Proceedings in Mathematics & Statistics, to appear. Most of the nonlinear examples in Chapter 5 have been submitted in a paper with the prepublication title of “Five classes of monotone operators with examples”. Parts of the introduction and much of the results in Chapter 3 also appear in these papers.  iii  Table of Contents Abstract  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  ii  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  iii  Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  iv  List of Tables  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  vii  List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  viii  Preface  List of Symbols  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  ix  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  xi  Dedication  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  xii  1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  1  2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  8  2.1  Definitions  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  8  2.2  Basic facts  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  13  3 Five Classes of Monotone Operators . . . . . . . . . . . . . . . . . . . . . . . .  21  3.1  General relationships between monotone classes  3.2  Properties of monotone operators  3.3  Monotone operators on product spaces  3.4  Further discussion of monotone properties  3.5  . . . . . . . . . . . . . . . . .  21  . . . . . . . . . . . . . . . . . . . . . . . . .  22  . . . . . . . . . . . . . . . . . . . . . .  28  . . . . . . . . . . . . . . . . . . . .  30  3.4.1  Convex preimages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  30  3.4.2  Paramonotonicity  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  33  3.4.3  3∗ -monotonicity  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  36  Single valued selections of monotone operators . . . . . . . . . . . . . . . . . .  39  iv  3.6  Sums of monotone operators  . . . . . . . . . . . . . . . . . . . . . . . . . . . .  3.7  Operators in one dimension (X = R)  3.8  Spherically symmetric monotone operators  . . . . . . . . . . . . . . . . . . . . . . .  40 42  . . . . . . . . . . . . . . . . . . . .  44  4 Monotone Classes of Linear Relations and Operators . . . . . . . . . . . . .  48  3∗ -monotone linear relations and angle-boundedness . . . . . . . . . . . . . . .  56  4.1  Rn  4.2  Monotone linear operators on  . . . . . . . . . . . . . . . . . . . . . . . . . .  62  4.3  Monotone linear operators on R2 . . . . . . . . . . . . . . . . . . . . . . . . . .  66  5 Examples of Monotone Operators 5.1  Summary and comparison  . . . . . . . . . . . . . . . . . . . . . . . . .  70  . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  93  6 A New Saddle Function Representation  . . . . . . . . . . . . . . . . . . . . . 102  6.1  Krauss’ saddle functions  6.2  A new utility function: MT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111  6.3  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103  6.2.1  Introducing the operator MT . . . . . . . . . . . . . . . . . . . . . . . . 111  6.2.2  History of MT - Fitzpatrick’s last function  6.2.3  Linear relations  6.2.4  Relationship of MT to Rockafellar’s antiderivative . . . . . . . . . . . . 116  . . . . . . . . . . . . . . . . 113  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115  The saddle function WT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 6.3.1  Applications to equilibrium problems  . . . . . . . . . . . . . . . . . . . 125  7 Borwein-Wiersma Decompositions . . . . . . . . . . . . . . . . . . . . . . . . . 130 7.1  Starshaped sets and their properties . . . . . . . . . . . . . . . . . . . . . . . . 131  7.2  Notions of relative interior and convexity  7.3  Borwein-Wiersma decomposable monotone operators  7.4  Standard and extended Borwein-Wiersma decompositions . . . . . . . . . . . . 145  7.5  Further properties of Borwein-Wiersma decompositions  7.6  Algebraic decompositions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155  7.7  Decompositions using WT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157  7.8  General decompositions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166  7.9  Decomposing linear relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176  8 Conclusion and Future Work 8.1  Conclusion  8.2  Future work  . . . . . . . . . . . . . . . . . . . . . 132 . . . . . . . . . . . . . . 134  . . . . . . . . . . . . . 152  . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 v  Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182  vi  List of Tables 5.1  Monotone class relationships  . . . . . . . . . . . . . . . . . . . . . . . . . . . .  94  5.2  Monotone linear operators on  R2 :  monotone class relationships. . . . . . . . . .  95  5.3  Monotone linear operators: monotone class relationships . . . . . . . . . . . . .  97  vii  List of Figures 5.1  Visualization of Id . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  71  5.2  Visualization of Example 5.0.13 . . . . . . . . . . . . . . . . . . . . . . . . . . .  75  5.3  Visualization of Example 5.0.15 . . . . . . . . . . . . . . . . . . . . . . . . . . .  77  5.4  Visualization of Example 5.0.16 . . . . . . . . . . . . . . . . . . . . . . . . . . .  79  5.5  Comparing TS with Q + TS in Example 5.0.16. . . . . . . . . . . . . . . . . . .  80  5.6  Visualization of Example 5.0.17 . . . . . . . . . . . . . . . . . . . . . . . . . . .  81  5.7  Visualization of Example 5.0.18 . . . . . . . . . . . . . . . . . . . . . . . . . . .  83  5.8  Visualization of Example 5.0.20 . . . . . . . . . . . . . . . . . . . . . . . . . . .  84  5.9  Visualization of Example 5.0.22 . . . . . . . . . . . . . . . . . . . . . . . . . . .  86  5.10 Visualization of Example 5.0.25 . . . . . . . . . . . . . . . . . . . . . . . . . . .  87  5.11 General monotone operators: monotone class relationships . . . . . . . . . . . .  95  5.12 Monotone linear operators: monotone class relationships. . . . . . . . . . . . .  96  5.13 Monotone linear operators on Rn : monotone class relationships. . . . . . . . . .  96  5.14 Monotone linear operators on R2 : monotone class relationships. . . . . . . . . .  97  5.15 Visualization of maximal monotone operators . . . . . . . . . . . . . . . . . . .  98  5.16 Visualization of monotone operators . . . . . . . . . . . . . . . . . . . . . . . .  99  5.17 Polar visualization of maximal monotone operators . . . . . . . . . . . . . . . . 100 5.18 Polar visualization of monotone operators . . . . . . . . . . . . . . . . . . . . . 101 6.1  The saddle function H∂f . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126  6.2  The Krauss saddle function L∂f . . . . . . . . . . . . . . . . . . . . . . . . . . . 127  6.3  The saddle function W∂f . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128  7.1  The domain of a monotone operator where 0 ∈ / Aff dom T . . . . . . . . . . . . 138  viii  List of Symbols X A Hilbert space Rn The n-th dimensional Euclidean space R The set of real numbers Z The set of integers N The natural numbers, N = {i : i ≥ 1, i ∈ Z} ℓ2 The space of square-summable infinite sequences with index N ℓ2 (D) The space of square-summable infinite sequences with index D ∅ The empty set ∂f The subdifferential of a convex function f ˆ The subdifferential of a nonconvex function h ∂h PV x The projection of a point x onto a linear subspace V PV D The projection of a set D onto a linear subspace V cone D The cone of D core D The core of D int D The interior of D conv D The convex hull of D ri C The relative interior of a convex set C eri D The encompassing relative interior of D (eri D = ri conv D) D The topological closure of a set D spanD The linear span of a set D Aff D The affine hull of a set D ix  D ⊥ The set perpendicular to D, {y ∈ X : x, y = 0 ∀ x ∈ D} NC The normal cone of a convex set C cl f The lower semicontinuous hull of a function f dom A The domain of an operator A ran A The range of an operator A dom f The domain of a function f gra A The graph of an operator A f |D The function f with domain restricted to D T |D The operator T with domain restricted to D T −1 The inverse mapping of an operator T Bε (x) The open ε-ball about the point x FL Fitzpatrick’s last function LT The Krauss saddle function calculated from T [x, y] The set {λx + (1 − λ)y : 0 ≤ λ ≤ 1, λ ∈ R} ]x, y[ The set {λx + (1 − λ)y : 0 < λ < 1, λ ∈ R} [x, y[ The set {λx + (1 − λ)y : 0 ≤ λ < 1, λ ∈ R} ]x, y] The set {λx + (1 − λ)y : 0 < λ ≤ 1, λ ∈ R} x, y The inner product of x and y x  The norm of x  Id The identity map ιD The indicator function of a set D A + B The set {x + y : x ∈ A, y ∈ B} A − B The set {x − y : x ∈ A, y ∈ B} A∗ The adjoint of a linear operator or relation A A+ The symmetric part of A (A+ = 21 (A + A∗ )) A∼ The skew part of A (A∼ = 21 (A − A∗ )) x  Acknowledgements Many thanks are due to Philip Loewen for his support, feedback, and guidance in the creation of this thesis. Feedback from Heinz Bauschke and Michael Friedlander has also been very valuable in improving the quality of this research. The love and understanding of so many friends and family have been invaluable in providing the life support necessary to complete such a work as this. Of particular mention is my partner in life, Dawn Mair, who has been by my side throughout, and has made it possible to overcome many trials and setbacks over the years.  xi  For my grandmother, who left before this work was completed.  xii  Chapter 1  Introduction In this thesis, we first examine the theory of monotone operators, specifically the relationships between different classes of monotonicity, with attention to linear relations as well as for general operators. In so doing, a complete list of examples demonstrating all possible combinations of monotone class is created. We then move on to the theory of saddle function representations for monotone operators, and expand on the theory of Fitzpatrick’s last function to create a new saddle function representation. Then, we turn our attention to Borwein-Wiersma decomposable operators, analyzing various properties of these operators and later demonstrating under which conditions they may be split into the sum of a subdifferential and a skew linear relation. These constructive decompositions are made possible by using a variant of Fitzpatrick’s last function and the theory we have developed. After going over the main results in point form, we will then go over some of the background of various monotone classes before outlining the structure and relevance of this thesis. Using notation not yet defined, the main results of this thesis are: (i) the zoo of examples and results listed in Tables 5.1 and 5.3, which yields a full understanding of all possible monotone classes relationships for (a) linear operators (Figure 5.12) (b) linear operators on Rn (Figure 5.13) (c) linear operators on R2 (Figure 5.14) (d) monotone operators (Figure 5.11) (ii) a new saddle function representation for monotone operators which (a) is simpler to calculate (Riemann integration along a line vs. supremum over a domain) - see for instance Example 6.3.8 (b) can be applied to any Borwein-Wiersma decomposable monotone operator with convex domain (Corollary 7.7.2) (c) yields the canonical saddle function representation (6.1.11) when applied to subdifferentials with convex domain (Corollary 6.2.25 and Corollary 7.7.2) (iii) various results on the properties of Borwein-Wiersma decompositions T = ∂f + A, including 1  (a) introducing notions of clean (dom ∂f = dom T ) and essential (span dom ∂f ⊃ dom A) decompositions, where  i. T = ∂f + A|span dom ∂f is always an essential decomposition (Proposition 7.3.14) ii. not every Borwein-Wiersma decomposition is clean (Example 7.4.12), although iii. T = ∂f + A is a clean decomposition if T is maximal monotone and Aff dom T and Aff dom ∂f are closed (Proposition 7.8.1) (b) if a Borwein-Wiersma decomposition for an operator T : X → 2X exists, there is one such decomposition T = ∂f + A where 0 ∈ Az if (by Theorem 7.3.17) i. T is maximal monotone ii. T has at least one clean decomposition iii. 0 ∈ Aff dom T iv. T : Rn → 2R  n  (iv) a method to fully obtain a Borwein-Wiersma decomposition is given for any BorweinWiersma decomposable operator T such that (a) dom T is convex and there exists a clean decomposition of T (Theorem 7.7.13) (b) dom T = X (Theorem 7.8.4) (c) 0 ∈ int dom T ,an extension of a result in [19] (Corollary 7.8.10) (d) dom T is starshaped, T is single-valued and has a clean decomposition (Theorem 7.8.12) (e) T is a linear relation and a clean decomposition exists, an extension of a result in [11] (Theorem 7.9.1) (v) various inexact methods to obtain Borwein-Wiersma decompositions as long as dom T is starshaped, with direct reconstruction approaches for both the subdifferential term and the skew linear relation term (vi) the ability to calculate, for some fixed z ∈ dom T , f (y) − f (x) and y, Ax for any  x, y ∈ dom T for some fixed Borwein-Wiersma decomposition T = ∂f + A, as long as the  lines [x, z] and [y, z] are both in dom T (Theorem 7.8.15)  Monotone operators arise as a generalization of subdifferentials of convex functions, and are used extensively in variational inequality and equilibrium theory. Variational inequality and equilibrium problems provide a unified framework for constrained optimization, saddle point, Nash equilibrium, traffic equilibrium, frictional contact, and complementarity problems [38] [39]. Monotone operators are also important for the theory of partial differential equations, where monotonicity both characterizes the vector fields of self dual Lagrangians [40] and is crucial 2  for the determination of equilibrium solutions (using a variational inequality) for elliptical and evolution differential equations and inclusions (see for instance [2]). Over the years, various classes of monotone operators have been introduced for a variety of theoretical and practical reasons. However, there have been few attempts to comprehensively compare them. In the first part of this thesis, we compare five special classes of monotone operators: those that are paramonotone, strictly monotone, 3-cyclic monotone, 3∗ -monotone, and maximal monotone. The role of each of these classes in the theory and application of monotone operators is briefly described below. Further, a more detailed discussion of the role of paramonotonicity and 3∗ -monotonicity follows in Section 3.4 after we have established some basic facts. Definition 1.0.1 (Five classes of monotone operator) An operator T : X → 2X is said to be [Class] (with abbreviation [Code]) if and only if T is monotone and for every (x, x∗ ), (y, y ∗ ), (z, z ∗ ) in graT one has [Condition]. Code Class Condition monotone PM  paramonotone  SM  strictly monotone  3CM  3-cyclic monotone  3*  3∗ -monotone  MM  maximal monotone  x − y, x∗ − y ∗ ≥ 0  x − y, x∗ − y ∗ = 0 ⇒ (x, y ∗ ), (y, x∗ ) ∈ gra T  x − y, x∗ − y ∗ = 0 ⇒ x = y  x − y, x∗ + y − z, y ∗ + z − x, z ∗ ≥ 0  sup(a,a∗ )∈gra T z − a, a∗ − x∗ < +∞  (∀a ∈ X)(∀a∗ ∈ X) x − a, x∗ − a∗ ≥ 0  ⇒ (x, x∗ ) ∈ gra T The order above, PM-SM-3CM-3*-MM, is fixed to allow a 5-digit binary label of the classes to which an operator belongs. For instance, an operator with the label 10111 is paramonotone, not strictly monotone, 3-cyclic monotone, 3∗ -monotone, and maximal monotone. It will be assumed that the label 00000 still implies that the operator is monotone. Paramonotonicity first arose as a weakening of sufficient conditions for the iterative solution of variational inequality problems, and continues to be used in this context [25]. As such, a number of modern iterative methods for solving variational inequality problems rely on paramonotonicity to ensure convergence [3] [27] [28]. For more on the theory of paramonotone operators and why this condition is important for variational inequality problems, see [48] and [78]. Strict monotonicity is a stronger condition than paramonotonicity. Strict monotonicity guarantees the uniqueness of a solution to the variational inequality problem (see for instance [38]). These operators are somewhat analogous to the subdifferentials of strictly convex functions. 3-cyclic monotone operators serve to represent a special case of n-cyclic monotone operators. Cyclic monotonicity both characterizes subdifferentials as monotone operators [65], and has  3  been crucial in demonstrating that monotone operators can be decomposed into cyclic and acyclic parts, much as linear operators can be decomposed into skew and symmetric components [1] [19]. It is a well known fact that 3-cyclic monotonicity is the weakest non-trivial n-cyclic condition, since all n-cyclic monotone operators where n ≥ 3 are 3−-cyclic monotone (outlined  below). In particular, all 3-cyclic monotone operators are 3∗ -monotone (see Fact 3.1.2 below). We adopt the notation of [82] and use the term 3∗ -monotone throughout. The property was simply denoted by “∗” by Br´ezis and Haraux when introduced [23], and such operators were sometimes called (BH)-operators [31] in honour of these original authors. More recently the property has also taken on the name “rectangular” since the closure of the domain of the Fitzpatrick function of a monotone operator is rectangular precisely when the operator is 3∗ monotone [70]. 3∗ -monotonicity has the important property in that if T1 and T2 are 3∗ -monotone, then as long as their sum is maximal monotone, then the closure [or interior] of the sum of their ranges is the closure [or interior] of the range of their sum. This can be used to provide sufficient conditions for the nonemptiness of T −1 (0) (see for instance [23]). Operators with bounded range [82] and strongly coercive operators [23] are 3∗ -monotone. There is a rich literature on the theory (see [15] for a good overview) and applications (for example [36] and [59]) of maximal monotone operators. Furthermore, it is well known that a maximal monotone operator T has the property that T −1 (0) is convex, a property shared by paramonotone operators with convex domain (Proposition 3.4.2), and analogous to the fact that the minimizers of a convex function form a convex set. Maximal monotonicity is also an important property for general differential inclusions [59] [22]. After some preliminary definitions and basic results in Chapter 2, we examine properties of paramonotone, strictly monotone, 3-cyclic monotone, 3∗ -monotone and maximal monotone operators and their relationships to one another in Chapter 3. This chapter brings many results together and extrapolates upon these to form a coherent theory, which is then used to refine similar theory for linear relations in Chapter 4 and to examine the examples of Chapter 5. In particular, it is shown how the monotonicity classes of a sum of operators depends on the monotonicity classes of the component operators (Section 3.6) and how the composition of operators in the product space preserves (or fails to preserve) these monotone classes (Section 3.3). Since operators on R are 3-cyclic monotone and paramonotone (Section 3.7), we use this result in Section 3.8 to construct spherically symmetric monotone operators that share these properties, although in a higher dimensional setting. It is eventually shown in Chapter 5 that the general relationships between the monotone classes described in Section 3.1, namely that 3-cyclic monotone operators are 3∗ -monotone, that strictly monotone operators are paramonotone, and that 3-cyclic maximal monotone operators are paramonotone, are the only ones that hold in the general setting. In numerical applications, the explicit or implied use of single-valued selections means that 4  monotone operators are often treated as if they are single-valued (ie: T : X → X) even when  they are not. In Section 3.5, we define this concept and note the only properties that this process might violate are paramonotonicity and maximal monotonicity, although paramonotonicity can be preserved if the original operator is maximal monotone.  Now, linear relations are a multi-valued extension of linear operators, and are defined by those operators whose graph forms a vector space. This is a natural extension to consider as monotone operators are often multi-valued, and there is an increasing occurrence of linear relations in monotone operator theory [60], [9], [12]. We consider linear relations in Chapter 4, and explore their characteristics and structure in light of these monotone classes. Of particular note, we fully explore the manner in which linear relations can be multi-valued (Proposition 4.0.13), and remark on a curious property of linear relations whose domains are not closed (Proposition 4.0.17). In Section 4.1, we also obtain a generalization to the fact that bounded linear operators that are 3∗ -monotone are also paramonotone (a corollary to a result in [23]), with conditions different from those in [12], by examining the related property of angle-boundedness. We construct a 3∗ -monotone linear relation that is not paramonotone. In Section 4.2 and 4.3, we list various examples of linear operators satisfying or failing to satisfy the 5 properties defined above. The examples are chosen to have full domain, low dimension, and be continuous where possible. This is shown to yield a complete characterization of the dependence or independence of these five classes of monotone operator in R2 , Rn , and in a general Hilbert space X. One result of Section 4.3 is that the paramonotone and linear operators in R2 are exactly the symmetric or strictly monotone operators in R2 . In Chapter 5, many examples of monotone operators, both specific and of general construction, are given. In particular, there are 17 specific cases where all possible combinations of these monotone classes are demonstrated on R2 , or R3 , and these operators (on R2 ) are visualized when defined and in comparison (Figures 5.15 and 5.16). These are summarized in Table 5.1, which serves to demonstrate that these examples exhaust all possibilities. All possible relationships between these five monotone classes are summarized in Figure 5.11. For ease of reuse and modification, all examples T : X → X given in Table 5.1, where X is  one of R, R2 or R3 , satisfy the following conditions: (i) T (0) = 0, (ii) T −1 (0) = 0, (iii) dom T = X, (iv) T is single valued,  (v) if T is maximal monotone, then T is continuous  5  Note that single valued and continuous monotone operators are necessarily maximal (Fact 3.2.1 below). Since translation of an operator preserves its monotone class memberships, these examples can be further modified to T˜ := T (x − a) + a∗ for any a, a∗ ∈ X. Note that sum of a skew linear relation (or operator) and a subdifferential of a (proper lower semicontinuous) convex function yields precisely a Borwein-Wiersma decomposition. Many of the examples from Chapter 5 have this form. The theory of Borwein-Wiersma decompositions is in its infancy. There are currently only three papers detailing this theory: the original work of Borwein and Wiersma [19] which introduced this subject as a refinement to Asplund decompositions, the work of Bauschke, Wang, and Yao [11] in decomposing linear relations (also included in the thesis of Yao, [80]), and a paper by Musev and Ribarska which, as done in [11], provided an example of a non-decomposable operator where the domain is not the full space (we do not consider non-decomposable operators in this thesis). This work extends the results of the two main papers, and develops the theory of BorweinWiersma decompositions. The focus taken is to examine the properties of Borwein-Wiersma decomposable operators and to attempt to reconstruct Borwein-Wiersma decompositions given only the sum. In the larger context, this is part of the aim of constructing Asplund decompositions, as their theoretical existance is currently only guaranteed through the use of Zorn’s lemma. These attempts are largely successful, first by examining operators with convex domain, (or operators on a convex subset of their domain), and then examining operators that have starshaped domains. Unlike previous approaches, we are able to directly reconstruct the skew linear relation term. For Borwein-Wiersma decomposable monotone operators with a convex domain, it is shown here to be possible to create a new saddle function representation, which differs from the classical methods in both calculation and result. This is the subject of Chapter 6, where after revisiting classical saddle function representations (Section 6.1), a variant of Fitzpatrick’s last function is defined (Section 6.2) and then used to create this new representation (Section 6.3). In so doing, we trace the history of Fitzpatrick’s last function and explicitly describe its relationship to the Rockafellar antiderivative. Fitzpatrick’s last function is also useful in decomposing Borwein-Wiersma decomposable operators, which is the subject of Chapter 7, although there are many advances to the theory of these operators in this chapter as well. As a result, we obtain many results on constructive decompositions of Borwein-Wiersma decomposable operators. This lays a theoretical groundwork for potential applications, such as improving the convergence and stability of algorithms involving monotone operators, as it is often the acyclic portion of an operator which can cause an approach to be time consuming or to fail outright. Indeed, it is in the context of equilibrium problems and variational inequalities that monotonicity has become important for the fields of engineering and economics (including optimal 6  transport, signal processing, mechanics and game theory)[39], and for partial differential equations, where monotonicity both characterizes the vector fields of self-dual Lagrangians [40] and is crucial for the determination of equilibrium solutions (determined by a variational inequality) for elliptical and evolution differential equations and inclusions (see for instance [2]). Maximal monotonicity is also an important property for general differential inclusions [59] [22]. It is perhaps in control theory where the benefit of numerically decomposing an unknown operator would have the most value, especially for control using variational inequalities (see the work arising from [53], such as the more recent [35]). As such, future benefit from a full or partial Borwein-Wiersma decomposition is likely to be realized by using the decomposition to improve current solution techniques for equilibrium problems and variational inequalities, such as by using a splitting algorithm that evaluates ∂f and A for a Borwein-Wiersma decomposition T = ∂f + A to more rapidly and reliably locate a solution x for 0 ∈ T x. As it stands, using a Borwein-Wiersma decomposition T = ∂f + A, the Douglas-Rachford splitting algorithm of Lions and Mercier [52] (which is a special case of the proximal point algorithm, see [36]) weakly converges to a solution of 0 ∈ T x if a solution exists  and A is single valued (see Theorem 1(i) in [52]). It convergences strongly if ∂f is strongly monotone [74], and can also be used to solve variational inequalities and equilibrium problems [74] [32]. However, as the decomposition has numerical cost, perhaps a new sort of splitting algorithm could take better advantage of the special properties involved in the decomposition, such as the paramonotonicity of ∂f and the linearity of A. We obtain a full decomposition of a Borwein-Wiersma decomposable operator T if its domain is starshaped and its encompassing relative interior is nonempty under certain conditions, and in any case a decomposition within error of a constant vector. This leads to a variety of corollaries, which themselves extend the best existing result for constructive decomposition (which requires that T : Rn → Rn is C 1 continuous and 0 ∈ int dom T ) [19]. We assume throughout that X is a real Hilbert space, with inner product ·, · . The space of subsets of X, including the empty set ∅, is denoted by 2X . When an operator  T : X → 2X is such that for all x ∈ X, T x contains at most one element, we call T single-  valued, as its image has a single value on its domain. Where T is single-valued, we often use  T x to represent this point instead of using the full notation x∗ ∈ T x or T x = {x∗ }. We  use the convention that for set addition A + ∅ = ∅. The graph of T is gra T := {(x, x∗ ) :  x∗ ∈ T x} ⊂ X × X, and the domain of T is dom T := {x ∈ X : T x = ∅}. A monotone extension T˜ : X → 2X of a monotone operator T : X → 2X is a monotone operator such that gra T gra T˜. A selection of an operator T : X → 2X is an operator T˜ such that gra T˜ ⊂ gra T and dom T = dom T˜, and a single-valued selection of T is such an operator T˜ where T˜ : X → X.  Define R+ to be the set {t ∈ R : t ≥ 0}.  7  Chapter 2  Preliminaries In this chapter, we go over definitions (Section 2.1) and basic facts (Section 2.2).  2.1  Definitions  Definition 2.1.1 (monotone) An operator T : X → 2X is said to be monotone if x − y, x∗ − y ∗ ≥ 0  (2.1.1)  for all x, y ∈ X, for all x∗ ∈ T x and for all y ∗ ∈ T y. Definition 2.1.2 (strictly monotone) An operator T : X → 2X is said to be strictly mono-  tone if T is monotone and for all (x, x∗ ), (y, y ∗ ) ∈ gra T , x − y, x∗ − y ∗ = 0 implies that  x = y.  Definition 2.1.3 (paramonotone) An operator T : X → 2X is said to be paramonotone  if T is monotone and for x∗ ∈ T x, y ∗ ∈ T y, x − y, x∗ − y ∗ = 0 implies that x∗ ∈ T y and  y ∗ ∈ T x.  Definition 2.1.4 (strongly monotone) An operator T : X → 2X is said to be strongly  monotone if there exists an α > 0 so that for all (x, x∗ ), (y, y ∗ ) ∈ gra X x − y, x∗ − y ∗ ≥ α x − y 2 .  (2.1.2)  In this case, the operator is also said to be α-strongly monotone or strongly monotone with parameter α. Definition 2.1.5 (coercivity) Consider an operator T : X → 2X . If all sequences (yn )n∈N ∈ X such that yn → +∞ satisfy  (i) limn→+∞ inf y∗ ∈T yn y ∗ = +∞, then T is weakly coervice. (ii) limn→+∞ inf y∗ ∈T yn  yn ,y ∗ yn  = +∞, then T is coercive.  (iii) that for all z ∈ X, limn→+∞ inf y∗ ∈T yn  yn −z,y ∗ yn  = +∞, then T is strongly coercive.  8  Remark 2.1.6 Trivially, strong coercivity implies coercivity implies weak coercivity. Since the infimum of an empty set is +∞, if T has bounded domain then it is strongly coercive, and only sequences (yn )n∈N in the domain of T need be considered when testing for coercivity properties. Remark 2.1.7 Strong monotonicity implies strict monotonicity, and hence paramonotonicity. This is obvious from Definition 2.1.4. Strongly monotone operators are strongly coercive. In demonstration, suppose T : X → 2X is strongly monotone. Take any sequence (yn )n∈N in the domain of T . Take an associated sequence (yn∗ )n∈N where yn∗ ∈ T yn , and consider yn − z, yn∗  =  yn − z, yn∗ − z ∗ + yn − z, z ∗  ≥ α yn − z  2  + yn − z, z ∗  = α yn  2  +α z  2  − z, z ∗ − 2α yn , z + yn , z ∗  ≥ α yn  2  − (2α z + z ∗ ) yn + α z  2  − z, z ∗ .  Therefore, T is strongly coercive. For a simple example of a strictly monotone operator that is not strongly monotone, consider    − ln(1 − x), x < 0 g : R → R : x → sgn(x)(ln(1 + |x|)) = 0, x=0 .   ln(1 + x), x>0  (2.1.3)  Strongly monotone operators are not fully compared with the other 5 classes of monotone operators, and are not considered in depth in this text. Definition 2.1.8 (n-cyclic monotone) Let n ≥ 2. An operator T : X → 2X is said to be n-cyclic monotone if  (x1 , x∗1 ) (x2 , x∗2 ) ···   ∈ gra T      ∈ gra T    ∈ gra T     (xn , x∗n ) ∈ gra T     xn+1 = x1  n  ⇒  i=1  xi − xi+1 , x∗i ≥ 0  (2.1.4)  A cyclical monotone operator is one that is n-cyclic monotone for all n ∈ N. Remark 2.1.9 (3-cyclic monotone) There are two simple alternative synonymous definitions of 3-cyclic monotonicity worth explicitly stating. For an operator T : X → 2X to be 3-cyclic monotone, every (x, x∗ ), (y, y ∗ ), (z, z ∗ ) ∈ gra T must satisfy x − y, x∗ + y − z, y ∗ + z − x, z ∗ ≥ 0,  (2.1.5)  9  or equivalently z − y, y ∗ − x∗ ≤ x − z, x∗ − z ∗ .  (2.1.6)  Definition 2.1.10 (3∗ monotone) An operator T : X → 2X is said to be 3∗ -monotone if T is monotone and for all z in the domain of T and for all x∗ in the range of T sup (y,y ∗ )∈gra T  z − y, y ∗ − x∗ < +∞.  (2.1.7)  Definition 2.1.11 (maximality) An operator is maximal n-cyclic monotone if its graph cannot be extended while preserving n-cyclic monotonicity. A maximal monotone operator is a maximal 2-cyclic monotone operator. A maximal cyclical monotone operator is a cyclical monotone operator such that all proper graph extensions are not cyclical monotone. Remark 2.1.12 Note that 2-cyclic monotonicity is equivalent to monotonicity. By substituting (an , a∗n ) := (a1 , a∗1 ), it easily follows from the definition that any n-cyclic monotone operator is (n − 1)-cyclic monotone. 1-cyclic monotonicity is not defined, since the n = 1 case for (2.1.4)  is trivial.  These classes of monotonicity can also be defined on subsets of the space, although in this document these definitions are used infrequently. Definition 2.1.13 Given subsets D, E ⊂ X, consider the operators T : D → 2E and T˜ : X → 2X . The operator T˜ is said to be monotone, strictly monotone, paramonotone, n-cyclic monotone, or 3∗ -monotone on D if it satisfies the relevant conditions on (D × X) rather than on gra T , or equivalently if Tˆ : X → 2X defined by Tˆx :=  T˜x, x ∈ D ∅,  x∈ /D  is monotone, strictly monotone, paramonotone, n-cyclic monotone, or 3∗ -monotone. The operators T and T˜ are said to be maximal monotone on D × E, (or if E = X, maximal monotone  on D), if there are no proper monotone extensions of gra T and gra T˜ in D × E. Definition 2.1.14 (proper) A function f : X → R  {−∞, +∞} is said to be proper if for  all x ∈ X, f (x) = −∞, and for at least one x ∈ X, f (x) < +∞.  Definition 2.1.15 (lower semicontinuous) A function f : X → R  {−∞, +∞} is said to  be lower semicontinuous if for every x ∈ X, every sequence (xn )n∈N ⊂ X such that xn → x  satisfies  lim inf f (xn ) ≥ f (x), n→+∞  10  or more succinctly for every x ∈ X lim inf f (y) ≥ f (x). y→x  Definition 2.1.16 (upper semicontinuous) A function f : X → R {−∞, +∞} is said to  be upper semicontinuous if −f is lower semicontinuous.  Remark 2.1.17 A function is continuous if and only if it is both upper and lower semicontinuous. Definition 2.1.18 (epigraph) For a function f : X → R denoted epi f , is the subset of X × R determined by  {−∞, +∞}, the epigraph of f ,  epi f := {(x, h) ∈ X × R : f (x) ≤ h} .  (2.1.8)  Definition 2.1.19 (convexity) A set C ⊂ X is convex if for every x, y ∈ C and for every  λ ∈]0, 1[, [x, y] ∈ C.  Definition 2.1.20 (convex function) A function f : X → R epigraph, epi f , is convex.  Definition 2.1.21 (concave function) A function f : X → R −f is convex.  {−∞, +∞} is convex if its {−∞, +∞} is concave if  Definition 2.1.22 (subdifferential) Given a proper convex function f : X → R {+∞}, the subdifferential of f , denoted by ∂f : X → 2X , is defined for x ∈ dom f by ∂f (x) := {x∗ : f (y) − f (x) ≥ y − x, x∗  ∀y ∈ X}.  (2.1.9)  Each x∗ ∈ ∂f (x) is called a subgradient of f at x, and the subgradient if ∂f (x) is a singleton. For x ∈ / dom f , ∂f (x) = ∅.  Definition 2.1.23 (closure) The closure of a convex function f : X → R {−∞, +∞},  denoted by cl f , is the pointwise supremum of continuous affine minorants of f . Explicitly, cl f (x) := sup m(x),  (2.1.10)  m∈Mf  where Mf := {m ∈ A : m(y) ≤ f (y) ∀x ∈ X}  (2.1.11)  A := {m : X → R : h ∈ X, c ∈ R, m(x) := x, h + c} .  (2.1.12)  and  11  Remark 2.1.24 Closure is sometimes called ’lower semicontinuous closure’, ’epigraph closure’, or ’lower semicontinuous hull’ in view of Fact 2.2.2 and Corollary 2.2.3 below. As a pointwise supremum of continuous functions, cl f is always a lower semicontinuous function. The following are equivalent: (i) cl f (x) = −∞ for some x ∈ X, (ii) Mf = ∅, (iii) cl f (x) = −∞ for all x ∈ X, as (i) ⇒ (ii) ⇒ (iii) ⇒ (i).  As we shall see below in Fact 2.2.4, if f (x) = −∞ when x belongs to a closed convex set, and  f (x) = +∞ elsewhere, then f is lower semicontinuous but f = cl f . As an example, consider the function f : X →  {−∞, +∞} where f (x) :=  −∞, x = 0,  +∞, x = 0.  (2.1.13)  Definition 2.1.25 (cone) Given a set D in a vector space Y , the cone of D, denoted by cone D, is the set defined by cone D := {λx : x ∈ D, λ > 0}.  (2.1.14)  Definition 2.1.26 (convex hull) Given a set D in a vector space Y , the convex hull of D, denoted by conv D, is the set defined by n  n  {  λi x i : i=1  i=1  λi = 1, xj ∈ D, λj ∈ [0, 1] ∀ j ∈ {1, 2, . . . , n} where n ∈ N}.  (2.1.15)  Definition 2.1.27 (span) Given a set D in a vector space Y , the span of D, denoted by spanD, is the set defined by n  {  i=1  λi xi : xj ∈ D, λj ∈ R ∀ j ∈ {1, 2, . . . , n} where n ∈ N}.  (2.1.16)  Definition 2.1.28 (affine hull) Given a set D in a vector space Y , the affine hull of D, denoted by Aff D, is the set defined by n  n  {  λi xi : i=1  i=1  λj = 1, xj ∈ D, λj ∈ R ∀ j ∈ {1, 2, . . . , n} where n ∈ N}.  (2.1.17)  for any x, y ∈ D. 12  Definition 2.1.29 (orthogonal) Two sets C, D ⊂ X are orthogonal, denoted by C ⊥ D, if c, d = 0  for all c ∈ C, d ∈ D.  (2.1.18)  The orthogonal complement of a set C ⊂ X, denoted by C ⊥ , is defined as C ⊥ := {x ∈ X : x, c = 0 ∀c ∈ C}  2.2  (2.1.19)  Basic facts  Here we define, state, and derive some of the basic results well known in the literature, see eg: [7], [49], [18]. Fact 2.2.1 A function f : X → R  {−∞, +∞} is lower semicontinuous if and only if its  epigraph is closed, and if and only if its sublevel sets are closed.  Proof. Let f be lower semicontinuous. For any converging sequence ((xn , hn ))n∈N ⊂ epi f ,  where (xn , hn ) → (x, h), we have  h = lim hn ≥ lim inf f (xn ) ≥ f (x), n→+∞  n→+∞  and so (x, h) ∈ epi f , hence epi f is closed.  Next, consider any arbitrary λ ∈ R {−∞, +∞}. For any convergent sequence (xn )n∈N ⊂  {y ∈ X : f (y) ≤ λ} where xn → x, it follows that f (x) ≤ lim inf n→+∞ f (xn ) ≤ λ. Hence, the sublevel sets of f are closed.  For the converse, suppose f is not lower semicontinuous and suppose there exists an x ∈ X  and a sequence (xn )n∈N → x such that  α := lim inf f (xn ) < f (x). n→+∞  Then, α = +∞ and f (x) = −∞. Define β ∈ R by  1   2 (α + f (x)),    α + 1, β :=  f (x) − 1,     0,  f (x) ∈ R, α ∈ R  f (x) = +∞, α ∈ R f (x) ∈ R, α = −∞  .  f (x) = +∞, α = −∞  Then, in any case, there is some subsequence (γ : N → N) of (xn )n∈N for which f (xγ(x) ) → α  and xγ(i) ≤ β for all i ∈ N. Therefore, the sequence (xγ(n) , β)n∈N is in the epigraph of f , yet  (xγ(n) , β) → (x, β) ∈ / epi f and so epi f is not closed. By the same argument, the sequence  (xγ(n) )n∈N is in the sublevel set {y ∈ X : f (y) ≤ β}, but x is not, and so this sublevel set is 13  also not closed. Fact 2.2.2 A proper convex function f : X → R if f = cl f .  {+∞} is lower semicontinuous if and only  Corollary 2.2.3 For any proper convex function f : X → R cl f (x) =  inf  {+∞}, for all x ∈ X,  h = lim inf f (y). y→x  (x,h)∈epi f  (2.2.1)  Furthermore cl f is the largest proper lower semicontinuous function which minorizes f (and so cl f = f ∗∗ where f ∗∗ is the Fenchel biconjugate of f ). Proof. Define fˆ : X → R {+∞} by fˆ(x) :=  inf  h,  (2.2.2)  (x,h)∈epi f  so that epi fˆ = epi f . By Fact 2.2.2, we have fˆ = cl f . From Fact 2.2.1, we know that fˆ is the largest-valued lower semicontinuous function which minorizes f , namely fˆ = lim inf y→x f (y).  Fact 2.2.4 (improper convex functions) Let f : X → R {−∞, +∞} be a convex func-  tion, and let C := {y ∈ X : f (y) = −∞}. Then, C is convex, and if C = ∅, (i) f (x) = +∞ everywhere outside of C, (ii) f is possibly finite only on the boundary of C, (iii) (cl f )(y) = −∞ for all y ∈ X, and  (iv) f is lower semicontinuous if and only if C is closed and f is nowhere finite. Proof. Since f is convex, C is convex. (i) Suppose that for some y ∈ / C, f (y) < +∞. There is a λ ∈]0, 1[ such that λx + (1 − λ)y ∈ / C. By convexity, f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y) = −∞, which contradicts the fact that  λx + (1 − λ)y ∈ / C.  (ii) Immediate from (i). (iii) As continuous affine mappings are nowhere infinite, f has no continuous affine minorant and so (cl f )(z) = −∞ for all z ∈ X.  (iv) If f is lower semicontinuous, C is closed by Fact 2.2.1. From (ii) therefore, f is nowhere finite. For the converse, suppose C is closed and that f (x) = +∞ for all x ∈ / C. Then,  by definition, f is lower semicontinuous as any convergent sequence xn → x along which  f (xn ) = −∞ must have x ∈ C, and so f (x) = −∞.  14  Fact 2.2.5 (effect of closure) Given a convex function f : X → R  following sets:  {−∞, +∞}, define the  A := {y ∈ X : f (y) = +∞}  (2.2.3)  B := {y ∈ X : f (y) ∈ / {−∞, +∞}} = dom f  (2.2.4)  C := {y ∈ X : f (y) = −∞}  (2.2.5)  Aˆ := {y ∈ X : (cl f )(y) = +∞}  ˆ := {y ∈ X : (cl f )(y) ∈ B / {−∞, +∞}} Cˆ := {y ∈ X : (cl f )(y) = −∞}.  (2.2.6) (2.2.7) (2.2.8)  Then, (i) A  B  C = X = Aˆ  ˆ B  ˆ C,  (ii) if C = ∅, Cˆ = X, (iii) Cˆ = X or Cˆ = ∅. (iv) if Cˆ = ∅, then ˆ ⊂ B, (a) B ⊂ B  ˆ = Aˆ ⊂ A. (b) A \ B For the following, PV refers the the orthogonal projection onto a closed subspace V . Proposition 2.2.6 (dimension reduction) Suppose that T : X → 2X is a maximal mono-  tone operator such that Aff(dom T ) is closed. For some choice of z ∈ dom T , let V := span(dom(T ) − z) and w := PV ⊥ z. Here, V is a closed subspace and both V and w are  invariant in the choice of z ∈ dom T . Then,  (i) for all y ∈ dom T , T (y) = T (y) + V ⊥ ,and (ii) the operator T˜ : V → 2V , defined on its domain dom T˜ = dom T − w to be x → {PV y ∗ : y ∗ ∈ T (x + w)},  (2.2.9)  is maximal monotone, and (iii) gra T = gra T˜ + {(w, x∗ ) : x∗ ∈ V ⊥ }. Proof. By the definition of span the closed subspace V is invariant with respect to of z ∈ dom T . Since V is a closed subspace, by the decomposition of the Hilbert space X, for all x ∈ X x = PV x + PV ⊥ x. 15  For all x, y ∈ dom T , it follows from V ∋ y − x = PV y − PV x + PV ⊥ y − PV ⊥ x that PV ⊥ x = PV ⊥ y. Therefore, w also is invariant with the choice of z. (i) Let z ∗ ∈ V ⊥ , then for all (x, x∗ ), (y, y ∗ ) ∈ gra T , y − x, y ∗ + z ∗ − x∗ = y − x, y ∗ − x∗ ≥ 0, since y − x ∈ span(dom T − x) = span(dom T − z) = V. Since T is maximal monotone, (y, y ∗ + z ∗ ) ∈ gra T . (ii) Define T˜ as above. For all (x, x∗ ), (y, y ∗ ) ∈ gra T˜, there exists (x+w, x∗ ), (y +w, y ∗ ) ∈ gra T 0  0  such that x∗ = PV x∗0 and y ∗ = PV y0∗ . Therefore,  x − y, x∗ − y ∗ = x − y, PV x∗0 − PV y0∗ = x − w + w − y, x∗0 − y0∗ ≥ 0, and so T˜ is monotone. By Theorem 3.2.2, (Id +T )X = X, and so {PV x + PV ⊥ x + PV x∗ + PV ⊥ x∗ : (x, x∗ ) ∈ gra T } = X. Therefore, {PV x + PV ⊥ (x + w) + PV y ∗ + PV ⊥ y ∗ : (x + w, y ∗ ) ∈ gra T } = X, and so {PV x + PV y ∗ : (x + w, y ∗ ) ∈ gra T } = V. As dom T˜ = dom T − w, the operator T˜ is maximal monotone by Theorem 3.2.2. (iii) Direct consequence of (i) and (ii).  Although the dimensionally reduced operator T˜ from Proposition 2.2.6 greatly reduces the ’multivaluedness’ of the maximal monotone operator T , it will not necessarily be single valued itself. However, T˜ will be a single-valued selection if T is a linear relation, as shown in Proposition 4.0.13.  Fact 2.2.7 Given a proper convex function f : X → R  {+∞}, then  gra(∂f ) ⊂ gra (∂(cl f )) .  (2.2.10)  Proof. The result trivially follows if gra(∂f ) = ∅. If (x, x∗ ) ∈ gra(∂f ), then x ∈ dom ∂f , and so by Fact 3.2.12, (cl f )(x) = f (x). Consider any y0 ∈ X. For any sequence (yn )n∈N such that  16  yn → y0 ,  f (yn ) − f (x) ≥ yn − x, x∗ ,  and therefore lim inf f (yn ) − f (x) ≥ y0 − x, x∗ . n→+∞  By Corollary 2.2.3, (cl f )(y0 ) − (cl f )(x) ≥ y0 − x, x∗ , and so (x, x∗ ) ∈ ∂(cl f ). Remark 2.2.8 As a consequence of Facts 2.2.2, 3.2.6, and 2.2.7, for any proper convex function f , the operator ∂(cl f ) is a maximal monotone extension of ∂f . Fact 2.2.9 Given a proper convex function f : X → R  {+∞}, for any set D ⊃ dom ∂f ,  define the function g : X → R {+∞} for x ∈ X by g(x) := and the function h : X → R  f (x), x ∈ dom ∂f  +∞,  x∈ / dom ∂f  .  (2.2.11)  {+∞} for x ∈ X by h(x) :=  f (x), x ∈ D  +∞,  x∈ /D  .  (2.2.12)  ˆ by stating that (x, x∗ ) ∈ gra ∂g ˆ if and only if x ∈ dom g and Define the operator ∂g g(y) − g(x) ≥ y − x, x∗  for all y ∈ X.  ˆ and ∂h ˆ are monotone operators where dom ∂g ˆ = dom ∂h ˆ = dom ∂f and Then, ∂g ˆ ⊂ gra ∂g. ˆ gra ∂f ⊂ gra ∂h ˆ = ∂h ˆ = ∂f . If f is lower semicontinuous, then ∂g Proof. By definition, for all x ∈ X, f (x) ≤ h(x) ≤ g(x), and f (x) = h(x) = g(x) if x ∈ dom ∂f . ˆ since if x ∈ ˆ Furthermore, dom ∂f ⊃ dom ∂g, / dom ∂f , then g(x) = +∞, and so x ∈ / dom ∂g. Now, for all x ∈ dom ∂f and for all y ∈ X,  g(y) − g(x) = g(y) − h(x) = g(y) − f (x) and g(y) − f (x) ≥ h(y) − f (x) ≥ f (y) − f (x),  17  and so ˆ ⊂ gra ∂g. ˆ gra ∂f ⊂ gra ∂h ˆ and hence Therefore, dom ∂f = dom ∂g, ˆ = dom ∂g. ˆ dom ∂f = dom ∂h ˆ and Even though g and h may not be convex (since dom g and dom h can be nonconvex), ∂g ˆ are monotone since, to choose g as an example, for every (x, x∗ ), (y, y ∗ ) ∈ gra(∂g), ˆ ∂h 0 = g(y) − g(x) + g(x) − g(y) ≥ y − x, x∗ + x − y, y ∗ . If f is lower semicontinuous, then by Fact 3.2.6, the operator ∂f is maximal monotone. In this case, it follows from gra ∂g ⊃ gra ∂f that gra ∂g = gra ∂f . The following two facts are basic results, see for instance [7]. Fact 2.2.10 For any set D in a vector space Y , conv cone D = cone conv D.  (2.2.13)  Fact 2.2.11 For any set D in a vector space Y and any x, y ∈ D, Aff D = x + span(D − x) = x + span(D − y).  (2.2.14)  Fact 2.2.12 If a set C ⊂ X is convex, then 0 ∈ Aff C, where Aff C is the affine hull of C, if and only if there exists a z ∈ C and a λ ∈ R such that λ = 0 and (1 + λ)z ∈ C.  Proof. It is obvious that if for some λ ∈ R, where λ = 0, z, (1 + λ)z ∈ C, then 0=  −1 1+λ z+ (1 + λ)z ∈ Aff C. λ λ  Now, suppose 0 ∈ Aff C, so that  n  ai xi  0=  (2.2.15)  i=1  for some choice of n ∈ N, xi ∈ C, and ai ∈ R such that γ1 :=  ai  n i=1 ai  γ2 :=  ai <0  = 1. Define γ1 , γ2 ∈ R by  ai , ai >0  and define z1 , z2 ∈ X by z1 :=  ai <0 ai xi  γ1  z2 :=  ai >0 ai xi  γ2  .  18  Since C is convex, z1 , z2 ∈ C. Therefore, from (2.2.15), γ1 z1 + γ2 z2 = 0, and γ1 + γ2 = 1. Hence, z1 = and so z1 = (1 + λ)z2 for λ =  1 −γ1 .  1 − γ1 z2 −γ1  Fact 2.2.13 Given a set D ∈ X and a point y ∈ D, then (span(D − y))⊥ = D ⊥ + spanP(D−y)⊥ y,  (2.2.16)  where PV is the orthogonal projection onto a closed subspace V . Proof. By the definition of the span, spanD = span(D − y) + span{y}.  (2.2.17)  Therefore, as (spanD)⊥ = D ⊥ , D ⊥ = (D − y)⊥  {y}⊥ .  Now, (span(D − y))⊥ = (D − y)⊥  = (D − y)⊥  D ⊥ + (D − y)⊥  = D ⊥ + (D − y)⊥ It remains to show that spanP(D−y)⊥ y = (D − y)⊥  D  D.  D. By (2.2.17), span{y} ⊂ spanD, and  so since the orthogonal projection is a linear operator,  P(D−y)⊥ span{y} = spanP(D−y)⊥ y ⊂ (D − y)⊥ Now, suppose that x ∈ (D − y)⊥  spanD.  (2.2.18)  spanD. By (2.2.17), there exists a sequence (xk )n∈N ⊂  (D−y)⊥ and a sequence (yk )n∈N ⊂ spanD such that (xk +yk ) → x. By the linear decomposition  of X into the two orthogonal closed subspaces (D − y)⊥ and D − y, each Hilbert spaces in their own right,  Pspan(D−y) (xk + yk ) + P(D−y)⊥ (xk + yk ) → Pspan(D−y) x + P(D−y)⊥ x.  (2.2.19)  19  Therefore, as x ∈ (D − y)⊥ , Pspan(D−y) (xk + yk ) → 0  and  P(D−y)⊥ yk → x,  and so x ∈ spanP(D−y)⊥ y. Fact 2.2.14 Given a set D ⊂ X and a point y ∈ D, then span(D − y) = spanD ⇔ 0 ∈ Aff D.  (2.2.20)  Proof. By (2.2.17), spanD ⊃ span(D − y). Suppose that z ∈ spanD, so that n  λi x i  z= i=1  for some some choice of λi ∈ R and xi ∈ D, where i ∈ {1, 2, . . . , n}. If 0 ∈ Aff D, then 0 ∈ y + span(D − y) by the definition of Aff. Therefore, −y ∈ span(D − y), and so n i=1  Therefore,  λi y ∈ span(D − y). n  n  z= i=1  λi (xi − y) +  i=1  λi y ∈ span(D − y),  and so spanD = span(D − y). Fact 2.2.15 Given any two sets V and W in X, then V  W  ⊥  ⊃ V ⊥ + W ⊥.  (2.2.21)  Proof. Let x ∈ V ⊥ + W ⊥ , so there are some x1 ∈ V ⊥ and x2 ∈ W ⊥ such that x = x1 + x2 . Then, for any y ∈ V and so x ∈ (V  W,  x, y = x1 + x2 , y = 0,  W )⊥ .  20  Chapter 3  Five Classes of Monotone Operators In this chapter, five special classes of monotone operators are studied: strictly monotone, 3-cyclic monotone, 3∗ -monotone, paramonotone and maximal monotone. In Section 3.1, we go over some general relationships that hold between these classes. This is followed, in Section 3.2, by a collection of well known results that hold for monotone operators. We continue the discussion on the usefulness of certain monotone classes in Section 3.4 after detailing how the monotone classes of two operators combined in the product space affect the monotone classes inclusion of the product. For applications, a single-valued selection is usually taken if a monotone operator is multivalued, and so we examine how this affects monotone class inclusions in Section 3.5. In Section 3.6, we bring together results on the sums of monotone operators as they relate to monotone classes. Section 3.8 uses the results of Section 3.7, where monotone operators on R are shown to always be cyclical monotone and paramonotone, to generate spherically symmetric operators in an arbitrary Hilbert space which are also cyclical monotone and paramonotone.  3.1  General relationships between monotone classes  There are precisely three relationships between the five classes of monotone operators here considered that hold in general. Any strictly monotone operator is paramonotone (Fact 3.1.1), any 3-cyclic monotone operator is 3∗ -monotone, and any operator that is both 3-cyclic monotone and maximal monotone is also paramonotone (Proposition 3.1.3). That these are the only such relationships that hold without further assumption is ultimately demonstrated with the zoo of examples presented Chapter 5 as summarized in Table 5.1. The first two relationships follow immediately from their monotone class definitions, and the third was discovered in 2006 (Proposition 3.1 in [42]). Fact 3.1.1 Any strictly monotone operator T : X → 2X is also paramonotone. Fact 3.1.2 If an operator T : X → X is 3-cyclic monotone, then it is also 3∗ -monotone.  21  Proof. It follows directly from Remark 2.1.9 (2.1.6) that for any given (x, x∗ ), (z, z ∗ ) ∈ gra T sup (y,y ∗ )∈gra T  z − y, y ∗ − x∗ ≤ x − z, x∗ − z ∗ < +∞.  Proposition 3.1.3 [42] If T is 3-cyclic monotone and maximal (2-cyclic) monotone, then T is paramonotone. Proof. Suppose that for some choice of (x, x∗ ), (y, y ∗ ) ∈ gra(T ), x − y, x∗ − y ∗ = 0, so y − x, x∗ = y − x, y ∗ . Since T is 3-cyclic monotone, every (z, z ∗ ) ∈ gra(T ) satisfies 0 ≥ = = =  y − x, x∗ + z − y, y ∗ + x − z, z ∗ −x, y ∗ + z, y ∗ + x − z, z ∗ z − x, y ∗ + x − z, z ∗ x − z, z ∗ − y ∗  and so x − z, y ∗ − z ∗ ≥ 0 ∀(z, z ∗ ) ∈ gra(T ). Since T is maximal monotone, y ∗ ∈ T x. By exchanging the roles of x and y above, it also holds  that x∗ ∈ T (y), and so T is paramonotone.  Remark 3.1.4 A result partially converse to Proposition 3.1.3 holds. Given a paramonotone operator T on X, if two points (x1 , x∗1 ), (x2 , x∗2 ) ∈ gra T are such that x2 − x1 , x∗2 − x∗1 = 0  (3.1.1)  then by paramonotonicity, (x1 , x∗2 ) ∈ gra T . For any choice of a third point (x3 , x∗3 ) ∈ gra T , by  monotonicity x3 − x1 , x∗3 − x∗2 ≥ 0, and therefore x3 − x1 , x∗3 ≥ x3 − x1 , x∗2 . In conclusion, x1 − x2 , x∗1 + x2 − x3 , x∗2 + x3 − x1 , x∗3 ≥ 0,  so that the three points satisfy the condition for 3-cyclic monotonicity (2.1.5). However, general converse results do not exist as shown by various counterexamples in Chapter 5. Specifically, refer to Example 5.0.31.  3.2  Properties of monotone operators  In this section we list known results for monotone operators, including among others characterizations for some of classes of monotone operator, the relationship of the monotone classes 22  to subdifferentials, and the preservation of monotone class under domain and range shifts and the inverse operation. The following is a consequence of the more general result in [24] that hemicontinuous mappings are maximal monotone on their domains. Fact 3.2.1 Let T : X → 2X be a monotone operator. If T is single valued and continuous on some subset D ⊂ dom T , then T is maximal monotone on int D.  Proof. Suppose the monotone operator T : X → 2X is single valued and continuous on  D ⊂ dom T . Suppose that (a, a ˜), where a ∈ int dom T , is a graph extension of T so that for  all (x, x∗ ) ∈ gra T ,  x − a, x∗ − a ˜ ≥ 0.  Now, since D ⊂ dom T , there is already a point (a, a∗ ) ∈ gra T . Aiming for a contradiction, suppose that a∗ = a ˜. Let xn = a + n1 (˜ a − a∗ ), so that limn→+∞ xn = a.  Since a ∈ int dom T , there is some N0 > 0 such that for all n > N0 , xn ∈ D. Furthermore,  since T is single valued and continuous on D, there is an N > N0 > 0 such that for all n > N , a∗ − T x n < a ˜ − a . Hence, for such an n > N , a − xn , a ˜ − T xn  1 a ˜ − a∗ , a ˜ − a∗ + a∗ − T xn n 1 ≤ = − a ˜ − a∗ 2 + a ˜ − a∗ a∗ − T xn n < 0, = −  which is a contradiction of monotonicity. Hence, T is maximal monotone. Theorem 3.2.2 (Minty’s Theorem) A monotone operator T : X → 2X is maximal monotone if and only if ran(Id +T ) = X.  Proof. [55] [7] Suppose first that ran(Id +T ) = X. Let (x, x∗ ) be monotonically related to gra T , so that for all (y, y ∗ ) ∈ gra T , x − y, x∗ − y ∗ ≥ 0. Since ran(Id +T ) = X, there is at  least one such (y, y ∗ ) ∈ gra T where y + y ∗ = x + x∗ . For such (y, y ∗ ), 0 ≤ =  x − y, x∗ − y ∗  x − y, y − x  = − x−y  2  (3.2.1)  ≤ 0.  Hence, x = y, so x∗ = y ∗ , and (x, x∗ ) ∈ gra T .  Now suppose that T is maximal monotone. Define the graph of the operator W : X → X by gra W := {(x + x∗ , x − x∗ ) : (x, x∗ ) ∈ gra T }.  (3.2.2) 23  Note that W is indeed single valued by (3.2.1) and the monotonicity of T . (v, v ∗ ), (w, w∗ ) v−w  2  ∈ gra W , note that there exist a unique pair − v ∗ − w∗  2  = =  x + x∗ − y − y ∗  2  + x − x∗ − y + y ∗  (x − y) + (x∗ − y ∗ )  = 4 x−  y, x∗  −  y∗  (x, x∗ ), (y, y ∗ )  2  ≥ 0.  For all  ∈ gra T such that  2  + (x − y) − (x∗ − y ∗ )  2  (3.2.3)  When an operator satisfies such a result, where for all v, w ∈ dom W , Wv − Ww ≤ v − w , we call such an operator nonexpansive, or Lipschitz-1. Kirszbraun’s Theorem [50] and its Hilbert space extension [75], state that if dom W = X, then there exists a Lipschitz-1 (ie: ˆ : X → X where dom W ˆ = X and W ˆ = W on dom W . Aiming nonexpansive) extension W ˆ . In this setting, there for a contradiction, let dom W = X and consider such an extension W ˆu ˆ u − W v ≤ u − v . Let x = u+W is a point u ∈ / dom W such that for all v ∈ dom W , W 2 ˆ ˆ u. Since (u, W ˆ u) ∈ / gra W , (x, x∗ ) ∈ / and x∗ = u−2W u , so that x + x∗ = u and x − x∗ = W  gra T . However, by (3.2.3) for all (v, v ∗ ) ∈ gra W and for their corresponding (y, y ∗ ) ∈ gra T ,  x − y, x∗ − y ∗ ≥ 0. Since T is maximal monotone, (x, x∗ ) ∈ gra T , a contradiction. Hence, dom W = X, and so {x + x∗ : (x, x∗ ) ∈ gra T } = X.  Fact 3.2.3 [55] Consider any operator T : X → 2X , and a subsequently defined convex set C ⊂ dom T . Then, T is monotone on C if and only if for each x ∈ C, there exists an open ball  Bε(x) (x) such that T is monotone on C  Bε(x) (x).  Proof. (Improved from [55].) Given T and C, first suppose the latter condition where T is monotone on each of Bε(x) (x). Consider two points (y, y ∗ ), (z, z ∗ ) ∈ gra T . For the (compact)  line segment L := {λy + (1 − λ)z : λ ∈ [0, 1]}, B := {Bε(x) (x) : x ∈ L} is an open covering,  hence there is a finite subcovering Bf . Since each open sphere in the subcovering is centered on L, define r0 to be the minimum intersection width along L, r0 := min  x2 − x1 − ε(x1 ) − ε(x2 ) :  x2 − x1 − ε(x1 ) − ε(x2 ) > 0,  Bε(x1 ) (x1 ), Bε(x2 ) (x2 ) ∈ Bf  Define δ so that r0 > δ > 0 and y − z /δ is some integer N > 0. Let λn :=  . nδ y−z  (3.2.4) =  n N,  and  define xn := λn y + (1 − λn )z, where n ∈ {0, 1, 2, . . . , N }. Note that xN = y and x0 = z. Since  C ⊂ dom T , let x∗N = y ∗ , x∗0 = z ∗ , and for all other n, let x∗n ∈ T xn . Now, since for each  N ≥ n ≥ 1, xn − xn−1 = δ < r0 , both xn and xn−1 both belong in at least one covering sphere, and so  0 ≤ xn − xn−1 , x∗n − x∗n−1 = δ  y−z , x∗n − x∗n−1 . y−z  24  Hence, summing for each n ∈ {1, 2, . . . , N }, 0≤ δ  y−z 1 , x∗N − x∗0 = y − z, y ∗ − z ∗ , y−z N  and so T is monotone. In the generalization from subdifferentials of proper lower semicontinuous convex functions to monotone operators, some of the structure and properties of subdifferentials are lost. Here, we show that maximal, n-cyclic, para-, and 3∗ -monotonicity reflect different aspects of the subdifferential that are not guaranteed for all monotone operators. Fact 3.2.4 Given a proper convex function f : X → R {+∞}, the subdifferential of f , that is ∂f , is a monotone operator from X to 2X .  Proof. If gra(∂f ) = ∅, then it is trivially monotone. By definition, if (x, x∗ ), (y, y ∗ ) ∈ gra ∂f ,  then  f (y) − f (x) ≥ y − x, x∗ and f (x) − f (y) ≥ x − y, y ∗ , and so adding these together 0 ≥ x − y, y ∗ − x∗ .  Fact 3.2.5 The subdifferential ∂f of a lower semicontinuous convex function f : X → R {+∞} is strictly monotone if the original function f is strictly convex.  Proof. Suppose f is strictly convex and consider x, y ∈ dom ∂f such that x = y. Let z = 1 2x  + 21 y. By (2.1.9), for all x∗ ∈ ∂f (x), y ∗ ∈ ∂f (y), f (z) − f (x) ≥ z − x, x∗  and  f (z) − f (y) ≥ z − y, y ∗ .  By strict convexity, 2f (z) < f (x) + f (y), and so 0 > 4f (z) − 2f (x) − 2f (y) ≥ =  y − x, x∗ + x − y, y ∗ x − y, y ∗ − x∗ .  Hence, x − y, x∗ − y ∗ > 0 and ∂f is strictly monotone. Fact 3.2.6 The subdifferential of any proper, lower semicontinuous convex function is maximal monotone. [62] 25  Fact 3.2.7 The set of maximal cyclical monotone operators on a Banach space is identical to the set of subdifferentials of proper, lower semicontinuous convex functions on that space. [62] [65] These last two Facts imply that if T is maximal cyclical monotone, then T is maximal monotone. Proposition 3.2.8 [5, Theorem 2.9] If T : X → 2X is maximal monotone, then for each  n ∈ {2, 3, . . .}, the following are equivalent (i) T is n-cyclic monotone (ii) T is maximal n-cyclic monotone  By these results, if T is maximal monotone, either it is cyclically maximal monotone (hence equal to a subdifferential by Fact 3.2.7) or there exists an n such that T is n-cyclically monotone but not (n + 1)-cyclically monotone. (See for instance Example 4.6 in [6].) Maximal monotonicity is required for this result, since there exist maximal 3-cyclic monotone operators which are not maximal 2-cyclic monotone [5] (duplicated in Example 5.0.30). Corollary 3.2.9 For any operator T : X → 2X , the following are equivalent: (i) T is cyclical monotone and maximal monotone, (ii) T is maximal n-cyclic monotone for each n ≥ 2, (iii) T is maximal cyclical monotone. (iv) T is the subdifferential of a proper, lower semicontinuous convex function on X. Proof. That (i) ⇒ (ii) follows directly from Proposition 3.2.8. That (ii) ⇒ (iii) follows by the definitions. (iii) ⇔ (iv) by Fact 3.2.7, and together with Fact 3.2.6, it follows that (iii) ⇒ (i).  Corollary 3.2.10 The subdifferential of any proper lower semicontinuous convex function on a Hilbert space is paramonotone. Proof. Apply Fact 3.2.7, Fact 3.2.6 and Proposition 3.1.3. Remark 3.2.11 By the above, and Fact 3.1.2, subdifferentials of proper lower semicontinuous convex functions are necessarily cyclical monotone, maximal monotone, paramonotone, and 3∗ monotone. Although these properties hold only for lower semicontinuous convex functions, the following fact demonstrates how proper convex functions are nearly lower semicontinuous anyway. 26  Fact 3.2.12 Given a proper convex function f : X → R {+∞}, f is lower semicontinuous  at any x ∈ dom ∂f .  Proof. Aiming for a contradiction, suppose that f is not lower semicontinuous for some x0 ∈ dom ∂f . Then, for some sequence (xn )n∈N → x0 , the limit limn→+∞ f (xn ) = k < f (x0 ), and since for all x∗ ∈ X,  xn − x0 , x∗ ≥ − x∗  xn − x0 → 0.  Taken together, and for any given x∗ ∈ X, there is an N > 0 such that for all n ≥ N , f (xn ) − f (x0 ) < k +  f (x0 ) − k k − f (x0 ) − f (x0 ) = < 0, 2 2  and xn − x0 , x∗ >  k − f (x0 ) . 2  Therefore, as f (xn ) − f (x0 ) < xn − x0 , x∗ for sufficiently large n, there exists no subgradient  x∗ at x0 , contradicting the assumption that x0 ∈ dom ∂f .  Fact 3.2.13 (shift-invariance) All classes of monotone operators defined above are shiftinvariant. That is, if the operator T : X → 2X satisfies the criteria defining any of the  above classes of monotone operators, then so does the operator S : X → 2X defined by Sx := T (x − a) + a∗ , equivalently defined by gra S := gra T + (a, a∗ ), for any (a, a∗ ) ∈ X.  Proposition 3.2.14 (operator inverse) If an operator T : X → 2X is any of (i) monotone, (ii) 3∗ -monotone, (iii) paramonotone, (iv) maximal monotone, or (v) n-cyclic monotone, then so is its inverse T −1 : X → 2X . However, (vi) Strict monotonicity and (vii) strong monotonicity are not necessarily preserved by the operator inverse. Proof. (i-iv) Obvious as (x, x∗ ) ∈ gra T ⇔ (x∗ , x) ∈ gra T −1 .  (v) Suppose T is n-cyclic monotone for some n. Then, (2.1.4) holds for any sequence of points (ai , a∗i )i∈{1,2,...,n} ⊂ gra T , where (an+1 , a∗n+1 ) := (a1 , a∗1 ), as well as for that sequence of points  27  in reverse (bi := an+1−i )i∈{1,2,...,n} , where (bn+1 , b∗n+1 ) = (a0 , a∗0 ) := (an , a∗n ). Therefore, 0≤  = =  = =  n i=1 n i=1 n i=1 n i=1 n i=1  bi − bi+1 , b∗i  an+1−i − an−i , a∗n+1−i an+1−i , a∗n+1−i − ni=1 an−i , a∗n+1−i  ai , a∗i −  n i=1  ai , a∗i+1  ai , a∗i − a∗i+1  and so T −1 is also n-cyclic monotone. (vi) The inverse of a strictly monotone operator is not necessarily strictly monotone since the operator T : R → 2R defined by    x − 1, x < 0 T x := [−1, 1], x = 0   x + 1, x > 0  is strictly monotone, but its inverse is not.  (vii) Similarly, the inverse of the strongly monotone operator T : R → R defined by  x   −e + 1, x < 0 T x := 0, x=0   x e − 1, x>0  is the function defined in (2.1.3), which is not strongly monotone. Remark 3.2.15 Note that the domain of T −1 is the image of T , and so may fail to be either convex or connected. Although most other monotone operators considered in this section have full domain, it is unavoidable that the operator inverse will not unless the range of the operator is the entire space X.  3.3  Monotone operators on product spaces  Let X1 and X2 be Hilbert spaces, and consider set valued operators T1 : X1 → 2X1 and  T2 : X2 → 2X2 . The product operator T1 × T2 : X1 × X2 → 2X1 ×X2 is defined as (T1 × T2 )(x1 , x2 ) := {(x∗1 , x∗2 ) : x∗1 ∈ T1 x1 and x∗2 ∈ T2 x2 }.  Proposition 3.3.1 If both T1 and T2 are monotone, then the product operator T1 × T2 is also monotone.  28  Proof. For any points ((x1 , x2 ), (x∗1 , x∗2 )) , ((y1 , y2 ), (y1∗ , y2∗ )) ∈ gra(T1 × T2 ), (x1 , x2 ) − (y1 , y2 ), (x∗1 , x∗2 ) − (y1∗ , y2∗ )  x1 − y1 , x∗1 − y1∗ + x2 − y2 , x∗2 − y2∗ ≥ 0.  = Hence, T1 × T2 is monotone.  Proposition 3.3.2 If both T1 and T2 are paramonotone, then the product operator T1 × T2 is also paramonotone.  Proof. If x∗i ∈ Ti xi , yi∗ ∈ Ti yi for i ∈ {1, 2} and (x1 , x2 ) − (y1 , y2 ), (x∗1 , x∗2 ) − (y1∗ , y2∗ ) = 0, then xi − yi , x∗i − yi∗ = 0 for i ∈ {1, 2} since both T1 and T2 are monotone. By the paramono-  tonicity of T1 and T2 , yi∗ ∈ Ti xi and x∗i ∈ Ti yi for i ∈ {1, 2}, and so (x∗1 , x∗2 ) ∈ T1 × T2 (y1 , y2 ) and (y1∗ , y2∗ ) ∈ T1 × T2 (x1 , x2 ).  By following the same proof structure as Proposition 3.3.2, a similar result immediately follows for some other monotone classes. Proposition 3.3.3 If both T1 and T2 belong to the same monotone class, where that class is one of strict, n-cyclic, or 3∗ -monotonicity, then so does their product operator T1 × T2 . Proposition 3.3.4 If both T1 and T2 are maximal monotone, then the product operator T1 ×T2  is also maximal monotone.  Proof. Suppose T1 ×T2 is not maximal monotone. Then there exists a point ((x1 , x2 ), (x∗1 , x∗2 )) ∈ /  gra(T1 × T2 ) such that for all ((y1 , y2 ), (y1∗ , y2∗ )) ∈ gra(T1 × T2 )  x1 − y1 , x∗1 − y1∗ + x2 − y2 , x∗2 − y2∗ ≥ 0,  (3.3.1)  / gra T2 . Suppose without loss of generality / gra T1 or (x2 , x∗2 ) ∈ and at least one of (x1 , x∗1 ) ∈  that (x1 , x∗1 ) ∈ / gra T1 .  By the maximality of T1 , x1 − z1 , x∗1 − z1∗ < 0 for some (z1 , z1∗ ) ∈ gra T1 , and so by setting  (y1 , y1∗ ) := (z1 , z1∗ ) in (3.3.1), x2 − y2 , x∗2 − y2∗ ≥ 0 for all (y2 , y2∗ ) ∈ gra T2 . Since T2 is maximal monotone, it must be that (x2 , x∗2 ) ∈ gra T2 .  Clearly, ((z1 , x2 ), (z1∗ , x∗2 )) ∈ gra T1 × T2 , yet  (x1 , x2 ) − (z1 , x2 ), (x∗1 , x∗2 ) − (z1∗ , x∗2 ) < 0. This is a contradiction of (3.3.1), and so T1 × T2 is maximal monotone.  29  Remark 3.3.5 Of course, if an operator T1 : X → 2X fails to satisfy the conditions for any of the classes of monotone operator here considered, then the product of that operator with any  other operator T2 : Y → 2Y , namely T1 × T2 : X × Y → 2X×Y , will also fail the same condition.  Simply consider the set of points P in the graph of T1 which violate a particular condition in X, and instead consider the set of points P˜ := {(p, a) × (p∗ , a∗ ) : p ∈ P } for a fixed arbitrary point (a, a∗ ) ∈ gra T2 . Clearly P˜ ⊂ gra T1 × T2 , and this set will violate the same conditions in  X × Y that P violates for T1 in X. For instance,  (w, a) − (x, a), (y ∗ , a∗ ) − (z ∗ , a∗ ) = w − x, y ∗ − z ∗ . In this manner, the lack of a monotone class property (be it n-cyclic, para-, maximal, 3∗ -, nor strict monotonicity) is dominant in the product space. Remark 3.3.6 Taken together, the results of this section are that the product operator T1 ×T2  of monotone operators T1 and T2 operates with respect to monotone class inclusion as a logical AND operator applied to the monotone classes of T1 and T2 . For instance, suppose that T1 is paramonotone, not strictly monotone, 3-cyclic monotone, maximal monotone, and 3∗ -monotone (with binary label 10111), and suppose that T2 is paramonotone, strictly monotone, not 3-cyclic monotone, maximal monotone, and not 3∗ -monotone (with binary label 11001). Then, T1 × T2  is paramonotone, not strictly monotone, not 3-cyclic monotone, maximal monotone, and not 3∗ -monotone (with binary label 10001).  3.4  Further discussion of monotone properties  In this section, we investigate how paramonotonicity and maximal monotonicity are guarantees of the convexity of a monotone preimage, and elaborate on the properties of paramonotone and 3∗ -monotone operators and how they are useful in theory and application.  3.4.1  Convex preimages  We note that both maximal monotonicity and paramonotonicity allow easy guarantees that the preimage of a point is convex. With the possible exception of results involving paramonotonicity, the following results are well known. When finding the zeros of a monotone operator, it can be useful to know if the solution set is convex or not. It is well known that for a maximal monotone operator T , T −1 (0) is a closed convex set (see for instance [7]). Proposition 3.4.1 Let T : X → 2X be a maximal monotone operator. Then T −1 (0) is a closed convex set.  30  Proof. Suppose T −1 (0) is a nonempty set. (The empty set is closed and convex.) Let x, y, z ∈ X  be such that 0 ∈ T x, 0 ∈ T z, and y = αx + (1 − α)z for some α ∈ [0, 1]. By the monotonicity  of T , for all (u, u∗ ) ∈ gra T ,  u − x, u∗ ≥ 0  u − z, u∗ ≥ 0  αu − αx, u∗ ≥ 0  ⇒  (1 − α)u − (1 − α)z, u∗ ≥ 0  u − y, u∗ ≥ 0.  ⇒  So, by the maximal monotonicity of T , 0 ∈ T (y), and T −1 (0) is convex. Let (xn )n∈N be  any convergent sequence, xn → x ¯, such that xn ∈ T −1 (0). Then, for any (y, y ∗ ) ∈ gra T , xn − y, 0 − y ∗ ≥ 0 by monotonicity, and so in the limit x ¯ − y, −y ∗ ≥ 0. Since T is maximal  monotone, (¯ x, 0) ∈ gra T , and so T −1 (0) is closed.  A similar result also holds for paramonotone operators. Proposition 3.4.2 Let T : X → 2X be a paramonotone operator with convex domain. Then  T −1 (0) is a convex set.  Proof. Suppose T −1 (0) is nonempty. Let x, y, z ∈ X such that 0 ∈ T x, 0 ∈ T z, and y =  αx + (1 − α)z for some α ∈ ]0, 1[. Then, x − y = (1 − α)(x − z) and y − z = α(x − z), so x − y = 1−α α (y − z). Since T has convex domain, T y = ∅. By the monotonicity of T , for all y∗ ∈ T y  0 ≤ x − y, −y ∗ =  1−α y − z, −y ∗ α  and  0 ≤ y − z, y ∗ ,  and so y − z, y ∗ = 0. Therefore, by the paramonotonicity of T , 0 ∈ T (y), and so the set  T −1 (0) is convex.  Remark 3.4.3 If an operator is not maximal monotone, there is no guarantee that T −1 (0) is closed, as the operator T : R → R below demonstrates:  Note that T is paramonotone.     −1, x ≤ −1, T x := 0, x ∈ ]−1, 1[ ,   1, x ≥ 1.  (3.4.1)  Proposition 3.4.4 Suppose that T : X → 2X is a monotone operator such that 0 ∈ T x  Ty  for two points x, y ∈ X. Then for each c in the line segment joining the two points, c ∈ {z : z = λx + (1 − λ)y, λ ∈ ]0, 1[}, it follows that T (c) ⊆ [R(x − y)]⊥ .  Proof. Consider any x, y ∈ X such that 0 ∈ T x, 0 ∈ T y. For any arbitrary λ ∈ ]0, 1[, let 31  c := λx + (1 − λ)y. Then, by monotonicity, for any c∗ ∈ T c, should one exist, 0 ≤ λx + (1 − λ)y − x, c∗ = (1 − λ) y − x, c∗ and 0 ≤ λx + (1 − λ)y − y, c∗ = λ x − y, c∗ . Therefore, x − y, c∗ = 0, and so for any z ∈ R(x − y), z, c∗ = 0. Corollary 3.4.5 Suppose that T : R → 2R is a monotone operator with convex domain, then  the preimage T −1 (0) is convex.  Proof. Suppose T −1 (0) is a nonempty set. Consider any x, y ∈ R such that 0 ∈ T x, 0 ∈ T y.  For any arbitrary λ ∈ ]0, 1[, let c := λx + (1 − λ)y. By Proposition 3.4.4, T c ⊆ {0}. Since T has convex domain, T c = ∅, so T c = 0, and therefore T −1 (0) is always convex.  Remark 3.4.6 The conditions of convex domain in Proposition 3.4.2 and Corollary 3.4.5 are required. Consider for instance the paramonotone operator T := 0 with the point (0, 0) removed from its graph, so that T 0 = ∅. T remains paramonotone, yet the set T −1 (0) is not convex. Remark 3.4.7 The above results can be extended in that the preimage of any point resulting from a maximal monotone or paramonotone operator is a convex set. Let S = T − a for any a ∈ X. Then, T −1 (a) = S −1 (0), and by Remark 3.2.13, S is respectively maximal monotone or paramonotone if T is. Therefore, for any a ∈ X, T −1 (a) is convex if T is maximal monotone,  and if T has convex domain, if either T is paramonotone or T : R → 2R .  Remark 3.4.8 Suppose T : X → 2X is maximal monotone. Proposition 3.4.1 and Remark 3.4.7 yield that T −1 (a) is closed for all a ∈ X. By Proposition 3.2.14, T −1 is maximal  monotone, and so (T −1 )−1 = T . Hence, T (a) is closed and convex for all a ∈ X.  As the example below demonstrates, without maximal monotonicity or paramonotonicity, the preimage of a given point may not be a convex set. Example 3.4.9 Let the operator T : R2 → R2 be defined by  (1, 0),        (0, 0), x→ ( 12 , 0),     (0, 0),    (−1, 0)       = 0, x2 > 0    . = 0, x2 = 0    = 0, x2 < 0     <0  x1 > 0 x1 x1 x1 x1  (3.4.2)  Note that T is monotone, but T −1 (0) is not convex. Furthermore, by Proposition 3.4.2, T is not paramonotone. 32  Fact 3.4.10 [63] Let T : X → 2X be a monotone operator such that T is locally bounded at some point in the domain. Then, int dom (T ) is a nonempty convex set, T is locally bounded on int dom T and not locally bounded at any x ∈ dom T such that x ∈ / int dom T . Corollary 3.4.11 Let T : X → 2X be a monotone operator such that T is locally bounded at some point in the domain. Then, T has full domain if and only if T is locally bounded.  Corollary 3.4.12 [63] Let T : X → 2X be a monotone operator. If for some subset S ⊂  dom T ,  0 ∈ int(conv T (S)), then there is an x ∈ dom T such that 0 ∈ T x.  3.4.2  Paramonotonicity  The condition of paramonotonicity prohibits angles of π/2 between a point in the domain and a point in its image. Paramonotonicity also guarantees that when solving for T −1 (0) by steepest descent the resulting operator is nonexpansive. Paramonotonicity arises in the literature in part because it is the weakest condition required for the convergence of almost all iterative solutions to variational inequality problems. Indeed, the condition first emerged in this context [25] as a sufficient condition for the convergence of a projected-gradient like method. Variational inequalities were first outlined in 1966 [44], and have since been used to model a large number of problems, as they provide a unified framework for, among others, constrained optimization, saddle point, Nash equilibrium, traffic equilibrium, frictional contact, and complementarity problems. For a good overview of sample problems and methods used to solve them, see [38]. Definition 3.4.13 (Variational Inequality Problem) Given a nonempty closed convex set C and a monotone operator T acting on C, the variational inequality problem, V IP (T, C), is to find an x ¯ ∈ C such that for some x ¯∗ ∈ T (¯ x) c−x ¯, x ¯∗ ≥ 0 for all c ∈ C.  (3.4.3)  The variational inequality problem is a generalization of the following classical constrained convex minimization problem (with convex function f and convex set C): min f (x) x∈C  (3.4.4)  since when f is convex, lower semicontinuous, and proper on C, T := ∂f is (maximal) monotone on C. Furthermore, minimizers x ¯ of (3.4.4) are characterized by the inclusion 0 ∈ ∂f (¯ x) + 33  NC (¯ x), where NC (x) is the normal cone to the set C at x ∈ C. Additionally, when C = X,  the variational inequality problem becomes the inverse problem, ie: find x ¯ such that 0 ∈ T (¯ x). A number of iterative methods for solving (3.4.3) have required paramonotonicity to con-  verge. Examples include an interior point method using Bregman functions [29], an outer approximation method [28], and proximal point algorithms [3] [27]. Often, as in [13], with more work it is possible to show convergence with paramonotonicity where previously stronger conditions, such as strong monotonicity, were required. It should be mentioned that with further assumptions some proximal methods have been used successfully to bypass the restriction of paramonotonicity [26] [34]. A reason for the crucial role of paramonotonicity is detailed in [48] and outlined here. In the analysis of the convergence proofs for iterative solution methods of variational inequality problems, it is possible to show that x ¯ − xC , x∗C ≥ 0, for iterate cluster points (xC , x∗C ) and  a solution x ¯. The following fact then makes it possible to show that the iterates do indeed converge to a solution, as long as T is paramonotone. Proposition 3.4.14 [48] Suppose T is paramonotone, and x ¯ solves the Variational Inequality Problem for some set C. Then, an element xC ∈ C is also a solution to the variational  inequality problem if and only if for some x∗C ∈ T xC  x ¯ − xC , x∗C ≥ 0.  (3.4.5)  Proof. [48] Suppose x ¯ solves the variational inequality problem for the given C and paramonotone T , and some point (xC , x∗C ) ∈ gra T satisfies xC ∈ C and x ¯ − xC , x∗C ≥ 0. Then, x ¯ satisfies (3.4.3) for some x ¯∗ ∈ T (¯ x), and so  x ¯ − xC , x ¯∗ − x∗C ≤ 0. Since T is monotone, x ¯ − xC , x ¯∗ − x∗C = 0 and so x ¯∗ ∈ T (xC ). Furthermore, x ¯ − xC , x ¯∗ = x ¯ − xC , x∗C ≥ 0. Finally, the paramonotonicity of T yields that for all c ∈ C, c − xC , x ¯∗ = c − x ¯, x ¯∗ + x ¯ − xC , x ¯∗ ≥ 0, and so xC is also a solution to the variational inequality problem. Conversely, if xC is such that for all x∗C ∈ T (xC ) , the equation (3.4.5) cannot be satisfied, and so xC cannot solve (3.4.3) and hence the variational inequality problem since x ¯ ∈ C. Steepest descent Another way to understand the impact of paramonotonicity is to study the nature of steepest descent algorithms, for instance in solving for the zero of a monotone operator. Indeed, the variational inequality problem can be posed in this manner [68]. For algorithmic purposes, 34  only single-valued or single-valued selections of monotone operators are considered. We first describe the steepest descent step as an operator. Definition 3.4.15 (Qλ ) Given T : X → X, define Qλ : X → X by Qλ = Id −λT . Sometimes λ is not written explicitly and we refer simply to Q.  For x, y in the domain of T , the following decomposition of Q is useful. Qx − Qy  2  = =  x − λT x − y + λT y  x−y  2  2  + λ2 T x − T y  2  − 2λ x − y, T x − T y  (3.4.6)  By (3.4.6), if T is not paramonotone, then Q is expansive for some choice of x, y ∈ X  for every λ > 0. Clearly, this makes it difficult to prove convergence, and so it comes as no surprise that paramonotonicity is required for many methods based on steepest descent ideas, such as descent methods with line search. For set valued T , as an algorithm Q is often defined to be single valued, selecting only one element x∗ ∈ T x for every x in the domain. From (3.4.6) we know that this should be done with  care to avoid expansivity, since even when T itself is paramonotone, a single-valued selection of its graph might not be. In Section 3.5, we define single-valued selections and examine a single valued selection that preserves paramonotonicity. Hybrid steepest descent The hybrid steepest descent method was introduced by Yamada in the comprehensive paper [77]. The method solves a variational inequality problem in which the set C is given as the fixed point set of a known nonexpansive operator, useful for cases where the projection operator onto the set is too computationally expensive to compute. The method was refined somewhat for more general applications in [79] and [78]. The importance of paramonotonicity for the monotone operator in the variational inequality problem was hinted at in [77] and expounded upon further in [78], although convergence results using paramonotonicity alone are elusive. Often, the stronger condition of (α-)inverse strongly monotone is used, and there has been much work on variants of this approach [46] [47] [57] [58] [30] [72]. Definition 3.4.16 Given an α > 0, a monotone operator T : X → X is α-inverse strongly monotone if  x − y, T x − T y ≥ α T x − T y 2 .  (3.4.7)  Note that if Qλ from Definition 3.4.15 is α-inverse strongly monotone, then Qλ is nonexpansive for all λ ∈ [0, 2α].  35  3.4.3  3∗ -monotonicity  We adopt the notation of [82] and use the term 3∗ -monotone, although this property was first introduced with no name. The property was first referenced simply by “∗” [23] by Br´ezis and Haraux, as such it is also sometimes called the (BH)-operator [31] in honour of these original authors. More recently the property has acquired the synonym “rectangular” since the closure of the domain of the Fitpatrick function of a monotone operator is rectangular precisely when the operator is 3∗ monotone [70]. 3∗ -monotonicity has another important property in that if T1 and T2 are 3∗ -monotone, then the sum of their ranges is the range of T1 + T2 . This statement is made precise by the following Proposition. Proposition 3.4.17 [23] Let both T1 and T2 be monotone operators such that (i) T1 + T2 is maximal monotone, (ii) T2 is 3∗ -monotone, and (iii) dom(T1 ) ⊆ dom(T2 ) or T1 is 3∗ -monotone. Then, ran(T1 + T2 ) = ran(T1 ) + ran(T2 )  (3.4.8)  int (ran(T1 + T2 )) = int (ran(T1 ) + ran(T2 )) .  (3.4.9)  and  The above Proposition has many uses. For instance, if two operators are 3∗ -monotone, and one is surjective, then if the sum is maximal monotone it is also surjective. Furthermore, if both are continuous monotone linear operators, and at least one is 3∗ -monotone, then the kernel of the sum is the intersection of the kernels [6]. These results can be used, as shown in [23], to determine when T −1 (0) is nonempty by demonstrating that 0 is in the interior (or is not in the closure) of the sum of the ranges of an intelligent decomposition of a difficult to evaluate maximal monotone operator. If the closure (or interior) is insufficient to determine existence of a solution, there is a refinement of the above result by Chu. Proposition 3.4.18 [31] If the sum of two 3∗ -monotone operators T1 and T2 is maximal monotone, then ri (conv ran(T1 ) + conv ran(T2 )) ⊆ ran(T1 + T2 ) ⊆ ran(T1 ) + ran(T2 ), where ri is the relative interior and co is the convex hull.  (3.4.10)  Furthermore, so long as  36  ri (conv ran(T1 ) + conv ran(T2 )) = ∅, ri (conv ran(T1 ) + conv ran(T2 )) = ri ran(T1 + T2 ) = ri (ran(T1 ) + ran(T2 )) .  (3.4.11)  For an interesting utilization of Proposition 3.4.18 for linear relations on Banach spaces, and its subsequent application to show existence of solutions to primal-dual problem pairs, see Theorems 3.27 and 3.28 in [60]. Some sufficient conditions for 3∗ -monotonicity follow. Proposition 3.4.19 [82] If an operator T in X is monotone and has bounded range, then T is also 3∗ -monotone. Proof. ([82] Proposition 32.41). Suppose T is monotone and has bounded range, so that there exists some M > 0 such that for all (x, x∗ ) ∈ gra T , x∗ < M . For any combination of (x, x∗ ), (y, y ∗ ), (z, z ∗ ) ∈ gra T ,  z − y, y ∗ − x∗ ≤ z − x, y ∗ − x∗  (3.4.12)  since by the monotonicity of T , −y, y ∗ − x∗ ≤ −x, y ∗ − x∗ . Hence, by the Cauchy-Schwarz  inequality, for any (x, x∗ ), (z, z ∗ ) ∈ gra T , sup (y,y ∗ )∈gra T  z − y, y ∗ − x∗ ≤  sup y ∗ ∈ran T  z − x y ∗ − x∗ ≤ (M + x∗ ) z − x ,  (3.4.13)  and so T is 3∗ -monotone. Corollary 3.4.20 3∗ monotone operators are not necessarily weakly coercive. Proposition 3.4.21 [23] Strongly coercive monotone operators are 3∗ -monotone. Proof. Suppose an operator T : X → 2X is strongly coercive. Let (z, z ∗ ) ∈ gra T and x∗ ∈ ran T  be chosen arbitrarily. Let ((yn , yn∗ ))n∈N ⊂ gra T be a sequence that establishes the supremum lim  n→+∞  z − yn , yn∗ − x∗ =  sup (y,y ∗ )∈gra T  z − y, y ∗ − x∗ .  (3.4.14)  First suppose that the sequence ( yn )n∈N ⊂ R has a bounded subsequence, so that yσ(n) ≤ ∗ M < +∞, where σ : N → N is the relabeling. Note that the sequence (yσ(n) , yσ(n) )  satisfies (3.4.14) in the place of and so for all n ∈ N  ((yn , yn∗ ))n∈N .  Since T is monotone, z − yσ(n)  ∗ z − yσ(n) , yσ(n) − x∗ ≤ z − yσ(n) , z ∗ − x∗ ,  , z∗  −  n∈N ∗ yσ(n)  also ≥0  (3.4.15)  37  and by the Cauchy-Schwarz inequality ∗ z − yσ(n) , yσ(n) − x∗ ≤ ( z + M ) z ∗ − x∗ .  (3.4.16)  As a result, sup(y,y∗ )∈gra T z − y, y ∗ − x∗  ∗ = limn→+∞ z − yσ(n) , yσ(n) − x∗  ≤ ( z + M ) z ∗ − x∗ < +∞.  Suppose now that there is no such bounded subsequence, and that yn → +∞. In this  case, there exists an N > 0 such that for all n ≥ N , yn ≥ max{ x∗ , z } and by strong coercivity inf y∗ ∈T yn  yn −z,y ∗ yn  ≥ 3 x∗ . Therefore, for all n > N ,  yn − z, yn∗ − x∗  ≥ 3 x∗  ≥ 3 x∗  ≥ 3 x∗ ≥  x  ∗  yn − yn − z, x∗  y n − x∗  y n − 2 x∗  yn − z yn  yn .  Hence, since x∗ might be 0, sup (y,y ∗ )∈gra T  z − y, y ∗ − x∗ = lim  n→+∞  z − yn , yn∗ − x∗ ≤ 0 < +∞,  and so T is 3∗ -monotone. For single-valued linear operators, stronger results hold. Proposition 3.4.22 [23] Let A : X → X be a bounded monotone linear operator. Then, A is 3∗ -monotone if and only if A is α-inverse strongly monotone, that is there exists an α > 0 such that x, Ax ≥ α Ax, Ax = α Ax  2  . Corollary 3.4.23 If A : X → X is a bounded linear 3∗ -monotone operator, then it is paramonotone.  We obtain a stronger result similar to Corollary 3.4.23 for unbounded and multivalued linear relations later in Proposition 4.1.11. Unfortunately, as is shown below in Example 5.0.16, there are nonlinear 3∗ -monotone operators that fail to be paramonotone. Hence, the application of 3∗ monotonicity to Variational Inequality solution techniques requiring paramonotonicity applies only for bounded linear operators.  38  3.5  Single valued selections of monotone operators  Below we define selections and single valued selections of monotone operators, and in particular require that such selections be domain preserving. That single valued selections preserve ncyclic, 3∗ -, and strict monotonicity is clear from the definitions of these properties. We show that paramonotonicity can be preserved if the original operator is maximal monotone and the the single-valued selection is chosen with care. Maximal monotonicity is never preserved if the original operator is multi-valued at any point in its domain. Definition 3.5.1 (selection) Given an operator T : X → 2X , a selection of T is an operator T˜ : X → 2X such that dom T˜ = dom T and gra T˜ ⊂ gra T . Note that a single-valued selection of an operator T is a selection of T that is single-valued, that is for every x ∈ dom T , T˜x is a singleton. Recall that for single-valued operators, we often  use the notation T x = x∗ instead of T x = {x∗ }.  Fact 3.5.2 Let T˜ : X → X be a single valued selection from an operator T : X → 2X . If T is any of monotone, strictly monotone, n-cyclic monotone for some n, or 3∗ -monotone, then so is T˜. Proposition 3.5.3 Let T := ∂f : X → 2X be the subdifferential of a (strictly) lower semicon-  tinuous proper convex function f that is not single-valued everywhere. Then, any single-valued selection T˜ of T is also (strictly) monotone and cyclical monotone; T is maximal monotone,  while T˜ is not. Proof. Direct from Fact 3.5.2 and Remark 3.2.11. For the following Proposition, given any maximal monotone operator T : X → X, let  T˜ : X → X denote the single valued selection determined by choosing the point of minimum norm:  T˜ := argmin{ x∗ : x∗ ∈ T x}.  (3.5.1)  In particular, if 0 ∈ T x, then T˜x = 0. Note that if T is maximal monotone, then T x is closed for every x in the domain of T (see for instance [7]). Since  {  x∗  x∗  ·  is a sublinear function, the set  : ∈ T x} ⊂ [0, +∞[ is also closed and convex. Therefore, a unique minimum exists ˜ and T is well defined when T is maximal monotone. Proposition 3.5.4 If T is paramonotone and maximal monotone, then T˜ is paramonotone. Proof. Suppose that T is paramonotone. If T˜ is strictly monotone, then it is paramonotone. Otherwise, there are distinct x and y in X such that T˜x − T˜y, x − y = 0. 39  Since T is paramonotone, T˜x ∈ T y, and T˜y ∈ T x. It cannot be that T˜x < T˜y since T˜x ∈ T y, nor can T˜x > T˜y since T˜y ∈ T x. Hence, T˜x = T˜y . By the uniqueness of argmin{ y ∗ , y ∗ ∈ T y}, and since T˜x ∈ T y, it follows that T˜x = T˜y, and that T˜ is paramonotone.  Remark 3.5.5 Should a single valued selection be made in a different manner, the preservation of paramonotonicity is not guaranteed. The single-valued selection of the paramonotone subdifferential of the proper lower semicontinuous convex function f (x1 , x2 ) := |x1 |, given  in Example 3.4.9 above, is not paramonotone. A similar non-paramonotone selection of the same convex function is given later in Example 5.0.17, although the operator T from Example 5.0.17 does not have the properties which facilitate the discussion which follows (in particular T −1 (0) = 0 and the graph of T would have to be changed at more than one point in order to obtain paramonotonicity). Consider T to be the operator in Example 3.4.9. Taking T˜ as from (3.5.1), if T˜ is welldefined, it may be paramonotone even though T is not. Take T as defined in Example 3.4.9 and add the point ((0, 0), (0, 0)) to its graph, so that gra S := gra T  {((0, 0), (0, 0))}). The  operator S is monotone since (0, 0) is a monotone extension of T . Now, even though S is not ˜ 0) = (0, 0) = paramonotone (for the same reasons that T is not), S˜ is paramonotone since S(0, ˜ k) for all k ∈ R. S(0,  3.6  Sums of monotone operators  For this section, T1 : X → 2X and T2 : X → 2X are both monotone operators, and the  sum of these operators is defined by T1 + T2 : X → 2X , where for x ∈ dom T1 (T1 + T2 )x :=  {x∗1  +  x∗2  : x1 ∈ T x, x2 ∈ T x}, and for x ∈ / dom T1  (If T1 + T2 were defined outside of dom T1  dom T2 ,  dom T2 , (T1 + T2 )x := ∅.  dom T2 to equal either T1 or T2 as appropriate,  monotonicity might not be preserved.) We show that all monotone classes considered except for maximal monotonicity are preserved under addition. Maximal monotonicity is preserved as long as dom(T1 )  int dom(T2 ) = ∅.  Remark 3.6.1 It is straightforward from the definitions that if T1 and T2 are both monotone, strictly monotone, or n-cyclic monotone, then so, respectively, is T1 + T2 . Furthermore, strict monotonicity dominates under addition, in that if either T1 or T2 is strictly monotone, so is T1 + T2 . Proposition 3.6.2 (3∗ -monotonicity is sum-preserving) Suppose T1 : X → 2X and T2 : X → 2X are both 3∗ -monotone operators on X.Then the operator T := T1 + T2 is also 3∗ monotone.  40  Proof. This follows from a simple aspect of the supremum. For all (x, x∗ ) ∈ gra T , where  x∗ = x∗1 + x∗2 for some x∗1 ∈ T1 x and x∗2 ∈ T2 x, and for all z ∈ X, sup(y,y∗)∈gra T y ∗ − x∗ , z − y  ≤ supy∈dom T supy∗ ∈T1 y y ∗ − x∗1 , z − y + supy∗ ∈T2 y y ∗ − x∗2 , z − y  ≤ max 0, sup(y,y∗)∈gra T1 { y ∗ − x∗1 , z − y } + max 0, sup(y,y∗)∈gra T2 {  < +∞.  y∗  −  x∗2 , z  (3.6.1)  −y }  Although the following fact was first proved in [66], a more concise proof is given in [16], where techniques are simplified by the use of the Fitzpatrick function. Fact 3.6.3 [66] If maximal monotone operators T1 and T2 on a Hilbert space are such that dom(T1 )  int dom(T2 ) = ∅, then T1 + T2 is maximal monotone.  Remark 3.6.4 It should be noted that this result remains famously open for general Banach spaces, and there has been some progress towards a result in recent years. Stephen Simon’s monograph [71] gives an excellent overview of the problem, although some more progress has been made recently, see [8]. Remark 3.6.5 Note that the failure of maximal monotonicity dominates under addition if T1 is not maximal monotone and T2 is single-valued or vice-versa if a point of monotone extension of either T1 or T2 occurs within dom T1  dom T2 . Without loss of generality, suppose T1 is not  maximal monotone and admits a monotone-preserving graph extension (x, x∗1 ). If x ∈ dom T2 , then (x, x∗1 + x∗2 ) is a monotone-preserving graph extension of T1 + T2 for all x∗2 ∈ T2 x.  Proposition 3.6.6 [12] Suppose T1 : X → 2X and T2 : X → 2X are both paramonotone  operators. Then the operator T := T1 + T2 is also paramonotone.  Proof. Suppose x − y, x∗ − y ∗ = 0 for the points (x, x∗ ), (y, y ∗ ) ∈ gra(T1 + T2 ). Since  x∗ = x∗1 + x∗2 for some x∗1 ∈ T1 x and x∗2 ∈ T2 x and y ∗ = y1∗ + y2∗ for some y1∗ ∈ T1 y and y2∗ ∈ T2 y, by monotonicity x − y, x∗1 − y1∗ = 0 and x − y, x∗2 − y2∗ = 0. By the paramonotonicity of T1 and T2 , x∗1 ∈ T1 y, y1∗ ∈ T1 x, x∗2 ∈ T2 y, and y2∗ ∈ T2 x. Therefore, x∗1 + x∗2 ∈ (T1 + T2 )y and y1∗ + y2∗ ∈ (T1 + T2 )x, and so T1 + T2 is paramonotone.  Remark 3.6.7 Any operator summed with itself will belong to the same monotone classes, for instance if T is paramonotone, so is T + T . Indeed, it is easy to verify that for any k > 0, kT belongs to the same monotone classes as T . However, the sum of two monotone operators that are both not strictly monotone, not n-cyclic monotone, not paramonotone, or not 3∗ monotone, reveals nothing on the monotone classes to which the sum belongs, even if the domains of operators coincide. This is demonstrated by the two cases below. 41  Consider the linear monotone operators Q : R2 → R2 Q :=  0 −1 1  0  and −Q. As shown below in Example 5.0.12, these operators are maximal monotone but  belong to none of the other monotone classes here considered. On the other hand, Q − Q = 0,  which is maximal monotone but also paramonotone, cyclical monotone, and 3∗ -monotone (see Chapter 5). For the case of strict monotonicity, consider the monotone operators T1 , T2 : R → R:    x − 1, x > 1 T1 x := 0, x ∈ [0, 1] ,   x, x<0     x − 1, x > 2 T2 x := 1, x ∈ [1, 2] .   x, x<1  (3.6.2)  Both T1 and T2 are not strictly monotone, yet their sum  is strictly monotone.     2x − 2, x > 2 (T1 + T2 )x := x, x ∈ [0, 2]   2x, x<0  (3.6.3)  Finally, the addition of an arbitrarily small operator can be used to make an operator 3∗ monotone and strictly (and hence para-) monotone. The regularization below was noted by Brezis and Haraux upon introduction of the 3∗ -monotone operator class [23], although without mention of strict monotonicity. Proposition 3.6.8 (Regularization) For any monotone operator T : X → 2X , adding δ Id for some δ > 0 regularizes the operator to be strictly monotone, 3∗ -monotone, and paramonotone. Proof. Given that strict monotonicity dominates under addition, T + δ Id must be strictly monotone, and therefore paramonotone. Also, since δ Id is strongly coercive, the sum must be as well, and by Proposition 3.4.21, T + δ Id is 3∗ -monotone.  3.7  Operators in one dimension (X = R)  In this section we show that any monotone operator T : R → R is a selection of the subgradient  of some proper lower semicontinuous convex function f : R → R {+∞}, and as such is paramonotone, 3-cyclic monotone, and 3∗ -monotone as well.  42  Proposition 3.7.1 [64] Every monotone operator T : R → 2R is cyclical monotone. Proof. Since T is monotone, it is 2-cyclic monotone. For the inductive step, assume that T is n-cyclic monotone. Let (ai , a∗i ) for 1 ≤ i ≤ n + 1 be arbitrary points in gra T , and define an+2 = a1 . We must show that T is (n + 1)-cyclic monotone, by demonstrating that n+1 i=1  (ai − ai+1 )a∗i ≥ 0.  (3.7.1)  Since a cyclic permutation has no effect in (3.7.1), we can assume without loss of generality that an+1 = max{aj : 1 ≤ j ≤ n + 1}.  If an+1 = an , then both an and a∗n can be completely removed from (3.7.1), which becomes n−2 i=1  (ai − ai+1 )a∗i  + (an−1 − an+1 )a∗n−1 + (an+1 − a1 )a∗n+1 ≥ 0.  This holds since T is n-cyclic monotone. Suppose instead that an+1 > an . By monotonicity, a∗n+1 ≥ a∗n , and so (an+1 − a1 )a∗n ≤  (an+1 − a1 )a∗n+1 .  Since T is n-cyclic monotone, n i=1 (ai  − ai+1 )a∗i =  ≥  ≥  =  n−1 ∗ i=1 (ai − ai+1 )ai 0 + a1 a∗n − an+1 a∗n (a1 − an+1 )a∗n+1 (an+2 − an+1 )a∗n+1 ,  + an a∗n − an+1 a∗n (3.7.2)  hence (3.7.1) holds and so T is (n + 1)-cyclic monotone.  Fact 3.7.2 Any one-dimensional monotone operator T : R → 2R must be cyclical monotone,  3∗ -monotone, and paramonotone, and gra T is a subset of the graph of the subgradient of some  lower semicontinuous convex function g : R → R {+∞}. Proof. By Proposition 3.7.1, as all monotone operators in R1 are 3-cyclic monotone, they are also 3∗ -monotone by Fact 3.1.2. For any monotone operator T operating on R, and for any (x, x∗ ), (y, y ∗ ) in the graph of T such that x − y, x∗ − y ∗ = (x − y)(x∗ − y ∗ ) = 0 and x = y, then x∗ = y ∗ , and so T has to be paramonotone. As monotone operators on R are cyclical monotone, any maximal monotone one dimensional operator must be the subgradient  43  of a lower semicontinuous proper convex function [65]. Therefore, the graph of monotone operator T : R → R is a subgraph of such a subgradient.  3.8  Spherically symmetric monotone operators  Spherically symmetric operators are those operators that are rotationally invariant (commutative with rotation operators), and are considered in this paper to also include the odd operators in R1 , as rotational operators contain the reflection operator x → −x. Although this class of  operators includes rotation operators themselves in R2 , in dimensions higher than two rotation  operators do not commute. Rotational operators in R2 and their sums with other spherically symmetric operators can be excluded by considering only cyclical monotone operators (see Proposition 5.0.14 below). Note that spherically symmetric operators do not have spherically symmetric graphs. The properties of monotone operators on R (including paramonotonicity) also apply to the simplest of operators in larger spaces, those with spherical symmetry, as the result below demonstrates. Note that we implicitly use the definition of monotone classes on a subset of the space (Definition 2.1.13) for this result. Proposition 3.8.1 Given a monotone operator g : R → 2R such that g(t) = {0} for all t ≤ 0, define T : X → 2X by  T :x→  {0},  {g∗ xx  x = 0, :  g∗  ∈ g( x )}, x = 0.  (3.8.1)  Then, (i) T is monotone, (ii) T is strictly monotone if g is strictly monotone on R+ , (iii) T is paramonotone, (iv) T is cyclical monotone, (v) T is 3∗ -monotone, (vi) T is maximal monotone if g is maximal monotone. Proof. (i) Since g is monotone and since (0, 0) ∈ gra g, for all t > 0, g ∗ ≥ 0 for all g ∗ ∈ g(t). Let (x, x∗ ) ∈ gra T , where x = 0, so that x∗ = gx∗  x − 0, x∗ − 0 = x, gx∗  x x  x x  for some gx∗ ∈ g( x ), and so = gx∗ x ≥ 0.  (3.8.2)  44  Now, let (x, x∗ ), (y, y ∗ ) ∈ gra T such that x = 0, y = 0, and x = y, and let x, y . x y  φ=  By the Cauchy-Schwarz inequality, |φ| ≤ 1. Then, for some gx∗ ∈ g( x ) and some gy∗ ∈ g( y ), x − y, x∗ − y ∗  =  = gx∗ x + ≥  =  x ∗ y x − gy y gy∗ y − φgx∗ y − φgy∗ gy∗ y − gx∗ y − gy∗ x  x − y, gx∗  gx∗ x + (gx∗ − gy∗ )(  ≥ 0.  x (3.8.3)  x − y ),  Together with the fact that x − y, x∗ − y ∗ = 0 if x = y, (3.8.2) and (3.8.3) demonstrate that  T is monotone.  (ii) If g is strictly monotone on R+ , then the inequalities in both (3.8.2) and (3.8.3) are strict, and so T is strictly monotone. (iii) Let (x, x∗ ), (y, y ∗ ) ∈ gra T be such that x − y, x∗ − y ∗ = 0. First, suppose that either  x = 0 or y = 0, and suppose further without loss of generality that y = 0. Then either x = 0 or x∗ = 0, and in both cases x∗ ∈ T y = {0} and 0 ∈ T x. Suppose now that x = 0 = y.  Choose gx∗ and gy∗ so that ( x , gx∗ ), ( y , gy∗ ) ∈ gra g, gx∗  x x  = x∗ , and gy∗  y y  = y ∗ . Since  x − y, x∗ − y ∗ = 0, it follows from (3.8.3) that φ = 1, and so x, y = x y , and that  (gx∗ − gy∗ )( x − y ) = 0. Therefore,  x x  =  y y  and either gx∗ = gy∗ or x = y , and so either  x∗ = y ∗ or T x = T y respectively. Therefore, T is paramonotone. (iv) By the same reasoning in (3.8.3), for every (x, x∗ ), (y, y ∗ ) ∈ gra T such that x = 0 and  y = 0, there exists a gx∗ ∈ g( x ) such that  x − y, x∗ ≥ ( x − y )gx∗ ,  (3.8.4)  which also holds in the cases where x = 0, y = 0, or both x = y = 0. Now, choose an arbitrary n ∈ N, where n > 1. Consider n points (ai , a∗i ) ∈ gra T , i ∈ {1, 2, . . . , n}, and define an+1 := a1 . Then, there exists by (3.8.4) a set of points gi∗ , where gi∗ ∈ g( ai ) for all i ∈ {1, 2, . . . , n}, such that  n  n i=1  ai −  ai+1 , a∗i  ≥  i=1  ( ai − ai+1 )gi∗ .  (3.8.5)  Since g : R → 2R is monotone, g is cyclical monotone, and so the quantity in (3.8.5) is  nonnegative.  45  (v) Since 3-cyclic monotone operators are 3∗ -monotone (Fact 3.1.2), T is 3∗ -monotone by ((iv)). (vi) Suppose g is maximal monotone, and so by Minty’s Theorem (Theorem 3.2.2), (Id +g)(R) = R. Since (Id +g)({t ∈ R : t < 0}) = {t ∈ R : t < 0}, (Id +g)(R+ ) = R+ . Let y ∈ X such that y = 0. Then, there is an a > 0 and an a∗ ∈ g(a) such  that a + a∗ = y . Let x =  ay y  so that  ay y + a∗ = y. y ay  x + Tx = a  Hence, since 0 + T (0) = 0, we have that the range of (Id +T ) is the space X and so T is maximal monotone by another application of Minty’s Theorem (Theorem 3.2.2). Note that if dom g = R in Proposition 3.8.1, then dom T = X. The strict and maximal monotonicity of g are also necessary conditions for T to inherit these properties, as shown below. Proposition 3.8.2 Given an operator g : R → 2R such that g(t) = {0} for all t ≤ 0, define  T : X → 2X as in (3.8.1). Then,  (i) T is monotone only if g is monotone, (ii) T is strictly monotone only if g is strictly monotone on R+ , (iii) T is maximal monotone only if g is maximal monotone. Proof.  (i)(ii) If g is not monotone [or monotone but not strictly monotone], there exist  (a1 , a∗1 ), (a2 , a∗2 )  ∈ gra g such that a1 = a2 and (a1 − a2 )(a∗1 − a∗2 ) < 0 [or = 0]. Choose  an x ∈ X such that x = a1 > 0. Then, (x, x−  a∗1 x a∗2 x a2 x ), ( a1 x, x )  ∈ gra T , and so  x a2 a∗1 x a∗2 x x, − = (a1 − a2 )(a∗1 − a∗2 ) < 0 [or = 0]. a1 x x a1  Hence, T is not monotone [or not strictly monotone]. (iii) Suppose that g is monotone but not maximal monotone, and so has some monotone extension gˆ, where gra g gra gˆ. Then, by Proposition 3.8.1 (i), the map Tˆ : X → 2X , where Tˆ : x →  {0},  {g∗ xx  x=0 :  g∗  ∈ gˆ( x )},  x >0  ,  46  is monotone. Since gra T  gra Tˆ, the operator T is not maximal monotone.  47  Chapter 4  Monotone Classes of Linear Relations and Operators In this chapter, we examine certain properties of linear relations, especially the ways in which they are multi-valued (Proposition 4.0.13), how 3∗ -monotonicity is related to the notion of angle-boundedness (Section 4.1), and the relationship between 3∗ -monotonicity and paramonotonicity for linear relations. Towards the end of this chapter, we examine further relationships that exist between monotone classes on Rn and R2 . Using the nomenclature of R. Cross [33], we define linear relations, which are set-valued generalizations of linear operators. Definition 4.0.3 (linear relation) An operator A : X → 2X is a linear relation if dom A is  a linear subspace of X and for all x, y ∈ dom A, λ ∈ R (i) λAx ⊂ A(λx), (ii) Ax + Ay ⊂ A(x + y).  Fact 4.0.4 An operator A : X → 2X is a linear relation if and only if its graph is a linear subspace of X × X, being that  (i) if (x, x∗ ) ∈ gra A, then λ(x, x∗ ) ∈ gra A for all λ ∈ R, and (ii) if (x, x∗ ), (y, y ∗ ) ∈ gra A, then (x + y, x∗ + y ∗ ) ∈ gra A. The following results on linear relations are well known. Of note, results (i) and (ii) are considered basic results and will not be cited in the work below. Fact 4.0.5 [76] For any linear relation A : X → 2X , (i) λAx = A(λx) for all x ∈ dom A, 0 = λ ∈ R, (ii) Ax + Ay = A(x + y) for all x, y ∈ dom A, (iii) A0 is a linear subspace of X, 48  (iv) Ax = x∗ + A0 for all (x, x∗ ) ∈ gra A, (v) If A is single-valued at any point, it is single-valued at every point in its domain. Proof. [76] Consider arbitrary x, y ∈ dom A. (i) Since λAx ⊂ A(λx) for all 0 = λ ∈ R,  1 λ A(λx)  ⊂ Ax, and so λAx = A(λx).  (ii) By (i) we have A(−y) = −Ay, and since A(x + y) + A(−y) ⊂ Ax, A(x + y) ⊂ Ax + Ay.  (iii) For all z1∗ , z2∗ ∈ A0, z1∗ + z2∗ ∈ A0 by (ii). Since gra A is a subspace of X × X, (0, 0) ∈ gra A  and so 0 ∈ A0, therefore in combination with (i), for all z1∗ ∈ A0 and for all λ ∈ R λz1∗ ∈ A0. (iv) Since Ax + A0 = Ax, by (iii) for all z ∗ ∈ A0, x + z ∗ ∈ Ax.  (v) A direct result of (iv).  Fact 4.0.6 For any set C ⊂ X, the set C ⊥ is closed in X as by definition it is the intersection of the closed hyperplanes hc := {x ∈ X : x, c = 0} where c ∈ C.  Proposition 4.0.7 Suppose A : X → 2X is a linear relation, and let x ∈ dom A. Then,  PA0⊥ Ax is a singleton and  Ax ⊂ PA0⊥ Ax + A0.  (4.0.1)  If A0 is closed, then there is a unique x∗0 ∈ Ax such that x∗0 ∈ A0⊥ , where x∗0 = PA0⊥ x∗ for all x∗ ∈ Ax.  Proof. Let x ∈ dom A. Since A0 and A0⊥ are closed subspaces such that A0 + A0⊥ = X,  then for all x∗ ∈ X, x∗ = PA0 x∗ + PA0⊥ x∗ . By Fact 4.0.5((iv)), (4.0.1) holds and PA0⊥ Ax is a singleton. If A0 is closed, then for all x∗ ∈ Ax, Ax = x∗ + A0 = PA0⊥ x∗ + A0. Therefore, PA0⊥ y ∗ = PA0⊥ x∗ for all y ∗ ∈ Ax. Furthermore, since 0 ∈ A0 always, PA0⊥ x∗ ∈ Ax. Proposition 4.0.8 Any monotone linear relation A : X → 2X with full domain is maximal monotone and single valued. Proof. Suppose that A : X → 2X is a linear relation where dom A = X. Let (z, z ∗ ) be a  point such that z − y, z ∗ − y ∗ ≥ 0 for all (y, y ∗ ) ∈ gra A. Choose an arbitrary z0∗ ∈ Az.  Let y = z − εx for arbitrary (x, x∗ ) ∈ gra A and ε > 0, so that by linearity −εx∗ ∈ A(−εx). Therefore z0∗ − εx∗ ∈ Ay and so εx, z ∗ − z0∗ + εx∗ ≥ 0. Divide out the ε, and send ε → 0+  so that x, z ∗ − z0∗ ≥ 0 for all x ∈ X. Hence z ∗ = z0∗ and T is single valued and maximal monotone.  The following two results appear respectively as Proposition 2.2(i) and Proposition 2.4 in [9]. 49  Proposition 4.0.9 [9] If A : X → 2X is a monotone linear relation, then dom A ⊂ (A0)⊥  and A0 ⊂ (dom A)⊥ .  Proof. Consider an arbitrary (x, x∗ ) ∈ gra A. If for some a1 ∈ A0, x, a1 = k = 0, let  a2 = k1 ( x, x∗ + 1)a1 so that a2 ∈ A0 since A0 is a subspace. Now, since (0, 0) ∈ gra A and A is monotone,  −1 = x, x∗ − ( x, x∗ + 1) = x − 0, x∗ − a2 ≥ 0 which is a contradiction. Therefore, A0 ⊥ {x} for all x ∈ dom A, and so dom A ⊂ (A0)⊥ and A0 ⊂ (dom A)⊥ .  Corollary 4.0.10 [9] If a linear relation A : X → 2X is maximal monotone, then (dom A)⊥ = A0, and so dom A = (A0)⊥ and A0 is a closed subspace.  Proof. Choose an arbitrary y ∗ ∈ (dom A)⊥ . For all (x, x∗ ) ∈ gra A, x − 0, x∗ − y ∗ = x, x∗ ≥ 0 since A is monotone and (0, 0) ∈ gra A by linearity. Since A is maximal monotone, (0, y ∗ ) ∈ gra A, and so A0 ⊃ (dom A)⊥ . Finally, by Proposition 4.0.9, A0 = (dom A)⊥ .  This leads to a partial converse result to Proposition 4.0.8, an application of Corollary 3.4.11. Corollary 4.0.11 If a maximal monotone single-valued linear relation A : X → X is locally bounded, then it has full domain.  Proof. Since A is single valued, A0 = 0, and so by Corollary 4.0.10, dom A = (A0)⊥ = X. Choose any point x ∈ X. Since dom A is dense in X, there exist a sequence (yn , yn∗ )n∈N ⊂ gra A  ∗ such that yn → x. Since A is locally bounded, a subsequence (yφ(n) )n∈N of (yn∗ )n∈N weakly converges to some point x∗ ∈ X. Therefore, for all z ∈ dom A,  0 ≤ lim  n→+∞  ∗ yφ(n) − z, yφ(n) − Az = x − z, x∗ − Az .  Since A is maximal monotone, (x, x∗ ) ∈ gra A, and so A has full domain. The following fact appears in Proposition 2.2 in [9]. Fact 4.0.12 [9] Let A : X → 2X be a monotone linear relation. For any x, y ∈ dom A, the set { y, x∗ : x∗ ∈ Ax} is a singleton, the value of which can be denoted simply by y, Ax .  50  Proof. Let x, y ∈ dom A and suppose that x∗1 , x∗2 ∈ Ax. By Fact 4.0.5 (iv), x∗2 − x∗1 ∈ A0.  Now, by Proposition 4.0.9, A0 ⊂ (dom A)⊥ , and so x∗2 − x∗1 ∈ (dom A)⊥ . Since y ∈ dom A, y, x∗1 = y, x∗2 .  Proposition 4.0.13 below demonstrates that multi-valued linear relations are closely related to a number of single-valued linear relations. Note especially that V = A0⊥ and V = dom A both satisfy the conditions below. Proposition 4.0.13 (dimension reduction) Suppose that A : X → 2X is a monotone lin-  ear relation. Let V ⊂ X satisfy  (i) V is a closed subspace of X, (ii) dom A ⊂ V , and (iii) A0 ⊂ V ⊥ . ˜ := PV Ax on dom A, and let A˜ = ∅ when x ∈ Define the operator A˜ : V → 2V , by Ax / dom A. ˜ ˜ Then, A is a single-valued monotone linear relation and dom A = dom A. In the case where V = A0⊥ and A0 is closed, the operator A˜ is a single-valued selection of A. If A is maximal monotone, then V = A0⊥ = dom A is the only subspace satisfying conditions (i)–(iii) above, and A˜ is a maximal monotone single-valued selection of A. Proof. For any x ∈ X, PV (x) = PV (PA0⊥ x + PA0 x) = PV (PA0⊥ x) as A0 ⊂ V ⊥ . By Proposition 4.0.7, A˜ is always single-valued, and if A0 is closed, PA0⊥ x∗ ∈ Ax for each (x, x∗ ) ∈ gra A, ˜ and so if V = A0⊥ , then A˜ is a selection of A. Consider now arbitrary (y, y˜∗ ), (z, z˜∗ ) ∈ gra A, and λ ∈ R. Then, for y ∗ ∈ Ay and z ∗ ∈ Az, we have that PV y ∗ = y˜∗ and PV z ∗ = z˜∗ . Since A ˜ and is a linear relation, (y + λz, y ∗ + λz ∗ ) ∈ gra A. Therefore, (y + λz, PV (y ∗ + λz ∗ )) ∈ gra A,  ˜ + λz) since PV is itself a linear operator, PV (y ∗ + λz ∗ ) = y˜∗ + λ˜ z ∗ , it follows that y˜∗ + λ˜ z ∗ ∈ A(y ˜ the operator A˜ is a linear relation. Finally, suppose that A is maximal Since dom A = dom A,  monotone, and so from Corollary 4.0.10 we have that A0⊥ = dom A and A0 is closed. The only subspace V satisfying the conditions in this case is V = A0⊥ . Suppose there exists a point ˜ For all (z, z ∗ ) ∈ gra A, (x, x∗ ) where x ∈ V = A0⊥ , that is monotonically related to gra A. there is a y ∈ A0 such that y + PV z ∗ = z ∗ . Then, by Fact 4.0.5 (iv),  x − z, x∗ − z ∗ = x − z, x∗ − y − PV z ∗ = x − z, x∗ − PV z ∗ ≥ 0. Therefore, (x, x∗ ) also monotonically related to A, and since A is maximal monotone, (x, x∗ ) ∈ ˜ Therefore, A˜ is maximal monotone. gra A. Since x∗ ∈ V , PV x∗ = x∗ and so (x, x∗ ) ∈ gra A. From the results in this section so far, we know that monotone linear relations A : X → 2X  can only be multi-valued such that A0 is a subspace of X, Ax = x∗ + A0 for any x∗ ∈ Ax, and 51  A0 ⊂ (dom A)⊥ . For the purposes of calculation by the inner product, for any x, y ∈ dom A, ˜ x, Az = x, Az  (4.0.2)  where A˜ is the single-valued operator (a selection of A if A0 is closed) as calculated in Proposition 4.0.13 for V = A0⊥ . In the other direction, any single-valued monotone linear relation A˜ : X → 2X can be extended to a multi-valued monotone linear relation A : X → 2X by ˜ +V. choosing any subspace V ⊂ (dom A)⊥ and setting Ax := Ax Now, in the unbounded linear case, maximal monotone operators may not have a closed domain. The concept of a halo well captures this aspect. Definition 4.0.14 (halo) The halo of a monotone linear relation A : X → 2X is the set halo A := {x ∈ X : (∃M )(∀(y, y ∗ ) ∈ gra A) x − y, y ∗ ≤ M x − y } .  (4.0.3)  Fact 4.0.15 [9] If A : X → 2X is a monotone linear relation, then dom A ⊂ halo A ⊂ (A0)⊥ . Fact 4.0.16 [9] If A : X → 2X is a monotone linear relation, then A is maximal monotone if and only if A0⊥ = dom A and halo A = dom A.  Now, if the domain of a linear relation is not closed, we have the following curious result. Below, Am denotes the iterated operator composition, where for instance A3 x = A(A(Ax)). Note that if dom A is dense in X, the operator PV A is the same as A. Proposition 4.0.17 Suppose a maximal monotone linear relation A : X → 2X is such that  dom A is not closed, and let V := dom A. Then, there is a sequence (zn )n∈N ⊂ dom A such that  (PV A)m (zn ) ∈ dom A, ∀1 ≤ m < n  (4.0.4)  (PV A) (zn ) ∈ / dom A,  (4.0.5)  n  where for all z ∈ dom A, PV Az is a singleton set. Proof. Since A is maximal monotone, dom A = halo A V =  A0⊥ .  dom A, and by Corollary 4.0.10,  Therefore, by Proposition 4.0.7, PV Az ⊂ Az and is a singleton for every z ∈ dom A.  Choose any point z0 ∈ V such that z0 ∈ / dom A. We shall generate the sequence (zn )n∈N ⊂ dom A iteratively as follows. For some n ≥ 0, suppose that zn ∈ V . By Minty’s theorem [55], since A is maximal monotone, ran(Id +A) = X. Therefore, there exists a zn+1 ∈ dom A such  that zn ∈ zn+1 + Azn+1 . Since zn , zn+1 ∈ V , zn ∈ zn+1 + PV Azn+1 , and so as PV Azn+1 is a  singleton,  PV Azn+1 = {zn − zn+1 }. 52  Now, since both PV and A are linear operators, if n ≥ 2 (PV A)2 zn+1 = PV A(zn − zn+1 )  (4.0.6)  = PV Azn − PV Azn+1  = {zn−1 − 2zn + zn+1 }, a linear combination of the terms zn−1 , zn , and zn+1 , with zn−1 appearing with coefficient 1. Similarly, if n ≥ 3, (PV A)3 zn+1 = PV A(zn−1 − 2zn + zn+1 )  = {zn−2 − zn−1 − 2zn−1 + 2zn + zn − zn+1 }  (4.0.7)  = {zn−2 − 3zn−1 + 3zn − zn+1 }.  By iterative composition, (PV A)m zn+1 is linear combination of the terms zp for n − m + 1 ≤  p ≤ n + 1, with zn−m+1 appearing with coefficient 1, as long as n − m + 1 ≥ 0. Since dom A is a linear subspace of X, (PV A)m zn+1 ⊂ dom A if n ≥ m. However, if n + 1 = m the single  point in (PV A)m zn+1 is not in dom A since z0 = x ∈ / dom A.  For any linear relation A : X → 2X where dom A is not closed, sequences like those in  Proposition 4.0.17 are plentiful. Every point x ∈ dom A such that x ∈ / dom A, including for  instance the points λx for λ > 0, generates a different sequence (zn )n∈N using the method from the proof of Proposition 4.0.17. The above result is tight in that the sequences in (4.0.5)  cannot exist if the domain of A is closed, as V = dom A = dom A in this circumstance, and so for every x ∈ dom A, PV Ax ∈ dom A. To explore these concepts, consider the following example. Example 4.0.18 Consider the infinite dimensional Hilbert space ℓ2 , the space of infinite sequences x = (xk )k∈N such that  +∞ 2 k=1 xk  < +∞. Let ek denote the kth standard unit vector  (the kth element in the sequence is 1, and all other elements in the sequence are 0). Define the single-valued monotone relation A : ℓ2 → ℓ2 defined for x ∈ dom A by +∞  Ax = A(  +∞  xk ek ) := k=1  kxk ek , k=1  where dom A := {x ∈ ℓ2 : ∃N ∈ N s.t. xk = 0 ∀k ≥ N } . +∞ 1 k=1 k ek is not n 1 i=1 2i ei eventually  Considering the linear relation A in the example above, the point x := in halo A. This is because the sequence (yn )n∈N ⊂ dom A where yn :=  violates (4.0.3) for any choice of M > 0 for a large enough n. (Therefore we know that A is not maximal monotone.) However, the point z :=  +∞ 1 i=1 i2 ei  is in halo A, and gra A could  / halo A, yet be extended by the point (z, x) and remain monotone. Since x ∈ dom A but x ∈ x = Az and z ∈ halo A, we have the beginning of a sequence like those in Proposition 4.0.17  53  for any monotone extension of A containing (z, x) that is also a linear relation. Note that since A is such that dom A halo A dom A, then for any linear operator A˜ with domain dom A ˜ = Ax for all x ∈ dom A (the existence of which is guaranteed by Zorn’s lemma), such that Ax ˜ ⊃ gra A, and so if A˜ is monotone, then A˜ is not monotone. This is because, gra(A) halo A˜ ⊂ halo A which would mean that halo A˜  dom A,  ˜ a contradiction of Fact 4.0.15. dom A,  Proposition 4.0.19 Let A : X → 2X be a monotone linear relation. Then there exists a maximal monotone linear relation B : X → 2X such that gra A ⊂ gra B.  ˜ := PV Ax on Proof. Let V = dom A, and define A˜ : V → 2V as in Proposition 4.0.13, (Ax dom A). Therefore, A˜ is a single valued monotone linear relation such that dom A˜ is dense in V , and is therefore a monotone linear operator that is only densely defined. By Proposition 3.2(f) ˜ of in [61] (by an application of Zorn’s lemma), there is a maximal monotone linear extension B ˜ Since B ˜ is a single valued linear relation, B(0) ˜ ˜ = dom B ˜ by Fact 4.0.16. A. = {0} and halo B ˜ + (dom A)⊥ , where dom B = dom B. ˜ Then, since Let B : X → 2X be defined by Bx = Bx  (dom A)⊥ = (dom B)⊥ , B is a maximal monotone linear relation by Fact 4.0.16. Finally, since for every (x, x∗ ) ∈ gra A, x∗ ∈ PV x∗ + V ⊥ = PV x∗ + (dom B)⊥ = Bx, B is a monotone extension of A. The following result is used later and appears in Proposition 4.6 in [11]. A linear relation is symmetric if A =  A+A∗ 2 ,  where A∗ is the adjoint of A.  Proposition 4.0.20 [11] Suppose that A : X → 2X is a linear relation. Then A is maximal  monotone and symmetric if and only if there exists a proper lower semicontinuous convex function f : X → R {+∞} such that A = ∂f . Note that the functions qA and q that appear in the propositions below are similar to the function MT defined in Chapter 6 , a variant of Fitzpatrick’s last function, applied to linear operators. See for instance Proposition 6.2.9, where qA (x) = MA (0, x). Following the style of result from [6], Proposition 4.0.21 Every symmetric continuous monotone linear operator A : X → X, must be both maximal monotone and maximal cyclical monotone. Proof. Define qA to be the operator qA : X → R : x →  1 x, Ax . 2 54  Now, A is monotone ⇔ x, Ax ≥ 0 for all x ∈ X so it follows that qA is convex ⇔ A is monotone. since quadratic functions that are nonnegative are convex. A can be decomposed into a symmetric part A+ = 21 (A + A∗ ) and a skew part A∼ = 12 (A − A∗ ) so that A = A+ + A∼ . Now, x, A∼ x = 0 for all x and for all A, so qA = qA+ and hence ∇qA+ = A+ = ∇qA . If A is a symmetric continuous monotone operator then A = A+ = ∇qA is the gradient of a  proper continuous convex function, and so by Corollary 3.2.9, A is maximal cyclical monotone and maximal monotone. Proposition 4.0.22 Every symmetric single-valued monotone linear relation A : X → 2X  with dense domain is the subgraph of a subgradient of a proper, convex function q : X →  R {+∞} that is lower semicontinuous on its domain, and so A must be cyclical monotone. Proof. Define q to be the operator q:X →R  {+∞} : x →  1 2  x, Ax , x ∈ dom A,  +∞,  x∈ / dom A.  Note that q is proper. Now, for λ ∈ ]0, 1[ and x, y ∈ dom A, λq(x) + (1 − λ)q(y) − q(λx + (1 − λ)y) =  1 λ(1 − λ) x − y, Ax − Ay ≥ 0, 2  (4.0.8)  and so q is convex (a known result [61], [9]). Now, since dom A is dense, and A is symmetric, then by Proposition 5.3 in [61], dom ∂q = dom A and ∂q(x) = Ax for all x ∈ dom ∂q.  It is a property of convex functions that q is lower semicontinuous at each x ∈ dom ∂q by Fact 3.2.12. If not already lower semicontinuous, the function q can be extended to a proper, lower semicontinuous convex function qˆ such that q = qˆ on dom(∂q) and dom q ⊂ dom qˆ. To  demonstrate this fact, define qˆ : X → R {+∞} by qˆ(x) :=  min lim inf q(xn ),  xn ∈dom A xn →x  n→+∞  (4.0.9)  which is well defined since dom A is dense in X. From the definition of qˆ it follows that qˆ is lower semicontinuous and that for all x ∈ X, qˆ(x) ≤ q(x). For all x ∈ dom A, since q is lower  semicontinuous on dom A, qˆ(x) ≥ q(x), and so qˆ(x) = q(x). Since q is proper, qˆ is proper  and since from the monotonicity of A, for all x ∈ dom A the function q(x) ≥ 0. For every  55  x, y ∈ X, there exists sequences (xn )n∈N and (yn )n∈N such that for all n ∈ N, we have that  xn , yn ∈ dom A, xn → x and yn → y. (In the case where x ∈ dom A or y ∈ dom A , consider  the trivial sequences where xn = x or yn = y for all n.) To demonstrate the convexity of qˆ, notice that for any λ ∈ ]0, 1[, each xn and yn satisfies the inequality (4.0.8), and so in the limit  this inequality is preserved and convexity is assured.  Finally, since gra A is a subset of the graph of a subgradient of a lower semicontinuous proper convex function qˆ, by Fact 3.2.7, A is cyclical monotone. Remark 4.0.23 Example 4.0.18 demonstrates that Proposition 4.0.22 cannot be extended to show that such monotone operators are maximal. This is not a contradiction since q in Proposition 4.0.22 is not necessarily lower semicontinuous everywhere. Both Proposition 4.0.21 and 4.0.22 above follow from Theorem 4.6 in [11]. Proposition 4.0.24 [11] Suppose that T : X → 2X is a linear relation, then A is maximal  monotone and symmetric if and only if there exists a proper lower semicontinuous convex function f : X → R {+∞} such that A = ∂f .  4.1  3∗-monotone linear relations and angle-boundedness  In this section, angle boundedness is defined and then shown to be a sufficient condition for the 3∗ -monotonicity of linear relations. Throughout this section, we make heavy use of the following fact for a simpler, more direct algebra. Definition 4.1.1 (angle bounded) Let A : X → 2X be a monotone linear relation. A is angle bounded if and only if there exists an a ≥ 0 such that for all x, y ∈ dom A | y, Ax − x, Ay |2 ≤ 4a2 x, Ax y, Ay .  (4.1.1)  Remark 4.1.2 If A is monotone and symmetric, then A is angle bounded, since in this case y, Ax = x, Ay for all x, y ∈ dom A, and so (4.1.1) holds for a = 0. Note that the angle  bounded condition is a condition on the size of the skew operator A∼ := 21 (A − A∗ ) in relation  to the symmetric operator A+ := 21 (A + A∗ ), in that it is equivalent to the condition | y, A∼ x |2 ≤ a2 x, A+ x y, A+ y .  (4.1.2)  The following result is a consequence of [1], and is proved directly in [4]. Proposition 4.1.3 [4] Given a linear monotone operator A : X → X, then for any n ≥ 3, A is n-cyclic monotone if and only if A is angle bounded with a = 2| tan(π/n)|.  56  Definition 4.1.4 (3σ monotone) An operator T : X → 2X is 3σ-monotone if and only if there exists a σ > 0 such that for all (x, x∗ ), (y, y ∗ ), and (z, z ∗ ) ∈ gra T , z − y, y ∗ − x∗ ≤ σ x − z, x∗ − z ∗ .  (4.1.3)  Fact 4.1.5 By definition, if an operator is 3σ-monotone, it is also 3∗ -monotone. Also, 3σmonotonicity with σ = 1 is equivalent to 3-cyclic monotonicity (2.1.6). Lemma 4.1.6 [82] For a monotone linear relation A : X → 2X , 3σ-monotonicity is equivalent to the condition that for all x, y ∈ dom A y, Ax  2  ≤ 4σ x, Ax y, Ay .  (4.1.4)  Proof. (Proposition 32.43 in [82]) Let A : X → 2X be a 3σ-monotone linear relation, so that w − v, Av − Au ≤ σ u − w, Au − Aw ,  ∀u, v, w ∈ dom A.  Let x = u − w and y = v − w. Since dom A is a linear subspace of X, x, y ∈ dom A. Since A  is linear the above is equivalent to  y, Ax − Ay ≤ σ x, Ax ,  ∀x, y ∈ dom A.  (4.1.5)  Substituting tx for x in (4.1.5) yields σ x, Ax t2 − y, Ax t + y, Ay ≥ 0,  ∀x, y ∈ dom A, ∀t ∈ R,  which has the format of the quadratic equation at2 +bt+c. Given that a and c are nonnegative, which we have by the monotonicity of A, quadratic equations of this form are nonnegative for all t if and only if b2 ≤ 4ac. This last condition is equivalent to (4.1.4). The following shows the relationship between 3σ-monotone and angle-bounded operators. Indeed, the condition for 3σ-monotone is introduced in [23] as the definition for angle-bounded (“angleborn´e”) operators. Proposition 4.1.7 [82] For monotone linear relations on X, 3σ-monotonicity is equivalent to angle-boundedness. Proof. (Proposition 32.43 in [82]) Let A : X → 2X be a monotone linear relation. Since A is monotone, for all x, y ∈ dom A and for all λ ∈ R, x − λy, A(x − λy) ≥ 0, and so 0 ≤ x, Ax − λ x, Ay − λ y, Ax + λ2 y, Ay .  57  Since the right hand side of the above inequality is a quadratic equation of the form aλ2 +bλ+c, where a ≥ 0, c ≥ 0, it is nonnegative if and only if b ≤ 4ac. Therefore, ( y, Ax + x, Ay )2 ≤ 4 x, Ax y, Ay .  (4.1.6)  First suppose that A is angle bounded, that is, there is an α > 0 such that for all x, y ∈ dom A | y, Ax − x, Ay |2 ≤ 4α2 x, Ax y, Ay . Note that y, Ax =  1 1 ( y, Ax + x, Ay ) + ( y, Ax − x, Ay ) , 2 2  therefore, using the fact that (a + b)2 ≤ 2a2 + 2b2 , y, Ax  2  ≤  1 ( y, Ax + x, Ay )2 + ( y, Ax − x, Ay )2 . 2  (4.1.7)  Combining (4.1.7) with (4.1.6), y, Ax  2  ≤ (2 + 2α2 ) x, Ax y, Ay ,  and A is 3σ-monotone with σ = 2 + 2α2 by Lemma 4.1.6. Now suppose A is 3σ-monotone. By Lemma 4.1.6, y, Ax  2  ≤ 4σ x, Ax y, Ay .  Note that y, Ax − x, Ay = 2 y, Ax − ( y, Ax + x, Ay ) . Therefore, using again the fact that (a + b)2 ≤ 2a2 + 2b2 and equation (4.1.6), ( y, Ax − x, Ay )2 ≤ (32σ + 8) x, Ax y, Ay , and so A is angle-bounded. For monotone linear relations in Rn , 3∗ -monotone and angle-boundedness are equivalent properties, which is shown in Proposition 4.2.11 below, although this is not the case in Hilbert space as is demonstrated by Example 4.2.12. The recent result for paramonotonicity and 3∗ -monotonicity is a portion of the main result in [12]. Proposition 4.1.8 [12] Suppose A : X → 2X is a maximal monotone linear relation such that  dom A and ran A+ are closed (A+ is the symmetric part of A). Then, A is 3∗ -monotone if and only if A is paramonotone. 58  Here, we use a different approach to that used for Proposition 4.1.8, where, while avoiding the use of the Fitzpatrick function, we obtain results that apply to all monotone operators regardless maximal monotonicity. This is done by examining the density of dom A rather than its closure, further extending these results. First, we characterize paramonotonicity for linear relations with the following two facts. Fact 4.1.9 Suppose A : X → 2X is a monotone linear relation. Then, A is paramonotone if  and only if for all x ∈ X  x, Ax = 0 ⇒ Ax = A0.  (4.1.8)  Proof. Suppose that A is paramonotone and that for some x ∈ dom A, x, Ax = 0. Then, x−0, Ax−A0 = 0, since A0 ⊂ (dom A)⊥ (Proposition 4.0.9). Therefore, by paramonotonicity,  every x∗ ∈ Ax is also in A0. By Fact 4.0.5 (iii) and (iv), Ax = A0.  Now, suppose that (4.1.8) holds for A and that for some (y, y ∗ ), (z, z ∗ ) ∈ gra A, y − z, y ∗ − z ∗ = 0.  Let x = y − z. Since A is a linear relation, y ∗ − z ∗ ∈ Ax, and so x, Ax = 0. Therefore, Ax = A0, and so y ∗ − z ∗ ∈ A0 and  y ∗ ∈ z ∗ + A0;  −z ∗ ∈ −y ∗ + A0.  By Fact 4.0.5 (i) and (iv), −y ∗ +A0 = −Ay. Hence y ∗ ∈ Az and z ∗ ∈ Ay, so A is paramonotone. Fact 4.1.10 Suppose A : X → 2X is a monotone linear relation, and let x ∈ X. Then,  Ax = A0 if and only if 0 ∈ Ax and if 0 ∈ Ax, then PA0⊥ Ax = {0}. If A0 is closed and  PA0⊥ Ax = {0}, then 0 ∈ Ax.  Proof. Let Ax = A0. Since A0 is a linear subspace of X (Fact 4.0.5 (iii)), 0 ∈ Ax. Now, let 0 ∈ Ax. Then, by Fact 4.0.5 (iv), Ax = A0.  By Proposition 4.0.7, PA0⊥ Ax is a singleton, and since 0 ∈ A0⊥ by the definition of the orthogonal complement, PA0⊥ Ax = {0}. Now, let PA0⊥ Ax = {0} and suppose that A0 is closed. Then, by Proposition 4.0.7, 0 ∈ Ax.  Proposition 4.1.11 Suppose A : X → 2X is a monotone linear relation such that dom A is  dense in A0⊥ and A0 is closed. If A is 3∗ monotone, then A is also paramonotone.  Proof. Suppose that A is not paramonotone. Then, there exists an x ∈ dom A such that x, Ax = 0 yet Ax = A0. Choose any x∗ ∈ Ax, and let x∗0 = PA0⊥ x∗ . By Fact 4.1.10,  x∗0 = 0 since A0 is closed. If x∗0 ∈ dom A, let w = 12 x∗0 . If x∗0 ∈ / dom A, there is a sequence 59  (yn )n∈N ⊂ dom A converging to x∗0 since dom A is dense in A0⊥ . In this case, let w = yn for  some n such that  w, Ax = yn , x∗0 ≥  1 ∗ 2 x . 2 0  Let v = λx for some λ > 0 and let u = 0 so that w − v, Av − Au = w − λx, λAx ≥  λ ∗ x 2 0  2  which is unbounded with respect to λ. Hence, A is not 3∗ -monotone, yielding the contrapositive. Note that for any linear relation A : X → 2X , A0 is closed if and only if Ax is closed for  any x ∈ dom A by Fact 4.0.5 (iii).  We therefore obtain by a different method Proposition 4.5 from [12]. Corollary 4.1.12 [12] If the linear relation A : X → 2X is maximal monotone and 3∗ monotone, then A is paramonotone.  Proof. Follows directly from Proposition 4.1.11 and Corollary 4.0.10. Corollary 4.1.13 If the linear relation A : X → 2X is 3∗ -monotone, then the operator A˜ : X → 2X defined by  ˜ := Ax + (dom A)⊥ Ax  (4.1.9)  is a linear relation and is a 3∗ -monotone extension of A that is paramonotone. Proof. The operator A˜ is a linear relation since A is a linear relation, since dom A˜ = dom A, and since (dom A)⊥ is a linear subspace. (Recall that we are using the convention that ∅ + S = ∅ for any set S.) More specifically, for all x, y ∈ dom A˜ = dom A and for all λ ∈ R, ˜ = λAx + λ(dom A)⊥ ⊂ A(λx) + (dom A)⊥ = A(λx), ˜ λAx and ˜ + Ay ˜ = Ax + (dom A)⊥ + Ay ⊂ A(x + y) + (dom A)⊥ = A(x ˜ + y). Ax By the definition of (dom A)⊥ , for all x, y, z ∈ dom A˜ ˜ − Az ˜ = z − y, Ay − Az . z − y, Ay Therefore, A˜ is monotone and 3∗ -monotone because A is monotone and 3∗ -monotone. Since by Proposition 4.0.9, A0 ⊂ (dom A)⊥ , it follows from Fact 4.0.5 (iv) that A˜ is a monotone extension  ˜ = (dom A)⊥ . Therefore, A0 ˜ ⊥ = dom A, and so by Proposition 4.1.11 and of A and that A0 ˜ A˜ is paramonotone. since dom A = dom A,  60  If the linear relation A from Proposition 4.1.11 is also a single valued bounded linear operator, then Proposition 4.1.11 is a corollary to this stronger result from [23] (Proposition 3.4.22). However, there are 3∗ -monotone linear relations that are not paramonotone. ˜ A : X : 2X by Example 4.1.14 Let X = ℓ2 and define the operators A, +∞  ˜ := Ax  x2k e2k  (4.1.10)  k=1  and ˜ + A0 Ax := x1 u + Ax where  ∞  u := k=1  1 e2k+1 , k  A0 := {x ∈ ℓ2 : ∃N ∈ N s.t. xk = 0 ∀k ≥ N and x2k+1 = 0 ∀k ∈ N} ,  (4.1.11)  (4.1.12) (4.1.13)  and dom A = dom A˜ = span{e1 , e2 , e4 , e6 , . . .}.  (4.1.14)  Then, A is a 3∗ -monotone linear relation, but it is not paramonotone. Proof. Both A and A˜ are by definition linear relations. Note that A˜ = 0 × J where J is a subgraph of Id. Therefore, A˜ is 3∗ -monotone as both Id and 0 are 3∗ -monotone. Also, A0 is a dense subspace of span{e2k+1 : k ∈ N}, and so A0⊥ = span{e2k : k ∈ N}. Therefore, PA0⊥ Ax = ˜ as u ∈ (dom A)⊥ . Since A0 ⊂ (dom A)⊥ (Proposition 4.0.9), for all (x, x∗ ), (y, y∗ ), (z, z∗ ) ∈ Ax gra A,  z − y, y∗ − x∗ = z − y, PA0⊥ y∗ − PA0⊥ x∗ = z − y, Ay − Ax , and so A is also 3∗ -monotone. Now, Ae1 = u + A0 ⊂ A0, ˜ 1 = 0. Therefore, A is not paramonotone. and so Ae1 = A0. However, e1 , Ae1 = e1 , Ae Finally, for certain linear operators, there is another condition equivalent to 3∗ monotonicity. Fact 4.1.15 [6] For any continuous linear operator A : X → X where ran A+ is closed, A  is 3∗ -monotone if and only if the range of the skew part of A is contained in the range of the symmetric part, ie: ran A∼ ⊆ ran A+ .  61  4.2  Monotone linear operators on Rn  A linear operator is a single-valued linear relation with full domain, which is maximal monotone by Proposition 4.0.8. Although being single-valued and having full domain are restrictive conditions, when it comes to monotone classes, linear operators are highly characteristic of linear relations with closed domain. If a monotone linear relation A : X → 2X has closed domain, which is always the case if  X = Rn , then dom A is itself a Hilbert space and the results above hold in their strongest form, as they do for all linear operators. Let A˜ : dom A → 2dom A be the single-valued selection of A generated in the manner of Proposition 4.0.13 with V = dom A. By Proposition 4.0.8, A˜ is a monotone linear operator. As the only difference between A and A˜ are elements perpendicular to the domain, by Section 3.5, the only difference between their monotone classes is that A may fail to be maximal monotone, and if so may also fail to be paramonotone. Below, we consider linear operators operating on R2 , Rn , and on Hilbert spaces of infinite dimension. Note that linear operators acting on Rn will be identified with their matrix representation in the standard basis, and recall from Proposition 4.0.20 that symmetric linear operators are the subdifferentials of a lower semicontinuous convex function. In Rn , all subspaces are closed, and so by Corollary 4.0.10, any maximal monotone single valued linear relation has full domain. Any single valued linear relation A that is not maximal monotone has as domain a closed subspace of Rn . Restricted to this domain D, the operator A˜ : D → D defined by x → PD (Ax), where PD is the orthogonal projection onto the domain, has full domain and is therefore maximal monotone.  Proposition 4.2.1 A single-valued monotone linear relation A : Rn → Rn is maximal mono-  tone if and only if dom A = Rn .  Proof. In Rn , all subspaces are closed, and so by Corollary 4.0.10, any maximal monotone single-valued linear relations have full domain. The converse follows from Proposition 4.0.8. For these reasons, and for simplicity, in this section we consider only linear operators, usually denoted A : Rn → Rn . Definition 4.2.2 (linear operator) A linear operator A : X → X is an single valued linear relation with full domain.  Remark 4.2.3 By Proposition 4.0.8, all monotone linear operators are maximal monotone, and the restriction that linear operators are single-valued is redundant as this also follows from having full domain.  62  Monotone linear operators on Rn are in one-to-one correspondence with the set of (not necessarily symmetric) positive semidefinite n × n matrices A : Rn → Rn , allowing matrix  theory to be applied.  Fact 4.2.4 (canonical form [56]) Since symmetric and skew symmetric matrices are normal matrices (AA∗ = A∗ A), they can be represented in what is sometimes known as the Murnaghan canonical form. This canonical form simplifies according to whether A is symmetric or skewsymmetric: (i) Symmetric matrices are orthogonally diagonalizable. That is, there exists an orthogonal matrix Q (Qt = Q−1 , ker Q = {0}), such that A = QDQt , where D is a diagonal matrix. (ii) For any n × n skew-symmetric matrix A (A = −At ), there exists an orthogonal matrix Q such that A = QΣQt , where the possible nonzero elements of Σ are Σ2i−1,2i = −Σ2i,2i−1 = ±λi for all i ∈ {1, 2, . . . , ⌊n/2⌋}. All other entries of Σ are 0, and the values λi ∈ R are  the magnitudes of the complex eigenvalues of A.  We continue to use the notation of Proposition 4.0.21, where we recognize that a linear operator A, considered as a matrix, can be decomposed into a symmetric part A+ = 12 (A + At ) and a skew part A∼ = 12 (A − At ) so that A = A+ + A∼ . Corollary 4.2.5 For a monotone linear operator A : Rn → Rn , the inner product x, Ax =  xt A+ x = 0 if and only if x ∈ ker A+ .  Proof. Since A+ is symmetric, A+ = QDQt for some orthogonal Q and diagonal matrix D with diagonal values λi ∈ R, i ∈ {1, 2, . . . , n}. Therefore, ker A+ = ker(DQt ). Since A+ is monotone, λi ≥ 0 for all i ∈ {1, 2, . . . , n}. Since  n  λi (Qt x)2i ,  xt A+ x = xt QDQt x = (Qt x)t D(Qt x) = i=1  xt A+ x = 0 if and only if for all i ∈ {1, 2, . . . , n} either λi = 0 or (Qt x)i = 0, which in turn occurs if and only if DQt x = 0, ie: if x ∈ ker(DQt ).  Proposition 4.2.6 [48] Let A : Rn → Rn be a linear operator. Then (i) A is monotone if and only if A+ is positive semidefinite, (ii) A is paramonotone if and only if A is monotone and ker(A+ ) ⊆ ker(A), (iii) A is strictly monotone if and only if A is monotone and ker(A+ ) = {0} (ie: A+ is positive definite).  Proof. [48] 63  (i) Immediate from the fact that x − y, Ax − Ay = (x − y)t A+ (x − y). (ii) If A is paramonotone, then by (i) A+ is positive semidefinite. For any z ∈ ker(A+ ), we  have that 0 = z, Az = z − 0, Az − A0 . Since A is paramonotone, Az = A0 = 0, so  z ∈ ker(A).  For the converse, assume that 0 = x − y, Ax − Ay for some x, y ∈ Rn so that x − y ∈  ker(A+ ) ⊆ ker(A). Hence, A(x − y) = 0, so Ax = Ay. Since A is also monotone, A is  paramonotone.  (iii) Assume A is strictly monotone. For any z ∈ ker(A+ ), it follows that 0 = z, Az = z − 0, Az − A0 . Since A is strictly  monotone, z = 0, and so ker(A+ ) = {0}.  For the converse, suppose x, y ∈ Rn are such that x − y, Ax − Ay = 0. Since ker(A+ ) =  {0}, x = y. Since A is also monotone, A is strictly monotone.  Corollary 4.2.7 Symmetric monotone linear operators on Rn are paramonotone. Proof. Direct from Proposition 4.2.6(ii). Proposition 4.2.8 The only paramonotone and skew-symmetric linear operator A : Rn → Rn is the zero operator A = 0.  Proof. Let A : Rn → Rn be a paramonotone and skew-symmetric linear operator. Since A is skew symmetric, then A = A∼ , and so A+ = 0. Since A is paramonotone, we have by Proposition 4.2.6 that Rn = ker A+ ⊆ ker A, and so A = 0.  That in Rn , 3∗ -monotonicity is equivalent to paramonotonicity is proven more succinctly  in [6], however the angle-bounded (or 3σ-monotone) result given below in Proposition 4.2.9 is a tighter result. Proposition 4.2.9 Every paramonotone linear operator A : Rn → Rn is angle bounded and  therefore 3∗ -monotone.  Proof. By Proposition 4.2.8, it is sufficient to examine the case where A+ = 0. There exist orthogonal matrices Q+ and Q∼ such that A+ = Q+ DQt+ and A∼ = Q∼ ΣQt∼ , where D and Σ are as defined in Fact 4.2.4. Let λm > 0 be the smallest nonzero eigenvalue of A+ , corresponding to the smallest nonzero element in the diagonal matrix D. Let λs be the largest element in Σ, being also the maximum magnitude of a complex eigenvalue of A∼ . In this way, A∼ , the operator norm of A∼ , is equal to λs . By Cauchy-Schwarz, for all x, y ∈ Rn  64  | y, A∼ x |2 ≤  A∼ x  ≤ λ2s x  2 2  y  2  y 2.  If x, k = y, k = 0 for all k ∈ ker A+ , then | y, A∼ x |2 ≤ ≤  λ2s λ x 2 λm y 2 λ2m m 2 λs λ2m x, A+ x y, A+ y  .  (4.2.1)  For any x, y ∈ Rn , there is an orthogonal decomposition x = xa + xk , y = ya + yk where xk , yk ∈ ker A+ , and where xa , ya belong to the orthogonal space (ker A+ )⊥ := {x : x, k =  0 ∀k ∈ ker A+ }. Since A is paramonotone, ker A+ ⊆ ker A. For any xk ∈ ker A+ , xk ∈ ker A = ker(A+ + A∼ ), and so xk ∈ ker A∼ and xk ∈ ker(A+ − A∼ ) = ker(A∗ ). Hence,  (xa + xk ), A(xa + xk ) = (xa + xk , Axa = A∗ xa , xa = xa , Axa . Similarly, (ya + yk ), A∼ (xa + xk ) = ya , A∼ xa . Hence, (4.2.1) holds for all x, y ∈ Rn , which  yields the general condition (4.1.2) for angle-boundedness, and therefore A is 3∗ -monotone.  Proposition 4.2.10 [6] For any linear operator A : Rn → Rn , A is paramonotone if and only if A is 3∗ -monotone.  Proof. Since A has full domain, Proposition 4.1.11 and Proposition 4.2.9 combine to yield this result. Proposition 4.2.11 Given a linear operator A : Rn → Rn , A is angle bounded if and only if A is 3∗ -monotone. Proof. A direct result of Proposition 4.2.10 and Proposition 4.2.9. However, the following example from [45] demonstrates that Proposition 4.2.11 does not generalize to Hilbert spaces. Example 4.2.12 [45] Define T : X → X where X = ℓ2 (Z) by T en =  n−2 en + n−1 e−n , n = 0 0,  n=0  (4.2.2)  where en is the nth standard vector of ℓ2 (Z), so that {en : n ∈ Z} forms an orthonormal  basis of X. Then, T is a monotone continuous linear operator that is 3∗ -monotone, and indeed α-inverse strongly monotone, however T is not angle bounded. 65  Proof. Let x ∈ X so that x = a0 e0 + such that  2 n∈Z an  +∞ n=1 an en  < +∞. Then,  +∞  Tx = n=1  + a−n e−n for some sequence (an )n∈Z ⊂ R  a−n an an a−n en + e−n , − + 2 n n n2 n  and so +∞  x, T x  = n=1 +∞  = n=1  1 1 (an a−n − an a−n ) + 2 (a2n + a2−n ) n n 1 2 (a + a2−n ). n2 n  Now, +∞  T x, T x  = n=1  a2n a2−n a2−n a2n + 2 + 4 + 2 n4 n n n  ≤ 2 x, T x , and so T is α-inverse strongly monotone with α = 12 . Similarly, +∞  Tx  2  = T x, T x  ≤  2a2n + 2a2−n i=1  = 2 x 2, and so T is a bounded linear operator. By Proposition 3.4.22, T is 3∗ -monotone. Now, for some n ∈ N, note that 2 e−n , T en − en , T e−n = n and that en , T en =  1 = e−n , T e−n . n2  Therefore, by definition, T is not angle bounded.  4.3  Monotone linear operators on R2  In this section we consider linear operators A : R2 → R2 , which can be represented by the matrix  A=  a c b d  .  66  The operator A so defined is monotone if and only if a + d ≥ 0 and 4ad ≥ (b + c)2 . We consider  some simple examples, examine their properties, and provide some sufficient and necessary conditions for inclusion within various monotone classes. Proposition 4.3.1 (3-cyclic monotone linear operators on R2 ) If A is 3-cyclic monotone, then max{|b|, |c|} − a − d ≤ 0.  (4.3.1)  Proof. Choose x = (0, 0), y = (1, 0), and z = (0, 1); let x∗ = Ax = (0, 0), y ∗ = Ay = (a, b), and z ∗ = Az = (c, d). If the mapping associated with A is 3 cyclic monotone then 0 ≤ =  x − y, x∗ + y − z, y ∗ + z − x, z ∗ (1, −1), (a, b) + (0, 1), (c, d)  = a + d − b. Similarly, by choosing different y and z, the following conditions are also necessary for any matrix A as defined above:  In all cases, x = (0, 0).    b − a − d, y = (1, 0), z = (0, 1),     −b − a − d, y = (−1, 0), z = (0, 1), 0≥  c − a − d, y = (0, 1), z = (1, 0),     −c − a − d, y = (0, −1), z = (1, 0).  (4.3.2)  There are many monotone linear operators in R2 that are not 3-cyclic monotone, and furthermore Examples 4.3.2 and 4.3.3 below demonstrate that 3-cyclic monotonicity does not follow from strict and maximal monotonicity. ˜ : R2 → R2 defined by Example 4.3.2 Consider the monotone linear operator R ˜= R  1 −2 3  1  .  (4.3.3)  ˜ violates the necessary condition (4.3.1) for 3-cyclic monotonicity since b−a−d > The operator R ˜ satisfies the monotonicity conditions (a + d) ≥ 0 and 4ad ≥ (b + c)2 , using the format 0 and R  a c ˜ = 0 implies that x = 0, so R ˜ is strictly monotone and above. Note that x, Rx b d ˜ is also 3∗ -monotone. Finally, R ˜ is therefore paramonotone. Hence, by Proposition 4.2.10, R  ˜= R  maximal monotone by Proposition 4.0.8.  67  Example 4.3.3 Consider the rotation operator Rθ : R2 → R2 with matrix representation Rθ =  cos(θ) − sin(θ) sin(θ)  cos(θ)  .  (4.3.4)  Note that Rθ is monotone if and only if |θ| ≤ π/2, since this is precisely when cos(θ) ≥ 0. In  this range, Rθ is maximal monotone by Proposition 4.0.8. Now, Rθ is 3-cyclic monotone if and only if |θ| < π/3 by Fact 4.3.4 below. Therefore, for any θ ∈]π/3, π/2[, Rθ is maximal monotone  and strictly monotone, but not 3-cyclic monotone. Now, x, Rθ x = 0 implies that x = 0 unless  θ = π/2. Therefore, Rθ is strictly monotone and hence paramonotone when |θ| < π/2. By  Proposition 4.2.10, Rθ is 3∗ -monotone as well when |θ| < π/2. When θ = π/2, Rθ is not paramonotone, and therefore it is neither strictly monotone, nor, by Proposition 4.1.11, is it 3∗ -monotone. By the following fact (Proposition 7.1 in [6]), R2 is large enough to contain distinct instances of n-cyclic monotone operators for n ≥ 2. Fact 4.3.4 [6] Let n ∈ {2, 3, . . .}. Then Rθ is n-cyclic monotone if and only if |θ| ∈ [0, π/n]. Proof. See Example 4.6 in [6] for a detailed proof. Finally, paramonotone linear operators in R2 are further restricted to be either strictly  monotone or symmetric. Proposition 4.3.5 A linear operator A : R2 → R2 is paramonotone if and only if it is strictly  monotone or symmetric.  Proof. Strictly monotone operators and symmetric linear operators are paramonotone by Fact 3.1.1 and Proposition 4.2.6 respectively. It remains to show that these are the only two a c possibilities. Assuming then that A is paramonotone, consider the general case, A = b d a  b+c 2  . If ker(A+ ) = {0}, then A is strictly monotone by Proposition 4.2.6. d If ker(A+ ) = {0} then by Proposition 4.2.6 ker(A+ ) ⊆ ker(A), and so ker(A) = {0}, from and A+ =  b+c 2  which det(A) = 0 and ad = bc. Hence, since det(A+ ) = 0, 4bc = (b + c)2 , so (b − c)2 = 0 and b = c. Therefore A is symmetric.  Remark 4.3.6 The only paramonotone linear operators in R2 that are not strictly monotone a b for a > 0 and b ∈ R and the zero operator are the symmetric linear operators A := 2 b ba x → (0, 0). Now, A = ∂f for the continuous convex function 1 b2 f (x1 , x2 ) := ax21 + 2bx1 x2 + x22 . 2 2a 68  For the zero operator, 0 = ∂(x → 0). Therefore, by Facts 3.2.6 and 3.2.7, they are maximal monotone and maximal cyclical monotone.  69  Chapter 5  Examples of Monotone Operators with Various Combinations of Monotone Classes There are 17 ways in which a monotone operator in a Hilbert space can be or fail to be a combination of paramonotone, strictly monotone, 3-cyclic monotone, 3∗ -monotone, and maximal monotone. Examples of these are given below, together with their 5-digit binary label of monotone class for easy tracking. Further examples are given afterwards, followed by a summary of the results of Chapter 3 and Chapter 4 in Tables 5.1 and 5.3, further visualizations, and commentary in Section 5.1. Of the 17 examples listed in Table 5.1, only 6 are not spherically symmetric: Examples 5.0.13, 5.0.17, 5.0.19, 5.0.21, 5.0.23, and 5.0.24. Four are spherically symmetric operators that can be represented in R1 , and only one of these is linear (Id). There are two other linear examples operating on R2 , for a total of four linear examples if 0 (not appearing in Table 5.1) is included. Combined, these show all possible configurations for linear monotone operators on R2 [37]. There are four nonlinear examples in R3 , and the remaining seven operators operate on R2 . Except for Example 5.0.17 and the Examples on R3 , the remaining operators are depicted in Figure 5.15 if they are maximal monotone and Figure 5.16 if they are not maximal. Example 5.0.17 is not depicted as the visualization artifact for operators that are not maximal monotone obfuscates important characteristics. Remark 5.0.7 Every operator T cited as an example in Table 5.1 satisfies all of the following conditions: (i) T (0) = 0, (ii) T −1 (0) = 0, (iii) dom T = X, (iv) T is single valued, (v) if T is maximal monotone, then T is continuous.  70  2.0 1.6  2.4  1.2 1.6 0.8 1.5  0.8  2.5 0.4  1.0  2.0 0.5  0.0  0.0 1.5  0.0 −0.5  −0.4  1.0  −0.8  0.5  −1.2  0.0 2.0  −1.0 −1.5 −2.0 2.0  1.5  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −0.8  1.5  −1.6  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −1.6  −2.4  Figure 5.1: Visualization of Id. For (x∗1 , x∗2 ) ∈ T (x1 , x2 ), height on left is x∗1 , colour on left is x∗2 , height on right is r, colour on right is θ where (x∗1 , x∗2 ) = r(cos θ, sin θ). Since strictly monotone operators are paramonotone and since 3-cyclic monotone operators are 3∗ -monotone, when these conditions either imply or preclude the other, this will not be made explicit. We start with the spherically symmetric operators, generated in the method of Proposition 3.8.1. These will be referred to later by symbol (Id, 0, Φ, g, h, and m) and subscripted with the dimension on which they operate (ie: Idn : Rn → Rn ) if otherwise unclear. The identity map (Id : x → x) is the subdifferential of the proper lower semicontinuous  convex function f : X → R defined by f (x) :=  1 2  x 2 . As such it is maximal monotone and  3-cyclic monotone, and therefore 3∗ -monotone. Since for any x, y ∈ X, 0 = x − y, x − y = x − y  2  if and only if x = y, the operator Id is strictly monotone, hence paramonotone, and has the binary label 11111. As a reference for the visualizations below, we visualize Id in Figure 5. Similarly, the zero operator 0 is the subdifferential of the proper lower semicontinuous convex function f : x → 0. Now, x − y, 0 = 0 for all x, y ∈ X, and so while 0 is not strictly monotone, it is paramonotone as 0x = 0y always. As such, 0 has the binary label 10111.  However, 0−1 (0) = X, and so we define the nonlinear operator h in its stead.  71  Example 5.0.8 (h, 10111) Define the operator h : X → X by x ≤ 1,  x,  hx :=  x x  ,  x > 1.  (5.0.1)  The monotone operator h is paramonotone, not strictly monotone, 3-cyclic monotone, and maximal monotone. Proof. The operator h is the spherically symmetric operator T from Proposition 3.8.1, where g : R → R is defined by     0, t ≤ 0, g(t) := t, t ≤ 1,   1, t > 1.  (5.0.2)  Now, g is monotone but not strictly monotone since it is increasing but not strictly so on R+ . Since g is single valued and continuous, it is maximal monotone by Fact 3.2.1. Therefore, by Propositions 3.8.1 and 3.8.2, h is paramonotone, 3-cyclic monotone, and maximal monotone but is not strictly monotone. Note that h is equal to ∂f , for the convex function f : X → R is defined by f (x) :=  1 2  x 2,  x −  1 2,  x ≤ 1,  x > 1.  (5.0.3)  The spherically symmetric operators g, m, and Φ below arise from the same process. Example 5.0.9 (g, 10110) Define the operator g : X → X by gx :=  0,  x = 0,  x x  , x = 0.  (5.0.4)  The monotone operator g is paramonotone, not strictly monotone, 3-cyclic monotone, and not maximal monotone. Proof. The operator g is the same as T from Proposition 3.8.1 using the function g where g(t) :=  0, t ≤ 0, 1, t > 0.  (5.0.5)  As g is constant when t > 0, it is monotone but not strictly monotone on R+ . By Proposition 3.8.1, g is paramonotone, 3-cyclic monotone, and not strictly monotone. Now, choose any y ∈ X such that y ≤ 1. The Cauchy-Schwarz inequality yields that, for any x ∈ X such that x = 0,  x − 0,  x − y ≥ x − x y ≥ 0. x 72  Therefore, (0, y) monotonically extends g, and g is not maximal monotone. Also, g is equal to the single valued selection of minimal norm (3.5.1) of ∂f , where f : X →  R is defined by f (x) := x and  (∂f )(x) =  {y ∈ X : y ≤ 1}, x = 0, x x  ,  (5.0.6)  x = 0.  Example 5.0.10 (m, 11110) Define the operator m : X → X by 0,  mx :=  x = 0, x x  x+  ,  (5.0.7)  x > 1.  The monotone operator m is strictly monotone and 3-cyclic monotone, but is not maximal monotone. Proof. The operator m is the spherically symmetric operator T from Proposition 3.8.1 using the function t ≤ 0,  0,  g(t) :=  (5.0.8)  t + 1, t > 0.  Clearly, g is strictly monotone. Therefore, by Proposition 3.8.1, m is paramonotone, 3-cyclic monotone, and strictly monotone. Notice that m = Id +g. Since dom Id = dom g = X, and since g is not maximal monotone, then m is not maximal monotone by Remark 3.6.5. Since m = Id +g, it is the single valued selection of minimal norm (3.5.1) of the maximal monotone operator ∂f , where f : X → R is defined by f (x) := x +  1 2  x 2.  Although we have already defined an operator with the same monotone classes (g), we define Φ below for use in Example 5.0.15 which follows. Other spherically symmetric operators are defined within Examples 5.0.16, 5.0.18, and 5.0.20. Example 5.0.11 (Φ, 10110) Define Φ by x ≤1  0,  Φx :=  x,  x >1  (5.0.9)  The operator Φ is paramonotone and 3-cyclic monotone, though it is not strictly monotone and is not maximal monotone. Proof. The subdifferential of the continuous convex function f : X → R, where f (x) :=  0, 1 2  x  2  −  1 2,  x ≤ 1,  x ≥ 1,  73  is     0, (∂f )(x) := {λx : λ ∈ [0, 1]},   x,  x < 1, x = 1,  (5.0.10)  x > 1.  As it is a subdifferential, ∂f is maximal monotone, cyclical monotone, and paramonotone. Since gra Φ gra ∂f ), we have that Φ is not maximal monotone, though it is 3-cyclic monotone by Fact 3.5.2. Since Φx := argmin{ x∗ : x∗ ∈ (∂f )(x)}, Φ is paramonotone by Proposition 3.5.4. Neither ∂f nor Φ are strictly monotone because the image of both operators is constant on the unit ball. The following simple skew-symmetric operator will be referred to as Q throughout the rest of the paper. In particular, note that −Q is also a monotone operator with the same properties. Example 5.0.12 (Q, 00001) Define Q : R2 → R2 by Q(x1 , x2 ) := (−x2 , x1 ).  (5.0.11)  The operator Q is neither paramonotone nor 3∗ -monotone, although it is maximal monotone. Proof. As Q is a skew-symmetric operator, for all x, y ∈ R2 x − y, Qx − Qy = 0, hence Q is monotone, yet Qx = Qy if x = y, and so it is not paramonotone. The operator Q directly violates the condition for 3∗ -monotonicity. For k ∈ N, let x = (0, 0), yk = (k, 0), and  z = (0, 1), so that  lim z − yk , Qyk − Qx  k→+∞  = =  lim  k→+∞  (−k, 1), (0, k)  lim k = +∞.  k→+∞  Finally, since Q is continuous and single valued on R2 , it is maximal monotone by Fact 3.2.1. We obtain an operator which fails to satisfy any of the monotone properties besides monotonicity itself by adding Q to a product-space composition of g and 0. For visualization of Example 5.0.13, see Figure 5. Example 5.0.13 (00000) Define T : R2 → R2 by T := Q + g1 × 01 , so that    (−1 − x2 , x1 ) x1 < 0, T (x1 , x2 ) = (−x2 , 0), x1 = 0,   (1 − x2 , x1 ), x1 > 0.  (5.0.12)  74  1.6  2.4  1.2 1.6 0.8 3.5 0.8 0.4  2  2.5  1  0.0  0.0 2.0  0 −0.4  1.5 −0.8  −1  1.0 −0.8  −2 −3 2.0  3.0  1.5  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −1.2 −1.6  0.5 0.0 2.0  1.5  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  −2.0  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −1.6  −2.4  Figure 5.2: Visualization of Example 5.0.13. For (x∗1 , x∗2 ) ∈ T (x1 , x2 ), height on left is x∗1 , colour on left is x∗2 , height on right is r, colour on right is θ where (x∗1 , x∗2 ) = r(cos θ, sin θ). The operator T is monotone but is not paramonotone, 3∗ -monotone, or maximal monotone. Proof. T is monotone since Q, g1 , and 01 are all monotone operators. Since g1 : R → R and  01 : R → R are not maximal monotone, then neither is g1 × 01 . Since dom Q = dom(g1 × 01 ) =  R2 , then T is not maximal monotone by Remark 3.6.5.  The operator T fails to be paramonotone since, for instance, when x = (1, 2) and y = (1, 0), x − y, Qx − Qy = (0, 2), (−1, 1) − (1, 1) = 0, even though x = y and Qx = Qy. For any k > 0, let x = (0, −1), y = (k, 0) and z = (0, 1), so that Qx = (1, 0) and Qy = (1, k).  Now,  z − y, Qy − Qx = (−k, 1), (0, k) = k,  (5.0.13)  and so sup(y,y∗ )∈gra T z − y, Qy − Qx = +∞. Therefore, by definition, T is not 3∗ -monotone. Linear combinations of Id and Q result in rotation operators, which allows us to determine the n-cyclic monotonicity of the result as a corollary to Proposition 7.1 in [6]. Proposition 5.0.14 Let T : R2 → R2 be given by T = a Id +bQ, with constants a > 0 and 75  b ∈ R. Then, T is n-cyclic monotone if and only if | arctan(b/a)| ≤ π/n. Proof. Define the rotation operator Rθ : R2 → R2 to be cos(θ) − sin(θ)  Rθ (x1 , x2 ) =  sin(θ)  x1 x2  cos(θ)  .  (5.0.14)  Then, Rθ = cos(θ) Id + sin(θ)Q, and as long as a > 0, a Id +bQ is monotone and a Id +bQ =  a2 + b2 Rarctan( b ) a  (5.0.15)  By Proposition 7.1 in [6], Rθ is n-cyclic monotone if and only if θ ∈ [−π/n, π/n], and therefore  if and only if | arctan(b/a)| ≤ π/n.  The following example resembles Example 3.5 in [12], which demonstrated a 3∗ -monotone  operator that was not paramonotone, although the operator A : R2 → R2 from that example was multivalued and its domain was the closed unit ball.  Example 5.0.15 (00010) Let T : R2 → R2 be defined by T := Q + Φ, so that T (x1 , x2 ) =  (−x2 , x1 ), (x1 − x2 , x1 + x2 ),  x ≤ 1,  x > 1.  (5.0.16)  The operator T is 3∗ -monotone, but is not 3-cyclic monotone, not paramonotone, and not maximal monotone. Proof. T is monotone since both Φ and Q are monotone operators. Since Φ is not maximal monotone, neither is T by Remark 3.6.5. Since T x = Qx when x ≤ 1, it follows that T is not paramonotone and not 3-cyclic  monotone. To demonstrate this, let x = (0, 0), y = (1/2, 0) and z = (0, 1/2), so that T x = (0, 0), T y = Qy = (0, 1/2), and T z = (−1/2, 0). Then x − y, T x − T y = (−1/2, 0), (0, −1/2) = 0, yet (1/2, 0) ∈ / T x, and so T is not paramonotone. Also, x − y, T x + y − z, T y + z − x, T z  = 0 + (1/2, −1/2), (0, 1/2) + (0, 1/2), (−1/2, 0)  = −1/4 < 0,  violating the condition required by 3-cyclic monotonicity.  76  3.2  2.4  2.4 1.6 1.6  4.0  3  3.0  1  0.0  2.5  0.0  0  2.0 −0.8  −1  1.5  −0.8  1.0  −2 −3 −4 2.0  0.8  3.5  0.8 2  1.5  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −1.6  0.5  −2.4  0.0 2.0  1.5  1.0  −3.2  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  0.5 0.0 x2 −0.5−1.0 −1.5  −1.6  −2.4  Figure 5.3: Visualization of Example 5.0.15. For (x∗1 , x∗2 ) ∈ T (x1 , x2 ), height on left is x∗1 , colour on left is x∗2 , height on right is r, colour on right is θ where (x∗1 , x∗2 ) = r(cos θ, sin θ). Now, for any x, z ∈ R2 , sup(y,y∗ )∈gra T z − y, y ∗ − T x = max ≤ max  sup(y,y∗ )∈gra T ;  y ≤1  sup(y,y∗ )∈gra T ;  y >1  sup(y,y∗ )∈gra T ;  z − y, y ∗ − T x ,  z − y, y ∗ − T x  y ≤1 (  sup(y,y∗ )∈gra(Id +Q);  y + z )(  y >1  z−  y∗  y, y ∗  (5.0.17) + T x ),  − Tx  .  The first supremum in (5.0.17) is bounded since y, and therefore T y, are both bounded in norm when y ≤ 1. By Proposition 5.0.14, the operator Id +Q is 3-cyclic monotone since  arctan(1) = π/4 ≤ π/3. Therefore, the second supremum of (5.0.17) is also bounded by the 3∗ -cyclic monotonicity of Id +Q, and so T is 3∗ -monotone.  Example 5.0.16 (00011) Let T := Q + TS where TS : R2 → R2 is defined by TS x :=  0, x−  x x  ,  x ≤ 1,  x ≥ 1,  so that T (x1 , x2 ) =  (−x2 , x1 ), (x1 − x2 , x1 + x2 ) −  (x1 ,x2 ) , x  x ≤ 1,  x ≥ 1.  (5.0.18) 77  The operator T is neither 3-cyclic monotone nor paramonotone, however T is maximal monotone and 3∗ -monotone. Proof. As in Example 5.0.15, T is neither 3-cyclic monotone nor paramonotone since T x = Qx when x ≤ 1. Define g : R → R by  t ≤ 1,  0,  g(t) :=  (5.0.19)  t − 1, t > 1.  The function g is monotone, single valued, and continuous, and so it is maximal monotone by Fact 3.2.1. Now, TS equals the T from Proposition 3.8.1 as generated using g from (5.0.19). Therefore, TS is maximal monotone. Since T is the sum of two maximal monotone operators each with full domain, it is maximal monotone by Fact 3.6.3. To demonstrate that T is 3∗ -monotone, first note that for any y ∈ R2 , we have that  Qy = y and y, Qy = 0, and so for any k ∈ R, ky + Qy  2  = ky + Qy, ky + Qy = (k2 + 1) y .  Now, consider any x, z ∈ R2 . If y ≤ 1, then by the Cauchy-Schwarz inequality, z − y, T y − T x  =  z − y, Qy − T x  ≤ ( z + Tx ) y + z  (5.0.20)  Tx .  If y ≥ 1, then z − y, T y − T x  = ≤ ≤  z − y, 1 − 1−  √  1 y  1 y  y + Qy − T x  y + Qy  2 z + Tx  z + y  y + z  Tx + z  Tx − 1 −  Tx .  1 y  (5.0.21) y 2 (5.0.22)  Let k := 4 max{ z , T x , 1}. By (5.0.21), if y ≥ k, then z − y, T y − T x ≤  √  2 z + Tx −  3 y 4  y + z  Tx < z  Tx .  (5.0.23)  78  3.2 2.4  2.4  1.6  1.6 3  0.8  3.0  0.8 2  2.5 1  0.0  0.0  2.0  0 1.5 −0.8  −1  −0.8  1.0  −2  0.5  −3 2.0  1.5  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −1.6 0.0 2.0 −2.4  −3.2  1.5  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −1.6  −2.4  Figure 5.4: Visualization of Example 5.0.16. For (x∗1 , x∗2 ) ∈ T (x1 , x2 ), height on left is x∗1 , colour on left is x∗2 , height on right is r, colour on right is θ where (x∗1 , x∗2 ) = r(cos θ, sin θ). Combining (5.0.20), (5.0.22), and (5.0.23), sup(y,y∗ )∈gra T z − y, y ∗ − T x = max ≤ max  sup(y,y∗ )∈gra T ;  y ≤k  sup(y,y∗ )∈gra T ;  y >k  sup(y,y∗ )∈gra T ;  y ≤k  sup(y,y∗ )∈gra T ;  y >k  z − y, T y − T x ,  z − y, T y − T x √ 2 z + Tx z  y + z  Tx ,  (5.0.24)  Tx  < +∞. Hence, by definition, T is 3∗ -monotone. It is easy to check that TS from Example 5.0.16 is the subgradient of the convex function f (x) :=  0, 1 2  x  2  − x +  1 2,  x ≤ 1,  x ≥ 1.  All spherically symmetric monotone operators generated in the manner of Proposition 3.8.1 are 3-cyclic monotone and paramonotone. Indeed, the operators generating in this manner are given above which independently are and are not strictly monotone and maximal monotone. It remains to be shown if there is a 3-cyclic monotone operator that is not paramonotone. If one exists, it cannot be strictly monotone, and by Proposition 3.1.3, it cannot be maximal 79  2.4  2.4  1.6  1.6  0.8  0.8  3.0  1.5 2.5 0.0  0.0  2.0  1.0  1.5 −0.8  −0.8  1.0  0.5  0.5 0.0 2.0  1.5  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −1.6 0.0 2.0 −2.4  1.5  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −1.6  −2.4  Figure 5.5: Comparing TS with Q + TS in Example 5.0.16. monotone. Such an operator follows. Example 5.0.17 (00110) The operator T : R2 → R2 defined by  (1, 0),     1    ( 2 , 0), x→ (0, 0),     (− 12 , 0),    (−1, 0)       = 0, x2 > 0    = 0, x2 = 0    = 0, x2 < 0     <0  x1 > 0 x1 x1 x1 x1  (5.0.25)  is 3-cyclic monotone, but is neither maximal monotone nor paramonotone. Proof. The operator T is a single valued selection of the multivalued mapping ∂f for the convex function f (x1 , x2 ) := |x1 |, that is gra T  gra ∂f , where   x1 > 0,   (1, 0), (∂f )(x) = {(k, 0) : k ∈ [−1, 1]}, x1 = 0,   (−1, 0) x1 < 0,      (5.0.26)     As such, T is monotone but not maximal monotone. Since the graph of T is a subset of the  80  2.4  1.6 1.0  1.0 0.8 0.8  0.5 0.0  0.6 0.0 0.4  −0.8 −0.5  −1.0 2.0  0.2  1.5  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  0.0 2.0  1.5  1.0  −0.015 −0.030 −0.045 −0.060 0.000 0.060 0.045 0.030 0.015  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −1.6  −2.4  Figure 5.6: Visualization of Example 5.0.17. For (x∗1 , x∗2 ) ∈ T (x1 , x2 ), height on left is x∗1 , colour on left is x∗2 , height on right is r, colour on right is θ where (x∗1 , x∗2 ) = r(cos θ, sin θ). graph of ∂f , T is 3-cyclic monotone. Finally, consider the points (0, 1), (0, −1) ∈ R2 . where (0, 1) − (0, −1), T (0, 1) − T (0, −1) = (0, 2), (1/2, 0) − (−1/2, 0) = 0, yet (− 21 , 0) ∈ / T (0, 1) and ( 12 , 0) ∈ / T (0, −1). In this way, T is not paramonotone.  Even though all linear operators in Rn are 3∗ -monotone if and only if they are paramono-  tone [6], various examples of paramonotone but not 3∗ -monotone linear operators in infinite dimensional Hilbert space appear in [12] and [37]. The operators defined below demonstrate that the independence between paramonotonicity and 3∗ -monotonicity can be shown for nonlinear operators in R2 and R3 . Example 5.0.18 (11000) Let T : R2 → R2 be the operator defined by T := TS + Q where TS : R2 → R2 is defined by  TS x :=  0, 2 x +1 x +1  x = 0, x x  , x = 0.  (5.0.27)  and Q is as defined in Example 5.0.12. This operator T is strictly monotone, however it is neither maximal monotone nor 3∗ -monotone.  81  Proof. Note that TS is of the form described in Proposition 3.8.1, where g : R → R is defined as  0,  g(t) :=  1+  t t+1 ,  t ≤ 0,  (5.0.28)  t > 0.  For all a, b ≥ 0, a − b, g(a) − g(b) = (a − b)(  (a − b)2 2a + 1 2b + 1 − )= ≥ 0. a+1 b+1 (a + 1)(b + 1)  Hence, by Proposition 3.8.1, TS is cyclical monotone and is strictly monotone since g is strictly monotone. Since strict monotonicity dominates under addition (Remark 3.6.1), T is strictly monotone. Consider any point y ∈ R2 such that 0 < y ≤ 1. Then, (0, y) monotonically extends  gra TS since for all x ∈ X,  x − 0, TS x − y  x x +1 x 1+ x +1  =  x, 1 +  ≥  x x  =  x x  −y  − x  2  x +1  ≥ 0.  Therefore, TS is not maximal monotone, and so by Remark 3.6.5, neither is T . To demonstrate that T is not 3∗ -monotone, let z := (0, 3), x = (0, 0), and yk := (k, 0), where k ∈ N, so that T yk = ( 2k+1 k+1 , k) and T x = (0, 0). Then, lim z − yk , T yk − T x  k→+∞  2k + 1 , k) k+1 2k2 + k = lim 3k − k→+∞ k+1 ≥ lim k = +∞, =  lim (−k, 3), (  k→+∞  k→+∞  and so T is not 3∗ -monotone. Example 5.0.19 (10000) Let TM : R2 → R2 be the operator TS + Q from Example 5.0.18,  and consider the operator T : R3 → R3 defined by T := TM × h1 , so that, for x = (x1 , x2 , x3 )  and π := x → (x1 , x2 ),     Tx =    2 πx +1 πx 2 + πx  −1  1  2 πx +1 πx 2 + πx  0  0  0    x1       x2  .  1 x3 min{1, |x3| } 0  (5.0.29)  Then, T is paramonotone, but is neither strictly monotone, nor 3-cyclic monotone, nor 3∗ 82  2.4  2.4  1.6  1.6  3 3.0  0.8  0.8  2 2.5 1  0.0  0.0  2.0  0 1.5 −0.8  −1  0.5  −3 2.0  −0.8  1.0  −2  1.5  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −1.6 0.0 2.0 −2.4  −3.2  1.5  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −1.6  −2.4  Figure 5.7: Visualization of Example 5.0.18. For (x∗1 , x∗2 ) ∈ T (x1 , x2 ), height on left is x∗1 , colour on left is x∗2 , height on right is r, colour on right is θ where (x∗1 , x∗2 ) = r(cos θ, sin θ). monotone, nor maximal monotone. Proof. Since h1 is monotone and has binary label 10111 and TM is monotone and has binary label 11000, then T is monotone and has binary label 10000 as it is formed by the product space composition of these two operators. Example 5.0.20 (11001) Let T : R2 → R2 be the operator defined by T := TS + Q where TS : R2 → R2 is defined by  x , x +1  TS x :=  (5.0.30)  and Q is the skew symmetric operator from Example 5.0.12. This operator T is strictly monotone and maximal monotone, but is not 3∗ -monotone. Proof. As in Example 5.0.18, the operator TS can be constructed in the manner from Proposition 3.8.1, where g : R → R is defined as g(t) :=  0, t t+1 ,  t ≤ 0,  t > 0.  (5.0.31)  83  2.4 2.4 1.8 1.6  1.2 2  0.6  0.8  2.5  1  2.0 0.0  0.0 1.5  0 −0.6  1.0  −0.8  −1 0.5 −2 2.0  1.5  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −1.2 0.0 2.0 −1.8  −2.4  1.5  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −1.6  −2.4  Figure 5.8: Visualization of Example 5.0.20. For (x∗1 , x∗2 ) ∈ T (x1 , x2 ), height on left is x∗1 , colour on left is x∗2 , height on right is r, colour on right is θ where (x∗1 , x∗2 ) = r(cos θ, sin θ). Since g is single-valued and continuous, it is maximal monotone by Fact 3.2.1. For all a, b ≥ 0, a − b, g(a) − g(b) = (a − b)(  a b (a − b)2 − )= ≥ 0. a+1 b+1 (a + 1)(b + 1)  Since g is also strictly monotone, TS is both strictly monotone and maximal monotone by Proposition 3.8.1. On the other hand, Q is maximal monotone but not strictly monotone. From Section 3.6, we know that T = TS + Q is both strictly monotone and maximal monotone. To demonstrate that T is not 3∗ -monotone, let z = (0, 2), x = (0, 0), and yk := (k, 0), where k k ∈ N, so that T x = (0, 0) and T yk = ( k+1 , k). Then,  lim z − yk , T yk − T x  k→+∞  =  lim (−k, 2), (  k→+∞  k , k) k+1  k2 k→+∞ k+1 ≥ lim k = +∞,  =  lim 2k −  k→+∞  and so T is not 3∗ -monotone. Example 5.0.21 (10001) Let TM : R2 → R2 be the operator TS + Q from Example 5.0.20,  and consider the operator T : R3 → R3 defined by T := TM × h1 , where h is as defined in 84  Example 5.0.8. For x = (x1 , x2 , x3 ) and π := x → (x1 , x2 ),    T (x1 , x2 , x3 ) =    1 πx +1  1 0  −1  1 πx +1  0    0 0 min{1,  1 x3  x1       x2  .  x3 }  (5.0.32)  Then, T is paramonotone and maximal monotone, but is neither strictly monotone, nor 3-cyclic monotone, nor 3∗ -monotone. Proof. Since h1 is monotone and has binary label 10111 and TM is monotone and has binary label 11001, then T is monotone and has binary label 10001 as it is formed by the product space composition of these two operators. The final linear example follows: Id +2Q. It will be used in the construction of the remaining operators. Example 5.0.22 (11011) Let T : R2 → R2 be defined by T := Id +2Q, so that T (x1 , x2 ) = (x1 − 2x2 , 2x1 + x2 ).  (5.0.33)  The operator T is strictly monotone, 3∗ -monotone, and maximal monotone, but it is not 3-cyclic monotone. Proof. From the results of Section 3.6, we have that since Id is strictly monotone and since both Id and Q are maximal monotone and have full domain, T is strictly monotone and maximal monotone. By Proposition 5.0.14, since arctan 2 ≥ π/3, T is not 3-cyclic monotone.  Since all paramonotone linear operators are also 3∗ -monotone in R2 (Remark 4.11 in [6]), T is  3∗ -monotone. Example 5.0.23 (10010) Let T : R3 → R3 be defined by T := (Id +2Q) × g1 so that  x3 > 0,   (x1 − 2x2 , 2x1 + x2 , 1), T (x1 , x2 , x3 ) = (x1 − 2x2 , 2x1 + x2 , 0), x3 = 0,   (x1 − 2x2 , 2x1 + x2 , −1), x3 < 0.  (5.0.34)  The operator T is paramonotone and 3∗ -monotone, but not strictly monotone, not 3-cyclic monotone, and not maximal monotone. Proof. Since g1 has binary label 10110 and since Id +Q has binary label 11011, then T has binary label 10010 as it is formed by the product space composition of these two operators.  85  3.2  2.4  2.4 1.6 1.6 3 0.8 2  4.0 0.8  3.5 3.0  1  0.0  0  2.0 −0.8  −1  1.5  −0.8  1.0  −2 −3 −4 2.0  0.0  2.5  1.5  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −1.6  0.5  −2.4  0.0 2.0  −3.2  1.5  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −1.6  −2.4  Figure 5.9: Visualization of Example 5.0.22. For (x∗1 , x∗2 ) ∈ T (x1 , x2 ), height on left is x∗1 , colour on left is x∗2 , height on right is r, colour on right is θ where (x∗1 , x∗2 ) = r(cos θ, sin θ). Example 5.0.24 (10011) Let T : R3 → R3 be defined by T := (Id +2Q) × h1 so that  x3 > 1,   (x1 − 2x2 , 2x1 + x2 , 1), T (x1 , x2 , x3 ) = (x1 − 2x2 , 2x1 + x2 , x3 ), |x3 | ≤ 1,   (x1 − 2x2 , 2x1 + x2 , −1), x3 < −1.  (5.0.35)  The operator T is paramonotone but not strictly monotone and 3∗ -monotone but not 3-cyclic monotone, and is maximal monotone. Proof. Since h1 has the binary label 10111, and since Id +Q has the binary label 11011, then T has binary label 10011 as it is formed by the product space composition of these two operators.  Example 5.0.25 (11010) Define T : R2 → R2 by T := Id +2Q + g2 , so that for all x = (x1 , x2 ) ∈ R2 , T (x1 , x2 ) =  (0, 0), 1+  x = 0, 1 x  x1 − 2x2 , 2x1 + 1 +  1 x  x2 , x = 0.  (5.0.36)  Then, T is strictly monotone and 3∗ -monotone, but is neither 3-cyclic monotone nor maximal monotone. 86  6.0 2.4 4.5 1.6 3.0 6 1.5  4  0.8  6 5  2  0.0  0.0  4  0 3 −1.5  −2  −0.8  2  −4 −6 2.0  1.5  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −3.0  1 0  −4.5  2.0  1.5  −6.0  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −1.6  −2.4  Figure 5.10: Visualization of Example 5.0.25. For (x∗1 , x∗2 ) ∈ T (x1 , x2 ), height on left is x∗1 , colour on left is x∗2 , height on right is r, colour on right is θ where (x∗1 , x∗2 ) = r(cos θ, sin θ). Proof. By the results of Section 3.6, since Id +2Q is strictly monotone, since both Id +2Q and g2 are 3∗ -monotone, and since both have full domain but g2 is not maximal monotone, T is strictly monotone, 3∗ -monotone, and not maximal monotone. For proof that T is not 3-cyclic monotone, consider the points x = (0, 0), y = (20, 0), and √ z = (10, 10 3). Applying the operator yields that T x = (0, 0) and T y = (21, 40). Note that √ since z, Qz = 0, z, T z = z 2 + z = 420. Therefore, as 3 ≥ 8/5, x − y, T x + y − z, T y + z − x, T z √ = (10, −10 3), (21, 40) + 420 √ = 630 − 400 3 < 0, and so T is not 3-cyclic monotone. We now have a full zoo of examples for each possible monotone class combination, which is summarized in Table 5.1 and Figure 5.11. However, to fully represent the possibilities for linear operators, we require a few more examples. Note for these linear operators A that while they meet all other criteria listed in Remark 5.0.7, A−1 (0) may be multivalued. Example 5.0.26 (10111) The orthogonal projection A : R2 → R2 defined by A(x1 , x2 ) := (x1 , 0) is maximal monotone, paramonotone, 3-cyclic monotone, and 3∗ -monotone.  87  Proof. Using the notation of Section 3.3, we have that A = Id × 0, where 0 : R → R is the zero  operator, and Id : R → R is the identity. The 0 operator is maximal monotone, paramonotone,  3-cyclic monotone, and 3∗ monotone, as is Id, which is also strictly monotone, while 0 is not. The properties of A follow directly from the results in Section 3.3. All relationships among the classes of monotone linear operators in R2 are now known completely and are summarized in Table 5.2. Recall that all monotone linear operators are assumed to have full domain and are therefore maximal monotone by Proposition 4.0.8. In Remark 4.3.6 we noted that the converse of Proposition 3.1.3 holds for monotone linear operators that are not strictly monotone operators on R2 . We now demonstrate that this result does not generalize to R3 . Example 5.0.27 Let T : R3 → R3 be the linear operator defined by  1 −2 1   T x :=  3 1 3  x. 1 −2 1   (5.0.37)  The operator T is paramonotone and maximal monotone, but not strictly monotone. Further, T is not 3-cyclic monotone, but is 3∗ -monotone. Proof. The symmetric part of T is   1   T+ :=  1/2 1  1/2  1  1     1/2  1/2 1  Since the eigenvalues of T+ , consisting of {0, 21 (3 +  √  3), 21 (3 −  √  3)}, are nonnegative, T+ is  positive semidefinite, hence monotone, and so T is monotone.  An elementary calculation yields that ker T+ = {t(−1, 0, 1) : t ∈ R}. Clearly, ker T =  ker T+ , so by Proposition 4.2.6, T is paramonotone. However, T is not strictly monotone since the kernel contains more than the zero element.  Furthermore, T is maximal monotone since it is linear and has full domain (Proposition 4.0.8). The operator T is not 3-cyclic monotone since the points (0, 0, 0), (1, 0, 0), and (0, 1, 0) do not satisfy the defining condition (2.1.5). (For a shortcut, call to mind Example 4.3.2 and Proposition 4.3.1.) Finally, since T is a linear operator in R3 that is paramonotone, it is 3∗ -monotone by Proposition 4.2.10. Recall from Proposition 4.2.10 that linear paramonotone operators on Rn are 3∗ monotone. Example 5.0.28 below demonstrates that larger spaces are more permissive. A similar example appears in [12].  88  Example 5.0.28 Let θk := π/2 − 1/k4 and let A : ℓ2 → ℓ2 be the linear operator defined by +∞  Ax →  k=1  (cos(θk )x2k−1 − sin(θk )x2k ) e2k−1 + (sin(θk )x2k−1 + cos(θk )x2k ) e2k ,  (5.0.38)  The structure of A is such that every x∗ = Ax obeys x∗2k−1 x∗2k  = Rθ k  x2k−1 x2k  (5.0.39)  for all x ∈ ℓ2 and k ∈ N, where Rθk is the rotation matrix as introduced in Example 4.3.3. A is strictly monotone and maximal monotone, but not 3∗ -monotone. It follows that A is also paramonotone but not 3-cyclic monotone. Proof. The monotonicity of T is evident from (5.0.39). Suppose that x ∈ ℓ2 is such that x, Ax = 0. Now,  +∞  cos(θk )(x22k−1 + x22k )  x, Ax = k=1  is equal to zero if and only if x = 0, and so A is strictly monotone. By Proposition 4.0.8, A is maximal monotone since it is linear and has full domain. +∞ 1 2 k=1 k (e2k−1 + e2k ). Define a sequence yn ∈ ℓ by yn := n2 e2n−1 , and so Ayn = n2 cos(θn )e2n−1 + n2 sin(θn )e2n . For all n, 0 < cos(θn ) ≤ 1/n4 , and from the Taylor’s series sin(θn ) ≥ 1 − 1/(2n8 ) for all large n. Considering the inequality (2.1.7) for 3∗ -monotonicity, we have  Let x = 0, so that Ax = 0, and let z =  z − yn , Ayn − Ax  = ≥  n (cos(θn ) + sin(θn )) − n4 cos(θn )  n(0 + 1 − 1/(2n8 )) − 1  → +∞,  (5.0.40)  as n → +∞,  and so A fails to be 3∗ -monotone. Remark 5.0.29 The operator A from Example 5.0.28 can be modified to lose its strict monotonicity property by using the zero function 0 : R → R as a prefactor in the product space,  yielding T = 0 × A. In this manner, +∞  T x := k=1  (cos(θk )x2k − sin(θk )x2k+1 ) e2k  + (sin(θk )x2k + cos(θk )x2k+1 ) e2k+1 .  (5.0.41)  Proof. The Hilbert space ℓ2 can be written as a product space ℓ2 = R × ℓ2 . More precisely, all  of these spaces can be embedded in the larger space ℓ2 (Z) with standard unit vectors ei for i in Z, the set of integers. In this setting ℓ2 = span{ei : i ∈ N}, and let V0 = span{e0 } so that 89  ℓ2 (N  {0}) = V0 × ℓ2 . Let T = 0 × A, where A is the linear operator from Example 5.0.28.  The operator 0 : V0 → V0 is paramonotone, maximal monotone, 3-cyclic monotone, and 3∗ monotone, but not strictly monotone on R. The operator A : ℓ2 → ℓ2 from Example 5.0.28  is strictly monotone and maximal monotone, but not 3∗ -monotone. Therefore, by the results of Section 3.3, T := 0 × A is paramonotone and maximal monotone, and fails to be strictly  monotone or 3∗ -monotone.  Although the cases for the following 3 examples have already been considered, the first is included here as it is an important theoretical example. The following two provide a means to create a variety of examples based on the rotation operator, so that operators with a precise angle of rotation, but satisfying specific monotone classes, can be created. For these latter two ˜ or the operator from Example 5.0.22 examples, the non-rotational example in Example 4.3.2 (R) (Id +2Q) would also function as a replacement for Rθ . Note that the following example does not have full domain, nor is the domain convex. Example 5.0.30 (M from [10]) The operator M : R2 → 2R  x→                     [(0, 1), (0, −1)] + R+ (1, 1) + R+ (1, −1),  (−t, 1 − t) + R+ (1, 1),  2  x = (1, 0) x = (1 − t, t), 0 < t < 1  (−1, 0) + R+ (−1, 1) + R+ (1, 1),  x = (0, 1)  (−1, 0) + R+ (−1, 1),  x = (−t, 1 − t), 0 < t < 1  [(−1, 0), (−1, −2)] + R+ (−1, 1) + R+ (−1, −1),     (t − 1, t − 2) + R+ (−1, −1),      (0, −1) + R+ (−1, −1) + R+ (1, −1),      (0, −1) + R+ (1, −1),    ∅,  x = (−1, 0)  (5.0.42)  x = (t − 1, −t), 0 < t < 1  x = (0, −1)  x = (t, t − 1), 0 < t < 1  otherwise  is (maximal) 3-cyclic monotone with the unit diamond as domain. However, it is not maximal monotone [10]. Furthermore it is not paramonotone: consider the points x = (1, 0) and y = (0, 1). Now, (0, 1) ∈ M x and (−1, 0) ∈ M y, so taking these values (1, 0) − (0, 1), (0, 1) −  (−1, 0) = 0. However, (−1, 0) ∈ / M x, so M is not paramonotone. Example 5.0.31 Let T : R3 → R3 be defined by T = Rθ × g where g : R → R is either the sign operation    −1, x < 0 g1 (x) := sgn(x) = 0, x=0   1, x>0  90  or g2 (x) := sgn(x) + x, and Rθ is the rotation matrix from Example 4.3.3 with angle θ ∈] − π/2, π/2[. In all cases,  T is paramonotone, 3∗ -monotone, and not maximal monotone. When g(x) := sgn(x), T  is not strictly monotone, whereas it is strictly monotone when g(x) := sgn(x) + x. When |θ| ∈]π/3, π/2[, T is not 3-cyclic monotone, whereas when |θ| ∈ [0, π/3] it is. Proof. Recalling the results of Section 3.3, clearly T is nonlinear and neither maximal monotone nor, when |θ| > π/3, is it 3-cyclic monotone. It is paramonotone since g1 (x), g2 (x), and Rθ are paramonotone.  When g := g1 , T is not strictly monotone. That T is strictly monotone when g := g2 can be verified by Proposition 3.3.3 and the fact that Rθ is strictly monotone when |θ| < π/2  (Example 4.3.3).  Since in either case g is a one dimensional monotone operator, it is 3-cyclic monotone (Proposition 3.7.1), as is Rθ if |θ| ≤ π/3 (Fact 4.3.4). Since the product of operators preserves n-cyclic monotonicity (Proposition 3.3.3), T is 3-cyclic monotone when |θ| ≤ π/3.  Furthermore, since g is 3-cyclic monotone, it is also 3∗ -monotone. Since Rθ is linear and  paramonotone, as long as |θ| < π/2, it is also 3∗ -monotone (Proposition 4.2.9). Since 3∗ -  monotone is preserved in the product space (Proposition 3.3.3), T is 3∗ -monotone.  Example 5.0.32 Define the nonlinear monotone operator T : R2 → R2 , where x = (x1 , x2 ),  by     Rθ x + (1, 0), x1 > 0, T : x → Rθ x + (sgn(x1 ), 0) = Rθ x, x1 = 0,   Rθ x − (1, 0), x1 , < 0  (5.0.43)  where as before Rθ is the rotation matrix, sgn is as defined in Example 5.0.31, and |θ| ≤ π/2. This operator T is monotone but not maximal monotone. Furthermore,  (i) if |θ| ∈ [0, π/3], then T is paramonotone, strictly monotone, 3-cyclic monotone, and 3∗ -monotone;  (ii) if |θ| ∈]π/3, π/2[, then T is paramonotone, strictly monotone, and 3∗ -monotone, but not 3-cyclic monotone;  (iii) if |θ| = π/2, then T is none of paramonotone, strictly monotone, 3-cyclic monotone, or 3∗ -monotone.  Proof. Consider the partition of the domain R2 into three areas, defined by x1 > 0, x1 = 0, and x1 < 0 for (x1 , x2 ) ∈ R2 . Recall Section 3.6 on the addition of monotone operators. Rθ is strictly monotone when |θ| < π/2, so in each domain area, T for such θ is strictly monotone.  91  When x, y belong to distinct domain areas, x1 > 0, y1 = 0, x1 = 0, y1 < 0, x1 > 0, y1 < 0,  x − y, (1, 0) + Rθ x − Rθ y  = x1 + x − y, Rθ x − Rθ y > 0 x − y, (1, 0) + Rθ x − Rθ y  = −y1 + x − y, Rθ x − Rθ y > 0  (5.0.44)  x − y, (2, 0) + Rθ x − Rθ y  = 2(x1 − y1 ) + x − y, Rθ (x − y) > 0 and so T as a whole is strictly monotone, and hence paramonotone, when |θ| < π/2, When  |θ| = π/2, T fails to be paramonotone within any domain area, simply because x−y, R±π/2 (x− y) = 0 regardless of x and y.  Similarly, T fails to be 3∗ -monotone when |θ| = π/2. For any k ∈ R, let x = (0, −1), y =  (k, 0) and z = (0, 1) respectively. Suppose k > 0 and let θ = π/2. Then, x∗ = Rπ/2 x = (1, 0) and y∗ = Rπ/2 y + (1, 0) = (1, k). In this way, z − y, y∗ − x∗ = (−k, 1), (0, k) = k,  (5.0.45)  and so sup(y,y∗ )∈gra Rπ/2 z − y, y∗ − x∗ = +∞. Now suppose k < 0 and let θ = −π/2. Then, x∗ = (−1, 0) and y∗ = (−1, −k), and so y∗ − x∗ = (0, −k). In this case, z − y, y∗ − x∗ = −k,  and so similarly sup(y,y∗ )∈gra R−π/2 z − y, y∗ − x∗ = +∞. Hence, when |θ| = π/2, T is not  3*-monotone.  When |θ| < π/2, we know that Rθ is 3∗ -monotone from the details of Example 4.3.3. Since sgn is also a 3∗ -monotone operator, by Proposition 3.6.2, T is 3∗ -monotone when |θ| < π/2. By Fact 4.3.4, if |θ| ≤ π/3, then Rθ is 3-cyclic monotone. Since the operator (sgn(x1 ), 0) is  in effect a one dimensional monotone operator, it is also 3-cyclic monotone by Proposition 3.7.1.  Since 3-cyclic monotonicity is preserved under addition, T is also 3-cyclic monotone for this range of θ. For each θ, T is not maximal monotone since the lack of maximal monotonicity dominates under addition (Remark 3.6.5), and sgn is not maximal monotone. For the violation of the 3-cyclic monotone condition (2.1.5), let x = (1, 0), y = (3, 0), and √ z = (2, 3). x − y, T x + y − z, T y + z − x, T z √ = (−2, 0), (1, 0) + Rθ (1, 0) + (1, − 3), (1, 0) + Rθ (3, 0) √ √ + (1, − 3), (1, 0) + Rθ (2, 3) √ √ = −2 − 2 cos(θ) + 1 + 3 cos(θ) − 3 3 sin(θ) + 1 + 2 cos(θ) − 3 sin(θ) √ +2 3 sin(θ) + 3 cos(θ) √ = 6 cos(θ) − 2 3 sin(θ).  (5.0.46)  92  When |θ| ∈ ]π/3, π/2], the above is negative and so T is not 3-cyclic monotone.  5.1  Summary and comparison  The examples of this chapter are completely representative of all possible monotone class combinations, as Table 5.1 demonstrates. Note that 3-cyclic monotonicity and maximal monotonicity are not enough to ensure strict monotonicity, let alone strong monotonicity, so that Proposition 3.1.3 can be considered a tight result. It remains to be seen if paramonotone operators that are neither strictly monotone nor 3cyclic monotone can exist with full domain on R2 , though it is possible to show such examples T where dom T = R2 . In all other cases, the dimension of X in Table 5.1 is the lowest possible dimension an operator with that binary label can operate upon. The operators used in these examples that are strictly monotone are mostly strongly monotone, with an exception being Example 5.0.28. For instance the rotation operators Rθ are strongly monotone for all θ such that |θ| < π/2, and strong monotonicity is preserved in the  operator product.  Note that all linear operators are assumed to have full domain and are therefore maximal monotone by Proposition 4.0.8. Also, if a linear operator fails to be paramonotone, it fails to be 3∗ -monotone and 3-cyclic monotone as well. The monotone class characterizations for linear operators in a Hilbert space are now also known completely, as summarized in Table 5.3 below. Therefore, all monotone class relationships are now know for general operators (Figure 5.11), linear operators (Figure 5.12), linear operators on Rn (Figure 5.13), and linear operators on R2 (Figure 5.14). Since the only linear operators on R are the constant operators, by the results shown in Table 5.2 and by Proposition 4.2.10, each example in Table 5.3 operates on a space with the lowest dimension for which its monotone class combination is possible. In particular, note how examples with binary label 1000, 1001, and 1100, although absent in Table 5.2, exist for spaces of higher dimension. Finally, for every operator T in Tables 5.3 and 5.1, an operator with the same monotone class combination on any higher dimension can be constructed by a product space composition with Id (T × Id).  93  Table 5.1: Monotone class relationships PM SM 3CM 3* MM X 0 0 0 0 0 R2 ∃ Example 5.0.13 R2 ∃ Example 5.0.12 (Q) 0 0 0 0 1 0 0 0 1 0 R2 ∃ Example 5.0.15 R2 ∃ Example 5.0.16 0 0 0 1 1 * * 1 0 * ∅ Fact 3.1.2 ∅ Proposition 3.1.3 0 * 1 * 1 0 0 1 1 0 R2 ∃ Example 5.0.17 ∅ Fact 3.1.1 0 1 * * * R3 ∃ Example 5.0.19 1 0 0 0 0 1 0 0 0 1 R3 ∃ Example 5.0.21 R3 ∃ Example 5.0.23 1 0 0 1 0 1 0 0 1 1 R3 ∃ Example 5.0.24 R1 ∃ Example 5.0.9 (g) 1 0 1 1 0 R1 ∃ Example 5.0.8 (h) 1 0 1 1 1 1 1 0 0 0 R2 ∃ Example 5.0.18 R2 ∃ Example 5.0.20 1 1 0 0 1 1 1 0 1 0 R2 ∃ Example 5.0.25 R2 ∃ Example 5.0.22 (Id +2Q) 1 1 0 1 1 R1 ∃ Example 5.0.10 (m) 1 1 1 1 0 1 1 1 1 1 R1 ∃ Id Where: ’PM’ represents paramonotone, ’SM’ represents strictly monotone, ’3CM’ represents 3-cyclic monotone, ’3*’ represents 3∗ -monotone, and ’MM’ represents maximal monotone, ’X’ represents the space of lowest dimension with a known example, 1 represents that the property is present, 0 represents an absence of that property, * represents that both 0/1 are covered by the result, ∃ represents that an example with these properties exists, ∅ represents that this combination of properties is impossible.  94  Table 5.2: Monotone linear operators on R2 : monotone class relationships. PM SM 3CM 3* 0 0 0 0 ∃ Example 5.0.12 (Q) 0 * * 1 ∅ Proposition 4.1.11 * * 1 0 ∅ Fact 3.1.2 0 * 1 * ∅ Proposition 3.1.3 0 1 * * ∅ Fact 3.1.1 1 * * 0 ∅ Proposition 4.2.9 1 0 0 * ∅ Remark 4.3.6 1 0 1 1 ∃ Example 5.0.26 ((x1 , x2 ) → (x1 , 0)) 1 1 0 1 ∃ Example 5.0.22 (Id +2Q) 1 1 1 1 ∃ Id Where: ‘PM’ represents paramonotone, ‘SM’ represents strictly monotone, ‘3CM’ represents 3-cyclic monotone, ‘3*’ represents 3∗ -monotone, 1 represents that the property is present, 0 represents an absence of that property, * represents that both 0/1 are covered by the result, ∃ represents that an example with these properties exists, ∅ represents that this combination of properties is impossible.  Figure 5.11: General monotone operators: monotone class relationships. PM = paramonotone, SM = strictly monotone, 3CM = 3 cyclic monotone, 3* = 3∗ -monotone, MM = maximal monotone.  95  3*  PM  3CM  Figure 5.12: Monotone linear operators: monotone class relationships. PM = paramonotone, SM = strictly monotone, 3CM = 3 cyclic monotone, 3* = 3∗ -monotone.  PM = 3*  3CM  Figure 5.13: Monotone linear operators on Rn : monotone class relationships. PM = paramonotone, SM = strictly monotone, 3CM = 3 cyclic monotone, 3* = 3∗ -monotone.  96  Table 5.3: Monotone linear operators: monotone class relationships PM SM 3CM 3* X 0 0 0 0 R2 ∃ Example 5.0.12 (Q) 0 * * 1 ∅ Proposition 4.1.11 ∅ Fact 3.1.2 * * 1 0 0 * 1 * ∅ Proposition 3.1.3 ∅ Fact 3.1.1 0 1 * * 1 0 0 0 ℓ2 ∃ Remark 5.0.29 1 0 0 1 R3 ∃ Example 5.0.27 R ∃ 0 1 0 1 1 1 1 0 0 ℓ2 ∃ Example 5.0.28 1 1 0 1 R2 ∃ Example 5.0.22 (Id +2Q) 1 1 1 1 R ∃ Id Where: ‘PM’ represents paramonotone, ‘SM’ represents strictly monotone, ‘3CM’ represents 3-cyclic monotone, ‘3*’ represents 3∗ monotone, ’X’ represents the space of lowest possible dimension, 1 represents that the property is present, 0 represents an absence of that property, * represents that both 0/1 are covered by the result, ∃ represents that an example with these properties exists, ∅ represents that this combination of properties is impossible.  3CM  SM PM = 3* PM = 3*  PM = 3* 3CM SM  Figure 5.14: Monotone linear operators on R2 : monotone class relationships. PM = paramonotone, SM = strictly monotone, 3CM = 3 cyclic monotone, 3* = 3∗ -monotone.  97  (11111) Id  (10111) Example 5.0.8 (h) 2.0  1.0  1.6  0.8  1.2  0.6  0.8  0.4  1.0  1.5 0.2  0.4  1.0  0.5  0.5  0.0  0.0  0.0  0.0  −1.0  −0.5 −0.4  −0.8  −1.5 −2.0 2.0  −0.2  −0.4  −0.5  1.5  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −1.2  −1.0 2.0  1.5  1.0  −1.6  (00001) Example 5.0.12 (Q)  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −0.6 −0.8 −1.0  (00011) Example 5.0.16 3.2 1.6 2.4 1.2 1.6 0.8 3  1.5  0.8  0.4  2  0.0  1  1.0 0.5  0.0  0  0.0 −0.4  −0.5  −0.8  −1  −1.0 −0.8 −1.5 −2.0 2.0  1.5  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −2 −3  −1.2 2.0  1.5  1.0  −1.6  0.5 0.0 x2 −0.5−1.0 −1.5  −2.0  (11001) Example 5.0.20  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −1.6  −2.4  −3.2  (11011) Example 5.0.22 (Id +2Q) 2.4 3.2 1.8 2.4 1.2 1.6  2  0.6  3 0.8 2  1 0.0  1  0.0  0  0 −0.6  −0.8  −1  −1 −2 −1.6 −2 2.0  1.5  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −1.2  −3 −4 2.0  −1.8  −2.4  1.5  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −2.4 −3.2  Figure 5.15: The visualization of maximal monotone operators of the form T : R2 → R2 , where for each (x∗1 , x∗2 ) ∈ T (x1 , x2 ), the height of the surface at (x1 , x2 ) denotes x∗1 and the colour at that point denotes x∗2 . The binary label denotes monotonicity class as per Definition 1.0.1.  98  (10110) Example 5.0.9 (g)  (11110) Example 5.0.10 (m) 1.0  3.0  0.8  2.4  0.6  1.8  0.4  1.0  0.2  1.2 2  0.6  0.5 1 0.0  0.0 0  0.0 −0.2  −0.6 −1  −0.5  −1.0 2.0  1.5  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −0.4  −2  −0.6  −3 2.0  −1.2  1.5  1.0  −0.8  0.5 0.0 x2 −0.5−1.0 −1.5  −1.0  (00000) Example 5.0.13  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −1.8 −2.4  (00010) Example 5.0.15 1.6  3.2  1.2  2.4  0.8 0.4  2  1.6 3 0.8 2  1  0.0  1  0.0  0  0 −0.4 −1  −0.8  −1 −2  −0.8  −2 −3 2.0  1.5  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −1.6 −3  −1.2  −4 2.0  1.5  1.0  −1.6  0.5 0.0 x2 −0.5−1.0 −1.5  −2.0  (11000) Example 5.0.18  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −2.4 −3.2  (11010) Example 5.0.25 6.0 2.4 4.5 1.6  3  3.0 6  0.8 2  4  1  2  0.0  1.5  0.0  0  0 −0.8  −1  −3 2.0  −1.5  −2 −4  −2  1.5  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −1.6  −6 2.0  −2.4  −3.2  1.5  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −3.0  −4.5  −6.0  Figure 5.16: Visualization of monotone operators of the form T : R2 → R2 , where for each (x∗1 , x∗2 ) ∈ T (x1 , x2 ), the height of the surface at (x1 , x2 ) denotes x∗1 and the colour at that point denotes x∗2 . The binary label denotes monotonicity class as per Definition 1.0.1. These operators are not maximal monotone and not continuous. Although the surface depicted is connected, this is an artifact of the visualization, and it is clear where the disconnect should lie. The discontinuities are more accurate in the colour axis. 99  (11111) Id  (10111) Example 5.0.8 (h)  2.4  2.4  1.6  1.6 1.0  0.8  2.5  0.8 0.8  2.0 0.0  0.0  0.6  1.5 0.4  1.0  −0.8  0.0 2.0  −0.8 0.2  0.5  1.5  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −1.6 0.0 2.0  1.5  −2.4  (10110) Example 5.0.9 (g)  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −1.6  −2.4  (11110) Example 5.0.10 (m)  2.4  2.4  1.6  1.6  1.0 0.8 0.8  3.5  0.8  3.0 0.0  0.6  2.5  0.0  2.0 0.4  1.5 −0.8  −0.8 1.0 0.2 0.0 2.0  1.5  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  0.5 −1.6 0.0 2.0  1.5  −2.4  (00001) Example 5.0.12 (Q)  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −1.6  −2.4  (00000) Example 5.0.13  2.4  2.4  1.6  1.6 3.5  0.8  2.5  0.8 3.0  2.0  2.5 0.0  0.0 2.0  1.5  1.5 1.0  −0.8  −0.8 1.0  0.5 0.0 2.0  1.5  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  0.5 −1.6 0.0 2.0 −2.4  1.5  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −1.6  −2.4  Figure 5.17: Visualization of monotone operators of the form T : R2 → R2 , where for each (x∗1 , x∗2 ) ∈ T (x1 , x2 ), the height of the surface at (x1 , x2 ) denotes (x∗1 , x∗2 ) and the colour at that point denotes the polar angle of (x∗1 , x∗2 ). The binary label denotes monotonicity class as per Definition 1.0.1. 100  (10110) Example 5.0.11 (Φ)  (00011) Example 5.0.16 (Φ + Q)  2.4  2.4  1.6  1.6 4.0  0.8  2.5  0.8  3.5 3.0  2.0 0.0 1.5  0.0  2.5 2.0  1.0  −0.8  1.5  −0.8  1.0 0.5 0.0 2.0  1.5  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  0.5 −1.6 0.0 2.0  1.5  −2.4  (10111) TS from Example 5.0.16  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −1.6  −2.4  (00011) Example 5.0.16 (TS + Q)  2.4  2.4  1.6  1.6  0.8  0.8  3.0  1.5 2.5 0.0 1.0  0.0  2.0 1.5  −0.8 0.5  −0.8  1.0 0.5  0.0 2.0  1.5  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −1.6 0.0 2.0 −2.4  1.5  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −1.6  −2.4  (000110) Example 5.0.17  2.4  1.6 1.0 0.8 0.8 0.0  0.6 0.4  −0.8 0.2 0.0 2.0  1.5  1.0  0.5 0.0 x2 −0.5−1.0 −1.5  1.5 1.0 0.5 0.0 −0.5 x 1 −1.0 −1.5 −2.0  −1.6  −2.4  Figure 5.18: Visualization of monotone operators of the form T : R2 → R2 , where for each (x∗1 , x∗2 ) ∈ T (x1 , x2 ), the height of the surface at (x1 , x2 ) denotes (x∗1 , x∗2 ) and the colour at that point denotes the polar angle of (x∗1 , x∗2 ). The binary label denotes monotonicity class as per Definition 1.0.1. 101  Chapter 6  A New Saddle Function Representation In this chapter, we start by reviewing the main results of Krauss [51] regarding skew-symmetric saddle functions and their relation to monotone operators using a more explicit approach than appears in the original work. In Section 6.2, we introduce a new function MT , an alternative formulation of what has been referred to as Fitzpatrick’s last function, and demonstrate that if T is the subdifferential of a convex function f , the value of MT is the same as the value of Rockafellar’s antiderivative ((f (y) − f (x)). In Section 6.3, we demonstrate how MT can be used to construct skew-symmetric saddle functions of monotone operators. We demonstrate with Example 6.3.8 that this is a different saddle function representation from Krauss. Furthermore, WT obtains the canonical skew-symmetric saddle function for subdifferentials (the Krauss saddle function does not) and its evaluation requires information only for collinear points in the graph of an operator T (the Krauss saddle function requires considering any number of points in the graph of T ). Therefore, there may be any number of applications of WT using these advantages, such as when using WT as the saddle function for an equilibrium problem. Throughout this chapter and the next, whenever x and y are distinct in X, the notation for a linear segment [x, y] ⊂ X will be used to denote the first of the sets [x, y] := {λx + (1 − λ)y : λ ∈ [0, 1]},  (6.0.1)  ]x, y[ := {λx + (1 − λ)y : λ ∈ ]0, 1[},  (6.0.2)  where [x, y[ and ]x, y] are defined as is expected, in the latter case for instance with 0 < λ ≤ 1. Set [x, x] = x and [x, x[ = ]x, x] = ]x, x[ = ∅. X is a Hilbert space throughout.  102  6.1  Krauss’ saddle functions  In 1985, Krauss published various results on the ways monotone operators can be represented by skew-symmetric saddle functions, and vice versa [51]. We reproduce the pertinent results below with modification of the original proofs. Definition 6.1.1 (saddle function) The function L : X × X → R {−∞, +∞} is a saddle ˆ y : X → R {−∞, +∞} defined by function if and only if the associated the functions Lx , L y → Lx (y) := L(x, y), and ˆ y (x) := −L(x, y) x→L are both convex. That is, L is concave in the first component and convex in the second. Here is a simple example of a saddle function. Example 6.1.2 For any p ≥ 2, the operator L : R × R → R L(x, y) :=  1 p py  {+∞} defined by  − 21 x2 , y > 0  +∞,  y≤0  (6.1.1)  is a saddle function. Definition 6.1.3 (closure of saddle functions) Closure for saddle functions has two aspects: closure in the first, concave, component, and closure in the second, convex, component. In the notation of Definition 6.1.1, given a saddle function L : X × X → R {−∞, +∞}, ˆ y )(x) cl1 L(x, y) := −(cl L  (6.1.2)  cl2 L(x, y) := (cl Lx )(y).  (6.1.3)  and  L is said to be closed if cl1 L = cl2 L = L. Definition 6.1.4 (domains of saddle functions) The domain of a saddle function L : X ×  X → R {−∞, +∞} is the product of the domains of the component concave and convex  functions:  dom L := dom1 L × dom2 L  := {x ∈ X : ∀y ∈ X, L(x, y) > −∞}  (6.1.4)  ×{y ∈ X : ∀x ∈ X, L(x, y) < +∞}.  103  However, for our purposes we will also use the alternative definition of domain from [51], where ˆ ˆ 1 L × dom ˆ 2L domL := dom  := {x ∈ X : ∀y ∈ X, cl2 L(x, y) > −∞}  ×{y ∈ X : ∀x ∈ X, cl1 L(x, y) < +∞},  (6.1.5)  = {x ∈ X : ∀y ∈ X, (cl Lx )(y) > −∞} ˆ y )(x) > −∞}. ×{y ∈ X : ∀x ∈ X, (cl L  Definition 6.1.5 (subgradients of saddle functions) The subgradient of a saddle function L : X × X → R  {−∞, +∞} at a point (x, y) ∈ X × X is defined to be ˆ y )(x) × (∂Lx )(y) ∂L(x, y) := (−∂ L  (6.1.6)  Example 6.1.6 For some p ≥ 2, define L as in Example 6.1.2. Then, cl1 L = L, however cl2 L(x, y) :=  1 p py  − 21 x2 , y ≥ 0  +∞,  y<0  .  As for the domains and subdifferential of L, ˆ dom L = domL = R × ]0, +∞[ and ∂L(x, y) = {−x} ×  {y p−1 }, y > 0 ∅,  y≤0  .  Definition 6.1.7 (skew-symmetric saddle function) A saddle function L : X × X →  R {−∞, +∞} is skew-symmetric if and only if for all x, y ∈ X cl2 L(x, y) = − cl1 L(y, x)  or equivalently  ˆ x )(y). (cl Lx )(y) = (cl L  (6.1.7)  Fact 6.1.8 For any saddle function L : X × X → R {−∞, +∞}, ˆ (i) domL ⊂ dom L. ˆ (ii) If L is skew-symmetric, domL = V ×V,  where V = {x ∈ X : ∀y ∈ X, cl2 L(x, y) > −∞}.  ˆ (iii) If L is closed, domL = dom L. ˆ Proposition 6.1.9 For any saddle function L : X × X → R {−∞, +∞}, domL = dom L if  and only if  ∀f ∈ {Lx : ∀y ∈ X, Lx (y) > −∞}  ˆ y : ∀x ∈ X, L ˆ y (x) > −∞}, {L  (6.1.8) 104  there exists a continuous affine minorant of f . ˆ ˆ y such that Lx (y) > −∞ for all Proof. Assume that domL = dom L. Choose any Lx and L ˆ ˆ y (x) > −∞ for all x ∈ X. Then, (x, y) ∈ dom L = domL, y ∈ X and L and so (cl Lx )(y) > −∞ ˆ for all y ∈ X and (cl Ly )(x) > −∞ for all x ∈ X. Therefore, each has a continuous affine  minorant.  If there is a continuous affine minorant of a convex function f , then (cl f )(x) > −∞ for all  x ∈ X, and so the converse is immediate.  Fact 6.1.10 (improper saddle functions) Suppose that the saddle function L : X × X → R {−∞, +∞} is such that for some x0 , y0 ∈ X, L(x0 , y0 ) = −∞, and let Cx0 := {y ∈ X : Lx0 (y) = −∞}. Then, (i) Lx0 (y0 ) = −∞, ˆ y (x0 ) = +∞, (ii) L 0 (iii) (cl Lx0 )(y) = −∞ for all y ∈ X, ˆ x )(y) = −∞ for all y ∈ X, if L is skew-symmetric, (iv) (cl L 0 (v) Cx0 is a convex set, (vi) for all y ∈ Cx0 : (a) Lx0 (y) = (cl Lx0 )(y) = −∞,  ˆ y (x0 ) = +∞, (b) L (vii) for all y ∈ / Cx 0 : (a) Lx0 (y) = +∞,  (b) Lx0 (y) = (cl Lx0 )(y), ˆ y (x0 ) = −∞, (c) L  ˆ y )(x) = −∞ for all x ∈ X, (d) (cl L (e) (cl Ly )(x) = −∞ for all x ∈ X, if L is skew-symmetric. Example 6.1.11 (canonical example) Given a proper lower semicontinuous convex function f : X → R {+∞}, define the skew-symmetric saddle functions L1 , L2 : X × X →  105  R {−∞, +∞} by    f (y) − f (x), L1 (x, y) := +∞,   −∞,    f (y) − f (x), L2 (x, y) := +∞,   −∞,  x, y ∈ dom f,  (6.1.9)  x ∈ dom f, y ∈ / dom f,  x∈ / dom f,  x, y ∈ dom f,  (6.1.10)  y∈ / dom f,  y ∈ dom f, x ∈ / dom f.  L1 and L2 differ only when x ∈ / dom f and y ∈ / dom f , in which case L1 (x, y) = −∞ and L2 (x, y) = +∞. Remark 6.1.12 Note that the skew-symmetric saddle functions L1 and L2 in Example 6.1.11 are such that cl1 L2 = L1 and cl2 L1 = L2 . As such, they are closed if and only if the domain of f is the entire space X. Furthermore, ˆ 1 = domL ˆ 2 = dom L2 = dom f × dom f. dom L1 = domL Definition 6.1.13 (TL ) Given a skew symmetric saddle function L  :  R {−∞, +∞}, define TL : X → 2X by x∗ ∈ TL x ⇔ (−x∗ , x∗ ) ∈ ∂L(x, x) ˆ x (x) and x∗ ∈ ∂Lx (x). ⇔ x∗ ∈ ∂ L Fact 6.1.14 Suppose L : X ×X → R  for all x ∈ X,  X × X  →  (6.1.11)  {−∞, +∞} is a skew-symmetric saddle function. Then  ˆ x )(x) = (cl Lx )(x) ≤ Lx (x) = −L ˆ x (x) ≤ −(cl L ˆ x )(x) = −(cl Lx )(x), (cl L  (6.1.12)  ˆ x )(x) ≤ 0 and (cl Lx )(x) ≤ 0. and so both (cl L Lemma 6.1.15 [51] Given a skew-symmetric saddle function L : X × X → R {−∞, +∞}, ˆ 1 L. dom TL ⊂ {x ∈ X : cl1 L(x, x) = cl2 L(x, x) = 0} ⊂ dom  (6.1.13)  ˆ x )(x) = (cl Lx )(x) = 0} dom TL ⊂ {x ∈ X : (cl L  (6.1.14)  Equivalently, ⊂ {x ∈ X : (cl Lx )(y) > −∞, ∀y ∈ X}.  Proof. If there is some y such that (cl Lx )(y) = −∞, then by Fact 2.2.4 (iii), (cl Lx )(x) = −∞ and so, as the contrapositive, the final inclusion of (6.1.13) and (6.1.14) holds.  106  Suppose that x ∈ dom TL , and let x∗ ∈ TL (x). Then, as x∗ ∈ ∂Lx (x), y − x, x∗ ≤ Lx (y) − Lx (x),  ∀y ∈ X.  Rearranging yields Lx (x) − x, x∗ + y, x∗ ≤ Lx (y),  ∀y ∈ X.  Here, the left hand side as a function of y is an affine minorant of Lx , and hence also minorizes cl Lx . Substituting y = x yields that Lx (x) ≤ (cl Lx )(x), and so Lx (x) = (cl Lx )(x). ˆ x (x), and so L ˆ x (x) = (cl L ˆ x )(x). Therefore, as L is skew-symmetric, Similarly, x∗ ∈ ∂ L ˆ x (x) = (cl L ˆ x )(x) = (cl Lx )(x) = Lx (x) = 0, L ˆ x (x) = L(x, x) = Lx (x). since −L Remark 6.1.16 Krauss remarks in [51] that dom TL is dense in the convex set dom1 L. Theorem 6.1.17 [51] Let L : X × X → R  Then, the following are equivalent,  {−∞, +∞} be a skew-symmetric saddle function.  (i) (x, x∗ ) ∈ gra TL , (ii) h, x∗ ≤ Lx (x + h) = L(x, x + h) for all h ∈ X, ˆ x (x + h) for all h ∈ X. (iii) h, x∗ ≤ L Proof. By Definition 6.1.13 and the definition of the subgradient, (x, x∗ ) ∈ gra TL is equivalent  to both  h, x∗ ≤ Lx (x + h) − Lx (x)  ∀h ∈ X  (6.1.15)  ˆ x (x + h) − L ˆ x (x) h, x∗ ≤ L  ∀h ∈ X.  (6.1.16)  and  ((i) ⇒ (ii)) Suppose (x, x∗ ) ∈ gra TL , so that x ∈ dom TL . By Lemma 6.1.15, (cl Lx )(x) = 0,  and so by (6.1.12), Lx (x) = 0. Therefore, from (6.1.15),  h, x∗ ≤ Lx (x + h) = L(x, x + h) for all h ∈ X. ((ii) ⇒ (iii)) Suppose that x ∈ X and x∗ ∈ X are such that h, x∗ ≤ L(x, x + h) = Lx (x + h)  for all h ∈ X. Let y = x + h, so that taken as a function of y, y − x, x∗ is a continuous affine 107  minorant of Lx , and so h, x∗ ≤ (cl Lx )(x + h). Since L is skew-symmetric, for all h ∈ X, ˆ x )(x + h) ≤ L ˆ x (x + h). h, x∗ ≤ (cl L  (6.1.17)  ˆ x (x + h) for all h ∈ X, ((iii) ⇒ (ii)) Similarly, if we were to suppose instead that h, x∗ ≤ L then by the same argument and the skew-symmetric nature of L, for all h ∈ X, ˆ x )(x + h) = (cl Lx )(x + h) ≤ Lx (x + h). h, x∗ ≤ (cl L  (6.1.18)  ((ii) ⇒ (i)) Finally, suppose that both (6.1.17) and (6.1.18) hold. Substituting h = 0 yields ˆ x )(x) ≥ 0 and (cl Lx )(x) ≥ 0. Hence, by Fact 6.1.14, (cl Lx )(x) = (cl L ˆ x )(x) = 0. By (cl L ˆ x (x) = 0, and so by (6.1.17 and (6.1.18), both (6.1.15) and (6.1.12), L(x, x) = Lx (x) = L (6.1.16) respectively are satisfied, and (x, x∗ ) ∈ gra TL . Theorem 6.1.18 [51] If L : X × X → R {−∞, +∞} is a skew-symmetric saddle function,  then TL is monotone.  Proof. Let (x, x∗ ), (y, y ∗ ) ∈ gra TL . By Theorem 6.1.17, for all h1 , h2 ∈ X h1 , x∗ h2 , y ∗  ≤ Lx (x + h1 )  (6.1.19)  ˆ x (y + h2 ) = −Ly+h (y). ≤ L 2  (6.1.20)  Let h1 = y − x and let h2 = x − y, so that adding together (6.1.19) and (6.1.20), we have that y − x, x∗ − y ∗ ≤ 0, and so TL is monotone. Remark 6.1.19 A result of Krauss [51] is that TL is maximal monotone if L is lower closed, ie: cl2 cl1 L = L, which is equivalent to cl2 L = L. We define HT specifically for Theorem 6.1.22 and its proof below. Definition 6.1.20 (HT (x, y)) Suppose T : X → 2X is a monotone operator. Define the  function HT : X × X → R HT (x, y) :=  inf  {−∞, +∞} by n i=1 λi  yi − x, yi∗  s.t. n ∈ N, λi ≥ 0,  n i=1 λi  = 1,  n i=1 λi yi  = y, yi∗ ∈ T yi  (6.1.21)  Remark 6.1.21 If y ∈ conv(dom T ), then HT (x, y) < +∞ since there is at least one admissi-  ble representation for y. Similarly, if y ∈ / conv(dom T ), then HT (x, y) = +∞. Below we provide a more explicit proof of a result from [51].  108  Theorem 6.1.22 [51] Suppose T : X → 2X is a monotone operator such that dom T = ∅, and define the function LT : X × X → R {−∞, +∞} by LT (x, y) :=        1 2  (HT (x, y) − HT (y, x)) , x ∈ conv(dom T ), y ∈ conv(dom T ), x ∈ conv(dom T ), y ∈ / conv(dom T ),  +∞,  −∞,  (6.1.22)  x∈ / conv(dom T ).  Then, LT is a skew-symmetric saddle function with domain ˆ 1 LT = dom ˆ 2 LT = conv(dom T ), dom1 LT = dom2 LT = dom  (6.1.23)  and is such that for all x, y ∈ X, −HT (y, x) ≤ LT (x, y) ≤ HT (x, y),  (6.1.24)  and for all x, y ∈ conv(dom T ), −∞ < LT (x, y) < +∞.  (6.1.25)  Finally, the operator TLT (see Definition 6.1.13) is a monotone extension of T , with dom TLT ⊂ conv(dom T ).  Proof. To demonstrate that LT is a saddle function, we first show that the function HT (x, y) is convex in y. For this purpose, let y = λv + (1 − λ)w for some 0 < λ < 1. If either  v∈ / conv(dom T ) or w ∈ / conv(dom T ), then HT (x, v) = +∞ or HT (x, w) = +∞, for which in  either case HT (x, y) ≤ λHT (x, v) + (1 − λ)HT (x, w). Otherwise, for any set of points  (vi , vi∗ ), (wj , wj∗ ) ∈ gra T where  j  γj = 1,  βi =  γj wj = w, and  βi vi = v, i  i  j  we have that y=  λβi vi + i  j  (1 − λ)γj wj  and λ i  βi + (1 − λ)  γj = 1. j  Therefore, by definition, HT (x, y) ≤  i  λβi vi − x, vi∗ +  j  (1 − λ)γj wj − x, wj∗ ,  109  and so, as the infimum preserves the inequality, HT (x, y) ≤ λHT (x, v) + (1 − λ)HT (x, w). Now, HT (x, y) is concave in x, since HT (·, y) is the pointwise infimum of continuous affine functions, which also implies that HT (x, y) = cl1 HT (x, y) = − cl (−HT (x, y))  for fixed y ∈ X.  (6.1.26)  As HT is concave in its first argument and convex in its second, HT (x, y) − HT (y, x) is concave  in x and convex in y when x, y ∈ conv(dom T ). As conv(dom T ) is convex, convexity and con-  cavity are maintained as necessary outside of conv(dom T ), and so LT (x, y) is a saddle function. Next, we show that LT is skew-symmetric. For each x ∈ X, define the convex functions ˆ x (y) := −LT (y, x) as in Definition 6.1.1. If x ∈ conv(dom T ), then Lx (y) := LT (x, y) and L  ˆ x (y) for all y ∈ X, and so (cl Lx )(y) = (cl L ˆ x )(y). Suppose LT (x, y) = −LT (y, x), so Lx (y) = L ˆ x (z) = −LT (z, x) = −∞, and so x ∈ / conv(dom T ). Then, for some z ∈ conv(dom T ), L ˆ x )(y) = −∞ = (cl Lx )(y) for all y ∈ X. (cl L  We will now prove (6.1.24). Consider arbitrary points x, y ∈ conv(dom T ) and arbitrary  convex representations (xi , x∗i ), (yi , yi∗ ) ∈ gra T where βi yi = y, i  γj xj = x, and  βi =  j  i  γj = 1. j  Thus, by the monotonicity of T ,  i  βi yi − x, yi∗  = i  =  βi yi − βi j  i  ≥  βi j  i  =  γj j  = j  i  γj xj , yi∗ j  γj yi − xj , yi∗ γj yi − xj , x∗j βi yi − xj , x∗j  γj y − xj , x∗j ,  and so for all x, y ∈ conv(dom T ), by definition HT (x, y) ≥ −HT (y, x). If y ∈ / conv(dom T ),  then by definition HT (x, y) = +∞. Therefore, the inequality holds outside of conv(dom T ), and for all x, y ∈ X  HT (x, y) ≥ −HT (y, x).  (6.1.27) 110  Since HT (x, y) = +∞ for y ∈ / conv(dom T ), for x ∈ conv(dom T ) we have that −HT (y, x) ≤ LT (x, y) ≤ HT (x, y). When x ∈ / conv(dom T ), HT (y, x) = +∞ and LT (x, y) = −∞, and so (6.1.24) holds.  Now, as noted in Remark 6.1.21, if y ∈ conv(dom T ), then HT (x, y) < +∞ and so  −HT (y, x) > −∞ if x ∈ conv(dom T ). Hence, (6.1.24) implies (6.1.25).  Concerning TLT , we know that TLT is monotone by Theorem 6.1.18 since LT is a skewsymmetric saddle function. From the definition of HT and (6.1.24) we have that for every (x, x∗ ) ∈ gra T ,  h, x∗ ≤ −HT (x + h, x) ≤ LT (x, x + h)  and so by Theorem 6.1.17, (x, x∗ ) ∈ gra TLT .  ∀h ∈ X,  Also by Theorem 6.1.17 and (6.1.22), if  x∈ / conv(dom T ), then x ∈ / dom TLT .  If x ∈ / conv(dom T ), then HT (x, y) = −∞ for all y ∈ X, and so x ∈ / dom1 LT . Therefore,  by Fact 6.1.8,  ˆ 2 LT = dom ˆ 1 LT ⊂ dom1 LT ⊂ conv(dom T ). dom Now, from (6.1.25) we have that dom2 LT := {y ∈ X : LT (x, y) < +∞ for all x ∈ X} = conv(dom T ). ˆ 2 LT , from which we obtain (6.1.23). Finally, from (6.1.26) it follows that dom2 LT = dom  6.2 6.2.1  A new utility function: MT Introducing the operator MT  Here we define MT and explore some it its properties. We will note later in Section 6.2.2 that MT is equivalent to a version of Fitzpatrick’s last function. Theorem 6.2.1 Given a monotone operator T : X → 2X , consider the operator MT : dom T × dom T → R, defined for all x, y ∈ dom T such that [x, y] ⊂ dom T by the Riemann integral 1  m(x, y, t)dt  MT (x, y) :=  (6.2.1)  0  where m(x, y, t) :=  supx∗ ∈T (x) y − x, x∗ ,  t = 0,  inf h∗ ∈T (x+t(y−x)) y − x, h∗ , t ∈]0, 1].  (6.2.2)  111  Then, m(x, y, t) is bounded and monotone increasing on [0, 1], the operator MT (x, y) is well defined, and sup x∗ ∈T (x)  y − x, x∗ ≤ MT (x, y) ≤  inf  y ∗ ∈T (y)  y − x, y ∗ .  (6.2.3)  Proof. Consider any x, y ∈ dom T such that [x, y] ⊂ dom T . It will first be shown that  m(x, y, t) as defined in (6.2.2) is Riemann integrable over t ∈ [0, 1] since it is bounded and monotone increasing on this interval. Let t1 , t2 ∈ [0, 1] such that t2 > t1 , and for simplicity let z1 := x + t1 (y − x) and z2 := x + t2 (y − x). By the monotonicity of T ,  (t2 − t1 )m(x, y, t2 ) = inf z2∗ ∈T (z2 ) (t2 − t1 )(y − x), z2∗ = inf z2∗ ∈T (z2 ) z2 − z1 , z2∗  (6.2.4)  ≥ supz1∗ ∈T (z1 ) z2 − z1 , z1∗  ≥ (t2 − t1 )m(x, y, t1 ). Therefore, for any 0 < t < 1, sup x∗ ∈T (x)  y − x, x∗ = m(x, y, 0) ≤ m(x, y, t) ≤ m(x, y, 1) =  inf  y ∗ ∈T (y)  y − x, y ∗ ,  (6.2.5)  and so m(x, y, t) is bounded and monotone increasing on [0, 1]. Finally, from (6.2.5), it follows that MT satisfies the inequality (6.2.3). Remark 6.2.2 Note that MT (x, y) is a Riemann integration for any monotone operator T with convex domain and all x, y ∈ dom, even when T is an unbounded operator such as the  linear relation from Example 4.0.18.  Corollary 6.2.3 Given monotone operators T1 , T2 dom T1  dom T2 , such that [x, y] ⊂ dom T1  :  X  →  2X and points x, y  dom T2 , then for all λ ∈ R,  MT1 +λT2 (x, y) = MT1 (x, y) + λMT2 (x, y).  ∈  (6.2.6)  Proof. Immediate from Theorem 6.2.1, as the integral is bounded and the inner product is linear. Let h ∈ [x, y] ⊂ dom T1  dom T2 . Since (T1 + λT2 )(h) = h∗1 + λh∗2 for some h∗1 ∈ T1 (h)  and some h∗2 ∈ T2 (h), and where the choice of one is independent of the other, inf h∗ ∈T (h) y − x, h∗  = inf h∗1 ∈T1 (h),h∗2 ∈T2 (h) y − x, h∗1 + λh∗2 = inf h∗1 ∈T1 (h) y − x, h∗1 + inf h∗2 ∈T2 (h) λ y − x, h∗2 .  and similarly for the supremum. Proposition 6.2.4 Given a monotone operator T : X → 2X , define MT : dom T ×dom T → R  112  as in Theorem 6.2.1. For all x, y ∈ dom T such that [x, y] ⊂ dom T , 1  m(x, ˆ y, t)dt  MT (x, y) =  (6.2.7)  0  where m(x, ˆ y, t) :=  suph∗ ∈T (x+t(y−x)) y − x, h∗ , t ∈ [0, 1[ inf y∗ ∈T (y) y − x, y ∗ ,  t=1  ,  (6.2.8)  and MT (x, y) = −MT (y, x).  (6.2.9)  Proof. By (6.2.4), m(x, ˆ y, t) is bounded and monotone increasing for t ∈ [0, 1], and lim sup m(x, y, s) ≤ m(x, ˆ y, t) ≤ lim inf m(x, y, s). s→t+  s→t−  Therefore,  1  (6.2.10)  1  m(x, ˆ y, t) =  m(x, y, t) = MT (x, y).  0  0  For w = 1 − t, we have that 1  −MT (y, x) = −  m(y, x, t)dt 0 1−  = −  inf  0+  h∗ ∈T (y+t(x−y))  1−  sup  = 0+  h∗ ∈T (y+t(x−y))  y − x, h∗ dt  0+  −  = 1−  x − y, h∗ dt  sup h∗ ∈T (x+w(y−x))  y − x, h∗ dw  1−  =  m(x, ˆ y, w)dw 0+ 1  m(x, y, w)dw  = 0  = MT (x, y), and so obtain (6.2.9).  6.2.2  History of MT - Fitzpatrick’s last function  Though independently arrived at in the quest to obtain a more tractable saddle function representation of monotone operators, the operator MT is one possible multivalued generalization of a function called Fitzpatrick’s last function, denoted here by FLT . We also show that FLT was in use, though only applied to subdifferentials and not named, 113  long before the term was coined. Note that Fitzpatrick’s last function is not the same as Fitzpatrick’s function, and that the latter is much more prevalent in the literature. Fitzpatrick’s last function was named in memory of Simon Fitzpatrick who communicated this function to the authors of [19]. In that paper, it assumes the following form: Definition 6.2.5 (Fitzpatrick’s last function in [19]) Given a monotone operator T : Rn → Rn , define for any x ∈ dom T , y ∈ int dom T , ˆ T (x, y) := FL  1 0  x − y, T (y + t(x − y)) dt.  (6.2.11)  It has also been defined in [15] as: Definition 6.2.6 (Fitzpatrick’s last function [15]) Given a monotone operator T : X → 2X such that 0 ∈ core dom T and dom T is open, define for any x ∈ dom T , 1  FLT (x) :=  sup 0  x, x∗ dt.  (6.2.12)  x∗ ∈T (tx)  Combining these two forms, a generalized, but unpublished, version of this function most similar to MT would be Definition 6.2.7 (Fitzpatrick’s last function) Given monotone operator T : X → 2X ,  define for x, y ∈ X such that [x, y] ∈ dom T 1  FLT (x, y) :=  sup 0 h∗ ∈T (x+t(y−x))  y − x, h∗ dt.  (6.2.13)  By Proposition 6.2.4, FLT (x, y) = MT (x, y) except possibly in the case where the set T (y)  is multivalued. In this case, if suph∗ ∈T (y) y − x, h∗ is finite, then FLT (x, y) = MT (x, y), and  if it is infinite then FLT (x, y) = MT (x, y) only when taken as an improper integral. The  advantage of MT (x, y) over FLT (x, y) is that it is always a Riemann integration as long as [x, y] ∈ dom T . As such, MT can be defined for monotone operators T where 0 ∈ / dom T , and  without using notions of interiority. Prior appearances of Fitzpatrick’s last function appear alongside an assumption that either 0 ∈ core dom T or 0 ∈ int dom T (explicitly in [19] and [15]  and implicitly in [11]).  Remark 6.2.8 It should be noted that this is not the first time this construction FLT has appeared in the literature, although not in this form. In [64], one such construction appears in the proof of the fact that the subdifferentials ∂f of proper convex functions f on Rn are exactly the monotone operators that are maximal cyclically monotone. The construction that appears  114  is equivalent to MT applied to ∂f . In particular, it was shown that for any x, y ∈ ri dom f , 1  f (y) − f (x) = where for any convex function g : Rn → R  h′ (t, 1)dt,  (6.2.14)  0  {+∞}, and any two points v, w ∈ Rn ,  g(v + αw) − g(v) R∋α→0 α  g′ (v, w) = lim  (6.2.15)  and h(t) = f ((1 − t)x + ty) .  (6.2.16)  Combining the above, along with the fact that for any v, w ∈ X f ′ (v, w) = supv∗ ∈∂f (v) w, v ∗ , h′ (t, 1) =  sup h∗ ∈(∂f )(x+t(y−x))  y − x, h∗ .  Therefore, (6.2.14) is equivalent to f (y) − f (x) = FL∂f (x, y),  (6.2.17)  a result we obtain for MT in more general circumstance in Proposition 6.2.17. Note also that in these conditions, MT = FLT .  6.2.3  Linear relations  In this section we examine MA applied to monotone linear relations A, although we use a different definition of skew linear relations than used earlier. In Chapter 4, a linear relation A : X → 2X was skew if A∗ = −A, and here we define skew to be x, Ax = 0 for all x ∈ dom A, a more general condition. For instance, in the first case all skew linear relations  are maximal monotone by Proposition 7.2 in [9], whereas for instance the operators A : x → 0  and B : x → X where dom A = dom B = {0} are both skew linear relations, yet A is clearly  not maximal monotone as gra A  gra B.  Proposition 6.2.9 Given a monotone linear relation A : X → 2X , the operator MA : dom A×  dom A → R from Theorem 6.2.1 as applied to A is well defined and for any x, y ∈ dom A, MA (x, y) = y − x,  Ax + Ay . 2  (6.2.18)  Proof. Since the domains of monotone linear relations are linear subspaces, dom A is convex, and so by Theorem 6.2.1, MA is well defined everywhere on the domain of A. The result  115  (6.2.18) follows from Theorem 6.2.1, as for all x, y ∈ dom A, 1  y − x, A(x + t(y − x)) dt  MA (x, y) = 0  1  1  = 0  = =  y − x, Ax dt +  0  t y − x, A(y − x) dt  1 y − x, Ay − Ax 2 Ax + Ay , y − x, 2 y − x, Ax +  using the notational simplification from Fact 4.0.12. Corollary 6.2.10 Given a monotone linear relation A : X → 2X that is skew (ie: x, Ax = 0  for all x ∈ dom A), then for all x, y ∈ dom A,  MA (x, y) = y − x, Ax = y − x, Ay .  (6.2.19)  Proof. Since A is skew, y − x, A(y − x) = 0, and so y − x, Ay = y − x, Ax . Therefore, (6.2.19) follows directly from (6.2.18). Proposition 6.2.11 Given a monotone linear relation A : X → 2X , the operator MA (x, y) : dom A × dom A → R is convex in x and concave in y. Proof. Recall (4.0.8), namely λq(x) + (1 − λ)q(y) − q(λx + (1 − λ)y) = where q(x) :=  1 2  1 λ(1 − λ) x − y, Ax − Ay ≥ 0, 2  x, Ax and x, y ∈ dom A. Therefore, in addition to Fact 4.0.12, we know that  y, Ay is a convex function of y even when A is not necessarily symmetric or single-valued. As y, Ax and x, Ay are linear functions with respect to y, by Proposition 6.2.9 it follows that MA is convex with respect to y, and since MA (x, y) = −MA (y, x) (Proposition 6.2.4), MA is concave with respect to x.  6.2.4  Relationship of MT to Rockafellar’s antiderivative  Rockafellar’s antiderivative was introduced to show the equivalence between subgradients of proper lower semicontinuous convex functions and cyclical monotone operators in a Banach space (see Fact 3.2.7). This result was introduced in [62], [63], and [65], but best summarized (for Rn ) in [69] (Theorem 12.25). 116  We define a variant of the Rockafellar antiderivative and show its relationship to MT . Natural upper and lower bounds of MT are also defined to better show the relationship of MT to Rockafellar’s antiderivative. The final result of this section, Corollary 6.2.25, is equivalent to a combination of the Rockafellar antiderivative result (which is unrestricted by domain), and the equivalence between MT and the Rockefellar antiderivative, which was well known at least since 1970 (see [64] and Remark 6.2.8). These results are presented here using a proof which obtains both simultaneously. Together with Proposition 6.2.18, this will allow us to consider operators which are not subgradients in the subsequent Chapter (Borwein-Wiersma decomposable operators). Furthermore, since we have from before that MT is a Riemann integral on R, we obtain easier numerical methods for calculating the Rockafellar antiderivative, since we need only consider collinear points rather than points over the entire domain. Definition 6.2.12 (Rockafellar’s antiderivative) Given a monotone operator T : X → 2X , and some (x, x∗ ) ∈ gra T , define Rockafellar’s antiderivative to be RT,(x,x∗ ) (y) :=  x − xn , x∗n + xn − xn−1 , x∗n−1 + · · ·  sup  + x1 − x, x∗  (xi ,x∗i )∈gra T i∈{1,2,...,n} n∈N  .  (6.2.20)  Definition 6.2.13 Given a monotone operator T : X → 2X , define the operators NT− , NT+ : X ×X →R  {−∞, +∞} by  NT− (x, y) :=  sup  NT+ (x, y) :=  inf  s.t.  n i=0  hi+1 − hi , h∗i  (6.2.21)  hi+1 − hi , h∗i+1  (6.2.22)  n ∈ N, hi ∈ X, h0 = x, hn+1 = y, h∗i ∈ T (hi ) n i=0  s.t. n ∈ N, hi ∈ X, h0 = x, hn+1 = y, h∗i ∈ T (hi )  Proposition 6.2.14 The operator from (6.2.21) is equivalent to the supremum of Rockafellar antiderivatives with respect to the image of x, NT− (x, y) = sup RT,(x,x∗ ) (y).  (6.2.23)  x∗ ∈T x  Proposition 6.2.15 For any monotone operator T : X → 2X and for all x, y ∈ dom T , NT+ (x, y) = −NT− (y, x),  (6.2.24)  Proof. The equation (6.2.24) is obtained by reversing the order of the index for hi in (6.2.22).  117  To be precise, first note that by (6.2.22), NT+ (x, y) =  − sup  s.t.  n i=0  hi − hi+1 , h∗i+1  n ∈ N, hi ∈ X, h0 = x, hn+1 = y, h∗i ∈ T (hi ).  ¯i, h ¯ ∗ := For any choice of n ∈ N and (hi , h∗i ) ∈ gra T where i ∈ {0, 1, 2, . . . , n + 1}, let (h i (hn+1−i , h∗n+1−i ). In this way,  i=0  n  n  n  hi − hi+1 , h∗i+1 =  i=0  ¯ n+1−i − h ¯ n−i , h ¯∗ h n−i =  i=0  ¯ i+1 − h ¯ i, h ¯∗ . h i  ¯ 0 = y and Since the conditions that h0 = x and hn+1 = y are equivalent to conditions that h ¯ n+1 = x, h NT+ (x, y)  =  − sup  s.t.  ¯ i+1 − h ¯ i, h ¯∗ h i ¯ ¯ ¯ n+1 = x, h ¯ ∗ ∈ T (h ¯ i ). n ∈ N, hi ∈ X, h0 = y, h i n i=0  which is precisely −NT− (y, x) as defined in (6.2.21). The first part of the following fact was noted in Theorem 12.25 in [69], in which the Rockafellar antiderivative is introduced. Fact 6.2.16 For any monotone operator T : X → 2X , NT− (x, y) is lower semicontinuous and convex in y and NT+ (x, y) is upper semicontinuous and concave in x. Proof. The operator N − (x, y) is convex and lower semicontinuous in y as it is the pointwise supremum of a collection of affine functions. Similarly, or if one would prefer by Proposition 6.2.15, NT+ (x, y) is upper semicontinuous and concave in x. Proposition 6.2.17 Given a proper convex function f : X → R  satisfies, for all x, y ∈ dom ∂f ,  {+∞}, the operator ∂f  − + N∂f (x, y) ≤ f (y) − f (x) ≤ N∂f (x, y).  (6.2.25)  Proof. Recall from the definition of subgradient that for all x, y ∈ X, and for all x∗ ∈ ∂f (x) and all y ∗ ∈ ∂f (y),  y − x, x∗ ≤ f (y) − f (x) ≤ x − y, y ∗ .  Hence, any collection of (hi , h∗i ) ∈ gra ∂f where i ∈ {0, 1, 2, . . . , n + 1} must, by telescoping,  satisfy  n  n i=0  hi+1 −  hi , h∗i  ≤ f (hn+1 ) − f (h0 ) ≤  i=0  hi+1 − hi , h∗i+1 ,  which combined with Definition 6.2.13 yields (6.2.25). 118  Proposition 6.2.18 If a monotone operator T : X → 2X is not cyclical monotone, then for  all x, y ∈ dom T ,  NT+ (x, y) = −∞  NT− (x, y) = +∞.  and  (6.2.26)  Proof. Suppose T is not cyclical monotone, so that for some m ∈ N there exists a d > 0 and a collection of m points (ai , a∗i ) ∈ gra T , where i ∈ {1, 2, . . . , m}, such that m i=1  ai+1 − ai , a∗i > d,  (6.2.27)  where am+1 := a1 . For any k ∈ N, let n = km + 1, and let ∗ ˆ jm+i , h ˆ∗ (h jm+i ) = (am+1−i , am+1−i )  (hjm+i , h∗jm+i ) = (ai , a∗i ); for all 0 ≤ j < k and 1 ≤ i ≤ m, and let (hn , h∗n ) = (a1 , a∗1 );  ˆn, h ˆ ∗ ) = (am , a∗ ); (h n m  Then, for any x∗ ∈ T x, NT− (x, y) ≥ =  = >  h1 − x, x∗ +  a1 − x, x∗ +  n−1 i=1 n−1 i=1  hi+1 − hi , h∗i + y − hn , h∗n  hi+1 − hi , h∗i + y − a1 , a∗1  a1 − x, x∗ + y − a1 , a∗1 + k  m i=1  a1 − x, x∗ + y − a1 , a∗1 + kd.  ai+1 − ai , a∗i  Similarly, define a0 := am and a∗0 := a∗m , for all y ∗ ∈ T y, NT+ (x, y) ≤ =  = = <  ˆ 1 − x, h ˆ∗ + h 1  am − x, a∗m +  am − x, a∗m +  am − x, a∗m +  am − x, a∗m +  n−1 ˆ ∗ ˆ ˆ∗ ˆ i=1 hi+1 − hi , hi+1 + y − hn , y n−1 ˆ ∗ ˆ ˆ∗ i=1 hi+1 − hi , hi+1 + y − am , y ∗ y − am , y ∗ + k m i=1 am−i − am+1−i , am−i ∗ y − am , y ∗ + k m i=1 ai − ai+1 , ai y − am , y ∗ − kd,  As this is true for all k ∈ N, we obtain (6.2.26) by letting k → +∞. Corollary 6.2.19 If T : X → 2X is a monotone operator, then NT− (x, y) ≤ NT+ (x, y)  (6.2.28)  if and only if gra T is a subgraph of ∂f for some proper lower semicontinuous convex function f : X → R {+∞}. Proof. Combine Propositions 6.2.17 and 6.2.18 with the fact that the set of maximal cyclical 119  monotone operators on X is exactly the set of such subdifferentials (Fact 3.2.7). Definition 6.2.20 Given a monotone operator T : X → 2X and an index n ∈ N, define Mn− , Mn+ : dom T × dom T → R for those x, y ∈ dom T such that [x, y] ⊂ dom T as follows. Let dn := dn (x, y) :=  y−x n+1 ,  Mn− (x, y) := sup  x∗ ∈T x hi =x+idn h∗i ∈T (hi )  h1 − x, x∗ + n i=0  = sup  x∗ ∈T x hi =x+idn h∗i ∈T (hi )  and  n i=1  hi+1 − hi , h∗i (6.2.29)  dn , h∗i ,  Mn+ (x, y) := inf hi =x+idn  n−1 i=0  hi+1 − hi , h∗i+1 + y − hn , y ∗  = inf hi =x+idn  n i=0  dn , h∗i+1 .  h∗i ∈T (hi ) y ∗ ∈T y  h∗i ∈T (hi ) y ∗ ∈T y  (6.2.30)  Proposition 6.2.21 Suppose that T : X → 2X is a monotone operator. Then, for all x and  y such that [x, y] ⊂ dom T ,  MT (x, y) := lim Mn− (x, y) = lim Mn+ (x, y). n→+∞  n→+∞  (6.2.31)  Proof. Consider any x, y ∈ dom T . As the lower Riemann integral associated with MT (x, y) is  L(x, y) := lim inf n→+∞ Ln (x, y) where  n  Ln (x, y) :=  inf  sup h1 − x, x  hi =x+idn x∗ ∈T x h∗i ∈T (hi )  ∗  + i=1  hi+1 − hi , h∗i  and the upper Riemann integral is U (x, y) = lim supn→+∞ Un (x, y) where n−1  Un (x, y) :=  sup  inf  ∗ hi =x+idn y ∈T y i=0 h∗i ∈T (hi )  hi+1 − hi , h∗i+1 + y − hn , y ∗ ,  we have that Mn+ (x, y) ≤ Un (x, y) and Mn− (x, y) ≥ Ln (x, y) for all n ∈ N. As MT (x, y) is a  Riemann integral, U (x, y) = L(x, y), from which we obtain (6.2.31).  Proposition 6.2.22 Given a monotone operator T : X → 2X and points x, y ∈ dom T such  that [x, y] ⊂ dom T , the operators Mn− and Mn+ satisfy the following inequalities: Mn− (x, y) ≤ NT− (x, y)  + (x, y) NT+ (x, y) ≤ Mm  ∀n ∈ N  (6.2.32)  ∀m ∈ N.  (6.2.33)  120  Proof. Immediate from Definition 6.2.20 and Definition 6.2.13. Proposition 6.2.23 Given a monotone operator T : X → 2X , suppose that for some x, y ∈  dom T such that [x, y] ⊂ dom T ,  N − (x, y) ≤ N + (x, y). Then MT (x, y) = NT− (x, y) = NT+ (x, y).  (6.2.34)  Proof. Apply Proposition 6.2.21 to Proposition 6.2.22. Remark 6.2.24 Note that by Proposition 6.2.18, the conditions for Proposition 6.2.23 can only occur if the monotone operator T is cyclical monotone, such as a subgradient of a proper, lower semicontinuous convex function. If T is maximal monotone, such subgradients are the only cyclical monotone operators by Fact 3.2.7. Corollary 6.2.25 Given a proper convex function f : X → R {+∞}, and points x, y ∈ X  such that [x, y] ⊂ dom ∂f , one has  − + M∂f (x, y) = N∂f (x, y) = N∂f (x, y) = f (y) − f (x).  (6.2.35)  Proof. Apply Proposition 6.2.17 to Proposition 6.2.23.  6.3  The saddle function WT  In this section, we introduce a new saddle function representation for monotone operators that is easier to evaluate and numerically approximate by basing it on MT . Specifically, Theorem 6.3.5 demonstrates that given a monotone operator T : X → 2X  with convex domain, the operator WT from Definition 6.3.1 below is a skew-symmetric saddle function that is Lagrangian to T if T is maximal monotone, and otherwise Lagrangian to a monotone extension of T as long as MT (x, ·) is convex for all x ∈ dom T . Furthermore, if T is  a subdifferential of a lower semicontinuous proper continuous function, we obtain a canonical saddle function for T , an advantage over Krauss’ saddle function.  In the next chapter, we shall show that the results below allow us to decompose the monotone operator T into cyclic and acyclic parts (the Borwein-Wiersma decomposition) if T is ˆ T is defined here, where Borwein-Wiersma decomposable (Definition 7.3.4). In this context, W the concept of the convex kernel, or cern, is examined in Section 7.1 below.  121  ˆ T ) Given a monotone operator T : X → 2X , define the operators Definition 6.3.1 (WT , W ˆ T : X × X → R {−∞, +∞} by WT , W  and     MT (x, y), [x, y] ⊂ dom T WT (x, y) := +∞, x ∈ dom T, [x, y] ⊂ dom T   −∞, x∈ / dom T.    MT (x, y), x ∈ cern dom T, y ∈ cern dom T ˆ WT (x, y) := +∞, x ∈ cern dom T, y ∈ / cern dom T ,   −∞, x∈ / cern dom T.  (6.3.1)  (6.3.2)  where cern dom T denotes the convex kernel of dom T .  Proposition 6.3.2 Given a monotone operator T : X → 2X , ˆT = Wˆ W T  (6.3.3)  where the monotone operator Tˆ : X → 2X is defined by Tˆx :=  T x, x ∈ cern dom T ∅,  x∈ / cern dom T  ,  (6.3.4)  ˆ T = x → −∞. where if dom T is not starshaped, W Proof. Clearly, dom Tˆ = dom T cern dom T = cern dom T . Since the convex kernel of any set is convex, dom Tˆ is convex, and so [x, y] ∈ dom Tˆ if and only if x ∈ dom Tˆ and y ∈ dom Tˆ. Proposition 6.3.3 Suppose T : X → 2X is a monotone operator. Then, WT is a saddle  function (see Definition 6.1.1) if and only if dom T is convex and MT (x, y) is convex in y for  every x ∈ dom T . Proof. We first demonstrate the necessity. Suppose that WT is a saddle function, and x, y ∈  dom T but for some λ ∈]0, 1[, z := λx + (1 − λ)y ∈ / dom T . Then, WT (x, y) = +∞, WT (z, y) =  −∞, and WT (y, y) = MT (y, y) = 0. Clearly, WT (z, y) < λWT (x, y) + (1 − λ)WT (y, y), and so  WT (·, y) is not concave, a contradiction. Therefore, WT is a saddle function only if dom T is convex.  Suppose now that there is an x ∈ dom T such that MT (x, y) is not convex in y on dom T . Clearly then, WT (x, y) is not a saddle function as it is not convex in y. Now, suppose that for every x ∈ dom T , MT (x, y) is convex in y. Let x ∈ X and let y =  λy1 + (1 − λ)y2 , where λ ∈ [0, 1] and y1 , y2 ∈ X. If x ∈ / dom T , then clearly WT (x, y) = λWT (x, y1 ) + (1 − λ)WT (x, y2 ) = −∞,  122  and if x ∈ dom T , but either [x, y1 ] ⊂ dom T or [x, y2 ] ⊂ dom T , then WT (x, y) ≤ λWT (x, y1 ) + (1 − λ)WT (x, y2 ) = +∞. Suppose now that x, y1 , y2 ∈ dom T , and since dom T is convex, y ∈ dom T . Since MT (x, y) is convex in y, WT (x, y) is convex in y.  To demonstrate that WT (x, y) is concave in x, let y ∈ X and let x = λx1 + (1 − λ)x2 where  λ ∈ [0, 1] and x1 , x2 ∈ X. If either x1 ∈ / dom T or x2 ∈ / dom T , then clearly WT (x, y) ≥ λWT (x1 , y) + (1 − λ)WT (x2 , y) = −∞.  Henceforth, suppose that x1 ∈ dom T and x2 ∈ dom T , and so by the convexity of the domain of T , x ∈ dom T . If y ∈ / dom T , then  WT (x, y) = λWT (x1 , y) + (1 − λ)WT (x2 , y) = −∞. We are left with the case where x, y, x1 , x2 ∈ dom T . From Proposition 6.2.4, MT (x, y) =  −MT (y, x), and so together with the fact established above that MT (x, y) is convex in y, we  have that MT (x, y), and hence WT (x, y), are both concave in x.  ˆ T is a saddle function if and only if for every x ∈ cern dom T , Corollary 6.3.4 The operator W  the function MT (x, y) is convex in y on cern dom T . Proof. Apply Proposition 6.3.2 to Proposition 6.3.3.  Theorem 6.3.5 Given a monotone operator T : X → 2X such that WT is a saddle function,  then WT is a skew-symmetric saddle function Lagrangian to the monotone operator TW : X → 2X , where  TW (x) := ∂WT (x, ·)(x),  (6.3.5)  and gra T ⊂ gra TW and dom T = dom TW . Proof. By Proposition 6.3.3, dom T is convex. Therefore, from Theorem 6.2.1 we have for all x, y ∈ dom T that MT (x, y) = −MT (y, x). By definition, WT (x, y) = −WT (y, x) when x ∈  dom T and y ∈ / dom T or when x ∈ / dom T and y ∈ dom T . When x ∈ / dom T and y ∈ / dom T ,  recalling Definition 6.1.3 we know that cl2 WT (x, y) = −∞ and cl1 WT (x, y) = +∞, as there  exists a x1 such that WT (x1 , y) = +∞. Hence, for all x, y ∈ X, cl2 WT (x, y) = − cl1 WT (x, y) and so WT is a skew-symmetric saddle function. From Theorem 6.2.1 we have that sup y − x, x∗ ≤ MT (x, y)  x∗ ∈T x  123  Hence, for all (x, x∗ ) ∈ gra T , and for all h ∈ X such that x + h ∈ dom T , h, x∗ ≤ Mn− (x, x + h) ≤ MT (x, x + h) = WT (x, x + h).  (6.3.6)  By Definition 6.3.1, for all (x, x∗ ) ∈ gra T and all h ∈ X such that x + h ∈ / dom T h, x∗ ≤ WT (x, x + h) = +∞.  (6.3.7)  Therefore, by Theorem 6.1.17 we have that gra T ⊂ gra T˜W , where T˜W is defined as TL is in Definition 6.1.13, where L = WT , that is  T˜W = {x∗ ∈ X : x∗ ∈ ∂WT (x, ·)(x), x∗ ∈ ∂(−WT (·, x))(x)}. (We say that WT is the saddle function Lagrangian to TW .) By Theorem 6.1.18, T˜W is a monotone extension of gra T . It remains to show that T˜W = TW . Clearly, gra T˜W ⊂ gra TW .  Let (x, x∗ ) ∈ gra TW . Therefore, by definition, x ∈ dom T , as ∂(x → −∞) = ∅. Since we’ve  established that for x ∈ dom T and for all y ∈ X, WT (x, y) = −WT (y, x), it follows that x∗ ∈ ∂(−WT (·, x))(x), and so gra T˜W ⊃ gra TW . Corollary 6.3.6 If a maximal monotone operator T : X → 2X is such that WT is skewsymmetric, then  T = ∂WT (x, ·)(x) for all x ∈ dom T . Corollary 6.3.7 Given a monotone operator T : X → 2X , WT is a skew-symmetric saddle function if and only if dom T is convex and MT (x, ·) is a convex function for all x ∈ dom T .  Proof. Apply Proposition 6.3.3 to Theorem 6.3.5. One particular advantage of WT over LT (from Theorem 6.1.22), is that for any subdifferential ∂f with convex domain of a proper lower semicontinuous convex function f : X → R {+∞}, W∂f (x, y) = f (y) − f (x) for x, y ∈ dom ∂f , and W∂f is a canonical skew symmetric  saddle function for ∂f . On the other hand, L∂f can be much more complex, and is much more  difficult to calculate. The following example demonstrates this issue. Example 6.3.8 Define the proper and continuous convex function f : R → R by f (x) :=  1 2 2x ,  −x,  x≥0  x<0  (6.3.8)  124  and define H∂f as in Definition 6.1.20 and L∂f as in Theorem 6.1.22. Then,  √  x − xy + 2y −x,      y 2 − xy,     −y, H∂f (x, y) =  x − y,      y 2 − xy,     −y − (x−1)2 , 4  and since L∂f (x, y) =                                   1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2  1 2  y > 0, x < −y 2 ,  −y 2 ≤ x < 1, y > 0,  0 ≤ x < 1, y < 0, x < 0, y < 0,  x ≥ 1, y >  x ≥ 1, y ≤  (6.3.9)  x−1 2 , x−1 2 ,  (H∂f (x, y) − H∂f (y, x)), the value of L∂f (x, y) is  √ x < −y 2 , 0 ≤ y < 1, 2x − xy + 2y −x , √ 2x − xy + 2y −x + 41 (y − 1)2 , x < −y 2 , x < 12 (y − 1), y ≥ 1, y 2 − x2 ,  0 ≤ x < 1, 0 ≤ y < 1,  y 2 − x2 ,  x < 1, x ≥ −y 2 , x ≥ 21 (y − 1), y ≥ 1  y 2 − xy + x ,  −y 2 ≤ x < 0, 0 ≤ y < 1,  y 2 − xy + x + 14 (y − 1)2 , √ [xy − 2y − 2x −y] ,  x < 1, −y 2 ≤ x ≤ 21 (y − 1), y ≥ 1,  0 ≤ x < 1, y < −x2  xy − x2 − y , −x2 ≤ y < 0, 0 ≤ x < 1,     x − y, x < 0, y < 0,    1 2 2   x ≥ 1, y ≥ 1, x ≥ 21 (y − 1), y ≥ 12 (x − 1),  2 y −x ,   1 1 2 2   1 ≤ x < 12 (y − 1), y ≥ 12 (x − 1), y ≥ 1,  2 y − xy + x + 4 (y − 1) ,  √  1 1 1 2 2    2 xy − 2y − 2x −y − 4 (x − 1) , x ≥ 1, y < −x , y < 2 (x − 1),   1 1 2 2   x ≥ 1, y < 1, −x2 ≤ y ≤ 21 (x − 1),  2 xy − x − y − 4 (x − 1) ,   1 2 2   x ≥ 1, x−1  2 y −x , 2 < y < 1,   1 1 2 2 1 ≤ y < 21 (x − 1), x ≥ 12 (y − 1), x ≥ 1. 2 xy − y − x − 4 (x − 1) ,  (6.3.10)  However, the new saddle function yields the canonical saddle function, since  W∂f (x, y) = f (y) − f (x) =  6.3.1         1 2 2y 1 2 2y  − 21 x2 , x ≥ 0, y ≥ 0, + x,   −y − 21 x2 ,     x − y,  x < 0, y ≥ 0,  x ≥ 0, y < 0,  (6.3.11)  x < 0, y < 0.  Applications to equilibrium problems  An equilibrium problem is, given a nonempty convex set C, to find a point x ¯ ∈ C such that,  for a given bifunction F : X × X → R  {−∞, +∞},  F (¯ x, y) ≥ 0  for all y ∈ C.  (6.3.12)  125  17.5  15.0  12.5  15  10.0  10  7.5  5  5.0  0 2  2.5  1 3  0  2 1  −1  0 x2  x1  0.0  −2  −1 −2  −3  −2.5  Figure 6.1: The saddle function H∂f from Example 6.3.8, with f as defined in (6.3.8).  126  8  6  4 8 6  2  4 2  0  0 −2  −2  −4 2  −6 1 3  −4  0  2 1  −1  0 x2  x1  −6  −2  −1 −2  −3  Figure 6.2: The Krauss saddle function L∂f from Example 6.3.8, with f as defined in (6.3.8).  127  4  3  2 4 3  1  2 1  0  0 −1  −1  −2 −3  2  −4  −2  1 3  0  2 1  −1  0 x2  x1  −3  −2  −1 −2  −3  −4  Figure 6.3: The saddle function W∂f from Example 6.3.8, with f as defined in (6.3.8).  128  Equilibrium problems can model various other optimization problems such as convex minimization, where F (x, y) = f (y) − f (x), and variational inequality problems, where F (x, y) = supx∗ ∈T x y − x, x∗ supx∗ ∈T x y −  x, x∗  (see [14] for a more complete exposition).  By (6.2.3), MT (x, y) ≥  , so on any convex set C ⊂ dom T , solving the equilibrium problem for  MT solves the variational inequality.  Since the saddle function WT is different from the Krauss saddle function, and as it is Riemann integrable, it has the potential to inspire alternate methods for solving these and other problems. Indeed, the Borwein-Wiersma decomposition explored below using WT may have further applications in the line of current splitting methods used to obtain the solution of equilibrium problems (see for instance [32]). This is a promising avenue for further research.  129  Chapter 7  Borwein-Wiersma Decompositions In this chapter we examine monotone operators of the form T = ∂f + A, where f is a proper lower semicontinuous convex function and A is a skew linear relation, meaning that for all x ∈ dom A we have x, Ax = 0. These monotone operators are called Borwein-Wiersma  decomposable, and the decomposition T = ∂f + A is referred to as a Borwein-Wiersma decomposition. Throughout, we will examine the nonuniqueness of f and A, isolate desirable properties of f and A, describe some properties of these operators, and eventually outline constructive methods for obtaining any of f , ∂f , and A directly. We start with a discussion the notions of the convex kernel and starshaped operators, as  well as various notions of relative interior and convexity and how they relate to monotone operators in Section 7.2. Following this, we define Borwein-Wiersma decomposable operators, and greatly expand on the theory and properties of these operators. We define clean and essential decompositions, and later show that the existence of a clean decomposition is crucial for precisely decomposing an operator. To further explore the properties of these operators, we define extended and standardized decompositions in Section 7.4. We then demonstrate how to constructively decompose a Borwein-Wiersma decomposable operator using the function MT introduced in Section 6.2. The results of this are also applied later in Section 7.9, showing how existing results for the decomposition of linear relations can also be obtained from both MT and the Krauss saddle function. Using MT directly, it is possible to decompose Borwein-Wiersma decomposable operators as long as the domain of T is starshaped. In Section 7.6, this is done to directly calculate the function of the subdifferential and the effect of the linear relation at a certain point and in a given direction. More generally, the various methods explored in Section 7.7 are able to recover much of a decomposition, and to do so exactly given sufficient conditions (usually relying on the existence of a clean decomposition).  130  7.1  Starshaped sets and their properties  Here we define starshaped sets, an important condition later on for the domains of monotone operators T . Definition 7.1.1 (starshaped sets) A set D in a real vector space Y is starshaped if there exists an x ∈ D such that for all y ∈ D the direct line segment joining the two lies entirely in  D. That is,  ∃x ∈ D  s.t.  ∀y ∈ D,  [x, y] ⊂ D.  (7.1.1)  The set of points x ∈ D for which (7.1.1) is satisfied is called the convex kernel of D, denoted  cern D. Hence, D is starshaped if and only if cern D = ∅. By the term D is starshaped at x  we mean that D is starshaped and x ∈ cern D.  Fact 7.1.2 A given nonempty set C in a real vector space Y is convex if and only if cern C = C. If a monotone operator T : X → 2X has starshaped domain, then the operator MT (x, y)  (see Section 6.2) is well defined as long as either x or y is in the convex kernel of dom T . This will be important for the decomposition of T obtained by methods below. Below are listed some facts about the relationship between starshaped and convex sets. Definition 7.1.3 (maximally convex/starshaped subset) Given a set D in a real vector space Y (and a point x ∈ D), a subset M ⊂ D is called maximally convex (or maximally  starshaped, or maximally starshaped at x) if there is no convex (or starshaped, or starshaped at x) set C ⊂ D such that M  C.  Fact 7.1.4 [73] Given a set D in a real vector space Y , then cern D =  M.  (7.1.2)  {M :M ⊂D,M is maximally convex}  Fact 7.1.5 [73] Given a set D in a real vector space Y , then cern D is a convex set. Proof. Follows directly from Fact 7.1.4. Fact 7.1.6 [21] Given a set D in a real vector space Y (and a point x ∈ D), every nonempty  convex (or starshaped at x) set E ⊂ D has a maximally convex (or maximally starshaped at x) superset M ⊂ D.  Remark 7.1.7 The distinction between maximally starshaped and maximally starshaped at x is an important one. As demonstrated in [43], the set D ⊂ R2 denoted by D := {(x, y) : x ≥ ⌈y⌉ − y},  (7.1.3) 131  where ⌈y⌉ = min{z ∈ Z : z ≥ y} is the ceiling function, has no maximally starshaped subset,  ie: for any starshaped set E ⊂ D, there is always another starshaped set F  D such that  E ⊂ F.  Fact 7.1.8 [21] Given a set D in a real vector space Y , then D is convex if and only if there exists a unique maximally convex set M ⊂ D (namely, M = D).  7.2  Notions of relative interior and convexity  Not every monotone operator has a convex domain. In this section, we examine maximal monotone operators, which, in a Hilbert space, have almost convex domains in finite dimensions [54], and virtually convex domains otherwise [67]. Definitions of relative interior are expanded upon from [17]. Relative interior here (ri) is as defined in [7], and corresponds to concepts called pseudo relative interior in [17] and called intrinsic core in [41]. Definition 7.2.1 (relative interior (ri)) Given a convex set C ⊂ X, a point x ∈ C is in  the relative interior of C, denoted by x ∈ ri C, if x ∈ C and  cone(C − x) = span(C − x). Fact 7.2.2 [17] Given a convex set C ⊂ X, x ∈ ri C if and only if x ∈ C and (i) cone(C − x) is a linear subspace of X; (ii) for all y ∈ C \ {x}, there exists a z ∈ C such that x ∈]y, z[; (iii) for all y ∈ C \ {x}, there exists a λ > 1 such that λx + (1 − λ)y ∈ C. The notion of relative interior and its generalizations below will often be applied in the manner of the following fact. Fact 7.2.3 Let C ⊂ X be a nonempty convex set. Then, for all x ∈ ri C, NC (x) = (∂ιC )(x) = (C − x)⊥ , where NC (x) is the normal cone to C at x and ιC is the indicator function for C. Proof. By definition, for all x ∈ C NC (x) := {x∗ ∈ X : y − x, x∗ ≤ 0} = (∂ιC )(x).  132  Therefore, (C − x)⊥ ⊂ ∂ιC (x). Suppose that x ∈ ri C and that x∗ ∈ ∂ιC (x). If there is a y ∈ C  such that y − x, x∗ < 0, then by Fact 7.2.2(iii), there is a λ > 1 such that λx + (1 − λ)y ∈ C. Therefore,  λx + (1 − λ)y − x = (1 − λ) y − x, x∗ > 0, a contradiction of x∗ ∈ ∂ιC (x). Hence, y − x, x∗ = 0 for all y ∈ C, and so x∗ ∈ (C − x)⊥ . Fact 7.2.4 [81] Given a proper convex function f : X → R {+∞}, dom ∂f ⊃ ri dom f.  (7.2.1)  Definition 7.2.5 (almost convex) A set C ⊂ X is said to be almost convex if the interior  of conv C with respect to Aff C is a subset of C, that is  intAff C conv C ⊂ C,  (7.2.2)  ri(conv C) ⊂ C.  (7.2.3)  or more succinctly  Theorem 7.2.6 [54] For a maximal monotone operator T : X → 2X , the set dom T is almost convex.  Definition 7.2.7 (virtually convex) A set C ⊂ X is said to be virtually convex if for any  relatively compact (in the norm) subset K ⊂ conv C, and for any ε > 0, there is a (strongly) continuous mapping φ : K → C such that, for every x ∈ K, φ(x) − x ≤ ε.  (7.2.4)  Fact 7.2.8 [67] Given a finite dimensional Hilbert space X, any set C ⊂ X is almost convex  if and only if it is virtually convex.  Theorem 7.2.9 [67] Given a maximal monotone operator T : X → 2X , then both sets dom T and ran T are virtually convex.  Definition 7.2.10 (nearly convex) Given a set C ⊂ X, C is said to be nearly convex if C  is convex.  Borwein and Yao [20] provide a good summary of sufficient conditions for the domain of maximal monotone operators to be nearly convex. It remains an open problem as to whether this is always the case. Theorem 7.2.11 [63] Given a maximal monotone operator T : X → 2X , then dom T is nearly  convex if int dom T = ∅.  133  As the domains of maximal monotone operators need not be convex, the notions of encompassing relative interior is introduced below, relaxing the requirement that the interior is a subset of the original set. Since the set might not be convex, we consider the convex cone instead of the simple cone. Recall that if C is convex, conv cone(C) = cone(C), or more generally for any set D in a vector space, conv cone(D) = cone conv(D) (see for instance Proposition 6.2 in [7]). Definition 7.2.12 (encompassing relative interior) A point x ∈ X is said to be in the  encompassing relative interior of a set D ⊂ X, denoted by x ∈ eri D, if conv cone(D − x) = span(D − x).  (7.2.5)  Fact 7.2.13 For any set D ⊂ X, eri D = ri conv D.  (7.2.6)  Proof. Let D ⊂ X, and let x ∈ X. Follows immediately since conv cone(D − x) = cone conv(D − x) = cone(conv D − x) and since span(D − x) = span(conv D − x).  7.3  Borwein-Wiersma decomposable monotone operators  As remarked in Chapter 1, much of monotone operator theory has been motivated by the theory of subdifferentials of convex functions. Indeed, subdifferentials of proper lower semicontinuous convex functions satisfy the properties of maximal, cyclical, 3∗ -, and paramonotonicity examined in that chapter. As such, it might be worthwhile to extract from a monotone operator the part of it which is like a subdifferential, and to decompose the operator into a subdifferential and a remainder term. Using the terminology of [19], we call this remainder term, if it is monotone, an acyclic operator. Definition 7.3.1 (acyclic) A monotone operator T : X → 2X is said to be acyclic if every  choice of proper lower semicontinuous convex function, f : X → R {+∞} and monotone operator A : X → 2X satisfying T = ∂f + A must satisfy for all x ∈ dom ∂f expression ∂f (x) = k for some constant k ∈ X.  dom A the  In [1], Asplund demonstrated under certain conditions, monotone operators can be decomposed into the sum of a subdifferential and a monotone acyclic operator. However, this result was an existence result only, and required the use of Zorn’s lemma.  134  Theorem 7.3.2 (Asplund) Given a monotone operator T : Y → 2Y  space is such that the weak* closure of dom T  Y  ∗  ∗  where Y is a Banach  has a nonempty (strong) interior, and is  strong-to-strong continuous at a point in this interior, then there is a proper lower semicontinuous convex f : Y → R  {+∞} and an acyclic operator A such that T = ∂f + A.  Remark 7.3.3 Note that the conditions for Theorem 7.3.2 are satisfied if Y is reflexive and T is locally bounded at some point in int dom T . It could be hoped that acyclic operators have some desirable property that allows for easier analysis, however no such property has yet been identified. While it is true that all linear skew operators are acyclic, Borwein and Wiersma [19] give an example of a nonlinear acyclic operator in R2 , so any expectation of the converse would be unfounded. For practical purposes, we will consider those operators which can be decomposed into the sum of a subdifferential and a skew linear relation. After exploring some properties of these operators, we proceed in the next section to demonstrate constructively how such operators can be decomposed into their component parts. Doing so will open new avenues of optimization theory and application, much in the way the theory of decomposing linear operators into skew and symmetric parts has done in the past. There may be many Borwein-Wiersma decompositions for the same operator, so we characterize different decompositions from the perspective of their domains as being clean (or non clean) and essential (or nonessential). Propositions 7.3.14 below demonstrates that all BorweinWiersma decomposable operators have an essential decomposition, and Proposition 7.8.1 provides some sufficient conditions for an operator to have a clean decomposition. Definition 7.3.4 (Borwein-Wiersma decompositions) A monotone operator T : X →  2X is said to be Borwein-Wiersma decomposable if there exists a proper lower semicontinuous convex function f : X → R  {+∞} and a monotone linear relation A : X → 2X that is skew  (ie: x, Ax = 0 for all x ∈ dom A), such that  T = ∂f + A.  (7.3.1)  The statement T = ∂f + A is a Borwein-Wiersma decomposition of T is taken to mean that the operators T , f , and A satisfy the conditions above. Furthermore, if span(dom ∂f ) ⊃ dom A then (7.3.1) is referred to as an essential decomposition, and if dom(∂f ) = dom T it is referred to as a clean decomposition. 135  Remark 7.3.5 In [11], A is taken to be linear, skew, and single-valued, and this decomposition is called the Borwein-Wiersma decomposition. Here, we consider the more general case where A can be a multivalued linear relation. However, in the general case Borwein-Wiersma decompositions might not be Asplund decompositions, since for instance the skew linear re2 lation A : R2 → 2R , where gra A = {(0, x) : x ∈ R2 }, is not acyclic. Define A˜ so that  ˜ However, given a Borwein-Wiersma decomposigra A˜ = {(0, 0)} ⊂ gra A, and so A = ∂ι0 + A.  tion ∂f + A, any single-valued selection of A that is also a skew linear relation is acyclic.  For any Borwein-Wiersma decomposable operator T , there may exist many BorweinWiersma decompositions. Indeed, although there are (at times) infinite choices of BorweinWiersma decompositions of T , given a single choice of a point z∗ and a proper lower semicontinuous convex function h : R → R {+∞}, the following example enumerates 15 such decompositions. Unlike the examples that appear in [19] and [11], these explicit Borwein-  Wiersma decompositions are such that not only is 0 ∈ / int dom T , but both 0 ∈ / dom T and  int dom T = ∅.  Example 7.3.6 Define the closed and convex set C by C := {(x1 , x2 , x3 ) ∈ R3 : 1 ≤ x1 ≤ 4, x2 = 1, x3 = 0}  (7.3.2)  3  and the monotone operator T : R2 → 2R for x = (x1 , x2 , x3 ) ∈ R3 by   {(y1 , y2 , y3 ) : y1 ≤ −1, y2 , y3 ∈ R}, x = (1, 1, 0),     {(−1, y , y ) : y , y ∈ R}, 1 < x1 < 4, x2 = 1, x3 = 0, 2 3 2 3 T x :=  {(y1 , y2 , y3 ) : y1 ≥ −1, y2 , y3 ∈ R}, x = (4, 1, 0),     ∅, x∈ / C.  Consider further any proper lower semicontinuous convex function h : R → R any point  z∗  = (z1 , z2 , z3 ) ∈  R3 .  (7.3.3)  {+∞} and  For i ∈ {1, 2, 3, 4, 5}, define the convex functions fi : R3 →  R {+∞} for x = (x1 , x2 , x3 ) ∈ R3 by  f1 (x) := ιC (x),  (7.3.4)  f2 (x) := ιC (x) + x, (2, 1, 0) ,  (7.3.5)  f3 (x) := ιC (x) − x, (1, 1, 1) ,  (7.3.6)  f4 (x) := ιC (x) + x, z∗ ,  (7.3.7)  f5 (x) := ι{(y1 ,y2 ):1≤y1 ≤4,y2 =1} (x1 , x2 ) + h(x3 ).  (7.3.8)  136  3  For i ∈ {1, 2, . . . , 6}, define the skew linear relations Ai : R3 → 2R by A1 (x1 , x2 , x3 ) := {(−x2 , x1 , 0)},  (7.3.9)  A2 (x1 , x2 , x3 ) := {(−3x2 , 3x1 , 0)},  (7.3.10)  A3 (x1 , x2 , x3 ) := {(0, 0, 0)},  (7.3.11)  A4 (x1 , x2 , x3 ) := {(−(1 + z1 )x2 , (1 + z1 )x1 + z3 x3 , −x2 z3 )},  (7.3.12)  A5 (x1 , x2 , x3 ) := {(−x2 − x3 , x1 , x1 )},  (7.3.13)  A6 (x1 , x2 , x3 ) := {(0, −x3 , x2 )}.  (7.3.14)  Define the subspace Y ⊂ R3 by Y := span dom T = R2 × {0}, 3 3 and for each i ∈ {1, 2, 3, 4}, define the skew linear relations Aˆi : R3 → 2R and A˜i : R3 → 2R  by  A˜i := Ai |Y ,  (7.3.15)  gra Aˆi := gra (Ai |Y ) + R3 × {(0, 0, y) : y ∈ R},  (7.3.16)  so that dom Aˆi = dom A˜i = Y , and, for instance, Aˆ4 (x1 , x2 , 0) = {(−x2 − z1 x2 , x1 + z1 x1 , y) : y ∈ R}. Then, for each i ∈ {1, 2, 3, 4}, T = ∂fi + A˜i  and  T = ∂fi + Aˆi  (7.3.17)  are clean and essential decompositions. For each i ∈ {1, 2, 3, 4}, T = ∂fi + Ai  (7.3.18)  is a clean but not essential decomposition, as are both T = ∂f1 + A5  and  T = ∂f3 + A6 .  (7.3.19)  Also, T = f5 + Aˆ1 is an essential decomposition of T that is not clean. Note however that f5 + A˜1 is not a Borwein-Wiersma decomposition of T since gra T f5 + A˜1 . Note in the above example that on the extreme sides of the domain, the image of Ax, whatever the decomposition T = ∂f + A, is largely lost within the normal cone of the domain of ∂f (or of ∂T ). As we shall see below, any attempt to decompose a Borwein-Wiersma 137  Figure 7.1: Visualization of the domain of a monotone operator where 0 ∈ / Aff dom T , such as the operator T from Example 7.3.6. Also displayed are the orthogonal subspaces V and W , defined for any choice of z ∈ dom T by V := span(dom T − z) and W = span(P(dom T −z)⊥ z), and where V, W ⊂ span dom T . decomposable operator will likely require some concept of an interior, be it relative interior or otherwise, in order to directly isolate the skew linear relation component. Proposition 7.3.7 Given a Borwein-Wiersma decomposable monotone operator T : X → 2X ,  a Borwein-Wiersma decomposition T = ∂f + A is clean and essential if and only if dom A = span dom ∂f.  (7.3.20)  Proof. First, consider any clean and essential Borwein-Wiersma decomposition T = ∂f + A. Since the decomposition is clean, dom ∂f = dom T ⊂ dom A, and since dom A is a linear subspace, span dom ∂f ⊂ dom A. Since the decomposition is essential, dom A = span dom ∂f .  138  Now, suppose that T = ∂f + A is a Borwein-Wiersma decomposition such that (7.3.20) holds. Then, dom ∂f ⊂ dom A ⊂ span dom ∂f, and so this decomposition is both clean an essential. Remark 7.3.8 (domain restriction notation) For the results below, we will follow the conventions: (i) for any operator g : X → Y , and any set D ⊂ X, gra(g|D ) := {(x, x∗ ) ∈ gra g : x ∈ D},  (7.3.21)  (ii) for any function g : Y → R {+∞}, for all x ∈ X g|D (x) :=  g(x), x ∈ D  +∞, x ∈ /D  dom g, dom g.  .  (7.3.22)  Remark 7.3.9 Below, and throughout, ∂ˆ denotes the subdifferential but applied nonetheless to a nonconvex function. While the definition is the same, and the same symbol ∂ could be used, the use of ∂ˆ will allow the reader to easily track when the underlying function is not convex. The following facts are immediate results of the definition of the subdifferential. Fact 7.3.10 Given any function f : X → R  X → 2X is a monotone operator.  ˆ : {+∞}, convex or not, the subdifferential ∂f  ˆ . Then, Proof. Let (x, x∗ ), (y, y ∗ ) ∈ gra ∂f f (y) − f (x) ≥ y − x, x∗ and f (x) − f (y) ≥ x − y, y ∗ so 0 ≥ x − y, y ∗ − x∗ , ˆ is monotone. and so ∂f Fact 7.3.11 Given any proper convex function f : X → R  follows that  {+∞} and any set D ⊂ X, it  ˆ D ⊂ gra ∂(f ˆ |D ) , gra ((∂f )|D ) ⊂ gra ∂f + ∂ι  (7.3.23)  139  ˆ D . Then, for some (x, x∗ ) ∈ gra ∂f and some (x, x∗ ) ∈ Proof. Let (x, x∗ ) ∈ gra ∂f + ∂ι 2 1 ∗ ∗ ∗ ˆ gra ∂ιD , we have that x1 + x2 = x . Since f |D = f + ιD , for all y ∈ X, f (y) − f (x) + ιD (y) − ιD (x) ≥ y − x, x∗1 + x∗2 , ˆ |D . and so (x, x∗ ) ∈ gra ∂f  Now, let (x, x∗ ) ∈ (∂f )|D . Since x ∈ D by construction, by the definition of the subˆ D )(x). Therefore, differential and since for all y ∈ X, ιD (y) ≥ 0, we have that 0 ∈ (∂ι ˆ D )(x). x∗ ∈ (∂f )(x) + (∂ι Fact 7.3.12 For any proper lower semicontinuous convex function f : X → R any convex set C ⊂ X such that f |C is proper,  {+∞} and  cl(f |C ) = f |C .  (7.3.24)  dom f ∩ C ⊂ dom f ∩ C  (7.3.25)  if and only if  Proof. By Corollary 2.2.3, for any x ∈ X, (cl(f |C ))(x) = (cl(f + ιC ))(x) = lim inf f (y) + ιC (y) y→x  ≥ f (x) + ιC (x) = (f |C )(x). If x ∈ dom f ∩ C, then the lim inf is seperable and equality holds. If either x ∈ / dom f or x ∈ / C, then equality holds as well, since (f |C )(x) = +∞. That leaves the case where x ∈ dom f and  / C. If the qualification constraint (7.3.25) holds, then (7.3.24) holds. On the x ∈ C, but x ∈ other hand, if there is an x ∈ dom f ∩C such that x ∈ / dom f ∩ C, then lim inf y→x f (y)+ιC (y) = +∞, yet (f |C )(x) < +∞, and the inequality is strict.  Proposition 7.3.13 For any set D ⊂ X, and any x ∈ eri D  D,  ⊥ ˆ D (x) = ∂ι ∂ι conv D (x) = (D − x) ,  (7.3.26)  where conv D is the closure of the convex hull of D. Proof. Clearly for any z ∈ (D − x)⊥ , and any y ∈ D, ιD y − ιD x = 0 = y − x, z ,  140  and so ˆ D ⊃ (D − x)⊥ . ∂ι ˆ D x. Then, by the definition of the subdifferential, 0 ≥ y − x, x∗ Now, suppose that x∗ ∈ ∂ι  for all y ∈ D. Since x ∈ eri D, for every y ∈ D there is a y2 ∈ D such that (y2 − x) = α(y − x)  for some α < 0. Therefore, for all z ∈ (D − x),  z, x∗ = 0. Hence, x∗ ∈ (D − x)⊥ .  Now, D ⊂ conv D, and so if x ∈ eri D, then x ∈ eri conv D . Since conv D is by definition closed and convex, then ιconv D is a lower semicontinuous convex function, and proper if D = ∅.  Therefore, by Fact 7.2.3,  ⊥ ˆ ∂ι conv D (x) = ∂ιconv D (x) = (conv D − x) .  Finally, we obtain (7.3.26) from the fact that (conv D − x)⊥ = (D − x)⊥ ,  (7.3.27)  which we now prove. Since (conv D − x) ⊃ (D − x), then (conv D − x)⊥ ⊂ (D − x)⊥ . Let  y ∈ (D − x)⊥ . Then, for all z ∈ span(D − x) ⊃ conv(D − x), z, y = 0. Therefore, for all  z ∈ conv D − x, z, y = 0, since for any sequence (zn )n∈N ⊂ (conv D − x) such that zn → z, zn , y = 0.  Proposition 7.3.14 For any monotone operator T : X → 2X with a Borwein-Wiersma decomposition T = ∂f + A, the decomposition  T = ∂f + A|span(dom ∂f )  (7.3.28)  is an essential decomposition of T . Proof. Let T = ∂f + A be a Borwein-Wiersma decomposable monotone operator as above. Then, A|span(dom ∂f ) is monotone because it is a subgraph of the monotone operator A, and A|span(dom ∂f ) is a linear relation by Fact 4.0.4 since its graph, as an intersection of linear subspaces gra A|span(dom ∂f ) = gra (A)  (span dom ∂f × X),  is itself a linear subspace of X × X. Finally, we obtain T = ∂f + A = ∂f + A|span(dom ∂f ) , from the fact that dom T = dom ∂f  dom A. 141  Proposition 7.3.15 Suppose that T = ∂f + A is a Borwein-Wiersma decomposition for some maximal monotone operator T : X → 2X and consider any skew linear relation A˜ : X → 2X  such that gra A˜ ⊃ gra A. Then, T = ∂f + A˜ is a Borwein-Wiersma decomposition of T . In  particular, T = ∂f + (A + (dom A)⊥ ) is a Borwein-Wiersma decomposition of T .  Proof. Since ∂f + A˜ is the sum of two monotone operators, it is itself monotone, and since ˜ gra T ⊂ gra(∂f + A). ˜ Since T is maximal monotone, T = ∂f + A˜ is a Borweingra A ⊂ gra A, Wiersma decomposition of T .  Proposition 7.3.16 Suppose that T = ∂f + A is a clean Borwein-Wiersma decomposition for some monotone operator T : X → 2X and consider any skew linear relation A˜ : X → 2X such  that gra A˜ ⊃ gra A. Then, T = ∂f + A˜ is a clean Borwein-Wiersma decomposition of T . In  particular, T = ∂f + (A + (dom A)⊥ ) is a clean Borwein-Wiersma decomposition of T . Proof. Since T = ∂f + A is a clean decomposition of T , dom(∂f ) = dom T . Therefore, dom A˜ ⊃ dom A ⊃ dom T = dom(∂f ),  ˜ If Ax = Ax ˜ for all x in dom(∂f ), then ∂f + A = ∂f + A, ˜ and so dom(∂f + A) = dom(∂f + A). and so T = ∂f + A˜ is a Borwein-Wiersma decomposition of T . Now, suppose otherwise, so that Ax  ˜ for some x ∈ dom T . As A˜ is a linear relation, this means that A0 Ax  ˜ (Fact 4.0.5 A0  (iv)). However, by Proposition 4.0.7 and the above,  ˜ ⊂ (dom A) ˜ ⊥ ⊂ (dom A)⊥ ⊂ (dom(∂f ))⊥ . A0 Since ∂f is the subdifferential of a convex function, (dom(∂f ))⊥ ⊂ (∂f )(x)  for every x ∈ X.  ˜ and so T = ∂f + A˜ is a Borwein-Wiersma decomposition of T . Therefore, ∂f + A = ∂f + A, In either case, the fact that dom T = dom(∂f ) is unchanged, and so both Borwein-Wiersma decompositions are clean. Note in the following theorem that the subspace W does not depend on the choice of z. Theorem 7.3.17 Suppose T : X → 2X is a Borwein-Wiersma decomposable monotone oper-  ator with nonempty domain and consider any z ∈ dom T . Define W := span{P(dom T −z)⊥ z}.  If at least one of the following holds:  (i) T admits at least one clean decomposition; (ii) T is maximal monotone; (iii) 0 ∈ Aff dom T ; 142  (iv) for some Borwein-Wiersma decomposition T = ∂f + A, (x, x∗ ) ∈ gra ∂f ⇒ x∗ + W ⊂ (∂f )(x);  (7.3.29)  then there is an essential decomposition T = ∂f + A such that 0 ∈ Az. If T admits a clean decomposition, then there is a clean and essential decomposition T = ∂f + A such that 0 ∈ Az.  Proof. If a clean decomposition of T exists, a clean and essential decomposition of T = ∂f + A exists by Proposition 7.3.14. Otherwise, let T = ∂f + A be any essential decomposition of T . Choose any z ∈ dom T . Since dom T ⊂ dom A, fix any z ∗ ∈ Az. Now, define the convex function f˜ : X → R {+∞} by f˜(x) :=  f (x) + x, z ∗ , x ∈ dom f  +∞,  x∈ / dom f  ,  so that f˜ is also a proper lower semicontinuous convex function and ∂ f˜ = ∂f + z ∗ . Let w := P(dom T −z)⊥ z and let V := span(dom T − z). Note that W = span{w} is a closed  linear subspace of X and at most 1-dimensional. Also note that w, V , and W are independent of the choice of z, and that Aff dom T = z + V . Since A is a linear relation, then dom A must be a subspace, and so dom A ⊃ span dom T ⊃ Aff dom T ⊃ dom T. Define the operator A˜ : X → 2X so that dom A˜ = dom A and for all x ∈ dom A ∗ ˜ := Ax − x, w z ∗ + x, z w. Ax w 2 w 2  (7.3.30)  Since W is a closed subspace of X, for all x ∈ X, x = PW x + PW ⊥ x. Since P(dom T −z)⊥ (Aff dom T ) = PV ⊥ (z + V ) = w + 0, then PW x = w for all x ∈ Aff dom T . Therefore, for all x ∈ Aff dom T , w, x = w 2 , and so ∗ ˜ = Ax − z ∗ + x, z w. Ax w 2  Since A is skew, z, z ∗ = 0. Therefore, since z ∗ ∈ Az and since A is a linear relation, ˜ = Az − z ∗ + 0 = A0, Az 143  ˜ The operator A˜ is a linear relation since it is the sum of a linear relation and so 0 ∈ A0 = Az. and a linear operator, and since dom A˜ = dom A which is a linear subspace. Since x, x, z ∗ w = x, z ∗ x, w = x, x, w z ∗ , ˜ x, Ax ˜ = 0, and so A˜ is also skew. and since A is skew, then for all x ∈ dom A, (i)Suppose now that dom T = dom ∂f . Then, since ∂f is maximal monotone, for every (x, x∗ ) ∈ gra ∂f , we have that x∗ + (Aff dom T )⊥ = x∗ + (Aff dom ∂f )⊥ ⊂ (∂f )(x). Therefore, x∗ + W ⊂ (∂f )(x). Note that this is the case of condition (iv).  (ii) Suppose instead that T is maximal monotone. Similarly, we have that for every (x, x∗ ) ∈  gra T , x∗ + (Aff dom T )⊥ ⊂ T x. Therefore, x∗ + W ⊂ T x.  (iii) Now, suppose that 0 ∈ Aff dom T . Then, by Fact 2.2.14, Aff dom T = span dom T , and  so w = 0.  In any case, enumerated (i)-(iv) above, for every x ∈ dom T , ∂ f˜ + A˜ = ∂f + z ∗ + A − z ∗ + = ∂f + A +  = T+  ·,z ∗ w 2w  ·,z ∗ w w 2  ·,z ∗ w w 2  (7.3.31)  = T. ˜ Therefore, T = ∂ f˜ + A˜ is a Borwein-Wiersma decomposition of T , where 0 ∈ Az. Since ˜ ˜ ˜ ˜ dom ∂ f = dom ∂f , and since dom A = dom A, then T = ∂ f + A and essential decomposition of T and is clean if T = ∂f + A is clean. Remark 7.3.18 Theorem 7.3.17 is important for many reasons, one of which is that it demonstrates how the Borwein-Wiersma decomposition may be useful in solving for 0 ∈ T z given that  a monotone operator T is Borwein-Wiersma decomposable. Suppose that there is a BorweinWiersma decomposable monotone operator T : X → 2X such that the one of the conditions  from Theorem 7.3.17 applies, and that there is a z ∈ dom T such that 0 ∈ T z. Then, by this  same theorem, there is a Borwein-Wiersma decomposition T = ∂f + A such that 0 ∈ Az. This means that if we solve for 0 ∈ ∂f , we solve for 0 ∈ T . Therefore, if we can isolate ∂f , as we do in the sections below, we can solve this problem for a monotone operator which, because it is the subdifferential of a convex function, is guaranteed to be paramonotone, cyclical monotone, and maximal monotone.  144  7.4  Standard and extended Borwein-Wiersma decompositions  In this section (7.4), and only for this section, the following qualification constraint will be assumed to hold for all Borwein-Wiersma decompositions T = ∂f + A. Assumption 7.4.1 (A1) dom f ∩ dom A ∩ span dom ∂f ⊂ dom f ∩ dom A ∩ span dom ∂f  (7.4.1)  Note that assumption (A1) holds in the case where dom A is closed. Definition 7.4.2 (standardized decomposition) Given a monotone operator T : X → 2X  with a Borwein-Wiersma decomposition T = ∂f + A, define the standardized decomposition operator T¨ : X → 2X by the standardized decomposition T¨ := ∂g + B where g := f |dom A  span dom ∂f  and  B := A|span dom ∂g .  (7.4.2)  Definition 7.4.3 (extended decomposition) Given a monotone operator T : X → 2X with  a Borwein-Wiersma decomposition T = ∂f + A, define the extended decomposition operator Tˆ : X → 2X by the extended decomposition Tˆ := ∂(f |dom A )|dom ∂f + A|span dom T + (dom T )⊥ .  (7.4.3)  Remark 7.4.4 Note that it is highly possible that T¨ = T and Tˆ = T , and that (7.4.3) may not be a Borwein-Wiersma decomposition of Tˆ since ∂(f |dom A )|dom ∂f might not be maximal  monotone, and hence not the subdifferential of a lower semicontinuous convex function.  Lemma 7.4.5 Given a monotone operator T : X → 2X with an essential decomposition T = ∂f + A, then its standardized decomposition T¨ is T¨ = ∂(f |dom A ) + A|span dom ∂ (f | Proof. f |dom A  dom A  ).  (7.4.4)  Since T = ∂f + A is an essential decomposition, span dom ∂f ⊃ dom A, and so span dom ∂f  = f |dom A .  Lemma 7.4.6 Suppose that T : X → 2X is a monotone operator with nonempty domain with a Borwein-Wiersma decomposition T = ∂f + A and a standardized decomposition T¨ = ∂g + B. Then, (∂g)|dom B = (∂g)|dom A = ∂(g|dom B ) = ∂(g|dom A ).  (7.4.5)  145  Proof. We have that dom B = dom A  span(dom ∂g). By (A1),  dom f ∩ dom A ∩ span dom ∂f  (7.4.6)  ⊂ dom f ∩ dom A ∩ span dom ∂f  ⊂ dom f ∩ dom A ∩ span dom ∂f ∩ dom A, and so by Fact 7.3.12, ∂(cl(g|dom A )) = ∂(g|dom A ). Therefore, ∂(g|dom A ) ⊂ ∂(g|dom A ) = ∂g.  (7.4.7)  Hence, by Fact 7.3.11 and since dom B ⊂ dom A, gra ∂ (g|dom A ) = gra (∂ (g|dom A )) |dom ∂g  = gra (∂ (g|dom A )) |span dom ∂g ⊂ gra ∂ g|dom A  span dom ∂g  = gra ∂ (g|dom B )  (7.4.8)  = gra ∂ (g|dom B ) |dom A  ⊂ gra ∂ (g|dom A ) . Define D by the following  D := dom ∂ (g|dom B ) = dom ∂ (g|dom A ) . Then, by Fact 2.2.9, ∂ (g|dom B ) = ∂ˆ g|dom B  D  = ∂ˆ (g|D ) = ∂ˆ g|dom A  D  = ∂ (g|dom A ) .  (7.4.9)  Therefore, by (7.4.7), gra(∂g)|dom B ⊂ gra(∂g)|dom A ⊂ gra ∂(g|dom A ) = gra ∂(g|dom B ) ⊂ gra ∂g. Suppose that (x, x∗ ) ∈ ∂(g|dom B ). Then, x ∈ dom B and x∗ ∈ (∂g)(x), and so we obtain (7.4.5).  Proposition 7.4.7 Suppose that T : X → 2X is a monotone operator with nonempty domain, a Borwein-Wiersma decomposition T = ∂f + A, a standardized decomposition T¨ = ∂g + B, and an extended decomposition operator Tˆ. Then, both T¨ and Tˆ are monotone operators, T¨ = ∂g+B is an essential decomposition of T¨, and if (7.4.3) is a Borwein-Wiersma decomposition of Tˆ, then it is also an essential decomposition. Proof. Recall that for any set D ⊂ X and for any monotone operator M : X → 2X , gra M |D ⊂ gra M and so M |D is also monotone operator. Therefore, both Tˆ and T¨ are monotone since they are both sums of two monotone operators. By Fact 7.3.12 and (A1), g = cl g, and so g is lower 146  dom ∂f is a subspace, g is convex, and as dom T = ∅, g is proper. Therefore, since B is a linear relation, T¨ = ∂g + B is an Borwein-Wiersma decomposition.  semicontinuous. As dom A Since  span(dom ∂g) ⊃ span(dom ∂g)  dom A = dom B,  T¨ = ∂g + B is an essential decomposition of T¨. Similarly, as span dom(∂(f |dom A )|dom ∂f  ⊃ span(dom ∂f  dom A)  ⊃ span(dom T ) ⊃ dom(A|span dom T ), if (7.4.3) is a Borwein-Wiersma decomposition of Tˆ, then it is an essential decomposition of Tˆ.  Proposition 7.4.8 Given a monotone operator T : X → 2X with a Borwein-Wiersma decom-  position T = ∂f + A, then  dom Tˆ = dom T  (7.4.10)  gra T ⊂ gra Tˆ ⊂ gra T¨,  (7.4.11)  and where T¨ and Tˆ are respectively the standardized decomposition operator and extended decomposition operator of T . If T is maximal monotone, T = T¨ = Tˆ.  (7.4.12)  Proof. Applying Fact 7.3.11, gra T  = gra (∂f + A) = gra ((∂f )|dom T + A|span dom T ) ⊂ gra ((∂(f |dom A ))|dom ∂f + A|span dom T ) .  Therefore, gra T ⊂ gra Tˆ, and since dom Tˆ ⊂ dom T , dom Tˆ = dom T . Furthermore, by (A1),  Fact 7.3.12, and since  dom(∂(f |dom A  span dom ∂f  )) ⊃ dom(∂f )  dom A = dom T,  147  it follows that gra Tˆ = gra ((∂(f |dom A ))|dom ∂f + A|span dom T ) + X × (dom A)⊥ = gra ((∂(f |dom A ))|dom ∂f + A|span dom T ) ⊂ gra (∂(f |dom A ))|span dom ∂f + A|span dom ∂(f | ⊂ gra ∂(f |dom A  span dom ∂f )  ⊂ gra ∂(f |dom A  span dom ∂f  + A|span dom ∂(f |  ) + A|span dom ∂(f |  span dom ∂f  dom A  span dom ∂f  dom A  span dom ∂f  dom A  )  ) )  = gra T¨. Finally, if T is maximal monotone, then T = T¨ = Tˆ since T¨ and Tˆ are monotone (see Proposition 7.4.7). Proposition 7.4.9 Suppose that T : X → 2X is a monotone operator with nonempty domain, a Borwein-Wiersma decomposition T = ∂f + A, a standardized decomposition T¨ = ∂g + B, and an extended decomposition operator Tˆ. Then, both the standardized decomposition operator and the extended decomposition operator of T¨ are equal to T¨, and if (7.4.3) is a Borwein-Wiersma decomposition of Tˆ, then the extended decomposition operator of Tˆ is Tˆ. Proof. Consider the standardized decomposition T¨ = ∂g + B, which is also an essential decomposition of T¨. By Lemma 7.4.5, the standardized decomposition of T¨ is ∂g|dom B + B|span dom ∂ (g|  dom B  ).  (7.4.13)  Since dom B = dom A  span(dom ∂g) ⊃ dom A  span(dom ∂f ) ⊃ dom ∂g,  then dom B = span(dom ∂g). Therefore, by Fact 2.2.9, ∂ g|dom B = ∂g, and so the standardized decomposition is always its own standardized decomposition. Suppose now that (7.4.3) is a Borwein-Wiersma decomposition of Tˆ, so that for some proper lower semicontinuous function h : X → R  {+∞},  ∂h = ∂ (f |dom A ) |dom ∂f . As ∂h is maximal monotone, then dom ∂f ⊃ dom ∂h. Now, dom (A|span dom T ) = dom A  span dom T.  148  Hence, since dom ∂h ⊂ dom T ⊂ dom A, by Fact 2.2.9, ∂(h|dom A  span dom T )  = ∂h.  Finally, by Proposition 7.4.8, dom Tˆ = dom T , and so Tˆ is its own extended decomposition. Now, consider the standardized decomposition T¨ = ∂g + B. By Lemma 7.4.6, the extended decomposition of T¨ is ∂(g|dom B )|dom ∂g + B|span dom T¨ = ∂(g|dom A )|dom ∂g +B|span(dom ∂g)  span(dom T¨)  = ∂(g|dom A ) + B|span(dom T¨) = (∂g)|dom A + B|span(dom T¨ ) ⊂ ∂g + B. Since for every operator T , gra Tˆ ⊃ gra T (Proposition 7.4.8), the extended decomposition operator of T¨ is T¨. Proposition 7.4.10 Suppose that T : X → 2X is a monotone operator with nonempty domain, a clean Borwein-Wiersma decomposition T = ∂f +A, a standardized decomposition T¨ = ∂g+B, and an extended decomposition Tˆ. Then T = Tˆ = T¨  and  ∂f = ∂g;  A=B  (7.4.14)  Proof. Since T = ∂f + A is a clean decomposition, dom A ⊃ dom ∂f , and since dom A is a linear space, dom A ⊃ span dom ∂f . Therefore, by Fact 2.2.9, ∂g = ∂ f |dom A  span dom ∂f  = ∂ f |span dom ∂f  = ∂f,  and B = A|span(dom ∂g) = A. Therefore T = T¨, and so by Proposition 7.4.8, T = Tˆ = T¨. Proposition 7.4.11 Suppose that T : X → 2X is a monotone operator with nonempty domain and a Borwein-Wiersma decomposition T¨ = ∂g+B. Then, T¨ = ∂g+B is a clean decomposition if and only if dom A ⊃ dom ∂g = dom ∂(f |dom A  span dom ∂f ).  (7.4.15)  Furthermore, T¨ = ∂g + B is a clean decomposition if any of the following are are satisfied: (i) T = ∂f + A is a clean decomposition, (ii) dom A = dom A, 149  (iii) span dom T = span dom T and (dom A)⊥ = (dom T )⊥ . Proof. Recall that dom T¨ = dom A  dom ∂g = dom A  dom ∂ f |dom A  span dom ∂f  .  Therefore, T¨ = ∂g + B is a clean decomposition if and only if dom A ⊃ dom ∂g. (i) Suppose that T = ∂f + A is a clean decomposition of T . Then, dom ∂f = dom T = dom ∂f  dom A. Therefore, dom A ⊃ span dom ∂f , and so dom A ⊃ dom ∂f ⊃ dom ∂ f |spandom ∂f = dom ∂ f |dom A  spandom ∂f  = dom ∂g, by which T¨ = ∂g + B is a clean decomposition of T¨. (ii) Follows directly from the fact that dom ∂g ⊂ dom A. (iii) By the assumptions, dom A = span dom T = span dom T ⊂ dom A, as so T¨ = ∂g + B is a clean decomposition by (ii). The following example demonstrates that the condition dom ∂f = dom T is not a trivial one, and so demonstrates that there is a Borwein-Wiersma decomposable operator for which there is no clean decomposition. The convex function f is a modification of a similar example in [69], and is also a demonstration of a subdifferential with nonconvex domain. 2  Example 7.4.12 Define T : R2 → 2R by T = ∂f + A where f : R2 → R {+∞} and 2  A : R2 → 2R are defined by f (x1 , x2 ) :=  1 − (x1 − 1)2 , |x2 | , |x1 − 1| ≤ 1  max 1 −  |x1 − 1| > 1  +∞,  (7.4.16)  and A(x1 , x2 ) :=  {(x2 , y) : y ∈ R}, x1 = 0  ∅,  x1 = 0  .  (7.4.17)  Here, f is a lower semicontinuous proper convex function and A is a linear relation, with dom T = dom ∂f . Furthermore, for any lower semicontinuous proper convex function g and any linear relation B such that T = ∂g + B, dom T = dom ∂g. Note however that dom T is not starshaped. 150  Proof. The function h(t) = 1 −  1 − (t − 1)2 is continuous when t ∈ [0, 2], where ∇2 h(t) =  (2t − t)−3/2 on ]0, 2[, so h is also convex on this interval. The function f is convex since it is the maxima of two convex functions, and it is trivially proper. It is also lower semicontinuous  since lim inf y→x f (y) = f (x) (Corollary 2.2.3). Note that ∇2 h(t) is not defined at 0 or 2, and  similarly, for |x2 | < 1, ∂f (0, x2 ) = ∅. To demonstrate, assume to the contrary that for some |x2 | < 1, there exists a x∗ ∈ ∂f (0, x2 ). In this case, for all y ∈ R2 , f (y) − f (0, x2 ) ≥ y − (0, x2 ), x∗ and as h(0) = 1, hence f (0, x2 ) = 1, and f (y) − 1 ≥ y − (0, x2 ), x∗ . In particular, for any 1 > ε > 0 small enough so that 1 −  f (ε, x2 ) = 1 −  2ε − ε2 ). Therefore, by (7.4.18),  (7.4.18) √ 2ε − ε2 ≥ |x2 |, we have that  − 2ε − ε2 ≥ (ε, 0), x∗ , and so x∗2 < 0. Hence, 2ε − ε2 ≤ ε|x∗2 |, and so taking the square and dividing by ε, 2 ≤ ε(|x∗2 | + 1), which is false for ε small enough, and the contradiction is hereby established. Now, since dom T = dom ∂f dom A, dom T ⊂ {(0, y) : |y| ≥ 1}. Clearly, dom T = dom ∂f , since for instance (1, 2) ∈ dom ∂f (in a neighbourhood of (1, 2),  ∂f (x1 , x2 ) = ∂|x2 | = {(0, 1)}). Since the subdifferential of a proper lower semicontinuous  convex function is maximal monotone, and since any maximal monotone operator on R2 is  almost convex (see [54] or Definition 7.2.5 and Theorem 7.2.6 below), dom ∂g is convex. Since convex sets are connected, and dom T is not connected, there exists no lower semicontinuous proper convex function g such that dom ∂g = dom T , regardless of the choice of linear relation B. Note however that since dom T is not connected in Example 7.4.12, dom T is neither starshaped nor convex. Remark 7.4.13 (open question) Is there a Borwein-Wiersma decomposable operator T such that for every Borwein-Wiersma decomposition of T , taken as T = ∂f + A, must satisfy 151  dom T = dom ∂f , where dom T is starshaped? Where dom T is convex?  7.5  Further properties of Borwein-Wiersma decompositions  Proposition 7.5.1 Let T : X → 2X be a monotone operator with a Borwein-Wiersma decom-  position T = ∂f + A such that dom T = ∅. Then,  ∂ˆ (f |dom T ) (x) ⊃ (∂f )(x) + (dom A)⊥ . For any x ∈ eri dom T  (7.5.1)  dom T and any z ∈ dom T ,  ∂ˆ (f |dom T ) (x) ⊃ (∂f )(x) + (dom T )⊥ + spanP(dom T −z)⊥ z  (7.5.2)  with equalities when ∂f + ∂ιconv dom T is maximal monotone, and, for any x∗ ∈ Ax, ∂ˆ ( ·, x∗ + ιdom T ) (x) = x∗ + (dom T )⊥ + spanP(dom T −z)⊥ z.  (7.5.3)  Proof. Since f |dom T = f + ιdom T = f + ιdom A + ιdom ∂f , by Fact 2.2.9, gra ∂ˆ (f |dom T ) ⊃ gra ∂ˆ (f + ιdom ∂f ) + ∂ιdom A = gra (∂f + ∂ιdom A ) . Since dom A is a subspace, gra ∂ιdom A = dom A × (dom A)⊥ , and so we obtain (7.5.1). By Fact 7.3.11, gra ∂ˆ (f |dom T ) ⊃ gra (∂f + ∂ιdom T ). Therefore, by Proposition 7.3.13, ∂ˆ (f |dom T ) (x) ⊃ gra ∂f + ∂ιconv dom T ,  (7.5.4)  with equality if the right hand side of (7.5.4) is maximal monotone. The inclusion in (7.5.2) follows from Proposition 7.3.13 and Fact 2.2.13. By the definition of the subdifferential and by Fact 2.2.9 , for all x ∈ dom ∂f , (∂f )(x) = ∂ˆ (f |dom ∂f ) (x) = (∂f )(x) + (dom ∂f )⊥ , and so we obtain (7.5.2). Now, ·, x∗ : X → R is a proper continuous linear (and therefore convex) function with full  domain, and its subdifferential is the singleton mapping ∂ ·, x∗ = x → x∗ . Therefore, since ιconv dom T is maximal monotone, (7.5.3) follows directly from (7.5.2).  For instances of the assumptions used in Corollary 7.5.2, recall that if a linear relation A is maximal monotone, then A0 = (dom A)⊥ (Corollary 4.0.10). For clean and essential decompositions T = ∂f + A, we have that (dom A)⊥ = (dom T )⊥ . Corollary 7.5.2 Let T : X → 2X be a monotone operator with a Borwein-Wiersma decompo152  sition T = ∂f + A such that dom T = ∅ and (i) (dom A)⊥ = (dom T )⊥ , (ii) A0 = (dom A)⊥ , (iii) 0 ∈ Aff dom T . Then, for any x ∈ eri dom T  dom T , ∂ˆ (f |dom T ) (x) ⊃ (∂f )(x) + (dom A)⊥ ,  (7.5.5)  ∂ˆ ( ·, x∗ + ιdom T ) (x) = Ax  (7.5.6)  and  Proof. By Fact 2.2.14, since 0 ∈ Aff dom T , we have that (dom T − z)⊥ = (dom T )⊥ for all z ∈ dom T , and so spanP(dom T −z)⊥ z = 0. All else follows from Proposition 7.5.1.  Proposition 7.5.3 Let T : X → 2X be a monotone operator with a clean and essential  Borwein-Wiersma decomposition T = ∂f + A. Then, for any x ∈ X, ∂ˆ (f |dom T ) (x) = ∂f and, for any x ∈ eri dom T  (7.5.7)  dom T , for any x∗ ∈ Ax, and for any z ∈ dom ∂f ,  ∂ˆ ( ·, x∗ + ιdom T ) (x) = x∗ + (dom A)⊥ + spanP(dom ∂f −z)⊥ z.  (7.5.8)  Proof. Since T = ∂f + A is a clean decomposition, dom ∂f = dom T , and since it is also an essential decomposition, span dom ∂f = dom A by Proposition 7.3.7. Therefore, (dom T )⊥ = (dom A)⊥ . Hence, (7.5.7) follows directly from Fact 2.2.9, and (7.5.8) follows from Proposition 7.5.1. Remark 7.5.4 (open question) Is there an example where the inclusion in (7.5.2) is strict? Proposition 7.5.5 Let T : X → 2X be a maximal monotone operator with a Borwein-  Wiersma decomposition T = ∂f + A such that dom T = ∅ and Aff dom T is closed. Then, for all x ∈ dom T ,  ∂ˆ (f |dom T ) (x) = (∂f )(x) + (dom T )⊥ + spanP(dom T −z)⊥ z,  (7.5.9)  and, for any x∗ ∈ Ax, x∗ + (dom T )⊥ + spanP(dom T −z)⊥ z ⊂ ∂ˆ ( ·, x∗ + ιdom T ) (x)  ⊂  x∗  + (dom T )⊥  + spanP(dom T −z)⊥ z +  (7.5.10) {(y1∗  −  y2∗ )  : y1 , y2 ∈ ∂f (x)} 153  Proof. For any z ∈ dom T , let V := span (dom(T ) − z). Since Aff dom T is closed, V is a closed subspace of X. By the nature of the subdifferential,  gra ∂ˆ (f |dom T ) ⊃ gra ((∂f )|dom T ) + X × V ⊥ . Since V ⊂ span dom T ⊂ dom A, V ⊥ ⊃ (dom A)⊥ . Therefore, for every x∗ ∈ Ax and every  x ∈ dom T ,  ∂ˆ ( ·, x∗ + ιdom T ) ⊃ x∗ + V ⊥  = x∗ + (dom A)⊥ + V ⊥  (7.5.11)  ⊃ Ax + V ⊥ , since, as a linear relation, Ax = x∗ + A0 ⊃ x∗ + (dom A)⊥ . Let T1 := (∂f )|dom T + (X × V ⊥ ) and T2 := A|dom T + (X × V ⊥ ) . Clearly, both T1 and T2 are monotone operators. By Proposition 2.2.6, T1 + T2 = T + (X × V ⊥ ) = T. Suppose that there is an x ∈ dom T and an x∗1 ∈ X such that (x, x∗1 ) monotonically extends  T1 . Then, since T is maximal monotone, for every x∗2 ∈ T2 x, there is an x ˆ∗1 ∈ T1 x and an x ˆ∗2 ∈ T2 x such that  x ˆ∗1 + x ˆ∗2 = x∗1 + x∗2 .  ˆ Therefore, ˆ∗2 = x∗ + w. Now, by (7.5.11), for some points w, wˆ ∈ V ⊥ , x∗2 = x∗ + w and x  ˆ − w. Since w ˆ − w ∈ V ⊥ , x∗1 ∈ T1 x, and so T1 is maximal monotone on dom T . ˆ∗1 + w x∗1 = x The equation (7.5.9) follows since, by Fact 2.2.13, V ⊥ = (dom T )⊥ + spanP(dom T −z)⊥ z.  Now, suppose that there is an x ∈ dom T and an x∗2 ∈ X such that (x, x∗2 ) monotonically  extends T2 . Then, since T is maximal monotone, for every x∗1 ∈ T1 x, there is an x ˆ∗1 ∈ T1 x and  an x ˆ∗2 ∈ T2 x such that  x ˆ∗1 + x ˆ∗2 = x∗1 + x∗2 .  ˆ ∈ V ⊥ . Therefore, ˆ for some y1∗ , y2∗ ∈ (∂f )(x) and some w, w ˆ∗1 = y2∗ + w Now, x∗1 = y1∗ + w and x  x∗2 = x ˆ∗2 + w ˆ − w ∈ V ⊥ , x∗2 ∈ T2 x + {y2∗ − y1∗ : y1∗ , y2∗ ∈ (∂f )(x)}. The ˆ − w + y2∗ − y1∗ . Since w equation (7.5.10) follows from Fact 2.2.13, and since ∂ˆ ( ·, x∗ + ιdom T ) is monotone. Corollary 7.5.6 Let T : X → 2X be a maximal monotone operator with a Borwein-Wiersma  decomposition T = ∂f + A such that dom T = ∅, Aff dom T is closed, and Aff dom ∂f is closed. Then, T = ∂f + A is a clean decomposition of T .  154  Proof. By the proof of Proposition 7.5.5, for all z ∈ dom T , ˆ |dom T ) gra ∂(f  = gra ∂f + (dom T − z)⊥ ⊃ gra ∂f + (dom ∂f − z)⊥ = gra (∂f ) ,  by Proposition 2.2.6. Since T = ∂f + A is a Borwein-Wiersma decomposition, ∂f is maximal monotone, and so ∂(f |dom T ) = ∂f . Therefore, dom ∂f ⊂ dom T ⊂ dom ∂f, and so T = ∂f + A is a clean decomposition.  7.6  The algebraic decomposition of Borwein-Wiersma decomposable operators  For clarification, the phrase “T = ∂f + A is a Borwein-Wiersma decomposition” below is taken always to mean in addition that f : X → R {+∞} is a proper lower semicontinuous convex  function and A : X → 2X is a skew linear relation, and that T : X → 2X is a Borwein-Wiersma decomposable operator.  First, it is shown that given any x, y such that their joining line segment lies within dom T , the values of f (y) − f (x) and y, Ax from a Borwein-Wiersma decomposition T = ∂f + A can  be recovered from evaluations of T only, at least to within the error of a fixed constant.  Proposition 7.6.1 Let T : X → 2X be a monotone operator and let T = ∂f + A be a Borwein-Wiersma decomposition of T . Then, for all x, y ∈ dom T such that the line segment [x, y] ∈ dom T , Proof.  MT (x, y) = f (y) − f (x) + y − x, Ax .  (7.6.1)  Follows immediately from Corollary 6.2.3, yielding that MT (x, y) = M∂f (x, y) +  MA (x, y), Corollary 6.2.25, from which M∂f (x, y) = f (y) − f (x) follows, and Corollary 6.2.10 which yields that MA (x, y) = y − x, Ax . Theorem 7.6.2 (Rockafellar-Vesel´ y [7]) Suppose that T : X → 2X is a maximal monotone  operator. If T is locally bounded at any x ∈ dom T , then x ∈ int dom T .  Theorem 7.6.3 Suppose that T : X → 2X is a monotone operator with a starshaped domain  and that T = ∂f + A is a Borwein-Wiersma decomposition of T . Choose any z in the convex  155  kernel of dom T . Then, for all x, y ∈ dom T , MT (x, z) + MT (z, y) = MT (z, y) − MT (z, x) = f (y) − f (x) + y − x, Az .  (7.6.2)  So as long as the line segment {λx + (1 − λ)y : λ ∈ [0, 1]} ∈ dom T , MT (x, y) − MT (x, z) − MT (z, y) = y − x, Ax − Az = y, Ax + x − y, Az .  (7.6.3)  Proof. As z is in the convex kernel of dom T , this is direct from Proposition 7.6.1. Corollary 7.6.4 Suppose that T : X → 2X is a monotone operator with a starshaped domain  and that T = ∂f + A is a Borwein-Wiersma decomposition of T . In addition, suppose that 0  is in the convex kernel of dom T . Then, for all x, y ∈ dom T , MT (x, 0) + MT (0, y) = f (y) − f (x).  (7.6.4)  So as long as the line segment {λx + (1 − λ)y : λ ∈ [0, 1]} ∈ dom T , MT (x, y) − MT (x, 0) − MT (0, y) = y, Ax .  (7.6.5)  Proof. Direct application of Theorem 7.6.3, with z = 0. Theorem 7.6.5 Suppose that T : X → 2X is a monotone operator with a starshaped domain  that is Borwein-Wiersma decomposable, that is T = ∂f + A where f : X → R {+∞} is a proper lower semicontinuous convex function and A : X → 2X is a skew linear relation.  Suppose further that there exists a z in the convex kernel of dom T and a λ ∈ R such that λ = 0 and (1 + λ)z is in the convex kernel of dom T . Then for all x, y ∈ dom T such that  {λx + (1 − λ)y : λ ∈ [0, 1]} ∈ dom T , f (y) − f (x) =  1 λ  (1 + λ)MT (x, z) + (1 + λ)MT (z, y)  (7.6.6)  −MT (x, (1 + λ)z) − MT ((1 + λ)z, y)  and y, Ax = MT (x, y) −  1 λ  (1 + λ)MT (x, z) + (1 + λ)MT (z, y) −MT (x, (1 + λ)z) − MT ((1 + λ)z, y)  .  (7.6.7)  Proof. By Theorem 7.6.3, MT (x, z) + MT (z, y) = f (y) − f (x) + y − x, Az and MT (x, (1 + λ)z) + MT ((1 + λ)z, y) = f (y) − f (x) + (1 + λ) y − x, Az , 156  and so (1 + λ) (MT (x, z) + MT (z, y)) − MT (x, (1 + λ)z) − MT ((1 + λ)z, y) = λ(f (y) − f (x)).  7.7  Decomposing Borwein-Wiersma decomposable operators using WT  Using the tools from Section 7.6, we show how to constructively decompose any BorweinWiersma decomposable operators with convex domain. First, we obtain some basic properties of WT when T is Borwein-Wiersma decomposable. Proposition 7.7.1 Given a monotone operator T with a Borwein-Wiersma decomposition T = ∂f + A, ˆ T (x, y) is a saddle function. Furthermore, if dom T is convex, then WT (x, y) then the operator W is a saddle function. Proof. By Proposition 7.6.1, MT (x, y) = f (y) − f (x) + y − x, Ax , which is well defined and convex in y on cern dom T (itself always a convex set) and on dom T ˆ T is a saddle function, as is WT if this latter set is convex. Therefore, by Proposition 6.3.3, W if dom T is convex. ˆT Corollary 7.7.2 If T : X → 2X is monotone and Borwein-Wiersma decomposable, then W is a skew symmetric saddle function, as is W ˜ for any operator T˜ : X → 2X such that TC  gra T˜C ⊂ gra T and dom T˜C = C for some convex set C ⊂ dom T .  ˆ T and WT (Definition 6.3.1), Theorem 6.3.5 Proof. Follows directly from the definition of W and Proposition 7.7.1. Corollary 7.7.3 Given a monotone operator T : X → 2X and any convex set C ⊂ dom T , if for any x ∈ C, the operator MT (x, ·) is not convex in C, then T is not Borwein-Wiersma decomposable.  Proposition 7.7.4 Given any monotone operator T : X → 2X , with a Borwein-Wiersma  decomposition T = ∂f + A, and given any convex set C ⊂ dom T such that C = ∅, choose an 157  arbitrary z ∈ C. Then, the functions φ1 , φ2 : X × X → R {−∞, +∞} defined by    MT (x, z) + MT (z, y), x, y ∈ C φ1 (x, y) := +∞, x ∈ C, y ∈ /C   −∞, x∈ /C    MT (x, y) − MT (x, z) − MT (z, y), x, y ∈ C φ2 (x, y) := +∞, x ∈ C, y ∈ /C   −∞, x∈ /C  (7.7.1)  (7.7.2)  are skew symmetric saddle functions. Furthermore, φ1 = WS1  and  φ2 = WS2  (7.7.3)  for any operators S1 , S2 : X → 2X that satisfy, for some z ∗ ∈ Az, S1 := ∂f + ∂ιC + z ∗ ,  (7.7.4)  S2 := A + ∂ιC − z ∗ .  (7.7.5)  Proof. No matter the choice of z ∗ , dom S1 = dom S2 = C, as C ⊂ dom T , and dom T = dom ∂f  dom A. Now, ιC is a proper convex function since C is not empty, and so by Corol-  lary 6.2.25, for all x, y ∈ C, M∂ιC (x, y) = ιC (y) − ιC (x) = 0,  (7.7.6)  a convex function in y for every x ∈ C. Since for every x, y ∈ C, MS1 (x, y) = M∂f +z ∗ (x, y) + M∂ιC (x, y) and MS2 (x, y) = MA−z ∗ (x, y) + M∂ιC (x, y) (see Corollary 6.2.3), MS1 (x, y) and  MS2 (x, y) are convex functions in y. Therefore, WS1 and WS2 are skew symmetric saddle functions. Now, for x, y, z ∈ C, we have from Theorem 7.6.3 that MT (x, z) + MT (z, y) = f (y) − f (x) + y − x, Az and that MT (x, y) − MT (x, z) − MT (z, y) = y − x, Ax − Az . By (7.7.6), Corollary 6.2.25, and Proposition 6.2.9, MS1 (x, y) = f (y) − f (x) + y − x, z ∗ and MS2 (x, y) = y − x, Ax − z ∗ . 158  (Recall by Fact 4.0.12 that y, Az = y, z ∗ for all z ∗ ∈ Az.) Therefore, since dom S1 = dom S2 = C, φ1 = WS1 and φ2 = WS2 .  Remark 7.7.5 If there is a convex set C such that 0 ∈ C ⊂ dom T , then for the choice  of z = 0, since A is a linear relation, 0 ∈ Az = A0. In this manner, S1 = ∂f + ∂ιC and S2 = A + ∂ιC , which is likewise also the case for any z ∈ C ⊂ dom T such that 0 ∈ Az.  Remark 7.7.6 If dom T is convex and T = ∂f +A is a clean decomposition (dom ∂f = dom T ), then S1 = ∂f + z ∗ for any z ∗ ∈ Az. Proposition 7.7.7 Given any monotone operator T : X → 2X , with a Borwein-Wiersma  decomposition T = ∂f + A and a convex set C ⊂ dom T , choose some z ∈ C and define φ1 and  φ2 as in Proposition 7.7.4. Then, for any x ∈ C and any z ∗ ∈ Az, ∂φ1 (x, ·)(x) ⊃ (∂f )(x) + (∂ιC )(x) + z ∗ ,  (7.7.7)  with equality when ∂f + ∂ιC is maximal monotone, and ∂φ2 (x, ·)(x) ⊃ Ax + (∂ιC )(x) − z ∗ ,  (7.7.8)  with equality when A + ∂ιC is maximal monotone. Proof. Apply Theorem 6.3.5 to Proposition 7.7.4 to obtain (7.7.7) and (7.7.8). with equality respectively if ∂f + ∂ιC or A + ∂ιC is maximal monotone. Remark 7.7.8 In Proposition 7.7.7, note that if T is maximal monotone, then any maximal monotone extension A˜ of A that is also a skew linear relation is such that T = ∂f + A˜ is also a Borwein-Wiersma decomposition of T . Proposition 7.7.9 Given any monotone operator T : X → 2X , with a Borwein-Wiersma  decomposition T = ∂f + A and given a closed convex set C ⊂ dom T with nonempty interior,  choose some z ∈ C and define φ1 and φ2 as in Proposition 7.7.4. Then, for any x ∈ C and any z ∗ ∈ Az,  ∂φ1 (x, ·)(x) = (∂f )(x) + (∂ιC )(x) + z ∗ .  (7.7.9)  If T is maximal monotone, then for any maximal monotone linear relation A˜ such that gra A˜ ⊃  gra A,  ˜ dom A = ∂f + (A + (dom A)⊥ ) T = ∂f + A|  (7.7.10)  is a Borwein-Wiersma decomposition of T , and, for all x ∈ C, ˜ + (∂ιC )(x) − z ∗ . ∂φ2 (x, ·)(x) = Ax  (7.7.11)  159  Proof. C  If int C is nonempty, then since C  ⊂ dom ∂f , the qualification constraint  int(dom ∂f ) = ∅ is satisfied. Therefore, by Fact 3.6.3, the sum ∂f + ∂ιC + z ∗ is max-  imal monotone, and so by Proposition 7.7.7, ∂φ1 (x, ·)(x) = (∂f )(x) + (∂ιC )(x) + z ∗ .  Suppose now that T is maximal monotone, and consider a maximal monotone extension ˜ A of A such that A˜ is a linear relation. (Such a maximal monotone extension exists by Proposition 4.0.19.) Since ∂f + A˜ is the sum of two monotone operators, it is itself monotone, ˜ we obtain that gra T ⊂ gra(∂f + A). ˜ Therefore, as T is maximal and since gra A ⊂ gra A, ˜ Now, let B := A| ˜ dom A , so that T = ∂f + B. The operator B monotone, T = ∂f + A. is a linear relation since dom A is a subspace, and so for all (x, x∗ ) ∈ gra B, we have that  x∗ + (dom B)⊥ = x∗ + (dom A)⊥ ∈ gra B. Therefore, gra B = gra A + X × (dom A)⊥ . Since  A is skew, for every x ∈ dom A, 0 = x, Ax = x, Bx , and so B is a skew linear relation. ˜ Therefore, T = ∂f + B is a Borwein-Wiersma decomposition of T . Since C ⊂ dom A ⊂ dom A,  ˜ = ∅ is satisfied, and so by and since int C = ∅, the qualification constraint C int(dom A) Fact 3.6.3, A˜ + ∂ιC − z ∗ is maximal monotone. Therefore, by Proposition 7.7.7, we obtain (7.7.11).  Corollary 7.7.10 Suppose that T : X → 2X is a maximal monotone operator such that  dom T is closed, convex, and has nonempty interior, and where 0 ∈ dom T . Define φ1 and  φ2 as in Proposition 7.7.4, where z = 0 and C = dom T . Then, for any Borwein-Wiersma decomposition T = ∂f + A of T , T = (∂f |dom T ) + (A + (dom A)⊥ )  (7.7.12)  is a Borwein-Wiersma decomposition of T , and for all y ∈ dom T , ∂y φ1 (·, y)(·) = ∂f |dom T ,  (7.7.13)  ∂y φ2 (·, y)(·) = A + (dom A)⊥ + (∂ιdom T ).  (7.7.14)  and  If T = ∂f + A is an essential decomposition, then for any x ∈ ri dom T , ∂φ2 (x, ·)(x) = Ax + (dom T )⊥ .  (7.7.15)  Proof. Since z = 0 ∈ dom T , let z ∗ = 0 ∈ Az = A0, and apply Proposition 7.7.9 to T for  C = dom T . Therefore, for any Borwein-Wiersma decomposition T = ∂f + A, ∂f + ιdom T ⊂ ∂(f |dom T ) is maximal monotone, and so ∂f + ιdom T = ∂(f |dom T ), which yields (7.7.13). Also, (7.7.14) is immediate from (7.7.11). Since gra ∂(f |dom T ) ⊃ gra(∂f )|dom T , we obtain (7.7.12)  from (7.7.10) and the maximal monotonicity of T . Since 0 ∈ dom T ⊂ Aff dom T , by Fact 2.2.14  we obtain (7.7.15) for all x ∈ ri dom T by Proposition 7.3.13.  160  Theorem 7.7.11 Given any monotone operator T : X → 2X with a Borwein-Wiersma decomposition T = ∂f + A such that dom T is nonempty and convex, choose some z ∈ dom T and define φ1 and φ2 as in Proposition 7.7.4. Then, for any x ∈ ri dom T and any z ∗ ∈ Az, ∂φ1 (x, ·)(x) ⊃ (∂f )(x) + (dom A)⊥ + spanP(dom T −z)⊥ z + z ∗  (7.7.16)  and, for all x∗ ∈ Ax, ∂φ2 (x, ·)(x) = x∗ + (dom T )⊥ + spanP(dom T −z)⊥ z − z ∗ .  (7.7.17)  If T = ∂f + A is clean and essential decomposition of T , then, for all x ∈ dom T , ∂φ1 (x, ·)(x) = (∂f )(x) + z ∗ ,  (7.7.18)  and for all x ∈ ri dom T and all z ∈ dom T , ∂φ2 (x, ·)(x) = Ax + (dom A)⊥ + spanP(dom ∂T −z)⊥ z − z ∗ .  (7.7.19)  Proof. By the definition of φ1 and by Theorem 7.6.3, for all x ∈ dom T , φ1 (x, y) = f (y) − f (x) + y − x, Az + ιdom T . Therefore, for all z ∗ ∈ Az, ∂φ1 (x, ·)(x) = ∂(f |dom T + ·, z ∗ ) = ∂(f |dom T ) + z ∗ . Similarly, ∂φ2 (x, ·)(x) = ∂( ·, Ax |dom T − ·, z ∗ ) = ∂( ·, Ax |dom T ) − z ∗ .  The results then follow from Proposition 7.5.1 and Proposition 7.5.3. Proposition 7.7.12 Suppose T : X → 2X is monotone operator with an essential Borwein-  Wiersma decomposition T = ∂f + A such that dom T is nonempty and convex. Suppose further that there exist z ∈ dom T and λ ∈ R such that λ = 0 and (1 + λ)z ∈ dom T . Define the  161  functions ψ1 , ψ2 : X × X → R {−∞, +∞} by  ψ1 (x, y) :=  ψ2 (x, y) :=         1 λ  (1 + λ)MT (x, z) + (1 + λ)MT (z, y) −MT (x, (1 + λ)z) − MT ((1 + λ)z, y)   +∞,     −∞,       MT (x, y) −  +∞,     −∞,  , x, y ∈ dom T  x ∈ dom T, y ∈ / dom T  (7.7.20)  x∈ / dom T  1 λ  (1 + λ)MT (x, z) + (1 + λ)MT (z, y) −MT (x, (1 + λ)z) − MT ((1 + λ)z, y) x ∈ dom T, y ∈ / dom T  , x, y ∈ dom T  (7.7.21)  x∈ / dom T.  Then, for all x ∈ dom T ,  and for all x ∈ ri dom T  ∂ψ1 (x, ·)(x) ⊃ (∂f )(x) + (dom T )⊥ ,  (7.7.22)  ∂ψ2 (x, ·)(x) = Ax + (dom T )⊥  (7.7.23)  T − ∂ψ2 (x, ·)(x) = (∂f )(x) + (dom T )⊥ .  (7.7.24)  and  Proof. By Theorem7.6.5,    f (y) − f (x) x, y ∈ dom T ψ1 (x, y) = +∞, x ∈ dom T, y ∈ / dom T   −∞, x ∈ / dom T,    y, Ax x, y ∈ dom T ψ2 (x, y) = +∞, x ∈ dom T, y ∈ / dom T   −∞, x ∈ / dom T.  (7.7.25)  (7.7.26)  By Fact 2.2.12, we have that 0 ∈ Aff dom T . Therefore, by Fact 2.2.14, for all z ∈ dom T ,  P(dom T −z)⊥ z = 0. The results then follow from Proposition 7.5.1.  Note that by Proposition 7.7.12, if T = ∂f + A is a clean and essential decomposition, then (dom A)⊥ = (dom T )⊥ = (dom ∂f )⊥ , and so ∂f + (dom A)⊥ = ∂f . Since B := x →  Ax+(dom A)⊥ is also a linear relation, then T = ∂f +B is an essential and clean decomposition of T and, for ψ1 and ψ2 as defined in Proposition 7.7.12, for all x ∈ ri dom T , ∂ψ2 (x, ·)(x) = Bx  (7.7.27)  T − ∂ψ2 (x, ·)(x) = (∂f )(x).  (7.7.28)  and  162  Indeed, given any monotone operator with a clean and essential decomposition and a convex domain, then it is possible to exactly decompose the operator on its relative interior. The theorem below is not entirely constructive in the case where 0 ∈ Aff dom T – it requires  knowledge of two colinear points in dom T , which are guaranteed to exist by Fact 2.2.12. If 0∈ / Aff dom T , the process below does not specify how the skew linear relation behaves in one  dimension, however any specification that is a skew linear relation would result in a complete decomposition.  See Figure 7.3 for a visualization of W , w, V , and Aff dom T as they may appear in Theorem 7.7.13 below for an operator T such that 0 ∈ / Aff dom T . Theorem 7.7.13 Suppose that T : X → 2X is a monotone operator with convex domain  that has at least one clean and essential Borwein-Wiersma decomposition. Consider any other clean and essential decomposition T = ∂g + B (so that dom A = dom B = span dom T ) and the decomposition T = ∂f + Aˆ where Aˆ : x → Ax + (dom A)⊥ , which is also clean and essential. ˆ Indeed, for every x ∈ dom A, (i) If 0 ∈ Aff dom T , then ∂f = ∂g and gra B ⊂ gra A. Ax + (dom A)⊥ = Bx + (dom A)⊥ .  (7.7.29)  With ψ1 and ψ2 as defined in Proposition 7.7.12, for every x ∈ dom T ∂ψ1 (x, ·)(x) = (∂f )(x),  (7.7.30)  ˆ ∂ψ2 (x, ·)(x) = Ax,  (7.7.31)  and for every x ∈ ri dom T ,  (ii) If 0 ∈ / Aff dom T , choose some z ∈ dom T and z ∗ ∈ Az, with φ1 and φ2 as defined in Proposition 7.7.4. Then, there is an essential and clean decomposition T = f˜ + A˜ such that for all x ∈ dom T = dom ∂f , ∂φ1 (x, ·)(x) = (∂ f˜)(x) = (∂f )(x) + z ∗ = (∂g)(x) + Bz.  (7.7.32)  Let W := span P(dom T −z)⊥ z . Then, for all x ∈ ri dom T , ˜ + (dom A) ˜⊥ . PW ⊥ ∂φ2 (x, ·)(x) = PW ⊥ Ax  (7.7.33)  Furthermore, for all x ∈ dom A, ˜ + (dom A) ˜⊥ PW ⊥ Ax  = PW ⊥ Ax − z ∗ + (dom A)⊥ ˆ − z∗ = PW ⊥ Ax  (7.7.34)  = PW ⊥ Bx − Bz + (dom B)⊥ . 163  ˆ : X → 2X such that, for all x ∈ dom T , There exists at least one skew linear relation B ˆ = PW ⊥ Ax ˜ + (dom A) ˜⊥ , PW ⊥ Bx ˆ the decomposition T = ∂ f˜ + B ˆ is a clean and and for any such skew linear relation B, essential. Proof. Recall that for clean and essential decompositions T = ∂f + A, it is always the case that (dom T )⊥ = (dom A)⊥ . (i) If T = ∂f + A is a clean and essential decomposition of T , then dom ∂f ⊂ dom A, and so (dom ∂f )⊥ ⊃ (dom A)⊥ . Therefore, T = ∂f + Aˆ is also a clean and essential decomposition of T . Since T = ∂g+B is also a clean and essential decomposition, dom A = dom B = span dom T by Proposition 7.3.7. If 0 ∈ Aff dom T , then by Fact 2.2.12 and the convexity of dom T , there exist z ∈ dom T  and λ ∈ R such that λ = 0 and (1 + λ)z ∈ dom T . By Proposition 7.7.12, for this choice of z ˆ and for every x ∈ dom T , and λ and for every x ∈ ri dom T , ∂ψ2 (x, ·)(x) = Ax, ∂ψ1 (x, ·)(x) ⊃ (∂f )(x) + (dom A)⊥ . Since (dom ∂f )⊥ ⊃ (dom A)⊥ , since dom T = dom ∂f , and since ∂f is maximal monotone,  ∂ψ1 (x, ·)(x) = (∂f )(x). Since ψ1 and ψ2 depend only on T and not any particular decompo-  sition, if T = ∂g + B is also a clean and essential decomposition of T , then ∂f = ∂g, and ˆ for all x ∈ ri dom T . Since A and B are linear relations, by Fact 4.0.5, gra B ⊂ gra Aˆ Bx ⊂ Ax on span dom T , and therefore gra B ⊂ gra Aˆ on dom A. Therefore, for every x ∈ dom A, Ax + (dom A)⊥ = Bx + (dom A)⊥ .  (ii) Now, suppose that 0 ∈ / Aff dom T . By Fact 2.2.12, span dom T = Aff dom T . Since  0∈ / Aff dom T ⊃ dom T , then 0 ∈ / dom T .  By Theorem 7.3.17, since there exists a clean and essential decomposition of T , there exists ˜ a clean and essential decomposition T = ∂ f˜ + A˜ such that 0 ∈ Az. As Aff dom T is an affine subspace of X, Aff dom T = z + V where V = span(dom T − z) is  a linear subspace of X (as for W , the choice of z ∈ dom T here always yields the same V ). Let  w := P(dom T −z)⊥ z (this is independent of z), so that W = span{w}, a subspace of X. Note that W is closed and is at most 1-dimensional. In particular, w + V = Aff dom T .  Let φ1 , φ2 be as defined (for T and z) in Proposition 7.7.4. Then, by Proposition 7.7.4, for every x ∈ dom T ,  ∂φ1 (x, ·)(x) = (∂ f˜)(x),  164  and for every x ∈ ri dom T , ˜ + (dom A) ˜ ⊥. ∂φ2 (x, ·)(x) ⊃ Ax ˜ and T = ∂g + B are As a consequence of Theorem 7.7.11 and since T = ∂f + A, T = ∂ f˜ + A, all Borwein-Wiersma decompositions of T , for every x ∈ ri dom T , ˜ + (dom A) ˜⊥ PW ⊥ (∂φ2 (x, ·)(x)) = PW ⊥ Ax  = PW ⊥ Ax − z ∗ + (dom A)⊥  (7.7.35)  = PW ⊥ Bx − Bz + (dom B)⊥ .  ˆ : X → 2X on dom B ˆ = dom A by Define an operator B ˜ + (dom A) ˜ ⊥ x + γ(x)w, ˆ := PW ⊥ Ax Bx  (7.7.36)  ˆ is a linear relation and T = ∂ f˜ + B, ˆ where γ(x) : X → R is a linear function. In any case, B ˆ is skew, then B ˆ is monotone linear relation, and T = ∂ f˜+ B ˆ since gra ∂f +X ×W ⊂ gra ∂f . If B  ˆ satisfying (7.7.36) is a clean and essential decomposition of T . There is at least one operator B ˆ is skew since when γ = w, PW A˜ , we have that such that B ˆ = gra A˜ + X × (dom A)⊥ = gra A˜ + X × (dom A) ˜ ⊥. gra B  By the above theorem, finding a clean and essential decomposition of T is reduced to the problem of reconstructing the skew linear relation given one missing dimension when 0 ∈ /  Aff dom T . However, not knowing how the skew linear operator Ax behaves in the subspace W is not much of an issue, as this is perpendicular to dom T and contained in the image of gra ∂f . Therefore, given the great number of potential decompositions in this case, although the decomposition from Theorem 7.7.13 may not be exact, it yields the best result possible. Note that by Theorem 7.7.13, in the case where 0 ∈ / Aff dom T it is therefore impossible to  determine the decomposition uniquely as for each (z, z ∗ ) ∈ dom T as there is another Borwein-  Wiersma decomposition that differs by a constant vector. However, these are exactly the decompositions discovered by this process. Also in the case where 0 ∈ / Aff dom T , if w ∈ dom T , then, by Theorem 7.6.3, ˜ + x − y, Aw ˜ , = y, Ax ˜ . MT (x, y) − MT (x, w) − MT (w, y) = y, Ax Therefore, if w = z, the consideration of T = f˜+ A˜ in the proof would not be required as there would be no z ∗ term to consider. However, doing so is of no import as the conclusions of this special case are the same as the general.  165  7.8  The general decomposition of Borwein-Wiersma decomposable operators  In this section, we demonstrate how to identify the subgradient and skew linear relation components of a Borwein-Wiersma decomposable monotone operator T , even if its domain is nonconvex. Indeed, all that is required to obtain similar results to decomposition using saddle functions is that the domain of T is starshaped and that a point in the convex kernel of dom T can be identified. As such, we show how T can be decomposed exactly in many instances into ∂f + A for any point in eri(dom T )  dom T . Furthermore, as long as the domain of T is starshaped and  Aff dom T is closed, the means by which a Borwein-Wiersma decomposition of T can always be constructed is shown. In contrast to the decomposition using the saddle function WT , we set up the problems to use one less evaluation of MT for determining ∂f and A. As such, a proof where dom T is convex is shown for comparison. At times, we shall assume that a given T = ∂f + A is essential. However, since every Borwein-Wiersma decomposable operator has an essential decomposition (Proposition 7.3.14), this assumption is is always trivial to satisfy. However, it is included for completeness in order to track how a non-essential decomposition may differ from the recovered decomposition. Similarly, if a clean decomposition of T exists, then the methods below will discover it, although there will of course be differences for any decomposition of T that is not clean. Recall n  that for any maximal monotone operator T : Rn → 2R , a clean decomposition will always n  exist by Proposition 7.4.11 and the fact that for any linear relation A : Rn → 2R , the domain of A is closed.  In the main theorems of this section, Theorem 7.8.7 and Theorem 7.8.8 below, as well as in their corollaries, two main assumptions are often made to simplify results, namely (i) there is a clean decomposition of T , where dom T = dom ∂f , (ii) being able to find one or two points in the convex kernel of dom T , where T is given as a Borwein-Wiersma decomposable monotone operator with some BorweinWiersma decomposition T = ∂f + A (although sometimes this is required to be (one of) the essential decomposition(s)). Both are somewhat problematic as they are conditions that may not be known about the operator in black box like conditions. However, there are many cases which make this assumptions more trivial, as the following three propositions demonstrate. We will continue to use the notation of Fact 2.2.9, where ∂ˆ denotes the subdifferential operator applied to a possibly nonconvex function. This helps the reader to track when the function itself may be nonconvex, which in this setting is most often due to it having a nonconvex domain. 166  Proposition 7.8.1 Let T : X → 2X be a monotone Borwein-Wiersma decomposable operator,  with some Borwein-Wiersma decomposition T = ∂f + A. Each of following conditions implies that this decomposition is clean (dom T = dom ∂f ): (i) Aff dom T = X; (ii) dom T = X; (iii) int dom T = ∅; (iv) T is maximal monotone, and locally bounded near every x ∈ dom T ; (v) T is maximal monotone, and both Aff dom T and Aff dom ∂f are closed; (vi) T is maximal monotone and X = Rn . Proof. (i) If Aff dom T = X, then since dom T = dom ∂f  dom A, and since dom A is a  subspace of X, it must be that dom A = X. Therefore, dom ∂f = dom T . (ii) ⇒ (i).  (iii) ⇒ (i).  (iv) By Theorem 7.6.2 (Rockafellar-Vesel´ y), if T is locally bounded at x then x ∈ int dom T ,  which leads to condition (iii). (v) See Corollary 7.5.6.  (vi) ⇒ (v), since subspaces and affine subspaces are always closed in Rn . Proposition 7.8.2 Let T : X → 2X be a maximal monotone Borwein-Wiersma decomposable  operator. Then, a clean and essential decomposition of T exists if dom A is closed for some  Borwein-Wiersma decomposition T = ∂f + A. Proof. Follows from Proposition 7.4.11 and Proposition 7.4.8, since assumption (A1) in Section 7.4 holds for any decomposition T = ∂f + A where dom A is closed, and since dom B is closed for any standardized decomposition T¨ = ∂f + B of such an operator T . Proposition 7.8.3 Let T : X → 2X be a monotone operator. Then, each of the following conditions imply that z ∈ cern dom T : (i) z ∈ X and dom T = X; (ii) z ∈ dom T and dom T is convex; (iii) z ∈ dom T , T is maximal monotone, int dom T = ∅, and dom T is closed; (iv) T is maximal monotone and z ∈ int dom T ; (v) T is maximal monotone, z ∈ eri dom T  dom T , and Aff dom T is closed; 167  (vi) T is maximal monotone, z ∈ eri dom T  dom T , and X is finite dimensional.  Proof. (i) Since cern X = X, for any z ∈ X, z ∈ cern dom T . (ii) For any convex set C ⊂ X, cern C = C. (iii) Under these conditions, dom T is convex by Theorem 7.2.11. (iv) Let z ∈ int dom T . Then, there is an ε > 0 such that the epsilon ball about z (Bε z) is  entirely within dom T . Consider an arbitrary x ∈ dom T and a λ ∈ ]0, 1[. Since x ∈ dom T , for  every y ∈ Bε z, (1 − λ)x + λy ∈ conv dom T . Therefore,  Bλε ((1 − λ)x + λz) ⊂ conv dom T, and so (1 − λ)x + λz ∈ int(conv dom T ) = ri(conv dom T ). By Theorem 7.2.6, since T is maximal monotone, ri(conv dom T ) ⊂ dom T . We now have that for all x ∈ dom T and all  λ ∈ ]0, 1[, (1 − λ)x + λz ∈ dom T . Since x ∈ dom T and z ∈ dom T , then [x, z] ∈ dom T , and so z ∈ cern dom T .  (v) Suppose that T is maximal monotone and that Aff dom T is closed, and consider any z ∈ eri dom T  dom T if one exists. Then, using the notation of Proposition 2.2.6, let V :=  span(dom T −z) and w := PV ⊥ z, so that Aff dom T = z +V = w+V . Since Aff dom T is closed, V is a closed subspace, and so define T˜ : V → 2V for T as it is defined in Proposition 2.2.6. Since  T is maximal monotone on X, so is T˜ on V . Since z ∈ eri dom T , (cone conv(dom T − z)) = span(dom T − z) = V . Therefore, since z − w ∈ dom T˜ ⊂ V , and since dom T˜ = dom T − w, cone conv(dom T˜ − (z − w)) = span(dom T˜ − (z − w)) = V. Therefore, (z − w) ∈ eri(dom T˜). Since T˜ is maximal monotone, when V is regarded as a Hilbert space, as it is with regards to T˜, we have by Theorem 7.2.6 that eri dom T˜ = ri conv(dom T˜) = intAff dom T˜ (conv dom T˜) = int(conv dom T˜)  (7.8.1)  ⊂ dom T˜. Since (z − w) ∈ eri dom T˜, by (7.8.1) that (z − w) ∈ int(dom T˜), and so the conditions for (iv) are satisfied. Consider any arbitrary x ∈ dom T , so that (x − w) ∈ dom T˜ ⊂ V . Then, by (iv),  [z − w, x − w] ∈ dom T . Therefore, by Proposition 2.2.6, we have that gra T = gra T˜ + {(w, x∗ ) : x∗ ∈ V ⊥ }, and so [z, x] ∈ dom T .  (vi) Follows from (v) above since if X is finite dimensional, Aff dom D is closed for all sets 168  D ⊂ X. We start the series of decomposition results with the simple case where dom T = X. In such a setting, a full decomposition is always possible, with a simple proof relying only on the properties of MT . Theorem 7.8.4 Suppose that T : X → 2X is a monotone operator such that T = ∂f + A is a  Borwein-Wiersma decomposition of T and dom T = X. Then, for all y ∈ X, MT (0, y) = f (y) − f (0),  (7.8.2)  ∂MT (0, ·) = ∂f.  (7.8.3)  ∂y (MT (x, y) + MT (y, 0))(y0 ) = Ax.  (7.8.4)  MT (x, y) − MT (x, 0) − MT (0, y) = y, Ax .  (7.8.5)  and  For each x ∈ X and for any y0 ∈ X,  and, for all y ∈ X,  Proof. For all x, y ∈ X, MT (x, y) is well defined and convex since dom T = X. Since 0 ∈ A0 for any linear relation A, we obtain (7.8.2) from Theorem 7.6.1, from which follows (7.8.3). Since Theorem 7.6.1 also yields that MT (x, y) + MT (y, 0) = f (0) − f (x) + y, Ax , we also obtain (7.8.4). Finally, for all x, y ∈ X, (7.8.5) follows directly from Corollary 7.6.4. Remark 7.8.5 Recall from Theorem 7.6.2 (Rockafellar-Vesel´ y) that if T : X → 2X is a  maximal monotone operator that is locally bounded on dom T , then dom T = X.  Remark 7.8.6 As a converse of sorts to Theorem 7.8.7, if T is a single-valued monotone operator with full domain, if MT (0, y) is a lower semicontinuous proper convex operator, and if T − ∂MT (0, y) is a skew linear operator, then T is decomposable, yielding an if and only if condition and partially generalizing a similar result in [19].  Recall for the Theorems below that if dom T is convex, then cern dom T = dom T and eri dom T  dom T = ri dom T .  Theorem 7.8.7 Suppose that T : X → 2X is a monotone operator with a nonempty and  starshaped domain and that T = ∂f + A is a Borwein-Wiersma decomposition of T . For z in  169  the convex kernel of dom T , define ΦT,z : X → R ΦT,z x :=  {+∞} by  MT (z, x), x ∈ dom T, +∞,  x∈ / dom T.  (7.8.6)  Then, for all z ∗ ∈ Az, ˆ T,z ⊃ gra (∂f + z ∗ ) gra ∂Φ  (dom T × X).  (7.8.7)  and for any x ∈ dom T , ˆ T,z (x). Ax − Az = Ax − z ∗ ⊂ T x − ∂Φ  (7.8.8)  Let W := span{P(dom T −z)⊥ z}. If T is maximal monotone and Aff dom T is closed, then for all x ∈ dom T ,  ˆ T,z (x) = (∂f ) (x) + z ∗ + (dom T )⊥ + W, ∂Φ  (7.8.9)  and ˆ T,z = A − z ∗ + (dom T )⊥ + W + {y ∗ − x∗ : x∗ , y ∗ ∈ ∂f (·)}. T − ∂Φ  (7.8.10)  If the decomposition is clean (dom ∂f = dom T ), ˆ T,z = ∂f + z ∗ , ∂Φ  (7.8.11)  ˆ T,z = A − z ∗ + {y ∗ − x∗ : x∗ , y ∗ ∈ ∂f (·)}. T − ∂Φ  (7.8.12)  and  Proof. Throughout this proof, the restricted domain notation f |D from Remark 7.3.8 is used, as defined in (7.3.21) and (7.3.22). As z ∈ cern dom T , from Proposition 7.6.1 we have that ΦT,z (x) = f |dom T (x) − f (z) + x − z, Az ,  (7.8.13)  ˆ T,z (·) = x∗ + z ∗ : x∗ ∈ ∂(f ˆ |dom T )(·) ∂Φ  (7.8.14)  and so for all z ∗ ∈ Az,  ˆ |dom T The inclusion (7.8.7) follows since gra ∂f  tion 7.5.1). Therefore, for all x ∈ dom T , (7.8.8) holds.  ⊃ gra (∂f ) |dom T (or from Proposi-  If Aff dom T is closed, then (7.8.14) together with Proposition 7.5.5 similarly yields (7.8.9) and (7.8.10). If the decomposition T = ∂f + A is clean, then dom T = dom ∂f . From Fact 2.2.9 we know that ∂ˆ (f |dom T ) = ∂f . This together with (7.8.14) yields (7.8.11) for all z ∗ ∈ Az, from which 170  follows (7.8.12).  Theorem 7.8.8 Suppose that T : X → 2X is a monotone operator with a starshaped domain  and a Borwein-Wiersma decomposition T = ∂f + A such that (dom A)⊥ = (dom T )⊥ . For any x and z in the convex kernel of dom T , define ΨT : X → R ΨT (x, y, z) :=  {+∞} by  MT (x, y) + MT (y, z) y ∈ dom T  ,  y∈ / dom T  +∞,  (7.8.15)  and let W := span{P(dom T −z)⊥ z}. Then, for all x = z in the convex kernel of dom T , for all z ∗ ∈ Az, and for all y0 ∈ eri(dom T ) dom T (which may be empty), ∂ˆy ΨT (x, y, z) (y0 ) = Ax − z ∗ + (dom A)⊥ + spanP(dom T −y0 )⊥ y0 ,  (7.8.16)  and T x − ∂ˆy ΨT (x, y, z) (y0 ) = ∂f + z ∗ + (dom A)⊥ + W,  (7.8.17)  where W = {0}  0 ∈ Aff(dom T ).  (7.8.18)  ∂ˆy ΨT (x, y, z) (y0 ) = Ax − z ∗ ,  (7.8.19)  if  If A0 = (dom A)⊥ and 0 ∈ Aff dom T , then  in which case T x − ∂ˆy ΨT (x, y, z) (y0 ) = ∂f + (dom A)⊥ + z ∗ .  (7.8.20)  If the decomposition is clean (dom ∂f = dom T ), then T x − ∂ˆy ΨT (x, y, z) (y0 ) = ∂f + z ∗ .  (7.8.21)  Proof. Since x and z are in the convex kernel of dom T , MT (x, y) and MT (y, z) are well defined, and so from Proposition 7.6.1, for all y ∈ dom T , MT (x, y) + MT (y, z) = MT (x, y) − MT (z, y)  = f (z) − f (x) + y − x, Ax − y − z, Az  and so, since A is a skew operator, MT (x, y) + MT (y, z) = f (z) − f (x) + y, Ax − Az . Therefore, ΨT (x, y, z) is linear on its domain in y. Let y0 ∈ dom T  eri dom T , if this set is  171  nonempty. By Proposition 7.3.13, ∂ˆy ΨT (x, y, z) (y0 ) = ∂y ( y, Ax − Az + ιdom T ) (y0 ) = Ax − Az + (dom T − y0 )⊥ .  (7.8.22)  This together with Fact 2.2.13, from which (dom T − y0 )⊥ = (dom T )⊥ + spanP(dom T −y0 )⊥ y0 ,  (7.8.23)  yields (7.8.16) since (dom A)⊥ = (dom T )⊥ . Now, if 0 ∈ Aff dom T , then by Fact 2.2.14, W = spanP(dom T −y0 )⊥ y0 = {0}. If A0 = (dom A)⊥ and if 0 ∈ Aff dom T as well, then by  Corollary 7.5.2 and (7.8.22) with (7.8.23), we obtain (7.8.19).  Returning to the general case, we have from Fact 4.0.5 (iv) and Proposition 4.0.9 that since A is a linear relation, for all x∗ ∈ Ax, Ax = x∗ + A0 ⊂ x∗ + (dom A)⊥ . Since (dom A)⊥ = (dom T )⊥ , we obtain (7.8.17). By the definition of the subdifferential, for all (x, x∗ ) ∈ gra ∂f and all y ∈ dom ∂f , x∗ + (dom ∂f − y)⊥ ∈ ∂f (x), and so from (7.8.22) follows (7.8.21). Note that A+(dom A)⊥ is a monotone extension of A, a property of linear relations explored in 4.0.9. Therefore, we obtain the following result. Corollary 7.8.9 Suppose the operators T , f , A, and ΨT and the points x, z, y0 are as defined in Theorem 7.8.8. Further, suppose that A is maximal monotone. Then, for all z ∗ ∈ Az, ∂ˆy ΨT (x, y, z) (y0 ) = Ax − z ∗ + spanP(dom T −y0 )⊥ y0 .  (7.8.24)  T x − ∂ˆy ΨT (x, y, z) (y0 ) = ∂f + z ∗ + spanP(dom T −y0 )⊥ y0 .  (7.8.25)  and  As a corollary to the Theorems above, we obtain a generalization of the best constructive decomposition of a general Borwein-Wiersma decomposable operator prior to this work. In [19], T is assumed to be C 1 and maximal monotone on Rn , and so only single-valued operators are considered. None of these assumptions are required below. Corollary 7.8.10 Suppose that T : X → 2X is a Borwein-Wiersma decomposable operator,  and let T = ∂f + A be an essential decomposition of T . Suppose further that 0 ∈ int dom T .  Then, T = ∂f + A is a clean decomposition, and 0 is in the convex kernel of dom T . Further-  172  more, with ΦT,z as defined in Theorem 7.8.7 and ΨT as defined in Theorem 7.8.8, ˆ T,0 = ∂f ∂Φ  (7.8.26)  ∂y (ΨT (x, y, 0)) (0) = Ax.  (7.8.27)  (∂f )(x) = T x − ∂ˆy ΨT (x, y, 0) (0).  (7.8.28)  and for all x ∈ cern dom T ⊃ int dom T ,  where also  Proof. Since 0 ∈ int dom T , 0 ∈ Aff dom T and Aff dom T = span(dom T ) = X, therefore since  we have an essential decomposition of T , dom A = span(dom T ) = X. Therefore, dom T = dom A = dom ∂f , and the decomposition is also clean. Therefore, (dom A)⊥ = ˆ T,0 = ∂f by (dom T )⊥ . Also, for z = 0 and z ∗ = 0 (as 0 ∈ A0 for all linear relations A), ∂Φ  dom ∂f  equation (7.8.11) of Theorem 7.8.7.  As dom A = X, it follows that (dom A)⊥ = {0}. Since 0 ∈ dom T , for all z ∈ dom T span{P(dom T −z)⊥ z} = span{P(dom T −0)⊥ 0} = {0}. Therefore, the rest follows from Theorem 7.8.8 using z = 0 and z ∗ = 0. In Corollary 7.8.10, note that the assumption of an essential decomposition is not necessary, since by its results there is only one possible Borwein-Wiersma decomposition and it is both clean and essential as long as 0 ∈ int dom T .  Recall from Theorem 7.6.2 (Rockafellar-Vesel´ y) that if T : X → 2X is a maximal monotone  operator and there is an x such that T is locally bounded at x, then x ∈ int dom T . By Theorem 7.8.7 we obtain the following result.  Proposition 7.8.11 Suppose that T : X → 2X is a Borwein-Wiersma decomposable operator  such that  (i) 0 ∈ Aff(dom T ), (ii) there exists a clean Borwein-Wiersma decomposition T = ∂f + A such that for some known z ∈ cern dom T , 0 ∈ Az (this is always the case for z = 0 for any clean decomposition of T when 0 ∈ cern dom T ).  Then, it is possible to construct a clean and essential decomposition of T at any point x such that (∂f )(x) is a singleton, or, if a point y0 ∈ eri(dom T )  dom T is known, at any point x = z  such that x ∈ cern dom T by using the methods of Theorems 7.8.7 and 7.8.8. Proof. Let T = ∂f + A be a clean Borwein-Wiersma decomposition such that, for some z ∈ cern dom T , 0 ∈ Az. Then, by Proposition 7.3.14, T = ∂f + A|span dom ∂f is an essential 173  and clean decomposition such that 0 ∈ (A|span dom T )z. Furthermore, T = ∂f + Aˆ where Aˆ : x → (A|span dom T )x + (dom A span dom T )⊥ is also a clean and essential decomposition ˆ (See for instance the proof of Theorem 7.7.13.) Since it is possible of T , where again 0 ∈ Az. to choose z ∗ = 0, this latter decomposition can be recovered for all x such that (∂f )(x) is a singleton by (7.8.11) and (7.8.12) from Theorem 7.8.8. Similarly, if a y0 ∈ eri(dom T )  dom T  is known, this decomposition can be recovered for any x ∈ cern dom T where x = z by (7.8.19)  and (7.8.21) from Theorem 7.8.7.  However, by Theorem 7.3.17, we can improve on these results. Theorem 7.8.12 Suppose that T : X → 2X is a monotone operator with a nonempty and  starshaped domain that admits a clean Borwein-Wiersma decomposition. For any z in the convex kernel of dom T , define ΦT,z : X → R  essential decomposition T = ∂f + A such that  {+∞} as in (7.8.6). Then, there is a clean and  ˆ T,z = ∂f ∂Φ  (7.8.29)  ˆ T,z = A + {y ∗ − x∗ : x∗ , y ∗ ∈ ∂f (·)}. T − ∂Φ  (7.8.30)  and  In particular, if T is single-valued, then ˆ T,z + (T − ∂Φ ˆ T,z ) T = ∂Φ  (7.8.31)  is a clean and essential decomposition of T . Proof. Choose an arbitrary z ∈ dom T . Since a clean and essential decomposition for T exists, by Theorem 7.3.17 there is a clean and essential decomposition T = ∂f + A such that 0 ∈ Az.  Therefore, both (7.8.29) and (7.8.30) follow from Theorem 7.8.7 with z ∗ = 0. If T is singlevalued, then ∂f is single-valued on dom T , and so (7.8.31) is a clean and essential decomposition of T . Remark 7.8.13 Theorem 7.8.12 generalizes Theorem 4.8 in [11], which states that for maximal monotone symmetric linear relation, there is a unique Borwein-Wiersma decomposition and it is a subdifferential. Here, we generalize this not only to any nonlinear Borwein-Wiersma decomposable monotone operator that admits a zero skew component, but for any BorweinWiersma decomposable monotone operator, maximal monotone or not, that admits a clean decomposition. However, this is not exactly true in our case, since our definition of the decomposition allows the skew linear relation component to be multivalued (even when it is zero), however it is true by the above for any clean decomposition. The assumption of the existence of a clean decomposition does not appear in [11] because every maximal monotone linear relation that admits a decomposition where the single-valued skew linear relation is zero is clean. 174  Theorem 7.8.14 Suppose that T : X → 2X is a maximal monotone operator with a nonempty and starshaped domain where 0 ∈ Aff dom T , where some point y0 ∈ eri(dom T )  dom T is  known, and that T has a Borwein-Wiersma decomposition T = ∂f + A such that (dom A)⊥ = (dom T )⊥ . For any z in the convex kernel of dom T , define ΨT : X → R  Then, for all x ∈ cern dom T where x = z,  ∂ˆy ΨT (x, y, z) (y0 ) = Ax.  {+∞} as in (7.8.15).  (7.8.32)  If T admits a clean decomposition, then there is a clean and essential decomposition T = ∂f +A such that, for all x ∈ cern dom T where x = z, (7.8.32) holds and T x − ∂ˆy ΨT (x, y, z) (y0 ) = (∂f )(x).  (7.8.33)  Proof. Since T is maximal monotone and by Theorem 7.3.17, there exists an essential decomposition T = ∂f + A such that 0 ∈ Az. Therefore, for x ∈ cern dom T where x = z, (7.8.32)  follows from Theorem 7.8.8 with z ∗ = 0 and y0 as defined. If T has a clean decomposition, then again by Theorem 7.3.17 there is a clean and essential T = ∂f + A such that 0 ∈ Az. By  Theorem 7.8.8 (with z ∗ = 0 and y0 as defined), we obtain both (7.8.32) and (7.8.33) for any x ∈ cern dom T where x = z. Much of the difficulty above is due to trying to decompose ∂f and A as monotone operators by the calculation of a subgradient. However, by simply combining Proposition 7.6.1 with Theorem 7.3.17, we obtain the following result: Theorem 7.8.15 Suppose that T : X → 2X is a Borwein-Wiersma decomposable monotone operator, and fix a z ∈ dom T . Then, there is a monotone operator T˜ with a Borwein-Wiersma  decomposition T˜ = ∂f + A so that, for all x, y ∈ dom T such that [x, z] ⊂ dom T and [y, z] ⊂ dom T ,  MT (z, y) − MT (z, x) = f (y) − f (x)  (7.8.34)  and, if [x, y] ∈ dom T as well, MT (x, y) − MT (x, z) − MT (z, y) = y, Ax .  (7.8.35)  gra T˜ ⊃ gra T  (7.8.36)  Furthermore, T˜ satisfies  with equality if any of the following hold: (i) T is maximal monotone; (ii) 0 ∈ Aff dom T ; (iii) there exists a clean decomposition of T . 175  Remark 7.8.16 Note that in the setting of Theorem 7.8.15, for any Borwein-Wiersma decomposition T˜ = ∂f + A, for a fixed x ∈ dom T , MT (z, y) − MT (z, x) = g(y) for a Borwein-Wiersma decomposition T˜ = ∂g + A. This is because by defining the function g : X → R {+∞} by g(y) := f (y) − f (x), we obtain ∂g = ∂f . Furthermore, the action of  the skew linear relation A on any finite dimensional subspace V ⊂ span dom T of X can be  determined by a finite number of evaluations of MT if the domain of dom T  V is convex,  with the only unknown being the dimension of the subspace A0 where {0} ⊂ A0 ⊂ (dom A)⊥ .  Therefore, with such a simple construction, it is possible to answer in the affirmative (in the case of Borwein-Wiersma decompositions) a question by Borwein and Wiersma in [19] as to whether it is possible to iteratively construct the acyclic part of a decomposable operator.  7.9  Decomposing linear relations  In this section, we show how the methods above generalize existing results for the constructive decomposition of Borwein-Wiersma decomposable linear relations found in [11]. In [11], Borwein-Wiersma decompositions are defined to be Borwein-Wiersma decompositions T = ∂f + A where A is single-valued, and T is taken to be a maximal monotone linear relation. For the following, note that a clean decomposition of A exists if A is maximal monotone and dom A is closed, by Proposition 7.4.11 and Proposition 7.4.8. Theorem 7.9.1 Let A : X → 2X be a Borwein-Wiersma decomposable linear relation, and define Aˆ : X → 2X to be the linear relation where gra Aˆ = gra A + X × (dom A)⊥ . Define φ2  as in Proposition 7.7.4 with z = 0 ∈ dom A and C = dom A. Then, φ2 is the skew symmetric saddle function  and     y − x, Ax , x, y ∈ C φ2 (x, y) := +∞, x ∈ C, y ∈ /C   −∞, x∈ /C Aˆ = (A − ∂y φ2 (·, y)(·)) + ∂y φ2 (·, y)(·)  (7.9.1)  (7.9.2)  is a Borwein-Wiersma decomposition of the linear relation A if a clean decomposition of A exists. Proof. Suppose that A = ∂f + B is a Borwein-Wiersma decomposition of A, and that this decomposition is clean if a clean decomposition of A exists. Since dom A is a subspace, then ˜ := B|dom A is a linear relation, dom B ˜ = dom A, and A = ∂f + B ˜ is a Borwein-Wiersma B decomposition of A, and is clean if A = ∂f + B is a clean decomposition. 176  Since A is a linear relation, 0 ∈ dom A, and dom A is nonempty and convex. Let z = 0,  and let z ∗ = 0 ∈ Az. By Proposition 6.2.9, = =  MA (x, y) − MA (x, z) − MA (z, y)  − z − x, Ax+Az − y − z, Ay+Az y − x, Ax+Ay 2 2 2  (7.9.3)  y − x, Ax ,  and so (7.9.1) holds. By (7.7.17) in Theorem 7.7.11, and by the properties of linear relations, for all x ∈ ri dom A = dom A, ˜ + (dom B) ˜ ⊥ = Bx + (dom A)⊥ , ∂φ2 (x, ·)(x) = Bx which is itself a linear relation. Therefore, T −∂φ2 (x, ·)(x) = (∂f )(x)+(dom A)⊥ . If A = ∂f +B  is a clean decomposition, then gra ∂f + X × (dom A)⊥ = gra ∂f + X × (dom ∂f )⊥ = gra ∂f , and so we obtain (7.9.2).  177  Chapter 8  Conclusion and Future Work 8.1  Conclusion  In Chapter 3, we saw at the conclusion in Table 5.1 that all possible combinations of the five types of monotone operator class have been considered and have at least one example. Table 5.3 did the same for linear operators. Indeed, some extra restrictions exist in R2 and Rn for linear operators, and these are covered in Table 5.2. In addition, we examined some properties of linear relations. In particular, the framework of Proposition 4.0.13 allows many results for linear operators to be applied to linear relations. In this way, a full set of test cases can be generated from the given examples in low or high dimension. With a base dimension of R6 , convergence of algorithms (eg., to find a zero of an operator or to solve an equilibrium problem) can be qualitatively compared for different cases of monotone classes. Further, operators in higher dimension can be constructed which belong or fail to belong to monotone classes in various ratios of dimension. In Chapter 6, we re-examined the Krauss saddle function representation of monotone operators. Unfortunately, as shown in Example 6.3.8, the Krauss saddle function representation does not lead to the canonical saddle function representation for the subdifferential of proper lower semi-continuous convex functions. Indeed, as the example shows, this representation can be complex and difficult to calculate. To develop a new saddle function representation, Fitzpatrick’s last function, here called MT , was considered. A way of defining MT so that MT is a Riemann integration, and thus does not rely on being in the interior of the domain, was developed (see Theorem 6.2.1). For proper lower semicontinuous convex functions f , M∂f is the same as the Rockafellar antiderivative of ∂f (see Section 6.2.4). As such, when used to create a saddle function W∂f (see Definition 6.3.1), this saddle function is in the canonical form. Although WT is not guaranteed to be a saddle function for all monotone operators, if WT is a saddle function, it is a skew-symmetric saddle function Lagrangian to T (see Theorem 6.3.5). However, MT can, unlike the Rockafellar antiderivative, be applied to monotone operators that are not subdifferentials. An important subset for which WT is a skew symmetric saddle function representation of a monotone operator, are those monotone operators with convex domain that are Borwein-Wiersma decomposable (see Proposition 7.7.1).  178  In Chapter 7, we examined the Borwein-Wiersma decomposition. However, unlike [11] and [19], the focus taken was on determining the properties and attempting to reconstruct BorweinWiersma decompositions provided that they exist, rather than to try to determine conditions on their existence. We have expanded the definition of the Borwein-Wiersma decomposition (as defined for instance in [11]) to include multivalued linear relations as the skew part (see Definition 7.3.4). This increases the number of possible Borwein-Wiersma decompositions for a given decomposable operator, and so clean and essential decompositions are defined as well. Every Borwein-Wiersma decomposition has an essential decomposition (Proposition 7.3.14), and many sufficient conditions for the existence of clean decompositions are given (see especially Propositions 7.8.1 and 7.8.2). The existence of a clean decomposition is important for constructing Borwein-Wiersma decompositions (see for instance the decomposition theorems references below). The saddle function WT can be used to decompose Borwein-Wiersma decomposable operators, and as long as the domain of T is convex, can construct a decomposition of a monotone extension of T that has an error at most of a constant (Theorem 7.7.11). The main result of decompositions using saddle functions is given in Theorem 7.7.13. Here, it is shown that if an operator T has convex domain and a clean decomposition, then all possible Borwein-Wiersma decompositions can be characterized, and it is possible to construct a Borwein-Wiersma decomposition on the relative interior of the domain of T using only evaluations of T without the prior knowledge of any decomposition, with a method that differs depending on whether 0 ∈ Aff dom T or not. Later on, in Section 7.9, Theorem 7.9.1 applies this methodology to generalize the existing results on the Borwein-Wiersma decomposition of linear relations found in [11]. Unfortunately, the use of WT to decompose operators requires that the domain of T be convex. To overcome this issue, using the weaker condition starshaped domains, subgradients of various combinations of the function MT are used to decompose the operator. Extending the results of [19], an exact decomposition of T is possible if it is Borwein-Wiersma decomposable and if 0 ∈ int dom T . Far more general cases can be considered, however, as the main  Theorems 7.8.7 and 7.8.8 demonstrate.  Finally, in Theorems 7.8.12, 7.8.14, and 7.8.15, the main results of this chapter are summarized, and methods given for the reconstruction of Borwein-Wiersma decompositions of monotone operators with starshaped domain.  8.2  Future work  Aside from building the table of relationships and examples further, using other classes of monotone operators, the core simple examples provided can be used together with the results of addition of monotone operators and product space preservation of class to generate large 179  numbers and various kinds of higher-dimensional examples that satisfy or break certain conditions. This would be useful for instance in testing the robustness of proposed methods for solving variational inequalities. Furthermore, most examples are in R2 , however the existence of monotone operators T : R2 → R2 that are paramonotone but neither strictly monotone  nor 3-cyclic monotone has not yet been demonstrated, the examples above demonstrating such properties operate on R3 . Since the binary label of the product of two operators (in the product space) is the result of the binary ’AND’ operator applied to the binary label of the factors, the examples provided in Chapter 5 can be used to generate higher-dimensional examples with ease that not only satisfy any possible combination of monotone class, but also satisfy/fail to satisfy conditions in any specified ratio of dimension. The Asplund decomposition results from Chapter 7 could allow for a broader range of variational problems to be solved in the manner of Hybrid Steepest descent. Various new topics for pursuit have been opened up from the work in Chapters 6 and 7. With the given framework, it would be a simple matter to extend the results for the decomposition of Borwein-Wiersma operators with starshaped domains even further, perhaps towards a complete characterization as obtained in Theorem 7.7.13. Especially important to pursue given this framework is a characterization of the existence of clean Borwein-Wiersma decompositions. For instance, does there exist a Borwein-Wiersma decomposable operator with a convex, or even starshaped, domain for which no clean decomposition exists? The extended and standardized decompositions analyzed here show further potential in this regard. Furthermore, since MT can be calculated as a Riemann integration, with known error bounds (see Proposition 6.2.21), the next step would be to construct algorithms to directly calculate the Borwein-Wiersma decompositions of operators. As the linear skew term can be calculated directly (see Section 7.6), alternative approaches for low and high dimensions should be considered. Furthermore, where this decomposition may be useful, there may be easy applications to improve algorithms that employ a line search (solving an equilibrium problem or variational problem, perhaps) by combining Definition 6.2.20 with Proposition 6.2.21. Further investigation of WT as a new type of saddle function representation of monotone operators may also be merited, especially since WT yields the canonical form for subdifferentials and since it is itself in the form of an equilibrium problem (see Section 6.3.1). The methods developed here should also be applied to determine when a given operator may be Borwein-Wiersma decomposable, and to extend results such as Theorem 2 in [19], where under some strict conditions it was found that T is weakly decomposable (ie: T = ∂f + A as before, but A can be nonlinear and skew) if and only if MT (x, ·) is convex for x ∈ int dom T .  We have obtained certain certificates for when an operator T is not Borwein-Wiersma decomposable, such as in Corollary 7.7.3. However, as the example Sˆ in [19] demonstrates, a given acyclic operator may be a skew linear relation on a subset of its domain (in this case, Sˆ is 180  linear on the unit ball). For this reason, further guarantees are required. Though the decomposition techniques above may appear to decompose an operator, without the guarantee of Borwein-Wiersma decomposability the assumed linearity condition of the acyclic term might be violated. There may be applications of the techniques used here in determining when the sum of maximal monotone operators, in this case a subdifferential with a linear relation, is itself maximal monotone, using the linearity of MT (where M(T1 +T2 ) = MT1 + MT2 ) together with the graph inequalities throughout Chapter 7. First however, the results presented here would need to be generalized to Banach space. Some care has been taken when developing this theory to avoid reliance on properties specific to Hilbert spaces in the hope that most of these results can generalize to Banach spaces. Another goal would be to attempt to characterize acyclic operators, in the context of Borwein-Wiersma decompositions and for Asplund decompositions in general, perhaps by finding an efficient means for establishing when MT (z, y) − MT (z, x) from (7.8.34) is the zero  operator combined with determining the properties of Aff dom T .  181  Bibliography [1] E. Asplund. A monotone convergence theorem for sequences of nonlinear mappings. In Nonlinear Functional Analysis, volume 18 of Proc. of Sympos. Pure Math., pages 1–9, Chicago, 1970. [2] H. Attouch and A. Damlamian. On multivalued evolution equations in Hilbert spaces. Israel J. Math., 12(4):373–390, 1972. [3] A. A. Auslender and M. Haddou. An interior-proximal method for convex linearly constrained problems and its extension to variational inequalities. Math. Program., 71:77–100, 1995. [4] J.-B. Baillon and G. Haddad. Quelques propri´et´es des op´erateurs angle-born´es et ncycliquement monotones. Israel J. Math., 26:137–150, 1977. [5] S. Bartz, H. H. Bauschke, J. M. Borwein, S. Reich, and X. Wang. Fitzpatrick functions, cyclic monotonicity and Rockafellar’s antiderivative. Nonlinear Anal., 66:1198–2007, 2007. [6] H. H. Bauschke, J. M. Borwein, and X. Wang. Fitzpatrick functions and continuous linear monotone operators. SIAM J. Optim., 18:789–809, 2007. [7] H. H. Bauschke and P. L. Combettes. Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer-Verlag, 2011. [8] H. H. Bauschke, W. Wang, and L. Yao. An answer to S. Simons’ question on the maximal monotonicity of the sum of a maximal monotone linear operator and a normal cone operator. Set-Valued Var. Anal., 17:195–201, 2009. [9] H. H. Bauschke, W. Wang, and L. Yao. Monotone linear relations: maximality and Fitzpatrick functions. J. Convex Anal., 25:673–686, 2009. [10] H. H. Bauschke and X. Wang. An explicit example of a maximal 3-cyclically monotone operator with bizarre properties. Nonlinear Anal., 69:2875–2891, 2008. [11] H. H. Bauschke, X. Wang, and L. Yao. On Borwein-Wiersma decompositions of monotone linear relations. SIAM J. Optim., 20:2636–2652, 2010. [12] H. H. Bauschke, X. Wang, and L. Yao. Rectangularity and paramonotonicity of maximally monotone operators. Optimization, in press. 182  [13] J. Y. Bello Cruz and A. N. Iusem. Convergence of direct methods for paramonotone variational inequalities. Comput. Optim. Appl., 46(2):247–263, 2010. [14] E. Blum and W. Oettli. From optimization and variational inequalities to equilibrium problems. Math. Student, 63:123–145, 1994. [15] J. M. Borwein. Maximal monotonicity via convex analysis. J. Convex Anal., 13(3):561– 586, 2006. [16] J. M. Borwein. Maximality of sums of two maximal monotone operators. Proc. Amer. Math. Soc., 134(10):2951–2955, 2006. [17] J. M. Borwein and R. Goebel. Notions of relative interior in Banach space. J. Math. Sci. (N.Y.), 115(4):2542–2553, 2003. [18] J. M. Borwein and J. D. Vanderwerff. Convex functions: constructions, characterizations, and counterexamples, volume 109 of Encyclopedia of Mathematics and its Applications. Cambridge University Press, Cambridge, USA, 2010. [19] J. M. Borwein and H. Wiersma. Asplund decomposition of monotone operators. SIAM J. Optim., 18:946–960, 2007. [20] J. M. Borwein and L. Yao. Some results on the convexity of the closure of the domain of a maximally monotone operator. Optim. Lett., in press. [21] L. Bragard. Propri´et´es inductives et sous-ensembles maximaux. Bull. Soc. Roy. Sci. Li`ege, 38(1–2):8–13, 1969. [22] A. Bressan and V. Staicu. On nonconvex pertubations of maximal monotone differential inclusions. Set-Valued Var. Anal., 2:415–437, 1994. [23] H. Br´ezis and A. Haraux. Image d’une somme d’op´erateurs monotones et applications. Israel J. Math., 23(2):165–186, 1976. [24] F. E. Browder. Probl`emes nonlin´eaires. Number 15 in S´eminaire de Math´ematiques Sup´erieures. Les Presses de l’Universit´e de Montr´eal, 1966. [25] R. E. Bruck Jr. An iterative solution of a variational inquality for cetain monotone operators in Hilbert space. Bull. Amer. Math. Soc., 81(5), 1975. [26] R. S. Burachik and J. Dutta. Inexact proximal point methods for variational inequality problems. SIAM J. Optim., 20(5):2653–2678, 2010. [27] R. S. Burachik and A. N. Iusem. An iterative solution of a variational inequality for certain monotone operators in a Hilbert space. SIAM J. Optim., 8:197–216, 1998.  183  [28] R. S. Burachik, J. O. Lopes, and B. F. Svaiter. An outer approximation method for the variational inequality problem. SIAM J. Control Optim., 43:2071–2088, 2005. [29] Y. Censor, A. N. Iusem, and S. A. Zenios. An interior point method with Bregman functions for the variational inequality problem with paramonotone operators. Math. Program., 81(3), 1998. [30] S. Chang, H. W. J. Lee, and C. K. Chan. A new method for solving equilibrium problem fixed point problem and variational inequality problem with application to optimization. Nonlinear Anal., In press. [31] L.-J. Chu. On the sum of monotone operators. Michigan Math. J., 43:273–289, 1996. [32] P. L. Combettes and S. A. Hirstoaga. Equilibrium programming in Hilbert spaces. J. Nonlinear Convex Anal., 6:117–136, 2005. [33] R. Cross. Monotone Linear Relations. M. Dekker, New York, 1998. [34] P. J. Da Silva, J. Eckstein, and C. Humes. Rescaling and stepsize selection in proximal methods using separable generalized distances. SIAM J. Optim., 12:238–261, 2001. [35] J. C. De Los Reyes. Optimal control of a class of variational inequalities of the second kind. SIAM J. Control Optim., 49:1629–1658, 2011. [36] J. Eckstein and D. P. Bertsekas. On the Douglas-Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Program., 55:293–318, 1992. [37] M. Edwards. Five classes of monotone linear relations and operators. In Computational and Analytical Mathematics, Springer Proceedings in Mathematics and Statistics, to appear. [38] F. Facchinei and J. S. Pang. Finite-dimensional variational inequalities and complementarity problems. Springer, 2003. [39] M. C. Ferris and J.S. Pang. Engineering and economic applications of complementarity problems. SIAM Rev., 39(4):669–713, December 1997. [40] N. Ghoussoub. Selfdual partial differential systems and their variational principles, volume 14 of Springer Monogr. Math. Springer, 2010. [41] M. S. Gowda and M. Teboulle. A comparison of constraint qualifications in infinitedimensional convex programming. SIAM J. Control Optim., 28(4):925–935, 1990. [42] N. Hadjisavvas and S. Schaible. On a generalization of paramonotone maps and its application to solving the stampacchia variational inequality. Optimization, 55(5-6):593–604, October-December 2006. 184  [43] W. R. Hare Jr. and J. W. Kenelly. Intersections of maximal starshaped sets. Proc. Amer. Math. Soc., 19(6):1299–1302, 1968. [44] P. Hartmann and G. Stampacchia. On some non-linear elliptic differential-functional equations. Acta Math., 115(1):271–310, 1966. [45] P. Hess. On nonlinear equations of Hammerstein type in Banach spaces. Proc. Amer. Math. Soc., 30(2):308–312, 1971. [46] H. Iiduka and W. Takahashi. Strong convergence theorems for nonexpansive mappings and inverse-strongly monotone mappings. Nonlinear Anal., 61(3):341–350, 2005. [47] H. Iiduka and W. Takahashi. Strong convergence studied by a hybrid type method for monotone operators in a Banach space. Nonlinear Anal., 68(12):3679–3688, 2008. [48] A. N. Iusem. On some properties of paramonotone operators. J. Convex Anal., 5(2):269– 278, 1998. [49] J. Jahn. Vector Optimization: Theory, Applications, and Extensions. Springer, 2004. ¨ [50] M. D. Kirszbraun. Uber die zusammenziehenden und Lipschitzschen transformationen. Fund. Math., 22:77–108, 1934. [51] E. Krauss. A representation of arbitrary maximal monotone operators via subgradients of skew-symmetric saddle functions. Nonlinear Anal., 9(12):1381–1399, 1985. [52] P. L. Lions and B. Mercier. Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal., 16(6):964–979, 1979. [53] F. Mignot. Contrˆ ole dans les in´equations variationelles elliptiques. J. Funct. Anal, 22:130– 185, June 1976. [54] G. J. Minty. On the maximal domain of a monotone function. Michigan Math. J., 8:135– 137, 1961. [55] G. J. Minty. Monotone (nonlinear) operators in a Hilbert space. Duke Math. J., 29:341– 346, 1962. [56] F. D. Murnaghan and A. Wintner. A canonical form for real matrices under orthogonal transformations. Proc. Nat. Acad. Sci. USA, 17(7):pp. 417–420, 1931. [57] N. Nadezhkina and W. Takahashi. Strong convergence theorem by a hybrid method for nonexpansive mappings and Lipschitz-continuous monotone mappings. SIAM J. Optim., 16(4):1230–1241, 2006.  185  [58] N. Nadezhkina and W. Takahashi. Weak convergence theorem by an extragradient method for nonexpansive mappings and monotone mappings. J. Optim. Theory Appl., 128(1):191– 201, 2006. [59] N. S. Papageorgiou and N. Shahzad. On maximal monotone differential inclusions in RN. Acta Math. Hungar., 78(3):175–197, 1998. [60] T. Pennanen. Dualization of monotone generalized equations. PhD thesis, University of Washington, Seattle, Washington, USA, 1999. [61] R. R. Phelps and S. Simons. Unbounded linear monotone operators in nonreflexive Banach spaces. J. Convex Anal., 5(2):303–328, 1998. [62] R. T. Rockafellar. Characterization of the subdifferentials of convex functions. Pacific J. Math., 17(3):497–510, 1966. [63] R. T. Rockafellar. Local boundedness of nonlinear, monotone operators. Michigan Math. J., 16(4):397–407, 1969. [64] R. T. Rockafellar. Convex Analysis. Princeton University Press, Princeton, New Jersey, USA, 1970. [65] R. T. Rockafellar. On the maximal monotonicity of subdifferential mappings. Pacific J. Math., 33:209–216, 1970. [66] R. T. Rockafellar. On the maximality of sums of nonlinear monotone operators. Trans. Amer. Math. Soc., 149:75–88, 1970. [67] R. T. Rockafellar. On the virtual convexity of the domain and range of a nonlinear maximal monotone operator. Math. Ann., 185(2):81–90, 1970. [68] R. T. Rockafellar. Monotone operators and the proximal point algorithm. SIAM J. Control Optim., 14:877–898, 1976. [69] R. T. Rockafellar and R. J.-B. Wets. Variational Analysis, volume 317 of Grundlehren der mathematischen Wissenschaften. Springer, 2 edition, 2004. [70] S. Simons. LC functions and maximal monotonicity. J. Nonlinear Convex Anal., 7:123– 138, 2006. [71] S. Simons. From Hahn-Banach to Monotonicity, volume 1693 of Lecture Notes in Mathematics. Springer-Verlag, 2 edition, 2008. [72] W. Takahashi and M. Toyoda. Weak convergence theorem for nonexpansive mappings and monotone mappings. J. Optim. Theory Appl., 118(2):341–350, 2005.  186  [73] F. A. Toranzos. Radial functions of convex and star-shaped bodies. Amer. Math. Monthly, 74(3):278–280, 1967. [74] P. Tseng. Applications of splitting algorithm to decomposition in convex programming and variational inequalities. SIAM J. Control Optim., 29:119–138, 1991. [75] F. A. Valentine. A Lipschitz preserving extension for a vector function. Amer. J. Math., 67:83–93, 1945. [76] A. Yagi. Generation theorem of semigroup for multivalued linear operators. Osaka J. Math., 28:385–410, 1991. [77] I. Yamada. The hybrid steepest descent method for the variational inequality problem over the intersection of fixed point sets of nonexpansive mappings. In Simeon Reich Dan Butnariu, Yair Censor, editor, Inherently parallel algorithms in feasibiliity and optimization and their applications, volume 9 of Stud. Comput. Math., pages 473–504. Elsevier, 2001. [78] I. Yamada and N. Ogura. Hybrid steepest descent method for variational inequality problems over the fixed point set of certain quasi-nonexpansive mappings. Numer. Funct. Anal. Optim., 25:619–655, 2004. [79] I. Yamada, N. Ogura, and N. Shirakawa. A numerical robust hybrid steepest descent method for the covexly constrained generalized inverse problem. Contemp. Math., 313:269– 305, 2002. [80] L. Yao. On Monotone Linear Relations and the Sum Problem in Banach Spaces. PhD thesis, University of British Columbia, Okanagan, British Columbia, Canada, 2012. [81] C. Z˘ alinescu. Hahn-Banach extension theorems for multifunctions revisited. Math. Methods Oper. Res., 68(3):493–508, 2008. [82] E. Zeidler. II/B - Nonlinear Monotone Operators. Nonlinear Functional Analysis and its Applications. Springer-Verlag, 1990.  187  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0071930/manifest

Comment

Related Items