Convex Hulls and Graph-MatrixCalculus: Algorithms inComputational Convex AnalysisbyBryan Nicholas Carl GardinerA THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFBACHELOR OF SCIENCE HONOURSinThe Irving K. Barber School of Arts and Sciences(Computer Science)THE UNIVERSITY OF BRITISH COLUMBIA(Okanagan)April 2010c Bryan Nicholas Carl Gardiner 2010AbstractAt the core of Convex Analysis and its applications are a collection of fre-quently used operators for transforming convex functions, along with theconvex hull operation for convexifying functions. While numerical algo-rithms are usually applied to general functions with little known structure,we focus on the class of univariate piecewise linear-quadratic (PLQ) func-tions, for which exact algorithms have been developed.This thesis presents two main results. In the rst part, we investigatetwo convex hull algorithms for univariate PLQ functions. The rst algo-rithm is an extension of the linear-time planar Beneath-Beyond algorithm,and performs a plane sweep that converts a function into its convex hull.The second uses convex duality theory to deconstruct a nonconvex functionand build its convex hull using convex operators, in worst-case quadratictime. The trade-o is complexity: the second algorithm can be signi cantlysimpler to implement.The second part is concerned with the computation of convex transforms,such as the Legendre-Fenchel transform and the Moreau Envelope. Weintroduce a new family of algorithms that stores and manipulates models ofthe subdi erential instead of the original function.We performed numerical experiments comparing the two convex hullalgorithms, and comparing the new subdi erential algorithms to existingPLQ algorithms. These experiments con rm the time complexity for theconvex hull algorithms, and show that the subdi erential algorithms havethe same complexity as the PLQ algorithms, while performing an order ofmagnitude faster.iiTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiAcknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . v1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22.1 Convex Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 22.1.1 Convexity . . . . . . . . . . . . . . . . . . . . . . . . 22.1.2 A ne and Convex Hulls . . . . . . . . . . . . . . . . 32.1.3 Set Interiors . . . . . . . . . . . . . . . . . . . . . . . 42.1.4 Subdi erential . . . . . . . . . . . . . . . . . . . . . . 42.1.5 Convex Conjugate . . . . . . . . . . . . . . . . . . . . 42.1.6 Epi-addition and Epi-multiplication . . . . . . . . . . 52.1.7 Moreau Envelope . . . . . . . . . . . . . . . . . . . . 52.1.8 Proximal Average . . . . . . . . . . . . . . . . . . . . 62.1.9 Self-Dual Smoothing Operators . . . . . . . . . . . . 72.2 Piecewise Linear-Quadratic Functions . . . . . . . . . . . . . 73 Convex Hull Algorithms . . . . . . . . . . . . . . . . . . . . . 93.1 Direct Approach . . . . . . . . . . . . . . . . . . . . . . . . . 93.2 Convex Duality Approach . . . . . . . . . . . . . . . . . . . . 183.3 Performance Comparison . . . . . . . . . . . . . . . . . . . . 194 Graph-Matrix Calculus . . . . . . . . . . . . . . . . . . . . . . 224.1 Representation of PLQ Functions . . . . . . . . . . . . . . . 224.2 Goebel’s Graph-Matrix Calculus . . . . . . . . . . . . . . . . 254.3 Transforms of PLQ Functions . . . . . . . . . . . . . . . . . 284.4 Comparison of Algorithms . . . . . . . . . . . . . . . . . . . 31iiiTable of Contents5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37ivAcknowledgementsI wish to thank Dr. Yves Lucet most sincerely for providing me with theopportunity to do an Honours thesis under him, and the opportunity topursue research in the area of Convex Analysis. His patient and caringmentorship has been essential, and he has given me much inspiration andcon dence.Many other people have provided support, encouragement, and knowl-edge throughout my thesis work. I would like to acknowledge Dr. HeinzBauschke and Dr. Warren Hare for their teaching, and for their everlastingenthusiasm for Mathematics.I would also like to thank Mike Trienis for his initial implementation ofthe direct PLQ convex hull algorithm.vChapter 1IntroductionConvex Analysis is a rich eld with applications in diverse areas. For exam-ple, it provides algorithms to compute distance transforms [8] and reversedegradation [5] in image and signal processing, and to improve models inmachine learning [4, 23].There are numerous operators in Convex Analysis that transform oneconvex function into another. Common examples of these would be the ubiq-uitous Legendre-Fenchel transform, the Moreau-Yoshida approximation, andthe recently-introduced Proximal Average [3]. It is natural to want fast al-gorithms to perform these operations. Since generic numerical algorithmsdo not generate an exact model of the transformed function, research hasfocused on special classes of functions [9, 16, 17]. Restricting oneself to awell-behaved class of functions can allow exact algorithms to be developed.Piecewise linear-quadratic (PLQ) functions are such a class. They are closedunder many convex transforms, including those listed above.Existing PLQ algorithms directly manipulate a model of a function [2].Because the subdi erential of a PLQ function is a piecewise linear multi-valued mapping, the subdi erentials of many transforms are related linearlyto the original subdi erentials. Goebel [12] published the exact relationshipfor a number of operators. We extend his results to additional operators,and provide a method to recover the function values.First, we will look at algorithms for computing the convex hulls of PLQfunctions. We will recall the extension by Trienis [22] of the Beneath-Beyondalgorithm for points in a plane, to PLQ functions with unbounded domains.We will then introduce a new algorithm for computing the convex hull ofa PLQ function, that uses only the convex conjugate and the pointwisemaximum operators.Chapter 2 sets notations and recalls the necessary theory. Chapter 3discusses the two convex hull algorithms for PLQ functions. After bring-ing ourselves into the domain of convex functions, in Chapter 4 we applyGoebel’s calculus rules to the computation of a number of convex operators,and compare these techniques to existing PLQ algorithms. Chapter 5 con-cludes the thesis with a results summary, and directions for future research.1Chapter 2PreliminariesIn this chapter, we will outline notations and recall results we will use inthe rest of the thesis.We will denote the standard inner product on Rd withhx;yi= xTy =dXi=1xiyi;and the quadratic energy functionq : Rd!R : x7! 12kxk2:We will write the open ball centered at x with radius " asB(x;") =fx2Rd :kxk<"g:2.1 Convex Analysis2.1.1 ConvexityWe use standard de nitions for the convexity of sets and functions.De nition 2.1.1. A set C is convex if, and only if, (i )8x1;x2 2C; 8 2[0;1]; (1 )x1 + x2 2C:Another useful notion is that of the epigraph of a function.De nition 2.1.2. Given a function f : Rd!R, its epigraph epi(f) is theset of points in Rd+1 that are on or above the graph of f:epi(f) =f(x; ) : f(x) g22.1. Convex AnalysisA function is de ned to be convex i its epigraph is convex. (Likewise,a function f is concave i f is convex, i.e. epi( f) is a convex set.)We will write R to mean the set R[f+1g of extended reals. Whenworking with a function f with bounded domain dom(f) Rd, it will some-times be more convenient to instead work with the extended-value function f, where f(x) =(f(x) , if x2dom(f);+1 , if x =2dom(f):We denote the indicator function of a set S by S(X). This function isde ned to be zero for all elements in S, and +1 for elements outside of S, S(x) =(0 , if x2S;+1 , if x =2S:2.1.2 A ne and Convex HullsWe recall the de nitions of a ne and convex hulls for sets in Rd, and forconvex hulls of functions.Fact 2.1.3. The a ne hull of a set S is the set of all a ne combinationsof nite numbers of points in S; that is,a (S) =( nXi=1 ixi : xi2S for i2f1;:::;ng;nXi=1 i = 1; n 1):A set is a ne if it is equal to its a ne hull.Fact 2.1.4. The convex hull of a set S is the smallest convex set that con-tains S. It is also the intersection of all convex sets containing S. It can bewritten asconv(S) =( nXi=1 ixi : xi2S; i 0 for i2f1;:::;ng;nXi=1 i = 1; n 1):The convex hull is, by de nition, a convex set.The closed convex hull of a function f is de ned to be the largest lowersemi-continuous, convex function that underestimates f. It is written asco(f). It can be de ned via the epigraph with(cof)(x) = infft : (x;t)2conv epifg:When we refer to the convex hull of a function, we are referring to its closedconvex hull.32.1. Convex Analysis2.1.3 Set InteriorsDe nition 2.1.5. The interior of a set S, int(S), is de ned asint(S) =fx2S : B(x;") S for some "> 0g:De nition 2.1.6. The relative interior of S, ri(S), is the interior of S whenconsidered as a subset of its a ne hull, that is,ri(S) =fx2S : B(x;")\a (S) S for some "> 0g:The relative interior will be useful when discussing function domains.2.1.4 Subdi erentialThe subdi erential is a generalization of derivatives to convex but nondif-ferentiable functions.De nition 2.1.7. We de ne the subdi erential of a convex function f :Rd! R at a point x to be the set of all subgradients s of f at x,@f(x) =fs2Rd :8y2Rd;f(y) f(x) +hs;y xig:The subdi erential is a multivalued map from Rd to Rd. We also de ne thegraph of the subdi erential, gph@f, as a subset of Rd Rd by(x;s)2gph@f,s2@f(x):2.1.5 Convex ConjugateThe convex conjugate, also known as the Legendre-Fenchel transform, orjust the conjugate, is a fundamental transform in convex analysis. It can beused to construct many other transforms, such as the Moreau envelope andthe proximal average.De nition 2.1.8. For a proper function f : Rd !R, its conjugate is thefunction f? : Rd! R de ned byf?(y) = supx2dom(f)fhy;xi f(x)gUsing extended-value functions, we can write the conjugate without con-straints as f?(y) = supx2Rdfhy;xi f(x)g. As f? is the supremum of convexfunctions, it is convex.42.1. Convex AnalysisFact 2.1.9. [20, Theorem 12.2] If f is convex, then the conjugate f? isalways convex and lower semi-continuous. Furthermore, f?? is the closedconvex hull of f, and f? is proper i f is proper.We highlight several cases where the conjugate is easy to compute.Fact 2.1.10. [20, p.106] The only function on Rd that is self-conjugate, i.e.f? = f, is q(x) = 12kxk2.Example 2.1.11. If f(x) = ax2 +bx+c for a> 0; b;c2R, then f?(y) =14a(y b)2 c, and f?? = f.Example 2.1.12. If f(x) = bx + c for b;c2R, then f?(y) = fbg(y) c,and f?? = f.Example 2.1.13. If f(x) = 1pkxkp with x 2 Rd, 1 < p < +1, thenf?(y) = 1qkykq, where 1p + 1q = 1.2.1.6 Epi-addition and Epi-multiplicationWe make use of two further operators, known as epi-addition (or inf-convolution),and epi-multiplication. We denote these, respectively, by:(f2g)(x) = infy2Rdff(y) +g(x y)g( Ff)(x) =( f(x= ) , if > 0; f0g , if = 0:2.1.7 Moreau EnvelopeThe Moreau envelope, also known as the Moreau-Yoshida approximate, isthe inf-convolution of a function with 12 k k2. That is,e f(x) = (f2 1q)(x) = infy2Rdff(y) + 12 ky xk2g;for > 0. The Moreau envelope is an underestimator of f, which convergespointwise to f as & 0 [21, Theorem 1.25]. By expanding the norm-squared, we can write the Moreau envelope in terms of the conjugate ase f(x) = kxk22 1 g? (x);52.1. Convex Analysiswhere g (x) = ( f + 12k k2)(x). When f is convex, the Moreau envelopecan be written using the conjugate,e f = (f? + 1q)?:The proximal mapping is the set of minimizers of the Moreau envelope.It is de ned asProx f(x) = argminy2Rdff(y) + 12 ky xk2g:2.1.8 Proximal AverageThe proximal average is an operator for continuously transforming one func-tion into another, and unlike the arithmetic average, it produces a properfunction even if the domains of two proper input functions f and g aredisjoint. The proximal average is the function whose proximal mapping isthe convex combination of the proximal mappings of the input functions [3,Remark 6.2].De nition 2.1.14. Given two proper functions f and g, and a parameter 2[0;1], the proximal average P (f;g) : Rd! R is de ned asP (f;g) = (1 )(f + 1q)? + (g + 1q)? ? 1qfor a smoothing parameter > 0.The proximal average is self-dual, that is,(P (f;g))? = P (f?;g?):If f and g are convex, then P (f;g) is also convex. Convexity with respectto other parameters have been studied in [15]. Recent extensions to theproximal average include the generalized kernel average ofnfunctions in [19],as well as the nonconvex proximal average in [13].The proximal average can be written as a minimization problem using[1, Formula 20],P (f;g)(x) = infx=(1 )x1+ x2 (1 )f(x1) + g(x2) + (1 ) 2 kx1 x2k2 :We call the proximal average exact at x if the in mum is attained for x1;x2 2Rd. The proximal average can also be written using Moreau envelopes as in[1, Theorem 8.3],P (f;g)(x) = e ( (1 )e f e g):62.2. Piecewise Linear-Quadratic Functions2.1.9 Self-Dual Smoothing OperatorsWe will present results on two recently published operators that are self-dual, i.e. they satisfyT(f?) = (Tf)?:The rst is a smoothing operator de ned by Goebel in [12] ass f = (1 2)e f + q:The second operator was published in [19] and is de ned asT f = P (f;q):In both cases, 2(0;1).2.2 Piecewise Linear-Quadratic FunctionsPiecewise linear-quadratic (PLQ) functions are piecewise functions f : Rd! R, where dom(f) is convex and partitioned into polyhedral pieces, and eachpiece is assigned a linear or quadratic function. We will additionally assumethroughout this thesis that f is continuous on the interior of its domain, anddi erentiable on each of its pieces, though f may fail to be di erentiable atthe boundary of each piece.The class of PLQ functions is well studied [2, 21] because it is closedunder many of the standard operations in convex analysis, such as addi-tion, scalar multiplication, conjugation, and the Moreau envelope. Whend = 1, the class is also closed under the pointwise minimum and maximumoperations.In this thesis, we are speci cally concerned with univariate PLQ func-tions f : R! R. We divide the real line into n pieces, with the ith piecebeing de ned by fi(x) = aix + bix + ci over the interval [xi 1;xi], where 1= x0 <x1 <x2 <:::<xn 1 <xn = +1. Given such a function, werepresent it as the n 4 matrix26664x1 a1 b1 c1x2 a2 b2 c2... ...1 an bn cn37775:Example 2.2.1. The univariate quadratic energy function, q(x) = 12x2 isrepresented by the matrix 1 12 0 0 :72.2. Piecewise Linear-Quadratic FunctionsExample 2.2.2. The function f(x) = jjx 1j 1j is represented by thematrix 26640 0 1 01 0 1 02 0 1 21 0 1 23775:As a special case, we support representation of the indicator function fora single point, f(x) = f xg(x) +c, where x2R, with the matrix: x 0 0 c Any references to PLQ functions from this point forward will be referringto univariate PLQ functions unless stated otherwise.8Chapter 3Convex Hull AlgorithmsConvex hulls form a bridge between the elds of Computational Geometryand Convex Analysis. The former typically focuses on computing the convexhull of a set of points, lines, curves, or other objects in Euclidean space; thelatter is concerned with the convex hulls of functions. Fortunately, there is adirect link: representing a function’s curve can enable us to use an applicablealgorithm from Computational Geometry.The Beneath-Beyond algorithm is an incremental algorithm for comput-ing the convex hull of a set of points in Euclidean space [7]. In the plane,this algorithm requires O(n) time, given sorted input (this is not an issue,as the rst column of the matrix of a PLQ function is always sorted). Theconvex hull of a piecewise linear function could be calculated by passing tothe Beneath-Beyond algorithm the set of points where the function is non-di erentiable. Quadratic pieces require special attention, but do not requireasymptotically more time. The initial implementation of this algorithm wasprovided in [22], and it is studied further in [10].After describing this algorithm, we present a new method for computingconvex hulls [10] by decomposing a nonconvex PLQ function into individualconvex pieces, conjugating, then applying the maximum operation to thosepieces to build the convex hull. As this algorithm stores a model that maygrow at each iteration, it takes O(n2) time in the worst case.3.1 Direct ApproachThe planar Beneath-Beyond algorithm performs a plane sweep and incre-mentally adds points to, and removes points from, the convex hull. Weapply the same technique to a PLQ function f, performing a plane sweepalong the x-axis to nd the convex hull of the epigraph. At any point in thealgorithm, the function to the left of our current position is convex.Initially, there are a number of cases we must consider, which wouldcause cof to be 1 everywhere.Proposition 3.1.1. [10, Proposition 4.2] Let f be a PLQ function. Then93.1. Direct Approachcof 1 i any of the following conditions hold.(i) There is an x with f(x) = 1,(ii) a1 < 0,(iii) an < 0,(iv) a1 = an = 0, and b1 >bn.The direct algorithm begins with the leftmost nite piece of f, and pro-gresses rightward. At any point in the algorithm, the function to the left ofthe current index is convex. In each iteration, two adjacent pieces of f areconsidered. If those two pieces, treated as a single function, are convex, thenwe move right, else we convexify the two pieces, backtracking if necessary. Asample run of the algorithm is shown in gure 3.1. Pseudocode is presentedin Algorithm 1, which we call plq coDirect.Notation 3.1.2. We write (fi;fi+1) to denote the ith and (i + 1)th piecesof f considered together, i.e. f restricted to [xi 1;xi+1].Before we begin the plane sweep, we will simplify f by replacing anynonconvex pieces of f with their convex hulls. This simpli es the cases wemust consider to convexify each pair of pieces. A single piece fi is onlynonconvex on its domain if ai < 0, in which case we can replace the piecewith the line segment between its two endpoints.We also need a test to know when two adjacent pieces are convex to-gether. The following proposition treats this case.Proposition 3.1.3. [22, Theorem 3.34] Let f be a continuous PLQ functionsuch that cof6 1. Then f is nonconvex i there are two adjacent piecesfi (on [xi 1;xi]) and fi+1 (on [xi;xi+1]) such that f0i(xi) >f0i+1(xi).In order to construct the convex hull of two adjacent PLQ functionson line 21 of Algorithm 1, three cases must be considered. If both piecesfi;fi+1 are quadratic (that is, ai;ai+1 > 0), then the following system mustbe solved for x ;x to nd a tangent line joining the pieces.f0i(x ) = f0i+1(x ) (3.1)f0i(x )(x x ) +fi(x ) = f0i+1(x )(x x ) +fi+1(x ) (3.2)x 2[xi 1;xi] (3.3)x 2[xi;xi+1] (3.4)103.1. Direct Approach−5−4−3−2−10 1 2 3 4 5−0.50.00.51.01.52.02.53.03.54.0(a) Initial, nonconvex PLQ func-tion. The rst two pieces arenonconvex; convexify and moveforward.−5−4−3−2−10 1 2 3 4 5−0.50.00.51.01.52.02.53.03.54.0(b) Two linear pieces are non-convex. Collapse to one pieceand backtrack.−5−4−3−2−10 1 2 3 4 5−0.50.00.51.01.52.02.53.03.54.0(c) Quadratic-linear nonconvexcase. Convexify; the slope at 5is unchanged, so advance.−5−4−3−2−10 1 2 3 4 5−0.50.00.51.01.52.02.53.03.54.0(d) Linear-quadratic case. Weneed to backtrack again.−5−4−3−2−10 1 2 3 4 5−0.50.00.51.01.52.02.53.03.54.0(e) Several more iterations leadto the convex hull.Figure 3.1: Demonstration of the steps of the plq coDirect algorithm.113.1. Direct ApproachAlgorithm 1: plq coDirectInput: f (a PLQ function)Output: cof (a convex PLQ function)begin1if any conditions in Proposition 3.1.1 hold then2cof 13return4end5for i 1 to n do6if ai < 0 then7Replace the ith piece of f with the linear function between8(xi 1;f(xi 1)) and (xi;f(xi));end9end10// Plane sweep.11i0 i the index of the rst nite piece of f;12if the index of the last nite piece of f;13// Let N denote the number of finite pieces of f.14while (i if 1) and (N 2) do15if f0i(xi) f0i+1(xi) then16// (fi;fi+1) are convex.17i i+ 1; // Add fi to the convex hull and advance.18else19// Convexify fi and fi+1.20Replace (fi;fi+1) with co((fi;fi+1)) on [xi 1;xi+1] in f;21Update if according to inserted/removed pieces;22if f0i 1(xi 1) >f0i(xi 1) then23// (fi 1;fi) has lost convexity.24i max(i 1;1);25else26i the index of the rightmost piece in [xi 1;xi+1];27// f is now convex on [xi0;xi].28end29end30end31cof f;32end33123.1. Direct ApproachIn this case, an additional linear piece will be inserted into f on [x ;x ],possibly replacing one or both quadratic pieces whenx = xi 1 orx = xi+1.A quadratic-quadratic convex hull is demonstrated in Figure 3.2.If fi is quadratic and fi+1 is linear, then we nd a tangent line connectingfi at x to the right endpoint of fi+1. We set f to be this tangent line on[x ;xi+1]. Note that x may be xi 1, in which case the quadratic piecewill be removed. Symmetry handles the case where fi is linear and fi+1 isquadratic; see Figure 3.3. The system we now solve isf0i(x )(xi+1 x ) +fi(x ) = fi+1(xi+1) (3.5)x 2[xi 1;xi]Care must be given to the linear-quadratic case. After solving a linear-quadratic system and convexifying (fi;fi+1), if we have lost convexity atxi 1, we rst check to see if fi 1 is quadratic. If it is, then we may entera cycle of alternately convexifying (fi 1;fi) and (fi;fi+1), converging tothe convex hull of f on [xi 1;xi+1]. As the linear function fi is repeatedlyshifted downward, we treat this as a special case and remove fi from themodel, solving the quadratic-quadratic system (Equations 3.1{3.4) to ndthe tangent line for fi 1 and fi+1.The simplest case to handle is when both fi and fi+1 are linear func-tions. We are convexifying because (fi;fi+1) are nonconvex, implying thatbi > bi+1. Here we replace both linear pieces with a single linear functionconnecting the points (xi 1;fi(xi 1)) and (xi+1;fi+1(xi+1)). Figure 3.4 pro-vides an example of this case.Special attention may be necessary in implementing the last two cases,when a linear piece extends to 1 (e.g. xi+1 = 1 in (3.5)). These casesare handled by nding a line whose slope is that of the unbounded piece.Examples of these cases are shown in Figures 3.3(b), 3.4(b), and 3.4(c).After inserting co((fi;fi+1)) into f, we would like to increment the indexi and consider the next two pieces in the function. By Proposition 3.1.3,f is convex i its slope increases at each of the breaks in its domain. Thefunction (fi;fi+1) is now convex, however, if the slope of f approaching xifrom the right has decreased, then (fi 1;fi) may fail to be convex. Hencethe algorithm must backtrack and reconsider (fi 1;fi).Lemma 3.1.4. With consecutive backtracks, every backtrack after the rstreduces the two pieces currently being considered to a single linear piece.Proof. After convexifying (fi;fi+1), Algorithm 1 only backtracks when nec-essary, i.e. when f0i(xi 1) decreases su ciently. The only time f0i(xi 1)133.1. Direct Approach−4 −3 −2 −1 0 1 2 3 4−1012345Figure 3.2: Quadratic-quadratic convex hull case.−5 −4 −3 −2 −1 0 1 2−0.50.00.51.01.52.0(a) Bounded linear piece.−2.0−1.5−1.0−0.50.00.51.01.52.0−1.0−0.50.00.51.01.52.02.5(b) Unbounded linear piece.Figure 3.3: Linear-quadratic convex hull case.143.1. Direct Approach−2.0−1.5−1.0−0.50.00.51.01.52.0−1.0−0.50.00.51.01.52.02.53.0(a) Bounded convex hull.−3.0−2.5−2.0−1.5−1.0−0.50.00.51.01.52.0−6−5−4−3−2−101(b) Convex hull unbounded on theleft.−3.0−2.5−2.0−1.5−1.0−0.50.00.51.01.52.0−6−5−4−3−2−101(c) Convex hull unbounded on theright.Figure 3.4: Linear-linear convex hull case.153.1. Direct Approachdecreases is when a linear piece is inserted on some interval [xi 1;x ], forxi 1 <x xi.Hence, after backtracking, the right piece will be linear. After the rstbacktrack, we will be in either the quadratic-linear case or the linear-linearcase. The only way backtracking can happen in these cases is if the currenttwo pieces are replaced by a single linear piece. By induction, this holds forall consecutive backtracks beyond the rst.Proposition 3.1.5. Algorithm 1 runs in optimal linear time and space.Proof. Assume cof6 1. Convexifying pieces in lines 6{10 requires O(1)time for each piece. To show that the while loop on lines 15{31 requires lineartime, note again that each iteration requires constant time. The while loopterminates either when we reach the rightmost nite piece of f, or when fcontains a single nite piece. In every iteration, we make progress to at leastone of these conditions.Moving right will lead to the rst condition being satis ed. If we don’tmove right, then we have lost convexity at xi 1. There are four subcases toconsider, based on the types of the original fi;fi+1 being convexi ed.(i) Linear-linear: Taking the convex hull of two linear pieces always pro-duces a single linear piece on [xi 1;xi+1], so progress is made towardN reaching 1.(ii) Quadratic-linear: The only way for Algorithm 1 to decrease f0i(xi 1)is to insert a linear piece extending rightward from xi 1. As fi+1 islinear, this means again that (fi;fi+1) has collapsed to a single linearpiece.(iii) Quadratic-quadratic: As in case (ii), a linear piece has been insertedwhose left endpoint is xi 1. In the quadratic-quadratic case we lookfor the linear function tangent to both quadratic pieces, so considerthe right endpoint of the linear function, x . If x = xi+1, then againco((fi;fi+1)) is now a single linear piece. If x <xi+1, then the convexhull is a linear-quadratic function, which is handled in the next case(after convexi cation).(iv) Linear-quadratic: If x = xi+1, then (fi;fi+1) collapses to a singlepiece. Otherwise, co((fi;fi+1)) still contains two pieces. As we havelost convexity at xi 1, consider the type of fi 1.163.1. Direct ApproachIf fi 1 is quadratic, then backtracking as usual will lead to a cycle ofrepeatedly convexifying (fi 1;fi) and (fi;fi+1), so remove the linearpiece and solve the quadratic system instead.If fi 1 is linear, then it is necessary to backtrack without decrementingthe number of pieces. We claim that this can happen at most oncebefore adding fi+1 to the convex hull. Fix x as the current value ofxi 1, and let fj denote, across all iterations, the piece of f whose rightendpoint is the value of xi before any backtracking. (Currently, fj =fi.) By Lemma 3.1.4, each consecutive backtrack will now decrease N,and the right piece in all these backtracks will be fj.When consecutive backtracking is no longer necessary, we will againconsider having convexi ed (fj;fj+1) in the linear-quadratic case, andassume f is still nonconvex at xj 1. We enumerate the reasons whybacktracking stopped.(a) If j = 1, then we have convexi ed (f1;f2), and consequently moveforward (we cannot backtrack any further).(b) If fj 1 is quadratic, then we the use quadratic-quadratic system asabove, and either move forward, or when the two quadratic piecescollapse to one, backtrack.(c) If fj 1 is linear, it may be necessary to backtrack again withoutcollapsing (fj;fj+1). Note that, as there was at least one consec-utive backtrack to get to this point, xj 1 is now strictly less than x, and also that xj 1 is one of the partition points that was inf when we assigned x (backtracking cannot introduce new parti-tion points). Hence there is a nite number of partition pointsthat xj 1 can take on in this case, and once xj 1 moves left be-yond a partition point, it is deleted from the model and cannot beconsidered by a future linear-quadratic case.Thus, though a single linear-quadratic piece may require O(n) time,partition points removed from the model will not be considered by fu-ture linear-quadratic cases, so O(n) work is done by all linear-quadraticcases.The other three cases always progress toward either i = if or N = 1,so overall, Algorithm 1 runs in linear time.The function f begins with O(n) pieces, and each iteration adds atmost one piece (a tangent line between two quadratics), or deletes atmost one piece, hence Algorithm 1 requires linear space as well.173.2. Convex Duality Approach3.2 Convex Duality ApproachAn alternate method that we present to compute the convex hull is basedon dividing a PLQ function into individual pieces, each of which is convex.Consider a PLQ function f with n pieces. Assuming cof 6 1, thismethod will also assume that each individual piece is convex (convexifyingindividual pieces may initially be necessary, as with the direct algorithm).We name this algorithm plq coSplit.De ne the functions (gi)ni=1 bygi(x) =(fi(x) , if x2[xi 1;xi]+1 , otherwiseEach gi is the extended-value function of fi restricted to [xi 1;xi]. We canwrite f = mingi. Also, as each gi is convex, cogi = gi. Hence we can writecof ascof = co mingi = co min cogi:From here, we apply the following theorem.Theorem 3.2.1. [14, Theorem X.2.4.4] Let g1;:::;gn be proper, lowersemi-continuous functions. Then(co max(g1;:::;gn))? = co min(g?1;:::;g?n):This holds for an arbitrary nite number of functions.By applying Theorem 3.2.1 with [14, Proposition X.2.6.1], we rewritethe convex hull of f ascof = co min(cog1;:::;cogn) = (max(g?1;:::;g?n))?: (3.6)This formulation provides a basis for computing the convex hull of f. Doingthis requires the ability to compute the conjugate of already-convex func-tions, and to construct the pointwise maximum of n convex functions. Theconjugate of each gi is easily found via an explicit formula in [2, Proposition5.2]. The complete method is presented in Algorithm 2.As the di erence in line counts suggests, Algorithm 2 requires less workto implement, provided that algorithms are already available for computingthe maximum and conjugate of convex PLQ functions.183.3. Performance ComparisonAlgorithm 2: plq coSplitInput: f (a PLQ function)Output: cof (a convex PLQ function)begin1Check if cof 1, and return if necessary;2Convexify individual nonconvex pieces of f;3g 1;4for i 1 to n do5Construct gi as above;6g max(g;g?i );7end8cof g?;9end10Proposition 3.2.2. Algorithm 2 has linear space complexity, worst-casequadratic time complexity, and best-case linear time complexity.Proof. Each gi contains a single piece, hence g?i will contain O(1) pieces.Every iteration, g will have O(1) partition points added to it, in the casewhere the partition points in g?i are not partition points of g. After niterations, g will contain O(n) pieces. As every call to max is a linearoperation, across all calls to max we take O(n2) time.If every gi after the rst introduces no new pieces into g, then each callto max will take constant time, making the algorithm run in linear time.This is the case, e.g., when computing the convex hull of a piecewise a neinterpolation of the function f(x) = 12jxj2.3.3 Performance ComparisonTo compare the plq coDirect and plq coSplit algorithms, we compute theconvex hull of a piecewise a ne function with an increasing number of pieces.The results may be seen in Figures 3.5 and 3.6.Figure 3.5 con rms that the worst-case complexity is attained for bothalgorithms when computing co(12k k2), with plq coDirect running in lineartime and plq coSplit showing quadratic performance. As the model of theconvex hull does grow to contain n pieces, quadratic growth is expected.In contrast, when computing co( 12k k2) in Figure 3.6, plq coSplit’s modeldoes not grow, and ultimately a model with a single piece is produced (alinear function between the bounds of the domain).193.3. Performance Comparison-1001020304050607080100150200250300350400450 500550600nRun time (sec)directsplitFigure 3.5: Run time (seconds) of the convex hull algorithms computing theconvex hull of an n-piece piecewise linear approximation of f(x) = 12x2 on[ n=2;n=2].In both graphs, plq coDirect performs signi cantly better. Besides quadraticworse-case complexity, plq coSplit is also making use of entire lower-levelalgorithms, so overhead invoking these subroutines certainly contributes toplq coSplit’s poorer performance.All experiments were run on a quad-core 64-bit 2.0GHz HP HDX 18 lap-top with 4GB RAM, running Gentoo Linux 10.0. Implementation was doneunder Scilab 5.1.1, extending the Computational Convex Analysis toolboxfreely available from [18].203.3. Performance Comparison0.00.51.01.52.02.5100 200 300 400 500 600nRun time (sec)directsplitFigure 3.6: Run time (seconds) of the convex hull algorithms computing theconvex hull of an n-piece piecewise linear approximation of f(x) = 12x2bounded on [ n=2;n=2].21Chapter 4Graph-Matrix CalculusWe will now present a new class of algorithms for computing transforms ofconvex functions, based on unary \calculus rules" [12] for the subdi eren-tial of a convex function. We recall these rules, then provide formulas foradditional unary operators, and two binary operators. We then apply theserules to PLQ functions, and describe how to recover an exact model of afunction from a transformed subdi erential.4.1 Representation of PLQ FunctionsUltimately, we wish to be able to perform a transform of a convex PLQfunction f,g = Tf;by looking at the graph of the subdi erential of f,gph@f =f(x;y) : x2domf; y2@f(x)g;and computing the subdi erential of the transform via a matrix multiplica-tiongph@(Tf) = Agph@f;for some matrix A2R2d 2d.For univariate PLQ functions, A 2 R2 2, and gph@f must be repre-sented in a way suitable for multiplying by A. The data structure used forthis graph must also contain a constant of integration, to be able to useintegration to recover the PLQ matrix model, and evaluate the function atpoints in its domain. Conversely, di erentiation is able to convert a PLQmatrix model into what we call a GPH matrix,24x0 x1 x2 ::: xn xn+1s0 s1 s2 ::: sn sn+1f0 f1 f2 ::: fn fn+135:The subdi erential of a PLQ function, when single-valued, is a piecewisea ne function, and the subdi erential of a convex PLQ function is made up224.1. Representation of PLQ Functionsof line segments connected end-to-end that extend to in nity in both direc-tions. We notexi points in domf, si subgradients off atxi, andfi constantsof integration (fi = f(xi)). The matrix above stores enough points (xi;si)on this graph to be able to recover the entire graph of the subdi erentialvia piecewise linear interpolation of (xi;si) for all i2f0;:::;n+ 1g.The columns on the GPH matrix are stored in lexicographically increas-ing order. That is, for i j, we have xi xj, and also si sj. If xi = xjthen fi = fj. The values fx1;:::;xng are divisions between the pieces of f,though they are not necessarily distinct: if f is nondi erentiable at a break x, then there will be two columns i<j with xi = xj = x, and si <sj willbe the slopes of f approaching x from the left and right, respectively.The values x0 and xn+1 are in the matrix to store the subdi erential off on ( 1;x1] and [xn;+1), respectively. The line segment from (x1;s1)to (x0;s0) is extrapolated to in nity, and correspondingly with (xn;sn) and(xn+1;sn+1). To represent f being bounded on the left (resp. right), x0 = x1(resp. xn+1 = xn) indicates a ray extending vertically to 1 (resp. +1).In this case, we set f0 (resp. fn+1) to +1. Except for f0 and fn+1, all otherfi and all xi;si must be nite.Example 4.1.1. The quadratic energy function may be represented by theGPH matrix 240 20 40 435:Example 4.1.2. The absolute value function may be encoded as the GPHmatrix 24 1 0 0 1 1 1 1 11 0 0 135:Example 4.1.3. The function where f(x) = 0 on x2 [ 1;1] and f(x) =x2 1 elsewhere may be stored as24 2 1 1 1 1 2 4 2 0 0 2 43 0 0 0 0 335:Example 4.1.4. The function where f(x) = x2 for x < 0, f(x) = 0 forx2[0;1], and f(x) = x 1 for x> 1 may be represented by the matrix24 1 0 1 1 2 2 0 0 1 1 1 0 0 0 135:234.1. Representation of PLQ FunctionsThere is no unique GPH matrix representation for a PLQ function. Re-dundant points may be stored in the middle of the matrix, and the twoendpoints (x0;s0) and (xn+1;sn+1) can be slid along their respective rays.While normalized GPH rules could be set, this would require normalizationafter many transformations, as the transformation matrix A can changegph@f drastically. Frequent cleaning could incur signi cant overhead, sowe instead suggest that a separate algorithm be provided to \clean" theGPH matrix when desired, removing duplicate points.To represent the special case of the point-indicator function f(x) = f xg(x) +c, we use the matrix24 x xs0 s1c c35 (with any s0 <s1):This matrix is the vertical line at x = x. An a ne function f(x) = ax + bis stored as a matrix whose graph is a horizontal line,24x0 x1a aax0 +b ax1 +b35 (with any x0 <x1):Likewise, a quadratic f(x) = ax2 +bx+c is encoded as24x0 x12ax0 +b 2ax1 +bax20 +bx0 +c ax21 +bx1 +c35 (with any x0 <x1):Given a matrix 24x0 x1s0 s1f0 f135;if x0 <x1 and s0 <s1, then f is a quadratic function that can be recoveredwith the following formulas.f(x) = ax2 +bx+ca = s1 s02(x1 x0)b = s0 2ax0c = y0 ax20 bx0244.2. Goebel’s Graph-Matrix CalculusThe case where x0 = x1 and s0 = s1 does not form a valid GPH matrix, astwo distinct points are required to extrapolate to a maximal graph.Considering that gph@f is stored in the rst two rows of the GPH matrix,we are now able to calculate Agph@f with a simple matrix multiplication.4.2 Goebel’s Graph-Matrix CalculusNumerous calculus rules were originally published in [12], then slightly ex-tended in [11], showing how the graph of the subdi erential of a convexfunction is a ected under core convex transformations operating on a singlefunction. Speci cally, (i), (ii), (iii), (iv), (vii), and (viii) were given in [12],and (v), (vi), and (ix) were introduced in [11].Theorem 4.2.1 (Goebel’s Graph-Matrix Calculus for unary operators).Let f : Rd!R be a proper, lower semi-continuous, convex function. Thenthe following rules hold, for all > 0, 0, and 2[0;1].(i) gph@(f?) = 0 IdId 0 gph@f.(ii) gph@(f + q) = Id 0 Id Id gph@f.(iii) gph@( f) = Id 00 Id gph@f.(iv) gph@(e f) = Id Id0 Id gph@f.(v) gph@( Ff) = Id 00 Id gph@f.(vi) gph@f( ) = 1 Id 00 Id gph@f.(vii) gph Prox (f) = Id IdId 0 gph@f.(viii) gph@(s f) = Id Id Id Id gph@f.(ix) gph@(T f) = (1 2 ) Id 2 Id 2 Id (1 2 ) Id gph@f.254.2. Goebel’s Graph-Matrix CalculusProof. Let J = 0 IdId 0 . Formula (i) is shown via(x;y)2gph@f,y2@f(x),x2@f?(y),(y;x)2gph@f:For (ii), we note that @(f + q) = @f + @q = @f + Id, thus (x;y) 2gph@f,(x;y + x)2gph@(f + q). We get (iii) using @( f) = (@f):y2@f(x), y2@( f)(x), y2 @f(x),(x; y)2gph@f:Similarly, @(e f) = @(f2 1q) = @(f? + q)?, and using previous rulesyields (iv). To obtain (v), we use the identity Ff = ( f?)?, and applyingthe two previous rules givesgph@( f?)? = J gph@( f?) = J Id 00 Id gph@f? = J Id 00 Id J gph@f= Id 00 Id gph@f:Using both multiplication rules (iii) and (v), and rewritingf( ) as ( 1Ff)leads togph@f( ) = gph@( ( 1Ff)) = Id 00 Id 1 Id 00 Id gph@f= 1 Id 00 Id gph@f;thus showing (vi). From [5, Section 2.2], we have Prox f = (Id + @f) 1,sogph(Id + @f) = Id 0Id Id ;and inverting the graph gives (vii). By writing out the de nition of s f andexpanding based on previous rules,gph@(s f) = gph@((1 2)e f + q)= Id 0 Id Id Id 00 (1 2) Id Id Id0 Id gph@f;and matrix multiplication proves (viii). The proof of (ix) is postponed untilafter the next theorem.264.2. Goebel’s Graph-Matrix CalculusWe also recall rules in [11] for calculation of the transformed subdi eren-tial under the binary operations of addition, in mal convolution, and takingthe proximal average.Theorem 4.2.2 (Graph-Matrix Calculus for binary operators).Let f1 and f2 be two proper, lower semi-continuous, convex functions. Thenthe following rules hold (for i2f1;2g).(i) Addition rule: If ri domf1\ri domf2 6=;, then(x;s)2gph@(f1 +f2),9(xi;si)2gph@fi such that(x = x1 = x2;s = s1 +s2:(ii) Inf-convolution rule: If @f1(x1)\@f2(x2)6=;, then(x1 +x2;s)2gph@(f12f2),(xi;s)2gph@fi:(iii) Proximal average rule: Let 2 (0;1) and = 1. Assume thatP (f1;f2) is exact at x = (1 )x1 + x2, where xi2domfi for i2f1;2g.Then:(x;s)2gph@P (f1;f2),8><>:s1 2@f1(x1)s2 2@f2(x2)s = x1 +s1 x = x2 +s2 = xProof. As f1;f2 are convex, we have by [14, Corollary XI.3.1.2]: @f1(x) +@f2(x) = @(f1 +f2)(x). Hence,s2@(f1 +f2)(x),s = s1 +s2 for si2@f(xi):Similarly, [14, Corollary XI.3.4.2] gives @f1(x1)\@f2(x2) = @(f12f2)(x), sos2@(f12f2)(x),9x1;x2; s2@f1(x1) and s2@f2(x2):Using @P (f1;f2)(x) = x+Ti: i>0(@fi(xi) +xi) from [1, Theorem 7.1],s2@P (f1;f2)(x),(s = x1 +s1 x , for some s1 2@f1(x1);s = x2 +s2 x , for some s2 2@f2(x2):We now include the proof of Theorem 4.2.1(ix).274.3. Transforms of PLQ FunctionsProof. Using the de nition of T f = P (f;q), and that s2 2@q(x2),s2 =x2, we have(x;s)2gph@(T f),8><>:x = (1 )s1 + s2s = x1 +s1 xs = x2 +s2 xSolving for x and s yieldsx = (1 2 )x1 + 2s1s = 12x1 + (1 2 )s1which are the coe cients in Theorem 4.2.1(ix).4.3 Transforms of PLQ FunctionsGiven a convex PLQ function f in GPH matrix format, Theorem 4.2.1shows how to compute many of the fundamental unary convex transforms.Having computed Agph@f, we still need to recover at least one constant ofintegration to be able to make use of this new graph. We present ways totransform fi values from @f’s graph to new values in A@f’s graph. Unlikethe matrices presented by Theorem 4.2.1, the method to compute new fi’sdepends highly on the transform being performed.We set our notations rst. We will assume we are transforming a con-vex function f given by the GPH matrix [xf;sf;yf]T, and that we havecalculated xg;sg for the transformed function g = Tf given by [xg;sg;yg]T.The notation x.*y denotes elementwise multiplication, and x.^n denoteselementwise exponentiation.Recovery of yg when g = f or g = f + q is straightforward: yg = yfin the rst case, and yg = yf + ( =2)xf.^2 in the second. The following twoPropositions show how to compute the transforms of other unary operators.Proposition 4.3.1. To compute the Legendre-Fenchel transform, g = f?,yg can be calculated via yg = sf.*xf yf.Proof. Asf is convex, looking for a maximizer ofg(s) = f?(s) = maxx(hs;xi f(x)) means nding x;s such that s2@f(x). For all i2f0;:::;n+ 1g, weknow that (sf)i2@f((xf)i), hence yg = g(xg) = g(sf) = sf.*xf yf.284.3. Transforms of PLQ FunctionsProposition 4.3.2. Under the Moreau envelope g = e f, we have thatyg = yf + ( =2)sf.^2.Proof. The Moreau envelope can be written as e f = (f? + 1q)?. By theprevious Proposition as well as Theorem 4.2.1(i),f? :=24x1s1y135 =24sfxfsf.*xf yf35:Then, if we consider the function f? + 1q,f? + 1q :=24x2s2y235 =24x1s1 + (1= )x1y1 + (1=2 )x1.^235:Conjugating again,g = (f? + 1q)? :=24xgsgyg35 =24s2x2s2.*x2 y235:Expanding yg yields yg = ( =2)sf.^2.To compute the sum of two convex PLQ functions f and g nite at morethan a single point, the x-axis must be repartitioned to include all of thedistinct elements in xf and xg. Pseudocode is presented in Algorithm 3.The algorithm requires complexity in time and space linear in the numberof pieces of f and g, as Line 3 merges two sorted sequences, after whicheach partition is only considered once, in O(1) time. Note that additional(linear) work may be required before entering the for loop to determine inwhich pieces of f and g the entries of x fall.The proximal average also requires more work to implement than theunary operators we have shown.Proposition 4.3.3. Let f1 = [x1;s1;y1] and f2 = [x2;s2;y2] be convex PLQfunctions in GPH matrix format, and 2 [0;1]. De ne the operator P byP = (Id +@f2) 1(Id +@f1). Then the proximal average P (f1;f2) = [x;s;y]is given by:x = (1 )x1 + Px1s = x1 +s1 x = x2 +s2 xy = (1 )(y1 + 12x1.^2) + (f2(Px1) + 12(Px1).^2) 12x.^2294.3. Transforms of PLQ FunctionsAlgorithm 3: gph addInput: f;g (convex PLQ functions, in GPH matrix format)Output: h (a convex PLQ function, in GPH matrix format)begin1h [];2x the ascending distinct partition points in xf and xg;3for each interval [xi;xi+1] produced by the partition x do4Append to h a column for the right slope at xi by adding the5corresponding right slopes and function values of f and g atxi;Append to h another column for the left slopes and function6values at xi+1;end7if f(x) =1 or g(x) =1 for x<x1 then8Bound h on the left with a column [x1;s1 1;1]T;9end10if f(x) =1 or g(x) =1 for x<xn then11Bound h on the right with a column [xn;sn + 1;1]T;12end13h UniqueColumns(h);14end15304.4. Comparison of AlgorithmsProof. In order to calculate the proximal average at x, we must nd thex1;x2 that make the proximal average exact. To do this, we parametrizethe graph with respect to x1 and nd the corresponding x2. The formula fors was stated in Theorem 4.2.2. From s, we can see that x1 + s1 = x2 + s2,so (Id +@f1)(x1) = (Id +@f2)(x2). Thus x2 = Px1 gives the points thatcorrespond to x1, i.e.x2 = (Id +@f2) 1(Id +@f1)(x1):By using a reformulation of the proximal average [1, Proposition 4.3],P (f1;f2) = infx=(1 )x1+ x2((1 )(f1 + q)(x1) + (f2 + q)(x2) q(x));and substituting x2, we arrive at the formula for y.It is important to note that (y1 + 12x1.^2), (f2(Px1) + 12(Px1).^2), and(12x.^2) are independent of . These values can be precomputed to reducethe time needed to compute the proximal average for various and xedf1;f2;x. The proximal average PLQ algorithm does not use such a pre-computation scheme. This technique was demonstrated in [11] to computeP (12k k2; f0g + 1) for many (Figure 4.1), and signi cant performanceincreases were noticed. The PLQ algorithm required 117 seconds, while theGPH algorithm for multiple took only 33 seconds.4.4 Comparison of AlgorithmsTo compare the performance of GPH algorithms with existing algorithms,we implemented many of these GPH algorithms alongside their PLQ coun-terparts in [18]. The results may be seen in Figures 4.2 and 4.3.We included four algorithms in our experiments. PLQ is the existingalgorithm that works with PLQ functions directly. GPH is our new algo-rithm that works with GPH matrices. In Figure 4.2, LLT is the Linear-timeLegendre Transform [16]. Finally, OPT is an algorithm using the n1fc1 non-smooth bundle method solver built into Scilab 5.1.1 [6] with a warm-startfrom the previously computed point.First, we look at computing the conjugate, (x4)?, in Figure 4.2. Thegeneral OPT algorithm does not perform as well as specialized algorithms(Figure 4.2(a)), and removing OPT from the picture makes it clear that,although both algorithms clearly require linear time, GPH is outperformingPLQ signi cantly (Figure 4.2(b)). The PLQ conjugate algorithm makes use314.4. Comparison of AlgorithmsFigure 4.1: The proximal average of f1(x) = 12k k2 and f2(x) = f0g(x) + 1as ranges from 0 (blue) to 1 (red).of a \clean" function to normalize its matrix, whereas this is not necessary forthe GPH matrix. The GPH algorithm, which produces an exact transformedmodel, shows comparable performance to the inexact LLT algorithm.We also used the proximal average algorithm for comparison (Figure4.3). The results were much the same: OPT was the slowest algorithm;PLQ performed much better; and GPH showed signi cant improvementover both algorithms.All experiments were performed on the same computer as in Section 3.3.324.4. Comparison of Algorithms0123456780 500 10001500200025003000350040004500nRun time (sec)plqgphlltoptplqgphlltopt(a) All conjugate algorithms.-0.050.000.050.100.150.200.250 2000400060008000100001200014000160001800020000nRun time (sec)plqgphlltplqgphllt(b) With OPT removed.Figure 4.2: Comparison of conjugate algorithms computing (x4)? approxi-mated by n-piece piecewise linear functions on [ 10;10].334.4. Comparison of Algorithms0.00.51.01.52.02.5100 200 300 400 500 600 700nRun time (sec)plqgphoptplqgphoptFigure 4.3: Comparison of proximal average algorithms computingP12(x4;ex) approximated by n-piece piecewise linear functions on [ 10;10].34Chapter 5ConclusionWe have provided two means to compute the convex hulls of continuouspiecewise linear-quadratic functions. The rst algorithm requires treating anumber of special cases at each iteration, but runs in optimal linear time.The second requires worst-case quadratic time and runs slower than the rst even when it achieves its best-case linear time, but requires less e ortto implement.We also showed that by storing the graph of the subdi erential of aPLQ function in matrix form, it is possible to build a class of algorithmsthat computes many convex transforms of convex PLQ functions via ma-trix multiplication, followed by a linear-time calculation of the value of thetransformed function. We included complete algorithms for scalar multi-plication, addition of the scaled energy function, addition of two functions,the Legendre-Fenchel transform, the Moreau envelope, and the proximalaverage. Because these calculations are matrix and elementwise vector op-erations, and because no subalgorithms are necessary, we experienced sig-ni cant performance gains over algorithms working with PLQ functions di-rectly.For both sets of algorithms, we experimentally validated our results withimplementations in the Computational Convex Analysis toolbox [18].Directions for future work include extending the univariate PLQ frame-work to model bivariate PLQ functions. Representation of the domain of afunction becomes critical, and it is necessary to nd a representation thataids the computation of the conjugate, from which other operators may bebuilt. Computing the convex hull of a bivariate PLQ function is also im-portant, however certain techniques can be ruled out: the duality approachof Section 3.2 cannot be directly applied, as the pointwise maximum of twoconvex bivariate PLQ functions, while convex, may fail to be PLQ (seeFigure 5.1).Another direction to take would be to extend the Graph-Matrix Cal-culus algorithms to work with nonconvex functions. Additionally, insteadof relying on convex hull algorithms to be able to use convex algorithmson nonconvex functions, individual algorithms may be extended to handle35Chapter 5. Conclusion-5-4-3-2-10 X0-5 1-4-3 2-2-1 0 31Y 2 43 4 55510Z 152025Figure 5.1: f(x;y) = 12(x2 + y2). g(x;y) = 4. max(f;g) is not a PLQfunction, as one piece of its domain is a circle about (0;0), which is notpolyhedral as the PLQ class requires.nonconvex cases as well.36Bibliography[1] H. H. Bauschke, R. Goebel, Y. Lucet, and X. Wang. The proximalaverage: Basic theory. SIAM Journal of Optimization, 19(2):766{785,July 2008.[2] H. H. Bauschke, Y. Lucet, and M. Trienis. The piecewise linear-quadratic model for computational convex analysis. ComputationalOptimization and Applications, 43(1), May 2009.[3] H. H. Bauschke, E. Matoukov, and S. Reich. Projection and proximalpoint methods: convergence results and counterexamples. NonlinearAnalysis: Theory, methods, and Applications, 56(5):715{738, 2004.[4] K. P. Bennett and E. Parrado-Hern andez. The interplay of optimizationand machine learning research. J. Mach. Learn. Res., 7:1265{1281,2006.[5] P. L. Combettes and J.-C. Pesquet. A proximal decomposition methodfor solving convex variational inverse problems. Inverse Problems,24(6), 2008.[6] Scilab Consortium. Scilab. http://www.scilab.org/, 2010.[7] H. Edelsbrunner. Algorithms in Combinatorial Geometry. Springer-Verlag, 1987.[8] P. F. Felzenszwalb and D. P. Huttenlocher. Distance transforms ofsampled functions. Technical Report TR2004-1963, Cornell Computingand Information Science, September 2004.[9] B. Gardiner and Y. Lucet. Numerical computation of Fitzpatrick func-tions. Journal of Convex Analysis, 16(4), January 2009.[10] B. Gardiner and Y. Lucet. Convex hull algorithms for piecewise linear-quadratic functions in computational convex analysis. Submitted 2010.37Bibliography[11] B. Gardiner and Y. Lucet. Graph-matrix calculus for computationalconvex analysis. Submitted 2010.[12] R. Goebel. Self-dual smoothing of convex and saddle functions. Journalof Convex Analysis, 15:179{190, 2008.[13] W. L. Hare. A proximal average for nonconvex functions: A proximalstability perspective. SIAM Journal on Optimization, 20(2):650{666,May 2009.[14] J.-B. Hiriart-Urruty and C. Lemarchal. Convex Analysis and Mini-mization Algorithms, volume 305{306 of Grundlehren der Mathema-tischen Wissenschaften [Fundamental Principles of Mathematical Sci-ences]. Springer-Verlag, 1993.[15] V. Koch, J. Johnstone, and Y. Lucet. Convexity of the proximal aver-age. Submitted June 29, 2009.[16] Y. Lucet. Faster than the fast Legendre transform, the linear-timeLegendre transform. Numerical Algorithms, 16(2):171{185, 1997.[17] Y. Lucet. Fast Moreau envelope computation I: Numerical algorithms.Numerical Algorithms, 43(3):235{249, November 2006.[18] Y. Lucet. Computational Convex Analysis. http://people.ok.ubc.ca/ylucet/cca, 2010.[19] S. M. Mo at. On the kernel average for n functions. MSc thesis,University of British Columbia Okanagan, 2009.[20] R. T. Rockafellar. Convex Analysis. Princeton University Press, Prince-ton, New York, 1970.[21] R. T. Rockafellar and R. J.-B. Wets. Variational Analysis. Springer-Verlag, Berlin, 1988.[22] M. Trienis. Computational convex analysis: From continuous defor-mation to nite convex integration. MSc thesis, University of BritishColumbia Okanagan, January 2007.[23] J. Yu, S. V. N. Vishwanathan, S. G unter, and N. N. Schraudolph. Aquasi-newton approach to nonsmooth convex optimization problems inmachine learning. J. Mach. Learn. Res., 11:1145{1200, 2010.38
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Undergraduate Research /
- Convex Hulls and Graph-Matrix Calculus: Algorithms...
Open Collections
UBC Undergraduate Research
Convex Hulls and Graph-Matrix Calculus: Algorithms in Computational Convex Analysis Gardiner, Bryan Nicholas Carl 2010
pdf
Page Metadata
Item Metadata
Title | Convex Hulls and Graph-Matrix Calculus: Algorithms in Computational Convex Analysis |
Creator |
Gardiner, Bryan Nicholas Carl |
Date Issued | 2010 |
Description | At the core of Convex Analysis and its applications are a collection of frequently used operators for transforming convex functions, along with the convex hull operation for convexifying functions. While numerical algorithms are usually applied to general functions with little known structure, we focus on the class of univariate piecewise linear-quadratic (PLQ) functions, for which exact algorithms have been developed. This thesis presents two main results. In the first part, we investigate two convex hull algorithms for univariate PLQ functions. The rst algorithm is an extension of the linear-time planar Beneath-Beyond algorithm, and performs a plane sweep that converts a function into its convex hull. The second uses convex duality theory to deconstruct a nonconvex function and build its convex hull using convex operators, in worst-case quadratic time. The trade-off is complexity: the second algorithm can be significantly simpler to implement. The second part is concerned with the computation of convex transforms, such as the Legendre-Fenchel transform and the Moreau Envelope. We introduce a new family of algorithms that stores and manipulates models of the subdifferential instead of the original function. We performed numerical experiments comparing the two convex hull algorithms, and comparing the new subdifferential algorithms to existing PLQ algorithms. These experiments confirm the time complexity for the convex hull algorithms, and show that the subdifferential algorithms have the same complexity as the PLQ algorithms, while performing an order of magnitude faster. |
Genre |
Other |
Type |
Text |
Language | eng |
Series | University of British Columbia. COSC 449 |
Date Available | 2010-07-29 |
Provider | Vancouver : University of British Columbia Library |
DOI | 10.14288/1.0052227 |
Affiliation |
Irving K. Barber School of Arts and Sciences (Okanagan) |
Peer Review Status | Unreviewed |
Scholarly Level | Undergraduate |
URI | http://hdl.handle.net/2429/27015 |
Aggregated Source Repository | DSpace |
Download
- Media
- Thesis Bryan Gardiner.pdf [ 701.64kB ]
- Metadata
- JSON: 1.0052227.json
- JSON-LD: 1.0052227+ld.json
- RDF/XML (Pretty): 1.0052227.xml
- RDF/JSON: 1.0052227+rdf.json
- Turtle: 1.0052227+rdf-turtle.txt
- N-Triples: 1.0052227+rdf-ntriples.txt
- Original Record: 1.0052227 +original-record.json
- Full Text
- 1.0052227.txt
- Citation
- 1.0052227.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Country | Views | Downloads |
---|---|---|
Canada | 2 | 0 |
Russia | 1 | 0 |
China | 1 | 0 |
United States | 1 | 0 |
City | Views | Downloads |
---|---|---|
Vancouver | 1 | 0 |
Shenzhen | 1 | 0 |
Saint Petersburg | 1 | 0 |
Victoria | 1 | 0 |
Ashburn | 1 | 0 |
{[{ mDataHeader[type] }]} | {[{ month[type] }]} | {[{ tData[type] }]} |
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.52966.1-0052227/manifest