Convex Relaxations of the Maximin Dispersion Problem by Sheena Ayla Haines B.Sc. Hons., The University of British Columbia, 2009 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in The College of Graduate Studies (Mathematics) THE UNIVERSITY OF BRITISH COLUMBIA (Okanagan) July 2011 c⃝ Sheena Ayla Haines 2011 Abstract Recently, convex relaxations have achieved notable success in solving NP- hard optimization problems. This thesis studies semidefinite and second- order cone programming convex relaxations of the maximin dispersion prob- lem. Providing nontrivial approximation bounds, we believe that our SDP and SOCP relaxation methods are useful in large-scale optimization. The thesis is organized as follows. We begin by recalling some basic con- cepts from convex analysis, nonsmooth analysis, and optimization. We then introduce the weighted maximin dispersion optimization problem; locating point(s) in a given region 풳 ⊆ ℝ푛 that is/are furthest from a given set of 푚 points. Also given are several reformulations of the original problem, in- cluding a convex relaxation problem and semidefinite and second order cone programming relaxations. Using well known results on Lipschitz functions and subdifferentials of Lipschitz functions we then derive a theoretical char- acterization of the optimal solutions for a given convex region 풳 and equal weights. Next, we provide a proof that the weighted maximin dispersion problem is NP-hard even in the case where 풳 is is a box and the weights are equal. We then propose an algorithm for finding an approximate solution using the SDP relaxation, and derive an approximation bound that depends on the subset 풳 . In the case that 풳 is a box or a product of low-dimensional spheres, we show that the convex relaxation reduces to a second-order cone program, and provide further results on the bound provided. Lastly, we pro- vide numerical examples using the SDP and SOCP relaxations, and suggest future work. ii Table of Contents Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . vii Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Existence of Maximizers and Minimizers . . . . . . . . . . . 3 1.3 Matrix Algebra . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.4 Convex Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.4.1 Convex Sets and Convex Functions . . . . . . . . . . 7 1.4.2 Minimizers or Maximizers of Convex Functions . . . . 16 1.5 Nonsmooth Analysis . . . . . . . . . . . . . . . . . . . . . . . 16 1.5.1 Locally Lipschitz Functions . . . . . . . . . . . . . . . 16 1.5.2 Subdifferentials and Critical Points . . . . . . . . . . 18 1.6 Some Convex Optimization Problems . . . . . . . . . . . . . 19 1.7 NP-Complete Problems . . . . . . . . . . . . . . . . . . . . . 21 1.7.1 The Theory of NP-Completeness . . . . . . . . . . . . 22 1.7.2 Some NP-Hard Problems . . . . . . . . . . . . . . . . 23 1.7.3 The Complexity of Algorithms for Solving SDP and SOCP . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2 The Weighted Maximin Location Problem . . . . . . . . . . 25 2.1 General Problem and Motivation . . . . . . . . . . . . . . . . 25 2.2 Convex Relaxations . . . . . . . . . . . . . . . . . . . . . . . 28 iii Table of Contents 2.2.1 SDP Relaxation . . . . . . . . . . . . . . . . . . . . . 30 2.2.2 SOCP Relaxation . . . . . . . . . . . . . . . . . . . . 31 2.2.3 More Properties on the Constraints of Convex Relax- ations . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 2.2.4 Why Does the SDP or SOCP Relaxation Have an Op- timal Solution? . . . . . . . . . . . . . . . . . . . . . 37 3 A Necessary Condition for Optimal Solutions . . . . . . . . 39 3.1 Properties of the Maximin Objective Functions . . . . . . . . 39 3.2 Characterization of Optimal Solutions . . . . . . . . . . . . . 41 4 NP-Hardness Results . . . . . . . . . . . . . . . . . . . . . . . 44 4.1 The Case That 풳 is a Box . . . . . . . . . . . . . . . . . . . 44 4.2 NP-Hardness of Restricted Problems . . . . . . . . . . . . . 49 5 Convex Relaxation Based Algorithms . . . . . . . . . . . . . 54 5.1 Auxiliary Results . . . . . . . . . . . . . . . . . . . . . . . . 54 5.2 SDP Relaxation Based Algorithm . . . . . . . . . . . . . . . 56 5.3 Proof of Theorem 5.4: SDP Relaxation Based Algorithm . . 65 5.4 Examples on the Lower Bound 훾∗ . . . . . . . . . . . . . . . 67 6 Numerics and Examples . . . . . . . . . . . . . . . . . . . . . 69 6.1 Approximation Bounds: Box Case . . . . . . . . . . . . . . . 69 6.2 Approximation Bounds: Ball Case . . . . . . . . . . . . . . . 70 6.3 Where the SDP Relaxation Fails . . . . . . . . . . . . . . . . 72 6.3.1 Necessary Condition for Non-trivial Bounds . . . . . 73 6.4 An Alternate Approach . . . . . . . . . . . . . . . . . . . . . 74 6.5 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . 75 7 Conclusions and Future Work . . . . . . . . . . . . . . . . . . 80 7.1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 7.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 Appendices A Matlab Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 B Matlab Output . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 iv List of Tables 6.1 Maximal value of 푚 yielding a non-trivial approximation. . . 74 6.2 Numerical Results for Large 푛 = 5. . . . . . . . . . . . . . . . 78 6.3 Numerical Results for Large 푛 and 푚. . . . . . . . . . . . . . 79 v List of Figures 1.1 Examples of convex sets. . . . . . . . . . . . . . . . . . . . . . 8 1.2 Examples of nonconvex sets. . . . . . . . . . . . . . . . . . . . 8 1.3 The convex hull of some nonconvex sets. . . . . . . . . . . . . 9 1.4 A hyperplane with normal vector 푎 divides ℝ2 into two half- spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.5 The Euclidean norm cone in ℝ3. . . . . . . . . . . . . . . . . 11 1.6 A polyhedron. . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.7 A convex function. . . . . . . . . . . . . . . . . . . . . . . . . 12 2.1 Plots of 푓(푥) for 풳 = [−1, 1]2, with the set 푆 on the left and set 푇 on the right. . . . . . . . . . . . . . . . . . . . . . . . . 27 vi Acknowledgements I would like to thank my supervisors Jason Loeppky and Shawn Wang for their infinite patience and encouragement. Their passion and enthusi- asm for mathematics and statistics has both inspired me and given me the motivation to pursue my Master’s degree. My thanks also go to my thesis committee members Heinz Bauschke, Yong Gao and Warren Hare for their insightful comments and suggestions which have improved the presentation of my thesis. I would also like to thank Richard Taylor for being the first professor to encourage me to pursue mathematics, and Rebecca Tyson for giving me my first opportunity to pursue mathematical research. Most im- portantly, I want to acknowledge the work of Paul Tseng. The work in this thesis is based on an unpublished paper that Paul was working on at the time of his disappearance. Recognizing the significance of his work, my supervisors suggested that we take Paul’s ideas and carry them to their con- clusion, the result of which is this thesis. Without Paul’s ideas this thesis would never have existed, and I hope that I have done justice to him in this work. vii Dedication This work is dedicated to my mom, who has always believed in me, and the late (Captain) Bob MacAuley, who was the first person to ever spark my interest in mathematics, physics and statistics. viii Chapter 1 Introduction The fundamental building blocks of all studies in optimization can be found in calculus, linear algebra and analysis. This chapter collects back- ground materials and known facts which will be used in later chapters. Sec- tions 1.1, 1.3 and 1.4 give a brief overview of the necessary results from these fields which are used to build the basic theoretical results of optimization. Sections 1.5.1 and 1.5 then introduce concepts which are required specifi- cally for solving non-convex optimization problems, like the one presented in this thesis. 1.1 Preliminaries In this work the geometry of problems is mainly concerned with Eu- clidean space and finding solutions that are subsets of ℝ푛. In this section we discuss the geometry and fundamental concepts of the real vector space ℝ푛. We also recall a few very basic definitions from calculus which are required throughout the rest of this thesis. We denote the set of 푛-dimensional real vectors by ℝ푛, and for any 푥 ∈ ℝ푛 we denote the 푖푡ℎ component of 푥 by 푥푖. Thus, each 푥 ∈ ℝ푛 is a column vector 푥 = ⎛⎜⎜⎜⎝ 푥1 푥2 ... 푥푛 ⎞⎟⎟⎟⎠ . For any 푥, 푦 ∈ ℝ푛 we define the inner product ⟨⋅, ⋅⟩ by ⟨푥, 푦⟩ = 푥푇 푦 = 푛∑ 푖=1 푥푖푦푖. (1.1) Definition 1.1. A norm ∥⋅∥ on ℝ푛 is a function that assigns a scalar ∥푥∥ to every 푥 ∈ ℝ푛 with the following properties: 1. ∥푥∥ ≥ 0 for all 푥 ∈ ℝ푛. 1 1.1. Preliminaries 2. ∥푥∥ = 0 if and only if 푥 = 0. 3. ∥훼푥∥ = ∣훼∣ ∥푥∥ for all 훼 ∈ ℝ and 푥 ∈ ℝ푛. 4. ∥푥+ 푦∥ ≤ ∥푥∥ + ∥푦∥ for all 푥, 푦 ∈ ℝ푛. This is known as the Triangle Inequality. Throughout this thesis we will be working exclusively with the Euclidean norm ∥푥∥2 = ( 푥푇푥 )1/2 = ( 푛∑ 푖=1 푥2푖 )1/2 (1.2) and so we will drop the subscript, so that ∥푥∥ refers to the Euclidean norm. Based on the definition of a norm we have the notion of distance. In general, the distance between two vectors 푥 and 푦 is given by 푑(푥, 푦) = ∥푥− 푦∥ . (1.3) The Euclidean distance between two vectors 푥, 푦 ∈ ℝ푛 is given by: ∥푥− 푦∥2 = √ (푥− 푦)푇 (푥− 푦) = ( 푛∑ 푖=1 (푥푖 − 푦푖)2 )1/2 . (1.4) As this is the only distance measure used throughout this thesis, we will again drop the subscript. A useful inequality that we will apply throughout this thesis is the Cauchy-Schwarz inequality: Fact 1.2. (Cauchy-Schwarz Inequality) [9] Let 푥, 푦 ∈ ℝ푛. Then ∣⟨푥, 푦⟩∣ ≤ ∥푥∥ ∥푦∥ , (1.5) with equality holding if 푥 and 푦 are collinear, i.e., 푥 being a scalar multiple of 푦 or vice versa. A basic result of analysis is that Fact 1.3. All norms in ℝ푛 are equivalent. More precisely, suppose that ∥⋅∥푎 and ∥ ⋅ ∥푏 are norms in ℝ푛, then there exists 훼, 훽 > 0 such that 훼∥푥∥푎 ≤ ∥푥∥푏 ≤ 훽∥푥∥푎 ∀푥 ∈ ℝ푛. 2 1.2. Existence of Maximizers and Minimizers 1.2 Existence of Maximizers and Minimizers Definition 1.4. The domain of a function 푓 , denoted dom (푓), is given by dom (푓) = {푥 ∈ ℝ푛∣ 푓(푥) < +∞}. Example 1.5. The domain of some common functions: − dom (푥2) = ℝ − dom (√푥) = {푥 ∈ ℝ ∣ 푥 ≥ 0} − dom ( 1푥) = {푥 ∈ ℝ ∣ 푥 ∕= 0} Definition 1.6. We say that the function 푓 is proper if its domain is nonempty and 푓 > −∞. Definition 1.7. The 훼-sublevel set of a function 푓 is the set {푥 ∈ dom 푓 ∣푓(푥) ≤ 훼} (1.6) and the epi-graph of 푓 is the set epi (푓) = { (푥, 푟) ∈ ℝ푛+1 ∣푥 ∈ dom 푓 and 푓(푥) ≤ 푟} . (1.7) The indicator function associated with 푆 ⊆ ℝ푛 is defined by 휄푆(푥) = { 0, if 푥 ∈ 푆; +∞, otherwise. Associated with 푓 : ℝ푛 → [−∞,+∞] we let argmin푓 = {푥 ∈ ℝ푛 ∣ 푓(푥) = min 푓} (1.8) and argmax푓 = {푥 ∈ ℝ푛 ∣ 푓(푥) = max 푓} . (1.9) Recall that Definition 1.8. A set 푆 ⊂ ℝ푛 is compact if it is closed and bounded. A continuous function on a compact set always has maximizers and min- imizers. 3 1.3. Matrix Algebra Theorem 1.9. (Extreme Value Theorem) [9] Assume that 푓 : ℝ푛 → ℝ is continuous and 풳 ⊂ ℝ푛 is compact. Then argmin (푓 + 휄풳 ) = { 푥 ∈ 풳 ∣∣∣∣ 푓(푥) = min풳 푓 } ∕= ∅ (1.10) and argmax (푓 + 휄풳 ) = { 푥 ∈ 풳 ∣∣∣∣ 푓(푥) = max풳 푓 } ∕= ∅. (1.11) Building on the vector space ℝ푛, the next section focuses on real matri- ces; that is, the set of all 푝× 푞 matrix with entries 푎푖푗 ∈ ℝ. Given the vector space ℝ푛, the Euclidean norm, and the theories of linear algebra presented in the next section, we can define many optimization problems of interest. 1.3 Matrix Algebra For any matrix 퐴, we denote its 푖, 푗푡ℎ element by 퐴푖,푗 . The transpose of 퐴, denoted 퐴푇 is defined by 퐴푇푖,푗 = 퐴푗,푖. For an 푛 × 푛 matrix, we write 퐴 ∈ ℝ푛×푛. Definition 1.10. Let 퐴 be a square matrix. We say that 퐴 is symmetric if 퐴 = 퐴푇 . (1.12) The space of all 푛× 푛 symmetric matrices is denoted 풮푛. Definition 1.11. The 푛 × 푛 identity matrix is a diagonal matrix whose diagonal elements are equal to 1, and is denoted 퐼 or sometimes 퐼푛. That is, 퐼푗,푗 = 1 for 푗 = 1, . . . , 푛 and 퐼푖,푗 = 0 for all 푖 ∕= 푗. Definition 1.12. The trace of an 푛 × 푛 square matrix 퐴 is defined to be the sum of the elements on the main diagonal. That is, trace(퐴) = tr(퐴) = 푛∑ 푖=1 퐴푖,푖. (1.13) Definition 1.13. For 퐴,퐵 ∈ 풮푛 we define the inner product by ⟨퐴,퐵⟩ = tr(퐴퐵). (1.14) 4 1.3. Matrix Algebra Fact 1.14. ([24] p.29 and p.45) For any 퐴,퐵 ∈ 풮푛 we have tr(퐴퐵) = tr(퐵퐴) (1.15) and tr(퐴+퐵) = tr(퐴) + tr(퐵). (1.16) Definition 1.15. A matrix 퐴 ∈ 풮푛 is said to be positive definite if 푥푇퐴푥 > 0 for all 푥 ∈ ℝ푛∖ {0} (1.17) and we write 퐴 ≻ 0. The matrix 퐴 ∈ 풮푛 is negative definite if 푥푇퐴푥 < 0 for all 푥 ∈ ℝ푛∖ {0} (1.18) and we write 퐴 ≺ 0. Definition 1.16. A matrix 퐴 ∈ 풮푛 is said to be positive semidefinite if 푥푇퐴푥 ≥ 0 for all 푥 ∈ ℝ푛 (1.19) and we write 퐴 ર 0. The matrix 퐴 ∈ 풮푛 is negative semidefinite if 푥푇퐴푥 ≤ 0 for all 푥 ∈ ℝ푛 (1.20) and we write 퐴 ⪯ 0. Definition 1.17. The set of a positive semidefinite symmetric matrices is denoted 풮푛+. Definition 1.18. Let 퐴 be any 푛×푛 square matrix. A principal submatrix of 퐴 is any 푚 × 푚 submatrix of 퐴 obtained by deleting 푛 − 푚 rows and corresponding columns. Example 1.19. Let 퐴 = ⎡⎢⎢⎣ 푎 푏 푐 푑 푒 푓 푔 ℎ 푖 푗 푘 푙 푚 푛 표 푝 ⎤⎥⎥⎦. One example of a principal submatrix of 퐴 is obtained by deleting the second and third columns and rows: [ 푎 푑 푚 푝 ] . (1.21) Another principal submatrix might delete just the fourth column and row, or the first and third, etc. 5 1.3. Matrix Algebra Proposition 1.20. If 퐴 ∈ 풮푛 is positive semidefinite, then every principal submatrix of 퐴 is symmetric positive semidefinite. Proof. Let 퐴푝 be any principal submatrix of 퐴. First, note that since we have deleted corresponding rows and columns of 퐴, a symmetric matrix, the submatrix 퐴푝 must be symmetric. Now, let 푥푝 be any nonzero vector of size 푝. “Enlarge” 푥푝 to a vector 푥 of size 푛 by inserting zeros into the positions corresponding to the rows (and columns) of 퐴 which were deleted to form 퐴푝. Then we have that 푥푇푝퐴푝푥푝 = 푥 푇퐴푥 ≥ 0. (1.22) Hence, 퐴푝 is positive semidefinite. Corollary 1.21. Let 퐴 ∈ 풮푛 be positive semidefinite. Then for each 푖 = 1, . . . , 푛− 1 we have [ 퐴푖,푖 퐴푖,푛 퐴푖,푛 퐴푛,푛 ] ર 0. (1.23) This follows directly from Proposition 1.20. Definition 1.22. The eigenvalues of an 푛 × 푛 matrix 퐴 are the values 휆1, . . . , 휆푛 (not necessarily distinct) such that det(퐴− 휆퐼) = 0. Fact 1.23. ([24] p.278) The determinant and trace of an 푛 × 푛 matrix 퐴 can be expressed in terms of the eigenvalues of 퐴 by: det(퐴) = 푛∏ 푖=1 휆푖 tr(퐴) = 푛∑ 푖=1 휆푖. (1.24) Fact 1.24. ([24] p.309) An 푛 × 푛 matrix 퐴 is positive semidefinite if and only if all of the eigenvalues of 퐴 are nonnegative. Proposition 1.25. If 퐴 is a 2 × 2 symmetric matrix, then 퐴 is positive semidefinite if and only if det(퐴) ≥ 0 and trace(퐴) ≥ 0. Proof. Suppose first that 퐴 is positive semidefinite. Then by Fact 1.24 we have 휆1, 휆2 ≥ 0. Then by Fact 1.23 we have det(퐴) = 2∏ 푖=1 휆푖 = 휆1휆2 ≥ 0 (1.25) and trace(퐴) = 2∑ 푖=1 휆푖 = 휆1 + 휆2 ≥ 0 (1.26) 6 1.4. Convex Analysis as required. Now, suppose that det(퐴) ≥ 0 and trace(퐴) ≥ 0. By Fact 1.23 we have that det(퐴) = 2∏ 푖=1 휆푖 = 휆1휆2 ≥ 0 (1.27) so that 휆1 and 휆2 must have the same parity. Furthermore, by Fact 1.23 we also have trace(퐴) = 2∑ 푖=1 휆푖 = 휆1 + 휆2 ≥ 0 (1.28) and thus we can conclude that 휆1, 휆2 ≥ 0 and hence 퐴 is positive semidefi- nite. Fact 1.26. (The Cholesky Factorization, [18] page 154.) Let 퐴 ∈ 풮푛 be positive definite. Then 퐴 = 퐿푇퐿, where 퐿 is a nonsingular upper triangular matrix. 1.4 Convex Analysis Recent applications of optimization are focused strongly on finding opti- mal solutions to non-convex problems. The approach to solving non-convex problems however, often involves creating convex relaxation problems which can then be solved by existing optimization methods. The following section provides a brief overview of convex analysis. 1.4.1 Convex Sets and Convex Functions Definition 1.27. A set 퐶 ⊆ ℝ푛 is affine if, for any 푥1, 푥2 ∈ 퐶 and 휃 ∈ ℝ we have 휃푥1 + (1− 휃)푥2 ∈ 퐶. (1.29) That is, an affine set contains the linear combination of any two points in the set, provided the coefficients sum to one. A point of the form 휃1푥1 + ⋅ ⋅ ⋅+ 휃푘푥푘, where 휃1 + ⋅ ⋅ ⋅+ 휃푘 = 1, is called an affine combination of the points 푥1, . . . , 푥푘. Definition 1.28. A set 퐶 ⊆ ℝ푛 is convex if, for any 푥1, 푥2 ∈ 퐶 and 0 ≤ 휃 ≤ 1 we have 휃푥1 + (1− 휃)푥2 ∈ 퐶. (1.30) 7 1.4. Convex Analysis Graphically, a set 퐶 is convex if the line segment between any two points in 퐶 is also contained in 퐶, see Figure 1.1. Every affine set is convex, since if a set contains the line through any two points, it must obviously contain the line between those two points. Given 휃푖 ≥ 0 with 휃1 + ⋅ ⋅ ⋅+ 휃푘 = 1 and points 푥1, . . . , 푥푘 ∈ 퐶, we say that the point 휃1푥1 + ⋅ ⋅ ⋅ + 휃푘푥푘 is a convex combination of the points 푥1, . . . , 푥푘. Figure 1.1: Examples of convex sets. Figure 1.2: Examples of nonconvex sets. Definition 1.29. The convex hull of a set 퐶 is the set of all convex combi- nations of points in 퐶: conv (퐶) = { 푘∑ 푖=1 휃푖푥푖 ∣∣∣∣∣ 푥푖 ∈ 퐶, 휃푖 ≥ 0, 푖 = 1, . . . , 푘, 푘∑ 푖=1 휃푖 = 1, 푘 ≥ 1 } . (1.31) The convex hull of the set 퐶 is the smallest convex set that contains 퐶. That is, if 퐷 is any convex set such that 퐶 ⊆ 퐷 then conv (퐶) ⊆ 퐷. 8 1.4. Convex Analysis Figure 1.3: The convex hull of some nonconvex sets. Fact 1.30. [22] If 퐶 ⊆ ℝ푛 is compact, i.e. closed and bounded, then conv(퐶) is compact. Corollary 1.31. Given 푥1, 푥2, . . . , 푥푘 in ℝ푛, conv {푥1, . . . , 푥푘} is compact. Definition 1.32. A set 퐾 ⊆ ℝ푛 is called a cone if, for any 푥 ∈ 퐾 and 휃 ≥ 0 we have 휃푥 ∈ 퐾. A set 퐾 is a convex cone if it is convex and a cone; that is, for any 푥1, 푥2 ∈ 퐾 and 휃1, 휃2 ≥ 0 we have 휃1푥1 + 휃2푥2 ∈ 퐾. (1.32) Example 1.33. − ℝ+ = {푥 ∣ 푥 ≥ 0} is a convex cone in ℝ. − ℝ푛+ = {(푥1, . . . , 푥푛) ∣ 푥1 ≥ 0, . . . , 푥푛 ≥ 0} is a convex cone in ℝ푛. − 풮푛+ = {퐴 ∣ 퐴 is an 푛× 푛 matrix, 퐴 is positive semidefinite} is a con- vex cone. A point of the form 휃1푥1 + ⋅ ⋅ ⋅ + 휃푘푥푘, where 휃푖 ≥ 0 for 푖 = 1, . . . , 푘 is called a conic combination of the points 푥1, . . . , 푥푘. Definition 1.34. Let 풦 ⊆ ℝ푛 be a cone. The dual cone of 풦, denoted 풦∗ is the set 풦∗ = {푦 ∈ ℝ푛 ∣∣푦푇푥 ≥ 0 ∀푥 ∈ 풦} (1.33) Definition 1.35. A hyperplane is a set of the form{ 푥 ∈ ℝ푛 ∣∣푎푇푥 = 푏} (1.34) where 푎 ∈ ℝ푛, 푎 ∕= 0 and 푏 ∈ ℝ. 9 1.4. Convex Analysis Analytically, a hyperplane is the solution set of a nontrivial linear equa- tion. Geometrically, the hyperplane 퐻 = { 푥 ∣∣푎푇푥 = 푏} can be interpreted as the set of points with constant inner product to a given vector 푎; that is, a hyperplane with normal vector 푎. Note that a hyperplane is affine (and thus convex). Definition 1.36. A (closed) halfspace is a set of the form{ 푥 ∈ ℝ푛 ∣∣푎푇푥 ≤ 푏} (1.35) where 푎 ∕= 0. A hyperplane divides ℝ푛 into two halfspaces (see Figure 1.4). Ana- lytically, a halfspace is the solution set of one nontrivial linear inequality. Halfspaces are convex, but not affine. a bxaT ≥ bxaT ≤ Figure 1.4: A hyperplane with normal vector 푎 divides ℝ2 into two half- spaces. 10 1.4. Convex Analysis Some frequently seen convex sets are listed below: − The closed norm ball in ℝ푛 with center 푥푐 and radius 푟 is given by 퐵푟[푥푐] = {푥 ∈ ℝ푛 ∣ ∥푥− 푥푐∥ ≤ 푟} (1.36) where 푟 > 0 and ∥⋅∥ denotes any norm on ℝ푛. − The norm cone associated with any norm ∥⋅∥ is given by {(푥, 푡) ∣ ∥푥∥ ≤ 푡, 푥 ∈ ℝ푛, 푡 ∈ ℝ} ⊆ ℝ푛+1. (1.37) The norm cone is a convex cone. See Figure 1.5 for an example of the norm cone associated with the Euclidean norm. − A polyhedron is the solution set of a finite number of linear equalities and inequalities, and is of the form{ 푥 ∈ ℝ푛 ∣∣푎푇푗 푥 ≤ 푏푗 , 푗 = 1, . . . ,푚, 푐푇푘 푥 ≤ 푑푘, 푘 = 1, . . . , 푝} . (1.38) Graphically, a polyhedron is the intersection of a finite number of halfspaces and hyperplanes. Note that the norm cone associated with the Euclidean norm is referred to as the second-order cone or Lorentz cone. Definition 1.37. The Lorentz cone is the cone in ℝ푛 defined as ℒ = { 푥 ∈ ℝ푛 ∣∣∣∣∣푥2푛 ≥ 푛−1∑ 푖=1 푥2푖 , 푥푛 ≥ 0 } . (1.39) Figure 1.5: The Euclidean norm cone in ℝ3. 11 1.4. Convex Analysis 1a 6a 5a 4a 3a 2a Figure 1.6: A polyhedron. Definition 1.38. A function 푓 : ℝ푛 → (−∞,∞] is said to be a convex function if dom 푓 is a convex set and, for all 푥, 푦 ∈ dom 푓 and 0 < 휃 < 1 we have 푓 (휃푥+ (1− 휃) 푦) ≤ 휃푓(푥) + (1− 휃)푓(푦). (1.40) Graphically, a function 푓 is convex if the line segment joining (푥, 푓(푥)) and (푦, 푓(푦)) lies above the graph of 푓 , (see Figure 1.7). A function 푓 is strictly convex if strict inequality holds in (1.40) whenever 푥 ∕= 푦 and 0 < 휃 < 1. We say that the function f is concave if −푓 is convex, and strictly concave if −푓 is strictly convex. (x, f (x)) ( y, f (y)) Figure 1.7: A convex function. Remark 1.39. A function 푓 is convex if and only if epi (푓) is a convex set. Fact 1.40. Every norm is a convex function on ℝ푛. 12 1.4. Convex Analysis Proof. Let 0 ≤ 휃 ≤ 1, and consider 푓(휃푥+ (1− 휃)푦) = ∥휃푥+ (1− 휃)푦∥ (1.41) ≤ ∥휃푥∥+ ∥(1− 휃)푦∥ (1.42) = 휃 ∥푥∥+ (1− 휃) ∥푦∥ (1.43) = 휃푓(푥) + (1− 휃)푓(푦). (1.44) Hence, 푓 is convex. Alternatively, simply note that the epigraph of the function 푓(푥) = ∥푥∥ is given by epi (푓) = {(푥, 푡) ∣ ∥푥∥ ≤ 푡} (1.45) which is a convex set in ℝ푛+1 (the norm cone), and so by Remark 1.39 푓 is a convex function. Definition 1.41. Let 푓 : ℝ푛 → ℝ푚 and 푥 ∈ int dom 푓 . If 푓 is differentiable at 푥, its Jacobian matrix ∇푓(푥) ∈ ℝ푚×푛 is given by ∇푓(푥)푖푗 = ∂푓푖(푥) ∂푥푗 (1.46) for all 푖 = 1, . . . ,푚, 푗 = 1, . . . , 푛. Definition 1.42. For a real-valued function 푓 : ℝ푛 → ℝ, if 푓 is twice differentiable at 푥, its Hessian matrix ∇2푓(푥) ∈ ℝ푛×푛 is given by ∇2푓(푥)푖푗 = ∂ 2푓 ∂푥푖∂푥푗 (1.47) for all 푖, 푗 = 1, . . . , 푛. Convexity of a differentiable function can also be established using the following conditions: Fact 1.43. (First order convexity condition) Suppose 푓 is differentiable on an open set 푂 ⊆ ℝ푛. Then 푓 is convex on 푂 if and only if 푂 is convex and 푓(푦) ≥ 푓(푥) +∇푓(푥)푇 (푦 − 푥) (1.48) for all 푥, 푦 ∈ 푂 ([22] page 46). 13 1.4. Convex Analysis Fact 1.44. (Second order convexity condition) Assume 푓 is twice differentiable on an open set 푂 ⊆ ℝ푛. Then 푓 is convex on 푂 if and only if 푂 is convex and ∇2푓 is positive semidefinite; i.e., ∇2푓(푥) ર 0 (1.49) for every 푥 ∈ 푂 ([22] page 46). Fact 1.45. The function 푓푖(푥) = 푤푖 ∥∥푥− 푥푖∥∥2 (1.50) is a convex function on ℝ푛, where 푤푖 > 0 ∈ ℝ and 푥푖 ∈ ℝ푛 is fixed. This follows from the second order convexity condition, and the fact that ∇2푓푖(푥) = 2푤푖퐼 ર 0. Definition 1.46. A vector 푣 ∈ ℝ푛 is said to be a subgradient of the convex function 푓 at the point 푥 if 푓(푦) ≥ 푓(푥) + ⟨푣, 푦 − 푥⟩ , ∀푦 ∈ dom 푓. (1.51) The set of all subgradients of 푓 is called the subdifferential of 푓 at 푥, and is denoted ∂푓(푥). A subdifferential mapping, being a surrogate for derivative, is fundamen- tally important in convex analysis and optimization. Definition 1.47. A vector 푥∗ is said to be normal to the convex set 퐶 at the point 푥 ∈ 퐶 if ⟨푦 − 푥, 푥∗⟩ ≤ 0 ∀푦 ∈ 퐶. (1.52) The set of all vectors 푥∗ that are normal to 퐶 at the point 푥 is called the Normal Cone to 퐶 at 푥, and is denoted 푁퐶(푥). If 푥 /∈ 퐶, then 푁퐶(푥) = ∅. Recall that for 퐶 ⊆ ℝ푛, 푥 is an interior point of 퐶 if there exists 휖 > 0 such that 퐵휖(푥) ⊆ 퐶. The set of interior points of 퐶 will be denoted by int(퐶). Example 1.48. Let 퐶 ⊆ ℝ푛. Immediately from the definition of ∂휄퐶 , we have ∂휄퐶 = 푁퐶 . Moreover, if 푥 ∈ int(퐶), then ∂휄퐶(푥) = {0}, i.e. 푁퐶(푥) = {0} . 14 1.4. Convex Analysis Definition 1.49. Let 푓 : ℝ푛 → [−∞,∞], and let 푥 be a point such that ∣푓(푥)∣ <∞. The one-sided directional derivative of 푓 at 푥 with respect to a vector 푦 is defined to be the limit 푓 ′(푥; 푦) := lim 휆↓0 푓(푥+ 휆푦)− 푓(푥) 휆 (1.53) if it exists (here ±∞ are allowed as limits). Definition 1.50. Let 푓 : ℝ푛 → [−∞,∞], and let 푥 be a point such that ∣푓(푥)∣ <∞. We say that 푓 is differentiable at 푥 if and only if there exists a vector 푥∗ with the property lim 푧→푥 푓(푧)− 푓(푥)− ⟨푥∗, 푧 − 푥⟩ ∥푧 − 푥∥ = 0. (1.54) Such an 푥∗, if it exists, is called the gradient of 푓 at 푥 and is denoted ∇푓(푥). Fact 1.51. If 푓 is differentiable at 푥, then 푓 ′(푥; 푦) exists and is a linear function of 푦 for all 푦. In particular, 푓 ′(푥; 푦) = ⟨∇푓(푥), 푦⟩ ∀푦 ∈ ℝ푛. (1.55) Fact 1.52. [2, Proposition 2.3.2] Let 푥̄ ∈ int(퐶), 퐶 ⊂ ℝ푛. Suppose that continuous functions 푔0, 푔1, . . . , 푔푚 : 퐶 → ℝ are differentiable at 푥̄, that 푔 is the max function 푔(푥) = max 푖=0,...,푚 푔푖(푥) (1.56) and define the index set 퐼 = {푖 ∣푔푖(푥̄) = 푔(푥̄)} . Then for all directions 푑 in ℝ푛, the directional derivative of 푔 is given by 푔′(푥̄; 푑) = max 푖∈퐼 {⟨∇푔푖(푥̄), 푑⟩} . (1.57) It turns out that the directional derivative of a convex function can be used to compute its subdifferential. Fact 1.53. [19, Theorem 23.2] Let 푓 : ℝ푛 → (−∞,+∞] be convex and 푥 ∈ dom 푓 . Then 푓 ′(푥; 푦) exists for every 푦 ∈ ℝ푛. Moreover, ∂푓(푥) = {푣 ∈ ℝ푛∣ 푓 ′(푥; 푦) ≥ ⟨푣, 푦⟩ ∀푦 ∈ ℝ푛}. 15 1.5. Nonsmooth Analysis 1.4.2 Minimizers or Maximizers of Convex Functions Theorem 1.54. [19, page 264] For a convex function 푓 , we have that 푥 ∈ argmin푓 ⇔ 0 ∈ ∂푓(푥). (1.58) Remark 1.55. We say that 푥 ∈ 퐷 is an extremal point of 퐷 if 푥 = 휆푦 + (1− 휆)푧 (1.59) with 0 < 휆 < 1, 푦, 푧 ∈ 퐷 implies that 푦 = 푧. Theorem 1.56. [22, Theorem 32.3] Let 푓 : ℝ푛 → ℝ be convex, and 퐾 ⊆ ℝ푛 be compact and convex. Then max 푥∈퐾 푓(푥) = max 푥∈ext(퐾) 푓(푥) (1.60) where ext(퐾) denotes the extremal points of 퐾. 1.5 Nonsmooth Analysis Since maximin problems are nonconvex in general, in Subsection 1.5.1 and 1.5.2 below we introduce the concept of a Lipschitz function and nons- mooth analysis. They are used in Section 3.2 to derive necessary optimality conditions for the solutions of nonconvex maximin optimization problem. 1.5.1 Locally Lipschitz Functions Definition 1.57. Given a function 푓 : ℝ푛 → ℝ, we say that 푓 is locally Lipschitz continuous at a point 푥0 if there exists 훿 > 0 and Λ > 0 such that whenever ∥푥− 푥0∥ < 훿 and ∥푦 − 푥0∥ < 훿 we have ∣푓(푥)− 푓(푦)∣ ≤ Λ ∥푥− 푦∥ . (1.61) Fact 1.58. If 푓 is locally Lipschitz at 푥0, then −푓 is locally Lipschitz at 푥0. The above fact follows directly from ∣−푓(푥)− (−푓(푦))∣ = ∣−(푓(푥)− 푓(푦))∣ = ∣푓(푥)− 푓(푦)∣ . (1.62) Fact 1.59. If 푓1, 푓2 are locally Lipschitz at 푥0, then max {푓1, 푓2} (1.63) and min {푓1, 푓2} (1.64) are locally Lipschitz at 푥0. 16 1.5. Nonsmooth Analysis Proof. Since 푓1 is locally Lipschitz at 푥0, there exists 훿1 > 0 and Λ1 > 0 such that ∣푓1(푥)− 푓1(푦)∣ ≤ Λ1 ∥푥− 푦∥ (1.65) whenever ∥푥− 푥0∥ < 훿1 and ∥푦 − 푥0∥ < 훿1. Similarly, there exists 훿2 > 0 and Λ2 > 0 such that ∣푓2(푥)− 푓2(푦)∣ ≤ Λ2 ∥푥− 푦∥ (1.66) whenever ∥푥− 푥0∥ < 훿2 and ∥푦 − 푥0∥ < 훿2. Set 훿 = min {훿1, 훿2}, and Λ = max {Λ1,Λ2} so that whenever ∥푥− 푥0∥ < 훿 and ∥푦 − 푥0∥ < 훿 we have − Λ ∥푥− 푦∥ ≤ 푓1(푥)− 푓1(푦) ≤ Λ ∥푥− 푦∥ (1.67) −Λ ∥푥− 푦∥ ≤ 푓2(푥)− 푓2(푦) ≤ Λ ∥푥− 푦∥ . (1.68) From this we have 푓1(푥)−max {푓1(푦), 푓2(푦)} ≤ Λ ∥푥− 푦∥ (1.69) 푓2(푥)−max {푓1(푦), 푓2(푦)} ≤ Λ ∥푥− 푦∥ . (1.70) Combining the above two equations we have max {푓1(푥), 푓2(푥)} −max {푓1(푦), 푓2(푦)} = max {푓1(푥)−max {푓1(푦), 푓2(푦)}, 푓2(푥)−max {푓1(푦), 푓2(푦)}} ≤ Λ ∥푥− 푦∥ . (1.71) Similarly, − Λ ∥푥− 푦∥ ≤ 푓1(푥)− 푓1(푦) ⇒ 푓1(푦)− 푓1(푥) ≤ Λ ∥푥− 푦∥ (1.72) −Λ ∥푥− 푦∥ ≤ 푓2(푥)− 푓2(푦) ⇒ 푓2(푦)− 푓2(푥) ≤ Λ ∥푥− 푦∥ (1.73) (1.74) so that 푓1(푦)−max {푓1(푥), 푓2(푥)} ≤ Λ ∥푥− 푦∥ (1.75) 푓2(푦)−max {푓1(푥), 푓2(푥)} ≤ Λ ∥푥− 푦∥ (1.76) and thus max {푓1(푦), 푓2(푦)} −max {푓1(푥), 푓2(푥)} ≤ Λ ∥푥− 푦∥ ⇒ max {푓1(푥), 푓2(푥)} −max {푓1(푦), 푓2(푦)} ≥ −Λ ∥푥− 푦∥ . (1.77) 17 1.5. Nonsmooth Analysis Hence, whenever ∥푥− 푥0∥ < 훿 and ∥푦 − 푥0∥ < 훿 we have that ∣max {푓1(푦), 푓2(푦)} −max {푓1(푥), 푓2(푥)}∣ ≤ Λ ∥푥− 푦∥ (1.78) which shows that max {푓1, 푓2} is locally Lipschitz at 푥0. Now, min {푓1, 푓2} = −max {−푓1,−푓2}. (1.79) Using Fact 1.58 we know that −푓1,−푓2 are both locally Lipschitz at 푥0. Then max {−푓1,−푓2} is locally Lipschitz at 푥0 by the above proof. Applying Fact 1.58 once more, we have that −max {−푓1,−푓2} and thus min {푓1, 푓2} is locally Lipschitz at 푥0. Remark 1.60. Fact 1.59 can also be seen by using max{푓1, 푓2} = 푓1 + 푓2 2 + ∣푓1 − 푓2∣ 2 , min{푓1, 푓2} = 푓1 + 푓2 2 − ∣푓1 − 푓2∣ 2 . Fact 1.61. Assume that 푓푖 is locally Lipschitz at 푥0 for 푖 = 1, . . . ,푚. Then max {푓1, . . . , 푓푚} (1.80) and min {푓1, . . . , 푓푚} (1.81) are locally Lipschitz at 푥0. Proof. This follows directly from applying induction to the results of Fact 1.59. 1.5.2 Subdifferentials and Critical Points Let 푂 ⊆ ℝ푛 be an open set. For a locally Lipschitz function 푓 : 푂 → ℝ, and 푥 ∈ 푂, we define its Clarke directional derivative with respect to 푣 ∈ ℝ푛 by 푓표(푥; 푣) = lim sup 푦→푥,푡↓0 푓(푦 + 푡푣)− 푓(푦) 푡 (1.82) and the Clarke subdifferential ∂표푓(푥) = {푥∗ ∈ ℝ푛 ∣ 푓표(푥; 푣) ≥ ⟨푥∗, 푣⟩ ∀푣 ∈ ℝ푛} . (1.83) The Clarke subdifferential plays a fundamental role in optimization problems involving nonconvex functions. Some key properties are: 18 1.6. Some Convex Optimization Problems Theorem 1.62. Let 푓, 푔 : 푂 ⊂ ℝ푛 → ℝ be locally Lipschitz and 푥 ∈ 푂. Then 1. ∂표푓(푥) is compact and convex ([7] page 40). 2. ∂표(푓 + 푔)(푥) ⊆ ∂표푓(푥) + ∂표푔(푥) ([7] Proposition 2.3.3). 3. ∂표(−푓)(푥) = −∂표푓(푥) ([7] Proposition 2.3.1). 4. If 푓 is a convex function on 푂, then ∂표푓 = ∂푓 ([7] Proposition 2.2.7). 5. If 푓 is continuously differentiable on 푂, then ∂표푓(푥) = {∇푓(푥)} ([7] Proposition 2.2.7). Theorem 1.63. [23, Proposition 7.4.7] If 푓 is locally Lipschitz about 푥̄ and 푥̄ is a local minimizer or a local maximizer of 푓 , then 0 ∈ ∂표푓(푥̄). Theorem 1.64. [23, Proposition 7.4.7] Let 퐼 be a finite set, and for all 푖 ∈ 퐼 let 푔푖 : ℝ푛 → ℝ be locally Lipschitz continuous around 푥̄. Let 푔 be the max-function defined as 푔(푥) = max {푔푖(푥) ∣ 푖 ∈ 퐼 } (1.84) and define the active index set at 푥̄ by 퐼(푥̄) = {푖 ∈ 퐼 ∣ 푔푖(푥̄) = 푔(푥̄)} . (1.85) Suppose that 푔푖 is differentiable at 푥̄. Then the Clarke subdifferential of 푔 at 푥̄ is given by ∂표푔(푥̄) = conv {∇푔푖(푥̄) ∣ 푖 ∈ 퐼(푥̄)} . (1.86) 1.6 Some Convex Optimization Problems In this section we give the general and standard forms of a semidefinite programming optimization problem (SDP) and a second-order cone pro- gramming problem (SOCP), which we will use in section 2.2. For a convex cone 퐾 ⊂ ℝ푝 and 푥 ∈ ℝ푝, we write 푥 ર퐾 0 if 푥 ∈ 퐾. Definition 1.65. ([4]) A conic form program (or conic program) is an op- timization problem which has a linear objective function and one inequality constraint function min 푐푇푥 (1.87) s.t. 퐹푥+퐺 ર퐾 0 퐴푥 = 푏 19 1.6. Some Convex Optimization Problems where 푐, 푥 ∈ ℝ푝, 퐴 ∈ ℝ푚×푝, 푏 ∈ ℝ푚, 퐹 ∈ 풮푝, 퐺 ∈ ℝ푝 and 퐾 is a convex cone in ℝ푝. Definition 1.66. When 퐾 is 풮푛+, the cone of positive semidefinite 푛 × 푛 matrices, the associated conic form problem is called a semidefinite program, and has the form min 푐푇푥 (1.88) s.t. 푥1퐹1 + ⋅ ⋅ ⋅+ 푥푚퐹푚 +퐺 ર 0 퐴푥 = 푏 where 푐, 푥 ∈ ℝ푚, 퐺,퐹1, . . . , 퐹푚 ∈ 풮푛 and 퐴 ∈ ℝ푝×푚. A standard SDP in inequality form is min 푐푇푥 (1.89) s.t. 푥1퐴1 + ⋅ ⋅ ⋅+ 푥푚퐴푚 ⪯ 퐵 where 퐴1, . . . , 퐴푚, 퐵 ∈ 풮푛, 푐, 푥 ∈ ℝ푚. A standard SDP is in the following form: min ⟨퐶,푋⟩ (1.90) s.t. ⟨퐴푖, 푋⟩ = 푏푖 푖 = 1, . . . ,푚 푋 ર 0 where 퐶,푋,퐴푖 ∈ 풮푛 and 푏푖 ∈ ℝ. One can show that (1.90) has a dual given by max 푏푇푥 (1.91) s.t. 푥1퐴1 + ⋅ ⋅ ⋅+ 푥푚퐴푚 ⪯ 퐶 which is in the form of (1.89), see [4, Example 5.11], [29]. Definition 1.67. ([4]) The second-order cone program (SOCP) is of the form min 푓푇푥 (1.92) s.t. ∥퐴푖푥+ 푏푖∥ ≤ 푐푇푖 푥+ 푑푖, 푖 = 1, . . . ,푚 퐹푥 = 푔 where 푓, 푥 ∈ ℝ푛, 퐴푖 ∈ ℝ푛×푛 and 퐹 ∈ ℝ푝×푛. 20 1.7. NP-Complete Problems Constraints of the form ∥퐴푥+ 푏∥ ≤ 푐푇푥+푑 are called second-order cone constraints because the affinely defined variables 푢 = 퐴푥+푏 and 푡 = 푐푇푥+푑 must belong to the second order cone. Note that SOCP is a special case of SDP by using ∥퐴푖푥+ 푏푖∥ ≤ 푐푇푖 푥+ 푑푖 ⇔ ( (푐푇푖 푥+ 푑)퐼 퐴푖푥+ 푏푖 (퐴푖푥+ 푏푖) ⊺ (푐푇푖 푥+ 푑) ) ર 0. Further readings on SDP and SOCP may also be found in [32], [27], [29]. 1.7 NP-Complete Problems In this section we will give a brief overview to the concept of complexity theory and NP-completeness. In order to begin discussing complexity of problems however, we first need a few basic definitions. Definition 1.68. [12] A problem is a general question to be answered, usu- ally containing several parameters with values left unspecified. A problem is described by giving a general description of all of its parameters, and a statement of what properties the answer (or solution) is required to satisfy. An instance of a problem is obtained by specifying particular values for all the problem parameters. Definition 1.69. [12] An algorithm is a general, step-by-step procedure for solving a problem. For concreteness, we often think of algorithms as being computer pro- grams written in a precise computer language. We say that an algorithm solves a problem Π if that algorithm can be applied to any instance 퐼 of Π and be guaranteed to provide a solution for that instance. Definition 1.70. [12] The input length for an instance 퐼 of a problem Π is defined to be the number of symbols in the description of 퐼 obtained from the encoding scheme for Π. Definition 1.71. [12] The time complexity function for an algorithm ex- presses its time requirement by giving, for each possible input length, the largest amount of time needed by the algorithm to solve a problem instance of that size. Note that by this definition, this function is not well-defined until the encoding scheme to be used for determining input length and the computer model for a particular problem are fixed. However, the particular choices made for these have little effect on the broad distinctions made in the theory of NP-completeness ([12] p.6). 21 1.7. NP-Complete Problems 1.7.1 The Theory of NP-Completeness In this section we discuss measures for the complexity of algorithms, and introduce the concept of NP-complete problems. Definition 1.72. [12] We say that a function 푓(푛) is 푂(푔(푛)) whenever there exists a constant 푐 such that ∣푓(푛)∣ ≤ 푐 ⋅ ∣푔(푛)∣ (1.93) for all values of 푛 ≥ 0. Definition 1.73. [12] A polynomial time algorithm is defined to be one whose complexity function is 푂(푝(푛)) for some polynomial function 푝, where 푛 is used to denote the input length. Any algorithm whose time complexity function can not be so bounded is called an exponential time algorithm. Note that this definition includes certain non-polynomial time complex- ity functions (like 푛log푛) which are not normally regarded as exponential functions. Definition 1.74. [12] A problem is called intractable if no polynomial time algorithm can possibly solve it. The idea of polynomial time reducibility is integral to the theory of NP- completeness, and refers to reductions for which the transformation can be executed by a polynomial time algorithm. This is significant, since if we have a polynomial time reduction from one problem to another, then this ensures that any polynomial time algorithm for the second problem can be converted into a corresponding polynomial time algorithm for the first problem. The simplest way to explain the idea of NP-completeness is as follows. First, define two classes of problems: 1. The class of problems with solution algorithms bounded by a polyno- mial of fixed degree, P [8]. 2. The class of problems with solutions that are verifiable in polynomial time, NP. That is, given a solution 푆 to problem Π, a polynomial time algorithm exists to verify that 푆 is in fact a solution of Π [31]. Fact 1.75. [8] P ⊆ NP. 22 1.7. NP-Complete Problems One of the most important open questions in theoretical computer sci- ences is whether or not P=NP. The notation NP stands for “non-deterministic polynomial time”, since the original definition of the class NP is the class of problems which can be solved by a non-deterministic Turing machine in polynomial time [31]. A problem 푃푖 is NP-hard if every problem in NP is polynomially re- ducible to 푃푖. One can say that 푃푖 is NP-hard⇔ If 푃푖 is solved in a polynomial time, then 푃 = 푁푃. A problem 푃푖 is NP-complete if it is both NP-hard and an element of NP. In essence, the NP-complete problems are the “hardest” problems in NP [12]. The question of whether NP-complete problems are intractable is still a major open problem in mathematics and computer science (See [31] and references therein). How to show that a problem 푃푖 is NP-hard? One often uses a reduction argument: 1. Confirm that 푃푖 is a decision problem; 2. Take one problem 푃푗 that is already known to be NP-hard or NP- complete; 3. Show that the problem 푃푗 is polynomially reducible to 푃푖; 4. This shows that 푃푖 is NP-hard. This is what we will use in the thesis. 1.7.2 Some NP-Hard Problems Based on the previous section, we see that the simplest way to prove that a problem is NP-complete is to reduce it in polynomial time to another problem which is known to be NP-complete (or NP-hard). In order for this to be possible, we require a compendium of problems which are already known to be NP-complete (or NP-hard). There are many books and papers including lists of problems that have been proven to be NP-complete or NP- hard (for example, see [12], [17], [6]). Here, we provide two examples of NP-hard problems that are integral to the work of this thesis. Fact 1.76. [6, MP1] It is NP-hard to determine the solvability of the integer linear program 퐵푦 = 푏, 푦 ∈ {−1, 1}푞 , (1.94) with 퐵 ∈ ℤ푝×푞, 푏 ∈ ℤ푝. 23 1.7. NP-Complete Problems Given an undirected graph 퐺 = (푉,퐸) with weights 푤푖푗 = 푤푗푖 = 1 on the edges (푖, 푗) ∈ 퐸, the Max Cut Problem is that of finding the set of vertices 푆 ⊆ 푉 that maximizes the weight of the edges in the cut (푆, 푆). Put 푤푖푗 = 0 if (푖, 푗) ∕∈ 퐸. The weight of a cut (푆, 푆) is 푤(푆, 푆) = ∑ 푖∈푆,푗 ∕∈푆 푤푖푗 . The Max Cut problem is to max 푆⊂푉 푤(푆, 푆). Assume that the graph has 푞 nodes, i.e., ∣푉 ∣ = 푞. Define two 푞×푞 symmetric matrices 퐴 = (푎푖푗) and 푊 respectively by setting 푎푖푗 = 푎푗푖 = 푤푖푗 , and 푊 = Diag(퐴푒)−퐴 ∈ ℤ푞×푞 where 푒 = (1, . . . , 1)⊺ ∈ ℝ푞. Definition 1.77. (See [21, page 309], [27], [14]) The MaxCut problem on a graph with 푞 nodes may be formulated as max 푦∈{−1,1}푞 푦푇푊푦, (1.95) where 푊 ∈ ℤ푞×푞 and 푊 ર 0. The Max Cut problem has long been known to be NP complete. Fact 1.78. ([14, page.1116], [21]) The Maxcut problem (1.95) is NP-hard. For further details on complexity theory and NP-hard problems, see [12]. 1.7.3 The Complexity of Algorithms for Solving SDP and SOCP To some extent, SDP is very similar to linear programming. Given any 휖 > 0, SDPs can be solved within an additive error of 휖 in polynomial time (휖 is part of the input, so the running time dependence on 휖 is polynomial in log(1/휖)), [14]. This is can be done through ellipsoid algorithm, polynomial algorithms for convex programming, as well as interior point methods. See [14], [15], [1]. If the feasible region is bounded, then SOCPs can also be solved to arbitrary fixed precision in polynomial time, [13]. For the current state of the art in SDP, see [32]. Currently, available software packages solving SDPs and SOCPs include SeDuMi (self-dual minimization) [25], CVX (a modeling system for disci- plined convex programming) [11] and SDPT3. 24 Chapter 2 The Weighted Maximin Location Problem 2.1 General Problem and Motivation Definition 2.1. The problem of finding a point 푥 in a closed set 풳 ⊆ ℝ푛 (푛 ≥ 1) that is furthest from a given set of points 푥1, . . . , 푥푚 in ℝ푛 in a weighted maximin sense is given by 휐푝 := max 푥∈풳 푓 (푥) with 푓 (푥) := min 푖=1,...,푚 푤푖 ∥∥푥− 푥푖∥∥2 , (2.1) where 푤푖 > 0. This is known as the weighted maximin dispersion prob- lem. In general ∥⋅∥ denotes the Euclidean norm; in the case that we consider another norm we will make the distinction explicit. In the equal weight case of 푤1 = ⋅ ⋅ ⋅ = 푤푚, (2.1) has the geometric in- terpretation of finding the largest Euclidean sphere with center in 풳 and en- closing no given point. To illustrate, let 퐵훿(푥) = {푦 ∣ ∥푦 − 푥∥ < 훿, 푦 ∈ ℝ푛} , and 푓(푥) = min푖=1,...,푚 ∥∥푥− 푥푖∥∥2. Example 2.2. Set 푟 = √ 푓(푥). Then 퐵푟(푥) ∩ { 푥1, . . . , 푥푚 } = ∅. Proof. Indeed, 푟2 ≤ ∥∥푥− 푥푖∥∥2 ∀푖 = 1, . . . ,푚 by definition. If 푥푖0 ∈ 퐵푟(푥) then ∥∥푥− 푥푖0∥∥ < 푟, so ∥∥푥− 푥푖0∥∥2 < 푟2. Thus we have 푟2 ≤ ∥∥푥− 푥푖0∥∥2 < 푟2 (2.2) which is a contradiction. Hence, 퐵푟(푥) ∩ { 푥1, . . . , 푥푚 } = ∅. 25 2.1. General Problem and Motivation Our interest in the minimax distance problem arises from deterministic function modeling in statistics [16]. We begin with a deterministic computer code which models a complex physical system. It is assumed that the code is very costly to run, and so our goal is to build a computationally cheap surrogate of the computer code. The statistical approach to this problem is to place a prior belief on the function space, 풳 , and then use the weighted maximin distance criteria to choose a set of sites in 풳 at which to run the computer code, so that we may model the resulting deterministic output as a stochastic process (see [26]). In general, the weighted maximin problem also has many diverse appli- cations in facility location, spatial management and pattern recognition (see [10], [30] and references therein). One applied example in facility location would be the determining of the location for a new and highly polluting in- dustry within some region 풳 . Suppose that there are 푚 cities within region 풳 , represented by the points 푥1, . . . , 푥푚. One criteria in choosing a location for the facility could be that the amount of pollution reaching any city be minimized. Then the optimization problem (2.1) may be used in choosing the location of the facility, provided we can assume that the amount of pol- lutant reaching a city is monotonically decreasing function of the distance between the city and the facility. For ease of notation throughout this thesis we define 푓푖(푥) = 푤푖 ∥∥푥− 푥푖∥∥2 (2.3) where 푤푖 > 0 is a fixed constant and 푥 푖 ∈ ℝ푛. Thus, the maximin dispersion problem becomes max 푥∈풳 푓(푥) (2.4) where, for all 푥 ∈ ℝ푛, 푖 ∈ {1, . . . ,푚} 푓(푥) = min 1≤푖≤푚 푓푖(푥). (2.5) In particular, we will focus on the case that 풳 in (2.1) is convex under componentwise squaring: 풳 = { 푥 ∈ ℝ푛 ∣∣∣(푥21, . . . , 푥2푛, 1)푇 ∈ 풦} , (2.6) for some closed convex cone 풦 ⊆ ℝ푛+1. We will examine two specific cases: the case where 풳 = [−1, 1]푛 (a box) corresponding to 풦 = {푦 ∈ ℝ푛+1 ∣푦푗 ≤ 푦푛+1, 푗 = 1, . . . , 푛} , (2.7) 26 2.1. General Problem and Motivation and the case of a unit Euclidean ball 풳 = { 푥 ∈ ℝ푛 ∣∣∣ ∥푥∥2 ≤ 1} correspond- ing to 풦 = {푦 ∈ ℝ푛+1 ∣푦1 + ⋅ ⋅ ⋅+ 푦푛 ≤ 푦푛+1} . (2.8) Even in the simple case of a box, 풳 = [−1, 1]푛, the function 푓(푥) can become complex quite quickly. Figure 2.1 shows 푓(푥) for two sets of points in 풳 ; the set 푆 = {(−1,−1), (−1, 1), (1,−1), (1, 1), (0, 0)} and the set 푇 , a randomly generated set of 10 points in 풳 . Figure 2.1: Plots of 푓(푥) for 풳 = [−1, 1]2, with the set 푆 on the left and set 푇 on the right. Remark 2.3. The function 푓 (푥) := min 푖=1,...,푚 푤푖 ∥∥푥− 푥푖∥∥2 (2.9) in Equation (2.1) is neither convex nor concave whenever 푚 ≥ 2 and 푥푖 ∕= 푥푗 . This is illustrated in Figure 2.1. Proposition 2.4. The optimal value of the maximin dispersion problem satisfies 휐푝 > 0. Proof. Take 푥 ∈ 풳 with 푥 ∕= 푥푖 for 푖 = 1, . . . ,푚. We have 푓(푥) > 0. Then 휐푝 ≥ 푓(푥) > 0. Since the weighted maximin dispersion problem is a nonconvex problem, we cannot apply convex programming methods. The goal of this thesis then is to consider convex relaxations of the problem, the solutions of which can be found using known techniques. The solutions to the relaxation problems can then provide a nontrivial lower bound for the original nonconvex problem. 27 2.2. Convex Relaxations 2.2 Convex Relaxations In this section we give several alternative formulations of the original problem (2.1) as well as defining the convex relaxation problems. First note that Proposition 2.5. If we let 퐴푖 = [ 퐼 −푥푖 − (푥푖)푇 ∥∥푥푖∥∥2 ] ∈ 풮푛+1 and 푋 = [ 푥푥푇 푥 푥푇 1 ] ∈ 풮푛+1 where 퐼 ∈ ℝ푛×푛, then tr[퐴푖푋] = ∥푥∥2 − 2 (푥푖)푇 푥+ ∥∥푥푖∥∥2 . Proof. We have 퐴푖푋 = [ 퐼 −푥푖 − (푥푖)푇 ∥∥푥푖∥∥2 ] [ 푥푥푇 푥 푥푇 1 ] = [ 푥푥⊺ − 푥푖푥⊺ 푥− 푥푖 −(푥푖)⊺푥푥⊺ + ∥푥푖∥2푥⊺ −(푥푖)⊺푥+ ∥푥푖∥2 ] so that tr퐴푖푋 = tr(푥푥⊺ − 푥푖푥⊺)− (푥푖)⊺푥+ ∥푥푖∥2 = tr(푥푥⊺)− tr(푥푖푥⊺)− (푥푖)⊺푥+ ∥푥푖∥2 = tr(푥⊺푥)− tr(푥⊺푥푖)− (푥푖)⊺푥+ ∥푥푖∥2 = ∥푥∥2 − 2(푥푖)⊺푥+ ∥푥푖∥2. For any 푥 ∈ ℝ푛 and 푖 ∈ {1, . . . ,푚}, by Proposition 2.5 we have∥∥푥− 푥푖∥∥2 = ∥푥∥2 − 2 (푥푖)푇 푥+ ∥∥푥푖∥∥2 = 〈퐴푖, [ 푥푥푇 푥 푥푇 1 ]〉 , (2.10) where 퐴푖 := [ 퐼 −푥푖 − (푥푖)푇 ∥∥푥푖∥∥2 ] (2.11) 28 2.2. Convex Relaxations and ⟨퐴,푍⟩ = tr[퐴푍]. By substituting 푋 = [ 푥푥푇 푥 푥푇 1 ] , we can reformulate the original prob- lem (2.1) as 휐푝 := max 푋,휁 휁 (2.12) s.t. 푤푖 〈 퐴푖, 푋 〉 ≥ 휁, 푖 = 1, . . . ,푚 (푋11, . . . , 푋푛푛, 1) 푇 ∈ 풦 풳푛+1,푛+1 = 1 푋 ર 0 rank푋 = 1, where “s.t.” abbreviates for “subject to”, 풦 is given as in either (2.7) or (2.8), and 퐴푖 is given in (2.11). Proposition 2.6. Let 푛 ≥ 2. The rank function rank : ℝ푛×푛 → [0, 푛] : 푋 7→ rank푋 is neither convex nor Lipschitz. Proof. It suffices to consider the case when 푛 = 2. Consider matrices 푋1 = ( 1 0 0 0 ) , 푋2 = ( 0 0 0 1 ) . We have rank ( 1 2 푋1 + 1 2 푋2 ) = 2 > 1 = 1 2 rank푋1 + 1 2 rank푋2, which implies that 푋 7→ rank푋 is not convex. Now let 휖 > 0 and 푌 = ( 1 0 0 휖 ) . We have ∣rank푋1 − rank푌 ∣ = ∣1− 2∣ = 1, ∥푋1 − 푌 ∥ = 휖 which gives ∣rank푋1 − rank푌 ∣ ∥푋1 − 푌 ∥ = 1 휖 → +∞ when 휖 ↓ 0. Therefore, rank is not Lipschitz at 푋1. 29 2.2. Convex Relaxations Since the rank function is neither convex nor Lipschitz, we drop the rank-1 constraint. This yields a convex programming relaxation of (2.1): 휐푐푝 := max 푋,휁 휁 s.t. 푤푖 〈 퐴푖, 푋 〉 ≥ 휁, 푖 = 1, . . . ,푚 (푋11, . . . , 푋푛푛, 1) 푇 ∈ 풦 푋푛+1,푛+1 = 1 푋 ર 0. (2.13) As (2.13), a convex relaxation of (2.12), is obtained by relaxing some of the constraints (2.12), all possible solutions of (2.12) are feasible for (2.13). Hence, Proposition 2.7. 휐푐푝 ≥ 휐푝. Our SDP and SOCP relaxations are formulated in next two subsections. 2.2.1 SDP Relaxation We will make the mild assumption that (2.13) has an optimal solution. This holds whenever 풳 is compact. If 풳 is convex, then we can strengthen this relaxation by adding the constraints (푋1,푛+1, . . . , 푋푛,푛+1) 푇 ∈ 풳 . Since 휐푝 > 0, 휁 ∕= 0 at an optimal solution of (2.13). By making the substitution 푍 = 푋/휁, we can reformulate (2.13) more compactly as 1 휐푐푝 = min 푍 푍푛+1,푛+1 s.t. 푤푖 〈 퐴푖, 푍 〉 ≥ 1, 푖 = 1, . . . ,푚 (푍11, . . . , 푍푛푛, 푍푛+1,푛+1) 푇 ∈ 풦 푍 ર 0. (2.14) Note that when 풦 is a polyhedral set having a tractable algebraic repre- sentation, for example (2.7), then (2.14) reduces to a semidefinite program (SDP), so we will refer to this as the semidefinite-programming relaxation of (2.1). Lemma 2.8. Let 푍∗ ∈ 풮(푛+1)×(푛+1) be an optimal solution of (2.14). Then 푍∗푛+1,푛+1 > 0 30 2.2. Convex Relaxations Proof. This follows from 1 푍∗푛+1,푛+1 = 휐푐푝 ≥ 휐푝 > 0. Theorem 2.9. When 풦 = {푦 ∈ ℝ푛+1 ∣ 푦푗 ≤ 푦푛+1, 푗 = 1, . . . , 푛} , we have 1. 푋 solves the convex relaxation problem (2.13) ⇔ 푋휐푐푝 solves (2.14). 2. 푍 solves (2.14) ⇔ 푋 = 푍휐푐푝 = 푍 1푍푛+1,푛+1 solves (2.13). 3. If 푍∗ solves (2.14), then 휐푐푝 = 1푍∗푛+1,푛+1 . 2.2.2 SOCP Relaxation Lastly, we consider a further relaxation that replaces the semidefinite- cone constraint, 푍 ર 0, of (2.14) with the following constraints[ 푍푗푗 푍푗,푛+1 푍푗,푛+1 푍푛+1,푛+1 ] ર 0, 푗 = 1, . . . , 푛. (2.15) Lemma 2.10. The constraints of equation (2.15),[ 푍푗,푗 푍푗,푛+1 푍푗,푛+1 푍푛+1,푛+1 ] ર 0 푖 = 1, . . . , 푛 (2.16) are equivalent to second order cone constraints. Proof. For each 푗 = 1, . . . , 푛, let 푍푗 denote the matrix 푍푗 = [ 푍푗,푗 푍푗,푛+1 푍푗,푛+1 푍푛+1,푛+1 ] . (2.17) Then by Proposition 1.25 we know that the constraints 푍푗 ર 0 are equivalent to the constraints det(푍푗) ≥ 0 and trace(푍푗) ≥ 0 for 푗 = 1, . . . , 푛. Now, det(푍푗) ≥ 0 ⇔ 푍푗,푗푍푛+1,푛+1 − (푍푗,푛+1)2 ≥ 0 ⇔ 푍푗,푗푍푛+1,푛+1 ≥ (푍푗,푛+1)2 (2.18) and trace(푍푗) ≥ 0 ⇔ 푍푗,푗 + 푍푛+1,푛+1 ≥ 0. (2.19) 31 2.2. Convex Relaxations We will show that (2.18) and (2.19) are equivalent to second order cone constraints, namely∥∥∥∥[ 2푍푗,푛+1푍푗,푗 − 푍푛+1,푛+1 ]∥∥∥∥ ≤ 푍푗,푗 + 푍푛+1,푛+1, (2.20) 0 ≤ 푍푗,푗 + 푍푛+1,푛+1. (2.21) To see that (2.18) and (2.20) are equivalent, consider the following:∥∥∥∥[ 2푍푗,푛+1푍푗,푗 − 푍푛+1,푛+1 ]∥∥∥∥ ≤ 푍푗,푗 + 푍푛+1,푛+1 ⇔ ∥∥∥∥[ 2푍푗,푛+1푍푗,푗 − 푍푛+1,푛+1 ]∥∥∥∥2 ≤ (푍푗,푗 + 푍푛+1,푛+1)2 ⇔ 4(푍푗,푛+1)2 + (푍푗,푗)2 − 2푍푗,푗푍푛+1,푛+1 + (푍푛+1,푛+1)2 ≤ (푍푗,푗)2 + 2푍푗,푗푍푛+1,푛+1 + (푍푛+1,푛+1)2 ⇔ 4(푍푗,푛+1)2 ≤ 4푍푗,푗푍푛+1,푛+1 ⇔ (푍푗,푛+1)2 ≤ 푍푗,푗푍푛+1,푛+1 (2.22) Equation (2.21) is exactly equation (2.19), which is already a second order cone constraint. That is, we have ∥퐴푥+ 푏∥ ≤ 푍푗,푗 + 푍푛+1,푛+1 (2.23) with 퐴 = 0 and 푏 = 0. Thus, substituting (2.15) for 푍 ર 0 in (2.14) yields the second-order cone programming (SOCP) relaxation of (2.1): 1 휐푠표푐푝 = min 푍 푍푛+1,푛+1 s.t. 푤푖 〈 퐴푖, 푍 〉 ≥ 1, 푖 = 1, . . . ,푚 (푍11, . . . , 푍푛푛, 푍푛+1,푛+1) 푇 ∈ 풦[ 푍푗푗 푍푗,푛+1 푍푗,푛+1 푍푛+1,푛+1 ] ર 0, 푗 = 1, . . . , 푛. (2.24) Theorem 2.11. One always has 휐푝 ≤ 휐푐푝 ≤ 휐푠표푐푝. 32 2.2. Convex Relaxations Corollary 2.12. Let 푍∗ ∈ 풮푛+1 be an optimal solution of (2.24). Then 푍∗푛+1,푛+1 > 0. Proof. This follows from 1 푍∗푛+1,푛+1 = 휐푠표푐푝 ≥ 휐푐푝 ≥ 휐푝 > 0. We have 휐푠표푐푝 = 1 푧∗푛+1,푛+1 if 푍∗ solves (2.24). 2.2.3 More Properties on the Constraints of Convex Relaxations (1) Constraint (푍11, . . . , 푍푛푛, 푍푛+1,푛+1) 푇 ∈ 풦. In either (2.14) or (2.24), what does the constraint (푍11, . . . , 푍푛푛, 푍푛+1,푛+1) 푇 ∈ 풦 (2.25) look like? We explicitly consider two cases: Box case: When 풦 = {푦 ∈ ℝ푛+1∣ 푦푗 ≤ 푦푛+1, 푗 = 1, . . . , 푛} (2.25) transpires to 푍푗푗 ≤ 푍푛+1,푛+1 ∀푗 = 1, . . . , 푛. (2.26) Ball case: When 풦 = {푦 ∈ ℝ푛+1∣ 푛∑ 푗=1 푦푗 ≤ 푦푛+1} (2.25) transpires to 푛∑ 푗=1 푍푗푗 ≤ 푍푛+1,푛+1. (2.27) In both cases, (2.25) consists of linear inequalities. (2) Constraint 푤푖 〈 퐴푖, 푍 〉 ≥ 1. In both (2.14) and (2.24), what does the constraint 푤푖⟨퐴푖, 푍⟩ ≥ 1 (2.28) look like? 33 2.2. Convex Relaxations We can exploit the structure of 퐴푖! For a matrix 푍 ∈ 풮푛+1, let 푧 = (푍1,푛+1, . . . , 푍푛,푛+1) 푇 and 푍∗ = ⎡⎢⎣ 푍11∗∗ . . . ∗∗ 푍푛푛 ⎤⎥⎦ so that we can write 푍 = [ 푍∗ 푧 (푧)푇 푍푛+1,푛+1 ] . (2.29) into a block form. Now consider the trace: ⟨퐴푖, 푍⟩ = Tr ([ 퐼 −푥푖 − (푥푖)푇 ∥∥푥푖∥∥2 ] 푍 ) = Tr ([ 퐼 −푥푖 − (푥푖)푇 ∥∥푥푖∥∥2 ] [ 푍∗ 푧 (푧)푇 푍푛+1,푛+1 ]) = Tr ([ 푍∗ − 푥푖푧푇 푧 − 푥푖푍푛+1,푛+1 − (푥푖)푇 푍∗ + ∥∥푥푖∥∥2 푧푇 −(푥푖)푇 푧 + ∥∥푥푖∥∥2 푍푛+1,푛+1 ]) = Tr ( 푍∗ − 푥푖푧푇 )+ Tr(−(푥푖)푇 푧 + ∥∥푥푖∥∥2 푍푛+1,푛+1) = ⎛⎝ 푛∑ 푗=1 푍푗푗 − (푥푖)푇 푧 ⎞⎠+ (−(푥푖)푇 푧 + ∥∥푥푖∥∥2 푍푛+1,푛+1) = 푛∑ 푗=1 푍푗푗 − 2(푥푖)푇 푧 + ∥∥푥푖∥∥2 푍푛+1,푛+1. (2.30) Hence (2.28) is equivalent to 푤푖 ( 푛∑ 푗=1 푍푗푗 − 2(푥푖)푇 푧 + ∥∥푥푖∥∥2 푍푛+1,푛+1) ≥ 1. (2.31) (3) The feasible region of (2.24) is strictly larger than the one in (2.14). A special case suffices. To this end, let 푛 = 2, we consider the matrix 푍 = ⎛⎝1 2 12 1 0 1 0 2 ⎞⎠ 34 2.2. Convex Relaxations We have ( 푍11 푍13 푍31 푍3,3 ) = ( 1 1 1 2 ) ∈ 풮2+ and ( 푍22 푍23 푍32 푍3,3 ) = ( 1 0 0 2 ) ∈ 풮2+, but 푍 ∕∈ 풮3+ because its principle submatrix( 푍11 푍12 푍21 푍22 ) = ( 1 2 2 1 ) ∕∈ 풮2+. This shows that algthough 푍 ≽ 0 ⇒ [ 푍푗푗 푍푗,푛+1 푍푗,푛+1 푍푛+1,푛+1 ] ≽ 0 ∀ 푗 = 1, . . . , 푛 always holds, the converse implication fails. The above discussions allow us to conclude: Theorem 2.13 (box case). Let 풦 = {푦 ∈ ℝ푛+1∣ 푦푗 ≤ 푦푛+1, 푗 = 1, . . . , 푛}. Then 1. The SDP relaxation (2.14) has an explicit form: 1 휐푐푝 = min 푍 푍푛+1,푛+1 s.t. 푤푖 (∑푛 푗=1 푍푗푗 − 2(푥푖)푇 푧 + ∥∥푥푖∥∥2 푍푛+1,푛+1) ≥ 1, 푖 = 1, . . . ,푚 푍푗푗 ≤ 푍푛+1,푛+1 푗 = 1, . . . , 푛 푍 ર 0, (2.32) where 푧 = (푍1,푛+1, . . . , 푍푛,푛+1) 푇 . 35 2.2. Convex Relaxations 2. The SOCP relaxation (2.24) has an explicit form: 1 휐푠표푐푝 = min 푍 푍푛+1,푛+1 s.t. 푤푖 (∑푛 푗=1 푍푗푗 − 2(푥푖)푇 푧 + ∥∥푥푖∥∥2 푍푛+1,푛+1) ≥ 1, 푖 = 1, . . . ,푚 푍푗푗 ≤ 푍푛+1,푛+1 푗 = 1, . . . , 푛[ 푍푗푗 푍푗,푛+1 푍푗,푛+1 푍푛+1,푛+1 ] ર 0, 푗 = 1, . . . , 푛, (2.33) where 푧 = (푍1,푛+1, . . . , 푍푛,푛+1) 푇 . Theorem 2.14 (ball case). Let 풦 = {푦 ∈ ℝ푛+1∣ 푛∑ 푗=1 푦푗 ≤ 푦푛+1}. Then 1. The SDP relaxation (2.14) has an explicit form: 1 휐푐푝 = min 푍 푍푛+1,푛+1 s.t. 푤푖 (∑푛 푗=1 푍푗푗 − 2(푥푖)푇 푧 + ∥∥푥푖∥∥2 푍푛+1,푛+1) ≥ 1, 푖 = 1, . . . ,푚∑푛 푗=1 푍푗푗 ≤ 푍푛+1,푛+1 푍 ર 0, (2.34) where 푧 = (푍1,푛+1, . . . , 푍푛,푛+1) 푇 . 2. The SOCP relaxation (2.24) has an explicit form: 1 휐푠표푐푝 = min 푍 푍푛+1,푛+1 s.t. 푤푖 (∑푛 푗=1 푍푗푗 − 2(푥푖)푇 푧 + ∥∥푥푖∥∥2 푍푛+1,푛+1) ≥ 1, 푖 = 1, . . . ,푚∑푛 푗=1 푍푗푗 ≤ 푍푛+1,푛+1[ 푍푗푗 푍푗,푛+1 푍푗,푛+1 푍푛+1,푛+1 ] ર 0, 푗 = 1, . . . , 푛, (2.35) where 푧 = (푍1,푛+1, . . . , 푍푛,푛+1) 푇 . 36 2.2. Convex Relaxations Remark 2.15. Note that in both box case and ball case, the SOCP relaxation only relies on (푍11, . . . , 푍푛푛, 푍푛+1,푛+1) and 푧 = (푍1,푛+1, . . . , 푍푛,푛+1) 푇 . Remark 2.16. In both box case and ball case, the feasible regions are un- bounded. Indeed, if 푍 is feasible, then 푘푍 is also feasible for every 푘 ≥ 1. Amazingly, in next section we show that both (2.14) and (2.24) do have optimal solutions! 2.2.4 Why Does the SDP or SOCP Relaxation Have an Optimal Solution? For a matrix 푍 ∈ ℝ푛×푛 we define its norm by ∥푍∥2 = max{∥푍푢∥ ∥푢∥ ≤ 1}. Fact 2.17. For every 푍 ∈ ℝ푛×푛, we have ∥푍∥2 = 휆max(푍⊺푍). Theorem 2.18 (box case). Let 풦 = {푦 ∈ ℝ푛+1∣ 푦푗 ≤ 푦푛+1, 푗 = 1, . . . , 푛}. Then the following hold: 1. The SDP relaxation (2.14) has 휐푐푝 < +∞ and at least one optimal solution. 2. The SOCP relaxation (2.24) has 휐푠표푐푝 < +∞ and at least one optimal solution. Proof. Part 1. First we show that 휐푐푝 < ∞. Assume to the contrary that 휐푐푝 =∞. Then there exists a sequence 푍(푘)푛+1,푛+1 ↓ 0 when 푘 →∞. Since by (2.26), 0 ≤ 푍(푘)푗푗 ≤ 푍(푘)푛+1,푛+1 ∀ 푗 = 1, . . . , 푛 we have lim푘→∞ 푍 (푘) 푗푗 = 0. Now the matrix 푍 (푘) ર 0 implies[ 푍 (푘) 푗푗 푍 (푘) 푗,푛+1 푍 (푘) 푗,푛+1 푍 (푘) 푛+1,푛+1 ] ર 0 푗 = 1, . . . , 푛 (2.36) so that 푍 (푘) 푛+1,푛+1푍 (푘) 푗푗 ≥ (푍(푘)푗,푛+1)2 ≥ 0 37 2.2. Convex Relaxations from which lim푘→∞ 푍 (푘) 푗,푛+1 = 0. By (2.31), we obtain that 푤푖 ( 푛∑ 푗=1 푍 (푘) 푗푗 − 2(푥푖)푇 푧(푘) + ∥∥푥푖∥∥2 푍(푘)푛+1,푛+1) ≥ 1 (2.37) where (푧(푘))⊺ = (푍(푘)1,푛+1, . . . , 푍 (푘) 푛,푛+1). When 푘 → ∞, (2.37) gives 0 ≥ 1, a contradiction. Next we show that 휐푐푝 is achieved. Assume that there exists a feasible sequence (푍(푘))∞푘=1 such that 푍 (푘) 푛+1,푛+1 ↓ 1/휐푐푝 when 푘 → ∞. Since 0 ≤ 푍 (푘) 푗푗 ≤ 푍(푘)푛+1,푛+1 for 푗 = 1, . . . , 푛, the sequence (푍(푘)푗푗 )∞푘=1 is bounded for 푗 = 1, . . . , 푛. This shows that (tr푍(푘))∞푘=1 is bounded. As the largest eigenvalue 0 ≤ 휆max(푍(푘)) ≤ tr푍(푘) we see that (휆max(푍 (푘)))∞푘=1 is bounded. Note that ∥푍(푘)∥2 = 휆max(푍(푘)) because 푍(푘) ∈ 풮푛+1+ . Since all norms are equivalent in finite dimensional space 풮(푛+1)×(푛+1), this means that (푍(푘))∞푘=1 is bounded. Therefore, there exists a subsequence of (푍(푘))∞푘=1 converges to a 푍 ∗ ∈ 풮푛+1+ . In particular, 푍∗ is feasible because each 푍(푘) is feasible. Without relabeling, let us assume that 푍(푘) → 푍∗. Then 푍(푘)푛+1,푛+1 → 푍∗푛+1,푛+1 and 푍∗푛+1,푛+1 = 1/휐푐푝. Hence 푍∗ is an optimal solution. Part 2. Apply the similar arguments as in Part 1. Theorem 2.19 (ball case). Let 풦 = {푦 ∈ ℝ푛+1∣ 푛∑ 푗=1 푦푗 ≤ 푦푛+1}. Then the following hold: 1. The SDP relaxation (2.14) has 휐푐푝 < +∞ and at least one optimal solution. 2. The SOCP relaxation (2.24) has 휐푠표푐푝 < +∞ and at least one optimal solution. Proof. The proof is similar to that of Theorem 2.18. Instead of using (2.26), we use (2.27). 38 Chapter 3 A Necessary Condition for Optimal Solutions In this Chapter we prove general results about the objective functions of the maximin dispersion problem, as well as deriving a necessary condition for the optimal solutions of (2.1). 3.1 Properties of the Maximin Objective Functions In this section, we will apply the results of Sections 1.4, 1.5 and 1.5.1 to the objective functions of the maximin dispersion problem, (2.3). First, we show that 푓(푥) as defined in (2.1) is locally Lipschitz at any 푥0 in ℝ푛. Theorem 3.1. The function 푓푖 as defined in (2.3) is locally Lipschitz at any 푥0 ∈ ℝ푛. Proof. Fix 푥0 ∈ ℝ푛 and choose 푥, 푦 ∈ ℝ푛 such that ∥푥− 푥0∥ ≤ 훿 and ∥푦 − 푥0∥ ≤ 훿 for some 훿 > 0. We have ∣푓푖(푥)− 푓푖(푦)∣ = ∣∣∣푤푖 ∥∥푥− 푥푖∥∥2 − 푤푖 ∥∥푦 − 푥푖∥∥2∣∣∣ (3.1) = 푤푖 ∣∣(∥∥푥− 푥푖∥∥− ∥∥푦 − 푥푖∥∥) (∥∥푥− 푥푖∥∥+ ∥∥푦 − 푥푖∥∥)∣∣ = 푤푖 ∣∣(∥∥푥− 푥푖∥∥− ∥∥푦 − 푥푖∥∥)∣∣ (∥∥푥− 푥푖∥∥+ ∥∥푦 − 푥푖∥∥) ≤ 푤푖 (∥∥(푥− 푥푖)− (푦 − 푥푖)∥∥) (∥∥푥− 푥푖∥∥+ ∥∥푦 − 푥푖∥∥) = 푤푖 (∥푥− 푦∥) (∥∥푥− 푥푖∥∥+ ∥∥푦 − 푥푖∥∥) . 39 3.1. Properties of the Maximin Objective Functions Now, ∥∥푥− 푥푖∥∥ ≤ ∥푥∥+ ∥∥푥푖∥∥ = ∥푥− 푥0 + 푥0∥+ ∥∥푥푖∥∥ ≤ ∥푥− 푥0∥+ ∥푥0∥+ ∥∥푥푖∥∥ ≤ 훿 + ∥푥0∥+ ∥∥푥푖∥∥ = 퐿1 <∞. (3.2) Similarly, there exists 퐿2 <∞ such that ∥∥푦 − 푥푖∥∥ ≤ 퐿2. Then∥∥푥− 푥푖∥∥+ ∥∥푦 − 푥푖∥∥ ≤ 퐿1 + 퐿2 (3.3) so that by setting 푀 = 푤푖(퐿1 + 퐿2) we have ∣푓푖(푥)− 푓푖(푦)∣ ≤ 푤푖 ∥푥− 푦∥ (퐿1 + 퐿2) (3.4) = 푀 ∥푥− 푦∥ whenever ∥푥− 푥0∥ ≤ 훿 and ∥푦 − 푥0∥ ≤ 훿, and hence 푓푖 is locally Lipschitz at 푥0. Corollary 3.2. The function 푓(푥) = min 푖=1,...,푚 푓푖(푥) = min 푖=1,...,푚 푤푖 ∥∥푥− 푥푖∥∥2 (3.5) is locally Lipschitz at any 푥0 ∈ ℝ푛. This follows directly from Theorem 3.1 and Fact 1.61. Lastly, we characterize the gradient of 푓푖(푥) as defined in (2.3), and give a theorem for the existence of solutions. Fact 3.3. For each 푖 = 1, . . . ,푚, and every 푥 ∈ 풳 the function 푓푖 is differ- entiable, with derivative ∇푓푖(푥) = 2푤푖 ( 푥− 푥푖) . (3.6) Proof. This follows from that 푓푖(푥) = 푤푖⟨푥− 푥푖, 푥− 푥푖⟩ = 푤푖 (⟨푥, 푥⟩+ 2⟨푥푖, 푥⟩+ ∥푥푖∥2). 40 3.2. Characterization of Optimal Solutions Theorem 3.4. Assume that 풳 is a compact subset of ℝ푛. Then argmax푥∈풳 푓 ∕= ∅. Proof. This follows directly from Theorem 1.9. Remark 3.5. If 풳 is not compact, then it may happen that argmax푓 = ∅. For example, let 풳 = ℝ푛. Then sup 푥∈ℝ푛 푓 =∞ so that argmax푥∈ℝ푛푓 = ∅. 3.2 Characterization of Optimal Solutions In this section we will provide a theoretical result that gives a necessary condition for 푥̄ to be an optimal solution of the equally weighted maximin dispersion problem, which is (2.1) with 푤푖 = ⋅ ⋅ ⋅ = 푤푚 = 푤. To begin, first rewrite the original problem (2.1) as min 푥∈풳 ( max 1≤푖≤푚 −푓푖 ) (3.7) and note that this is equivalent to min 푥∈풳 ( max 1≤푖≤푚 −푓푖 + 휄풳 ) . (3.8) Theorem 3.6. Let 푥̄ belong to the interior of 풳 , and let 푓(푥) = min 푖=1,...,푚 푤 ∥∥푥− 푥푖∥∥2 . (3.9) We have that if 푥̄ is an optimal solution of 푚푎푥푥∈풳 푓(푥) (3.10) then 푥̄ ∈ conv{푥푖 ∣푖 ∈ 퐼(푥̄)} (3.11) where 퐼(푥̄) = { 푖 ∣∣∣∣ ∥∥푥̄− 푥푖∥∥2 = min푖=1,...,푚 ∥∥푥̄− 푥푖∥∥2 } . (3.12) 41 3.2. Characterization of Optimal Solutions Proof. Since 푥̄ is a critical point of 푓 if and only if 푥̄ is a critical point of −푓 , we begin by converting our min-function 푓(푥) into a max-function −푓(푥) as follows: − 푓(푥) = − min 푖=1,...,푚 푤 ∥∥푥− 푥푖∥∥2 (3.13) −푓(푥) = max 푖=1,...,푚 ( −푤 ∥∥푥− 푥푖∥∥2) (3.14) −푓(푥) = max 푖=1,...,푚 푔푖(푥) (3.15) where 푔푖(푥) = −푤 ∥∥푥− 푥푖∥∥2 = −푓푖(푥). Now, by Theorem 3.1, we know that 푓푖(푥) is locally Lipschitz at any 푥 ∈ ℝ푛, and so applying Fact 1.58 we know that 푔푖(푥) is locally Lipschitz at any 푥 ∈ ℝ푛. Furthermore, by applying Fact 1.59 we know that the max of a family of locally Lipschitz functions is locally Lipschitz, so that −푓(푥) = max푖=1,...,푚 푔푖(푥) is locally Lipschitz at any 푥 ∈ ℝ푛. Thus, by applying Theorem 1.63 we know that if 푥̄ is a possible critical point of −푓 + 휄풳 , then we must have 0 ∈ ∂표(−푓 + 휄풳 )(푥̄) ⊆ ∂표(−푓)(푥̄) + ∂표휄풳 (푥̄). Thus, in order to derive a necessary condition for 푥̄ to be a critical point of −푓(푥), we need to characterize ∂표(−푓)(푥̄). Since −푓(푥) is locally Lipschitz at any 푥̄ ∈ ℝ푛 and 푔푖 is differentiable for each 푖, by Theorem 1.64 the Clarke subgradient of −푓 at 푥̄ is given by ∂(−푓(푥̄)) = conv {∇푔푖(푥̄) ∣ 푖 ∈ 퐼(푥̄)} . (3.16) Applying Fact 3.3 to the equally weighted case, we have that the derivative of the functions 푔푖 is ∇푔푖(푥) = −2푤(푥− 푥푖) (3.17) so that ∂표(−푓(푥̄)) = conv{−2푤(푥̄− 푥푖) ∣ 푖 ∈ 퐼(푥̄)} . (3.18) Thus 0 ∈ ∂표(−푓)(푥̄) + ∂표휄푋(푥̄) implies that 0 ∈ conv{−2푤(푥̄− 푥푖) ∣ 푖 ∈ 퐼(푥̄)}+ ∂휄풳 (푥̄). Dividing both sides by 푤 and using the fact that ∂표휄풳 (푥̄) = 푁풳 (푥̄) is a cone, we have ⇔ 0 ∈ conv{−(푥̄− 푥푖) ∣ 푖 ∈ 퐼(푥̄)}+ ∂표휄풳 (푥̄) (3.19) ⇔ 0 ∈ −푥̄+ conv{푥푖 ∣ 푖 ∈ 퐼(푥̄)}+ ∂표휄풳 (푥̄). (3.20) 42 3.2. Characterization of Optimal Solutions Since 푥̄ ∈ int풳 , we have ∂표휄풳 (푥̄) = 0 by Example 1.48 so that 푥̄ ∈ conv{푥푖 ∣ 푖 ∈ 퐼(푥̄)} . (3.21) Lastly, since 푖 ∈ 퐼(푥̄) if and only if 푔푖(푥̄) = −푓(푥̄) (3.22) ⇔ −푤 ∥∥푥̄− 푥푖∥∥2 = −푓(푥̄) (3.23) ⇔ 푤 ∥∥푥̄− 푥푖∥∥2 = 푓(푥̄) (3.24) ⇔ 푤 ∥∥푥̄− 푥푖∥∥2 = min 푖=1,...,푚 푤 ∥∥푥̄− 푥푖∥∥2 (3.25) ⇔ ∥∥푥̄− 푥푖∥∥2 = min 푖=1,...,푚 ∥∥푥̄− 푥푖∥∥2 (3.26) we have that 푥̄ is a critical point of 푓(푥) = min푖=1,...,푚푤 ∥∥푥− 푥푖∥∥2 if and only if 푥̄ ∈ conv{푥푖 ∣푖 ∈ 퐼(푥̄)} (3.27) where 퐼(푥̄) = { 푖 ∣∣∣∣ ∥∥푥̄− 푥푖∥∥2 = min푖=1,...,푚 ∥∥푥̄− 푥푖∥∥2 } . (3.28) Thus, we have provided a necessary condition for 푥̄ ∈ ℝ푛 to be a possible maximizer for the equally weighted maximin dispersion problem. Remark 3.7. Let 푓 be given as in (3.9). Then 푥̄ is an optimal solution of max푥∈풳 푓(푥) only if 0 ∈ ∂표(−푓)(푥̄) +푁풳 (푥̄). That is, 0 ∈ conv{−푤(푥̄− 푥푖) ∣ 푖 ∈ 퐼(푥̄)} +푁풳 (푥̄) (3.29) ⇔푥̄ ∈ conv{푥푖 ∣ 푖 ∈ 퐼(푥̄)} +푁풳 (푥̄) (3.30) ⇔푥̄ ∈ conv{푥푖 ∣ 푖 ∈ 퐼(푥̄)} +푁풳 (푥̄). (3.31) See Theorem 1.30. 43 Chapter 4 NP-Hardness Results Fact 4.1. The general weighted maximin distance problem is known to be NP-hard, even in the case where the distances satisfy the triangle inequality [20]. In this chapter we focus on a particular subset of maximin distance problems, such that 푤1 = ⋅ ⋅ ⋅ = 푤푚, 풳 = [−1, 1]푛 and 푥1, . . . , 푥푚 are in 풳 . We show that even in this reduced case of equal weighting and 풳 an 푛-dimensional hypercube, the problem is still NP-hard (see Theorem 4.5). We also present a suggested heuristic for solving (2.1) which involves solving restricted subproblems, and show that the restricted subproblems are still NP-hard. 4.1 The Case That 풳 is a Box In this section we will show that the equally weighted maximin optimiza- tion problem given by max 푥∈[−1,1]푛 min 푖=1,...,푚 푤 ∥∥푥− 푥푖∥∥2 (4.1) is NP-hard. In order to prove this result, we first provide some preliminary results. Proposition 4.2. The following are equivalent to solve: 1. 푦 is a solution of the integer linear program 퐵푦 = 푏, 푦 ∈ {−1, 1}푞 , (4.2) with 퐵 ∈ ℤ푝×푞, 푏 ∈ ℤ푝. 2. x is a solution to 퐶푥 ≤ 0, 푥 ∈ {−1, 1}푞+1 , (4.3) with 퐶 := [ 퐵 −푏 −퐵 푏 ] . 44 4.1. The Case That 풳 is a Box Note that when we say that (4.2) and (4.3) are equivalent to solve, we mean it in the following sense: 푦 ∈ {−1, 1}푞 solves (4.2) ⇒ (푦,−1) solves (4.3). 푥 = (푥1, . . . , 푥푞, 푥푞+1) solves (4.3) ⇒ ( 푥1 푥푞+1 , ⋅ ⋅ ⋅ , 푥푞 푥푞+1 ) solves (4.2). Proof. Consider:⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣ 퐵11 퐵12 ⋅ ⋅ ⋅ 퐵1푞 −푏1 퐵21 퐵22 ⋅ ⋅ ⋅ 퐵2푞 −푏2 ... ... 퐵푝1 퐵푝2 ⋅ ⋅ ⋅ 퐵푝푞 −푏푝 −퐵11 −퐵12 ⋅ ⋅ ⋅ −퐵1푞 푏1 ... ... −퐵푝1 −퐵푝2 ⋅ ⋅ ⋅ −퐵푝푞 푏푝 ⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦ ⎡⎢⎢⎢⎣ 푥1 푥2 ... 푥푞+1 ⎤⎥⎥⎥⎦ ≤ ⎡⎢⎢⎢⎣ 0 0 ... 0 ⎤⎥⎥⎥⎦ (4.4) ⇔ ⎧⎨⎩ 퐵11푥1 +퐵12푥2 + ⋅ ⋅ ⋅+퐵1푞푥푞 − 푏1푥푞+1 ≤ 0 퐵21푥1 +퐵22푥2 + ⋅ ⋅ ⋅+퐵2푞푥푞 − 푏2푥푞+1 ≤ 0 ... 퐵푝1푥1 +퐵푝2푥2 + ⋅ ⋅ ⋅+퐵푝푞푥푞 − 푏푝푥푞+1 ≤ 0 +퐵11푥1 +퐵12푥2 + ⋅ ⋅ ⋅+퐵1푞푥푞 − 푏1푥푞+1 ≥ 0 ... +퐵푝1푥1 +퐵푝2푥2 + ⋅ ⋅ ⋅+퐵푝푞푥푞 − 푏푝푥푞+1 ≥ 0 ⇔ ⎧⎨⎩ 푏1푥푞+1 ≤ 퐵11푥1 +퐵12푥2 + ⋅ ⋅ ⋅+퐵1푞푥푞 ≤ 푏1푥푞+1 ... 푏푝푥푞+1 ≤ 퐵푝1푥1 +퐵푝2푥2 + ⋅ ⋅ ⋅+퐵푝푞푥푞 ≤ 푏푝푥푞+1 And thus, since 푥푖 = ±1 for 푖 = 1, .., 푞 we obtain 푥푞+1푏 ≤ 퐵푥 ≤ 푥푞+1푏 ⇔ 퐵푥 = 푥푞+1푏 ⇔ 퐵 ( 푥 푥푞+1 ) = 푏. (4.5) Since 푥푞+1 = ±1, we have 푥푥푞+1 ∈ {−1, 1} 푞, and 푥푥푞+1 solves (4.2). Proposition 4.3. The following are equivalent to solve: 45 4.1. The Case That 풳 is a Box 1. x is a solution to 퐶푥 ≤ 0, 푥 ∈ {−1, 1}푞+1 , (4.6) with 퐶 := [ 퐵 −푏 −퐵 푏 ] ∈ 푍2푝×(푞+1). 2. x is a solution to 2퐶푥 ≤ 휌푒, 푥 ∈ {−1, 1}푞+1 , (4.7) for any 0 < 휌 ≤ 1, where 푒 = (1, . . . , 1)푇 . In this case, when we say that (4.6) and (4.7) are equivalent to solve, we mean that 푥 solves (4.6) ⇔ 푥 solves (4.7) . Proof. We can rewrite the inequality (4.7) in item 2 as 퐶푥 ≤ 휌 2 푒, 푥 ∈ {−1, 1}푞+1 . (4.8) Now, since 0 < 휌 ≤ 1 we have 0 < 휌2 ≤ 12 . Furthermore, as 퐶 ∈ ℤ2푝×(푞+1) and 푥 ∈ {−1, 1}푞+1 it is obvious to see that 퐶푥 ∈ ℤ2푝 so that solving 퐶푥 ≤ 12푒 is equivalent to solving 퐶푥 ≤ 0. Proposition 4.4. The integer linear program (1.94) is reducible in polyno- mial time to the equally weighted maximin dispersion problem max 푥∈[−1,1]푞+1 min 푖=1,...,2푝 ∥∥푥− 푥푖∥∥2 . (4.9) Moreover, max 푥∈[−1,1]푞+1 min 푖=1,...,2푝 ∥∥푥− 푥푖∥∥2 = max 푥∈{−1,1}푞+1 min 푖=1,...,2푝 ∥∥푥− 푥푖∥∥2 . (4.10) Proof. From Fact 1.76 we know that it is NP-hard to determine the solv- ability of 퐵푦 = 푏, 푦 ∈ {−1, 1}푞 , (4.11) with 퐵 ∈ ℤ푝×푞, 푏 ∈ ℤ푝. By Proposition 4.2 this is equivalent to 퐶푥 ≤ 0, 푥 ∈ {−1, 1}푞+1 , (4.12) 46 4.1. The Case That 풳 is a Box with 퐶 := [ 퐵 −푏 −퐵 푏 ] . By Proposition 4.3, it is equivalent to solve 2퐶푥 ≤ 휌푒, 푥 ∈ {−1, 1}푞+1 , (4.13) for any 0 < 휌 ≤ 1, where 푒 = (1, . . . , 1)푇 . We assume without loss of generality that 퐶 has no zero row. Let 푐푖 denote the 푖th column of 퐶푇 . Then (푐푖)푇 is the 푖th row of 퐶 so that ∥∥푐푖∥∥ ∕= 0. Let 푥푖 := 휚푖푐 푖 ∈ ℝ푞+1 with 휚푖 := 휌∥푐푖∥2 , 푖 = 1, . . . , 2푝. (4.14) Then 휚푖휌 = 휌2 ∥푐푖∥2 = ∥∥푥푖∥∥2, since∥∥푥푖∥∥ = ∥∥휚푖푐푖∥∥ = 휌∥푐푖∥2 ∥∥푐푖∥∥ = 휌∥푐푖∥ . Furthermore, we have 2퐶푥 = 2 [( 푐1 )푇 ( 푐2 )푇 ⋅ ⋅ ⋅ (푐2푝)푇 ]푇 푥 = ⎡⎢⎢⎢⎢⎣ 2 ( 푐1 )푇 푥 2 ( 푐2 )푇 푥 ... 2 ( 푐2푝 )푇 푥 ⎤⎥⎥⎥⎥⎦ (4.15) and thus 2퐶푥 ≤ 휌푒 ⇔ 2 (푐푖)푇 푥 ≤ 휌 ∀푖 = 1, . . . , 2푝 ⇔ 2휚푖 ( 푐푖 )푇 푥 ≤ 휚푖휌 ∀푖 = 1, . . . , 2푝 ⇔ 2 (푥푖)푇 푥 ≤ ∥∥푥푖∥∥2 , ∀푖 = 1, . . . , 2푝. (4.16) Then (4.13) is equivalent to 2 ( 푥푖 )푇 푥 ≤ ∥∥푥푖∥∥2 , 푖 = 1, . . . , 2푝, 푥 ∈ {−1, 1}푞+1 . (4.17) Now, from ∥∥푥푖 − 푥∥∥2 = ∥∥푥푖∥∥2 − 2 (푥푖)푇 푥+ ∥푥∥2, we have∥∥푥푖∥∥2 ≥ 2 (푥푖)푇 푥 푖 = 1, . . . , 2푝 ⇔ ∥∥푥푖 − 푥∥∥2 + 2 (푥푖)푇 푥− ∥푥∥2 ≥ 2 (푥푖)푇 푥 푖 = 1, . . . , 2푝 ⇔ ∥∥푥푖 − 푥∥∥2 ≥ ∥푥∥2 푖 = 1, . . . , 2푝 ⇔ ∥∥푥푖 − 푥∥∥2 ≥ 푞 + 1 푖 = 1, . . . , 2푝 (4.18) 47 4.1. The Case That 풳 is a Box (since 푥 ∈ {−1, 1}푞+1 implies ∥푥∥2 = 푞 + 1). Hence, (4.17) is in turn is equivalent to 푞 + 1 ≤ ∥∥푥푖 − 푥∥∥2 , 푖 = 1, . . . , 2푝, 푥 ∈ {−1, 1}푞+1 . (4.19) Thus (1.94) has a solution if and only if the optimal value of max 푥∈{−1,1}푞+1 min 푖=1,...,2푝 ∥∥푥푖 − 푥∥∥2 (4.20) is no less than 푞 + 1. That is, ∃ 푥 ∈ {−1, 1}푞+1 such that 푞 + 1 ≤ ∥∥푥푖 − 푥∥∥2 ∀푖 ⇔ ∃ 푥 ∈ {−1, 1}푞+1 such that 푞 + 1 ≤ min 푖=1,...,2푝 ∥∥푥푖 − 푥∥∥2 ⇔ 푞 + 1 ≤ max 푥∈{−1,1}푞+1 min 푖=1,...,2푝 ∥∥푥푖 − 푥∥∥2 . (4.21) Now, we show that (4.20) is equivalent to max 푥∈[−1,1]푞+1 min 푖=1,...,2푝 ∥∥푥푖 − 푥∥∥2 . (4.22) . Indeed, since ∥∥푥푖∥∥ = 휌∥푐푖∥ , we can take 휌 = min { 1,min 푖 ∥∥푐푖∥∥ /3} , (4.23) so that ∥∥푥푖∥∥ < 12 for all 푖. To see this, simply note that by (4.23) we have 휌 ≤ min 푖 ∥∥푐푖∥∥ 3 ⇒ 휌 ≤ ∥∥푐푖∥∥ 3 ∀ 푖 ⇒ 휌∥푐푖∥ ≤ 1 3 ∀ 푖. Now we show that the objective function value of (4.22) is strictly im- proved by setting 푥 = {±1}푞+1: For any 푥 ∈ [−1, 1]푞+1, if 0 ≤ 푥푗 < 1 for some 푗 ∈ {1, . . . , 푞 + 1}, then, for each 푖 ∈ {1, . . . , 2푝}, either (i) 푥푖푗 ≤ 푥푗 so that∣∣푥푗 − 푥푖푗∣∣ = 푥푗 − 푥푖푗 < 1− 푥푖푗 = ∣∣1− 푥푖푗∣∣ , or (4.24) 48 4.2. NP-Hardness of Restricted Problems (ii) 푥푖푗 > 푥푗 so that∣∣푥푗 − 푥푖푗∣∣ = 푥푖푗 − 푥푗 ≤ 푥푖푗 < 12 < 1− 푥푖푗 = ∣∣1− 푥푖푗∣∣ . (4.25) Then the objective function value of (4.22) is strictly improved by setting 푥푗 = 1. Similarly, if 0 ≥ 푥푗 > −1 for some 푗 ∈ {1, . . . , 푞 + 1}, then the objective function value of (4.22) is strictly improved by setting 푥푗 = −1. Lastly, it is readily seen that the size of 푥1, . . . , 푥2푝 (encoded in binary) is polynomial in the size of [퐵 푏] ([12] p.9-10, p.21). Thus the solvability of (1.94) is reducible in polynomial time to the optimization problem (4.22). We now state the main theorem of this Chapter: Theorem 4.5. Let 푤 > 0. The equally weighted maximin optimization problem given by max 푥∈[−1,1]푛 min 푖=1,...,푚 푤 ∥∥푥− 푥푖∥∥2 (4.26) is NP-hard. Proof. By Proposition 4.4, we know that the integer linear program (1.94) is reducible in polynomial time to (4.22). Since (1.94) is NP-hard (Fact 1.76), this implies that (4.22) is NP-hard. Lastly, since (4.22) is simply (4.26) with 푤푖 = 1 for all 푖, we thus have that (4.26) is NP-hard. 4.2 NP-Hardness of Restricted Problems In [30], the author describes an heuristic approach for solving (2.1) based on partitioning 풳 into 푚 Voronoi cells 풳 푖 := { 푥 ∈ 풳 ∣∣∣푤푖 ∥∥푥− 푥푖∥∥2 ≤ 푤푗 ∥∥푥− 푥푗∥∥2 ∀푗 ∕= 푖} 푖 = 1, . . . ,푚, (4.27) and maximizing ∥∥푥− 푥푖∥∥2 over 푥 ∈ 풳 푖, for 푖 = 1, . . . ,푚. However, since (2.1) is NP-hard, at least one of these restricted problems (or subproblems) is NP-hard. In fact, we show below that the restricted problem is NP-hard even when 푤1 = ⋅ ⋅ ⋅ = 푤푚 and 풳 = ℝ푛. Theorem 4.6. We have 49 4.2. NP-Hardness of Restricted Problems 1. 푓(푥) = min 1≤푖≤푚 푤푖 ∥∥푥− 푥푖∥∥2 = ⎧⎨⎩ 푤1 ∥∥푥− 푥1∥∥2 if 푥 ∈ 풳 1 푤2 ∥∥푥− 푥2∥∥2 if 푥 ∈ 풳 2 ... 푤푚 ∥푥− 푥푚∥2 if 푥 ∈ 풳푚 (4.28) where 풳 푖 = { 푥 ∈ 풳 ∣∣∣ 푤푖 ∥∥푥− 푥푖∥∥2 ≤ 푤푗 ∥∥푥− 푥푗∥∥2 ∀ 푗 ∕= 푖} , for 푖 = 1, . . . ,푚. 2. max 푥∈풳 푓(푥) = max { max 푥∈풳 1 푓(푥), . . . , max 푥∈풳푚 푓(푥) } . 3. If 푤1 = ⋅ ⋅ ⋅ = 푤푚, then each 풳 푖 is a convex set, since{ 푥 ∈ 풳 ∣∣∣ 푤푖 ∥∥푥− 푥푖∥∥2 ≤ 푤푗 ∥∥푥− 푥푗∥∥2 , 푗 ∕= 푖} = { 푥 ∈ 풳 ∣∣∣〈푥푗 − 푥푖, 푥〉 ≤ ∥∥푥푗∥∥2 − ∥∥푥푖∥∥2 , 푗 ∕= 푖} . Note that max푥∈풳 푖 푓(푥) is: max 푥∈풳 푖 푓(푥) = max 푥∈풳 푖 ∥∥푥− 푥푖∥∥2 s.t. ∥∥푥− 푥푖∥∥2 ≤ ∥∥푥− 푥푗∥∥2 푗 ∕= 푖. (4.29) Let 푊 ∈ ℤ푞×푞 and assume that 푊 is positive semidefinite and symmet- ric. We have the following Lemma: Lemma 4.7. The MaxCut problem on a graph with 푞 nodes max 푦∈{−1,1}푞 푦푇푊푦, (4.30) is equivalent to the following problem: max 푦∈[−1,1]푞 푦푇 (푊 +퐷) 푦, (4.31) where 퐷 ∈ ℝ푞×푞 is any diagonal matrix chosen so that 푊 + 퐷 is positive definite. 50 4.2. NP-Hardness of Restricted Problems To prove Lemma 4.7 we recall Theorem 1.56, which states that if 퐷 ⊂ ℝ푛 is compact and convex and 푓 : ℝ푛 → ℝ is convex, then max 푥∈퐷 푓 = max 푥∈푒푥푡(퐷) 푓 (4.32) where 푒푥푡(퐷) is the set of extremal points of 퐷. Proof. Note that the set of extremal points of [−1, 1]푞 is {−1, 1}푞. Then max 푦∈[−1,1]푞 푦푇 (푊 +퐷) 푦 = max 푦∈{−1,1}푞 { 푦푇푊푦 + 푦푇퐷푦 } = max 푦∈{−1,1}푞 ⎧⎨⎩푦푇푊푦 + 푞∑ 푗=1 푦2푗퐷푗푗 ⎫⎬⎭ = max 푦∈{−1,1}푞 ⎧⎨⎩푦푇푊푦 + 푞∑ 푗=1 퐷푗푗 ⎫⎬⎭ = max 푦∈{−1,1}푞 푦푇푊푦 + 푞∑ 푗=1 퐷푗푗 . (4.33) Since ∑푞 푗=1퐷푗푗 is constant, it is equivalent to solve max 푦∈{−1,1}푞 푦푇푊푦. (4.34) Now, since 푊 +퐷 is nonsingular, positive definite, by Fact 1.26 we can decompose 푊 + 퐷 = 퐿푇퐿 where 퐿 ∈ ℝ푞×푞 is a nonsingular and upper (or lower) triangular matrix, and make the substitution 푥 = 퐿푦, which leads to the following lemma: Lemma 4.8. Solving (4.31) is equivalent to solving max 푥∈ℝ푞 ∥푥∥2 s.t. 2 ( 푢푖 )푇 푥 ≤ 1, −2 (푢푖)푇 푥 ≤ 1, 푖 = 1, . . . , 푞, (4.35) where 푢푖 denotes the 푖th column of ( 퐿−1 )푇 /2. 51 4.2. NP-Hardness of Restricted Problems Proof. This follows from ( 퐿−1 )푇 = (2푢1, . . . , 2푢푞)⇒ 퐿−1 = ⎡⎢⎢⎢⎣ 2푢푇1 2푢푇2 ... 2푢푇푞 ⎤⎥⎥⎥⎦ (4.36) so that 푥 = 퐿푦 ⇒ 푦 = 퐿−1푥 = ⎡⎢⎢⎢⎣ 2푢푇1 2푢푇2 ... 2푢푇푞 ⎤⎥⎥⎥⎦푥⇒ 푦푖 = 2 (푢푖)푇 푥 푖 = 1, . . . , 푞 (4.37) and ∣푦푖∣ ≤ 1 ∀푖 = 1, . . . , 푞 ⇔ ∣∣∣2 (푢푖)푇 푥∣∣∣ ≤ 1 ∀푖 = 1, . . . , 푞 (4.38) Lastly, it is easily seen that ∥푥∥2 = ⟨퐿푦,퐿푦⟩ = 푦푇 (퐿푇퐿) 푦 = 푦푇 (푊 +퐷) 푦. (4.39) Remark 4.9. 푥 solves (4.35) if and only if 푦 = 퐿−1푥 solves (4.31). Lemma 4.10. Problem (4.35) can be written as max 푥∈ℝ푞 ∥푥∥2 s.t. ∥푥∥2 ≤ ∥∥푥− 푥푖∥∥2 , 푖 = 1, . . . , 2푞. (4.40) where 푢푖 denotes the 푖th column of ( 퐿−1 )푇 /2 and 푥푖 := 휌푖푢 푖, 푥푖+푞 := −휌푖푢푖, with 휌푖 := 1∥푢푖∥2 , 푖 = 1, . . . , 푞. (4.41) Proof. By definition of 푥푖, we have ∥∥푥푖∥∥ = ∥∥휌푖푢푖∥∥ = 1∥푢푖∥2 ∥∥푢푖∥∥ = 1∥푢푖∥ , so that ∥푥∥2 ≤ ∥∥푥− 푥푖∥∥2 = ∥푥∥2 − 2 〈푥, 푥푖〉+ ∥∥푥푖∥∥2 ⇔ 0 ≤ −2 〈푥, 푥푖〉+ 1∥푢푖∥2 = −2 〈푥, 푥푖〉+ 휌푖 ⇔ 2 〈푥, 푥푖〉 ≤ 휌푖, 푖 = 1, . . . , 2푞 (4.42) ⇔ ∣∣2 〈푥, 휌푖푢푖〉∣∣ ≤ 휌푖, 푖 = 1, . . . , 푞 ⇔ ∣∣2 〈푥, 푢푖〉∣∣ ≤ 1. 52 4.2. NP-Hardness of Restricted Problems Theorem 4.11. The MaxCut problem on a graph with 푞 nodes is reducible in polynomial time to the problem (4.40), max 푥∈ℝ푞 ∥푥∥2 s.t. ∥푥∥2 ≤ ∥∥푥− 푥푖∥∥2 , 푖 = 1, . . . , 2푞. (4.43) Proof. By applying Lemmas 4.7, 4.8 and 4.10 we know that the MaxCut problem is reducible to solving (4.40). Since 푊 ≥ 0, 푊 ∈ ℤ푞×푞, and by choosing 퐷 appropriately, we can ensure that 푥1, . . . , 푥2푞 have rational en- tries and the size of 푥1, . . . , 푥2푞 (encoded in binary) is polynomial in the size of 푊 ([12]). Thus MaxCut is reducible in polynomial time to the restricted problem (4.40). Corollary 4.12. The restricted problems of the heuristic method of [30], max 푥∈ℝ푞 ∥푥∥2 s.t. ∥푥∥2 ≤ ∥∥푥− 푥푖∥∥2 , 푖 = 1, . . . , 2푞. (4.44) are NP-hard. Proof. By Theorem 4.11, we know that MaxCut is reducible in polynomial time to the restricted problem (4.40). Thus, since the MaxCut problem is NP-hard (Fact 1.78), we have that (4.40) must be NP-hard. Corollary 4.13. One of the restricted problems max 푥∈풳 푖 푓(푥) must be NP-hard. Proof. Apply Corollary 4.12 and Theorem 4.6 part 2. 53 Chapter 5 Convex Relaxation Based Algorithms Given that (2.1) is NP-hard, see Chapter 4, in general it is very im- portant to explore methods for finding approximate solutions to (2.1). In this chapter, we outline an algorithm which can be used to find approximate solutions to (2.1) using our convex SDP (2.14) and SOCP (2.24) relaxations. 5.1 Auxiliary Results Let 푍∗ ∈ 풮푛+1 be an optimal solution of the SDP relaxation (2.14). Then 푍∗ ∈ 풮푛+1+ so that 푍∗푗푗 ≥ 0 for 푗 = 1, . . . , 푛 + 1. For 푖 = 1, . . . ,푚, let 푏푖 ∈ ℝ푛 be given by 푏푖푗 = √ 푍∗푗푗푥 푖 푗 , 푗 = 1, . . . , 푛 (5.1) where 푥푖, 푖 = 1, . . . ,푚 are the given set of points from (2.1). To prove our main result, Theorem 5.4, we require the following tail estimate by Ben-Tal, Nemirovski, and Roos; see [3]. For completeness, we include the proof. Lemma 5.1. Let 휁 ∈ {−1, 1}푛 be a random vector, componentwise inde- pendent, with Pr (휁푗 = 1) = Pr (휁푗 = −1) = 1 2 ∀푗 = 1, . . . , 푛. (5.2) Then for any 훼 > 0, Pr (( 푏푖 )푇 휁 ≥ √훼 ∥∥푏푖∥∥) ≤ 푒−훼/2, 푖 = 1, . . . ,푚. Proof. (based on proof of Lemma A.3 in [3]): First, note that 휁 is a random variable and 푏푖 is fixed for each 푖, so (푏푖)푇 휁 is a random variable. Since 훼 > 0 and ∥∥푏푖∥∥ are fixed, (푏푖)푇 휁 − √훼 ∥∥푏푖∥∥ is 54 5.1. Auxiliary Results also a random variable. Now, for any random variable 푦 with density function 푓 , for any 휌 ≥ 0 we have 퐸 (푒휌푦) = ∫ 0 −∞ 푒휌푦푓(푦)푑푦 + ∫ ∞ 0 푒휌푦푓(푦)푑푦 ≥ 0 + ∫ ∞ 0 푓(푦)푑푦 (since 푒휌푦 ≥ 1 for 푦 ≥ 0) = Pr (푦 ≥ 0) . (5.3) Hence, substituting 푦 = (푏푖)푇 휁 −√훼 ∥∥푏푖∥∥, we have: Pr ( (푏푖)푇 휁 ≥ √훼 ∥∥푏푖∥∥) = Pr ((푏푖)푇 휁 −√훼 ∥∥푏푖∥∥ ≥ 0) ≤ E ( 푒휌((푏 푖)푇 휁−√훼∥푏푖∥)) = E ( 푒휌((푏 푖)푇 휁) ) 푒−휌 √ 훼∥푏푖∥ (5.4) and thus E ( 푒휌((푏 푖)푇 휁) ) = 푛∏ 푗=1 E ( 푒휌푏 푖 푗휁푗 ) = 푛∏ 푗=1 1 2 ( 푒휌푏 푖 푗 + 푒−휌푏 푖 푗 ) (from 5.2) = 푛∏ 푗=1 cosh (휌푏푖푗) (5.5) Apply the inequality cosh 푡 ≤ 푒 12 푡2 (which holds by Taylor’s series) to obtain E ( 푒휌((푏 푖)푇 휁) ) ≤ 푛∏ 푗=1 푒 1 2 휌2(푏푖푗) 2 = 푒 1 2 휌2( ∑푛 푗=1(푏 푖 푗) 2) = 푒 1 2 휌2∥푏푖∥2 . Hence, we have for 휌 ≥ 0 Pr ( (푏푖)푇 휁 ≥ √훼 ∥∥푏푖∥∥) ≤ E(푒휌((푏푖)푇 휁)) 푒−휌√훼∥푏푖∥ ≤ ( 푒 1 2 휌2∥푏푖∥2) 푒−휌√훼∥푏푖∥ = 푒 1 2 휌2∥푏푖∥2−휌√훼∥푏푖∥. (5.6) 55 5.2. SDP Relaxation Based Algorithm The right hand side of this equation is minimized at 휌 = √ 훼/ ∥∥푏푖∥∥, with 푒 1 2 ( √ 훼/∥푏푖∥)2∥푏푖∥2−(√훼/∥푏푖∥)√훼∥푏푖∥ = 푒−훼2 (5.7) Thus, we have Pr (( 푏푖 )푇 휁 ≥ √훼 ∥∥푏푖∥∥) ≤ 푒−훼2 (5.8) as required. 5.2 SDP Relaxation Based Algorithm In this section, we consider the case of (2.6) and propose an algorithm for constructing an approximate solution of (2.1) based on the SDP relaxation (2.14), with an approximation bound of 1−√훼훾∗ 2 휐푐푝. (5.9) We will see in Section 6.1 that, in the case of 풳 = [−1, 1]푛, this yields an approximation bound that is uniformly bounded away from zero as 푛→∞. First we give some elementary properties of optimal solutions to the SDP relaxation. Lemma 5.2 (box case). Let 풦 = {푦 ∈ ℝ푛+1 ∣푦푗 ≤ 푦푛+1, 푗 = 1, . . . , 푛} . (5.10) If the SDP 1 휐푐푝 = min 푍 푍푛+1,푛+1 s.t. 푤푖 〈 퐴푖, 푍 〉 ≥ 1, 푖 = 1, . . . ,푚 (푍11, . . . , 푍푛푛, 푍푛+1,푛+1) 푇 ∈ 풦 푍 ર 0. has a solution, then it has a solution 푍∗ ∈ 풮푛+1 such that 푍∗푗푗 = 푍 ∗ 푛+1,푛+1 ∀ 푗 = 1, . . . , 푛. In particular, 푍∗푗푗 > 0 for 푗 = 1, . . . , 푛. 56 5.2. SDP Relaxation Based Algorithm Proof. The constraint (푍11, . . . , 푍푛푛, 푍푛+1,푛+1) 푇 ∈ 풦 gives 푍푗푗 ≤ 푍푛+1,푛+1 ∀ 푗 = 1, . . . , 푛. (5.11) The constraint 푤푖 〈 퐴푖, 푍 〉 ≥ 1 gives 푤푖 ( 푛∑ 푗=1 푍푗푗 + ∥∥푥푖∥∥2 푍푛+1,푛+1 − 2 (푥푖)푇 푧∗) ≥ 1. (5.12) If 푍 is a feasible solution, by (5.11) Diag(푍푛+1,푛+1 − 푍11, . . . , 푍푛+1,푛+1 − 푍푛푛, 0) ર 0. Then 푍̃ := 푍 + Diag(푍푛+1,푛+1 − 푍11, . . . , 푍푛+1,푛+1 − 푍푛푛, 0) ર 0 and also verifies (5.11) and (5.12), so 푍̃ is feasible. Moreover, 푍̃푗푗 = 푍푛+1,푛+1 ∀ 푗 = 1, . . . , 푛. Hence any optimal solution 푍∗ to the SDP can be modified so that 푍∗푗푗 = 푍 ∗ 푛+1,푛+1 ∀ 푗 = 1, . . . , 푛. The proof is complete by using Lemma 2.8. Lemma 5.3 (ball case). Let 풦 = {푦 ∈ ℝ푛+1∣ 푛∑ 푗=1 푦푗 ≤ 푦푛+1} (5.13) If the SDP 1 휐푐푝 = min 푍 푍푛+1,푛+1 s.t. 푤푖 〈 퐴푖, 푍 〉 ≥ 1, 푖 = 1, . . . ,푚 (푍11, . . . , 푍푛푛, 푍푛+1,푛+1) 푇 ∈ 풦 푍 ર 0. has a solution, then it has a solution 푍∗ ∈ 풮푛+1 such that 푛∑ 푗=1 푍∗푗푗 = 푍 ∗ 푛+1,푛+1. In particular, ∑푛 푗=1 푍 ∗ 푗푗 > 0. 57 5.2. SDP Relaxation Based Algorithm Proof. The constraint (푍11, . . . , 푍푛푛, 푍푛+1,푛+1) 푇 ∈ 풦 gives 푛∑ 푗=1 푍푗푗 ≤ 푍푛+1,푛+1. (5.14) The constraint 푤푖 〈 퐴푖, 푍 〉 ≥ 1 gives 푤푖 ( 푛∑ 푗=1 푍푗푗 + ∥∥푥푖∥∥2 푍푛+1,푛+1 − 2 (푥푖)푇 푧∗) ≥ 1. (5.15) If 푍 is a feasible solution, by (5.14) Diag(푍푛+1,푛+1 − 푛∑ 푗=1 푍푗푗 , 0, . . . , 0) ર 0. Then 푍̃ := 푍 + Diag(푍푛+1,푛+1 − 푛∑ 푗=1 푍푗푗 , 0, . . . , 0) ર 0 and also verifies (5.14) and (5.15), so 푍̃ is feasible. Moreover, 푛∑ 푗=1 푍̃푗푗 = 푍푛+1,푛+1. Hence any optimal solution 푍∗ to the SDP can be modified so that 푛∑ 푗=1 푍∗푗푗 = 푍 ∗ 푛+1,푛+1. The proof is complete by using Lemma 2.8. Theorem 5.4. Let 푍∗ ∈ 풮푛+1 be an optimal solution of the SDP-relaxation problem (2.14) 1 휐푐푝 = min 푍 푍푛+1,푛+1 s.t. 푤푖 〈 퐴푖, 푍 〉 ≥ 1, 푖 = 1, . . . ,푚 (푍11, . . . , 푍푛푛, 푍푛+1,푛+1) 푇 ∈ 풦 푍 ર 0 and define 훾∗ := max푗=1,...,푛 푍 ∗ 푗푗∑푛 푗=1 푍 ∗ 푗푗 . Then there exists a feasible solution 푥̃ for the original optimization problem (2.1) such that 58 5.2. SDP Relaxation Based Algorithm 1. 푥̃ = ⎛⎝ √푍∗11휁1√ 푍∗푛+1,푛+1 , √ 푍∗22휁2√ 푍∗푛+1,푛+1 , ⋅ ⋅ ⋅ , √ 푍∗푛푛휁푛√ 푍∗푛+1,푛+1 ⎞⎠ (5.16) where 휁 = (휁1, . . . , 휁푛) satisfies 휁 ∈ {−1, 1}푛 and (푏푖)푇 휁 < √ 훼 ∥∥푏푖∥∥, and 훼 = 2 ln (푚/휌). 2. 푓 (푥̃) = min 푖=1,...,푚 푤푖 ∥∥푥̃− 푥푖∥∥2 ≥ 1−√훼훾∗ 2 휐푐푝. (5.17) 3. 푓(푥̃) ≥ 1− √ 훼훾∗ 2 휐푝. Before proving Theorem 5.4 we provide some Propositions and defini- tions. Recall that in Section 5.1 we let 푍∗ ∈ 풮푛+1 be an optimal solution of the SDP relaxation (2.14), and for 푖 = 1, . . . ,푚, we let 푏푖 ∈ ℝ푛 be given by 푏푖 = (√ 푍∗11푥 푖 1, . . . , √ 푍∗푛푛푥 푖 푛 ) . (5.18) Proposition 5.5. Fix any 0 < 휌 < 1, and set 훼 = 2 ln (푚/휌). Then Pr (( 푏푖 )푇 휁 ≥ √훼 ∥∥푏푖∥∥) ≤ 휌 푚 , 푖 = 1, . . . ,푚. (5.19) Proof. Indeed, by Lemma 5.1 we have Pr (( 푏푖 )푇 휁 ≥ √훼 ∥∥푏푖∥∥) ≤ 푒−(2 ln 푚휌 )/2 = 푒 − ln 푚 휌 = 푒ln 휌 푚 = 휌 푚 . (5.20) Proposition 5.6. Fix any 0 < 휌 < 1, and set 훼 = 2 ln (푚/휌). We have Pr (( 푏푖 )푇 휁 < √ 훼 ∥∥푏푖∥∥ , 푖 = 1, . . . ,푚) ≥ 1− 휌 > 0, (5.21) so that there exists a 휁 ∈ {−1, 1}푛 satisfying( 푏푖 )푇 휁 < √ 훼 ∥∥푏푖∥∥ , 푖 = 1, . . . ,푚. (5.22) 59 5.2. SDP Relaxation Based Algorithm Proof. Consider: Pr (( 푏푖 )푇 휁 < √ 훼 ∥∥푏푖∥∥ , 푖 = 1, . . . ,푚) = 1− Pr (( 푏푖 )푇 휁 ≥ √훼 ∥∥푏푖∥∥ for some 푖 ∈ {1, . . . ,푚}) ≥ 1− 푚∑ 푖=1 Pr (( 푏푖 )푇 휁 ≥ √훼 ∥∥푏푖∥∥) ≥ 1− 푚∑ 푖=1 휌 푚 (Apply Proposition 5.5) = 1− 휌 > 0. (5.23) How do we find the random vector 휉 verifying (5.22)? Such a 휁 can be found by repeated random sampling, and the probability of finding such a 휁 after 푁 samples is at least 1 − 휌푁 . That is, the probability that we do not find such a 휁 in each sample is less than or equal to 휌, so the probability of finding such a 휁 in 푁 samples is greater than or equal to 1−휌푁 . Now, set 푧̃ ∈ ℝ푛 and 푧̃푛+1 according to 푧̃푗 = √ 푍∗푗푗휁푗 , 푗 = 1, . . . , 푛, 푧̃푛+1 = √ 푍∗푛+1,푛+1. (5.24) Then 푧̃ = ( √ 푍∗11휁1, . . . , √ 푍∗푛푛휁푛), (5.25)( 푧̃21 , . . . , 푧̃ 2 푛, 푧̃ 2 푛+1 )푇 = ( 푍∗11, . . . , 푍 ∗ 푛푛, 푍 ∗ 푛+1,푛+1 )푇 ∈ 풦. We will show that ∥∥푧̃ − 푥푖푧̃푛+1∥∥2 is not too small for all 푖, yielding a non- trivial approximation bound. Proposition 5.7. With 푧̃ given in (5.24), we have ∥∥푧̃ − 푥푖푧̃푛+1∥∥2 = 푛∑ 푗=1 푍∗푗푗 − 2 ( 푏푖 )푇 휁 √ 푍∗푛+1,푛+1 + ∥∥푥푖∥∥2 푍∗푛+1,푛+1 (5.26) For 푖 = 1, . . . ,푚. Proof. First,∥∥푧̃ − 푥푖푧̃푛+1∥∥2 = ∥푧̃∥2 − 2 (푥푖)푇 푧̃푧̃푛+1 + ∥∥푥푖∥∥2 푧̃2푛+1. (5.27) 60 5.2. SDP Relaxation Based Algorithm Then apply (5.1): 푏푖푗 = √ 푍∗푗푗푥 푖 푗 ⇒ 푥푖푗 = 푏푖푗√ 푍∗푗푗 (5.28) and (푥푖)푇 푧̃ = 푛∑ 푗=1 푥푖푗 푧̃푗 = 푛∑ 푗=1 푏푖푗√ 푍∗푗푗 √ 푍∗푗푗휁푗 = (푏 푖)푇 휁 (5.29) to get the desired result. Now, define 훾∗ := max푗=1,...,푛 푍 ∗ 푗푗∑푛 푗=1 푍 ∗ 푗푗 . (5.30) Proposition 5.8. We have 푛∑ 푗=1 푍∗푗푗 − 2 ( 푏푖 )푇 휁 √ 푍∗푛+1,푛+1 + ∥∥푥푖∥∥2 푍∗푛+1,푛+1 (5.31) ≥ 푛∑ 푗=1 푍∗푗푗 − 2 √√√⎷훼훾∗ 푛∑ 푗=1 푍∗푗푗 ∥∥푥푖∥∥√푍∗푛+1,푛+1 + ∥∥푥푖∥∥2 푍∗푛+1,푛+1. Proof. First, since 휁 satisfies ( 푏푖 )푇 휁 < √ 훼 ∥∥푏푖∥∥ , 푖 = 1, . . . ,푚, we have that 푛∑ 푗=1 푍∗푗푗 − 2 ( 푏푖 )푇 휁 √ 푍∗푛+1,푛+1 + ∥∥푥푖∥∥2 푍∗푛+1,푛+1 ≥ 푛∑ 푗=1 푍∗푗푗 − 2 √ 훼 ∥∥푏푖∥∥√푍∗푛+1,푛+1 + ∥∥푥푖∥∥2 푍∗푛+1,푛+1. (5.32) Furthermore, we have∥∥푏푖∥∥2 = 푛∑ 푗=1 푍∗푗푗 ( 푥푖푗 )2 ≤ ( max 푗=1,...,푛 푍∗푗푗 ) 푛∑ 푗=1 (푥푖푗) 2 = ( max 푗=1,...,푛 푍∗푗푗 )∥∥푥푖∥∥2 = ⎛⎝훾∗ 푛∑ 푗=1 푍∗푗푗 ⎞⎠∥∥푥푖∥∥2 (5.33) 61 5.2. SDP Relaxation Based Algorithm by (5.28) and (5.30). Then 푛∑ 푗=1 푍∗푗푗 − 2 √ 훼 ∥∥푏푖∥∥√푍∗푛+1,푛+1 + ∥∥푥푖∥∥2 푍∗푛+1,푛+1 (5.34) ≥ 푛∑ 푗=1 푍∗푗푗 − 2 √√√⎷훼훾∗ 푛∑ 푗=1 푍∗푗푗 ∥∥푥푖∥∥√푍∗푛+1,푛+1 + ∥∥푥푖∥∥2 푍∗푛+1,푛+1. Combining equations (5.32) and (5.33) gives the desired result. Proposition 5.9. We have 푛∑ 푗=1 푍∗푗푗 − 2 √√√⎷훼훾∗ 푛∑ 푗=1 푍∗푗푗 ∥∥푥푖∥∥√푍∗푛+1,푛+1 + ∥∥푥푖∥∥2 푍∗푛+1,푛+1 ≥ ( 1− √ 훼훾∗ )⎛⎝ 푛∑ 푗=1 푍∗푗푗 + ∥∥푥푖∥∥2 푍∗푛+1,푛+1 ⎞⎠ . (5.35) Proof. First, 푛∑ 푗=1 푍∗푗푗 − 2 √√√⎷훼훾∗ 푛∑ 푗=1 푍∗푗푗 ∥∥푥푖∥∥√푍∗푛+1,푛+1 + ∥∥푥푖∥∥2 푍∗푛+1,푛+1 (5.36) = ( 1− √ 훼훾∗ )⎛⎝ 푛∑ 푗=1 푍∗푗푗 + ∥∥푥푖∥∥2 푍∗푛+1,푛+1 ⎞⎠+√훼훾∗ ⎛⎝√√√⎷ 푛∑ 푗=1 푍∗푗푗 − ∥∥푥푖∥∥√푍∗푛+1,푛+1 ⎞⎠2 which can be seen by simply expanding and grouping terms: ( 1− √ 훼훾∗ )⎛⎝ 푛∑ 푗=1 푍∗푗푗 + ∥∥푥푖∥∥2 푍∗푛+1,푛+1 ⎞⎠+√훼훾∗ ⎛⎝√√√⎷ 푛∑ 푗=1 푍∗푗푗 − ∥∥푥푖∥∥√푍∗푛+1,푛+1 ⎞⎠2 = (1− √ 훼훾∗) 푛∑ 푗=1 푍∗푗푗 + (1− √ 훼훾∗) ∥∥푥푖∥∥2 푍∗푛+1,푛+1 + √ 훼훾∗ ⎛⎝ 푛∑ 푗=1 푍∗푗푗 − 2 √√√⎷ 푛∑ 푗=1 푍∗푗푗 ∥∥푥푖∥∥√푍∗푛+1,푛+1 + ∥∥푥푖∥∥2 푍∗푛+1,푛+1 ⎞⎠ = 푛∑ 푗=1 푍∗푗푗 + ∥∥푥푖∥∥2 푍∗푛+1,푛+1 − 2 √√√⎷훼훾∗ 푛∑ 푗=1 푍∗푗푗 ∥∥푥푖∥∥√푍∗푛+1,푛+1. (5.37) 62 5.2. SDP Relaxation Based Algorithm Then simply note that √ 훼훾∗ ⎛⎝√√√⎷ 푛∑ 푗=1 푍∗푗푗 − ∥∥푥푖∥∥√푍∗푛+1,푛+1 ⎞⎠2 ≥ 0. (5.38) and hence ( 1− √ 훼훾∗ )⎛⎝ 푛∑ 푗=1 푍∗푗푗 + ∥∥푥푖∥∥2 푍∗푛+1,푛+1 ⎞⎠+√훼훾∗ ⎛⎝√√√⎷ 푛∑ 푗=1 푍∗푗푗 − ∥∥푥푖∥∥√푍∗푛+1,푛+1 ⎞⎠2 ≥ ( 1− √ 훼훾∗ )⎛⎝ 푛∑ 푗=1 푍∗푗푗 + ∥∥푥푖∥∥2 푍∗푛+1,푛+1 ⎞⎠ . (5.39) Next, we provide results that will help to establish our lower bound. Proposition 5.10. For each 푖 = 1, . . . ,푚, 1 푤푖 ≤ 2 ⎛⎝ 푛∑ 푗=1 푍∗푗푗 + ∥∥푥푖∥∥2 푍∗푛+1,푛+1 ⎞⎠ . (5.40) Proof. Recall that in (2.14) our first constraint is that 푤푖 〈 퐴푖, 푍 〉 ≥ 1 or 1 푤푖 ≤ 〈퐴푖, 푍〉 for 푖 = 1, . . . ,푚. Consider the following sequence of inequali- ties, which we will justify later: 1 푤푖 ≤ 〈퐴푖, 푍∗〉 = 〈[ 퐼 −푥푖 − (푥푖)푇 ∥∥푥푖∥∥2 ] , 푍∗ 〉 = 푛∑ 푗=1 푍∗푗푗 − 2 ( 푥푖 )푇 푧∗ + ∥∥푥푖∥∥2 푍∗푛+1,푛+1 (5.41) ≤ 2 ⎛⎝ 푛∑ 푗=1 푍∗푗푗 + ∥∥푥푖∥∥2 푍∗푛+1,푛+1 ⎞⎠ . (5.42) For (5.41), let 푧∗ = ( 푍∗1,푛+1, . . . , 푍 ∗ 푛,푛+1 )푇 63 5.2. SDP Relaxation Based Algorithm and 푍∗∗ = ⎡⎢⎣ 푍 ∗ 11 ∗∗ . . . ∗∗ 푍∗푛푛 ⎤⎥⎦ so that we can write 푍∗ = [ 푍∗∗ 푧∗ (푧∗)푇 푍∗푛+1,푛+1 ] . (5.43) Now consider the trace: 푇푟 ([ 퐼 −푥푖 − (푥푖)푇 ∥∥푥푖∥∥2 ] 푍∗ ) 푇푟 ([ 퐼 −푥푖 − (푥푖)푇 ∥∥푥푖∥∥2 ] [ 푍∗∗ 푧∗ (푧∗)푇 푍∗푛+1,푛+1 ]) = 푇푟 ([ 푍∗∗ − 푥푖(푧∗)푇 푧∗ − 푥푖푍∗푛+1,푛+1 − (푥푖)푇 푍∗∗ + ∥∥푥푖∥∥2 (푧∗)푇 −(푥푖)푇 푧∗ + ∥∥푥푖∥∥2 푍∗푛+1,푛+1 ]) = 푇푟 ( 푍∗∗ − 푥푖(푧∗)푇 )+ 푇푟 (−(푥푖)푇 푧∗ + ∥∥푥푖∥∥2 푍∗푛+1,푛+1) = ⎛⎝ 푛∑ 푗=1 푍∗푗푗 − (푥푖)푇 푧∗ ⎞⎠+ (−(푥푖)푇 푧∗ + ∥∥푥푖∥∥2 푍∗푛+1,푛+1) = 푛∑ 푗=1 푍∗푗푗 − 2(푥푖)푇 푧∗ + ∥∥푥푖∥∥2 푍∗푛+1,푛+1. (5.44) For (5.42), we have that 푍∗ ર 0 so that[ 푍∗푗푗 푍 ∗ 푗,푛+1 푍∗푗,푛+1 푍 ∗ 푛+1,푛+1 ] ર 0, 푗 = 1, . . . , 푛. (5.45) Then ( 푍∗푗푗 ) ( 푍∗푛+1,푛+1 )− (푍∗푗,푛+1)2 ≥ 0 for 푗 = 1, . . . , 푛 (5.46) 64 5.3. Proof of Theorem 5.4: SDP Relaxation Based Algorithm and thus 2 ∣∣∣(푥푖)푇 푧∗∣∣∣ ≤ 2∥∥푥푖∥∥ ∥푧∗∥ (Cauchy-Schwarz) = 2 ∥∥푥푖∥∥ √√√⎷ 푛∑ 푗=1 ( 푍∗푗,푛+1 )2 (5.47) ≤ 2∥∥푥푖∥∥ √√√⎷ 푛∑ 푗=1 푍∗푗푗푍 ∗ 푛+1,푛+1 (see (5.46)) = 2 (∥∥푥푖∥∥√푍∗푛+1,푛+1) ⎛⎝√√√⎷ 푛∑ 푗=1 푍∗푗푗 ⎞⎠ ≤ ∥∥푥푖∥∥2 푍∗푛+1,푛+1 + 푛∑ 푗=1 푍∗푗푗 (2푎푏 ≤ 푎2 + 푏2). Hence, 푛∑ 푗=1 푍∗푗푗 − 2 ( 푥푖 )푇 푧∗ + ∥∥푥푖∥∥2 푍∗푛+1,푛+1 ≤ 푛∑ 푗=1 푍∗푗푗 + ∥∥푥푖∥∥2 푍∗푛+1,푛+1 + 푛∑ 푗=1 푍∗푗푗 + ∥∥푥푖∥∥2 푍∗푛+1,푛+1 = 2 ⎛⎝ 푛∑ 푗=1 푍∗푗푗 + ∥∥푥푖∥∥2 푍∗푛+1,푛+1 ⎞⎠ . (5.48) 5.3 Proof of Theorem 5.4: SDP Relaxation Based Algorithm Using the results of the previous section (5.2) we now provide the proof of Theorem 5.4. Proof. First, by combining Propositions 5.7, 5.8 and 5.9, we have ∥∥푧̃ − 푥푖푧̃푛+1∥∥2 ≥ (1−√훼훾∗) ⎛⎝ 푛∑ 푗=1 푍∗푗푗 + ∥∥푥푖∥∥2 푍∗푛+1,푛+1 ⎞⎠ . (5.49) 65 5.3. Proof of Theorem 5.4: SDP Relaxation Based Algorithm Combining the equation (5.49) with Proposition 5.10 we have that min 푖=1,...,푚 푤푖 ∥∥푧̃ − 푥푖푧̃푛+1∥∥2 ≥ (1− √ 훼훾∗) (∑푛 푗=1 푍 ∗ 푗푗 + ∥∥푥푖∥∥2 푍∗푛+1,푛+1) 2 (∑푛 푗=1 푍 ∗ 푗푗 + ∥푥푖∥2 푍∗푛+1,푛+1 ) = 1−√훼훾∗ 2 . (5.50) Thus 푥̃ ∈ ℝ푛 given by 푥̃ := 푧̃ 푧̃푛+1 , (5.51) belongs to 풳 given by (2.6), since, by (5.24) ( 푥̃21, . . . , 푥̃ 2 푛 ) = ( 푍∗11 푍∗푛+1,푛+1 , . . . , 푍∗푛푛 푍∗푛+1,푛+1 ) , (5.52) and ( 푍∗11, . . . , 푍∗푛푛, 푍∗푛+1,푛+1 ) ∈ 풦 by feasibility. Since 풦 is a convex cone, 1 푍∗푛+1,푛+1 ( 푍∗11, . . . , 푍 ∗ 푛푛, 푍 ∗ 푛+1,푛+1 ) ∈ 풦, (5.53) so that ( 푥̃21, . . . , 푥̃ 2 푛, 1 ) ∈ 풦 and thus 푥̃ ∈ 풳 . This establishes part 1. Furthermore, 푥̃ satisfies 푓 (푥̃) = min 푖=1,...,푚 푤푖 ∥∥푥̃− 푥푖∥∥2 ≥ 1−√훼훾∗ 2 1 푧̃2푛+1 = 1−√훼훾∗ 2 휐푐푝, (5.54) since, by (5.49), min 푖=1,...,푚 푤푖 ∥∥푥̃− 푥푖∥∥2 = min 푖=1,...,푚 푤푖 ∥∥∥∥ 푧̃푧̃푛+1 − 푥푖 ∥∥∥∥2 = min 푖=1,...,푚 푤푖 푧̃2푛+1 ∥∥푧̃ − 푥푖푧̃푛+1∥∥2 ≥ 1− √ 훼훾∗ 2 1 푧̃2푛+1 . (5.55) Now, 푍∗ is an optimal solution to (2.14) and 푧̃2푛+1 = 푍∗푛+1,푛+1, so that by (2.14) 1 푧̃2푛+1 = 1 푍∗푛+1,푛+1 = 휐푐푝. 66 5.4. Examples on the Lower Bound 훾∗ Thus, we have shown that there exists a feasible solution 푥̃ of the original problem (2.1) such that 푓 (푥̃) ≥ 1− √ 훼훾∗ 2 휐푐푝 (5.56) as required in part 2. Finally, part 3 follows by combining part 2 and Theorem 2.11. Remark 5.11. Notice that we do not need 푍∗ ર 0 but only (5.45) to estab- lish (5.17). Since (5.45) is equivalent to second-order cone constraints, see section 2.2.2 or [3, 25], it suffices to solve a second-order program (SOCP) instead of an SDP, which is computationally much more expensive. To summarize, the key point is to find ( 푍∗11, . . . , 푍∗푛+1,푛+1 ) using the SDP or SOCP relaxations. Upon obtaining ( 푍∗11, . . . , 푍∗푛+1,푛+1 ) , we can construct a feasible solution 푥̃ of (2.1) as follows: 1. Find ( 푍∗11, . . . , 푍∗푛+1,푛+1 ) , which is obtained from a solution 푍∗ to the SDP (2.14) or SOCP (2.24) relaxation. 2. Generate 휁 ∈ {−1, 1}푛 as in Proposition 5.6. 3. Set 푥̃(휁) = 푧̃ 푧̃푛+1 = 1√ 푍∗푛+1,푛+1 (√ 푍∗11휁1, . . . , √ 푍∗푛푛휁푛 ) . (5.57) 5.4 Examples on the Lower Bound 훾∗ We would like 훾∗ to be small for the bound (5.17) to be tight. In this section we provide concrete examples of calculating the lower bound 훾∗. Example 5.12. Let 풳 = {−1, 1}푛, corresponding to 풦 = {푦 ∈ ℝ푛+1 ∣푦푗 = 푦푛+1, 푗 = 1, . . . , 푛} . (5.58) As ( 푌 ∗11, . . . , 푌 ∗푛푛, 푌 ∗푛+1,푛+1 )푇 ∈ 풦, in particular, 푌 ∗ ≽ 0, we have 푌 ∗푗푗 = 푌 ∗ 푛+1,푛+1, 푗 = 1, . . . , 푛, (5.59) so that 훾∗ = max푗=1,...,푛 푌 ∗ 푗푗∑푛 푗=1 푌 ∗ 푗푗 = 푌 ∗푛+1,푛+1 푛푌 ∗푛+1,푛+1 = 1 푛 . (5.60) 67 5.4. Examples on the Lower Bound 훾∗ Example 5.13. Suppose 푛 is even, and let 풳 = {푥 ∈ ℝ푛 ∣∣푥2푗−1 + 푥2푗 = 1, 푗 = 2, 4, . . . , 푛} . This corresponds to 풦 = {푦 ∈ ℝ푛+1 ∣푦푗−1 + 푦푗 = 푦푛+1, 푗 = 2, 4, . . . , 푛} . (5.61) As ( 푌 ∗11, . . . , 푌 ∗푛푛, 푌 ∗푛+1,푛+1 )푇 ∈ 풦, in particular 푌 ∗ ≽ 0, we have 푛∑ 푗=1 푌 ∗푗푗 = (푌 ∗ 11 + 푌 ∗ 22) + ⋅ ⋅ ⋅+ ( 푌 ∗푛−1,푛−1 + 푌 ∗ 푛푛 ) = 푛 2 푌 ∗푛+1,푛+1 (5.62) and 푌 ∗푗푗 ≤ 푌 ∗푛+1,푛+1, 푗 = 1, . . . , 푛. (5.63) Thus, 훾∗ = max푗=1,...,푛 푌 ∗ 푗푗∑푛 푗=1 푌 ∗ 푗푗 ≤ 푌 ∗ 푛+1,푛+1 푛 2푌 ∗ 푛+1,푛+1 = 2 푛 . (5.64) More examples of 훾∗ will be given in Chapter 6. 68 Chapter 6 Numerics and Examples In this chapter we begin by considering two specific cases of the weighted maximin problem, and use the results of section 5.2 to derive approximation bounds. In the last section, we numerically compute the SDP and SOCP approximation bounds for a few specific examples, and (where applicable) compare these results to an alternate approximate solution based on a grid search method. 6.1 Approximation Bounds: Box Case Lemma 6.1. If 풳 = [−1, 1]푛 so that 풦 is given by (2.7), i.e. 풦 = {푦 ∈ ℝ푛+1 ∣ 푦푗 ≤ 푦푛+1, 푗 = 1, . . . , 푛} then the relaxation problem (2.14) given by 1 휐푐푝 = min 푍 푍푛+1,푛+1 s.t. 푤푖 〈 퐴푖, 푍 〉 ≥ 1, 푖 = 1, . . . ,푚 (푍11, . . . , 푍푛푛, 푍푛+1,푛+1) 푇 ∈ 풦 푍 ર 0. (6.1) has at least one optimal solution 푍∗ with 푍∗푗푗 = 푍 ∗ 푛+1,푛+1 for 푗 = 1, . . . , 푛, corresponding to which 훾∗ = 1/푛. Proof. Since 풳 = [−1, 1]푛 which is compact, (2.14) has an optimal solution 푍∗. Suppose that 푍 ∗̄ 푗푗̄ < 푍∗푛+1,푛+1 for some 푗̄ ∈ {1, . . . , 푛}. Then, for 푖 = 1, . . . ,푚, by (5.44) we have〈 퐴푖, 푍∗ 〉 = 푛∑ 푗=1 푍∗푗푗 + ∥∥푥푖∥∥2 푍∗푛+1,푛+1 − 2 (푥푖)푇 푧∗, which is increasing with 푍 ∗̄ 푗푗̄ . Thus, increasing 푍 ∗̄ 푗푗̄ to 푍∗푛+1,푛+1 will maintain all constraints in (2.14) to be satisfied while not changing the objective function value. Hence, 푍∗푗푗 = 푍 ∗ 푛+1,푛+1 ∀푗 = 1, . . . , 푛, and so 훾∗ = 1푛 as in Example 5.12. 69 6.2. Approximation Bounds: Ball Case It follows from 훼 = 2 ln (푚/휌), (5.17) and Lemma 6.1 that the feasible solution 푥̃ of (2.1) found by the SDP-based algorithm in Section 5.2 satisfies 푓(푥̃) ≥ 1− √ 2 ln (푚/휌)/푛 2 휐푐푝. (6.2) Thus, 1 ≥ 푓(푥̃)/휐푐푝 ≥ 1− √ 훽 2 whenever 2 ln (푚/휌)/푛 ≤ 훽 < 1. As we noted at the end of Section 5.2, the approximation bound (6.2) still holds when we further relax the SDP constraint 푍 ર 0 in (2.14) to the SOCP constraints. To our knowledge, this is the first nontrivial approximation bound for an NP-hard problem based on SOCP relaxation. Theorem 6.2. If 풦 is given by (2.7), then when 푛→∞, lim inf 푛→∞ 푓(푥̃) 휐푐푝 ≥ 1 2 . (6.3) Moreover, lim inf 푛→∞ 푓(푥̃) 휐푝 ≥ 1 2 . Proof. This follows from Lemma 6.1 and the fact that as 푛→∞, (2 ln (푚/휌)/푛)→ 0 so that lim inf 푛→∞ 푓(푥̃) 휐푐푝 = lim inf 푛→∞ 1−√2 ln (푚/휌)/푛 2 ≥ 1 2 . (6.4) As 휐푝 ≤ 휐푐푝 and 푓(푥̃) > 0, we have 푓(푥̃) 휐푝 ≥ 푓(푥̃) 휐푐푝 so that lim inf 푛→∞ 푓(푥̃) 휐푝 ≥ lim inf 푛→∞ 푓(푥̃) 휐푐푝 ≥ 1 2 . 6.2 Approximation Bounds: Ball Case In this section we consider the case of 풳 = {푥 ∈ ℝ푛 ∣∥푥∥ ≤ 1} , 70 6.2. Approximation Bounds: Ball Case and derive analogous results for an approximation bound as we did in sec- tion 6.1. As noted in section 5.2, we want 훾∗ to be small for the bound to be tight, so we begin by placing a lower bound on 훾∗. Lemma 6.3. If 풳 = {푥 ∈ ℝ푛 ∣∥푥∥ ≤ 1} is a Euclidean ball so that 풦 = {푦 ∈ ℝ푛+1 ∣푦1 + 푦2 + ⋅ ⋅ ⋅+ 푦푛 ≤ 푦푛+1} as in (2.8), then 훾∗ ≥ 1/푛. Furthermore, 훾∗ = 1/푛 when 푍∗푗푗 = 1푛푍∗푛+1,푛+1 for 푗 = 1, . . . , 푛. Proof. In order to minimize 훾∗ = max푗=1,...,푛 푍 ∗ 푗푗∑푛 푗=1 푍 ∗ 푗푗 we want to maximize ∑푛 푗=1 푍 ∗ 푗푗 while minimizing max푗=1,...,푛 푍 ∗ 푗푗 . From the definition of 풦 it obvious that∑푛 푗=1 푍 ∗ 푗푗 is maximized at 푍 ∗ 푛+1,푛+1. Now, setting 푍 ∗ 푗푗 = 1 푛푍 ∗ 푛+1,푛+1 for 푗 = 1, . . . , 푛 yields ∑푛 푗=1 푍 ∗ 푗푗 = 푍 ∗ 푛+1,푛+1, and 훾∗ = max푗=1,...,푛 푍 ∗ 푗푗∑푛 푗=1 푍 ∗ 푗푗 = 1 푛푍 ∗ 푛+1,푛+1 푍∗푛+1,푛+1 = 1 푛 . (6.5) To show that this solution minimizes 훾∗, suppose that 푍∗푗푗 = 1 휆푗 푍∗푛+1,푛+1 for 푗 = 1, . . . , 푛 so that ∑푛 푗=1 푍 ∗ 푗푗 = 푍 ∗ 푛+1,푛+1, but assume that 1 휆푗 < 1푛 for some 푗 ∈ {1, 2, . . . , 푛}. Then we must have 1휆푘 > 1푛 for some 푘 ∕= 푗 in 1, . . . , 푛 (in order that ∑푛 푗=1 푍 ∗ 푗푗 = 푍 ∗ 푛+1,푛+1 is still satisfied). Thus, 훾∗ = max푗=1,...,푛 푍 ∗ 푗푗∑푛 푗=1 푍 ∗ 푗푗 = 푍∗푛+1,푛+1 max푗=1,...,푛 1 휆푗 푍∗푛+1,푛+1 > 1 푛푍 ∗ 푛+1,푛+1 푍∗푛+1,푛+1 > 1 푛 . (6.6) Then we have an analogous result to Lemma 6.1 for the Euclidean ball bounded case: Lemma 6.4. If 풦 is given by (2.8), then (2.14) has an optimal solution 푍∗ with 푍∗푗푗 = 1 푛푍 ∗ 푛+1,푛+1 for 푗 = 1, . . . , 푛 corresponding to which 훾 ∗ = 1/푛. Proof. Since 풳 is compact, (2.14) has an optimal solution, 푍∗. We know that 〈 퐴푖, 푍∗ 〉 = 푛∑ 푗=1 푍∗푗푗 + ∥∥푥푖∥∥2 푍∗푛+1,푛+1 − 2 (푥푖)푇 푧∗ (6.7) 71 6.3. Where the SDP Relaxation Fails is increasing with ∑푛 푗=1 푍 ∗ 푗푗 . Then setting 푍 ∗ 푗푗 = 1 푛푍 ∗ 푛+1,푛+1 maximizes∑푛 푗=1 푍 ∗ 푗푗 at 푍 ∗ 푛+1,푛+1, which maintains all constraints of (2.14) while not changing the objective function value. Thus, applying Lemma 6.3, we have 훾∗ = 1/푛. Then, as in section 6.1, we can set 훼 = 2 ln (푚/휌), and apply (5.17) and Lemma 6.4 so that the feasible solution 푥̂ of (2.1) found by the SDP-based algorithm in Section 5.2 satisfies 푓(푥̂) ≥ 1− √ 2 ln (푚/휌)/푛 2 휐푐푝. (6.8) Theorem 6.5. If 풦 is given by (2.8), then when 푛→∞, lim inf 푛→∞ 푓(푥̃) 휐푐푝 ≥ 1 2 . (6.9) Moreover, lim inf 푛→∞ 푓(푥̃) 휐푝 ≥ 1 2 . The proof is analogous to the proof of Theorem 6.2. 6.3 Where the SDP Relaxation Fails In this section we consider the approximation bound (6.2) for the case that 푛 is small and 푚 is large. The following example shows that the SDP relaxation (2.14) can be a poor approximation of (2.1) in such a case, so (6.2) cannot be significantly improved upon. Example 6.6. Suppose 푛 = 1 (i.e. in ℝ), 푤푖 = 1, and 푥푖 = 2(푖−1)푚 − 1, for 푖 = 1, . . . ,푚 + 1. Since 푥1, . . . , 푥(푚+1) form a grid of equal spacing on 풳 = [−1, 1], we will have an optimal point (call it 푥∗), at the halfway point between any two 푥푖. For convenience, choose 푥1 = −1 and 푥2 = −1 + 2푚 , so that 푥∗ = −1 + 1푚 . Then the optimal solution is given by 휐푝 = ∣∣푥∗ − 푥1∣∣2 = ∣∣∣∣−1 + 1푚 − (−1) ∣∣∣∣2 = ∣∣∣∣−1푚 ∣∣∣∣2 = 1 푚2 . 72 6.3. Where the SDP Relaxation Fails On the other hand, 푍 = 퐼 is a feasible solution of (2.14) with objective function value of 1. To see that 푍 = 퐼2×2 is feasible, need to show the following: − 〈퐴푖, 푍〉 ≥ 1. Recall that 〈퐴푖, 푍〉 = trace[퐴푖푍]. Then we have 퐴푖푍 = 퐴푖퐼 = 퐴푖 = ⎡⎣ 1 − ( 2(푖−1) 푚 − 1 ) − ( 2(푖−1) 푚 − 1 ) ( 2(푖−1) 푚 − 1 )2 ⎤⎦ ⇒ 〈퐴푖, 푍〉 = 1 + (2(푖− 1) 푚 − 1 )2 ≥ 1 − Here 풳 = [−1, 1] so that 풦 = {푦 ∈ ℝ2 ∣푦1 ≤ 푦2} . Then (푍11, 푍22) = (1, 1) ∈ 풦 is obvious. − 푍 = 퐼 ર 0. Thus we have that 1휐푐푝 = min푍 푍푛+1,푛+1 ≤ 1 so that 휐푐푝 ≥ 1. Hence, 휐푐푝 ≥ 1 is significantly larger than 휐푝 = 1푚2 . In fact, 휐푐푝 ≥ 푚2휐푝 which shows that in the case of 푛 small and 푚 large, 휐푐푝 is a poor approxi- mation for 휐푝. 6.3.1 Necessary Condition for Non-trivial Bounds In order for the approximation bounds to be useful, we need to ensure that they yield non-trivial results (bounds > 0). In this section we give con- ditions on the values of 푚 and 푛 for the case that 풳 is a box or a ball and thus 훾∗ = 1/푛 which will ensure that the approximation bounds provided by the SDP and SOCP relaxation problems are bounded away from 0. To obtain a non-trivial lower bound, we require that 1−√2 ln(푚/휌)/푛 2 휐푐푝 > 0. (6.10) 73 6.4. An Alternate Approach Since 휐푐푝 > 0 (since 휐푐푝 ≥ 휐푝), we need only verify that 1−√2 ln(푚/휌)/푛 2 > 0 (6.11) ⇔ 1 > √ 2 ln(푚/휌)/푛 (6.12) ⇔ 푛 2 > ln( 푚 휌 ) (6.13) ⇔ 푚 휌 < exp (푛 2 ) ⇒ 푚 < exp (푛 2 ) . (6.14) Thus, we can guarantee that the approximation bounds provided by the SDP and SOCP based algorithms will yield a non-trivial lower bound whenever 푛 and 푚 satisfy 푚/휌 < exp(푛2 ). Table 6.1 shows the largest value of 푚 that will yield a non-trivial bound for a given value of 푛 (setting 휌 = 1): 푚 = ⌊exp 푛2 ⌋, i.e 푚 = the largest integer ≤ exp 푛2 . 푛 1 2 3 4 5 6 7 8 9 10 푚 1 2 4 7 12 20 33 54 90 148 Table 6.1: Maximal value of 푚 yielding a non-trivial approximation. Note that this means that we can not guarantee a non-trivial bound for cases that 푚 ≥ exp(푛2 ). Remark 6.7. Consider the function 푔 : (0, 1]→ ℝ : 휌 7→ 1− √ 2/푛 ln(푚/휌) 2 . Since 휌 7→ ln(푚/휌) is a decreasing function, 푔 is an increasing function, therefore 푔(1) = max휌∈(0,1] 푔(휌). 6.4 An Alternate Approach Of course, SDP relaxation is not the only way to construct an approxi- mate solution of (2.1). For example, consider the following simple procedure for constructing an approximate solution that is not based on SDP. Definition 6.8. The ceiling, or least-integer function, is a function from the real numbers ℝ to the integers ℤ, that maps any real number 푥 to the the smallest integer 푦 such that 푦 ≥ 푥. It is denoted by ⌈푥⌉. For example, we have ⌈1.5⌉ = 2, ⌈−3.2⌉ = −3 and ⌈4⌉ = 4. 74 6.5. Numerical Results Example 6.9. Let 푁 := ⌈ (푚+ 1)1/푛 ⌉ . We partition 풳 = [−1, 1]푛 into 푁푛 boxes of length 2/푁 in each dimension. Since this box contains the Euclidean ball centered at 푥̂ of radius 1/푁 , we have ∥∥푥̂− 푥푖∥∥ ≥ 1/푁 for all 푖. Hence 푓(푥̂) = min 푖=1,...,푚 푤푖 ∥∥푥̂− 푥푖∥∥2 ≥ min푖푤푖 푁 = min푖푤푖⌈ (푚+ 1)1/푛 ⌉ . Also 푥̂ ∈ 풳 . Assuming that 푥1, . . . , 푥푚 are in 풳 so that ∥∥푥− 푥푖∥∥2 ≤ 4푛 for all 푥 ∈ 풳 , then 휐푝 ≤ 4푛min푖푤푖. Therefore 푓(푥̂) ≥ 1 4푛 ⌈ (푚+ 1)1/푛 ⌉ 휐푝. The approximate solution 푥̂ is good when 푥1, . . . , 푥푚 are distributed “uni- formly” over 풳 but can be arbitrarily bad when 푥1, . . . , 푥푚 are clustered near each other. In particular, unlike (6.2), the above approximation bound tends to zero as 푛→∞; namely lim 푛→∞ 1 4푛⌈(푚+ 1)1/푛⌉ = 0. (6.15) 6.5 Numerical Results In this section, we use cvx [11], an optimization package for Matlab, to numerically compare the SDP and SOCP approximation bounds. cvx is a modeling system for disciplined convex prgramming (DCP). DCP’s are convex programming problems that are described using a limited set of construction rules. cvx can solve standard problems such as linear programs (LPs), quadratic programs (QPs), second-order cone programs (SOCPs) and semidefinite programs (SDPs), as well as many other more complex convex optimization problems. Since the original optimization problems in the examples do not have a simple closed form solution, a Matlab program was written to find the approximate optimal point, yielding an approximate solution to the original weighted maximin dispersion problem (see Appendix C). Note that this code depends upon the full factorial (fullfact) function in Matlab, so that for high dimensional problems or very large grid sizes the computational time is very high, and thus this program is not meant to take the place of more complex solvers. 75 6.5. Numerical Results Example 6.10. Suppose that 풳 = [−1, 1]5, 푤푖 = 1 for 푖 = 1, . . . , 3 and 푥1 = (1, 0, 0, 0, 0) (6.16) 푥2 = (0, 0, 1, 0, 0) (6.17) 푥3 = (0, 0, 0, 0, 1) (6.18) so that we have the following equally-weighted maximin dispersion problem: max 푥∈[−1,1]5 min 푖=1,2,3 ∥∥푥− 푥푖∥∥2 . (6.19) Remark 6.11. We can easily derive the exact optimal solution for (6.19). Set 푥 = (푥1, 푥2, 푥3, 푥4, 푥5). Then min 푖=1,2,3 ∥∥푥− 푥푖∥∥2 = min { (푥1 − 1)2 + 푥22 + 푥23 + 푥24 + 푥25, 푥21 + 푥22 + (푥3 − 1)2 + 푥24 + 푥25, 푥21 + 푥 2 2 + 푥 2 3 + 푥 2 4 + (푥5 − 1)2 } = min { (푥1 − 1)2 + 푥23 + 푥25, 푥21 + (푥3 − 1)2 + 푥25, 푥21 + 푥23 + (푥5 − 1)2 } +푥22 + 푥 2 4. Thus, the optimal solution requires that 푥1 = 푥3 = 푥5 = −1. Then we can set 푥1 = 푥3 = 푥5 = −1 and 푥2 = ±1 and 푥4 = ±1 to obtain an optimal point, corresponding to the optimal solution of 휐푝 = 4 + 2 + 2 = 8. First, we obtain an approximate solution of (6.19) on a grid using the Matlab code from Appendix A: Using grid sizes of 5, 9 and 11, we have an approximate solution of 휐푝 = 8 (6.20) in each case, with the optimal value being achieved at one of four points: 푥∗1 = (−1,−1,−1,−1,−1) 푥∗2 = (−1, 1,−1,−1,−1) 푥∗3 = (−1,−1,−1, 1,−1) 푥∗4 = (−1, 1,−1, 1,−1) . (6.21) Now, we want to solve the SDP and SOCP relaxation problems using cvx and Matlab to compare the approximation bounds. Since 풳 = [−1, 1]5 is a 76 6.5. Numerical Results polyhedral set having a tractable algebraic representation, equation (2.14) yields the following semidefinite program: 1 휐푐푝 = min 푍 푍66 s.t. 〈 퐴푖, 푍 〉 ≥ 1, 푖 = 1, 2, 3 (푍11, . . . , 푍66) 푇 ∈ 풦 푍 ર 0. (6.22) while equation (2.24) yields the following second-order cone program: 1 휐푠표푐푝 = min 푍 푍66 s.t. 〈 퐴푖, 푍 〉 ≥ 1, 푖 = 1, 2, 3 (푍11, . . . , 푍66) 푇 ∈ 풦[ 푍푗푗 푍푗,6 푍푗,6 푍6,6 ] ર 0, 푗 = 1, . . . , 5. (6.23) where 풦 = {푦 ∈ ℝ6 ∣푦푗 ≤ 푦6 for 푗 = 1, . . . , 5} . Solving both the SDP and SOCP relaxation problems using cvx yield an optimal solution value of 1 휐푐푝 = 1 휐푠표푐푝 = 0.125. (6.24) See Appendix B. Interestingly, 1휐푐푝 = 0.125 ⇒ 휐푐푝 = 8, so that in this case, 휐푐푝 = 휐푝. However, this is not the case in general, and from our results in this paper we can only say with certainty that 휐푝 ≥ 1− √ 2 ln(3/휌)/5 2 휐푐푝. Substituting 휐푐푝 = 8 and choosing 휌 = 1 (recall that 0 < 휌 ≤ 1), we have the following lower bound on the optimal solution: 휐푝 ≥ 1− √ 2 ln(3)/5 2 8 ≈ 1.348. Since this is a case where 푛 and 푚 are both quite small, 휐푐푝 is not a very good approximation of the true solution 휐푝. However, in cases where 풳 is a box and 푛 and 푚 are small, it is certainly feasible to use the Matlab code provided in Appendix A to obtain an approximate solution to (2.1). We begin by examining the case where 푛 = 5, which is still feasible to solve using the code in Appendix A to yield an approximation bound and we compare the results to those produced using cvx. 77 6.5. Numerical Results Example 6.12. Suppose that 풳 = [−1, 1]5, 푤푖 = 1 for 푖 = 1, . . . , 3 and we have 푚 = 3, 4, . . . , 12 points where each point 푥푗 is randomly selected from 풳 . Using a grid size of 11 the following table compares the lower bound based on the optimal values 휐푐푝 and 휐푠표푐푝 resulting from solving the SDP and SOCP relaxation problems with cvx to the approximate optimal solution 휐푔 found using the grid solver. Table 6.2 summarizes the results of using the grid solver and CVX (setting 휌 = 1). Since 푛 is small we expect the bound to not be very good, and this expectation is confirmed form the reults in Table 6.2. However, for small values of 푚 the bound will significantly help in narrowing the search space for finding the optimal solution. 푚 1− √ 2 ln(푚)/5 2 휐푐푝 1− √ 2 ln(푚)/5 2 휐푠표푐푝 휐푔Grid 3 1.2100 1.2100 6.9436 4 0.8238 0.8238 5.7219 5 0.7549 0.7549 6.1027 6 0.5184 0.5184 5.3748 7 0.5189 0.5189 5.8145 8 0.3272 0.3272 5.4333 9 0.2482 0.2482 5.7362 10 0.1341 0.1341 5.9963 11 0.0868 0.0868 4.9257 12 0.0119 0.0119 5.0090 Table 6.2: Numerical Results for Large 푛 = 5. What we are more interested in is cases where 푛 and 푚 are large, so that applying the grid program becomes impossible. In particular, we examine a few cases where 푛 is “large” (푛 > 5) and 푚 < exp ( 푛 2 ) (setting 휌 = 1), which guarantees that the SDP and SOCP based algorithms will yield a non-trivial lower bound. Example 6.13. Let 풳 = [−1, 1]푛, 푤푖 = 1 for 푖 = 1, . . . ,푚, and 푥1, . . . , 푥푚 are a set of randomly generated points in 풳 . The following table gives the lower bound to the original problem 2.1 based on the approximate solutions 휐푐푝 and 휐푠표푐푝 resulting from using cvx to solve the relaxation problems (2.14) and (2.24): From this example it is once again apparent that the optimal solution to the simpler SOCP relaxation problem is identical to the optimal value of 78 6.5. Numerical Results 푛 푚 1− √ 2 ln(푚)/푛 2 휐푐푝 1− √ 2 ln(푚)/푛 2 휐푠표푐푝 6 20 0.0025 0.0025 7 33 0.0027 0.0027 8 54 0.0061 0.0061 9 90 0.0001 0.0001 10 148 0.0016 0.0016 11 244 0.0016 0.0016 12 403 0.0006 0.0006 13 665 0.0001 0.0001 Table 6.3: Numerical Results for Large 푛 and 푚. the SDP relaxation. Since the original problem is NP-hard, and the values of 푛 and 푚 are large enough that applying the grid solution is not feasible, we do not have a way to compare these results to the actual solution of the problem. However, since all of the bounds given are strictly greater than zero, Table 6.3 at least confirms that the lower bound for (2.1) resulting from the SDP and SOCP relaxation problems is a non-trivial bound. 79 Chapter 7 Conclusions and Future Work 7.1 Results The purpose of this thesis has been to develop SDP and SOCP convex relaxations for tackling the weighted maximin dispersion problem, which is NP-hard. We began by describing the general weighted maximin dispersion prob- lem in Section 2.1, and defining the convex relaxation problems in Section 2.2. Next, in Section 3.2, we used theoretical results from nonsmooth anal- ysis to derive a necessary condition for the optimal solutions of the equally weighted maximin problem. Provided that the set 풳 is convex, we showed that 푥̄ ∈ int(풳 ) is a possible solution of max 푥∈풳 min 푖=1,...,푚 푤 ∥∥푥− 푥푖∥∥2 (7.1) only if 푥̄ ∈ conv{푥푖 ∣푖 ∈ 퐼(푥̄)} . Chapter 4 was dedicated to the theory of NP-completeness. While it is well known that the general weighted maximin distance problem is NP-hard, we have provided a proof to show that even in the case of equally weighting and 풳 is a box, the problem is NP-hard. That is, the problem given by 푓(푥) = max 푥∈[−1,1]푛 min 푖=1,...,푚 푤 ∥∥푥− 푥푖∥∥2 (7.2) is NP-hard. Also in this Chapter, we investigated an heuristic approach based on partitioning the space 풳 into Voronoi cells and considering sub- problems based on restricting 푥 to be in one of 푚 Voronoi cells. Again, we showed that even in the case of equal weighting and 풳 = ℝ푛, the restricted subproblems are NP-hard. Finally, in Chapter 5, we used the convex relaxations problems given in Section 2.2 to develop an algorithm to construct an approximate solution to the general weighted maximin dispersion problem. Pulling results from 80 7.2. Future Work statistics, convex analysis and matrix algebra, we showed that there exists a feasible solution 푥̃ to the original problem max 푥∈풳 min 푖=1,...,푚 푤푖 ∥∥푥− 푥푖∥∥2 (7.3) such that 푓(푥̃) ≥ 1− √ 훼훾∗ 2 휐푐푝 (7.4) where 1/휐푐푝 is the optimal value of the SDP relaxation problem (2.14), and 훾∗ = max푗=1,...,푛 푍 ∗ 푗푗∑푛 푗=1 푍 ∗ 푗푗 , 훼 = 2 ln ( 푚 휌 ) . (7.5) Chapter 6 began by focusing on two specific examples - the case that 풳 is a box, and the case that 풳 is a ball. We showed that in either case, there exists an optimal solution to problem (2.10) with 푍∗푗푗 = 푍 ∗ 푛+1,푛+1 for 푗 = 1, . . . , 푛 and 훾∗ = 1/푛. Finally, Chapter 6 finished with a few numerical examples to apply the SDP relaxation based algorithm developed in Chapter 5. Matlab code was provided for implementing the algorithm developed in Chapter 5, as well as Matlab code that was developed for finding approximate solutions over a grid. 7.2 Future Work Possible directions for further investigation into the weighted maximin dispersion problem and the approximation bounds derived in this paper are: 1. Can we extend Lemma 6.1, or Lemma 6.4 to other 풳 of the form (2.6)? Or more generally, when 풳 is invariant under permutation. 2. Is (2.1) NP-hard for other choices of 풳 besides the box (e.g., 풳 is the Euclidean ball)? 3. Can the above results be extended to locate multiple points in 풳 ? And maxsum dispersion problems? 4. Can we improve the approximation bound by using the rank reduction scheme of Goemans-Williamson [14] (also see [28]) or its variant pro- posed by Bertsimas and Ye [5]? In particular, instead of generating 휁 81 7.2. Future Work uniformly on {−1, 1}푛, generate 휉 ∼ 푁(0, 푍∗), the real-valued normal distribution, and set 휁푗 = { 1 if 휉푗 > 0; −1 else, 푗 = 1, . . . , 푛. Note that this requires 푍∗ ર 0. Analyzing this will likely require new large deviation estimates. 5. Create specialized algorithms to solve the SOCP relaxation more effi- ciently when 푛 and 푚 are large (and large number of points to locate). 6. Find the lower bound for the performance of the SOCP relaxation. 7. One always has 휐푐푝 ≥ 휐푝. Under what conditions do the convex re- laxation problems (2.14) or (2.24) yield the exact solution. That is, under what conditions will we have 휐푐푝 = 휐푝? 8. Do some comparisons of our SDP and SOCP relaxations to existing relaxation methods, such as [16, 30, 33]. 9. In the thesis, we are considering the Maximin problem only for 푚 points in ℝ푛. What happen if we consider the Maximin problem for 푚 convex sets in ℝ푛? 82 Bibliography [1] F. Alizadeh. Interior point methods in semidefinite program- ming with applications to combinatorial optimization. SIAM Journal on Optimization, 5(1):13–51, 1995. → pages 24 [2] Jonathan M. Borwein and Adrain S. Lewis. Convex Analysis and Nonlinear Optimization, Theory and Examples. Springer, 2000. → pages 15 [3] A. Ben-Tal, A. Nemirovski, and C. Roos. Robust solutions of uncertain quadratic and conic-quadratic problems. SIAM Journal on Optimization, 13(2):535–560, 2002. → pages 54, 67 [4] Stephen Boyd and Lieven Vandenberghe. Convex Optimization. Cambridge University Press, 2004. → pages 19, 20 [5] D. Bertsimas and Y. Ye. Semidefinite relaxations, multivari- ate normal distribution, and order statistics. In D.Z. Du and P.M. Pardalos, editors, Handbook of Combinatorial Optimiza- tion, volume 3, pages 1–19. Kluwer Academic Publishers, 1998. → pages 81 [6] Pierluigi Crescenzi and Viggo Kann. A compendium of np optimization problems, 1998. → pages 23 [7] Frank H. Clarke. Optimization and Nonsmooth Analysis. So- ciety for Industrial and Applied Mathematics, 1990. → pages 19 [8] Stephen Cook. The p vs np problem, 2000. → pages 22 [9] Kenneth R. Davidson and Allan P. Donsig. Real Analysis and Applications. Springer, 2010. → pages 2, 4 [10] B. Dasarathy and Lee J. White. A maxmin location problem. Operations Research, 28(6):1385–1401, 1980. → pages 26 83 [11] Micheal Grant and Stephen Boyd. cvx user’s guide; for cvx version 1.21. User’s Guide, September 2010. → pages 24, 75 [12] Michael R. Garey and David S. Johnson. Computers and In- tractability: A Guide to the Theory of NP-Completeness. Bell Telephone Laboratories, Incorporated, 1979. → pages 21, 22, 23, 24, 49, 53 [13] M. Grotschel, L. Lovasz, and A. Schrijver. Geometric Al- gorithms and Combinatorial Optimization. Springer-Verlag, Berlin, 1998. → pages 24 [14] Michel X. Goemans and David P. Williamson. Improved approximation algorithms for maximum cut and satisfiabil- ity problems using semidefinite programming. J. ACM, 42(6):1115–1145, 1995. → pages 24, 81 [15] C. Helmberg, F. Rendl, R.J. Vanderbei, and H. Wolkowicz. An interior-point method for semidefinite programming. SIAM Journal on Optimization, 6(2):342–361, 1996. → pages 24 [16] M. E. Johnson, L. M. Moore, and D. Ylvisaker. Minimax and maximin distance designs. Journal of Statistical Planning and Inference, 26(2):131 – 148, 1990. → pages 26, 82 [17] R.M. Karp. Reducibility among combinatorial problems. In R.E. Miller and J.W. Thatcher, editors, Complexity of Com- puter Computations, pages 85–103. Plenum Press, New York, 1972. → pages 23 [18] Carl D. Meyer. Matrix Analysis and Applied Linear Algebra. Society for Industrial and Applied Mathematics, 2000. → pages 7 [19] R.T. Rockafellar. Convex Analysis. Princeton University Press, Princeton, New Jersey, 1970. → pages 15, 16 [20] S. S. Ravi, D. J. Rosenkrantz, and G. K. Tayi. Heuristic and special case algorithms for dispersion problems. Operations Re- search, 42(2):299–310, 1994. → pages 44 [21] F. Rendl, G. Rinaldi, and A. Wiegele. Solving max-cut to opti- mality by intersecting semidefinite and polyhedral relaxations. Mathematical Programming, 121(2):307–335, 2010. → pages 24 84 [22] R.T. Rockafellar and Roger J-B. Wets. Variational Analysis. Springer-Verlag, 1998. → pages 9, 13, 14, 16 [23] Winfried Schirotzek. Nonsmooth Analysis. Springer-Verlag, Berlin, 2007. → pages 19 [24] Shayle R. Seale. Matrix Algebra Useful for Statistics. John Wiley and Sons Inc., 1982. → pages 5, 6 [25] J.F. Sturm. Using sedumi 1.02, a matlab* toolbox for optimiza- tion over symmetric cones (updated for version 1.05). Techni- cal report, Department of Econometrics, Tilburg University, Tilburg, The Netherlands, August 1998 - October 2001. → pages 24, 67 [26] Jerome Sacks, William J. Welch, Toby J. Mitchell, and Henry P. Wynn. Design and analysis of computer experiments. Statistical Science, 4(4):pp. 409–423, 1989. → pages 26 [27] M.J. Todd. Semidefinite optimization. Acta Numerica, 10:515– 560, 2001. → pages 21, 24 [28] Paul Tseng. Further results on approximating nonconvex quadratic optimization by semidefinite programming relax- ation. SIAM Journal on Optimization, 14(1):268–283, 2003. → pages 81 [29] L. Vandenberghe and S. Boyd. Semidefinite programming. SIAM Review, 38(1):49–95, 1996. → pages 20, 21 [30] D.J. White. A heuristic approach to a weighted maxmin dis- persion problem. IMA Journal of Management Mathematics, 7(3):219–231, 1996. → pages 26, 49, 53, 82 [31] Wikipedia. Complexity class. → pages 22, 23 [32] H. Wolkowicz, R. Saigal, and L. Vandenberghe. Handbook of Semidefinite Programming. Theory, Algorithms, and Applica- tions. Kluwer Academic Publishers, Boston, MA, 2000. → pages 21, 24 [33] Shuzhong Zhang. Quadratic maximization and semidefinite re- laxation. Mathematical Programming, 87(3):453–465, 2000. → pages 82 85 Appendix A Matlab Code The following matlab code was used to produce 3-dimensional pictures of the minimum distance function. Note that this function calls the MinNorm function, the code for which is also given. % Plot the minimum distance function over [-1,1]ˆ2 % define the range over which to calculate the minimum distances x1=(-1:0.01:1)’; x2=(-1:0.01:1)’; xmat=[kron(x1,ones(length(x2),1)) kron(ones(length(x1),1),x2)]; [x y]=meshgrid(x1,x2); % create two sets (S and T) of 10 points % in the 2-D unit square % S is a "nice" set of points S=[[-1,1];[1,-1];[1,1];[-1,-1];[0,0]]; % T is a random set of points t=randi([-10,10],[10,2]); T=t/10; % get the function values for S and T fminS=MinNorm(xmat,S); fminT=MinNorm(xmat,T); zS=reshape(fminS,length(x1),length(x2)); zT=reshape(fminT,length(x1),length(x2)); figure(1) surf(x,y,zS,’edgecolor’, ’none’) print(’-f1’, ’MinDistExS.eps’,’-depsc2’) 86 Appendix A. Matlab Code figure(2) surf(x,y,zT,’edgecolor’, ’none’) print(’-f2’, ’MinDistExT.eps’,’-depsc2’) The following code was written to estimate the optimal solution to (2.1) when 풳 = [−1, 1]푛 and 푤푖 = 1 for all 푖. Using a grid of size 푞 + 1, we find the point in 푥∗ ∈ 풳 that is furthest away from the given set of points 푆, using the maximin distance criteria. function [Mx, P]=MaximinGrid(q,S) %This function takes a set of points S and finds the % furthest point in [-1,1]ˆ{n} based on the maximin distance % criteria, using a q+1 grid % S is an m by n array, where n is the number of dimensions % and m is the number of points (each ROW is a point) [m n]=size(S); % create the q+1 grid on [-1,1]ˆn: x=((q+1)*ones([n,1])); % a vector of length n with entries q+1 X=fullfact(x); % a grid of size (q+1)ˆ{2} by n % scale the grid to be in [-1,1] [r c]=size(X); Y=-1+2*((X-1)/q); d=999*ones([r,1]); % a vector that stores the mindist from each point % on the grid to the closest point in S for ii=1:r % for each point on the grid k=0; % condition for the while loop while k==0 % enter loop for jj=1:m 87 Appendix A. Matlab Code % for each point in S D = sum( (Y(ii,:)-S(jj,:)).ˆ2); % calculate the distance if D==0 % if the grid point is in S d(ii)=D; % so it won’t have the max value of 1! k=1; % skip this point by exiting the while loop break; elseif D < d(ii) % if the new distance is smaller but not equal 0 d(ii) = D; % store the new mindist end end k=1; end end Mx=max(d); % get the maximin distance ind=(d==Mx); P=Y(ind,:); Mx=max(d); end 88 Appendix B Matlab Output In this section we provide the Matlab output from Example 6.10. Solve Example 6.10 using the MaximinGrid function with different grid sizes 푞1 = 4, 푞2 = 8 and 푞3 = 10: % Find the approximate optimal point for Numerics Example 1 % Do a couple of trials with different grid sizes % Define the given set of points S x1 = [1,0,0,0,0]; x2 = [0,0,1,0,0]; x3 = [0,0,0,0,1]; S = [x1; x2; x3]; % Choose the different mesh sizes for the grid q1=4; q2=8; q3=10; % Find the optimal points using the different grids [d1 p1]=MaximinGrid(q1,S); d1 = 8 p1 = -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 1 -1 -1 1 -1 1 -1 [d2 p2]=MaximinGrid(q3,S); d2 = 8 [d3 p3]=MaximinGrid(q3,S); d3 = 8 89 Appendix B. Matlab Output Output from solving the SDP relaxation for Example 6.10: Calling sedumi: 29 variables, 8 equality constraints ------------------------------------------------------------ SeDuMi 1.21 by AdvOL, 2005-2008 and Jos F. Sturm, 1998-2003. Alg = 2: xz-corrector, Adaptive Step-Differentiation, theta = 0.250, beta = 0.500 eqs m = 8, order n = 15, dim = 45, blocks = 2 nnz(A) = 39 + 0, nnz(ADA) = 64, nnz(L) = 36 it : b*y gap delta rate t/tP* t/tD* feas cg cg prec 0 : 4.82E+000 0.000 1 : -9.09E-002 1.99E+000 0.000 0.4133 0.9000 0.9000 2.47 1 1 2.7E+000 2 : 1.32E-001 8.58E-001 0.000 0.4308 0.9000 0.9000 4.31 1 1 4.4E-001 3 : 1.26E-001 2.41E-001 0.000 0.2805 0.9000 0.9000 1.57 1 1 1.0E-001 4 : 1.25E-001 1.18E-002 0.000 0.0489 0.9900 0.9900 1.22 1 1 4.8E-003 5 : 1.25E-001 3.71E-005 0.000 0.0032 0.9990 0.9990 1.01 1 1 1.5E-005 6 : 1.25E-001 7.28E-006 0.000 0.1960 0.9000 0.8959 1.00 1 1 2.9E-006 7 : 1.25E-001 5.39E-007 0.423 0.0741 0.9900 0.9438 1.00 1 1 2.2E-007 8 : 1.25E-001 6.57E-009 0.000 0.0122 0.9902 0.9900 1.00 1 1 3.3E-009 iter seconds digits c*x b*y 8 0.8 8.1 1.2500000099e-001 1.2499999993e-001 |Ax-b| = 2.7e-009, [Ay-c]_+ = 3.7E-010, |x|= 5.3e-001, |y|= 3.5e-001 Detailed timing (sec) Pre IPM Post 4.836E-001 7.800E-001 1.560E-001 Max-norms: ||b||=1, ||c|| = 1, Cholesky |add|=0, |skip| = 0, ||L.L|| = 316.544. ------------------------------------------------------------ Status: Solved Optimal value (cvx_optval): +0.125 Z = 0.1250 -0.0000 0.1250 -0.0000 0.1250 -0.1250 -0.0000 0.1250 -0.0000 -0.0000 -0.0000 0.0000 0.1250 -0.0000 0.1250 -0.0000 0.1250 -0.1250 -0.0000 -0.0000 -0.0000 0.1250 -0.0000 0.0000 0.1250 -0.0000 0.1250 -0.0000 0.1250 -0.1250 -0.1250 0.0000 -0.1250 0.0000 -0.1250 0.1250 90 Appendix B. Matlab Output Output from solving the SOCP relaxation for Example 6.10: Calling sedumi: 23 variables, 12 equality constraints ------------------------------------------------------------ SeDuMi 1.21 by AdvOL, 2005-2008 and Jos F. Sturm, 1998-2003. Alg = 2: xz-corrector, Adaptive Step-Differentiation, theta = 0.250, beta = 0.500 eqs m = 12, order n = 19, dim = 29, blocks = 6 nnz(A) = 47 + 0, nnz(ADA) = 144, nnz(L) = 78 it : b*y gap delta rate t/tP* t/tD* feas cg cg prec 0 : 3.80E+000 0.000 1 : -1.05E-001 1.62E+000 0.000 0.4249 0.9000 0.9000 2.47 1 1 2.8E+000 2 : 1.35E-001 6.38E-001 0.000 0.3950 0.9000 0.9000 4.86 1 1 3.3E-001 3 : 1.24E-001 5.51E-002 0.000 0.0863 0.9900 0.9900 1.45 1 1 2.4E-002 4 : 1.25E-001 5.14E-004 0.358 0.0093 0.9945 0.9945 1.07 1 1 2.3E-004 5 : 1.25E-001 1.23E-005 0.000 0.0240 0.9900 0.9900 1.00 1 1 5.9E-006 6 : 1.25E-001 6.14E-007 0.082 0.0498 0.9900 0.9356 1.00 1 1 3.1E-007 7 : 1.25E-001 1.11E-008 0.000 0.0182 0.9905 0.9900 1.00 1 1 8.7E-009 iter seconds digits c*x b*y 7 0.1 7.5 1.2500000462e-001 1.2500000087e-001 |Ax-b| = 3.6e-009, [Ay-c]_+ = 2.7E-009, |x|= 5.0e-001, |y|= 3.5e-001 Detailed timing (sec) Pre IPM Post 6.240E-002 6.240E-002 3.120E-002 Max-norms: ||b||=1, ||c|| = 1, Cholesky |add|=1, |skip| = 0, ||L.L|| = 108.52. ------------------------------------------------------------ Status: Solved Optimal value (cvx_optval): +0.125 Z = 0.1250 -0.0000 0.1250 -0.0000 0.1250 -0.1250 -0.0000 0.1250 -0.0000 -0.0000 -0.0000 0.0000 0.1250 -0.0000 0.1250 -0.0000 0.1250 -0.1250 -0.0000 -0.0000 -0.0000 0.1250 -0.0000 0.0000 0.1250 -0.0000 0.1250 -0.0000 0.1250 -0.1250 -0.1250 0.0000 -0.1250 0.0000 -0.1250 0.1250 91
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Convex relaxations of the maximin dispersion problem
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Convex relaxations of the maximin dispersion problem Haines, Sheena Ayla 2011
pdf
Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.
Page Metadata
Item Metadata
Title | Convex relaxations of the maximin dispersion problem |
Creator |
Haines, Sheena Ayla |
Publisher | University of British Columbia |
Date Issued | 2011 |
Description | Recently, convex relaxations have achieved notable success in solving NP-hard optimization problems. This thesis studies semidefinite and second-order cone programming convex relaxations of the maximin dispersion problem. Providing nontrivial approximation bounds, we believe that our SDP and SOCP relaxation methods are useful in large-scale optimization. The thesis is organized as follows. We begin by recalling some basic concepts from convex analysis, nonsmooth analysis, and optimization. We then introduce the weighted maximin dispersion optimization problem; locating point(s) in a given region X ⊆ R^{n} that is/are furthest from a given set of m points. Also given are several reformulations of the original problem, including a convex relaxation problem and semidefinite and second order cone programming relaxations. Using well known results on Lipschitz functions and subdifferentials of Lipschitz functions we then derive a theoretical characterization of the optimal solutions for a given convex region X and equal weights. Next, we provide a proof that the weighted maximin dispersion problem is NP-hard even in the case where X is a box and the weights are equal. We then propose an algorithm for finding an approximate solution using the SDP relaxation, and derive an approximation bound that depends on the subset X. In the case that X is a box or a product of low-dimensional spheres, we show that the convex relaxation reduces to a second-order cone program, and provide further results on the bound provided. Lastly, we provide numerical examples using the SDP and SOCP relaxations, and suggest future work. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2011-10-25 |
Provider | Vancouver : University of British Columbia Library |
Rights | Attribution-NonCommercial-NoDerivatives 4.0 International |
DOI | 10.14288/1.0072338 |
URI | http://hdl.handle.net/2429/38242 |
Degree |
Master of Science - MSc |
Program |
Mathematics |
Affiliation |
Arts and Sciences, Irving K. Barber School of (Okanagan) Computer Science, Mathematics, Physics and Statistics, Department of (Okanagan) |
Degree Grantor | University of British Columbia |
GraduationDate | 2012-05 |
Campus |
UBCO |
Scholarly Level | Graduate |
Rights URI | http://creativecommons.org/licenses/by-nc-nd/4.0/ |
AggregatedSourceRepository | DSpace |
Download
- Media
- 24-ubc_2012_spring_haines_sheena.pdf [ 885.86kB ]
- Metadata
- JSON: 24-1.0072338.json
- JSON-LD: 24-1.0072338-ld.json
- RDF/XML (Pretty): 24-1.0072338-rdf.xml
- RDF/JSON: 24-1.0072338-rdf.json
- Turtle: 24-1.0072338-turtle.txt
- N-Triples: 24-1.0072338-rdf-ntriples.txt
- Original Record: 24-1.0072338-source.json
- Full Text
- 24-1.0072338-fulltext.txt
- Citation
- 24-1.0072338.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
data-media="{[{embed.selectedMedia}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.24.1-0072338/manifest