Second Order Necessary Conditions in Optimal Control by Harry Huiheng Zheng B.Sc. (Mathematics), Fudan University, 1983 M.Sc. (Mathematics), Fudan University, 1986 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE STUDIES Department of Mathematics We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA May 1993 © Harry Huiheng Zheng, 1993 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. (Signature) Department of ^Avrtie_144 aC_S The University of British Columbia Vancouver, Canada Date DE-6 (2/88) A l vt5t^koi93 Abstract This thesis presents second order necessary conditions for the standard deterministic optimal control problem without conventional regularity assumptions. It reports three different advances in the second-order theory. First, we study conjugate points in smooth optimal control. We describe a set of conditions under which the second variation along an extremal trajectory is certain to be negative: when these are satisfied, they identify a certain point in the basic interval as a "Generalized Conjugate Point (GCP)." The GCP's include all conjugate points in the classical sense, and all points where the Legendre condition is violated. We also find GCP's in many problems left unresolved by the published literature, and deduce that the associated extremals are not optimal. Second, we use a second-order tangent set to predict the second-order variations along a trajectory to a differential inclusion. Assuming only that the problem's data are Lipschitz continuous, we obtain a generalized form of the familiar statement that the second variation associated with a minimum point is nonnegative. The result is expressed in terms of a generalized second-order derivative introduced by Aubin, and reproduces the classical necessary conditions in the smooth case. Third, we apply Rockafellar's theory of epi-differentiability to integral functionals. Using Attouch's theorem, we show that certain nonconvex integral functionals of interest in optimal control are twice epi-differentiable. From this we deduce necessary conditions for optimality for problems without endpoint constraints. ii Table of Contents Abstract ^ ii List of Figures ^ v Acknowledgement ^ vi Chapter 1: Introduction ^ 1.1. Conjugate Points ^ 1 1 1.2. Second Order Approximation for Dynamical Systems ^ 6 1.3. Epi-derivatives of Integral Functionals ^ 11 1.4. A User's Guide ^ 13 1.5. Approximating Cones ^ 16 Chapter 2: Generalized Conjugate Points ^ 22 2.1. Preliminaries ^ 23 2.2. Second Order Necessary Conditions ^ 25 2.3. Generalized Conjugate Points ^ 38 2.4. Examples ^ 45 Chapter 3: Variational Inclusions ^ iii 52 3.1. Background Material and Hypotheses ^ 53 3.2. Approximations to the Reachable Set ^ 57 3.3. Second Order Necessary Conditions ^ 61 3.4. Unbounded Differential Inclusions ^ 64 3.5. Application to Optimal Control Problems ^ 66 Chapter 4: Second Epi-derivatives ^ 73 4.1. Background Material ^ 74 4.2. Epi-derivatives of Integral Functionals ^ 80 4.3. Epi-derivatives of Bolza Functionals ^ 88 4.4. Applications ^ 91 4.5. Calculus of Epi-derivatives and Endpoint Constraints ^ 94 References ^ 98 iv List of Figures Figure 1.1. Control uk ^ 7 Figure 1.2. Trajectory xk ^ 7 Figure 1.3. Set C and level curves of g ^ 19 Figure 1.4. The set C ^ 21 Figure 1.5. The adjacent cone to C at 0 ^ 21 Acknowledgement First of all, I would like to thank my supervisor, Philip Loewen, for his guidance, support, inspiration, and encouragement during my stay at the University of British Columbia. I had very good fortune to study under his supervision. I would like to thank the rest of my committee, Ulrich Haussmann, Wayne Nagata, and Edwin Perkins, for their helpful comments and criticism, and for their courses which I had the pleasure of taking. I would like also to thank all UBC graduate students, faculty, and staff who made my graduate study a rewarding experience. Finally, I would like to thank my wife, Min Li, for her love and continual support. Financial support by the University of British Columbia Graduate Fellowship, and by the Teaching and Research Assistantships in the Department of Mathematics, is gratefully acknowledged. Without this support none of this work would have been possible. vi Chapter 1 Introduction 1.1. Conjugate Points This thesis describes several new second order necessary conditions in dynamic optimization. The first, an extended theory of conjugate points, provides new information even within the classical calculus of variations. To put our contribution in context, we review the basics of the calculus of variations here. The basic problem in the calculus of variations is to minimize the functional J(x) = foT L(x(t), (t)) dt over all piecewise smooth functions x with fixed endpoints x(0) = A and x(T) B, where L is a function from RI' x Rn to R defined by (x, u) L(x, u). A piecewise smooth function x is called an arc; an arc is admissible if it satisfies the endpoint conditions. An admissible arc x is called a local minimizer for J if there exists a positive constant e such that for any piecewise smooth function h with h(0) = h(T) = 0 satisfying Ih(t)1 < e for all t in [0, T], we have J(x h) > J(x). An admissible arc x is called a weak local minimizer for J if for any h as above, with Ih'(t)I < E for all t in [0, 7], we have J(x h) > J(x). Obviously, if x is a local minimizer for J, then it is a weak local minimizer for J. Suppose x is a weak local minimizer for J. Fix any h as above, and define g(a) = J(x ah). Then the function g has a local minimum at a = 0, so we must have g'(0) = 0 and g"(0) > 0 (provided these derivatives exist). If L is C 1 , we can compute g'(0) = I (L x (t) T h(t)d- L u (t) T (t)) dt, where L x (t) = L x (x(t), x'(t)) and L u (t) = L u (x(t), x' (t)). (A similar interpretation applies to other functions as well.) Upon setting g'(0) = 0 and integrating by parts, 1 Chapter 1. Introduction^2 we get^ T (L (0 - I L 11 x (s)d.^) T h i (t)dt = 0. Since the "variation" h is arbitrary, we obtain the Euler equation: for some constant c, rt L u (t) = c+^L x (s)ds a.e. t E [0, 7]. If the arc x is a strong local minimizer for J, we also have the following condition, named for Weierstrass: L(x(t), w)^L(x(t), x' (t)) + (L u (t),w — x'(t)) Vw E R", a.e. t E [0, T]. When the integrand L is C 2 , the function g introduced above is twice differentiable, with g „ 0 )^hh(t))) Jo \ L xuuu((t)hh(i))) L T x uxr (( dt. t) Since g has a local minimum at 0, this quantity is nonnegative for every variation h. The problem of minimizing gn(0) over all variations h is called the accessory problem. The fact that the zero arc solves the accessory problem implies the Legendre condition L uu (t) > 0 a.e. t E [0, T]. We say L is regular relative to x if strengthened Legendre condition Luu(t) > 0 holds for all t in [0, T]. If L is C 3 and regular, suppose x E C 1 , then we have the Jacobi condition: there is no point in the interval (0, T) conjugate to T. A point c in (0, T) is called conjugate to T if there exists a nontrivial solution of the two-point boundary value problem on [c, T] irit-(L uu (t)h 1 (t) + L ux (t)h(t)) = L xu (t)11 1 (t) + L xx (t)h(t), { h(c) 0, h(T) = 0. In proving the Jacobi condition, the regularity of L relative to x plays an important role. For if c E (0, T) were a conjugate point to T, there would be a nontrivial solution h to Jacobi equation with h(T) = 0 and h(c) 0. Extend h to [0,7] by setting h = 0 on [0, c). A simple calculation shows that for this h, one has 9,11 (0) = 0. ^ Chapter 1. Introduction^3 Thus h is a weak local minimizer for the accessory problem with a corner point at t = c. But the regularity of L implies that all solutions of the accessory problem must be smooth. This contradiction shows the nonexistence of conjugate points to T in (0, T). Conjugate point theory helps to eliminate non-optimal extremals obtained from the Euler equation. Consider, for example, minimizing the functional J(x) = fo27 r (x' (t) 2 — x(t) 2 )dt over all arcs with x(0) = x(27r) = 0. It is easy to check that x(t) = a sin(t) is an extremal for any constant a and that L is regular (L..(t) = 2), but 7r is a conjugate point to 27r. So x(t) = a sin(t) cannot be a weak local minimizer for J. If L fails to be regular, the classical conjugate point theory outlined above does not apply. Consider the following problem with two-dimensional state x = (x1, x2): T minimize J(x) = I (x'2 (t) 2 — x 1 (t) 2 ) dt subject to x(0) = 0, x(T) = 0. o It is easy to check that x = 0 is an extremal and that the Legendre condition holds ( 0 0 > 0). But the integrand L is not regular, so we cannot apply (Luu(t) = 0 ^ 2 classical conjugate point theory to determine if x = 0 is optimal or not. In this thesis we will extend conjugate point theory to cover both regular and nonregular systems. We will see later on that x = 0 is not optimal for any T > 0. Of course, our story does not stop here: most of our work deals with optimal control problems. The standard optimal control problem is to choose a measurable function u: [0,7] ---4 U so as to T minimize J(u) = £(x(T)) + f L(x(t),u(t))dt , ^(1.1) o where U is a nonempty closed subset of Rin and x is the trajectory satisfying f x'(t) = f(x(t),u(t)) a.e. t E [0, 7] I x(0) = A, x(T) E C C R. (1.2) By choosing f(x,u) = u, £(x) = 0, C = {B}, and U = Rn , we see that the basic problem in the calculus of variations is a special case of the optimal control problem. Chapter 1. Introduction^4 An admissible pair (x, u) is called optimal if for any admissible pair (y, v) we have J(v) > J(u). Necessary conditions for the calculus of variations have been extended to this problem as Pontryagin's celebrated maximum principle. Theorem 1.1 (Pontryagin). If (x, u) is optimal, then there exists an arc p and a scalar A E {0, 1}, not both zero, such that the following hold: (a) the adjoint equation —At) = H x (x(t),p(t),u(t)) a.e. t E [0, T]; (b) the transversality condition —p(T) E Xe(x(T)) + Nc(x(T)); (c) the maximum principle max H(x(t),p(t),u) = H(x(t),p(t),u(t)) a.e. t E [0, 7]. uEU Here H is the pre-Hamiltonian defined by H(x,p,u) = pT f(x,u) — AL(x,u) and Nc(x(T)) is the Clarke normal cone to C at x(T). Pontryagin's maximum principle extends the Euler equation, Weierstrass condition and Legendre condition to problems of optimal control. There is no counterpart of conjugate point theory in optimal control in any standard textbook. Recently, however, there have been some efforts in this direction. Zeidan and Zezza [32, 33] define conjugate points for optimal control problems and show the nonexistence of such points if the system is strongly regular. Here the term "strongly regular" refers to a combination of the strengthened Legendre condition with certain strong normality conditions. When these conditions fail to hold, the results of [32, 33] give little information. In Chapter 2 of this thesis we weaken these hypotheses considerably. We consider the optimal control problem (1.1), assuming that the endpoint constraint set is a smooth manifold C = {x E RI' I (I)(x) = 01 and that the control set U is defined by equalities and inequalities. We first define an extremal for an admissible pair (x, u) and show that if (x, u) solves (1.1) and is normal, then (x, u) is an extremal under some linear independence conditions on constraint sets. We then distinguish the admissible control directions v along which the first variation is zero, so the second variation is nonnegative. Now suppose V is such an admissible Chapter 1. Introduction^5 direction set (see (2.7)). The linearized system of (1.2) is f y'(t) = MOO) + fu (t)v(t) 1 y(0) = 0, v E V. (1.3) We denote its solution at time t by y(t, v). For any solution (y, v) of (1.3) obeying (I)'(x(T))y(T) = 0, the first variation Ji(v) = 0, so the second variation J2(v) > 0. (See (2.13).) We then define a generalized conjugate point to T in interval (0, T) in such a way that if there exists such a point, we can find an admissible v E V such that the corresponding J2 (v) is negative. The main result of Chapter 2 is Theorem 2.13, which we now quote: Theorem 1.2. Suppose that U is a convex set, that (x, u) is an extremal, and that the set I(u(t)) is piecewise constant in time. If (x, u) actually solves the problem (1.1), then either (a) there exists a nonzero vector a such that aTV(x(T))y(T,v) < 0 for all v E V; or (b) the interval (0, T) contains no generalized conjugate points to T. If (a) does not happen, we say (x, u) is regular. It is worth mentioning the following three special cases. • Classical calculus of variations. Any extremal x is automatically regular. If x is optimal, then there are no generalized conjugate points in (0, T), which implies both the Jacobi condition and the Legendre condition. The strengthened Legendre condition is not required. • Free endpoint optimal control. The extremal (x, u) is always regular. If (x, u) is optimal, then there are no generalized conjugate points in (0, T). Strong regularity conditions are not required. • Research of Zeidan and Zezza [32, 331. We prove that if the system is strongly regular, then the conjugate point set of Zeidan and Zezza is a subset of ours. Chapter 1. Introduction^6 We present some examples which lie beyond the scope of [32, 33], but involve non-optimal extremals with generalized conjugate points. 1.2. Second Order Approximation for Dynamical Systems Nonsmooth and set-valued analysis is a very powerful tool that can be used to study dynamical systems more general than (1.2), such as the differential inclusion x'(t) E F(x(t)) a.e. t E [0, 7], x(0) = A.^(1.4) This is a generalized differential equation to which the control system (1.2) can be reduced by setting F(x) = f(x,U) = {f(x,u) I u E U}. Obviously, any solution of (1.2) also satisfies (1.4). The reverse is also true under some mild conditions by the following result. Lemma 1.3 (Filippov [2]). Suppose f is continuous and U is nonempty and closed. If an absolutely continuous function x satisfies (1.4), then there exists a measurable function u: [0, T] U such that x'(t) = f(x(t),u(t)) a.e. t E [0, T]. In general, we consider the following optimization problem: minimize g(x(T)) over all arcs x satisfying (1.4) and x(T) E C.^(1.5) This model, besides being equivalent to the standard optimal control problem, provides a mathematical tool for studying nonsmooth control systems, feedback control systems (x'^f (x, u), u E U(x)), and implicit dynamical systems (f (x, x') = 0). By setting F(x) = f(x,U(x)) and F(x)^f(x,u) = 0}, the above dynamic constraints are reduced to differential inclusions. We assume throughout this thesis that the function F is continuous and convexvalued. The convexity of F is a standard and indispensable hypothesis in differential inclusion problems. Even in the control literature, such an assumption is often made Chapter 1. Introduction^7 explicitly (see Fleming and Rishel [11]). Without the convexity condition, the direct method of proving the existence of optimal solutions does not work, so there may be no optimal solutions at all. For example: consider the problem of minimizing the functional J(u) = x2(27r) over all (u,x) satisfying X 1i = U x'2 = x? + (1 — u2 )2 x(0) = 0, x i (27r) = 0 u(t) E [-1, 1]. Since x2 is nonnegative, we have J(u) = x 2 (27r) > x 2 (0) = 0 for any u. Fix any integer k, define a control u k (t) = sgn(sin(kt)). The graph of corresponding trajectory x 1 is of "sawtooth" shape. The corresponding values Auk) = 2k fo lk 2 t dt = 27r 3 /3k 2 converge to 0 as k goes to oo. So the infimum of J is 0, but there exists no control u such that J(u) = 0 (since otherwise, we must have both lu( = 1 and x i = 0, which is impossible). This is due to the nonconvexity of the multifuncton F (x) = {(u, xi -I- (1 — u 2 ) 2 ) 1 u E [--1, 1]}. Figure 1.1. Control uk ^ Figure 1.2. Trajectory xk Define the Hamiltonian 'H(x , p) = maxl(p, v): v E F(x)}. We have the first order necessary condition in terms of a Hamiltonian inclusion: Chapter 1. Introduction ^8 Theorem 1.4 (Clarke [6, Theorem 3.2.6]). Suppose x solves the problem (1.5). Then there exist an arc p and a constant A E {0,1}, not both zero, such that one has • the Hamiltonian inclusion (—p'(t), x 1 (t)) E OW(x(t),p(t)) a.e. t E [0, 7], • the transversality condition —p(T) E )g'(x(T)) + Nc(x(T)). Here 31-1(x, u) refers to the Clarke generalized gradient of 1-1 at (x, u) and NC is the Clarke normal cone. Loewen and Rockafellar [16, 17] give necessary conditions for much more general problems. Under some technical assumptions they assert that if x solves the problem, then there exist an arc p and a scalar A E {0, 1} , not both zero, such that the following hold: • the Euler-Lagrange inclusion pi(t) E co {w 1 (w,p(t)) E LNgp hF(x(t),x i (t))} a.e. t E [0, T]; • the Hamiltonian inclusion (—pi(t), xi(t)) E (97-1(x(t), p(t)) a.e. t E [0,T]; • the Weierstrass-Pontryagin maximum condition (p(t), x' (t)) = 7-1(x(t), p(t)) a.e. t E [0, T]; • the transversality condition —p(T) E )tg'(x(T)) + L1Vc(x(T)). Here LNc(x) is the limiting normal cone to C at x, which may not be convex. The Clarke normal cone is the closed convex hull of the limiting normal cone. We will use the "graphical" approach to study second order necessary conditions. Hamiltonian analysis is, in some loose sense, a dual approach of our methods. Let us briefly describe the main idea. For simplicity, suppose there are no endpoint constraints (C = Rn), and denote by R(T) the reachable set of differential inclusion (1.4) at time T. If a trajectory x solves the problem (1.5), then the Fermat rule (1.19) tells us that the derivative g'(x(T)) is nonnegative in every tangent direction w E DR(T)(x(T)), or that g'(x(T)) is in the positive polar cone to D R ( T)(x(T)). (The adjacent cone DRAT) and the second adjacent set are discussed in more DR(T) Chapter 1. Introduction ^9 detail below.) In the case of a nonlinear system like ours, the best we can hope for is to characterize explicitly subsets Q of the adjacent cone DR(T)(x(T)), using variations of the solution x. Since variational equations arising in ordinary differential equations are very useful in characterizing variational properties of ordinary control problems, we seek some kind of variational inclusions for differential inclusions. To this end, we must first define derivatives for set-valued functions. Suppose F is a set-valued function from Rn to lin and u E F(x). The first order derivative of F at (x, u) is the set-valued function dF(x, u), defined by gph dF(x, u) = D gp h F (x, U). The second order derivative of F at (x, u) relative to (y, v) E D g ph F(x, u) is a set-valued function d 2 F(x, u; y, v), defined by gph d 2 F(x ,u; y, v) = Dgph ph F(x , u; Y ,v). Frankowska [12] has shown that if F is compact, convex-valued and is Lipschitzian on its domain, then the reachable set R i (T) of the following "linearized inclusion" is a subcone of the adjacent cone D R ( T )(x(T)): y'(t) E dF(x(t), x' (t))y(t), y(0) = 0.^(1.6) Such an inclusion easily implies Pontryagin's maximum principle. We take Frankowska's analysis one step further. Given a solution y of (1.6), we prove that the reachable set R2(T) of the following inclusion is a subset of the second adjacent set DR(T)(x(T), y(T)): z'(t) E d 2 F(x(t), x'(t); y(t), y'(t)) z(t),^z(0) = 0.^(1.7) We then apply these approximations to (1.5) to obtain second-order necessary conditions. Chapter 1. Introduction ^10 In Chapter 3 we study the differential inclusion problem (1.5), which is equivalent to minimize f(x) = g(x) + 'I' R(T),-,c(x) over all x E R" .^(1.8) (Here kIf is standard notation for the indicator function—see (1.18) below.) If the arc x solves (1.5), then its endpoint x(T) solves (1.8). We have first- and secondorder necessary conditions as in (1.19) and (1.20) below. Since we can not explicitly characterize the set R(T), we have to use approximations. Our main concern is how to express the adjacent cone (set) of R(T) (l C in terms of some known subsets. It turns out that under some mild constraint qualification we can do this. Denote Q(x) = {p(T) 1 (–p i (t), x' (t)) E O'H(x(t), p(t))} .^(1.9) Here is the main result of Chapter 3 (Theorem 3.15). Theorem 1.5. Let x be a local solution to (1.5) and satisfy the constraint quali- fication ZUQ(x)m_Nc(x(T))) . 0 .^ (1.10) { } Then one has the first order necessary condition g i (x(T)) T y(T) _> 0 Vy(T) E R i (T) n Dc(x(T)),^(1.11) where R 1 (T) is the reachable set of (1.6). Furthermore, if equality holds for some y(T), and there exist an integrable function k and a constant ao > 0 such that d(x 1 (t) + ay' (t), F(x(t) + ay(t))) < a 2 k(t)^(1.12) for all 0 < a < a o and 0 < t < T, then one has the second order necessary condition g l (x(T)) T z(T) + 2y(T) T g n (x(T))y(T) ? 0 Chapter 1. Introduction^11 for all z(T) E R2 (T) n DC(x(T),Y(T)), where R 2 (T) is the reachable set of (1.7). When applying the above results to optimal control problems, we find that the usual normality condition implies the constraint qualification above and thus recover all standard second-order necessary conditions. Of course, these results also apply to many other systems beyond the scope of optimal control problems. 1.3. Epi-derivatives of Integral Fanctionals There are many generalized derivatives associated with nonsmooth functions. They play different roles in optimization. In this thesis we also investigate the epidifferentiability of nonconvex integral functionals that can be used to obtain necessary conditions for optimal control problems. Epi-differentiability is a quite new subject. Rockafellar [24, 25] uses it to study constrained optimization in finite dimensional spaces. Elementary calculus tells us that if a point x minimizes a C 2 function f over Rn, then the gradient of f at x is zero and the Hessian matrix of f at x is nonnegative. Many constrained optimization problems are equivalent to minimizing an extended-valued function f = g I 'I'c, where g is a C 2 objective - - function, C is a nonempty closed constraint set, and Tc is an infinite penalty function. To extend unconstrained optimization results to this function f, we are led to study some kind of generalized derivatives. Epi-differentiability of extended-valued functions in finite dimensional spaces is introduced and developed by Rockafellar [24, 25]. Suppose f is a lower semicontinuous, extended-valued function and f(x) is finite. The difference quotient of f at x is the function fx (h,-), defined by f x(h, Y) ---= f(x + hy) — f(x) h The function f is epi-differentiable at x if lim inf epi fz (h, •) = lim sup epi fz (h,•) h--+0+^ h--40-1- Chapter 1. Introduction^12 and we define the epi-derivative 11 by epi fix = lim epi fr (h, •). If a vector v 11-00+ satisfies fl(y) > (v, y) for all y, then v is called an epi- gradient of f at x. The second difference quotient of f at x relative to v is the function fx , v (h, •), defined by fx,v(h, y)^ f(x hy) f(x) — h(v, y) h 2 /2 We define second epi-derivative of f at x relative to an epi-gradient v by epi fx , lim epi f x , r (h, •). If a function is fully amenable, that is, if it is the composition of a closed, polyhedral convex, proper function with a twice continuously differentiable function and satisfies a certain constraint qualification, then it is twice epi-differentiable. It is known that a large class of optimization problems in finite dimensional spaces can be described by fully amenable functions. Poliquin and Rockafellar [21, 22] develop a calculus of epi-derivatives and apply it to get optimality conditions. If x is a minimum point, then 0 is an epi-gradient of f at x and the second epi-derivative fz", o (y) is nonnegative for all y. Since epi-differentiability is so useful in finite dimensional optimization, various authors have tried to extend epi-derivatives to infinite dimensional spaces. Do [9] studies integral functionals with convex integrands in L 2 spaces and Cominetti [8] studies functionals in reflexive Banach spaces under some rather strong assumptions. Motivated by the calculus of variations, we are especially interested in the following functional I with nonconvex integrand L defined by T 1(x) = f L(x(t), (t))dt. It is natural to ask following questions: Does the epi-differentiability of L imply that of I? If so, can we express the epi-derivatives of I in terms of those of L? We address these questions in Chapter 4, starting with the epi-differentiability of integral functionals I: L 2^defined by 1(u) = I f (u(t))dt. Chapter 1. Introduction^13 Here the integrand has the form f(u) = a(F(u)) + W c(G(u)), where F: Jim -4 R 1 and G: Rm -4 RP are twice continuously differentiable functions, a: R 1 -4 R is a piecewise linear quadratic convex function and C is a nonempty closed convex polyhedron in RP. We denote U = {u E Rni 1 G(u) E Cl. We now quote the main result of Chapter 4 (Theorem 4.11). Theorem 1.6. Suppose f is defined as above, the set U is convex and a certain constraint qualification holds at u(t) for almost every t. Also suppose there exists a constant c > 0 such that for any 7 E acr(F(u(t))) one has a(F(u(t) + hx)) — a(F(u(t))) — h(-y, F' (u(t))x) > _ c x1 2 1 h 2 /2 for all x E Rrn, h > 0 and a.e. t E [0, T]. Then I is twice epi-differentiable at u. Its second epi-derivative relative to w E 01(u) is T IT ,w (V) = f ful im,w(t)(v(t))dt b v E L 2 . o In Chapter 4, we prove this result, and then apply it to a Bolza functional to get second-order necessary conditions for free endpoint problems. 1.4. A User's Guide All three chapters below present second-order necessary conditions for controlled dynamical systems. These results are closely related, but have different emphases. For the optimal control problem (1.1), we usually apply Pontryagin's maximum principle (Theorem 1.1) to find some extremals, then use Theorem 1.2 to eliminate extremals that are not optimal. The conditions of Theorem 1.2 are simple, while its conclusions are very strong. Theorem 1.2 is especially useful when the system is not strongly regular, that is, when the strengthened Legendre condition or the strong normality condition at 0 fails to hold, since there are no results in the literature dealing with these cases. Chapter 1. Introduction 14 Of course, we can also reduce the problem (1.1) to a differential inclusion problem (1.4) by adding an extra variable xo. Specifically, we introduce the compound state variable i- = (xo , x) and define a multifunction on Rn+ 1 by F(i) = {g(x,u) + r, f(x,u)) 1 u E U, r > 01. The objective function becomes g(i) = £(x) + x o . Theorem 1.5 provides secondorder necessary conditions for this situation. These may be hard to analyse, however. Not only do we have to add extra conditions on the system (F(i) must be a closed convex set), but we also have to verify condition (1.12) and solve a Hamiltonian inclusion (1.9) (an arc p in Q(x) is usually not the same as the adjoint arc from the maximum principle). As an illustration of this method, we show how to get second-order necessary conditions for Mayer problems in Section 3.5. As for using epi-derivatives to study optimal control, this is more restricted: at present we can treat only linear dynamical systems and problems without endpoint constraints. In a word, we prefer to use generalized conjugate points to study second order necessary conditions for smooth optimal control. When some functions in the system are not smooth, we fall back on the differential inclusion model. This is also the method of choice if a problem can not be expressed in the optimal control framework (such as feedback control systems or implicit dynamical systems). Although verifying (1.10) and (1.12) may be hard work, Theorem 1.5 at least gives us some information on the second order necessary conditions. Epi-differentiability is a potentially useful method in studying dynamical systems if we can develop good calculus rules for second epi-derivatives in infinite dimensional spaces. Currently, epi-differentiability is mainly used in sensitivity analysis of marginal functionals and optimal solution sets. (See surveys by Poliquin and Rockafellar [21, 22] for finite dimensional spaces.) In this thesis we are only concerned with necessary conditions. We do not address any results in that field. We Chapter 1. Introduction 15 note, however, that Theorem 1.6 is a proper extension of Do's [9] to nonconvex integrands and can be used in reducing the sensitivity analysis of infinite dimensional spaces to that of finite dimensional ones. In Chapter 2, Theorem 2.4 is important not only as the basis for Theorem 1.2 but also because it shows how to use perturbation methods to get nonnegativity of second variation. When a control passes through a corner point of control set, the corresponding tangent cone changes dramatically. This makes it hard to find an admissible variation defined on a fixed interval for every t, which is essential in ap- plying dominated convergence theory and inverse function theory. Our perturbation method handles this difficulty well. In Chapter 2, Definition 2.10 is set up in such a way that if there is a generalized conjugate point, then the second variation is negative for some admissible control. So Theorem 1.2 is the strongest necessary condition available. Conjugate points defined by nontrivial solutions of the Jacobi equation are subsets of our generalized conjugate points. It is known that the Jacobi condition is equivalent to the existence of a solution to a certain Riccati equation, which makes a bridge linking necessary and sufficient conditions. Even though Theorem 1.2 is the strongest necessary condition, we do not know at the moment what Riccati equation (inequality) corresponds to the nonexistence of generalized conjugate points. If we could find it, we would reduce the gap between necessary and sufficient conditions. In Chapter 3, we use proximal analysis to estimate the Clarke normal cone to the reachable set and use the reduction method to solve unbounded differential inclusion problems. In Chapter 4, we use Moreau-Yosida approximation to reduce epi-convergence to pointwise convergence, then apply the monotone convergence theorem and Fatou's lemma to prove the main results. Chapter 1. Introduction ^16 1.5. Approximating Cones We use several different methods to study the second order theory of dynamical systems in this thesis. Chapter 3 is based on set-valued analysis [3]. To appreciate the beauty and power of this new technique, let us give a brief review. We know that the gradient of a smooth function can be characterized by the normal vector to its graph, which is a smooth manifold. If we want to extend differentiability to a nonsmooth function, we have to use its epigraph instead of its graph. The epigraph of a nonsmooth function usually has some corner points at its boundary. We have to extend the tangent space to a more general setting, that is, the tangent cone. There are many different ways to define a tangent cone to a set, so there are many different derivatives corresponding to these tangent cones. Every tangent cone has its own advantages and disadvantages—this makes nonsmooth analysis a very versatile and powerful tool to study the differentiability of nonsmooth functions. Among the most famous cones are the Clarke tangent cone, the contingent cone, and the adjacent cone. Since this thesis concerns second order analysis, we are particularly interested in second order generalized derivatives and their geometric interpretations. Suppose {Ch}h>0 is a family of nonempty closed subsets of Rn. The lower limit of {Ch} is defined by lim inf Ch = { y E R n I Vhk —3 0 + , 3 Y k —4 y, such that yk E Ch k Vkl.^(1.13) h--+0+ The upper limit of {Ch} is defined by lim sup C h = {y E Ir I Vhk —4 0 + , 3yk„ —4 y, such that yk. E Ch ic . *}. (1.14) 11-40+ Suppose C is a nonempty closed subset of R n and x is a point in C. Here are some important cones in set-valued analysis, written in terms of the difference quotient C—x Ac(x,h):. h Chapter 1. Introduction^17 • The Clarke tangent cone to C at x is given by Tc(x) = liminf Ac(x i , • The adjacent cone to C at x is given by D c (x) = 1 hi m^c(x , h); • The contingent cone to C at x is given by Kc(x) limsupAc(x,h); h -+0 + • The Clarke normal cone to C at x is given by Nc(x) = {y E^(y, z) < 0 Vz E Tc(x)}. These cones and their calculus have been discussed in detail in [3]. We only mention the following facts: for any set C and point x E C, Tc(x) C Dc(x) C Kc(x); also Tc(x) is a closed convex cone. We call the set C derivable at x if and only if Dc(x) = Kc(x) and tangentially regular at x if and only if Tc(x) Kc(x). In the following two cases, C is certain to be tangentially regular at x: 1. C is a convex set; or 2. C is defined by smooth functions, that is, C fy E R" I fi(y) < 0 V j E / 1 ; fi(y) = 0 Vj E / 2 }^(1.15) and the following constraint qualification holds at x: the vectors {f"i (x)} are linearly independent for j E I(x) U / 2 , where I(x) = {j E I l I fi(x) = 0} is the active inequality index set. In the second case, we have Tc(x) fy E Rn I f;(x) T y < 0 Vj E /(x), ti (x) T y = 0 Vj E / 2 )^(1.16) Chapter 1. Introduction^18 and Nc(x) = {y E .17" I y =^ti(x) and Ai 0 Vj E I(x)}. AEI(x)u/2 (1.17) Here is an example in which C C R 2 (see [3], Figure 4.4): C = {(-x,x)}x>o U{( 2-n , 2-n )}nEN 7b(0) = {0} Dc(0) = {( x,x)}x?0 - Kc(0) = {(—x,x)}x>o U{(x, x)}x>o• For a given vector y E Dc(x), the second adjacent set to C at x relative to y is defined by D2c(x, y) lim inf C- x h2 - by Now we define generalized derivatives for extended-valued functions. Suppose f is a lower semicontinuous function from Rn to Ruf+ool. The epigraph of f is a nonempty closed subset of R" x R defined by epif = {(x,r)^x^f(x)}. For any point x in the domain of f, the adjacent derivative of f at x is the function fa' (x, -) defined by epi fa (x,•) = D e o f(x, f(x)). It can be expressed by = lim sup inf f (x = lim lim sup i nf €.0+ h_.+ 0+ d(y 1 ,y)<E 41 ) - f ( x f ( x 4') f (x) If De o f(x, f(x)) = Kep i f (x, f(x)), it follows that f is epi-differentiable at x with epi derivative f z' (y) = fa (x, y) for all y. If fa (x, y) is finite, then the second adjacent - derivative f a" (x, y, •) is defined by epi fa"(x,y, •) = D e2pj f ((x, f (x)), (y, fa (x, y))), or f (x by h 2 z') — f (x) — h f a' (x , y) h2 h--+0+^ fa"(x,y,z) = lim sup inf Chapter 1. Introduction ^19 If f E C 2 , then ,g(x, y) = f' (x)Ty and f:(x,y, z) = f' (x)T z + .1y T f"(x)y. Another useful choice for f is the indicator of a given set C in R n , defined by if x E C, if x (1.18) cl C. In this case fc,'(x, y)^Wp c,( x )(y) and fall(x, y, z) = Tpt( xy )(z). Consider the following optimization problem in Rn: minimize a locally Lipschitz function g over a nonempty closed set C. It is equivalent to minimizing the extended-valued function f = g + xlic• Suppose x E C solves the problem. We have the first-order necessary condition 0 < fa' (x, ^g a (x, y) + Wpc(x)(Y) Vy E Rn.^(1.19) Furthermore, if equality holds for some y, we have the second-order necessary condition 0 _< fa" (x, y, z)^g a" (x,y, z) + D(x , y) (z) ez E Rn.^(1.20) Figure 1.3. Set C and level curves of g Chapter 1. Introduction^20 Consider the following two-dimensional example (see Figure 1.3): minimize the function g(x i ,x 2 ) = 1 — xi — 4 over the unit disk C = {(x i , x 2 ) I xi + 4 < 1}. The set C and some level curves of g appear in Figure 1.3. To illustrate the theory, we define f = g + xlic• At the point x = (1,0), the adjacent cone Dc(x) is the half space {(y i , y2) I y i 5_ 0}. The adjacent derivative fL(x, y) = —y i is clearly nonnegative for any y E Dc(x), so the first order necessary condition holds at x. If equality holds for some y, that is, y = (0, /3) for some 3, then we have ? ./*(x, y) = {(z i , z 2 ) 1 zi < —0 2 /2}. Consequently the second adjacent derivative is fan(x,Y,z) = M32 5 0 for any z E .14,(x, y). Since f."(x,y,z) < 0 whenever 13 0, the second-order necessary condition fails and x is not a local minimum point. In fact, direct calculation shows that (2, 4) and (2, —$) are two global minimum points, and (-1, 0) is a global maximum point. It is worth noting that second order theory gives more information here: since the second adjacent derivative is negative at (1, 0) along vertical directions, it shows how to move from (1, 0) toward points with lower values of f . This sort of information can be very important in numerical computations. If the set C above is the unit circle instead of the unit disk, then at the point x the adjacent cone is the vertical line Dc(x) = {y E R 2 I y i = 01, and the second adjacent set relative to any y E Dc(x) is the vertical line {z E R 2 1 z1 = — y3/2}. Again, the point x satisfies the first order necessary condition, but fails the second order one. Indeed direct calculation shows that the point x is a local maximum point on the set C. Let us consider one more example about the second order adjacent set. Suppose a nonconvex closed set C is defined by C = fx E R 2 1 (xi +1) 2 +4 > 1, 4 +4 < 4, x 2 > 01. (See Figure 1.4.) The adjacent cone at the origin is shown in Figure 1.5: analytically, Dc(0) = {Y E R 2 1 yi ?0, y2 > 0}. Chapter 1. Introduction ^21 We have Dt(0, y) = le if y is an interior point of the adjacent cone. If y is a boundary point of the adjacent cone, then 1{z E R 2 I zi _..?0 — / 2 }^if 0 and y 2 > 0 DkO, Y) = {z E 11 2 1 z1 > 0, z2 > 0} if yi = 0 and y 2 = 0 {z E R 2 I Z2 _?_ 0}^if y i > 0 and y 2 = O. yi . Y2 • 3 2 Dc(0) 1 0 1^2^3 yi Figure 1.4. The set C^Figure 1.5. The adjacent cone to C at 0 Chapter 2 Generalized Conjugate Points This chapter concerns second order necessary conditions and conjugate point theory for the following control problem: 1T minimize J(u) = f(x(T))^L(t , x(t), u(t)) dt^(2.1) over all piecewise smooth trajectories x with piecewise continuous controls u satisfying I x' (t) = f (t, x(t), u(t)) x(0) = x 0 , x(T) E C (2.2) u(t) E U. Here the endpoint constraint set C and the control set U have the form C = {x E Rn 14.(x) =0}, U= {u E Rni gj(u) 50,j E P;gi(u) = 0,j E1 2 }. (2.3) The standard approach to conjugate point theory is to consider the so-called "accessory problem", which is to minimize the second variation J2 (v) over all solution pairs (y, v) for some linearized system. If (x, u) solves the original problem then the associated functional J2 is nonnegative. If some v solves the accessory problem, applying standard necessary conditions to this case produces the nonexistence of conjugate points. Our approach is quite different. Assuming that a generalized conjugate point exists, we construct a control v directly to show that the second variation must be negative. 22 Chapter 2. Generalized Conjugate Points^23 This chapter is organized as follows. Section 2.1 reviews some background material, and presents the basic hypotheses. Section 2.2 gives the second order necessary conditions for the problem (2.1). Section 2.3, describes the theory of generalized conjugate points. Section 2.4 offers some examples. 2.1. Preliminaries We consider the following Mayer problem: minimize J(u) = 4x(T)) over all (x, u) satisfying (2.2).^(2.4) The problem (2.1) can be reduced to (2.4) by adding an extra state variable. We will do this later. Terminology. Implicit in the statement of problem (2.1) is a relatively open subset St of [0, T] x 17n for which the domain of both f and L is Q x U. A statecontrol pair (x, u) is admissible for (2.1) if it satisfies (2.2) and obeys (t, x(t)) E f2 for all t E [0,7]. An admissible pair (x, u) is a (local) solution to (2.1) if (upon shrinking Q if necessary) any other admissible pair (y, v) satisfies J(v) > J(u). An admissible pair (x, u) is normal if the only solution of the following system is p 0: — p'(t) = fz (t)T p(t), —p(T) E Nc(x(T)),^ (2.5) p(t)T f(t, x(t), u) ^p(t)T f(t, x(t), u(t)) Vu U. Here fx (t) = f x (t , x(t), u(t)). (This interpretation will be applied to other functions if necessary.) Standing Hypotheses. Let an admissible pair (x, u) be fixed. Assume that the f(t,y,v) is measurable in t and C 2 in (y, v) and that there exists an integrable function k such that if (y ,v) is near (x(t), u(t)), the first and second order derivatives of f(t,y, v) with respect to (y, v) are bounded by k(t). Chapter 2. Generalized Conjugate Points^24 Assume that the functions t :^R, 4:D : Rn^R k , and gi : Rm —> R for j E I l U / 2 are C 2 near the given points. (The disjoint index sets / 1 and / 2 are finite.) For any u E U, write I(u) := {j E / 1 gi(u) = 0} for the set of active inequality indices. Assume that 4) is smooth, and that the matrix 4. 1 (x(T)) has full row rank. The tangent cone and normal cone are given by (1.16) and (1.17), respectively. That is, Tc(x(T)) = {y E^1.1(x(T))y = 0}, Nc(x(T)) = fy E R" y = (1. 1 (x(T)) T a for some a E R k }. Assume further that the set of vectors {gli (u(t)) I j E gu(t)) U / 2 is linearly independent for almost all t in [0, T]. The tangent cone Tu(u(t)) is given by (1.16). Definition 2.1. An admissible pair (x, u) is called an extremal if there exist a piecewise smooth function p and piecewise continuous functions -yi, and a vector A E R k , satisfying (a) the adjoint equation — pi(t) = Hx (t); (b) the transversality condition — p(T) = A'(x(T)); (c) the maximum principle H u (t) = 0; (d) the complementary slackness -yi(t) 0 and -yj(t)gi(u(t)) = 0 for all j E / 1 ; where H(t,x,p,u,y) =p T f(t,x,u) — A(x) = £(x) A T cD(x). Combining the maximum principle (Theorem 1.1) and the Kuhn-Tucker theorem [10], we have the following first-order necessary condition. Theorem 2.2. Suppose (x, u) solves (2.4) and is normal. Then (x, u) is an ex- tremal. Chapter 2. Generalized Conjugate Points^25 2.2. Second Order Necessary Conditions Let (x, u) be an extremal with corresponding p, 7 and A. Let D(t) ij E I(u(t)) yi(t) > 0) denote the set of active indices for which the associated multiplier is - strictly positive. let V (t) {v E Rni g i (u(t)) T v = 0 Vj E D(t)U I 2 ,^1 gi(u(t))T v 0 Vj E /(u(t))\ D(t) f (2.6) Notice that for each t, V (t) is a closed convex subcone of Tu(u(t)). We associate with V(t) the following family of functions: V = {v E PW C[0,7]1 v(t) E V(t) Vt).^(2.7) Thus V is a convex cone of piecewise continuous functions. To obtain the second-order necessary condition, we perturb the control in (2.2) to get variational systems. Suppose the given control u belongs to a family of controls indexed by 8, so that u(t) 0(t, 0) for all t and the general parameter produces a control 0(t, 8) with corresponding trajectory x(t, 8). We first give a definition that describes the properties desired in such a perturbation. Definition 2.3. A function b is called an admissible control variation if there exist constants T > 0, M > 0 such that 0: [0, x [—T, Ti -+ R'n is well defined and satisfies • 0(.,6) is piecewise continuous and 0(t , •) is C 2 ; • 0(t, 0) = u(t); • 110(t,.)iloo, Ii019(t,^1109e(t,^5_ M. The key point in this definition is that T and M are independent of t, which is essential later in applying the inverse function theorem. Suppose ' (t, 8) is an admissible control variation, and x(t, 8) is a trajectory satisfying x t (t, 0) =^x(t, 8), 0(t, 8)) for all t and x(0, 0) = xo . Then x(t, 0) = x(t), Chapter 2. Generalized Conjugate Points^26 x(t, •) is C 2 , and for 0 sufficiently small one has (t, x(t , 0)) E f for all t E [0, T]. We can write t x(t , 0) = x o + f f (s, x(s, 0), tk(s , 0)) ds. o (2.8) Since the first and second order derivatives of f(t, y, v) with respect to (y, v) are bounded by an integrable function k(t) if (y, v) is near (x(t), u(t)), and 09(t, 0) and 090(t, 0) are bounded by a constant M, we can differentiate (2.8) near 0 = 0 under the integral by the dominated convergence theorem. To describe the results, denote y(t) = xe(t, 0) and v(t) = &9(t, 0). Then (y, v) satisfies y'(t) = f z (t)y(t) + fu (t)v(t), y(0) = 0.^(2.9) Similarly, denote z(t) = xee(t, 0) and w(t) = 00 9 (t, 0). Then (z, w) satisfies the following equation relative to (y, v) z' (t) = f z (t)z(t) + fu (t)w(t) + H2 (t , y(t), v(t)), z(0) = 0,^(2.10) where H2(t, y, v) = y T h z (t)y + 2y T fzu (t)v + V T fuu (i)V. Now we consider the variational system associated with (2.2) relative to the given pair (x, u), namely, I y'(t) = fz (t)y(t) + fu (t)v(t) 1 y(0) = 0,v E V. (2.11) Let Z(t) E Rri " be a solution of the matrix differential equation Z'(t) = —Z(t)fr (t), Z (T) = I. Then any solution pair (y, v) of (2.11) can be expressed as t y(t, v) = f Z(t) -1 Z(s)fu (s)v(s)ds. o For any solution pair (y, v) of (2.11) with (1) 1 (x(T))y(T) = 0, the corresponding first variation of the cost is J i (v) = e (x(T))T y(T) = 0. This can be obtained Chapter 2. Generalized Conjugate Points^27 by combining Definition 2.1, the admissible direction set (2.6), and the linearized system (2.11), as shown below: Ji(v) = 11 1 (x(T)) T y(T) = —p(T) T y(T) by Definition 2.1 (b) =—I (p(t) T y(t)) dt by (2.11) =— (p 1 (t) T y(t) p(t) T y i (t)) dt f ^ (2.12) fo T = —^p(t) T fu (t)v(t) dt by Definition 2.1 (a) and (2.11) T = - ^-yi(t)g"i(u(t))Tv(t)dt by Definition 2.1 (c), (d) o jED(ou/2 = 0 by (2.6). Theorem 2.4. Suppose U is a convex set, (x, u) is an extremal pair, and I(u(t)) is a piecewise constant set. If (x, u) actually solves the problem (2.4), then either (a) there exists a nonzero vector a E R k such that a T V(x(T))Z(t)fu (t)v(t) < 0 a.e. t, Vv E V; or (b) for any solution (y, v) of (2.11) obeying (13 1 (x(T))y(T) = 0, we have the second variation J2(v) > 0, where I T(y(t))7(Hxx(t) Hxu(t))(Y(t) J2(v) = y(T) T A ll (x(T))y(T)^ Hux (t) H o v(t)^ u u(t) v(t)) dt. (2.13) Proof Let R 1 (T) denote the reachable set of (2.11) at time T. Since R 1 (T) is a convex cone containing 0, so is its linear image (D I (x(T))R i (T). We must have either 0 E bdy 1. 1 (x(T))/:1 1 (T) or 0 E int (1) 1 (x(T))R i (T). Chapter 2. Generalized Conjugate Points^28 In the first case, by the separation theorem, there exists a nonzero vector a E R k such that a T V(x(T))y(T,v) < 0, Vy(T,v) E R i (T).^(2.14) For any v E V and 0 < t < T, choose e > 0 sufficiently small that 0 < t e < T. Define v7 E V as 4(s) v(s) if .s E (t, t e) and v7 (s) = 0 otherwise. Substitute v7 into (2.14) to get aT (x(T))Z(s)fu(s)v(s)ds _< 0. Divide by e and let e decrease to 0: this gives conclusion (a). In the second case we have (I) 1 (x(T))/7 1 (T) = R k . We prove (b) in several steps. To simplify expressions, we denote .e2(yi, Y2 ) YTAII(XM)Y2 ^(2.15) L2(t,^V11 Y2 V2 = ( y t, 1 ) (HHxurx((tt)) H H:((tt))G Y22) • Since u is piecewise continuous and I(u(t)) is piecewise constant, there exists a partition of [0, 7]: 0 = to < ti < • • • < t1+1 = T, such that on each interval (ti, ti+1) the control u is continuous and the index set I(u(t)) is constant (i = 0, ... , 1). For E > 0 sufficiently small, define A, = UL o [ti e,ti + 1 — e] and B E = [0, T] \ A. Now define a closed convex subcone of V(t) by V,(t) = V(t) t E {0}^t E B,, and write V, = {v E PWC[0,1] I v(t) E V,(0). Note that if e l < € 2 , then V„ C VE1 C V, and if m denotes Lebesgue measure, then m(B E ) = 241 + 1). Denote by r,(T) the reachable set of (2.11) with control set V, in place of V. Step I. We show that if (1. 1 (x(T)) maps R 1 (T) onto R k , then it also maps the smaller set r E (T) onto R k , provided that e is sufficiently small. Chapter 2. Generalized Conjugate Points ^29 Lemma 2.5. If V(x(T))Ri(T) Rk, then there exists e o > 0 such that cl)'(x(T))r,(T) R k VO < e < e o . Proof. If the conclusion were false, there would exist a sequence e n^0+, such that (x(T))r,,„ (T) 5L R k . Since (x(T))r,,z (T) is a convex cone containing 0, we must have 0 E bdg clAx(T))r,,,(T). Thus there exists a sequence a n E R k with Ia n ' = 1, such that a!'4.1(x(T))y(T,v) < 0 Vv E V E „. We can choose a subsequence of {a n } converging to some vector a with^= 1 (we use the same labels). Now for any v E V, define v n, E V„z by v,z (t) = v(t) if t E A, n and v ri (t) = 0 otherwise. This construction gives aT ,E V(x(T))y(T,v„) < 0 Vn. Since liv n, — 112 = fn , n iv(t)12 dt^0 implies Ily(•, v, i ) — y(•,v)If c,„^0 as n —> oo, we certainly have iy(T, — y(T, v)1 —> 0 as n oo. By taking the limit, we have aT (x(T))y(T, v) < 0. Since v E V is arbitrary, we have V(x(T))R i (T) Rk, a contradiction. Q.E.D. Step 2. We show that for any v E V„ we can construct an admissible control variation with the properties described in Definition 2.3. Lemma 2.6. For any e > 0, suppose v E V, is given. Then there exists constants T > 0 and M > 0, and an admissible control variation 0: [0, x [—T, T] Rm such that 09(t, 0) = v(t) for all t E [0, T]. Furthermore, g 3 (0(t,0)) = 0 for all j E D(t)U / 2 , and gj(0(t,0)) < 0 for all j E / 1 \ D(t) if 0 < 0 < r . Proof. Fix an index i, 1 < i < 1. On the interval [ti^— e], we know that u is continuous and I(u(t)) is constant, say It. Write Ui^{u(t) E Rm I ti-l-e< t^ti+i — e}, Gi(u) = fa (1111 ..vjETJUI2 (a vector), and Ci (u) = Chapter 2. Generalized Conjugate Points^30 (a matrix whose rows are the gradients of the corresponding component functions in Gi). For any u E Ui, the matrix Gii (u) has full row rank by our standing hypothesis and g i (u) < 0 for all j E / 1 \ The pseudo-inverse of G'i (u) is defined by G#(u) = G:(u)T (Ci (u)Ci (u)Tri . Since the functions gj are twice continuously differentiable and the set U; is a compact subset of U, we can find constants Si > 0, ci > 0 such that for any u E SiB, we have • Gay) has full row rank; • g i(u) < 0 for j E \ • Gi(u),^(u) and their derivatives are bounded by ci. For each fixed t E^e, t;+1 — e], we solve the following differential equation with independent variable 9: ,l, (0) = G!° (0 (0))Gii (u(t))v (t) +^—^(0 (0))Ci (0 (0))) v(t)^ 0(0) = u(t). (2.16) We have 1,1)0(t,0) = v(t), and Galk(t,0))00(t,60) = Gau(t))v(t). In particular, 0)) T 0,9(t, 0) = g,(u(t))Tv(t) for any j E U/ 2 . Let 9(t) = inf {0 > 0 10(t, 0) — u(t)i^8i). That is, 9(t) is the first time the trajectory of (2.16) leaves the ball centered at u(t) with the radius bi. Certainly 0(0 > 0 and 0(t,.) is well-defined on the interval [0, OM]. For 0 < 9 < 9(t), we have 0(t,0) E u(t) + 5 B C SiB, so 10(t, 8)1 < iu(t)i^bi,100(t, 9 )1^( 1 + cniv(t)i and 1000(t, 61 )1 5 3 cnv(t)1169(t, OA by (2.16). Denote =^bi, (1 +^3c(1 + cDliveo 1. We have either 9(t) = oo or 9(t) is finite. In the latter case, I0(t , 0(t)) — (t, 0) 1=I f e(i) ^6.(t , 0) d01 5 MAO, Chapter 2. Generalized Conjugate Points^31 that is, OW > Ty := IA. Thus the function 0(t,.) is defined at least on the interval [0, Ti]. Similarly, we can prove that the solution for the differential equation (2.16) is well defined on the interval [—Ti, 0]. For 0 < 0 < Ti, we have M71(t , 0)) < 0, Vj E / 1 h, while for j E D(t)1.) /2 , the text immediately following (2.16) implies that 9 gj(k(i, 0)) = g j(0(t, 0)) + I (g j(0(t , 0))) 1 dB 0 = gi(u(t)) + f g i (11,(t , 0)) T 0(t , 0) dB =f f g i (u(t)) T v(t) dO 0. Similarly, for jEL\D(t) we have 0 g^(t, 0)) = g (u(t)) +^gii (u(t)) T v(t) dB 0 < 0. So for each t E [ti -f-e 7 ti+1 e] there is a C 2 trajectory tk(t, •) on the interval [—Ti, ri], satisfying 1171*, -A., Il zko( t,.)1 1. ,11 0 80( t,.) 11.^mi. Since v is piecewise continuous on the interval [ti e, ti +1 — e], by continuous dependence on parameters of differential equations, we see 1,b(•, 0) is piecewise continuous on [t i e, t i+1 — e]. By following the argument above for each index i = 1, , /, we define 7, b(t , 0) for all t in A. We then define zi)(t, (9) = 0 if t E B. By letting T min i <;<i Ti and M = maxi<,<t we see immediately that zi) has all the desired properties.^ Q.E.D. Step 3. We show that the second variation J2 (v) is nonnegative for controls v in the set V(. Chapter 2. Generalized Conjugate Points^32 Lemma 2.7. Suppose 4, t(x(T))r,(T) = R k . Then for any y(T,v) E r E (T) with V(x(T))y(T,v) = 0, we have the second variation J2(v) > 0. Proof. Let c =^and let ei (i = 1, , k) denote the standard basis for Rk. Define k eo = — Si 1=1 = ei + eo, 2 = 1, ^, k. Then we have E ik_ o 6 = 0, while the vectors Si — eo = ei (i = 1, , k) are linearly independent. Since (1)'(x(T))r,(T) = R k , there exist controls vi E V, (i = k) such that 41 (x(T))y(T, vi) = ei• Now for any y(T, v) E r ,(T) with ob' (x(T))y(T, v) = 0, and any 0 < vi T < 1, define = (1 — r)v rvi E V,. Then y(t,^= (1 — r)y(t, v) ry(t, vi), and (1,1 (x(T))y(T,iii) = 76:. Observe that by (2.12), (x(T)) T y(T,^= Al (x(T)) T y(T,^— A T (x(T))y(T, riyi) = —TA T 6. Since vi E 14, there exist constants ri > 0, Mi > 0 and an admissible control variation 0i: [0,^x [ — Ti, Ti]^linz satisfying all the conclusions in Lemma 2.6. Let = mino<i<k Ti, M InaX0<i<k 11/12, and define 0: [0,^x Bk +1 (T)^R' by 0(t, 0) = i=o Let x(• , 0) be the corresponding trajectory of the full nonlinear system (2.2) with control z '(t, 9). Since 08, (t, 0) = ci,i(t), define yi(t) = xe, (t, 0), then (yi, ciii) satisfies (2.9). Define F: Bk+1(7) Rk -f- 1 by F(6) =^0i,(1)(x(T,0))). i=o ^ Chapter 2. Generalized Conjugate Points^33 We have F(0) = 0 and ae (0) = (1, (1. 1 (x(T))y(T, cgi)) = (1, c r ei). We show that F'(0) is nonsingular. Suppose a E R k + 1 satisfies F'(0)a = 0. Expand this to get E ik_ o ai = 0 and E jk_ o a jeTei = 0. If we write ao = — E ik_ i ai and substitute it into the latter expression, we have E -1 — eo ) = 0, which implies ai = 0 for i = 1, , k. Thus a = 0 and F'(0) is nonsingular. By the inverse function theorem, there exist e o > 0 and twice continuously differentiable functions 19i: (—eo, eo) R such that 0(0) = 0 and E ei(e) = e k szb(x(T,0(e))) = 0 V€ E [0,€0). i=o Differentiate the above expressions with respect to e and let E = 0. This gives E9:(0) = 1,^0 = cbi(x(T)) E y(T , iA(0) = E19:(0),ei. i=0^i=0 i=0^ By the choice of c and ei, we have E :i _ 0 (0:(0) — c) = 0 and E :i _ 0 (60) — c)6 = 0. So 6q(0) = c for i = 0, , k. For e o > 0 sufficiently small, we have 19 i(e) > 0 for all 0 < e < e o . Since U is a convex set, we get 0(t, 9(e)) E U for all 0 < t < T and 0 < c < €0. Define 1g (t) = -:x(t , 9(e))I,o^z(t) = -:E( ; x t ,19(6))1e=o 52 i)(t)= --a-TO(t,e(E))1e=0,^tiy(t)= yETO(t,e(6))1,=0. , i) satisfies (2.9), where direct calculation reveals i3(t) = cE c 2 E_ 0 1ii(t) and y(t) = e2 Ek o y(t , v2). Then Similarly, (z", ti) satisfies (2.10) relative to ei)(t) = (9, 0, and ^( t ) 0 ( ) 2 + ( t ) ' ( ) ) i=0 C3 E+ c^0:/(0)fyi(t), i=0^i=0 0 1, i (t)0'i (0) = Chapter 2. Generalized Conjugate Points^34 where wilt) = -a8-;10,(t, 0). Since g j (0 i (t, i )) = 0 for j E D(t) U I 2 , we may differentiate with respect to Oi and let ei = 0. This shows that iiii(t) satisfies g i (u(t)) T iiii(t)-1-iii(t) T g7(u(t))fi(t) = 0 Vi =^j E D(t) U I 2 .^(2.17) Since (1)(x(T,t9(e))) = 0, we differentiate this relation with respect to e and let c = 0 to get V(x(T))W(T) = 0 and (1)"(x(T))(g(T),g(T)) (1. 1 (x(T)) T i(T) = 0. Now define h(e) = e(x(T,t9(e))). This function has a local minimum over [0, c o ) at the point E = 0. Since W(0) = e(x(T)) 71(T) = (—ATei) = 0, e(x(T)) T y(T,i)i) = i= 0 ^ i= 0 we must have 0 5 h"(0) = t"(x(T))(RT),W(T)) e(x(T))T i(T). Note that (s(T)) T i(T) = —p(T) T i(T) — (1)' (x(T))i(T) = — p(T) T i(T) A T 4)"(x(T))(W(T), RT)). We expand the first term as p(T) T 4T) = f p(t) T (Mt)th(t) 112(t, kt),ij(t))) dt =^h2 h3, where h1, h2 and h 3 are given by k^T h1 =^p(t)T fu (t)iv i (t) dt - i=o k^T E y j (t)g j (u(t))T ivi(t)dt = c 3^ i=o ° jED(ouP - k^T = -C 3 Ef ^-,; (o g ii(u(0f )(ii(t), 'IVO) dt i=° ° jED(oui 2 Chapter 2. Generalized Conjugate Points^35 = —C 2 (1 — T)2 h2 =^9:1(0) E 7; (o g ii(u(0)(v(t), v(t)) dt 0(7), J^ jED(ouP J p(t) T fu (t)iii(t) dt = 0 and h3 = f^ H2y(t), 0 fT = c 2(1^7 )2^„...(t)T H2(t, y(t), v(t)) dt^0(r). o Using these three expressions, we have h"(0) = t20(7 ), "g(T)) — P(T) T i(T) 1 = £ 2 0(T), (T)) — c 2 (1 — r) 2^L 2 (t, y(t), v(t), y(t), v(t)) dt 0(r) = c 2 (1 — 7) 2 (e2(y(T), y(T)) — fo L2(t, y(t), v(t), y(t), v(t)) dt) + 0(7). Since h"(0) > 0 for all 0 < r < 1, we may let T^0+ desired conclusion.^ and divide by c 2 to get the Q.E.D. Step 4. We finish the proof of Theorem 2.4 (b). Since (x(T))r, o (T) = R k for some co > 0 by Lemma 2.5, if we denote B(e 0 ) = {v e V , 0 I Olio, 1}, then 0 is an interior point of the convex set 4) 1 (x(T))y(T, B (Eo)). There is a constant ti o > 0 such that {x E R k I ixl < riol is contained in 1. 1 (x(T))y(T, B(Eo))• Suppose v E V and (x(T))y(T , v) = 0. For any c > 0, define v e E V, by v e (t) = v(t) if t E A, and v,(t) = 0 otherwise. We have^— vIlL2^0 and v e ) — y(•, v)11 00 —* 0 as c —+ 0+. We will show how to construct a sequence of controls w, E V, such that ilw€1100 0 while J2(v € w,) > 0. Indeed, if (1)' (x(T))y(T, v ,) = 0, then it suffices to choose w e = 0 by Lemma 2.7. On the other Chapter 2. Generalized Conjugate Points ^36 hand, if 4"(x(T))y(T,v,) 0, there exists a w E V, with ilw11 0,3 < 1 such that 4) 1 (x(T))y(T,v,) 4) 1 (x(T))y(Tw) = 110 ^ (x(T))y(T,ve)I Choose w, = j4qx(T))y(T, v f )jtvh o : then cl)'(x(T))y(T, w,) = —V(x(T))y(T,v,) and W E E VE O C V, . In other words, (I3 1 (x(T))y(T,w, + v E ) = 0 and w, + v, E V„ so by Lemma 2.7 we have the second variation J2(w, + v,) > 0. Since J2 vE + We) = J2(vE) + .12(W e) 2.e2(Y(T 14 )1 Y( T W E)) ( — 2i . L2(1, y(t,v,),v,(t),y(t,w ( ),w,(t))dt (,) and iiw€1100^0 , ifY(•, wE)1100 ^0 as a —4 0+, we may let €^0+ to get the desired result h(v) > 0. Q.E.D. Remark 2.8. The convexity of U is only used in Lemma 2.7 to generate a control such that the corresponding trajectory satisfies the endpoint constraints. If we have a free endpoint problem, we do not need this procedure, so we can drop the convexity condition on the control set U. Now we get necessary conditions for the problem (2.1). Theorem 2.9. Suppose the function L satisfies the same conditions as the function f and (x, u) is an extremal pair for problem (2.1). Then Theorem 2.4 still holds, provided that H is replaced by H = pT f(t,x,u) — L(t, x, u) — EjEpul2 -Yigi(u). Proof. If we define a new variable xo(t) = L(t,x(t),u(t)) with x o (0) = 0, then the problem (2.1) is equivalent to the one we have discussed above. Standard methods give the result. We elaborate as follows. Denote x = (x o , x), f(t, -, u) = (L(t,x,u), f(t,x,u)), AS) = xo + As), 4)^= 4)(x) and 1-1(t,i,13,u,-y) = p o L + P T^E jEitur2 -yigi(u). This reduces (2.1) to the Mayer problem (2.4). If (x, u) solves (2.1), define x o (t) = foi L(s, x(s), u(s)) ds. Then (x, u) solves the reduced problem. By hypothesis (x, u) is an extremal. Definition 2.1 (a) (b) shows that the Chapter 2. Generalized Conjugate Points ^37 arc /3 satisfies po (t) a —1 and —p' (t) = fx (t)T p(t) — L x (t) with —p(T) = A'(x(T)). Let 2 E R (n+1)x(n + 1) be a solution of 2 1 (t) = —2(i)fi(t), 2(T) = in +1. Then 2(t) has the form 2(0= (10 Z(t) ) . It is easy to check that (L),..((tt);')v(t) et T cir (i) 2 (t) m ov(t) ....... a T (0, 4) , (x)) ( 0 0 Z(t)) = oi T V(x(T))Z(t)f,,,(t)v(t). The other expressions can be obtained in a similar way. Putting the above discussion together, we get the desired result.^ Q.E.D. Comparison to other work. Warga considers the second variation for optimal control problems in [30], and proves results comparable to Theorem 2.4 and Theorem 2.9. His formulation involves an arbitrary convex control set U, and his accessory problem is based on a linearized system (comparable to our equation (2.11)) in which the controls v must satisfy both v(t) E U — u(t) and p(t)T fu (t)v(t) = 0 for all t. When the control set U is given by a finite system of mixed equality and inequality constraints, these two conditions specify a subset of the set V(t) considered in this chapter. Indeed, we replace Warga's set U — u(t) by Tu(u(t)): it is well known that this tangent cone can be strictly bigger than either U— u(t) or the cone it generates. (To get from this cone to Tu(u(t)), a closure operation is required.) Thus assuming a specific structure for the set U allows us to consider more perturbation controls v than a direct application of Warga's theory would suggest. This improvement can be strict: in Example 2.21 below, an admissible v(t) in our sense yields a negative value for Warga's "second variation" even though the underlying extremal is optimal! The reason for this is that Warga's second variation functional does not satisfactorily account for the second-order shape of the boundary of the control set U. But Warga's second variation functional agrees with ours for controls v(t) restricted to U — u(t). We note also that Warga's methods apply equally well Chapter 2. Generalized Conjugate Points^38 to time-varying control sets U(t): this is a refinement which could easily be added to our theory as well. 2.3. Generalized Conjugate Points Definition 2.10. Let (x, u) be an extremal. A point c E (0, T) is a generalized conjugate point to T if there exists a triple (y, q, v) on [c, T], such that y'(t) = fx (t)y(t) + fu (t)v(t), y(c) = 0, y(T) E Tc(x(T)), v E V — q'(t) = h(t)T q(t) + .11xx (t)y(t) + IIxu (t)v(t), — q(T) E A " (x(T))y(T) — NC(x(T)), (q(t)T fu(t) + y(t)T Hxu (t)+ v(t)T H uu (t))v(t) ? 0, and in addition, either (a) or (b) below holds: (a) (q(t)T fu (t)+ y(t)T Hru (t)+v(t)T H uu (t))v(t) > 0 on a set of positive measure; or (b) there exist a control v 1 E V and an arc y i such that Yi(t ) = ix(i) Y 1(t) + fu t v i (t), ( ) Yi ( 0 ) = 0, Yi (c) T q(c) > 0, Yi (T) E Tc(x(T)), (q(t)T fu (t) + y(t) T Hxu (t) + v(t)T H uu (t))v i (t) ..?_ 0. Remark 2.11. The standard problem in the calculus of variations is to minimize the functional foT L(t, x(t), x' (1)) dt over all arcs x satisfying the endpoint constraints x(0) = x o and x(T) ST. To translate it into an optimal control problem, we may take f(t,s,u) = u, U = R" to get fx (t) = 0, fu (t) = I, A(x) AT (5 — XT), Tc(x(T)) {0}, and H = pT u — L(t, x u). Note that the system is normal. The Chapter 2. Generalized Conjugate Points^39 conditions of Definition 2.10 refer to a triple (y, q, v) and a point c for which y'(t) = v(t),y(c) = 0, y(T) = 0 q'(t) = L x x (t)y(t) + L zu (t)v(t) q(T) E R", v E PW C , v(t) E R" (q(t)T — y (t)T L x .(t)T — v (t)T L ..(t)) v(t) ...> 0. These conditions certainly hold for any piecewise smooth pair (y, q) such that q'(t) = L x z(t)y(t)^zu(t)Y 1 (t), y(c) = 0, y(T) = 0, 1 q(t) = Lux(t)Y(i) L uu (t)y 1 (t)• (2.18) If L..(t) > 0 then the existence of a nontrivial solution y of the Jacobi equation is equivalent to the existence of a solution to (2.18) with q(c) 0. (Note that y'(c) 0 and L uu (t) > 0 imply q(c) 0.) We may choose y i (t) = t(T — t)q(c), v = yi. So our condition (b) holds. From the above discussion we see that if c is a conjugate point in the calculus of variations and the strengthened Legendre condition holds, then c must be a generalized conjugate point. By choosing controls v as above we can find all the conjugate points in calculus of variations. Of course, we may find other generalized conjugate points by choosing other controls v. For example, suppose x is a smooth extremal for which the Legendre condition fails. That is, suppose there is a point c E (0, T) and a unit vector it such that il T L..(c)it < 0. Let us show that c is a generalized conjugate point to T. Since CIT L ii .(c)ii < 0, there exists a 8 > 0, such that it T L..(t)ii < 0 on [c, c 8]. By the continuity of functions L z ,(t), L x .(t), and L..(t), there exist e > 0 and M > 0 such that izT < —€ and iL xx (t)1, IL x ,i (t)1 are bounded by M for all c < t < c^. Now let 0 < d < 1 be so small that [c, c 2d] C [c, c 8]. Denote Cl = c d and c 2 c 2d, and consider the functions 1it^t E [c,c d v(t) =^—it t E (ci,c21 to^t E (c2, 71 it(t — c)^t E [c, ci] y(t) =^ 1 —ii(t — c 2 ) t E (ci,c21 0^t E (c2, Chapter 2. Generalized Conjugate Points^40 and q(t) = a + fckL xx (s)y(s) + L xu (s)v(s)) dt. This choice of (y, q, v) satisfies all the basic conditions in Definition 2.10. The first three lines are obvious; to confirm the fourth, we need only consider the interval [c, c 2 ]. Here we have both lq(t)1 < d + 4Md and 1(q(t) T — y(t) T .L x .(t)9v(t)1 < d + 5Md, so (q(t) T — y(t) T L zu (t) T — v(t) T .L T,.(t))v(t) > —(1 + 5M)d + c. If, in addition to the restrictions imposed above, we insist that d < e/(1+5M), then (y, q, v) satisfies all the basic conditions in Definition 2.10 and the supplementary condition (a). Therefore c is a generalized conjugate point to T. The main result. Our next result provides a concrete construction to show that if an extremal x has a generalized conjugate point in (0, T) then the second order variation is negative for some admissible control. Theorem 2.12. Let (x, u) be an extremal for problem (2.1). If there is a gener- alized conjugate point to T in (0, T), then there exists a pair (y, v) satisfying the system (2.11) with V(x(T))y(T) = 0 such that J2(v) < 0. Proof. Suppose c E (0, T) is a generalized conjugate point to T. Starting with the triple (y, q, v) of Definition 2.10, extend y and v to the whole interval [0, 7] by defining t { 0, y(t) = ^y(t), ^E [0, c)^ and v(t) = { 0, ^t E [0, c) t E [c,7],^v(t), t E [c,7]. Then (y, v) is a solution pair of (2.11). The pointwise inequality condition in Definition 2.10 implies that, in the notation of (2.15), T J2(v) = y(T) T A" (x(T))Y(T) - i i Y(t) 7' I I xx(t)Y(t) + Y(t) T II xu (t)v(i) + y(t) T Hxu (t)v(t) + v(t) T H uu (t)v(t)) dt T 5_ y(T) T A" (x(T))y(T) + I (y(t) T (q' (t) + f x (t) T q(t)) + q(t) T fu (t)v(t)) dt Chapter 2. Generalized Conjugate Points^41 T = y(T) T A H (x(T))y(T) + f (y(t) T q(t)Y dt c = y(T) T (A n (x(T))y(T) + q(T)) — y(c) T q(c) =0. If condition (a) of Definition 2.10 holds, the inequality above is strict, so <12(v) < 0. Otherwise, condition (b) holds, so J 2 (v) = 0. For a > 0, we define the control v a = v 1 + ay. We have v a E V. Clearly, the arc y a (t) = y i (t) + ay(t) is the solution to the system (2.11) corresponding to v a , and J2 (v a ) = J2(vi) + a 2 J2(v) + 2ayi(T) T A"(x(T))y(T) T — 2a I (yi (t) T Hz,(t)Y(t) + yi (t) T Hxu (t)v(t) + y(t) T Hzti (t)v i (t) + v(t) T H uu (t)v i (t)) dt 5 J2 (1, 1 ) + 2ayi (T) T A" (x(ny(T) T + 2a i (yi (t) T (q i (t) + fx(t) T q(t)) + q(t) T fii (t)v i (t)) dt T = J2 (vi) + 2a (yi (T) T A"(x(T))y(T) -I- f(y i (t)T q(t)) dt) = J2(vi) + 2ayi(T) T (A" (x(T))y(T) + q(T)) — 2ay i (c) T q(c) = J2 (v i ) — 2ay i (c) T q(c). The right side approaches —oo as a —* oo, so for all a sufficiently large, we have J2(v a ) < 0.^ Q.E.D. Combining Theorem 2.4 and Theorem 2.12, we immediately have the chapter's main result: Theorem 2.13. Suppose that U is a convex set, that (x, u) is an extremal, and that I(u(t)) is a piecewise constant set. If (x, u) actually solves the problem (2.1), then either Chapter 2. Generalized Conjugate Points^42 (a) there exists a nonzero vector a E R k such that a T V (x(T))Z(t)fti (t)v(t) < 0 a.e. t, Vv E V; or (b) the interval (0, T) contains no generalized conjugate points to T. If (x, u) solves the problem (2.1), to eliminate possibility (a) in Theorem 2.13, we have to add some conditions on (x, u). Definition 2.14. An admissible pair (x, u) is called regular if the only solution of the following system is p -.- 0: { -11(t) = fx (t) T p(t), —p(T) E Nc(x(T)), p(t)T fu (t)v G 0 dv E V(t). It is easy to prove that if (x, u) is regular, then clii(x(T))R i (T) = R k . (See [32, Remark 3.2].) The next result therefore follows immediately from Theorem 2.13. Theorem 2.15. Suppose the control set U is convex. Let (x, u) be a regular extremal. If (x, u) actually solves the problem (2.1), then the interval (0, T) contains no generalized conjugate points to T. It is worth mentioning that in the classical calculus of variations or in the free endpoint problem, any minimizing pair (x, u) is automatically regular. The following example shows that in general, the regularity condition is indispensable. Example 2.16. In the problem (2.4), let i = 0, f(x,u) = Id + u i u 2 + u 2 and x(0) = 0, x(1) = 0, and define the control set U by g(u) = ( ui, u2) 5 0. Any — admissible pair (x, u) is optimal. Obviously, (x,u) = (0,0) is an admissible pair and is also normal: (2.5) tells us that —At) = 0 and p(u? + u i u 2 + u2) < 0 for all Chapter 2. Generalized Conjugate Points^43 u 1 > 0 and u2 < 0. Since ui = 0 implies p > 0 while u2 = 0 implies p < 0, we get p = O. Theorem 2.2 tells us that (x, u) = (0,0) is an extremal. We compute the corresponding p and y. (See Definition 2.1.) Here H = p(4 u i u 2 u 2 ) 70 1 — 72 u 2 . Condition (a) gives At) = 0, condition (b) gives no information, conditions (c) and (d) give Hu = 2Pul Pu2 + -Y1 = 0, -y put + p — -Y2 > 0,^T u = nv. By letting u = 0, we see p = 1, y = (0, 1) satisfy the above conditions. Since D(t) {2}, we have V(t) = {v E R 2 gl(0) T v 0 , g“O) T v 0) = R+ x {0}. If we choose y(t) = 0, q(t) = 1, and v(t) = (1,0), then this triple satisfies Definition 2.10. So even though (x, u) = (0, 0) is optimal and normal, any point c in (0,1) is a generalized conjugate point to 1. This is because (x, u) = (0,0) is not regular. In this problem fu (t) = (0,1), so fu (t)v(t) = 0 for any v E V. Here conclusion (a) of Theorem 2.13 holds, but (b) does not. Comparison to other work. Zeidan and Zezza [32] discuss the optimal control problem (2.1) under the structural assumption that U = {u E Rm I gj(u) = 0, j E ./ 2 } for smooth functions. In this case, our cone V(t) = {v E R m g i (u(t)) T v = 0, j E 1 2 1 is a subspace of Rm. They call a point c E (0, T) conjugate to T if there exists nonzero (y, q, v), satisfying the first four conditions in Definition 2.10 and (fu (t)Tq(t) H zu (t)T y(t) H uu (t)T v(t)) 1 V(t). Their definition also extends that in the calculus of variations with a similar formulation, and is simpler than ours. But there is a price to pay for it: to apply the conjugate point theory in [321, the system must be strongly normal at both end points and a version of the Chapter 2. Generalized Conjugate Points ^44 strengthened Legendre condition must hold. (See [32, Theorem 6.1].) The pair (x, u) is said to be strongly normal at 0 if for any c E (0, T) the system — At) = fz (t) T p(t) t E [0,c], p(t) T h(t)v = 0 Vv E V(t), t E [0,c) has only the zero solution. Similarly, the pair (x, u) is said to be strongly normal at T if for any c E (0, T) the system — 1, 1 (0 = fx (t) T p(t) t E [c,7], p(t) T fu (t)v = 0 Vv E V(t), t E [c,T] — p(T) E Nc(x(T)) has only the zero solution. Obviously, since there is no transversality condition at 0, the strong normality condition at 0 is less likely to hold except in the case of calculus of variations or free initial point problems. In fact, under these conditions the conjugate point set in [32] is a subset of ours: the "strengthened Legendre condition" of [32] implies that (y, q, v) being nonzero is equivalent to q(c) 0; while the strong normality condition of [32] implies that the linearized system is completely controllable on both intervals [0, c] and [c, T] for any c E (0, T), so our condition 3.1(b) holds. The key point in this chapter is to give a rigorous treatment of second order necessary conditions for control sets involving inequality constraints along with equalities. Although Zeidan and Zezza [32, Remark 6.4] claim that the inequality constrained case can be studied by their methods, their justification is neither clear nor convincing. The problem arises because control sets defined by mixed constraint systems may have corners, whereas the methods of [32] are valid exclusively for smooth control sets. Even if the cited remark were correct, Zeidan and Zezza only choose a subspace of V(t), namely, T(t) = {v E R m g l (u(t)) T v = 0, Vj E /(u(t))U PI, to define conjugate points. Their definition is the same as that for smooth control sets with T(t) in place of V(t). The examples below will show how our generalized conjugate point gives much more information than that suggested by [32]. Chapter 2. Generalized Conjugate Points^45 Zeidan and Zezza [33] define conjugate points for the general closed convex control set U [33, Definition 3.1]. Their proof of nonexistence of conjugate points for a weak local solution is based on the strong normality condition, and on the second order variations in [4], which are not as sharp as ours since they do not consider the structure of the control set U. In fact, under the conditions of [33, theorem 3.1] the conjugate point set of [33] is a subset of ours. So our notion of generalized conjugate point is sharper and more general. The examples below will confirm this. In Theorem 2.15 we require neither the strong normality condition on the endpoint 0 nor the strengthened Legendre condition. Also in our definition of generalized conjugate point condition (b) is much weaker than the strong controllability condition, and if condition (a) holds, we do not need to consider controllability at all. In [34] Zeidan and Zezza claim that the strong normality conditions on both ends of the basic interval are indispensable. That is because their definition of conjugate point and method of proving nonexistence of conjugate points force them to impose this condition. Our theory proceeds without it. 2.4. Examples Here we study several examples and compare our results with those available in the literature. Example 2.17. Consider the following free endpoint problem: fo T minimize^(u(t) 2 — x(t) 2 ) dt subject to x' = u, x(0) = 0, u > 0. The pair (x, u) = (0,0) is extremal for any T > 0, but when T > 7r/2 it fails to be optimal. Our Theorem 2.15 detects this, whereas the results of [32, 33] do not. Chapter 2. Generalized Conjugate Points^46 Here we have H (x , p, u, 7) = pu — u 2 + x 2 + -yu. Let (x, u) be an extremal, that is, suppose there exist p and -y satisfying (a)—(d): —p = 2x, p(T) = 0, H. = p — 2u + 7 = 0, 7 > 0, 7u = 0. Obviously, (x, p, u, y) = (0, 0,0, 0) is a solution. So (x, u) = (0, 0) is an extremal. Since D(t) = 0, we have V(t) = Ii+, also fx = 0,fu = 1, Hxx = 2, Hx . = 0, H.„ = —2. A point c E (0, T) is a generalized conjugate point to T (Definition 2.10) if there exists a triple (y, q, v), such that 1 y' = v, y(c) = 0 —q' = 2y, q(T) = 0 (q — 2v)v > 0, v > 0. (2.19) Let v(t) = q(t)I2 in (2.19). Then (2.19) holds for y(t) = cos(T— t), q(t) = 2 sin(Tt). The constraint v > 0 holds as long as T E [0, 7r]. And when T > 7r/2 then c = T —112 makes y(c) = 0, q(c) = 2. By choosing v 1 (t) = 2/c and y i (t) = 2t 1 c, we confirm condition (b) of Definition 2.10. So there is a generalized conjugate point in (0, T) if T > 7r/2. Theorem 2.15 implies that (x, u) = (0, 0) is not optimal. In fact, if we let a cos(t) 0 5 t < 7r/2 u(t) = { 0 7r/2 < t < T where a is positive, then the corresponding trajectory x is a sin(t) 0 < t < 7r/2 x(t) = { a 712<t<T and f(u) = —a 2 (T — 7r/2) < 0. In [32, Definition 6.1] (compare [32, Remark 6.4]) the subspace T(t) = {0} must be chosen instead of V(t) to determine the conjugate points in (0, T). This choice forces v(t) = 0, so y(t) = 0 and thus q(t) = 0. The definition in [32] produces no conjugate points. In [33, Definition 3.1] with the control set U = 11 + , there is a conjugate point c = T — 7r/2 to T if T > 7r/2 but the strong normality condition Chapter 2. Generalized Conjugate Points^47 (H2 ) is violated, so Theorem 3.1 of [33] can not be applied. In a word, these two papers fail to detect the nonoptimality of (x, u) = (0,0). Example 2.18. Consider the following free endpoint problem T minimize J(u) = f (u2(t) 2 — x(t) 2 ) dt o subject to x' = ui + u 2 , x(0) = 0, u 1 > 0. Here the pair (x, u) = (0,0) is extremal but not optimal for any T > 0. This follows from our Theorem 2.15, but the results of [32, 33] give partial answers at best. Here we have H (x , p, u ,-y) = pu i d pu 2 - — u3+ x 2 + y u 1 . Let (x, u) be an extremal - with the corresponding p and y, that is, 11 = 2x, p(T) = 0, Hu_ — GP- +2u27 ) .0, y ?_ 0, yul = 0. - - Evidently (x, p, u, -y) = (0, 0, 0, 0) is a solution. So (x, u) = (0, 0) is an extremal. Since D(t) = 0, we have V(t) = {v E R 2 : 9,1 (0) T v < 0} = 11+ x R, also ( 00 02 . fx = 0, f = (1,1), Hxx = 2, Hxu = 0, Huu =^A point c in (0, T) is a generalized conjugate point to T if there exists a triple (y, q, v) such that y' = v i + v 2 , y(c) = 0 ^ —q' = 2y, q(T) = 0 qvi + (q — (2.20) 2v2)v2 ? 0, v E V(t). Let v i (t) = 1, v2(t) = 0 in (2.20). Then (2.20) holds for y(t) = t — c and q(t) = (T + t — 2c)(T — t). Moreover, condition (a) of Definition 2.10 holds on (c,T). Thus any point in (0, T) is a generalized conjugate point to T. So (x, u) = (0, 0) is not optimal for any T > 0. In fact, if we choose u 1 (t) = a, u2(t) = 0, where a is positive, then J(u) = —a 2 T 3 /3 < 0. In [32, Definition 6.1], the subspace T(t) = {0} x R must be chosen to determine the conjugate points in (0, T), and there is a conjugate point c = T — ir /2 to T when Chapter 2. Generalized Conjugate Points^48 T > 7r/2. In [33, Definition 3.1], the control set is U = R + x R, but no conjugate points are found for any T > 0. Our result is sharper in both cases. Example 2.19. Consider the following calculus of variations problem with twodimensional state x: minimize J(u) = Jo T (u 2 (t) 2 — x i (t) 2 ) dt subject to x' = u, x(0) = 0, x(T) 0, u E R 2 . Our Theorem 2.15 implies that the extremal (x, u) (0, 0) is not optimal for any T > 0. The results of 132, 33] fail to detect this. We have H(x,p,u) = piui + p2u2^+ 4. The problem is normal. The pair (x, u) ------- (0,0) is an extremal with p = 0. Here V(t) = R 2 , also fx = 0,.fu = = 0 0 Hzz = ( 02 0° ),Hzu = 0, and Huu 2 . A point c E (0, T) is a generalized 0^ conjugate point to T if there exists a triple (y, q, v), such that 1 y' = v, y(c) = 0, y(T) = 0 — 4'1 = 2yi, — CO = 0, qivi + (q2 — 2v 2 )v 2 > 0, v E R 2 . (2.21) Let c = T/3, v i (t) = 1 on [c,2c] and v i (t) = —1 on (2c, T], v2(t) = 0 in (2.21). We get the solutions (t) = {t—c c<t< 2c T — t 2c < t < T qi(t) = {t(2c — t)^c < t <2c (t — 4c)(t — 2c) 2c < t < T and y 2 = 0, q 2 = 0. It is easy to see that condition (b) of Definition 2.10 holds. So there is a generalized conjugate point to T for all T > 0. Thus (x, u) = (0,0) is not optimal for any T > 0. In fact, if we choose u i (t) = a on [0, T/2] and u i (t) = —a on (T /2,T], u2(t):-.-- 0, then J(u) = —a 2 T 3 /12 < 0. Notice that the strengthened Legendre condition H uu < 0 fails in this problem, so that the classical theory of conjugate points fails to apply. This condition is Chapter 2. Generalized Conjugate Points^49 required by the results of both [32] and [33], so the example lies beyond the scope of both these papers. Of course, this hypothesis cannot simply be ignored: choosing the subspace T(t) = R 2 in [32, Definition 6.1] produces no conjugate points, and similarly, no conjugate points can be found using the definition in [33]. Example 2.20. Consider the following free endpoint problem: minimize J T(u i (t) 2 u 2 (t) 2 — x(t) 2 ) dt subject to x' = u 1 + u 2 , x(0) = 0, u E U, where the control set U is defined by g(u) = (u i , —u 2 ) < 0. Here the pair (x,u) = (0,0) is an extremal for any T > 0, but when T > 7r/2 it fails to be optimal. Our Theorem 2.15 detects this, whereas the results of 132, 33] do not. Here we have H(x,p,u, -y) = pu i + pu 2 — ui — u3+ x 2 — 701 + 72212. Let (x,u) be an extremal, that is, suppose there exist p and 7 such that = 2x, p - 221 1 - =^,^=0,7 >0, y Tg(u) = O. —p(T)= 0,^ p - Z212 1 72 - - Observe that (x, p, u, y)^(0, 0, 0, 0) is a solution. So (x, u)^(0, 0) is an extremal. - T v < 0} R_ x R. Also Since D(t) = 0, we have V(t) = {v E R 2 : g i(0) l fx = 0, fu = (1,1), 11.. = 2, Hxu = 0, Huu = —2/2. A point c E (0,T) is a generalized conjugate point to T if there exists a triple (y, q, v), such that y' = vl + v 2 , y(c) = 0 ^ = 2y, —q(T) = 0 (2.22) (q^(q — 2v 2 )v 2^0, v E V(t). Let v i (t) = 0, v2(t) = q(t)/2 in (2.22). One solution is y(t) = cos(T — t), q(t) = 2 sin(T — t), so when T > 7r/2 then the choice c = T — 7r/2 gives y(c) = 0 and q(c) = 2. In Definition 2.10, condition (a) fails, but condition (b) holds for the same vl as in Example 2.17. So there is a generalized conjugate point in (0, T) if T > 7r/2 and (x, u) (0, 0) is not optimal by Theorem 2.15. Chapter 2. Generalized Conjugate Points^50 Here the strengthened Legendre condition holds. In [32, Definition 6.1] T(t) = {0} x {0} must be chosen to determine conjugate points in [0, T], and none are found. Similarly, there is no conjugate point of the sort described in [33, Definition 3.1]. A key issue in second-order theory is the set of comparison controls participating in the linearized system (2.11). Our choice, based on the set V(t) (see (2.6)), allows more controls than the results of Warga[30]. By restricting to Warga's control variational set, both results are the same. But Warga's result is false for our control variational set V. Now we give an example to show an admissible control v in our sense yields a negative value for Warga's "second variation" even though the underlying extremal is optimal. Example 2.21. Consider the following two-dimensional control problem: minimize £(x(1)) subject to x' = u, x(0) = 0, u E U, where £(x) = —x 1 — 2x3 and U = {u E R 2 i (u1 -1-- 1) 2 + u3 < II. Denote by R(1) the reachable set at t = 1. For any x(1) E R(1), a simple argument involving Jensen's inequality implies x(1) E U. So x 1 (1) 2 + 2x 1 (1) -I- x2(1) 2 < 0, that is, 0 < 2x 1 (1) 2 < —x 1 (1) — 1x 2 (1) 2 . It follows that the minimum value of —x i (1) — 2x 2 (1) 2 over R(1) equals 0, attained at the origin. Therefore the control u -_-,-- 0 solves the problem. By the maximum principle (Theorem 1.1) we get the adjoint arc p = (1, 0). Warga [30, Theorem 1.1, Form 1.2] asserts that y(1) T t ll (x(1))y(1) _?_ 0 over all y' = v, y(0) = 0, v(t) E 5/(t), where C/(t) = {v E U I p(t) T fu (t)v = 0}. That is, — Y2( 1 ) 2 > 0 over all y` = v, y(0) = 0, v(t) E fl(t).^(2.23) Since Q(t) = {v E U I v 1 = 0} --=,- {0}, Warga's second-order inequality (2.23) holds obviously. (2.23) still holds in replacing U by cone U in the definition of C/(t). Chapter 2. Generalized Conjugate Points^51 cone U is the open left half plane together with the origin, so 12(t) ,---- {0}. But our choice, namely, Tu(0) is too big: using it in place of U makes ft(t) = {0} x R, which is the same as our V(t), and Warga's second-order inequality (2.23) fails to hold. As we have mentioned, the reason is that Warga's second variation functional does not satisfactorily account for the second-order shape of the boundary of the control set U. In contrast, our theory involves the second variation functional 1 J2(v) =-- —Y2(1) 2 + f v 2 (t) 2 dt. o The extra term comes from the boundary of the control set, and we get .1 2 (v) > 0 for all v(t) E V(t). Chapter 3 Variational Inclusions Consider the following differential inclusion control problem: minimize g(x(T)) over all arcs x satisfying (3.2) and x(T) E C.^(3.1) Here the dynamic constraint is the differential inclusion x 1 (t) E F(t , x(t)) a.e. t E [0, 7], x(0) E Co.^(3.2) For the purpose of this chapter, an arc is an absolutely continuous function. If we denote by .R(T) the reachable set to (3.2) at time T, the problem (3.1) is equivalent to minimizing the extended-valued function f over all x in R", where f is defined by f (x) = g (x ) + 41 R( T )n c (x ). If an arc x solves problem (3.1), then its endpoint x(T) minimizes f. Hence the Fermat rule tells us that certain directional derivatives of f at x(T) must be nonnegative. To express these derivatives in terms of those of g, ill R(T) and Wc, we must know the normal and tangent cones to the reachable set R(T) at the optimal endpoint. Unfortunately, we can not characterize R(T) explicitly, so our best hope is to find good approximations to these cones. In this chapter we first study the Clarke normal cone and the adjacent sets to the reachable set via proximal analysis and set-valued analysis. We then get second order necessary conditions for optimality in (3.1) with bounded differential 52 Chapter 3. Variational Inclusions 53 inclusions and extend the results to unbounded differential inclusions using the reduction method in Loewen and Rockafellar [16]. Finally we apply the results to the optimal control problem and recover the accessory problem in Zeidan and Zezza [32]. 3.1. Background Material and Hypotheses Let S be a closed subset of Rn and s E S. A vector e is called a proximal normal to S at s, written e E PNs(s), if there is some M > 0 such that (e, — < — s1 2 for all s' E S. The limiting normal cone to S at s is defined by LNs(s) = E R n I = urn ek for some sequence k oo Sk E PNs(sk), sk-s-}. Here sk4s means that sk^s and sk E S for all k. The important properties of the limiting normal cone are (see [16]) I. If s is a boundary point of S, then LNs(s) contains a nonzero element; 2. The multifunction s H LNs(s) has closed graph; and 3. The Clarke normal cone Ns(8) is given by Ns(s) ZT.LNs(s). The main results in this chapter are based on the following necessary conditions for general optimization problems in finite dimensional spaces. Theorem 3.1. Let g: R" R be Lipschitzian in some open set containing x, and let Si , S2 be nonempty closed subsets of R n containing x. Suppose x solves the minimization problem minimize g(y) over all y E Sl n s2^(3.3) and satisfies the constraint qualification Ns i (x) n( -Ns 2( x )) {0}.^ (3.4) Chapter 3. Variational Inclusions ^54 Then one has the first order necessary condition g ai (x , y) 0 Vy E Ds, (x) n Ds2(x)•^(3.5) Furthermore, if equality holds in (3.5) for some y, then one has the second order necessary condition > 0 ez E D s2 , (x,^D2s 2 (x, ga" (x, y, z) — y). (3.6) Proof Denote f = g^slns2 . Then x is a minimum point of f. The Fermat rule [3, Theorem 6.1.9] tells us fa (x, y) > 0 for all y in Rn . Since g is Lipschitzian around x, by [3, Theorem 6.3.1, Proposition 6.2.4] we have fa (x y) C ga(x,y) TDs,ns 2 (x)(Y). , Finally, the constraint qualification (3.4) implies Ds,ns 2 (x) = Ds i (x)nDs 2 (x) by [3, Corollary 4.3.5]. If g (x, y) = 0 for some y E Ds i (x) n Ds2(x), that is, .fa (x, Y) = 0, then 0 < fa"(x, y, z) < g':(x , y, z) for all z E D 2si ns2 (x, y) by [3, Proposition 6.6.3, Theorem 6.6.2]. Finally (3.4) implies D 2sins2 (x, y) = D 2si (x, y) (l D2s2( x, y) by [3, Theorem 4.7.4]. Q.E.D. Remark 3.2 . Ward [29] has shown that the constraint qualification (3.4) can be weakened as LNs,(x)n( LNs2 (x)) = {0}. — Correspondingly, we do not need to take closed convex hull of Q(x) in the constraint qualification (3.9). A tube about an arc x is a relatively open subset 5 -2 of [0,^x R n containing the graph of x. An arc x is admissible for (3.1) if it is a trajectory of F and satisfies both endpoint constraints. An admissible arc x is a local solution to (3.1) if there Chapter 3. Variational Inclusions^55 exists an e > 0 such that for any admissible arc y satisfying 11y — x11 03 < e we have g(y(T)) > g(x(T)). Denote by St e the &tube about x and S2,(t) the t-sections of Q,, that is, Si, = { (t, x) E [0, 7] x R" fl,(t) = {x E I Ix - x(01 < e) R n I ix — X MI < el. By choosing a sufficiently small e we may suppose Q, C Q. The following hypotheses about (3.1) will be assumed throughout the chapter. (H1) The multifunction F has nonempty, compact, convex images on the tube Q ; (112) the multifunction F is measurably Lipschitz on Q, that is, F is measurable with respect to the u-field G x 13 generated by products of Lebesgue subsets of [0, T] with Borel subsets of Rn, and there exists an integrable function k such that F(t, y) C F(t, x) + k(t)ly — xl clB V(t, x), (t, y) E Q, where B is the open unit ball in Rn; (113) the multifunction F is integrably bounded on S2, that is, there is an integrable function k such that Ivl < k(t) Vv E F(t, x), (t, x) E Q; (114) The subsets Co , C of R n are nonempty and closed. (115) The function g : Rn —+ R is Lipschitzian near x(T). The Hamiltonian associated with the multifunction F is the function H: Q x R n —+ R defined by n(t , x, p) = max{(v, p) I v E F (t , x)} . The function 7-t has the following properties [6, Proposition 3.2.4]: Chapter 3. Variational Inclusions ^56 Theorem 3.3. The function H is finite on Q x Rn . If (t, x) is an interior point of Q, and if p is any point in Rn, then (a) 0 —3 7-1(0, x, p) is measurable near t; (b) y —> 7-1(t,y,p) is Lipschitz near x of rank Iplk(t); (c) q —> 1-1(t,x,q) is Lipschitz near p of rank k(t). Lemma 3.4. Suppose x is an admissible arc and (t, x(t)) is in the interior of SI for all t. If an arc p satisfies (-21(0, x'(t)) E 071(t, x(t),p(t)) for all t, then Ip(t)I < Ip(T)1(1 + Ke K ) Vt E [0, T], where 01-1(t,x,p) is the Clarke generalized gradient set of ?At, -, -) with respect to (x, p), and K = foT k(s)ds. Proof. By [6, Theorem 2.5.1] we have (970,x,p) = co pirrin(x,p)(t, x k,pk) I (xk,pk) -4 (x,p), (xk,pk) s n st n } where S is a set of Lebesgue measure 0 in Rn x Rn and Qii is a set of points at which 7-1 fails to be differentiable. By Theorem 3.3 (b) we have jp 1 (t)1 < k(t)Ip(t)l. Since T p(t) = p(T) — I p i (s) ds, t we have IP( t )I 5- IP(T)I T + f k(s)1P( 3 )Ids t Denote r(t) = — ftT k(s)lp(s)lds. Obviously, 7- 1 (t) = k(t)Ip(t)1 5_ k(t)(Ip(T)I — r(t))• That implies dt f r(t)e ° k(s) d s < k(t)IP(T)ICf ° k(s) ds ' (3.7) Chapter 3. Variational Inclusions ^57 Integrate from t to T on both sides and also note r(T) = 0. This gives T —r(t) < f k(s)lp(T)lef: t k(ss) ds' ds < Ip(T)IKe K . Substitute the above expression into (3.7). The desired result follows. ^Q.E.D. From now on let x be a given local solution to (3.1) relative to a tube Si,. Without loss of generality, we may suppose that the domain of F is SZ and C2 = S12 e . Otherwise we could define a new multifunction F by restricting F to S12 € . The arc x would remain a local solution to the associated new problem (3.1), for which we are about to provide necessary conditions. Since these necessary conditions involve only the local behaviour of F along x, their conclusions for F are identical to the desired results for F. Similarly, we may suppose that the sets C o and C are compact. 3.2. Approximations to the Reachable Set We first quote a result from Clarke [6, Theorem 3.1.7]: Suppose multifunction defined on a tube SI and on C2. Also assume that r is an A x B- r is nonempty, compact, and convex-valued r is upper semicontinuous in x and integrably bounded. Suppose further that there is a multifunction X from [0, T] to Rn and a positivevalued function e(t) with the property that X(t) + e(t)B C S2(t) for all t. Theorem 3.5. Let {xk} be a sequence of arcs that satisfy xk(t) E F(t,x k (t)) a.e. on [0, T], x k(t) E X(t) on [0, T], and the sequence {x k(0)} is bounded. Then there is a subsequence of {xk} that converges uniformly to an arc x that is a trajectory for F. Let us denote the reachable set of (3.2) near x by R(T) = {y(T) E IV I y'(t) E F(t,y(t)), y(t) E cl fl e (t), y(0) E Col. Theorem 3.6. The set R(T) is closed. Chapter 3. Variational Inclusions 58 Proof. Suppose yk(T) E .R(T) and yk(T) -> a as k -- oo. We have a sequence of {yk} such that yk(t) E F(t,yk(t)), yk(t) E cl 1 € (t) and yk(0) E Co . Hypothesis (113) and compactness of Co show that 14(t)I < k(t) and lyk(0)1 < c for some constant c > 0. By Theorem 3.5 there is a subsequence of {yk} (we use the same labels) such that II yk - Y1100 - 0 as k -4 oo and y is a trajectory of F. The closedness of cl 5 -1,(t) and Co imply y(T) E R(T) and y(T) = a. So R(T) is a closed set. Q.E.D. To discuss the Clarke normal cone to the reachable set, we need the following concept: Definition 3.7. A trajectory y for (3.2) is said to be proper if for any sequence yk of trajectories for (3.2) having the property that yk(T) ---+ y(T), there is a subsequence along which Ilyk - y11,, --4 0 as k -› oo. Now suppose x is a local solution of (3.1). Then x(T) is a minimum point of the extended-valued function f(x) = g(x) + T R(T)nc (x). To apply Theorem 3.1, we must first verify the constraint qualification (3.4) with S i = R(T) and 52 = C. This requires that we describe the Clarke normal cone to the reachable set R(T) at x(T). For this, we denote by Q(x) the set of endpoints of arcs p that satisfy the Hamiltonian inclusions and initial conditions associated with a given admissible arc x, that is, Q(x) = {p(T) I ( - 131 , x i ) E ar(t, x, p), P(0) E LNco(x(°))}. (3.8) Theorem 3.8. Let x be a local solution for (3.1) and be proper. Then NR(T)(x(T)) g :3-Q(x). (3.9) Proof. Let y be a trajectory of (3.2) with y(t) in 52,(t) for all t, and 6 E PNR(T)(y(T)). Then there exists a constant M > 0, such that y(T) minimizes the smooth function (-6, z(T)) + MIz(T) - y(T)I 2 over all z(T) E R(T). According to Loewen Chapter 3. Variational Inclusions 59 and Rockafellar [16, Theorem 1.2] this implies that there exists an arc p satisfying the Hamiltonian inclusion and initial condition defining Q(y) and —p(T) = —e, so e = p(T) E Q(y)• Since e is arbitrary, this shows that PNR ( T )(y(T)) C Q(y). For any e in LNR ( T)(x(T)), there exist sequences xk(T) E R(T) and ek E PNR(T)(xk(T)) such that ek —+ e and xk(T) —+ x(T). The properness of x implies that there is a subsequence of xk (we use the same labels) along which Ilxk —x11 00 -4 0 as k —.> oo. For k sufficiently large we have (t, xk(t)) E Si e (t) for all t. Since PNR(T)(xk(T)) C Q(xk), there must be a sequence of adjoint arcs pk such that ( — Pk, xk) E a7-1(t,xk,Pk), Pk(0) E LNco(xk( 0 )), pk(T) = Clc• Since ek --+ e, we can suppose IPk(T)i < l'I + 1. Theorem 3.3 and Lemma 3.4 imply that the multifunction OH is integrably bounded on some tube containing all the arcs (xk,pk). By Theorem 3.5, applied to F = OH with X(t) = 1l € (t) x lin, there exists a further subsequence pk uniformly converging to an adjoint arc p associated with x. Since the graph of LNc o is closed, we also have p(0) E LNco(x( 0 )), that is, p(T) E Q(x). So LNR ( T)(x(T)) c Q(x), as required. Q.E.D. In order to apply Theorem 3.1 to the problem (3.1), we also need to know the adjacent cone and the second adjacent set to the reachable set R(T) at x(T). The first order approximation has been discussed thoroughly in Frankowska [12], where the following result is given. Theorem 3.9. Let R 1 (T) be the reachable set of the following system: f y i (t) E dF(t, x(t), s' (t))y(t) a.e. t E [0, lib y(0) E Dc o (x(0)). (3.10) Then one has R1(T) g DR(T)(x(T)). Now we turn to the second adjacent set to R(T) at x(T) relative to y(T) E Ri(T). Chapter 3. Variational Inclusions^60 Theorem 3.10. Suppose that the arc y satisfies (3.10), and that there exist an integrable function k and a constant a ° > 0 such that d(x'(t) + ay' (t), F(t, x(t) + ozy(t))) < a 2 k(t)^(3.11) for all 0 < a < a o and 0 < t < T. Then one has R 2 (T) c D R(T) (x(T), y(T)), 2 where R 2 (T) is the reachable set of the following system: { z'(t) E d2 F(t, x(t), x'(t); y(t),y 1 (t))z(t) a.e. t E [0, 2], z(0) E D o (x(0); y(0)). Proof. Take any z(T) E R2(T), and any sequence tk > 0 converging to 0. Since z(0) E Dt...(x(0),y(0)), by the definition of the second adjacent set there must be a sequence sk —* z(0), such that x(0) + tky(0) + tisk E Co . Let zk(t) := x(t) + t k y(t) + tiz(t) + t 2k(sk — z(0)). Then zk is an arc on [0, 7], and zk(0) E Co. For k sufficiently large, we have the estimate d(z ki (t), F(t, zk(t))) < Izi MI + d(x'(t) + t ky' (t), F(t, x(t) + tky(t))) ti ti + k(t)(12•(t)1 + lsk — z(0)1) < Iz I MI + k(t) + k(t)(1z(t)I + 1) a.e. t E [0, T]. (3.12) (The last inequality relies on assumption (3.11).) Fix t E [0, T] outside the measure zero set implicitly specified below. Since z'(t) E d 2 F(t, x(t), x'(t); y(t), y' (t)) z(t), for the same sequence tk --4 0+, there exists a sequence {( k , (k)} which converges to (z(t), z'(t)) such that x ' (t) + tky (t) + t 2k(k E F(t, x(t) + tky(t) + tiG). ' ^ ^ Chapter 3. Variational Inclusions^61 Therefore we have the pointwise convergence statement below: 44(0, F(t ti , zk(t))) < A — z i (01 + k(t)(14 — z(t)l + 13k — z ( 13 )1) -4 0, as k --* oo. Upon writing pk := foT d(z ki (t), F(t, zk(t)))dt, we have by (3.12) and the dominated convergence theorem that Pk lim --7 = 0. tk k--).00 - By [6, Theorem 3.1.6], there exist arcs xk on [0, 71 and a constant c > 0 such that each xk is a trajectory of F with xk(0) = zk(0) E Co and Ilxk — zklloo < cpk. Each endpoint xk(T) lies in R(T) and we have xk(T) — x(T) — tky(T) xk(T) — zk(T) + z(T) + sk — z(0) t 2k^ ^t2k --> z(T), as k -p oo. So we must have z(T) E D 2 R(T) (x(T), y(T)), as required.^Q.E.D. 3.3. Second Order Necessary Conditions Now we are ready to state the necessary conditions for the problem (3.1). Theorem 3.11. Let x be a local solution to (3.1) and satisfy the constraint qualification co Q(x)n(-1Vc(x(T))) = {0}.^ (3.13) Then one has the first order necessary condition g'a (x(T), y(T)) > ^Vy(T) E R i (T)n Dc(x(T)).^(3.14) Furthermore, if equality holds in (3.14) for some y(T) and the assumption (3.11) holds along y, then one has the second order necessary condition g::(x(T), y(T), z(T)) _> 0 V z(T) E R2(T) n DC(x(T), y(T)).^(3.15) Chapter 3. Variational Inclusions^62 Proof. If x is proper, the result follows directly from Theorem 3.1 and the results above. Theorem 3.8 assures that assumption (3.13) implies the constraint qualification of Theorem 3.1, while the estimates of certain adjacent sets in terms of R1(T) and R 2 (T) are described in Theorem 3.9 and Theorem 3.10. If x is not proper, we may study an associated new problem (3.1) with X- = (x o , x) E Rn+ 1 , "j(i) = 4 + g(x), Co = {0} x Co , C = [-1, 1] x C, and IV, = {Ix — x(t)1 4 } x F(t, x). Since F satisfies (H1)-(H3), so does F. (Note SI = 52 2 ,.) The extended Hamiltonian is given by 1(t,i‘,13) = Polx — x(t)I 4^H(t, P). It is easy to see that i- = (0, x) is a local solution for the new problem (3.1). It is also proper: indeed, let xk be a sequence of trajectories for the modified differential inclusion (3.2) having the property xk(T) x(T), that is, LT Ixk(t)— x(t)I 4 dt --> 0, and xk(T) x(T). The integrable boundedness of F and compactness of Co imply that 14(t)I < k(t) and Ixk(0)I < c, so the sequence xk is uniformly bounded and equicontinuous. It has a subsequence converging uniformly to some limit x but the , integral condition above forces 'X' = x, so i" is proper. Therefore Nk- (T) (i(T)) C :Q(^Here Q is expressed by (3.8). Since the function does not depend on x o , so po is a constant, we have Q(.) C R x Q(x). Thus (3.13) gives co 0(±-)n(-2V(x))^x "cTiQ(x)) n({0} x (-Nc(x(T)))) = {0}. Theorem 3.1 tells us that 4(i(T), y(T)) ?. 0 VXT) E^( 7') n Da(x(T))•^(3.16) Now we show that if y(T) E Ri(T), then (0, y(T)) E iii(T). For any sequence tk^0+, and any time t where (3.10) holds, there exists a sequence {(ek, (k)} which Chapter 3. Variational Inclusions^63 converges to (y(t), yi(t)) such that x'(t) + tk(k E F(t, x(t) t kG). Define 4k = (01 ek) and (k = (t 3k141 4 Ck); then 4 k^(0, y(t)) and (k^(07 (t)) and (t)^t kek E E(t,(t)-F tic6c)Thus the extended form of (3.10) also holds at time t. Also D 0 (1c(0)) = {0} x Dc o (x(0)). Thus we have (0, y(T)) E If y(T) E Ri(T)n Dc(x(T)), then (0, y(T)) E fii(T)nD6.(i (T)). Substitut- ing (0, y(T)) into (3.16), we have the first order necessary condition (3.14). Furthermore, if equality holds for some y(T) in (3.14), then equality holds for (T) = (0, y(T)) in (3.16), and we have gMT),^z(T)) 0 di(T) E F2(T) n D a WT), XT)),^(3.17) 2 - i provided the assumption (3.11) holds along y = (0, y) for the multifunction P. It is easy to see that (t) ay' (t), F(t, (t) + (10))) = 4(0, x' (t) ay 1 (t)), eoly(t)1 4 ,F(t,x(t) ay(t)))) c 4 ly(t)1 4^cr 2 k(t). So (3.11) does hold along y. Similarly, we can show that if z(T) E R2(T), then (0, z(T)) E 112(T). Substituting (0, z(T)) into (3.17), we get the second order necessary condition (3.15). Q.E.D. Remark 3.12. For a free endpoint problem (C = Rn), we do not need to compute the set Q (x), since (3.13) follows from the obvious equation Nc(x(T)) = {0}. Remark 3.13. The first-order necessary conditions implicit in Theorem 3.11 are not directly comparable to those in the literature [4, 7] because our results are Chapter 3. Variational Inclusions^64 in primal rather than dual form. In other words, we assert the non-existence of admissible descent directions for the objective function directly, instead of passing to a dual description involving an adjoint arc p. In Section 3.5 below we will make explicit the relationship between the necessary conditions of Theorem 3.11 and Pontryagin's maximum principle for a problem in which F(t, x) = f (t, x ,U) arises from a model in optimal control. 3.4. Unbounded Differential Inclusions To get optimality conditions for unbounded differential inclusions, we first quote a definition in [16]: Let suppose r r : SZ -+ Rm be a multifunction with closed values, and . is £ x 8 measurable on Q. Consider a point (1, 1) in ft Definition 3.14. The multifunction F is called sub-Lipschitzian at (i,i) if for every constant p> 0, there exist constants e > 0 and a > 0 such that y) n p clB C r(t, x) aly - xlc1B No, T1 and all x, y in^eB. The multifunction r is called integrably sub-Lipschitzian in the large at (i,i) for all t E (I - e, t + e) if there exist constants e > 0 and /3 > 0, together with a nonnegative function a(t) integrable on (i - e, e), such that r(t,^p clB C r(t,x) + (a(t)^/9 )1Y - x I clB for all t E (1 - e,i e) n[0, 71, all x, y in^eB, and all p > 0. Now for the differential inclusion control problem (3.1) we retain the hypotheses (H4)-(H5) and replace (H1)-(H3) with the following hypothesis: (H6) The multifunction F has nonempty, closed, convex values on St, and one of the following two conditions holds: Chapter 3. Variational Inclusions ^65 (a) The arc x is Lipschitzian, and the multifunction F is sub-Lipschitzian at every point (t, x(t)) in gph x. (b) The multifunction F is integrably sub-Lipschitzian in the large at every point (t, x(t)) in gph x. We have the following optimality conditions for the problem (3.1): Theorem 3.15. Under the hypotheses (H4)-(H6), the conclusions of Theorem 3.11 still hold. Proof. By [16, Proposition 2.4], there exist integrable functions R and (A on [0, T] and a relatively open subset ft of St containing the graph of x on which the truncated multifunction F defined by i'(t, x)^F(t , x)n(x 1 (t) R(t) cl B) satisfies the hypothesis (H1)-(113) on ^The arc x is also a local solution for the new problem (3.1), so the conclusions of Theorem 3.11 can be applied here with the corresponding constraint qualification and Ii i (T) and^T ). From the proof of [16, Proposition 2.4] we know there exists a constant R o > 0 such that R(t) > R o for all t E [0, T]. Fixing t E [0, T] outside of a measure zero set and taking a small neighbourhood of (x(t), x'(t)), we see immediately that E F(t,C) <^>^E P(t,C) whenever i E x'(t) RoB. So the graph of F and that of F are the same near the point (x(t), x'(t)). Since the derivative of F at (x(t), x'(t)) is only related to the local behaviour of the graph of F at that point, we have dF(t, x(t), x'(t))^dF(t, x(t), x'(t)). So (T) = R 1 (T). Similarly we have R2(T) = R2(T). Chapter 3. Variational Inclusions^66 Finally, note that any (—p', x') E aFt(t, x, p) implies that (p, x') > (p, v) for all v E fr(t, x). The formula (3.4) in [16] tells us that x, p) c an(t, x, p). So we have (x) C a5Q(x). We get the desired result by putting all these together. Q.E.D. Remark 3.16. If the multifunction F satisfies the assumption (H1)-(H3), it will satisfy the assumption (116) just by taking a k and /3 = 0. Thus Theorem 3.15 is a true extension of Theorem 3.11. 3.5. Application to Optimal Control Problems Theorem 3.15 is the second order necessary condition for a general multifunction F. It can be used to study many problems, such as feedback control systems, implicit dynamical systems, and so on (see [12]). Here we discuss its application to the optimal control problem below: minimize g (x (T)) over all arcs x satisfying (3.19) and x(T) E C.^(3.18) Here the dynamic constraint is the controlled differential equation x i (t) = f (t, x(t), u(t)), x(0) E Co, u(t) E U^(3.19) where u is required to be measurable and the given control set U is nonempty and closed. Denote by R(T) the reachable set at time T of the control system (3.19), that is, R(T) = {x(T) I x' (t) = f (t, x(t), u(t)), x(0) E Co, u(t) E U}. Let C 2 be a tube about an arc x and assume that f:C 2 x U . following conditions: > Rn satisfies the — Chapter 3. Variational Inclusions ^67 • The function f(t, •, •) is twice continuously differentiable, and there exists an integrable function k such that at any point (t, x, u) E C2 x U, all its derivatives with respect to (x, u) are bounded by k(t). • The function f(•, x, u) and all its derivatives with respect to (x, u) are measurable on [0, T]. • The function g is twice continuously differentiable near x(T). Consider the multifunction F(t, x) = f(t, x, U). Clearly, F is ,C x B measurable and satisfies the assumption (H6)-(b); also F has nonempty images on Si. Here we explicitly suppose that F(t, x) is convex for all (t, x) E ft and that F(t, x) is closed. (An example is f (t, x, u) = A(t)x + B(t)u and U is a closed convex set.) Theorem 3.17. Let R 1 (T) be the reachable set of the following system: Then we have l y'(t) = fx (t)y(t) + fu (t)v(t) a.e. t, Y( 0 ) E Dco(x( 0 )), v E L'[0,7] and v(t) E Du(u(t)). R1(T) g DR(T)(x(T)). Proof. According to Theorem 3.9, it suffices to show that for any fixed t E [0, 7] outside some measure zero set, any a E R'2 , and any v E Du(u(t)), we have fx (t)a + fu (t)v E dF(t, x(t), x' (t))a. This is equivalent to the assertion that Vtk — 0 + , 36k —> a, (k —> fz (t)a + fu (t)v, U k E U, such that f(t, x(t), u(t)) + tk(k = f(t, x(t) + tkek, uk). Now we prove the latter statement. By assumption v E Du(u(t)), so there exists a sequence vk --> v such that uk := u(t) + t kV k E U. Let ek = a, and (k = Chapter 3. Variational Inclusions^68 (f(t,x(t) d-tka,uk)— f(t,x(t),u(t)))/tk. Using a first order Taylor expansion, we see immediately that (k --4 fx (t)a + fu (t)v. Q.E.D. Theorem 3.18. Let y(T) E R 1 (T) be as described in Theorem 3.17 with control v. If there exist an integrable function k and a constant ao such that d(x'(t)-F ay i (t), f(t, x(t) + ay(t), U)) < a 2 k(t) ^ (3.20) for 0 < a < ao and 0 < t < T, then one has R2(T) c D2R(T) (x(T), y(T))• Here R 2 (T) is the reachable set of the following control system: z'(t) = fx (t)z(t) + fu (t)w(t) -1-- H(t)^a.e. t E [0, T], z(0) E Dto (x(0), y(0)), w E L°10,71 ] and w(t) E Db(u(t),v(t)), where H(t) = 2 f xx (t)(y(t), y(t)) + f xu (t)(y(t), v(t)) + 2 f uu (t)(v(t), v(t)).^(3.21) Proof. Similar to Theorem 3.17.^ Q.E.D. We can now state a second order necessary condition for the optimal control problem (3.18). This result follows from Theorem 3.11 to Theorem 3.18. Theorem 3.19. Let x be a local solution for (3.18) with the corresponding control u. If x satisfies the constraint qualification .- 9Q(x)n{—Nc(x(T))) = {0}, then one has g'(x(T)) T y(T) .> 0 Vy(T) E Ri(T)n Dc(x(T)). Chapter 3. Variational Inclusions ^69 Furthermore, if equality holds for some y(T) and the assumption (3.20) holds along y, then one has (x(T))T z(T) 2g" (x(T))(y(T), y(T)) 0 for all z(T) E R2(T)nDC(x(T),Y(T)). Recall that the set Q(x) is defined by ^(3.22) Q(x)^{AT) I ( — P i , x i ) E 87-1(t, x, p), P(0) E LNco(x(°))}, so the constraint qualification in Theorem 3.19 is expressed in terms of a Hamiltonian inclusion. In control theory an admissible pair (x, u) is said to be normal if the only solution of the following system (3.23) is p 0: —pi (t)^fx (t)Tp(t), { P(0) E Nco (x(0)), —p(T) E Nc(x(T)) (p(t),^x(t), u)) < (p(t), f (t, x(t), u(t))) Vu E U. (3.23) We next show that the normality of (x, u) implies the constraint qualification of Theorem 3.19 for an extended differential inclusion problem, and that Theorem 3.19 accurately recovers the admissible directions and the accessory problem in [32, Theorem 2.2]. Theorem 3.20. Suppose the optimal pair (x, u) is normal and the control set U is convex and compact. If the arc y is a trajectory of the following system for some function v E L'[0, T] y'(t) = fx (t)y(t) + fu (t)v(t) y(0) E Dc o (x(0)) y(T) E Dc(x(T)) (3.24) g 1 (x(T)) T y(T) = 0 v(t) E U — u(t), then we have (x(T)) T z(T)^y(T)g"(x(T))y(T) 0^(3.25) Chapter 3. Variational Inclusions^70 for any arc z satisfying the equation I zi(t) = fr (t)z(t) + H(t) (3.26) z(0) E D 2c0 (x(0), y(0)) z(T) E Dkx(T),y(T)). Here H(t) is given by (3.21) above. Proof. Following the trick in [5], we define a new differential inclusion problem minimize g(x(T)) such that "i i E E(t,^i(0) E Co, x(T) E C.^(3.27) Here th = (xo, x) E R n+1 P(t ,^= -.5{(111 — U(t)I 4 f (t X U)) u E U}, Co = {0} x Co , and C = [-1, 1] x C. The extended Hamiltonian is = ,n2j{ poiu — u( t )1 4 + (p, f(t,x, u) )}. If 9 = (y o , y) is admissible for (3.1), then we have 9' (t) E .i"(t,g(t)) C R x . f (t, y, U), so yl (t) E f (t, y(t),U). Filippov's lemma tells us that there exists a measurable function v such that (y, v) is admissible for (3.18). Since (x, u) is a local solution for (3.18), we have §(y(T)) = g(y(T)) g(x(T)). Sox = (0,x) is a local solution for (3.27). By [6, Theorem 2.8.6], the generalized gradient an(t,i , 13) is the convex hull - of the vectors (0, fx(t, x,ii) Tp, 1 u — u(t)1 4 , f(t,x, ft)) over all ii such that max {polu — u(t)I 4 + (P, f (t, x, u))} = Polu — u(t)1 4 +^f (t, X , 1:1- )). uEU Suppose 13(T) E CM). Since 1-1 does not depend on x o , the arc p o is a constant. We have (0, — p', 0, x') E af(t,'i,i5), so the only possible choice of ii is u(t). So Q(x) is the set of all (po(T),p(T)) such that po (t) is a constant, and p satisfies —21(t) fx (t)T p(t), p(0) E Nco (x(0)), and max {Polu — u(t)1 4 + (p(t), f (t , x(t), u))1 = (p(t), f(t, x(t), u(t))). uEU Chapter 3. Variational Inclusions ^71 (Here we take Nco (x(0)) instead of L.Nc o (x(0)).) It is easy to show that OM is a closed convex set. Now any 13(T) E C2(i)n( Na(x(T))) must have p o = 0, so p satisfies the — system (3.23). By the normality of (x, u) we have p 0. It is easy to show that if y(T) E R1(T) then we have - (T) = (0, y(T)) E 1i1(T). Since y satisfies (3.24), we have g'(i(T)) T y(T) = 0. Since U is a convex set, for all 0 < a < 1, we have u(t) av(t) E U. So d("i' ag',^+ ay)) d((0,^ay'), (IavI 4 , f(t, x + ay, u av))) a41v14^a2k(t) So the assumption (3.20) holds along y. Finally, note that if z(T) E R2(T), then (0, z(T)) E R 2 (T) and that 0 E Dt(u(t),v(t)). By applying Theorem 3.11, we get the required conclusion. Q.E.D. To complete the comparison between Theorem 3.20 and [32, Theorem 2.2], let Co = {oco}, C = {x E R" 1.(x) = 0}. Then by [3, Proposition 4.3.7 and 4.7.5] the endpoint constraints become y(0) = 0, (1)'(x(T))y(T) = 0 in (3.24) and z(0) = 0, 41 1 (x(T))z(T) 14)"(x(T))(y(T),y(T)) = 0 in (3.26). So (3.24) is equivalent to (2.3), (2.4) and (2.5) in [32]. If p is an adjoint arc as in [32, Theorem 2.1] with Ao = 1, that is, if -11(t) = MO T p(t), p(T) = g'(x(T)) (1) 1 (x(T)) T 7, then (p(t) T z(t)) 1 = p(t) T H(t). These observations allow us to rewrite (3.25) as 0 < 1--.y(T) T g u (x(T))y(T) p(T) T z(T) — -y T (1)'(x(T))z(T) y(T )T ( g 77 up = y (x(T))y(T) f p(t) T H(t)dt. - Thus Theorem 3.20 agrees with [32, Theorem 2.2]. Remark 3.21. By applying Theorem 3.15 to optimal control we get Theorem 3.20, which recovers the accessory problem under the conditions that U is compact and Chapter 3. Variational Inclusions 72 At, x, U) is convex. These conditions are not required in Chapter 2. Also we choose v(t) E U — u(t) instead of v(t) E Tu(u(t)) to make sure the condition (3.20) holds. So Theorem 3.20 is weaker than Theorem 2.4. This is not surprising since Theorem 3.15 covers more problems and has simpler structures, so some conditions may be stronger then necessary in concrete applications. Potentially, Theorem 3.15 is very useful even though verifying all conditions and computing approximating cones are nontrivial work. There are still lots of open questions on this topic: Can we remove the condition (3.11)? (We do not need it in optimal control.) Can we remove the convexity condition on F? (It seems very hard.) Can we define generalized conjugate points in terms of the Hamiltonian? (We do not use the pre-Hamiltonian in differential inclusion problems.) This work is only one step in completely understanding the second order theory of differential inclusion problems. Chapter 4 Second Epi-derivatives In this chapter, we study the epi-differentiability of integral functionals I: L 2 defined by T 1(u) = f f(t,u(t))dt,^ o >R — (4.1) where the integrand f(t,•) is a fully amenable function of a particular form, i.e., the sum of a finite-valued fully amenable function and an indicator function. (See section 4.2 for details.) We show that the epi-derivatives of I can be expressed in terms of the epi-derivatives of f(t, •). This reduces the analysis of infinite dimensional spaces to that of finite dimensional ones and lets us obtain optimality conditions for problems involving I. We also discuss the epi-differentiability of simple composite functionals of I, namely, Bolza functionals. We then apply the results to derive second order necessary conditions for free endpoint control problems. Since epi-differentiability has strong geometric meaning and can capture the local behavior of integral functionals near a given point, this method may be useful in studying optimality conditions for problems with constraints on both endpoints. In fact, we have succeeded in obtaining first order necessary conditions by this method. We will first give some background material about epi-differentiability, then study the epi-differentiability of integral functionals with nonconvex integrands, and finally apply our results to Bolza functionals. 73 Chapter 4. Second Epi-derivatives^74 4.1. Background Material We write R for the extended real line - R U{+oo}. Suppose that X is a Hilbert space with distance d and f (h, •) is a family of proper (> —oo and 0 oo) lower semicontinuous functions from X to R with h > 0. Definition 4.1. The functions f (h, •) epi converge to a function f: X^151 as - h^0+ if lim sup epi f (h, •)^lim inf epi f (h, •) = epi f.^(4.2) h--00+^h--00+ Here the first equation is the epi-convergence criterion, while the second defines the function f. The resulting function f, called the epi limit, is lower semicontinuous. - By the definitions of upper and lower limits for sets (see (1.13) and (1.14)), condition (4.2) is equivalent to either (I) For any point x E X and any sequence h n —* 0+ we have (1) any sequence x n ---> x obeys lim inf f(h n , x n ) > f(x); n--*oo (2) there exists a sequence x n —> x such that lim sup f(h n ,x n ) < f(x); or n--■oo (II) For all x E X, we have lim inf inf f (h, x') = lirn sup inf f(h, x') = f(x) h-.4.13+^h-40+ x'-+x where lira sup inf f(h, x l ) = lim lira sup inf f(h,X 1 ), h--■0+^ f -0:1+ h_4 0+ d(x ,x)<€ lim inf inf f(h,x 1 ) = lirn lim inf inf f(h, ). € -*ID+ h-*0+ d(x',x)<€ For h > 0, the family {f A (h, •)}), > 0 of Moreau-Yosida approximates of f(h, •) is defined by f A (h, x) = t!rEifx { f (h, u) Ad 2 (u , x)} Vx E X. Note that for fixed (h, x), f A (h, x) is a nondecreasing function of A. The following result from Attouch [1, Theorem 2.65] will be useful in our discussion. Chapter 4. Second Epi-derivatives ^75 Theorem 4.2. Suppose there exist r > 0, xo E X, such that for every h > 0 and x E X we have f (h, x) + r(d 2 (x, so) + 1) ?_ 0.^ (4.3) Then the following equalities hold: for every x E X, lim inf inf f(h, x') = lirn lim inf f A (h, x) h.o+ x' .x^A--*oo h--40+ — lirn sup inf f (h, x') = lirn lim sup f h--0:14- x' ---*x^A—+°° h—+O+ A (h, x). Theorem 4.2 implies that under condition (4.3) the epi-convergence of f(h,•) to f(.) can also be characterized by (III) For all x E X we have lim lim inf f A (h, x) = lirn lim sup f A (h,x) = f(x). A—•oo h--0:1+^A—'00 h—o0+ Now suppose f: X — + R is a proper lower semicontinuous function and x is a point in the domain of f. The first difference quotient of f at x is defined by f(x + hy) — f(x) .fx(h, Y) =^ h The function f is called epi-differentiable at x with epi-derivative f; if fx (h,•) epiconverges to fix as h goes to 0. A vector w E X* is called an epi gradient of f at x - if fix(Y) (w, Y) VY E X. Suppose w is an epi-gradient. The second difference quotient of f at x relative to w is given by .fr,.(h,y) — f (x + hy) — f(x) — h(w ,, y) h 2 /2 The function f is called twice epi-differentiable at x relative to w with second epiderivative g, , „ if fx , w (h,-) epi-converges to fx" w as h goes to 0. The following definition is from Rockafellar [24, Definition 1.1]. ^ Chapter 4. Second Epi-derivatives^76 Definition 4.3. A function g: Rd it with effective domain D = {u g(u) < oo} is called piecewise linear quadratic if D can be expressed as the union of finitely - many sets Di, such that each Dj is a convex polyhedron and the restriction of g to Di is a quadratic (or affine) function. Suppose that a: RI^R is a piecewise linear-quadratic convex function, that F: Rm^R I and G: Rm^RP are twice continuously differentiable functions, and that C C RP is a nonempty convex closed polyhedron. We are interested in the epi-differentiability of the composite function f(u) = a(F(u)) + tPc(G(u))• ^ (4.4) The domain of f is U = {u E R m G(u) E^ (4.5) The constraint qualification associated with f at a point u E U is the following: if (5 E Nc(G(u)) and O T G'(u) = 0, then S = 0. (4.6) This is well known as the dual form of the Mangasarian-Fromovitz constraint qualification. Under (4.6) the tangent cone to U at u is Tu(u) =^E Rm. I G'(u)v E Tc(G(u))}. We have the following result from Rockafellar [24, Theorem 4.5]. Theorem 4.4. Suppose u E U and the constraint qualification (4.6) holds at u. Then f is twice epi-differentiable at u. Its first epi-derivative is given by fu(v ) = cr F (u ) ( F (u) v ) + W m 1 ( u) ( v), where the first term is simply cr p-, (u) (F i (u)v) = 11111 h-.0+ a(F(u) hF'(u)v) — a(F(u)) ^ Chapter 4. Second Epi-derivatives^77 The function^is the support function of the generalized subgradient set af(u) = F i (u) T acr(F(u)) G'(u) T 1Vc(G(u)), which is the same as the set of all epi-gradients--this set is a nonempty convex polyhedron. The second epi-derivative of f at u relative to a point w E a f(u) is given by fu",u, (v) = 4 (u) (F' (u)v) +^max { v T [-r T F + 6T G]"(u)v} + TE(u,w)(v) , ( 1',6)Er(u,.) where 4. 00 (F1 (u)v) = film cr(F(u) hF'(u)v) — a(F(u)) — licr F i oo (F 1 (u)v) h 2 /2 is finite, I r(u,w) = {(7, 8 ) E ao (F(u)) x Nc(G(u)) F i (u) T 7 G i (u) T 6 = w}^(4.7) - is a nonempty, bounded, polyhedral convex set, and E(u,w) = {v E Tu(u) I criF(u)(F1(u)v) = (w,v)} ^(4.8) is a polyhedral convex cone. Most standard optimization problems in finite dimensional spaces can be written in terms of a function f of the form above. For example, a typical problem in nonlinear programming is minimize fo (u) over all u E subject to fi(u) < 0, for 1 < i < p;^ (4.9) fi(u)=0,forp+1<i<p+ q. All functions are twice continuously differentiable. Problem (4.9) can be studied by different penalty functions: Chapter 4. Second Epi-derivatives ^78 • Infinite penalty representation f(u) = fo(u) + WRP x top ( f (u), • • . , fp+q (u))• • Exact penalty representation E p+g f(u) = fo(u) + ^aimax{fi(u),0}^ ailfi(u)I i=1^i=p+1 with ai > 0. • Smooth penalty representation E ri(max{fi(u),0}) ^E p+g f(u) = fo (u)^ 2 rilfi(u)I 2 i=1 with ri > 0. Instead of studying these models separately, we may discuss one model that includes all these cases and more (such as the augmented Lagrangian representation): f(u) = max {Mu)}^ E pi(cli i (fi(u))) tif c(G(u)). ^(4.10) Here each pi: R + R is defined by pi(x) = 2 rix 2 aix bi with ri > 0 and ai > 0. Each d1 1 is a distance function associated with /i = (—oo, 0] for i = / 1 + 1, ... ,1 2 and h = [0,0] for i = / 2 + 1,...,1. By writing E a(a) = max fail i=i 1 +1 F(u) =^, fi(u)) we have f(u) = a(F(u))^c(G(u)). Denote I(u) = {1 i / 1 1 fi(u) = max fj(u)}, i<j<r i 11 S(u)={'y E R -111 I^= 1;^=^/(u)}, i=1 Chapter 4. Second Epi-derivatives ^79 pi(u) = ricli i (L(u))^ai, i = / 1 + 1,^, 1, Si(u) = pi(u)04 (ii(u)), i = 1 1 + 1, . . . , 1. In this notation, the epiderivative and subgradient of the convex function a at the point F(u) are given by 1 clF(u)(a) = max jEI( u ) {a,}^ E pi(U)(4)1fi ( u )(ai) i=11-1-1 Ocr(F(u)) = S(u) x Sh ±i (u) x • • • x Si(u). We can write all expressions related to^explicitly. The following are for h (—oo, 0] and h = [0, 0] (inside of brackets): clii(fi(u))= max{ 0 ,/i(u)} (iii(u)i) 1 0 (—ai)^fi(u) < 0 (4)fi ( u )(ai) = max{0,ai} (lad) fi(u) = 0 ai (ai)^fi(u) > 0 adii(fi(u))= 1 [0,0] ([-1, —1]) fi(u) < 0 [0, 1] ([-1, 1]) fi(u) = 0 [1,1] ([1, 1]) fi(u) > 0. We summarize the above discussion as follows: Theorem 4.5. The function f in (4.10) is twice epi-differentiable at u E U when the constraint qualification (4.6) holds at u. Its first epi-derivative is given by E (ai ricii i (fi(u))) (cli i ) fi(u) (f:(u) T v)^Tu(u)(v). ful (v) = iEI(u) max {f:(u) T v}^ i=114-1 The second epi-derivative, relative to some w E 0 f(u), is given by fzw(v)= E ri ((d/i)fi (u)(f:(u) T v)) 2 +^max {v T [7 T F S T G] ii (u)v} WE(.,„)(u) (-7 ,5)Er(u,w) where r(u, w) and E(u, w) are given by (4.7) and (4.8). Chapter 4. Second Epi-derivatives ^80 4.2. Epi-derivatives of Integral Functionals Now consider the integral functional 1 defined by (4.1) with the integrand f (t, u) = a(F(t, u)) I Wc(G(t, u)), where F: [0, T] x Rin .--+ R I and G: [0, T] x Rm. --* RP are - - measurable in t and twice continuously differentiable in u, a: R 1 ---+ R is piecewise linear-quadratic convex, and C C RP is nonempty closed convex polyhedral. To highlight the main idea and simplify the notation, we will suppress the explicit t dependence from all functions, that is, write f instead of f(t,-). When we take epi-derivatives, it is understood that these operations are only with respect to u, and t is fixed. All the conditions and conclusions are the same for f and f(t,•), and the proofs are also the same. The domain of 1 is U = {u E L 2 I u(t) E U a.e. t E [0, T]}. Throughout this section we deal with a fixed function u E L 2 with 1(u) finite. Our main results are Theorem 4.6, which deals with first order derivatives, and Theorem 4.11, which treats the second order case. Theorem 4.6. Suppose the constraint qualification (4.6) holds at each point u(t), t E [0, T], and there exists a constant c > 0 and an integrable function k such that a(F(u(t) + hx)) — a(F(u(t))) > k(t) — ) eixi 2 a.e. t E [0, T]. h Then 1 is epi-differentiable at u and its epi-derivative is given by o pu (v) = fT 4 0 (v(t)) dt Vv E L 2 = IfoT CrF(u(t))(FI(u(t))v(t)) dt, + co, if v(t) E Tu(u(t)) a.e. otherwise. We need the following concept and lemmas from Rockafellar [23] to prove Theorem 4.6. Definition 4.7. A function f : [0, T] x Rn --4 R is called a normal integrand if its epigraph epi f(t,•) is a closed valued, measurable multifunction. Chapter 4. Second Epi-derivatives ^81 Lemma 4.8 [23, Theorem 2A]. Suppose the function f(t,•) is lower semicon- tinuous. Then f is a normal integrand if and only if f is G x .8 measurable. Lemma 4.9 [23, Theorem 3A]. If f is a normal integrand on [0, T] x Rm and infx€L2 foT At, x(t)) dt < 00, then ^T ^T inf^ f f(t, x(t)) dt = I inf At, x) dt. x€L2 0^ 0 x ERm Proof of Theorem 4.6. By hypothesis, the first difference quotient obeys the estimate below: 1 u (h , V) ?_ fo ?. I o T a (F (u(t) + hv (t))) — a (F(u(t))) h dt T k(t)dt — clivfl. Thus /u (h, •) is proper and condition (4.3) holds. By alternative definition (III), the epi-differentiability of I at u is equivalent to ^A---•oo lim lim influA (h, v) = lim lim sup luA (h, v).^(4.11) h--•0+^A--+ao h_,0+ For each A > 0, we have INh, v) = inf {Iu (h, x) + Allx — viD zEL2 T = inf xEL 2 I (fu o)(h,x(t)) + AI x(t) — v(t)1 2 ) dt. 0 Since a and `Pc are normal integrands, and the mapping (t, x) .— (F, G)(u(t) + hx) is measurable in t and continuous in x, the function (t, x) --4 fu(t) (h, x) is normal integrand by [23, Proposition 2N]. Note that T^ T inf f (fu o)(h,x(t)) + Alx(t) — v(t)1 2 ) dt _< A i Iv(t)I 2 dt < oc. x EL2 0^ 0 We then apply Lemma 4.9 to (4.11) to deduce that /,,,(h, v) = foT fuAm (h, v(t))dt. But fum ( h, x) _?_ k (t ) — cl x 1 2 ? ki ( t) — 2 c l x — v( t) 12 ^ ^ Chapter 4. Second Epi-derivatives ^82 where ki(t) := k(t) — 2clv(t)1 2 is integrable. Thus whenever A > 2c we have k i (t) < 4 ) (h, v(t)) = z ei lm ffu ( t )(h,x)-F Aix — v(t)1 2 1^AIv(t)I 2 . By Fatou's lemma (assuming A > 2c) we have A^lim u inf f.A(i) (h,v(t))dt. lim inf /(h, v) > I h^o h —• ° + For any fixed t, Ern inf h _•0 + f.A(i) (h, v(t)) is nondecreasing as a function of A, and is bounded below by k l (t). By the monotone convergence theorem, we have lim lim inf^v) > I Ern lira lim inf fuA(t) (h, v(t)) = I fui (t) (v(t))dt. h—>0+^0 ^ 0 A—+oo Similarly, we have Ern lim sup itt (h, v) < f lim lim sup fu() (h, v(t)) = I fL o) (v(t))dt. A —'^ 00 A'°°^0+^ h—, 0 So / is epi-differentiable at u and its epi-derivative is foT f:, (0 (v(t))dt.^Q.E.D. Here is an immediate consequence of Theorem 4.6 in terms of epi-gradients. Corollary 4.10. Under the conditions of Theorem 4.6, the epi-gradient set of I at u is 01(u) = fu, E L 2 I w(t) E Of(u(t)) a.e.t E [0,7]1. Proof If w E L 2 satisfies w(t) E af(u(t)) a.e., then we have Du (V) =^f: 0 i(t) (v(t))^?_ fo (w(t), v(t)) dt = (w , v) for any v E L 2 , so w E 0/(u). On the other hand, if w E 01(u), then for any v E L 2 we have (w, V) ru (V) = /T fus(i) (v(t)) dt , Chapter 4. Second Epi-derivatives ^83 that is, inf2 J O T x€L (fii(t) (x(t)) — (w(t), x(t))) dt^O. (4.12) Note f:, (i) (x) limh_, 0 + gh(t, x), where gh: [0, T] x Rm^R is defined by gh(t, x) = cr (F(u(t)) + hF' (u(t))x) — o- (F(u(t))) +^(G(u(t)) + hG' (u(t))x) Since gh is a normal integrand, Lemma 4.8 implies that gh is G x13 mesaurable. Therefore fui (i) (x), being the pointwise limit of gh(t, x), is also G x B measurable, and it is lower semicontinuous with respect to x by the epi-derivative definition. Applying Lemma 4.8 again we deduce that fu' (t) (x) is a normal integrand. By applying Lemma 4.9 to (4.12) we get inf (f:, (0 (x) — (w(t),x)) dt^0. ./0 xERni But x ign (f' 0) (x) — (w(t), x)) < 0, so we must have x^(f' 0) (x) — (w(t), x)) = 0 ER for almost all t, that is, w(t) E af(u(t)) almost everywhere.^Q.E.D. Now we study the second epi-differentiability of I at u relative to w E aI(u). Theorem 4.11. Suppose the constraint qualification (4.6) holds at u(t) for almost every t, the set U is convex, and there exists a constant c > 0 such that for almost all t in [0, T] and any -y E ao- (F(u(t))) one has o- (F(u(t) + hx)) — o- (F(u(t))) — h(-y, (u(t))x) > cixi2 h 2 /2 (4.13) for all x E Rm . Then I is twice epi-differentiable at u. Its second epi-derivative relative to w E 01(u) is , „(V) = f T fu(t) op(t) (v(t)) dt V v E L 2 . Proof Since w(t) is an epi-gradient of f at u(t), there exist 7(t) E acr(F(u(t))) and (5(t) E Nc(G(u(t))) such that w(t) = F'(u(t)) T -y(t) + G i (u(t)) T S(t). Fix x in Chapter 4. Second Epi-derivatives^84 Rm . If u(t) + hx V U, then fu(t),w(t)(h,x) = 00. Otherwise, x E Tu(u(t)), that is, G'(u(t))x E Tc (G(u(t))) , so WO, G' (u(t))x) < 0. Thus the inequality assumed in the theorem's statement leads to the estimate f (u(t) + hx) — f (u(t)) — h(w(t), x) _> a (F(u(t) + hx)) — a (F(u(t))) — h(w(t), x) = a (F(u(t) + hx)) — a (F(u(t))) — h(-y(t), F' (u(t))x) — h(S(t), G'(u(t))x) > —ch 2 1x1 2 12 In either case, we have fu ( t ),,w ( t )(h,x) > —clx1 2 . For any 0 < h < 1 and v E L 2 , the second difference quotient /u,w(h, v) = foT fu(t),w(t)(h,v(t))dt > — 443. Thus (4.3) holds. By alternative definition (III), the second order epi-differentiability of I at u relative to w is equivalent to lim lim inf iuA w (h, v) = lim lim sup A—+oo h-00+^'^A—•co h_. 0 + /uA w (h, v). For each A > 0, we have I, ,„,(h , v) = xig, {Iu ,„(h, x) + AP — vilD T1 -= xig, 10 V u(t),w(t)( 1 Z 1 X (t)) + )4(0 — v (01 2 ) dt . Since fum,w(t)(h,x) is a normal integrand and fu(0,w(t)(h,x) ? —20)(01 2 — 2clx — v(t)1 2 , we also have — 2c I v ( t ) I 2 < .f., ( ( , v ( t )) _< 41)(0 1 2 0,w( 0 h for A > 2c. Thanks to Fatou's lemma and the monotone convergence theorem (just as in the proof of Theorem 4.6), the result follows. ^ Q.E.D. Chapter 4. Second Epi-derivatives^85 Theorem 4.12. Suppose f is defined by (4.10). Let u be a function such that u(t) lies in U and satisfies (4.6) for almost all t. Suppose there exists a constant c > 0 such that for any x and y in Rm we have yT (x)y > — clyI 2 for i 1,...,12 and IY T f:' (x)Yi < 41 2 for i = 12 + 1, ... ,l, while p i (u(t)) < c for i 1 1 + 1, ... ,l. Suppose further that the function IF 1 (u(t))1 2 is integrable. Then I T is epi-differentiable at u and its epi-derivative is given by 11 (v) = ff. (t) (v(t))dt. Furthermore, if U is convex, then I is twice epi-differentiable at u and its second epi-derivative relative to any w E 01(u) is given by 1." , .(v) = foT f."(i) ,,v(t) (v(t))dt. Proof. We only need to verify the conditions of Theorem 4.6 and Theorem 4.11. For any -y E acr(F(u(t))), we have 0 5 -y i < 1 for i = 1, ,1 1i and 0 < -y i < pi(u(t)) < c for i = l l + 1,^, / 2 , and l'Yil^pi(u(t)) < c for i = 1 2 + 1,^, /. So (F(u(t) hx)) — (F(u(t))) > (7 F(u(t) hx) — F(u(t)) = (-y,F 1 (u(t))x + a hF" (x* )(x , x)) —ci lF 1 (u(t))xl — Ec 7 i1x12— E cpi(u(t))1x1 2 i=1^i=11+1 -C2IX1 2 - C1111 (240)1 2 for some constants c i , c 2 . The first line uses the subgradient's definition, the second one uses the mean value theorem, the third one follows from Cauchy's inequality and our hypothesis, and the last one is true because IF 1 (u(t))11x1 < IF'(u(t))1 2 +1x1 2 and c > 0. Q.E.D. Comparison to other work. Noll [20] discusses the epi-differentiability of integral functionals with finite integrands which have second order Taylor expansions. Thus his integrands are at least Frechet differentiable. He shows that in this case, the second epi-derivative is a quadratic functional. (See Noll [20, Theorem 3.1].) In contrast, we deal with extended real-valued, nonsmooth integrands. Our results include the case where f(u) = F(u) is smooth and scalar-valued as a function of u: we simply choose o(a) = a E R, G 0, and C = RP. In this case Chapter 4. Second Epi-derivatives 86 of (u) = {F' (u)} , and it is easy to calculate r(u, w) = {(1, 0)} and E(u, w) = R, so f ." ,w (v) = v T F" (u)v . If we add the assumption that inf zE Rm F" (x) > —oo, then our Theorem 4.6 shows that I is twice epi-differentiable at u with second epiderivative i." , ,,(v) = foT v(t)T F" (u(t))v(t) dt , as expected. Of course, membership in the class C 2 is a stronger condition on F than the existence of a second order Taylor expansion. But even in this case, we have something Noll [20] does not cover. He considers only integrands bounded by quadratic functions [20, (3.2)], or, in the C 2 case, integrands whose Hessian matrices are bounded [20, Theorem 4.2]. We only impose a lower bound on the Hessian matrix. Thus our theory pertains to arbitrary smooth convex integrands (such as F (u) = eu in the case m = 1), whereas Noll's requires that the integrand grow at most quadratically. As for functionals in Sobolev spaces, or in our terms Bolza functionals, a similar remark applies. Levy [14] discusses epi-differentiability of integral functionals in L 2 spaces and applies the results to the sensitivity analysis of set-valued functions. His paper contains results analogous to Theorem 4.6 and Theorem 4.11, which we proved independently at about the same time. (The proof techniques, however, are different: Levy chooses sequences satisfing alternative definition (I) directly, whereas we pass to Moreau-Yosida approximates and prove (III).) Indeed, a draft of [14] arrived in time for us to state Theorem 4.6 and Theorem 4.11 in a form that facilitates direct comparison. Our Theorem 4.12, which identifies a significant class of functions to which the general theory applies, has no counterpart in [14]. Moreover, our applications are disjoint from those in [14]: whereas Levy concentrates on sensitivity analysis, we discuss the epi-differentiability of Bolza functionals and the resulting necessary conditions in optimal control. Now we give an example to see how we may apply Theorem 4.6 and Theorem 4.11. Consider the problem T minimize 1(u) = i f (t, u(t)) dt over all u E L 2 , o (4.14) Chapter 4. Second Epi-derivatives^87 where f is given by (4.4). We have the following necessary condition for optimality: Theorem 4.13. If u solves the problem and satisfies all conditions in Theorem 4.6 and Theorem 4.11, then for almost every t we have 0 E a f(u(t)) and fun(t),o (v) ? 0 for any v E Rm . Proof Since u is optimal, we have 0 E 81(u) and i.", o (v) > 0 for all v E L 2 . By a Corollary 4.10, we have 0 E f(u(t)) almost everywhere and T f funo (v(t))dt > 0 inf vEL2 0 It is easy to check that f un(t),o (v) is a normal integrand, so Lemma 4.9 implies fT ./o a. fz(t),o(v) dt >0. But infvERm f:(t),o(v) 5_ .f.:"( t ), 0 (0) = 0, so we must have infvERm f:(0,o(v) = 0 for almost every t. The conclusion follows.^ Q.E.D. The following example shows that these necessary conditions may be far from sufficient. Example 4.14. Let f: R .— R be defined by f(u) = 2./t — sin u + ‘11[ 0 , 37,] (u). Elementary calculus shows that f has local minima at the points u = 7r/3 and u = 77r/3, and that the first of these is the global minimum. For any measurable subset A of [0,1], the function u A ( t) = -7311 x A (t ) + 'lir ( 1 - XA( t )) satisfies both the first- and second-order necessary conditions of Theorem 4.13 1 concerning minimizers for 1(u) = o f(u(t))dt; however, only in the case where f m(A) = 1 does uA provide a local minimum for I. Chapter 4. Second Epi-derivatives^88 4.3. Epi-derivatives of Bolza Functionals Now we study the epi-differentiability of the Bolza functional. Suppose T J (u) = f f (t,u(t), x(t)) dt, Vu E L 2 . o (4.15) Here f(t, u, x) = L(t, u, x) + kii c (G(t , u)), and the function x is given in terms of u by x(t) = a(t) + E(u)(t), where a E L 2 and E E r(L 2 , L 2 ). Denote F(u) = (u, X), T Z (LI, v) = f L(t, u(t), v(t)) dt, o T 12(U) = f kif c (G (i , u(t))) dt, o 1(u, v) = 11 (u , v) + 12 (u). Then we have J (u) = 1 (F (u)). Notice that F: L 2 --+ L 2 is affine, with F'(u)v = E(v). Suppose the integrand L satisfies the following conditions: (1) The function L is measurable in t and twice continuously differentiable in (u, x); (2) There exists a constant c > 0 such that 1Ln (t , u, x)1 < c for all (t, u, x) E [0, 71 x Rm. + n ; (3) The integrals foT 1L(t, 0,0)1 2 dt < oo and foT ILV, 0, 0)1 2 dt < oo . Remark 4.15. From now on we will suppress the explicit t-dependence from all functions. The discussion proceeds without any changes for t dependent functions. The functional Z is continuously differentiable and twice weakly Gateaux differentiable, but in general it is not twice continuously differentiable unless L is precisely a quadratic function. (See Noll [20] for a detailed example and further references.) Since Z o F E C 1 , we have 0,7(u) = F'(u)*21(F(u)) + ■912(u). Thus every w E 0,7(u) gives rise to a function w i := w — F'(u)*/1. (F(u)) E 012(u), i.e., Chapter 4. Second Epi-derivatives ^89 wi(t) E Nu(u(t)) by Corollary 4.10. For any v E L 2 writing y = E(v), we have , (w, v) = (F' (u)*11(F(u)), v) + (w l , v) = (11(F(u)), 1 11 (u)v) (w i , v) = 1 T (L u (t) T v(t) L x (t) T y(t) + (w i (t), v(t))) dt ((Lu , Lx) (wi, 0), F' (u)v). So we have w = l'i(u)*th where w(t) = (L u (t), L x (t)) + (w i (t), 0) E 8 f(u(t), x(t)). We are going to study the epi-differentiability of J. Theorem 4.16. Suppose the constraint qualification (4.6) holds at u(t) for almost all t. Then J is epi-differentiable at u. Its epi-derivative is given by T c7:1(v) = 1-( u , x )(v y) = I (L'u (t) T v(t) + Vx (t) T y(t)) dt^v(v), where y = E(v) and ^(4.16) V = Tu(u) = v E L 2 I v(t) E Tu(u(t)) a.e. t E [ 0 , T] }. If, in addition, we suppose that the set U is convex, then J is twice epi-differentiable at u relative to w E 0J (u), and its second epi-derivative is ,711: m (V) =^(L 2(V (i),^+ max Iv(t) T [S T G]"(u(i))v(t)}^E(i)(v(t))) dt. b Mt) Here w 1 := w - F' (u)*11(F(u)) and L2(v,, y) = vT g u (t) v 2y 7 1:,1 z (t)v Y T x(t)Y F(t) = {45 E Nc (G(u(t))) I G i (u(t)) T 6 = wi(t)} E(t) = {v E Tu(u(t)) I v T w i (t) = 01. Proof. We have Ju (h, v) = Iu , x (h,v,y) where y = E(v). By version (I) of the definition, J is epi-differentiable at u if and only if for any point v E L 2 and sequence h u 4 0+, we have - Chapter 4. Second Epi-derivatives^90 (i) for any sequence v converging to v, we have Ern inf n — c,„ Ju (h n , v n ) > I ui x (v , y)• (ii) there exists a sequence v n converging to v, such that Ern sup n , 00 Ju(hn,vn) < 1:1,r(v , y). Given any sequence v n —+ v in L 2 , denote y n = E(v n ). Then (v n , y n ) .-- (v, y) in L2 x L 2 . Since I is epi-differentiable at (u, x), we have lim inf iu,x(h n , vn, yn) ?_ iul ,x(y, n -*co That is condition (i). Now we want to show (ii). If /- Y)• tix) (v,y) = oo, we are done. So we suppose /(, ,x) (v, y) is finite. The epi-differentiability of I implies that there exists a sequence (vn, wn) --* (v, y) such that 11111SUpiu,x(hn, vn, Wn)^I(u,x)(V, Y)• n-+oo Since the right side is finite, we have u(t) + h n v n (t) E U almost everywhere for n sufficiently large. Now by definition, In ,x(h n , v n , W n) = In,z(hn, Viz, yn) where En = 10 + en TL(u + hnVn, + hnWn) - L(u + hnVn, x x + hnYn) hn dt. Using the mean value theorem, we estimate IL(u + v,x + y) — L(u + v , x + 4 = 1.L x (u + v , x + Oy + (1— 9 )z)IIY — zl 5- (iLz(v, x)I + clvi + clyi + clz1)1Y — zl. It follows that T Ifni^f (IL o x(U, x)I + CIV n i + CIWn1 + ClYn1)1Wn - Y n i dt. The sequence of integrals foTOL x (u(t),x(t))1 + cjv n (t)I + ciw n (t)1 + cly n (t)1) 2 dt is bounded and foT lwn(t) — yn(t)I 2 dt -- 0. Therefore E n .- 0 as n —> oo. So lim sup ju(hn, vn) = 11111Sllpiti,x(hn, v n , Wn) 5 11,x(v, y). n-.00^ n-.00 Chapter 4. Second Epi-derivatives ^91 By alternative definition (I), the functional Y is epi-differentiable at u with epiderivative .21 x (v, y). Next, suppose w E 49j(u). The second difference quotient of J at u relative to w is ju ,„(h , v) = 1( u ,i ) ,( L u + ,,,,,L s ) (h, v , y ). Note that I is twice epi-differentiable at (u, x) relative to (L u + w1, L i ), so a discussion similar to that for the first epi-derivative shows that Y is twice epi-differentiable at u. The only difference is the estimate of e n . Here En = I. T L(u Since + h n v n , x + h n w n ) — L(u + li n v n , x + h ny TO — h,,,(L z(u, x), wn — Y n) 14,12 dt. IL(u + v, x -I- y) — L(u + v , x + z) — (L x (u , x), y — z)I = 1(L x (u + v, x + Oy + (1 — 9)z) — L x (u , x), y — 41 5 1Y — zi (civ 1 + clYI + cizi) we have , rT / Ifni^i 2 q1Vn(t)1 + IW n(t)i + lYn(t)1)1W n (t) — vn(t)I dt, o which converges to zero as n goes to infinity. ^ Q.E.D. 4.4. Applications In this section, we apply our results to obtain necessary conditions for optimality in the following free endpoint control problem: = 1 L(u(t), x(t)) dt T minimize J (u) o over all u E U.^(4.17) The corresponding trajectory x satisfies x' (t) = A(t)x(t) + B(t)u(t), x(0) = xo ,^(4.18) Chapter 4. Second Epi-derivatives^92 and the control set U = {u E L 2 I u(t) E U}. Let X(t) be the fundamental matrix function associated with A(t), i.e., the unique solution of the initial-value problem X'(t) = A(t)X(t), X(0) = I. Using X, we define the operator E: L 2 > L 2 by — E(u)(t) = fot X(t)X(s) -1 B(s)u(s)ds, and the function a(t) = X(t)x o . Then (4.18) gives x = a + E(u), so J takes the form (4.15). Now we give necessary conditions for the problem (4.17). Suppose u is an optimal control, that is, for any v E L 2 we have J(v) .? J(u). Denote f(u, x) = L(u, x) + kli u(u). We will apply the results in section 4.3 to get optimality conditions for (4.17). The extended pre-Hamiltonian function for problem (4.17) is H(t, x, p, u, 7) = p T (A(t)x + B(t)u) - L(u, x) — ,y 7' G(u). The "adjoint equation" below defines an arc p: —At) = H x (t) = A(t) T p(t) — L x (u(t), x(t)), p(T) = 0. Theorem 4.17. Suppose (u, x) is optimal solution of (4.17), and the constraint qualification (4.6) holds at u(t) for almost all t. Then we have the following first order necessary condition: A T (L u (t)v(t) + L x (t)y(t)) dt _> 0^(4.19) for all (v, y) satisfying v E V and y = E(v). (The cone V is given by (4.16).) If, in addition, the control set U is convex, then we have the second order necessary condition I. T (L2(v(t), y(t)) + max Iv(t) T [-y T Gr(u(t))v(t)) + W E( t )(v(t))) dt _?_ 0 -rEr(t) Chapter 4. Second Epi-derivatives ^93 where L2(v, y) = v T LL(t)v -F2y T LL(t)v + y T 4 z (t)y , I'm . {' y E Nc(G(u(t))) I H u (t) = 0}, E(t) = {v E Tu(u(t)) I (B(t) T p(t) — L u (t)) T v = 0}. Proof. Since J has a minimum at u, we have 4.7,:(v) > 0 for all v E L 2 , that is, (4.19). Theorem 4.16 provides the second order necessary condition yz o (v) > 0, where T 411:1,0(V) = 10 (L2( 1) (t), Y(t)) + nEIN ) {v(t) T [ -y T G] n (u(t))v(t)} + kli E(t )(v(t))) dt. This condition involves the function wi = —Fi(u)*4(F(u)), which we now compute. For any v E L 2 , we have (111 (u)* 11(F (u)), v) = (11(F (u)), F1(u)v/ = (4(u , x), (v , y)) = fo T(L (t) = f0 T (L (i) =I o = fo T u T V(i) + L x (t) T y(t)) dt u T V(t) + (p i (t) + A(t) T p(t)) T y(t)) dt (L u (t) T v(t) + (pi (t) T y(t) + p(t) T y' (t)) — p(t) T B (t)v(t)) dt T (L u (t) — B (t) T p(t)) T v(t) dt = (L u — B T p, v)• So we have w 1 = —F 1 (u)*11(F(u)) = —L u + B T p. The definition of L2 here agrees with that in Theorem 4.16. As for the other Chapter 4. Second Epi-derivatives^94 ingredients, we calculate p r(t) =^E Nc(G(u(t))) I w i (t) = E7igau(t))1 = {y E Nc(G(u(t))) I H u (t) = o} E(t) =^E Tu(u(t)) I (B(t) T p(t) — L .(t)) T v = 01. Q.E.D. Remark 4.18. If the gradients Ig:(u(t))} (active inequality and equality index) are linearly independent for almost all t, then the multipliers -yi are unique, and this result is the same as Theorem 2.4 (b). 4.5. Calculus of Epi derivatives and Endpoint Constraints - Let us consider the following endpoint-constrained optimal control problem: minimize (u) = I L(x(t),u(t)) dt T (4.20) over all controls u E U with trajectories x = a + E(u) and x(T) E C. Here U, a and E are the same as those in the first paragraph of Section 4.4. Let us define the following functions: F(u) = x(T) with x = a + E(u), W(x) = Tc(x), and V(u) = foT L(x, u) dt I'u(u). Then problem (4.20) is equivalent to minimizing ,7(u) = V(u)+W(F(u)) over all u E L 2 . In this section we show how the calculus of epiderivatives leads to first-order necessary conditions for optimality in this problem. The second-order theory remains an open question. Suppose the set U C Rm is convex closed. The set U is therefore convex closed. We also suppose C is closed and tangentially regular, that is, Tc , (x) = Ifc(x) for x E C. We know that the functional V is epi-differentiable at u E U under some mild Chapter 4. Second Epi-derivatives ^95 conditions (see Theorem 4.16) and its epi-derivative is given by V:1 (V) =I T (L iu (t) T V(t) Vx (t) T Y(t)) dt + Tv(v), ^(4.21) where V = {v E L 2 v(t) E Tu (u(t)) a.e. t E [0, 11 ]1 and y = E(v). It is also easy to check that F is continuously differentiable at u with derivative F'(u)v = y(T, v), and that W is epi-differentiable at x E C with epi-derivative 14 7;(Y) = WTc(x)(Y). To show that is epi-differentiable at u E U and to express its epi-derivative in terms of those of V and W, we need some stability assumptions. There are not many results on this topic. One source is Aubin and Frankowska [3]. Their Theorem 6.3.1 and its remark show that a suitable condition is that there exist constants c > 0, 0 < a < 1 and > 0, such that for any w E U(8) :=^E^Ilw — ull < 61 and z E C(S) := {z ECJ lz — x(T)I < (5}, we have B l C F'(u)(Tu (u) n^— Tc(y) + B a ,^(4.22) where B, and B, are balls with center 0 and radius c in L 2 and R n , respectively. Under this condition,^is epi-differentiable at u and its epi-derivative is given by =^+ TTG. (x (T) )(Y (7) ) with y = E(v) and v E V. ^Since F 1 (u)(Tu(u)n Bc) = {y(T) y = E(v), v E Tu(u),^5_^we simply denote it by r(u, c). The usual normality condition on control system (4.20) is r(u) — Tc(x(T)) = R n^(4.23) where r(u) = {y(T) I y = E(v), v E Tu(v)}. Now we check its relation with stability assumptions. Chapter 4. Second Epi-derivatives ^96 Theorem 4.19. If the normality condition (4.23) holds, then there exists a constant c > 0 such that B1 C r(u, c) — Tc(x(T))- ^ (4.24) Proof. We know r(u, k) — Tc(x(T)) is convex and contains 0 for all k. If 0 is a boundary point for all k, then there exists a sequence ek E Rn with lek = 1 such that (G,r(u,k) — Tc(x(T))) < 0. Without loss of generality, we may suppose ek^= 1. Since r(u, k) T r(u), as k^oo, we have (e, r(T) — Tc(x(T))) < 0. This contradicts (4.23). So 0 must be an interior point for some K, that is, there exists e > 0 such that B E C r(u, K) — Tc(x(T)). Q.E.D. We let c = K/e to get (4.24).^ Now we show (4.24) implies (4.22). Theorem 4.20. If (4.24) holds, then (4.22) holds. Proof. By [3, Theorem 4.1.10] we have Tc(x(T)) = liminf Tc(Y) = yx(T) R>. So there exists a S i > 0 such that for any Ub>0 nyEC(b) (TC(Y)B E) • y E C(81), we have Tc(x(T)) C Tc (y) B114. By [3, Theorem 4.1.13] we have Tu(u) = liminf u Kri (v), where Ku(v) is a weak contingent cone to U at v. Since U is convex closed, we have Kti (v) = Ku(v) = Tu(v). There exists c 1 > 0, such that ly(T)1 5_ ciIIvII for any y = E(v). Choose E = 1/4c 1 . There exists 82 > 0 such that if w E li(82), we have Tu(u) C Tu(w)-1- B So Tu(u)n Bc C (Tu(w) + n Bc C Tu(w) n BC2 + B, for c 2 = E . c E. For any Chapter 4. Second Epi-derivatives ^97 y(T) E r(u,c), there exists v E Tu(u)nB c such that y(T) = F'(u)v. So there exists vi E 71,(w)n8c, such that liv i — vii < €. Let yi(T) = F'(u)v i . We have lYi(T) — Y(T)I 5- ci Pi — vii < 1/4. That is, r(u,c) C r(w,c2)+ B114. Now let (5 = mit 461,621. For any y E C(6) and u E li(6), we have from (4.24) that - B1 c r(u, c) — Tc(x(T)) c r(w,, c2) — Tc(y) + B1/2 This is stability assumption (4.22).^ Q.E.D. Summarizing the above discussion, we have the following first-order necessary condition: Theorem 4.21. Suppose the normality condition (4.23) holds for (u, x). Then J is epi-differentiable at u. Its epi-derivative is given by ji (v) = f (L iu (t) T v(t) + Vx (t) T y(t)) dt + Wv(v) + WTc(x(T))(Y(T)), o where y = E(v). If (u, x) solves the problem (4.20), then we have first order necessary condition I, T (L iu (i) T V(t) + Vx (t) T y(t)) dt >. 0 for all v E Tu(u) with y = E(v) and y(T) E Tc(x(T)). Remark 4.22. For the time being, we cannot use these methods to produce second-order necessary conditions. The reason is that there are no useful calculus rules for second epi-derivatives in infinite dimensional spaces. This subject needs further investigation. References [1] H. Attouch, Variational Convergence for Functions and Operators, Pitman, Boston, 1984. [2] J. P. Aubin and A. Cellina, Differential Inclusions, Springer-Verlag, New York, 1984. [3] J. P. Aubin and H. Frankowska, Set-valued Analysis, Birkhauser, Boston, 1990. [4] D. S. Bernstein and E. G. Gilbert, Second order necessary conditions in optimal control: Accessory problem results without normality conditions, J. Optim. Theory Appl., 41 (1983), 75-105. [5] F. Clarke, Optimal control and the true Hamiltonian, SIAM Review, 21 (1979), 157-166. [6] F. Clarke, Optimization and Nonsmooth Analysis, Wiley-Interscience, New York, 1983. [7] F. Clarke, Methods of Dynamic and Nonsmooth Optimization, CBMS-NSF regional conference series in applied mathematics, 57, 1990. [8] R. Cominetti, On pseudo-differentiability, Trans. Amer. Math. Soc. 324 (1988), 843-865. [9] C. Do, Generalized second derivatives of convex functions in reflexive Banach spaces, Trans. Amer. Math. Soc., 334 (1992), 281-301. [10] A. V. Fiacco and G. P. McCormick, Nonlinear Programming: Sequential Unconstrained Minimization Techniques, (1968). [11] W. H. Fleming and R. W. Rishel, Deterministic and Stochastic Optimal Control, Springer-Verlag, New York, 1975. [12] H. Frankowska, Contingent cones to reachable sets of control systems, SIAM J. Control and Optim., 27 (1989), 170-198. 98 References^99 [13] M. R. Hestenes, Calculus of Variations and Optimal Control Theory, John Wiley & Sons, 1966. [14] A. Levy, Second-order epi-derivatives of integral functionals, preprint, 1992. [15] P. D. Loewen, Optimal Control via Nonsmooth Analysis, CRM Proceedings & Lecture Notes, to appear. [16] P. D. Loewen, R. T. Rockafellar, Optimal control of unbounded differential inclusions, SIAM J. Control and Optim., to appear. [17] —, The adjoint arc in nonsmooth optimization, Trans. Amer. Math. Soc., 325 (1991), 39-72. [18] P. D. Loewen and H. Zheng, Generalized conjugate points for optimal control problems, Nonlinear Analysis, to appear. [19] G. P. McCormick, Optimality criteria in nonlinear programming, SIAM-AMS Proceedings, 9 (1976), 27-38. [20] D. Noll, Second order differentiability of integral functionals on Sobolev spaces and L 2 spaces, J. fiir die Reine & Angewandte Mathematik, to appear. [21] R. T. Rockafellar and R. A. Poliquin, A calculus of epi-derivatives with applications to optimization, preprint, 1991. [22] —, Amenable functions in optimization, preprint, 1991. [23] R. T. Rockafellar, Integral functionals, normal integrands and measurable selections, in Nonlinear Operators and the Calculus of Variations, Springer-Verlag Lecture Notes in Math. 543 (1976), 157-207. [24] —, First- and second-order epi-differentiability in nonlinear programming, Trans. Amer. Math. Soc. 307 (1988), 75-107. [25] —, Second-order optimality conditions in nonlinear programming obtained by way of epi-derivatives, Math. of Oper. Res., 14 (1989), 462-484. [26] —, Proto-differentiability of set-valued mappings and its applications in optimization, Analyse Non Lin&ire , ed. H. Attouch, J. P. Aubin, F. Clarke, I. Ekeland, 1989,449-482. [27] —, Generalized second derivatives of convex functions and saddle functions, Trans. Amer. Math. Soc. 320 (1990), 810-822. References^100 [28] —, Nonsmooth analysis and parametric optimization, Methods of Nonconvex Analysis, ed. A. Cellina, Springer-Verlag, Lecture Notes in Math. 1446 (1990), 137-151. [29] D. Ward, A chain rule for parabolic second-order epiderivatives, Optimization, to appear. [30] J. Warga, A second-order Lagrangian condition for restricted control problems, J. Optim. Theory Appl., 24 (1978), 475-483. [31] —, A second order condition that strengthens Pontryagin's maximum principle, J. Diff. Eqn., 28 (1979), 284-307. [32] V. Zeidan and P. Zezza, The conjugate point condition for smooth control sets, J. Math. Anal. Appl., 132 (1988), 572-589. [33] —, Necessary conditions for optimal control problems: conjugate points, SIAM J. Control and Optim., 26 (1988), 592-608. [34] —, Conjugate points and optimal control: Counter-examples, IEEE Trans. Auto. Contr., 34 (1989), 254-256. [35] H. Zheng, Second-order necessary conditions for differential inclusion problems, Applied Math. and Optim., to appear.
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Second order necessary conditions in optimal control
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Second order necessary conditions in optimal control Zheng, Harry H. 1993
pdf
Page Metadata
Item Metadata
Title | Second order necessary conditions in optimal control |
Creator |
Zheng, Harry H. |
Date Issued | 1993 |
Description | This thesis presents second order necessary conditions for the standard deterministic optimal control problem without conventional regularity assumptions. It reports three different advances in the second-order theory. First, we study conjugate points in smooth optimal control. We describe a set of conditions under which the second variation along an extremal trajectory is certain to be negative: when these are satisfied, they identify a certain point in the basic interval as a "Generalized Conjugate Point (GCP)." The GCP's include all conjugate points in the classical sense, and all points where the Legendre conditionis violated. We also find GCP's in many problems left unresolved by the published literature, and deduce that the associated extremals are not optimal. Second, we use a second-order tangent set to predict the second-order variations along a trajectory to a differential inclusion. Assuming only that the problem's data are Lipschitz continuous, we obtain a generalized form of the familiar statement that the second variation associated with a minimum point is nonnegative. The result is expressed in terms of a generalized second-order derivative introduced by Aubin, and reproduces the classical necessary conditions in the smooth case. Third, we apply Rockafellar's theory of epi-differentiability to integral functionals. Using At touch's theorem, we show that certain nonconvex integral functionals of interest in optimal control are twice epi-differentiable. From this we deduce necessary conditions for optimality for problems without endpoint constraints. |
Extent | 4226653 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
File Format | application/pdf |
Language | eng |
Date Available | 2008-09-17 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0079845 |
URI | http://hdl.handle.net/2429/2126 |
Degree |
Doctor of Philosophy - PhD |
Program |
Mathematics |
Affiliation |
Science, Faculty of Mathematics, Department of |
Degree Grantor | University of British Columbia |
Graduation Date | 1993-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
Aggregated Source Repository | DSpace |
Download
- Media
- 831-ubc_1993_fall_phd_zheng_harry.pdf [ 4.03MB ]
- Metadata
- JSON: 831-1.0079845.json
- JSON-LD: 831-1.0079845-ld.json
- RDF/XML (Pretty): 831-1.0079845-rdf.xml
- RDF/JSON: 831-1.0079845-rdf.json
- Turtle: 831-1.0079845-turtle.txt
- N-Triples: 831-1.0079845-rdf-ntriples.txt
- Original Record: 831-1.0079845-source.json
- Full Text
- 831-1.0079845-fulltext.txt
- Citation
- 831-1.0079845.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0079845/manifest