THE EXISTENCE OF OPTIMAL SINGULAR CONTROLS FOR STOCHASTIC DIFFERENTIAL EQUATIONS By Wulin Suo B. Sc., M. Sc. (Mathematics), Hebei University, China A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE STUDIES DEPARTMENT OF MATHEMATICS AND INSTITUTE OF APPLIED MATHEMATICS We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA February 1994 © Wulin Suo, 1994 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for refer ence and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of Mathematics and Institute of Applied Mathematics The University of British Columbia Vancouver, Canada V6T 1Z2 Date: Monelj to, ggzç.. Abstract We study a singular control problem where the state process is governed by an Ito stochastic differential equation allowing both classical and singular coutrols. By reformulating the state equation as a martingale problem on an appropriate canonical space, it is shown, under mild continuity conditions on the data, that an optimal control exists. The dynamic programming principle for the problem is established through the method of conditioning and concatenation. Moreover, it is shown that there exists a family of optimal controls such that the corresponding states form a Markov process. When the data is Lipschitz continuous, the value function is shown to be uniformly con tinuous and to be the unique viscosity solution of the corresponding Hamilton-Jacobi-Bellman variational inequality. We also provide a description of the continuation region, the region in which the optimal state process is continuous, and we show that there exists a family of optimal controls which keeps the state inside the region after a possible initial jump. The last part is independent of the rest of the thesis. Through stretching of time, the singular control problem is transformed into a new problem that involves only classical control. Such problems are relatively well understood. As a result, it is shown that there exists an optimal control where the classical control variable is in Markovian form and the increment of the singular control variable on any time interval is adapted to the state process on the same time interval. ‘I Table of Contents Abstract ii Acknowledgement v 1 Introduction 1 2 Formulation of the Problem 3 4 11 2.1 Introduction 11 2.2 Model problems 11 2.3 Statement of the problem 14 2.4 Control rules 20 2.5 The topology on the canonical space 22 Existence of Optimal Controls 30 3.1 Introduction 30 3.2 An equivalent definition for control rules 30 3.3 Existence of optimal controls 34 3.4 Some comments 43 The Dynamic Programming Principle 45 4.1 Introduction 45 4.2 Some preparations 45 4.3 The dynamic programming principle 52 4.4 Markov property 55 111 5 The Value Function 61 5.1 Introduction 61 5.2 Continuity of the value function 61 5.3 The dynamic programming equation 68 5.4 5.3.1 Heuristic derivation of the dynamic programming equation 71 5.3.2 Viscosity solution 72 The uniqueness of viscosity solution to the HJB equation 76 6 The Continuation Region 87 7 The Existence of Optimal Control Laws 97 7.1 Introduction 97 7.2 Formulation of the problem 98 7.2.1 The singular control problem 99 7.2.2 The classical control problem 101 7.3 Equivalence of the two problems 105 7.4 Existence of optimal control laws 111 7.5 Some comments 116 A Set-valued functions 120 B Some results from real analysis 123 Bibliography 125 iv Acknowledgement First and foremost I am grateful to Professor TJlrich Haussmann. Throughout the years of my graduate study at UBC, I have benefited enormously from his ideas, enthusiasm and overall guidance. I thank him for not only serving as my research supervisor and spending countless hours in sharing his broad knowledge of mathematics with me, but also for his encouragement, patience and consideration during the years. I also thank Professors Phillp Loewen and Ed Perkins for being in the supervisory committee, and giving those wonderful courses which I have enjoyed taking. I am also indebted to many of my fellow graduate students, whose friendship has helped to make the student life tolerable, and to the staff in the department, especially Ms. Tina Tang, for their help. In addition, I would like to take this opportunity to thank my best friends Yan-qun Liu and Xue-feng Zhang of Rebei University for the favors they have done me since I left China to pursue my graduate study here. Finally, I thank my wife Jane and my son Charles for their love, understanding and tolerance. To them this thesis is dedicated. v Chapter 1 Introduction The class of singular stochastic control problems, which has been studied extensively in recent years, deals with systems described by a stochastic differential equation in which one restricts the cumulative displacement of the state caused by control to be of an additive nature, or in other words, of bounded variation on finite intervals. In classical control problems, this cumu lative displacement is the integral of some function of the state (see Fleming and Rishel [22], Krylov [50]) and so is absolutely continuous. In impulsive control problems (see Bensoussan and Lions [5]), this cumulative displacement has jumps, between which it is either constant or absolutely continuous. Singular control problems admit both of these possibilities and also the possibility that the displacement of the state caused by the optimal control is singularly continuous with respect to the Lebesgue measure on the time interval. More precisely, in singular control problems the state process is governed by the following d-dimensional stochastic differential equation J dx = b(t, t, ut)dt + a(t, Xj, ut)dBt + g(t)dvj, s x la” < u ‘. T, ffldXl, g(.) [0,T] i—+ ladxIc [0,71] xladx U -* ‘—+ Ia”, ot.,.,.): are given deterministic functions, (Be, t is an 1-dimensional Brownian motion, x is the initial state at time .s and u [0, T] (11) x = 3 x, xElad on some filtered probability space (fl,F,.F,F), where b(,.,.): [0,T] t [0, T] i—* 0) U, 1R”, with v nondecreasing componentwise, stand for controls. We call u the classical control variable, and v the singular control variable. When k = d and g(.) = I, the d x d unit matrix, the problem is often referred to as a monotone follower problem, and when k = 2d, g(.) = (I, —I), it is also called in the literature a bounded variation follower problem. 1 Chapter 1. Introduction 2 Moreover, we should point out that in most of the literature about singular control problems, there is no classical control variable u involved. The expected cost has the form {jT J where f(.,.,.) : E E f(t, x uj)dt + [0,T] x iRd x U 18, c(.) : [0,T] c(t) dv} (1.2) . 1R are given, and F— f stands for the running cost rate of the problem and c the cost rate of applying the singular control. If we let the value function of the problem be I’V(t, x), i.e., the infimum of J over all admissible controls, then a heuristic application of the dynamic programming principle will lead to the following variational inequality, or Hamilton-Jacobi-Bellman equation, inf(LW+f)(t,x,u) (g*VW(t, x)) t inf(LW on [0,T] x d, 18 + + c(t) 0, 0, i = 1,2, f)fl{(g*VrW(t,x))t .., k, and + c(t)} = (1.3) 0 where - 0 1 02 LE +>ai 03 0 +Eb—. The problem reduces to the classical control problem if g = 0, which has been studied extensively in the literature, cf. Fleming and Rishel [22], Krylov [50], Lions [55] among others. Let A be the subset of [0, T] x on which inf (LW + f)(t, x, u) uEU and on its complement AC = 0, (1.4) one of the other inequalities in (1.3) becomes an equality. If we can show that the value function is convex and in C” ( [0, T] x 1R’), and the boundary OA of 2 the subset A is smooth enough, then it can easily be verified that the optimal control exists, and has the following form: if the state process starts outside of A, then the optimal control will make it jump to some point on the boundary OA, thereafter v acts only when the state process is on OA, and pushes it back into A (along the direction —VW). The classical control Chapter 1. Introduction 3 variable u acts when the optimal state process is inside A in the way such that (1.4) holds, in other words, the optimal classical control variable u acts optimally as if there were no singular control v in the problem. The optimal state process is thus a reflected diffusion in the set A, which is called the inaction region, and the singular optimal control is like the local time of the reflected diffusion at the boundary DA. This kind of problem was first studied in the late 1960s by Bather and Chernoff [3], who considered the problem of controffing the motion of a spaceship on some finite time horizon with a quadratic terminal cost and a cumulative cost for the singular control, and was taken up by Borodowski, Bratus and Chernous’ko [7], [8], [10], [11]. However, not much progress had been made until the seminal work of Benès, Shepp and Witzenhausen [4] (1980), who studied several specific singular control problems, all of which were 1-dimensional with quadratic running costs and simple state processes (e.g., b = 0, and o. = 1 in (1.1)). The solutions constructed were among a long list of solutions to singular stochastic control problems in which the value functions are twice continuously differentiable, even across the free-boundary DA where the optimal singular control acts. This so-called principle of smooth fit has played an important role in constructing the solutions of 1-dimensional problems (e.g., Harrison, Sellke and Taylor [25], Harrison and Taksar [26], Harrison and Taylor [27], Jacka [37], Karatzas [40], [41], Karatzas and Shreve [48], Lehoczky and Shreve [54], Shreve, Lehoczky and Gayer [68], Sun [74], and more recently, Ma [58]). In all these results, the diffusion part of the state process is either a Brownian motion or a diffusion with linear drift and constant diffusion coefficients (i.e., b, a = constant). Moreover, the running cost function f is always assumed to be convex in the state variable x, and it can then be shown that the value function is convex in the state variable. In this case, the subset A, i.e., the t-section of the set A, is an interval on the real line. The optimal state process is thus a reflected Brownian motion (or a linear diffusion) on this interval, and the optimal singular control is the local time of the state at the endpoints. It is now clear that extending the principle of smooth fit to multi-dimensional cases will face problems like the unknown smoothness of the value function and the free-boundary, and the lack Chapter 1. Introduction 4 of knowledge of the direction of reflection. Moreover, even if intnitively we may gness —V W 1 to be the optimal direction of reflection (since this is the direction of least increase of the value function W), the gradient VW may be zero at some points on the boundary, and thus leave the direction of reflection indeterminate. These are essentially the main difficulties that prevent researchers from applying the principle of smooth fit, which has been very successful for specific 1-dimensional problems, to construct the optimal control for higher dimensional problems, and from showing the singular feature of the optimal control. However, this method is successfully used by Soner and Shreve [71] for a 2-dimensional singular control problem for the Brownian motion with infinite time horizon and discounted convex running cost rate. Their approach uses the gradient flow of the value function W (on 182) to change to a more convenient pair of coordinates, and to obtain a more standard free-boundary problem. By using this ingenious device, they characterize the value function W as the unique C -solution of the corresponding 2 Hamilton-Jacobi-Bellman equation. They also show the free boundary Th4 to be of class C ” 2 for any ct e (0, 1). Such smoothness of DA is essential in their construction of the optimal process, which is a 2-dimensional Brownian motion reflected along ÔA in the direction —VW, and is obtained as the unique solution of a Skorohod problem satisfying the conditions of Lions and Sznitman [57]. As the authors point out, the proof of Soner and Shreve’s result makes critical use of the 2-dimensional nature of the problem, and it cannot be extended to more general problems in higher dimensions. However, by analytic methods Soner and Shreve [72] obtain similar results in higher dimensions if the singular control can be exerted only in one direction. A 2-dimensional cheap monotone follower problem is solved in Chiarolla and Haussmann [12], [13] by geometric methods. It is shown there that the optimal control exists and is unique, and the free-boundary is C 2 except at one point. Again the method is specific to dimension two. Karatzas and Shreve [48] considered the singular control of a 1-dimensional Brownian motion under a constraint on the singnlar control variable (a finite-fuel constraint), i.e., the total variation of the singular control variable cannot exceed a constant . The fuel remaining, Chapter 1. Introduction — 5 1’, constitutes a second state variable, and the value function is shown analytically to be C 2 jointly in both state variables. This property enables them to construct an optimal control in the following form: act optimally as if there is no constraint, or in other words, with infinite fuel, until the fuel remaining is down to zero, and then leave the state uncontrolled. Problems of this type can also be found in Benes, Shepp and Witzenhausen [4], Jacka [37], and Bridge and Shreve [9] (a more recent work on a multi-dimensional problem). Singular control can also be approached as the limit of either impulse controls or absolutely continuous controls. Menaldi and Rofman [64] study a so-called cheap control problem (i.e., c(.) = 0) for an n-dimensional diffusion process with infinite time horizon where only impulse controls are allowed. They obtain the optimal cost as a limit of impulse control problems having a cost for each impulse. The existence of an optimal control is proved only after restricting the problem to a particular subset of impulse controls, which suggests that an optimal control for the problem has to be sought in a much larger set of admissible controls. In fact, Menaldi and Robin [62] prove that the value function of the problem is continuous and is the same over sets of absolutely continuous controls, impulse controls, or pure jump controls as long as they are all of bounded variation on finite intervals. Moreover, in the 1-dimensional case, they prove the existence of a singular optimal control in the class of monotone controls for a nondegenerate diffusion. The value function is shown to be C 2 and is the maximum solution of the corresponding Hamilton-Jacobi-Bellman equation, when f is assumed to be convex in the state variable, and the state process is assumed to have constant drift and diffusion coefficients. Menaldi and Taksar [65] study a multi-dimensional problem under the same hypothesis on b, and ci, with a positive cost c(.) = constant for the singular control variable, which enters the state process additively and is of bounded variation. They approximate the value function through absolutely continuous controls by means of penalization, and prove that T’V is a gen eralized solution of the corresponding Hamilton-Jacobi-Bellman equation. The existence and uniqueness of optimal control is also established without the requirements of regularity of the free-boundary. Chapter 1. Introduction 6 It should be pointed out that this approach is used by many authors to establish the regu larity of the value fuuctions and to show that it is a generalized solutiou (in a certain sense) to the corresponding variational inequality, or the Hamilton-Jacobi-Bellman equation, see Chow, Menaldi and Robin [14], Williams, Chow and Menaldi [80], Zhou [81] among others. The reader may consult Baldursson [2] for conditions under which an approximation by absolutely continu ous controls is possible, and Heinricher and Mizel [34] for a counterexample. In Heinricher and Mizel’s model the value function obtained by minimizing over the set of controls with bounded variation is strictly smaller than the value function corresponding to absolutely continuous controls. It was first observed by Bather and Chernoff [3] that the 1-dimensional singular control problem has a close connection with an optimal stopping problem. In fact, it was shown that the space derivative of the value function W coincides with the optimal risk of an appropriate stopping problem, whose optimal continuation region is precisely the region of inaction A of the control problem. This connection between singular stochastic control and optimal stopping was developed rigorously by Karatzas [41] mostly by analytical methods based on properties of solutions to free-boundary problems and variational inequalities. Subsequently, Karatzas and Shreve [46] established the connection between the two problems by using only direct probabilistic arguments. In particular, they proved the existence of an optimal control for the problem of controffing a Brownian motion, in the setting of the monotone follower problem, and they showed the optimal stopping time for the associated stopping problem is exactly the first time the singular control acts, in other words, the first time the singular control is positive. Karatzas and Shreve [47] obtained similar results for bounded variation follower problems and optimal stopping for a Brownian motion with absorption at the origin (see also Baldursson [2]). Other results concerning the control of a Brownian motion and its relationship to optimal stopping problems can be found in El Karoui and Karatzas [20] and Taksar [77]. It should be noticed that control processes are more easily topologized than stopping times, and therefore the proof of existence of an optimal stopping time through this connection usually requires Chapter 1. Introduction 7 fewer conditions than the approach used in Friedman [24], and van Moerbeke [78]. The equivalence between singular control and optimal stopping problems is also used by Chow, Menaldi and Robin [14] to determine the free boundary ôit, whose 1-dimensional state process is governed by a linear stochastic differential equation, possibly degenerate, with time dependent coefficients, and finite time horizon. They approach the control problem by a se quence of absolutely continuous control problems, which enable them to prove that the value function is the unique solution of the corresponding variational inequality. Then they construct the optimal control, which is, in fact, Markovian and whose input produces a reflected diffusion process as the optimal state process. To construct the reflected diffusion, they assume some regularity of the free-boundary. Singular optimal controls have been used to study various types of storage problems, cf. Har rison and Taksar [26], Harrison and Taylor [27], and Taksar [76]. In Martins and Kushner [59] (and also Kushner and Ramachandran [52], Krichagina and Taksar [49] among others) sin gular control problems for a l3rownian motion are used to approximate queuing systems in heavy traffic. We should especially mention the work of Davis and Norman [16] (cf. Shreve and Soner [69] for the same model with fewer hypotheses). In this work they formulate an optimal investment/consumption problem with transaction costs as a 2-dimensional singular control problem. The inaction region (which is called the no-transaction region) is found to be a wedge, hence the results of Varadhan and Williams [79] establish the existence of the optimal control policy, which is a linear mapping of the local time of the reflected diffusion, i.e., of the optimal state process. In this thesis we use probabilistic methods to study the general d-dimensional control prob lem with the state process satisfying (1.1) and the expected cost function (1.2). We will show that the optimal control exists under some very mild conditions. The dynamic programming principle will be established. Moreover, the value function is shown to be uniformly continuous and is the unique viscosity solution of (1.3). We will define a continuation region for the problem and show that it has some of the features found in the specific problems. The adaptedness of Chapter 1. Introduction 8 the optimal control to the state will be investigated. As we have pointed out, in the literature singular control problems usually involve only the singular control variable v in (1.1) (exceptions can be found in [16], [54] and [69]). The present work, however, places both classical and singular control problems in a common framework. It includes both problems as special cases by letting g(.) = 0 or U be a singleton. An outline of this thesis is as follows: In Chapter 2 we first recall some model problems which arise from different applications to show the particular features of optimal singular controls. After we formulate the problem the concept of relaxed control is introduced. The problem is reformulated as an equivalent martingale problem on a canonical space, which simplifies taking limits when we apply the compactification method. The control rules defined in Section 2.4 enable us to consider the cost as a function on the collection of probabilities (i.e., control rules) on the canonical space. A topology for the canonical space is given that makes it a metrizable separable space. In Chapter 3 , we apply the compactification method to show that the optimal control for the singular control problem exists. The cost function on the canonical space is defined and is shown to be lower semicontinuous. Moreover, the set of control rules starting from initial points that are in a bounded subset of [0,T] x Jpd is shown to be compact, and as a result the existence of optimal control is established. The value function is also shown to be Borel measurable. In Chapter 4, we apply the method of conditioning and concatenation, used by Stroock and Varadhan [73] in the construction of solutions to stochastic differential equations, and by Raussmann [29], Haussmann and Lepeltier [30], and El Karoui et al[18] in the setting of optimal (classical) control problems to establish an abstract dynamic programming principle. We also show that there exists a family of optimal controls such that the corresponding optimal state process forms a Markov process. Assuming Lipschitz conditions on the coefficients of the state process, we show in Chapter 5 that the value function is uniformly continuous on [0, T] x II?d. Motivated by the work of Lions [55] on classical control problems, we characterize the value Chapter 1. Introduction 9 function as the unique viscosity solution to the Hamilton-Jacobi-Bellman equation (1.3). Applying the results we obtained in previous chapters, we define in Chapter 6 the contin uation region (or, the inaction region), and show that there exists an optimal control in the following form: if the state starts outside the region A, then the singular control variable v brings the state immediately to the closure A by a jump. The state stays in A thereafter, and is continuous when it is inside A. This result coincides with the properties of optimal solutions of the specific problems solved in the literature. Chapter 7 is rather independent of the rest of this thesis. In this chapter, we consider the bounded variation control problem, i.e., k = 2d, and .q(.) = (I, —I) with I the dx d unit matrix. We introduce a random time change which stretches out the time scale. Under this new time scale, the problem is transformed to a new control problem involving only classical controls. The new problem has been studied extensively, cf. Fleming and Rishel [22], Krylov [50], and especially Haussmann [29], Haussmann and Lepeltier [30], where the existence of an optimal Markovian control for the new problem is established under some mild continuity conditions on the coefficients of the state process. Applying this result and transforming the optimal control back to the original singular control problem, we show that an optimal control exists. Moreover, it is shown that there exists an optimal control in the following form: the control variable u is in Markovian form, i.e., it depends only on the current state of the problem, and the increments of the singular controls v during any time interval depend only on the state process in that interval. This type of control will be called a control law. This method gives an explicit way to construct the optimal control for the singular control problem when the optimal control to the new problem, which is relatively well understood, is known, e.g., by the maximum principle, dynamic programming, etc (see Fleming and Rishel [22], Haussmann [28], Krylov [50]). However, we do not know whether the optimal state process is a reflected diffusion in some region. Finally, we list some notation that will be used throughout this thesis: ffl 11? denote the d-dimensional Eudidean space and the real line respectively. , • 1 18+ = Chapter 1. Introduction {x e 18, x 10 0}, and 18 is defined similarly. For x , and unless otherwise defined, y t zL x 1 • T> 0 is the fixed horizon, and >D = = (x ) 1 , y = ) 1 (y e d, 18 x y = (z, y) = denotes the Endidean norm. [0,Tj x ia’. U is a compact metric space. • C[0, Tj denotes the collection of real-valued continuous functions defined on [0, T], and C’[0, T] denotes the collection of 1R-valued continuous functions defined on [0, T]. • V”[0, T] denotes the collection of 18dvalued functions defined on [0, TJ that are left con tinuous and have right limits (i.e., icri functions). • Ak[0,T] denotes the collection of functions a and a 1 is nondecreasing with a(0) • 3 lxk = 0, i = 1, ]flk [0,T] . ., such that a = ) C Vk[0,T] 1 (a k. is the space of I>< k matrices with the I x k-dimensional Eucidean norm. • If Y is a metric space, and 8(Y) denotes the corresponding Borel u-field, fe b13(Y) mean that f is f C 8(Y), a 8(Y)-measurable and bounded 8(Y)-measurable real-valued function respectively. We denote by JM+(Y), J1lJi(Y) the space of nonnegative Radon measures and the space of probabilities on Y, respectively. • For any bounded function 4’: Y ‘-÷ 18, we can extend 4’ to ffi4 ( 1 Y) by 4’(t) EJ4’(Y)iL(dY) for each it C 1216 (Y). 1 • If X is a random variable on a probability space (12, F, F), the expectations of X will be denoted by E’(X). iW (MlL) is the family of continuous square integrable martingales (local martingales, respectively) on some given probability space (12, F, F) with a filtration {F}. Chapter 2 Formulation of the Problem 2.1 Introduction In this chapter, we first give three model problems that arise from different applications, in clnding stochastic decision and finance, and that have been solved explicitly. These problems suggest the general formulation of the stochastic control problem, which is presented in Sec tion 2.3 and studied in this thesis. We also prove that the control problem is equivalent to the relaxed control problem; this simplifies taking limits when applying the compactification method to show the existence of optimal controls. As a consequence, the control problem is reformulated as a martingale problem in Section 2.4. When an appropriate canonical space is chosen, the concept of control rule is introduced. This enables us to consider the expected cost as a function defined on the space of probabilities on the canonical space. A topology on the canonical space is given in Section 2.5 to make it a separable metrizable space. 2.2 Model problems We will present in this section some model problems which have been solved explicitly in the literature. The solutions to these problems provide us with an intuitive idea about the basic features of singular control problems. Example 1. (Karatzas [40]) Monotone follower in stochastic decision. The problem is to optimally track a 1-dimensional Brownian motion B by a nondecreasing process Vt, adapted to the past of B on a probability space (Q, F, F). The state is defined as xj=x+Bt—Vt, 0tT 11 Chapter 2. Formulation of the Problem 12 and the aim is to minimize the expected cost J(v) = E T {J f(xt)dt}. This decision problem can be reduced, through formal dynamic programming, to a freeboundary problem in partial differential equations. Let the value function be defined as {jT W(t,x) infEr = where t is the time-to-go. Then the formal Hamilton-Jacobi-Be]lman equation is a variational inequality: max {w and W(0, x) = — — w} f(x), = 0, for (t, x) € E, 0 for x € JR. It turns out that under the assumption m kx for some constants 0 f”(x) , Vx € JR m Kx lv < K and integer m 0, there is a classical solution TV € C 2 ’ 1 and a moving free-boundary, which is a C’-curve on [0,T], and separates the do-nothing, or inactive region from the active region in >1 The optimal control v has the foliowing form: if the initial state x is outside the inactive region, apply a jump to bring the state immediately to the free-boundary, and apply control thereafter only when the state hits the free-boundary to keep it inside the inactive region. The optimal control v (after time 0) is thus local time of the optimal state, which is a reflected Brownian motion, at the free-boundary, and is thus singular continuous with respect to Lebesgue measure. Example 2. (Chow, Menaldi and Robin [14]) Additive control for a linear system. Consider the 1-dimensional linear stochastic differential equation, with the control v, which is a process of bounded variation, entering additively into the system, JI. dx = d3 = (aQ)xt + b(t))dt + c(t)dB + dvi, t> Chapter 2. Formulation of the Problem 13 The expected cost takes the form Js,r(v) Er {jr f(6, xe)dS + c(O)do_s} 1 s T) where 1’. denotes the total variation function of t.. Under certain assumptions, especially the convexity of f(t,.), the value function W(s, z) is characterized as the unique generalized solution of the variational inequality min{LW + f(s, x), —W + c(s)} where LW = 0, on 3 + u2(s)W + (a(s)x + b(s))W. The moving-boundary consists of two W branches x+(s) and x—(s), which are given by x(s) = inf{x : W(s, x) + c(s) > 0}, = inf{x : W(s, x) — c(s) < 0}. Moreover, if it is assumed that x±(.) are finite and continuous on [0, T], then the optimal state process is a reflected diffusion in the moving interval (ar(s), x+(s)), except possibly for an initial jump. Example 3. (Davis and Norman [16], Shreve and Soner [69]) Investment and consumption model with transaction costs. Davis and Norman [16] (see also Shreve and Soner [69] for the same problem under more relaxed conditions) consider an optimal investment/consumption model in which a single agent consumes and distributes his wealth in two assets: a bond and a stock, with transaction charges equal to a fixed percentage of the amount transacted. The agent’s portfolio (so(t), si(t)), where so(t), si(t) stand for the amount of bond and stock respectively, evolves according to u(t))dt (1 + A)dL + (1 dso(t) = (rso(t) dsi(i) = asi(t)di + csi(t)dWj + dL so(O) = x, si(O) — = — — dUe, — .u)dU, Chapter 2. Formulation of the Problem where 0 < A, t t 14 < 1, cr, a> 0 are constants, x + (1 — u)y > 0 and x + (1 + A)y 1 0, u(t) 0, 0. Here (u, L, U) is called a control policy, with consumption rate u, cumulative stock acquisition L, and cumulative stock sales U. L, U are nondecreasing processes. •The problem is to choose (it, L, U) to maximize the utility function L, U) j ef(uj)dt, subject to (so(t),8i(t))ESE{(z,y)e JR 2 :x+(1—t)y>0,x+(1+A)y>0}, 0<t< oo, where 6 > 0 is a constant, and or f(’u) = f is assumed to be in the form f(u) = — (y < 1, y $ 0) logu. The problem is reduced to the corresponding free-boundary problem, which can be shown to have a unique solution, and the optimal state process is found explicitly as follows: if the state starts from (x, y) e NT (NT, a wedge that can be found explicitly, is called the non-transaction region), do not use L, U and solve the problem as in the classical control problem (the control variable is u(t)); when the state reaches the the boundary ÔNT, apply L, U as much as necessary to prevent it form leaving NT. When the state starts from S\NT, use L, U immediately to make the state jump to the boundary of NT along appropriate directions. Hence the optimal process is a diffusion reflected in a wedge (after time 0), and the optimal singular controls L and U are the local times of the optimal state process at the lower and upper boundary of the wedge NT. 2.3 Statement of the problem We consider in this thesis the following optimal control problem in which we allow both classical control and singular control to act at the same time. The dynamics are in the form = for(t,x)e>D, x+ f t t b(O, x, uo)dO + siT,where j t a(O, zo, uo)dB 0+ g(O)dvo, a.s., (2.1) Chapter 2. Formulation of the Problem • U e 15 U, .s < 6 <T, and U, called the control set, is a compact metric space; • (a, b): > x U i—+ S’ x]R0!, g [0, T] i—* S”> (k > 0 is a fixed integer); a(t, x, u), b(t, x, u) are measurable, bounded and continuous with respect to (x, u); g(t) is continuous on [0, T]; • (B, 0 t T) is a d-dimensional Brownian motion on some probability space; • veAtj0,Tj. ‘We introduce the concept of controls for the stochastic differential equation (2.1). Definition 2.1 A control is a term a = (Q,F,Fj,F,Bj,x,u,v,s,x) such that (Cl) (s, x) C (C2) (fl, F, F) is a probability space with the filtration {Fj}t>o; (C3) Uj Z.3 a U-valued process, progressively measurable with respect to {Ft}to; (C4) v is an 184-valued processes progressively measurable with respect to F. The sample paths of v are in A’[0,T], i.e., for each a’ e 1!, v.(w) e (CS) B is a standard d-dimensional Brownian motion on (Q, F, F, F) and Xt, the state process, is Ft-adapted with sample paths in Vd[O, Tj, and such that (2.1) is satisfied. We assume that x = x for 0 < r s. We call (s, x) the initial condition of the control a. The collection of controls with initial condition (s, a) is denoted by 5 A , . It is well known from the theory of stochastic differential equations that, under the above conditions, the set 5 is nonempty for each fixed (s, a) (e.g., take u and v to be constant). The cost corresponding A to the control a is defined to be J(a) where E {JT )dt + 1 f(t, a , U 1 L,T) c(t) . dvt}, (2.2) Chapter 2. Formulation of the Problem • f: E x U i— 11? is a measurable functiou and is lower semicontinuous in (x, u), satisfying —K for some constants m • c = (Ct) 16 [0,T] JR’ fQ,x,u) C(l+ 0 and C 0, K ), tm IIxtI (t,x,u) C X U 0; is lower sernicontinuous and Ct > 0, 1 i k. Throughout this work we write J d k(S) da(S) = S > it:1 for any JRkvalued Borel measurable functions k J [s,t) = (kg) and a = (as) e .,4’[0,T]. For v C Ac[0,T], define Gt(v) f g(S)dv(S), s <t [s,t) = (2.3) 0, 0ts. It can be verified easily that G.(v) C Vd[0,T]. The value function of the problem is defined, for (s, x) C E, W(s, x) = inf J(a). A control a C AS,T is called an optimal control if W(s, x) (2.4) = Remark 2.2 This model includes the classical control problem as a special case, i.e., g(.) When k = 2d, g(.) = = 0. (I, —I), with I the d x d unit matrix, it is called the bounded variation control problem. When k = d, g(.) = I, it is often called a monotone follower problem. U Remark 2.3 In our model, the function g(.) is fixed. A different formulation of the problem can be found in Fleming and Soner [23], Chapter 8. For a comparison, see the disscusion in Section 7.5. Remark 2.4 Note that there is no terminal cost in our model. Due to the way we choose the topology on the canonical space, this method cannot treat the case with terminal cost. We will return to this point at the end of Chapter 3. 0 Chapter 2. Formulation of the Problem 17 Remark 2.5 We take the singular control variable, and thus the state process, to be left continuous. It seems that this is a natural choice for singular control problems because the dynamic programming principle (Theorem 4.10) takes a simple form. The reader may compare Theorem 4.10 and Menaldi and Robin [63], Theorem 1.3, which gives the dynamic programming principle for an one-dimensional problem with right continuous singular control variable. U In order to apply the compactification method, we now reformulate the problem. Since the Brownian motion in the definition of controls is unknown in advance, we can reformulate the control problem as an equivalent martingale problem. This simplifies taking limits. In fact, let £ Eajj 8 3 where a = uu. Then we can show that a e +EbII—, , if and only if a satisfies (Cl), (C2), (C3), 3 A (C4), and (C5’) Xj is an Ft-adapted process with sample paths in V”[O, T] such that • F(x = • V C(]IZd), Mq5 (0 e x, ‘ur = ‘u’, z r 0, 0 = t s) = 1 for some arbitrary but fixed u U, T) is in M, i.e., Mq5 is a continuous square integrable martingale on the filtered probability space (Q, F, F, F) (1 t M(w) e ), where t £(8, x ,u 0 )dO 0 E (Xt) .s — V(xo) g(O)dvo . — - > [b(xo)-ql(xo)-V(xO)./sx ] 0 . (2.5) s6<t Therefore we can delete the term B from the notation of a control. The proof of the equivalence of the existence of weak solutions to a stochastic differential equation and the existence of solutions to the corresponding martingale problem, given in Proposition IV-2.1 of Ikeda and Watanabe [35], also works here despite the extra term G.(v). Next we introduce the concept of relaxed controls, which gives a more suitable topological structure when applying the compactification method. In a relaxed control problem, the U valued process {Ut} is replaced by an IllS (U)-valued process {pt}, where 1716 (U) is the space 1 Chapter 2. Formulation of the Problem 18 of probability measures on U endowed with the topology of weak convergence. 1M 1 (U) is also a compact metrizable space. If q: U -* 11? is a bounded measurable function, then we extend to IJI/Ii(U) by letting L Definition 2.6 a = 1 (1Z,F,.Fj,F,x , vt,s,x) ,jz is called a relaxed control if it satisfies the conditions (Cl), (C2), (C4), (C5’), and (U) 1 (C3’) jz is 1M — valued, progressively measurable with respect to {F}>. Remark 2.7 Note that we never work with u(t, x, p1. Instead, we work with a(t, x, i) through the above martingale formution. It is not true in general that cr(t,x,u)a(t,x,u)* = a(t,x,t), cf. El Karoui et al [18]. U The collection of relaxed controls starting from time s with the initial state x is denoted by A , 3 . A , 5 Hence Va e Note that As,r can be imbedded into the Dirac measure at the point it. J(&) by letting ,ut(du) E 6jdu). Here 6 deuotes there exists an & C , 3 A such that = and therefore inf J(&) inf J(a). In order to get the reverse inequality, we define for each (t, x) C K(t, cv) {(a(t, cv, it), b(t, cv, u), z): z (2.6) a subset of 3<1 xlRd xlR: f(t, cv, it), it C U}. Proposition 2.8 Assume that K(t, cv) is convex for each (t, cv) C D. Then Va (2.7) e 3 A , , there exists an a C A , such that 3 J(a) <J(d). Proof We first show that K(t, cv) is closed for each (t, cv) C E. For u, C U, cv,,, assume (a(t, cv, it,,), b(t, cv, it,,), Zn) —÷ (a, b, cv) f(t, cv, u,,), Chapter 2. Formulation of the Problem as n —* 19 co. Since U is compact, there exists u 0 E U such that (possibly extracting a snbse quence) u — . Note that a(t, x, 0 u .), b(t, x,.) are (a(t, x, Un), b(t, x, Un)) as n —* —+ continuous on U, therefore (a(t, x, 0 u ) , b(t, x, u )) 0 co. From the assumption that f(i, x,.) is lower semicontinuous on U, we have z= lim z, liminffQ,x,un) f(t,x,uo). n—*oo n—cc Therefore (a, b, Z) e K(t, x), i.e., K(t, x) is closed. Let = and define!: [0,T] x lQt,w) Q = = (l,F,.Ft,P,xt,pj,s,x) x x ‘—÷ 11? by (aQ, xj(w), p(w)), b(t, xt(w), ut(w)), f(t, xt(w), ,ut(w))) J(a, b, f)(t, xt(w), u)Jzt(w, du). (2.8) Since K(t,x) is closed and convex, it follows that I(t,w) E K(t,xt(w)) for all (t,w). It can be easily verified that / is progressively measurable with respect to F. By a measurable selection theorem (see Theorem A.9 of Haussmann and Lepeltier [30]) there exist progressively measurable processes Ut and d, which are U and lilt-valued respectively, such that for each (t,w), 1(t, w) = (aQ, xt(w), ut(w)), b(t, xjQo), u (w)), f(t, xj(w), utQo)) + dj(w)). 1 Now we define a By definition we know that a J(&) e = (fl,F,Ft,P,xt,ut,vt,s,x). , (use (C5’) rather than (C5)). Moreover, 3 A = E = EF{J f(t,xt,ut)dt+j c(t).dvt}+Ej ddt f(t, x,)dt + 1:c(t) dv} T T 1 E = {j J(a). f(t, Xt, u)dt + c(t) . dv} Chapter 2. Form ulation of the Problem 20 The proof is thus complete. U By (2.6) and Proposition 2.8 we know that when K(t, x) is convex, inf J(d) inf J(a). , 8 cxeA Moreover, if the infimum over relaxed controls is attained, then so is the infimum over ordinary controls. Hence we can restrict our attention to the case of relaxed controls if K(t, x) is convex. If the convexity fails, our existence results pertain only to relaxed controls. 2.4 Control rules Now we choose a canonical space to simplify the arguments in the compactification method. Define U E {p: [0, Tj i—* IA/Ii(U) is Borel measurable}. Consider the canonical space XE vd[o,Tj x U x A”j0,T]. All the above spaces are equipped with appropriate topologies which we will discuss later. Let 73, U, A denote their Borel u-fields, and , U, A the u-fields up to time 1, i.e., the Borel a- fields generated by all the functions in 73, U, A which are constants on [t, TJ respectively. Their precise definition will be given in the next section. Let XE V xU x A, E x x A. Definition 2.9 A control rule is a probability R on the measurable space (X, X) such that & = is a relaxed control, where XtQQ) = for w = (x.,z., V.) C X, i.e., Xt, i-ztQo) = [tj, Vt = Vt(W) Chapter 2. Formulation of the Problem 21 1. (X, X, R) is a probability space with th€ filtration R(r 2. Vçb e C(1Rd), (s < t t (s — , {}, (s, x) E / = 6, v,. = 0, 0 < r and s) = 1; T) is in M on the filtered probability space (X, X, X, R) T), where f Mth(w) — t t (9, x, te)d9 — j V(xe) g(O)dv 0 . 0 [(xe+)-e)-Vth ) .txe]. (x - (2.9) s<O<t Let J(s, R) E J(&). We denote by the space of control rules such that the above relaxed control starts from time .s with initial state x. We will suppress s in J(.s, R) when it is clear that R E R.. Now the control problem can be described completely in terms of control rules. Proposition 2.10 Let a = (Q,F,Ft, F, control rule R E Xt,[tt, Vt, s, x) be a relaxed control; then there exists a such that J(R) = J(a). Proof. The proof of this result is standard: define a map P : X by E This map is measurable, and F’’’ C ‘(X) C F, where F’’ is the ifitration generated by x, , v. We can show that a = (Q, F, _1(), F, Xt, JUt, Vt, .5, x) is also a relaxed control and such that J(a) = J(a), Chapter 2. Form ulation of the Problem 22 see Haussmann and Lepeltier [30], Theorem 3.13. Let 1? Fa which is a probability is a relaxed control satisfying the requirements in the Proposition. U measure on (X, It is easily seen that . (X, 2.5 = A?, ;, R, Xt, l-’t, vj, s, x) The topology on the canonical space In this section, we define the topologies on the spaces Vd[0,T], U and Ak[0,T]. The space Vd[O, T]. We first give a topology on Vd[0, oo), the collection of lcrl functions (i.e., left continuous and having right limits) on -+ taking values in Vd[0,T] as a subset of V 1 [0,oo) by extending each xc V’[0,T] to an x’ x(t), x(T), Then we can take c V9O,co) through 0t<T, tT, and consider the induced topology on Vd[0,T]. Define a measure function f 18 ‘-÷ ?(.) on the Borel subsets of 18+ by A(dt) d, 18 W(f). Note that JR° e d t t. For a Borel measurable the image of the measure A(.) under the mapping t called the pseudo-path of the function by = [—00, f. It is a probability law on [0, co] x d 18 i.—+ (t, f(t)) is a and is denoted +oo]”. It is clear that 4! identifies two functions if and only if they are equal almost everywhere in the Lebesgue sense, and in particular, 4! is one-to-one on Vd[O, oc). Thus it provides us with an imbedding of V90, oc) into the compact Polish space P of all the probabilities on [0, oo] x 1& (with the topology of weak convergence). The topology that P induces on Vc[O, oc) via the mapping 4! is called the pseudo-path topology, and makes Vd[O, oc) a separable metrizable space. The associated Borel u-algebra on V°[O, oc) is the same as we get from the Skorohod topology. In fact, Lemma 1 in [66] tells us that convergence in the pseudo-path topology is just convergence in measure. Let V denote the Borel u-field on V![O, co), and i5 be the Borel u-field up to time t, i.e., V 1 is the c-field generated by all the Chapter 2. Formulation of the Problem 23 functions in V[0, cc) that are constant after t. Then we know that t=a{x(O): 09t}, 0tT. 15 Now we introduce some notation. For x v = (vi) e ifid, u < v means < v (1 i e Vd[0, cc), define x d). Let N(x) = = supt IIx(t)II. For u = (ui), ZNuth)t(xi), where ) 1 ( t N’ x denotes the number of upcrossings of x(.) on [0, cc) between the levels u and v. Then a subset A C Vd[O, cc) such that sup x’ < cc, zEA for any u < v sup N’(x) < cc (2.10) xEA is relatively compact in Vd[O, cc) with the pseudo-path topology. For details, see Meyer and Zheng [66]. Assume (12, .1, F) is a probability space with filtration {F}g>o and X is a lcrl adapted process, with EIIXtII < cc for each t < cc. By the pseudo-law of the process X we mean the image law of F under the mapping w = cc (define X = i—* ‘I’(X.(w)). Let r be a finite subdivision: 0 = to <t 1 < 0), define the conditional variations VarT(X) = — XjJFt]II, i<n Var(X) = sup 14(X). if Var(X) < cc, then X is said to be a quasimartingale. For a martingale X, Var(X) supt EIIXtII. For a quasimartingale X, it can be shown that for d cF(X* c) Var(X), ENUV(X) 1 u u = (ui), v = (v ) 1 = 111 u E d, < v + Var(X) We state the main results about the pseudo-path topology in the following theorem. Theorem 2.11 (a). Let {F} be a sequence of probability laws on the Borel subsets of Vd[O, cc) such that under each F the coordinate process X is a quasimartingale with conditional variation Var(X) uniformly bounded in n. Then there exists a subsequence on V”{0, cc) to a law F, and X is a quasimartingale under F. flk which converges weakly Chapter 2. Formulation of the Problem (b). Let (xi’), 24 (Xe) be measurable processes on the probability space (12, F, F) such that the pseudo-law of X” converges to that of X. Then there exists a subsequence (Xr) and a set I of full Lebesgue measure, such that the finite dimensional distributions of (Xi”Qi converge to those of (X)€i. Let f be a bounded continuous function on (lR’, then the function (ti,t , 2 . ..,tk) i: E [f(x,x,. . .,X)] converges in measure to the corresponding function relative to (Xe) as n —÷ 00. Proof. See Meyer and Zheng [66]. 0 Remark 2.12 The corresponding results on Vd[O, T] can be stated similarly. 0 Remark 2.13 As pointed out by Meyer and Zheng [66], Vd[O, co) with the pseudo-path topol ogy is not a Polish space. But from the definition we know that it is homeomorphic to a subspace of the Polish space 2, and hence is a separable metric space. The Space ii. LI is the space of measurable transformations p [0, T] El ‘-÷ (U) endowed 1 ]Jl/1 with the stable topology, which is defined as follows. Define for A C B([0, T]), B C 13(U) (A x B) E Ldt. Then j can be extended uniquely to an element in ]M+([0, T] x U), the space of nonnegative Radon measures on [0, T] x U. The stable topology on U is the weakest topology which renders continuous the mappings pT p J J 0 for all bounded measurable functions qS(t, u)/i(dt, du) U which are continuous in u. Under this topology, we know that M+([0,T] x U) is a compact separable metrizable space. U is also endowed with its Borel cr-field U, which is the smallest a-field such that the mappings T F- f Jo pt(du)f(t,u)dt JU Chapter 2. Formulation of the Problem are measurable, where f 25 is a bounded measurable function continuous with respect to the variable u. The filtration 7-4 is the c-field generated by {1[ot]fL, t e M}. From the definition of the stable topology, we know that 14 is generated by the sets of the form {: with .s J d M 8 0 OEB} t and B a Borel set in JM(U). For more details, see Haussmann and Lepeltier [30]. The Space Ak[0, T]. Let V be the collection of functions v: [0, T] is of bounded variation and left continuous. We assume v(0) = ‘—÷ 18 such that each v(’) 0. We first consider a topology on V. Let JiW[0, T] be the collection of signed Radon measures on [0, T]. Then there is a one-to-one correspondence between V and JM[0, T]): for a E V, let va([.5,t))Ea(t)a(s), V 0stT. So we need only consider a topology on M[0,T], and can then get the induced topology on V. Denote by C[0, T] the collection of real valued continuous functions on [0, T]. It is well known that IAJ[0,T], and therefore V, is the dual space of C[0,T] with the supremum norm: forf C C[0,T], lUll = sup O<iXT lf(t), and the corresponding weak*topology on M[0, T] is the topology induced by the weak con vergence of measures. It can be easily seen that the measures of the form , with N 1 zc ab finite, a, x rational, comprise a countable dense subset of 11V1[0,T]; therefore M[0,T], with the weak-topology, is separable. Let A° = {a C V : a is nondecreasing}. Then it is a closed subset of V under the weak convergence topology, and the corresponding closed subset in IM[0, T] is IM+[0, T]. We consider the induced topology on M+[0, T]. Chapter 2. Formulation of the Problem 26 1} be a dense subset of C[O,T], and 4 o Let 5 {q k ,k fo kdA d(A, = 1. For Ajt E 1M[O,T], define loT kd (2.11) = It is easy to verify that d defines a metric on IM+[O,T], and hence a metric on A°. Theorem 2.14 For A, A E M+[O, T], d(A, A) ,‘ 0 if and only if A ‘ A in M[0, T], i.e., çT I Jo for each f(t)dA,(t) — I Jo f(t)dA(t), (2.12) f e C[0,T]. Therefore IJIJ+[0,T] is metrizable. Proof. The sufficiency is obvious. To show it is also necessary, take that ç’o = f E C[0, T], 1, so there exist a constant C > 0 such that A([0,T]),A([0,T]) e > 0, since is dense in G[0,T], we can find {k} II! Now if d(A, A) ‘, 0. Note C. For a given 4 such that (2.13) E. — 0, then there exists N such that when n > N, IOTIdATh_I d 1 A 1 <dA A < ‘ IIiII Therefore when n f - - N / Jo dA,, 1 q5 — / jdA e, JO and J 0 pT fdAn_J fdA 0 < J (f_çb ) 1 dA 0 + CIIf fT — - 1 d A fT 1 d A+ f(f - ‘iI + e + CIIf iIl — (2C--1). Therefore (2.12) holds for f. 0 Now we can conclude that under the weak convergence topology A° is a separable metric space. We state the following theorem which will be used later. Chapter 2. Formulation of the Problem 27 Theorem 2.15 For any constant C > 0, {A e iW[0,T] .X([0,T]) C} (2.14) is a compact subset of M+[0, T]. Proof. Note that the set (2.14) is a closed subset in JM[0, T] in both the (variation) norm topology and the weak* topology. Also notice that if A E M[0, T] C JM[0, T], then the norm of A will be pT hAil sup = IIfII’ J fdA = ° i.e., the set (2.14) is a norm-bounded subset of M[0,T]. Therefore by the Banach-Alaoglu Theorem (see Larsen [53] Theorem 9.4.1) we can conclude that (2.14) is a compact subset of IM[0,T], and therefore a compact subset of JM+[0,T]. Finally, observe that Ak[0, T] = 0 (A ) 0 ’ and consider the product topology on Ak[0, T] inher ited from the weak topology of A°. We can state the following result. Corollary 2.16 .,4k[0,T] is metrizable and separable. For a,a e Ac[0,T], a —÷ a if and only if pT J f(t) da(t) . 0 for any fe —+ j0 f(t) da(t). . C([0,T],JRj. Moreover, the set VM = {a e Ak[0,T]: hla(T)H M} (2.15) is compact for any constant M > 0. 0 We make the following observation for later use; its proof is obvious from the relative com pactness criterion for pseudo-path topology on 7Y[0,T]. Recall that the map 0 A’[0,T] V![0,T] is defined by (2.3). Lemma 2.17 For any constant M > 0 G(VM) = {G.(v): v C is a relatively compact subset of Vd[0,T], where VM is defined by (2.15). 0 Chapter 2. Formulation of the Problem 28 Remark 2.18 Note that Lemma 2.17 is the main reason why we choose the pseudo-path topology for V°[O, T]. This result is critical for the proof of Theorem 3.8. It is obvious that the Skorohod topology (i.e., the Ji-topology) is too strong for Lemma 2.17 to be true. We should point out that the Skorokod M -topology for Vd[O, T] (cf. Skorohod [70]) also 1 gives Lemma 2.17. However, we choose to use the pseudo-path topology because it makes the proof of the main result (Theorem 6.9) of Chapter 6 easier. We let D A be the Borel u-field of Ac[0, T], and A the Borel u-field up to time t, i.e., A is the u-field generated by all the functions in .A”[0, T] that are constants after t. It can be verified that At=u{v(O): 0Ot}, 0tT. From now on we will always use the notation fi = X, F = A’ and F = X. It is well-known that M (!2), endowed with the Prohorov weak convergence topology, is a separable metrizable 1 space. Denote the collection of all the control rules with initial condition (s, x) by which is a subset of M (Q). For any real number A, define 1 {P J(s, F) e Proposition 2.19 There exists a constant C A(x) we have 0 0 for = C(1 + A}, 0 such that for lix urn), x C (2.16) each (s, x) C Y. Recall that in is given in the definition of f. Proof. Under our assumptions it is known from the theory of stochastic differential equations (cf. Stroock and Varadhan [73]) that there exists a F 0 C R.s,r with Po(vj = 0, p = otto, t 0 T) = 1. Then from the definition of control rule we know that under Fo, = x+ t b(O, X9, u°)d9 + j u(9, x , u°)dBo, 9 Chapter 2. Formulation of the Problem 29 therefore from the boundedness of b, a and the Burkholder-Davis-Gundy inequality we have { c m fi + EPO EFOI,XtIIm J m b(O,x u 9 °)dO f SUP u(’,e’,u°)dBei Js Q9t c {i + Ix II + EPO m (j tr(a(O, x, u0))dO) C(1+IIxIj ) m , } where C is a constant independent of .s, x. Now we have by definition J(s, Fo) T 1 { = E° < C(+IIxII ) m . f(8, x, u0)dO} The proposition is thus proved by letting A(x) = c (i + C(1 + fh). T 1 E° IIzeIImds) Li Chapter 3 Existence of Optimal Controls 3.1 Introduction This chapter studies the existence of optimal controls by the compactification method. We first consider an equivalent definition of control rules. After showing that the cost fnnction is lower semicontinnous on the canonical space Q and the collection of control rules is a compact subset of 111 (Q), we conclude the existence of an optimal control by the well-known fact that a lower 1 semicontinuous function attains its minimum on compact sets. The value function is also shown to be Borel measurable. Some comments about the model we use and possible generalizations are presented in Section 3.4. 3.2 An equivalent definition for control rules In order to show the existence of optimal controls let us reformulate the problem as follows. Consider the stochastic differential equation (2.1), and let y then x 1 = y + Gj(v) (0 = x — G(v), T), and (2.1) becomes t yj = x+ f t t b(8, z, ue)dO + j u(9, Xe, ue)dBe. Similarly, we can state the following result for control rules; Proposition 3.1 F e , if and only if there exists an Fe-adapted process y such that 3 fl 1) y. is continuous w.p. 1, and FQr. 2) FQr = X, /1 = Vr = 0, 0 r = y. + 0(v)) s) = 1, 30 = 1, (3.1) Chapter 3. Existence of Optimal Controls 31 3) M E M for every q E C(IRd), where (yt(w)) = — )d9, j (O, xe, y, 0 (3.2) and 1(6,x,y,u) aj(O,x,u)3 + 31 i1 Proof. First, we assume that such a y exists. To show P Mtq e e M for e 7?, we need only to verify C(YD). Recall that Mb is defined by (2.5), which by 1) may be rewritten as (y + Gj(v)) = — f — f £(O, ye + Go(v), [Le)dO V(yo + Ge(v)) dGe(v) 6 + Ge(v)) [cb(y — — (Yo + Go(v)) 5 c — Vq(ye + Go(v)) AG (v)]. 6 (3.3) sO<t By letting (x) = x’, x, and respectively for x = ) 1 (x IRd, 1 i, j < d we can conclude that y is a continuous local semimartingale, with (t) 1 y cloc where M , = x+A+ )dO 9 j b(O,xo, i wfth = j a(O,xo,o)dO, 1 , i d. Therefore by Ito’s formula (cf. Ikeda and Watanabe [35], Chapter III) (yj + Gt(v)) + f (x) + = 0 + Ge(v)) j V(y . V(ye + Ge(v)) dGo(v) 0 dy f . (Yo+Ge(v))d(yi ya) 2 8 + [(yo + Ge+(v)) + — (v)) 9 (yo + G — V(yo + Ge(v)) AG (v)]. 9 sO<t t = (x) + + j t te)d + ye + Go(v), 1 {q(ye + Go(v)) + f — 9 + Ge(v)) j V(y q(yo + Ge(v)) 0+G V(y (v)) d 9 . 9 . — . dGe(v) (v)]. 0 Vq(ye + Ge(v)) AG (3.4) Chapter 3. Existence of Optimal Controls 32 Note that the last integral in (3.4) is in M. Now from (3.3) we can see that = (x) + f V(xe) due, therefore we have shown that Mçb € M, i.e., F € R. The other half of the proof is similar. c The following result is used in the proof of Proposition 3.1 and will be used again. We write it down as a lemma for convenience. Lemma 3.2 Assume F € For any , ft (i, iI) = € C?() we have a(9, Xe, ) 8e) 3e) dO under the probability F. In particular, if we define, for 1 — (t) i.e., iQI with b(y) = y — = y(t) — f (3.5) d, i (O, xe, 0 1 b )dO, (3.6) x, then = jtaij(O,Xs,e)dO, 1 i, j d. U Recall that C°{O, T} is the space of 1R’-va1ued continuous functions on [0, T]. We give Cd[0, T] the uniform topology, i.e., for x, y € C”[O, T], the distance between x and y is defined by p(x, y) IIx(t) — y(1)II. This makes C![0, T] a Polish space, cf. Bifflngsley [6]. For a sequence F’ € with (sn, x) € E, the probability law of the process y, defined in Proposition 3.1, under F tm is defined by P(C) for C € d, where d is (w: T F h y.(w) € C) the Borel a-field of Cd[0, T]. Chapter 3. Existence of Optimal Controls 33 Proposition 3.3 If the sequence (sn, Zn) is bounded in E, then {F’} are relatively compact. Moreover, for any e > 0, there exists a compact subset K C Cd[0, T] such that 1 P’ ’ 1 (K) &, — V n. (3.7) Proof. To show that {FTh} is relatively compact, we need only verify the following: (a) limA_ infn P”-)y(0))) A) = 1, and (b) for any 7> 0, 2 limlimsupF’ sup 6O y(t) — y(s) 7 Os<tT = 0. (3.8) t—s<6 Note that (a) is obvions from the fact 2 P’ ( y(0) 1. By Billingsley [6], Theorem 12.3, = Zn) = (b) is implied by ) 2 Ffly( — for any n 1, 0 2 , 1 t t 4 y(ti) y(t) J4lQC (3.9) — T, where C is a constant. Recall from the definition of y that F’ (y(t) 1 with Al,,. e 2 Cit = Zn + j x, 0 t s,) b(8, x, e)dO + in(t) = = 1, and for t s,, under P, and (In)j tr(a(0,xo,pe))dO, t> s. = It can be easily verified that (3.9) follows from the boundedness of the coefficients u(.,.,.) and b(.,.,.) and the Burkholder-Davis-Gundy inequality. We have thus shown that the probability laws of y under pn are relatively compact. The existence of a compact set K such that (3.7) holds is a consequence of Prohorov’s theorem. D 18 then Proposition 3.4 If A is a bounded subset of d lim M—*cc inf P{w : iivTii PE7?’,X (s,x) e[O,T] xA where i 11 J denotes the Euclidean norm in k M} = 1, (3.10) Chapter 3. Existence of Optimal Controls Proof. If P e R, then J(P) 34 A. Since the function below, there exists a constant K > 0 such that f f is assnmed to be bounded from —K. From the assumption that c(t) is strictly positive and lower semicontinuous, there exists a constant c 0 > 0 such that c (t) t (1 i k, 0 t 0 c T). Thus T 1 { A > J(F) f(O, xo,po)dO + E = EIivTii I? E c(O) dvo} . coil vTii}. E”{—KT + Therefore j (A + KT)/co, and F(iivTiJ M) The proposition is obvious now. 3.3 U Existence of optimal controls Define a function on Q by Fs(w)Ej T T f(Oxoito)d8+j c(8).dvg (3.11) forw= (x.,u.,v.). Then we have Lemma 3.5 f(.) is lower sernicontinuous on fi, i.e., liminf F (w) 3 if w — ) 4 Q 3 r (3.12) win Q. Proof. We show the case when s = 0; for general .s the proof is similar. We can also assume that f(t,.,.), c(.) are continuous. In fact, since they are lower semicontinuous, we can find sequences of continuous functions fm(t,.,.) {fm(t, I , )}, {cm()} such that f(t, ., .), c(.) I c(.), 1 i k, Chapter 3. Existence of Optimal Controls 35 and thus if the lemma is true for continuous functions, then Vm> 1, {JT liminf lirninf f (w) 0 J fm(0, x(o), t(o))do + —* Cm(O) dv(O)} fm(9,X(0),(6))d8+f Cm(O)’dV(O). 0 Let m f 0 oo, and use the monotone convergence theorem to conclude the result. We assume that f(t,., .), and c(.) are continuous. From v — v in Ak[O, T] we have fT I JO Now we show that as n T 1 Let dQ 9 L — I c(O) dv(O). Jo —+ f(9, x(9), u)(du)d9 p(du)d9, dQ J c(9) dv(9) j f(9, x(9), u) (du)d9. 8 j (3.13) (du)dO, the right hand side of (3.13) can be rewritten as 0 f(O,x(O),u)dQ [0,T]xU = + f [0,TIxU f [f(O, x(9), u) From the definition of stable topology we know as J - f(O, x(O), u)]dQ. (3.14) [O,T]xU f(9,x(8),u)dQ [O,T]xU n —* 00, _*J [0,T]xU Now we show that the limit of the second integral in (3.14) is zero. For any positive integer m and constant 7> 0, let g(t, x, u) = f(t, x, u) Am = (t,u): I — f(t, z(t), u), sup IIx—r(t)I1/m Ig(t,x,u) 7 Then each t-section of the set Am is a closed subset of U from the continuity of the function 2 f(t.).AlsoAiD D ”DAmDAm+iD”.and A flAmø. Chapter 3. Existence of Optimal Controls 36 Applying the results in Jacod and Mémin [38] Proposition 2.11 we can get lim sup Qn(Am) Let B,, = Q(Am), limQ(Am) m = 0. (3.15) In order to show the limit of the last integral in (3.14) {(t, u) : g(t, x(t), u) is zero, we need only show lim Q(B) = 0 (3.16) by Jacod and Mémin [38] Cor 2.18. For a given e > 0, from (3.15) there exists an M > 0 such that Q(AM) a Recall that convergence in pseudo-path topology is equivalent to the convergence in Lebesgue measure, therefore x(.) exists N such that when n = {t IIx(t) x(t)II — x(.) in the Lebesgue measure 1, and there N, i {t: Let C —÷ - x(t)II > <a > 1/m}, then it is obvious B,, \ (C’ x U) C AM, and therefore we have Q(B) But Q(C,,” x U) = Q(AM) + Q(C x U). l(C”) <e, hence limsupQfl(B) limsupQ(AM)+E Q(AM)-f-e<2a Since e is arbitrary, we have shown (3.16), and the lemma is thus proved. For F e we have by definition that J(s, F) = i.e., the cost corresponding to the control rule F. We can state the following result now. 0 Chapter 3. Existence of Optimal Controls Theorem 3.6 The mapping (s, x, F) , .s, x, J(s, P) is lower semicontinuous on {(s, x, F) : (s, z) é —* with the induced topology of[O,T] x M+(Q), i.e., if(s,x) e F E — 37 — x, and P —f Proof. Assume (s,, x, , F,) 2 .s or s, — ) 8 r liminf 8 J( n , Pa). (3.17) (s, x, F) with F e —÷ s. When s E (r F e F e R. weakly, then J(s, F) two cases: s, , E because we have assumed that c It suffices to consider T s, E = F e {j f(O, x, o)d9 + (f j c(O) dvo} . —K(s—s) f(8z/Le)dO) 0. It follows lirninf 3 (r — 0. ) 8 r (3.18) When s,, j s, (3.18) follows from — E’ (1 where we have used the fact F(vo = = — 0, 0 j 0 s) f(o, Xn, = 1. In either case, from (3.18) and the lower semicontinuity of I(.), we have lirninf J(s, F) lirninf ET 8 + liminf 3 F 1 liminfE EFS (F = — ) 8 F J(s,P), i.e., J(.,.) is lower semicontinuous. Theorem 3.7 For any A> 0, o if A is a compact set in , then u{1: s E {0,T], x A} is compact. Proof. Since Pvti() is metrizable, we need only to show that each sequence {F’} C has a subsequence {Fnk} such that pflk — F compact and T < —* s, x , we may assume s, for some s —b [0,T], x E A. Because A s x for some s e [0,Tj, x E A. Chapter 3. Existence of Optimal Controls 38 By Proposition 3.1, the process y, defined by y.(w) = x.(w) — for w = (x, p, v), has continuous sample paths under the probability F, and is such that i’IIt c M (Mth is defined in Proposition 3.1) for each E C?(ffld). Now we introduce the following auxiliary space Define a probability z clXCd[O,TJ, Z FxL fr on (Z, Z) by the probability law of (x, z, v, y) with respect to , tm i.e., F for Z C (Z) T F h F (w: (x.(w), .(w), v.(w), y.(w)) e Z). In other words, for F E F, C C C, P(F x C) = We will show that the sequence F’ is tight. For a positive constant M, let VM = {v e Av[O,T] IIVTII M}. From Corollary 2.16 we know that VM is a compact Borel subset of Ak[O, T]. Proposition 3.4 implies that for any given e > 0, there exists an M such that F(Vd[O, T] for every F C >< U Vw) X 1 (3.19) — .s e [0,T], x e A. Therefore for the corresponding P, we have E(vd[0,Tj x U x VM x Cd[0,TJ) 1— e. (3.20) By Proposition 3.3 we know that the probability laws of y with respect to , tm denoted by F are relatively compact, and there exists a compact subset K (K) t F m 1 — c C°[o, T] with Chapter 3. Existence of Optimal Controls 39 or equivalently, F(w: y.Qu) e K) 1 — e, Vn. (3.21) From Proposition 3.1, (3.21) may be written as x K) 1— r, Vn. (3.22) We now consider the coordinate process x. Let D = K + G(VM) = {y + G(v), y E K, v E VM}. By Lemma 2.17 we know that GQIKw) is a relatively compact subset of Vd[0, T] under the pseudo-path topology. Since the uniform topology is stronger than the pseudo-path topology, K is also a compact subset in V’[O, T] and hence so is D. From Proposition 3.1 we have (D T P h x i-I x VM x K) Let S = = pTh(vd[o, T] x U x VM x K). (3.23) D x U x VM x K. Since U is a compact space, we know that S is a relatively compact subset in Z. Moreover, from (3.20), (3.22) and (3.23) we have (S) T F h for every Thus ii. {FTh} 1 — 2e is a tight sequence of probability measures on Z. By the Prohorov theorem there is a subsequence {FThk } and a probability F on (Z, Z) such that Fnk —* F weakly. Define , 2 FFIc i.e., in the terminology of Jacod and Mémin [38], F is the Q-marginal of F, then it is easy to see that F weakly. The proof of the theorem will be completed if F e R. By Ftk Proposition 3.1 we need only to show that there exists a continuous process Y on (Q, F, F, F) such that x. 1. F(Y = 2. F(xT = X, — G.(v)) v = = 0, 0 1; r s) = 1; Chapter 3. Existence of Optimal Controls 40 3. Mt4 e M, Vç& C C(]Rd), and 4. J(s,F) A. Note that 4) is obvious from Theorem 3.6. To show 1), note that the set {(w, y): x.(w) = y. + G.(v(w))} is a closed subset of S = Q P(x. = If we define Y(w) = x.(w) y. — + G.(v)) x Cd[0, T], and therefore lim sup P(x. = y. + G,(v)) = L (3.24) G.(v(w)), then P((w, y): Y.(w) = y.) = 1. Thus Y is a continuous process on (Q,F,J,F), and 1) follows from (3.24). Moreover, {y(O) = x} is closed in 5, so F{y(0) = x} lim sup Eflk{y(0) = x} = 1, or F{w: Y(0) = x} = 1. It follows that x(0) = x a.s.(F). For 2), let (x.,&,v.): Ix(t)—zII Bm = , It is easy to see that Bm is closed in Q and 1 B D 2 B D flBm Since p2k ={w=(x,p,v): we know for each in, Xt=X, ½ = 0, 0< t , s— and vt=0, 0<ts}. Fflk(Bm) = 1 for large k since Xnk —* F weakly, so F(Bm) limsupF’(Bm) = 1, k and therefore F{w = X, Vt = 0, 0 t .s} = F(flBmfl{x(0) = x}) =llrnF(Bm) = 1. Finally, we prove 3). For any bounded continuous function H(.) on fl, if we define = H(w), Vc = (w,y.) C 12 x x. But Chapter 3. Existence of Optimal Controls 41 then H is a continuous function on .Z. For any fixed I = .s, the function Mj(z) = (Yt)— jtL(oxy)do (x,p,v,y)e Z is continuous on Z. In fact, it can be shown as in the proof of Lemma 3.5 that the integral part of the function M,çb is continuous on Z, and the continuity of the function z = (x,t,v,y)-÷ b(y) on Z follows from the fact that C![O, T] is endowed with the uniform topology. Thus for o u < I < T the function z = (x, i, v, y) - ft(z) [Mtb(z) is a bounded and continuous function on 2’, and since Pnk lirn E = E — Mgl(z)] - —* F weakly, we have {ft[M Again, by (3.24) and the definition of 1’, we have P(ilith = (3.25) j}. — M.) = 1, where £ is defined by (3.2) with y replaced by Y. Thus (3.25) can be rewritten as lim EPnk {H[&th = E” {H[&c5 — — M4]}. (3.26) For every bounded continuous function H on 12 that is Ft-measurable, the left hand side of (3.26) is zero since MgI e M on the filtered probability space (12, F, F,, F). By a routine limit procedure we have — MugS)] = 0 for each bounded Ft-measurable function H. Thus (MgI, F) is a martingale under F. The continuity of this martingale follows from that of Y. The proof is therefore complete. D We can now prove the main theorem of this chapter. Theorem 3.8 The control problem has an optimal solution, i.e., there exists a F* e fl , such 8 that J(s,F*)= inf J(s,F). 1 , 8 PER Chapter 3. Existence of Optimal Controls 42 Proof. By Proposition 2.19 and Theorem 3.7 we know that is nonempty and compact. Moreover, it is obvious mi FE1Z , 8 x J(s, F) Now J(s, .) is a lower semicontinuous function on set (1), i.e., 1 c 1M there exists a P J(s, F*) J(s, F). so it attains its minimum on the compact e inf = inf = C R such that J(s, F) = inf J(s, F), which completes the proof. 0 Recall that the value function is defined by W(s,x) inf = P€fl, J(s.F). Let us define = By Theorem 3.8, R $ 0 : W(s,x) {F C = J(s,F)}. for any (s, x) C YZ. It can be easily verified that it is a compact subset of 11i4 (Q). 1 Before we prove the measurability of the value function W, we give a result which will be used later. We adopt the notations of Stroock and Varadhan [73] Chapter 12. Lemma 3.9 The map RY : i—* Comp(IM ( 1 Q)) is Borel measurable. Moreover, there exists a measurable selector H of 7?”, i.e., H(s, x) C V(s, x) e E and H : ‘—÷ (Q) is Borel 1 1JI1 measurable. Proof. By Stroock and Varadhan [73] Lemma 12.1.8, we need only to show the following: for (sn,xn) such that Since e E, s, pn Xn —÷ s, Xn —+ x, F”‘ C S,X’ 7? there exists a subsequence flk and F C p. —* x, we may assume A(xn) {7?, (s,x) C [O,T] x A} with A = A for some constant A. {x,xi,x , 2 . .} Therefore {F”} C a compact set. By Theorem 3.7 there Chapter 3. Existence of Optimal Controls and F C R such that Fnk exists a subsequence easily that F e 43 —* F. From Theorem 3.6 it can be seen The measurability of 7?° is thus proved. The existence of a measurable selector H is a consequence of Stroock and Varadhan [73] Theorem 12.1.10. 0 Corollary 3.10 W(.,.) is a Borel measurable function. Proof. From Theorem 3.6 we know the map (s, x, F) ‘— J(s, F) is lower semicontinuous and thus Borel measurable. The corollary follows from the fact that W(s, x) = J(s, H(s, x)) is the composition of two Borel measurable mappings. 3.4 0 Some comments (a) The model studied in this chapter includes the case of the monotone follower problem as formulated in Karatzas [40], Karatzas and Shreve [40], [46], by letting k I, 0 g(8) = d and g(O) 8 < T, where I denotes the d x d unit matrix. Moreover, if we take k = (I, —I), 0 8 = = 2d and T, then the model reduces the bounded variation problem as discussed in Chow, Menaldi and Robin [14], among others. We have assumed that ct(.) > 0. This condition is necessary for the existence of optimal controls to the general problem formulated in this chapter (see the proof of Proposition 3.4). Thus it excludes the case of the so-called cheap control problems, i.e., c(.) = 0. This type of problem is discussed in Chiarolla and Haussmann [12], [13], Menaldi and Robin [62]. However, our method works for problems with finite fuel constraints, because for any y e V”[0,T], the set {veAk[0,T]: ( (t),Vt, t v t)<y 1ik} is closed in Ac[0,T]. This follows from the fact that v (1 i —* v in A’[0,T] then v(t) —÷ v ( 1 t) k) at all the continuity points of v . For problems with finite fuel constraints, see 1 Karatzas [42], Karatzas and El Karoui [20] and a more recent work by Bridge and Shreve [9] among others. Chapter 3. Existence of Optimal Controls 44 (b) Now we explain why the method used in this chapter is not suitable to the problem where terminal costs are allowed. If we define F 3 (w) çb(xr(w)) for w = (x, ji, v), with a continnons function defined in 11?”, we cannot get the lower semicontinuity for F . The reason is 3 that x” —p x in the pseudo-path topology only ensures that x”(.) and therefore (T) Th x —* —÷ x(.) in Lebesgue measure, x(T) may not hold. But we can modify the formulation of the problem to allow a terminal cost, i.e., let J(a) I E rT J f(t, Zj, u)dt + J s c(t) . dv + b(xT+) [s,T] Note the additional term c(T).Avg in the second integral, and the relation We can replace V°[O, T] by Vd[O, T] x to obtain the results if d 18 XT+ = xr+g(T)L\vr. and .A”[O, T] by Ad[O, T] x ]R in the canonical space is lower semicontinuous. (c) For the same reason as explained in (b) we cannot allow pointwise constraints of the type = 0 a.s. for some lower semicontinuous function P on 1R’ (which may take the values +oo), and fixed t 0 C [0,T]. Bnt it is seen easily that the following kind of integral constraints may be added to the problem: 2’ j 2’ co(t) dv fo(t,xj,ut)dt+ where fo, co satisfy the same conditions as f . 0, a.s. and c, except that the positivity of co(.) is relaxed to bounded below. Moreover, from the proof of Lemma 3.5 we can conclude that the following kind of constraint may also be added to the model: 2’ j 2’ f ( 1 t, Xj, ut)dt + j with 1 (.) continuous on D (for each 0 1 f ( t,•,.), c ci(t) t . dv = 0, T), [0,T] respectively. Of course, we must now assnme the existence of an admissible control. See Haussmann and Lepeltier [30] for constraints of these types in the classical control problems. Chapter 4 The Dynamic Programming Principle 4.1 Introduction In this chapter we wifi apply the method used in Haussmann [29], Haussmann and Lepeltier [30] and El Karoui et al [18] to establish the dynamic programming principle. In Section 4.2 we will present some techniques that are used by Stroock and Varadhan [73] to show the existence of a Markovian solution to a stochastic differential equation. The main results of this section are that a control rule remains a control rule for the problem starting at a later time from the point reached at that time, and if we take a control rule and at some time later switch to another one that has the initial value reached by the first control rule, the object we obtain is also a control rule. Section 4.3 is devoted to an abstract form of the dynamic programming principle, which has played an essential role in the setting of classical control problems. In the last section, Section 4.4, we show that there exists a famlly of optimal control rules such that the associated optimal state process has the strong Markov property. Note that this approach to the dynamic programming principle does not require any regularity of the value function W other than Borel measurability. 4.2 Some preparations This section presents some preparations for proving the dynamic programming principle. Define 12 12 by = I 0<t<s, , — — I. (xj,pj,ts+Vt—Vs), 45 s<tT. (4.1) Chapter 4. The Dynamic Programming Principle Note that if & — v ( 8 ) = = 0 (w), 8 0 v(w) — then c = 46 & on [0,s], (x.()j&(&)) = (x.(w),t&(w)) on (s,T] and v ( 8 w) on [s,T]. Define F 8 to be the a-field generated by the canonical paths after time .s. Lemma 4.1 1fF is a probability on (Q,F ) (0 8 s T) and C 2, then there exists a unique probability measure, denoted by 6 ® F, on (Q, F) such that 6 ® F(w :wj = 6 ® F(A) where by Wj = 0 t s, we mean = “j,O s) t F(Oj(A)), Xt = , Vt = t, = 1, (4.2) VA c F , 8 0 t (4.3) s, and,u = Jt a.e. on [0,s]. Proof. The uniqneness of such a probability measure is obvious, so we need only to show its existence. if I is a subinterval of [0, T], we write V(I) for the set of measurable functions I i—> ]Mi(U). Let us recall an equivalent definition of the stable topology on U = V([0, T]). For p C U, define a mapping i: U’--> C([0,T],]M(U)) by i(p)(.) = f p dO. The topology on U induced by i is exactly the stable topology we have introduced in Chapter 2. For a discussion of this fact, see Haussmann and Lepeltier [30] §3.10. Similarly, we can consider the topology on V(I) induced by the mapping = dO c C([0,T],M(U)), 8 j 1i(O)p p.C V(I). Write A for the 1R-valued nondecreasing functions on I with the inherited topology from A’[0,T]. Let Vd[O,s] )< X ]i[o, 8 V([0,5]) x V°[s,T] x i ,V([s,T]) 1 8 XEX X 0 X, < Chapter 4. The Dynamic Programming Principle and define o: Q where w = i—* v) (x, Xo, : 47 X by i—+ = (xt,jtdo,vt), = (x,,jtIdo,v,) !. Define : o t s , i’ T, $1 by = where 0 T : X 0 : X i’ = (tAs, T(W) = (Xt , 8 with = (,fl,) E X , 0 = (x,t,v) (iz)(t 3 z 1 A s), (1L)(i V 1 i itAs), s), 8 Vt ) , e X. We define a probability F on X by F 6o and let x Po . 1 p=o— Now we verify that F satisfies the conditions (4.2) and (4.3). Note that t < .s, 0 r o = so we have P(w : = , 0 I s) = i((Wl,L4’2) = 2 , 1 P(w ) w : = = : c’t, = Lt, TOOO(W1)t=L2t, 1. For A e F , 8 P(A) = P(Q. ) 2 ,w 41 : = ((‘i, = ((wi,w ) 2 : w ) 2 : ) 2 (wi,w E A) 2 , 3 O ) 1 ) (T(W ) ( T(W2) E e A) 0 t s) 0t<t) 0 I s) , 0 Chapter 4. The Dynamic Programming Principle J = since r a ‘(w)t = Wj Fe C’(w ; r(w2) 2 48 COST ()(A))6o a r a (w) C O$T(A)) 6(O)(&o1) = = F(w: r a (w) C = F(O;(A)) holds on [s,T]. The lemma is proved by letting 6 Ø. F = F. Remark 4.2 Notice that for F C A C F , then F(A) = F 8 E a there exists a F-null set N 0 such that if 0 0 , 0 N O(A), and {jt h(8) dvo} = . EPOS$ {j h(S) t t . for any bounded 1Rkva1ued Borel measurable function h(.). These properties will be used repeatedly in the rest of this section. 0 Assume that r is an Fe-stopping time, 0 a family {Q w C } FT is T. A r-transition probability (or r-t.p.) is of probability measures on (Q, F) such that w Note that r F—* Q(A) is FT-measurable, VA C F. the collection of sets A such that Afl{r and w such that rQo) T, we denote by FT(W) f C Q: t t} C F, Vt < T. For a fixed r the u-field generated by the sets of the form odOCB, CC (4.4) , T(W) where A C B(JR’), B C 13(IM+(U)), C C B(lRk), r(w) T, so t FT(W) is the collection of events that occur after the time r(w). Notice that the topology on Q is separable, so F is countably generated, and for a given probability F on (Q, F), the regular conditional probability distribution (r.c.p.d.) of F for given FT exists, and will be denoted by FT,W. Given a stopping time r (0 r know that there exists a unique 6 6 OT T) and a r-t.p. ®T Q Q, if &‘ C fi, then from Lemma 4.1 we (Q) such that 1 C 1216 Q{cZ’ :3 = Z’,O t r(w)} = 1, Chapter 4. The Dynamic Programming Principle bCJ We write ® ®T Q. = Q(A) When e FT(c0). (x(r(w)), 6°, 0), we write 0 £ U It can be seen easily that for s VA Q(E;J ( 0 A)), = = 49 n vr flT T, and a stopping time r s r T, 0 =Q = Q(rT(W)) where r IS F (I’) (4.5) is defined by (3.11). e M ( 1 Q), r is a stopping time, and {Q} is a r-t.p., then we have the following result which is analogous to Theorem 6.1.2 in Stroock and Varadhan [73]. Lemma 4.3 There exists a unique probability, denoted by F ØT Q, such that 1. FØTQ(A)=F(A), ifAEFT, 2. The r.c.p.d. of P ØT Q with respect to F,- is Q. Proof. The uniqueness of such a probability F®,the existence. In fact, it is enough to check that w Q is obvious. Thus we need only to prove Q(A) = 6®()Q(A) is FT-measurable, and then set ®T Once it is known that w ‘-÷ Q(A) = (6. 1 E 3 ®,-(.) Q.(A)), A E F. Q,(A) is F,--measurable, the proof that P ®,- Q has the desired properties is easy. But if ‘ii A=w=(xjt,v): x(tj)eX, j 1 d 9 i 8eB, vQtj)cC, i=1,•,n with0ti<...<tT,XjeB(]Rd),BjcB(1lkt+(U)),C,eB(]Rj,1<in,then Q(A) )(T(W))Qw (e;)(A)) t1 1[o, k + E [tk,tk+l 1 )(T(W)) (i (x(ti))1B (j od6)1c (v(ti))) 1 Chapter 4. The Dynamic Programming Principle 50 rtz / , 1 xQ(&: (ti)eX jlodO+ ‘ Ji-(w) / podOEBi, (tj)—(r(w))+v(r(w))ECi, +1[t.T](T(W)) (x(ti))lBe(J 1 II 1x k+1 zfl) 1 (v(t)), btedO)1c and this is clearly 7 -F measurable. The lemma is thus proved by recaffing that the collection of sets in the form of A generates the a-field F. Lemma 4.4 Assume P C U r is an Fe-stopping time: s e (Mj4, F, P o O) is a rnartingaie after rz, for r T, and iL C 1!. Then C?(1R°). Proof. Since P C fl, we know that (Mjq!’, F, P) is a martingale, i.e., IA VACF, su<tT, IA (4.6) or E’ {1(.) [M(.) When t Mth(.)]} =0. - (4.7) , we have 0 r pt = — ] pi £(,x , 6 ue)dO — [(x) - - J (x)]- v(xo) [(xT) .g(O)dv - r<O<t — — Therefore we can get, for r, u t / £ 5 1 (O,±e,Jio)dO_j Z T, A EFOOTL {1A() [M(’) - [(Eo+)—(te)J. e F, M(.)]} = E’ 3 {1A(O(.)) [Mb(O(.)) = E {1A(OrQ,)) [Mth(.) =0 - — Mb(O, ( 0 .))]} M)]} Chapter 4. The Dynamic Programming Principle because it is obvious that O;’(A) e F. 51 The lemma is thus proved. El Recall that for a stopping time r, FT,W denotes the regular conditional probability distri bution of F given F,-. = is obviously a r-t.p., and so FE LZ 6 FT,W ®T is defined, where (xQr(w)), 6°, 0). Lemma 4.5 If(Mj, F,, F) is a martingale and r is an Ft-stopping time. Then there is a F-null set N C FT such that for w g N, (Mi, F,, FT,W) is a martingale for t ru,. Proof. cf. Stroock and Varadhan [73] Theorem 1.2.10. Corollary 4.6 If F such that for w e 0 and r is an Fe-stopping time, then there is a F-null set N C FT N, (Mjq5, F,, F,-, o °;‘) is a martingale for t e when C?(JR’ ) 1 . Froof. The proof is obvious from Lemmas 4.4 and 4.5. El The next two results are important for the rest of the chapter. The first one states that a control rule remains a control rule for the problem starting at a later time from the point reached at that time. The second one says that if we take a control rule and at some later time switch to another coutrol rule, then this concatenated object is still a control rule. Proposition 4.7 (Closure under Conditioning) If F s r T, then there exists a F-null set Ne FT e , and r is a stopping time, 3 R- such that Pse flTw,xr forw Proof. Let 5 °°(]R’). Then for each m, (Mtqlm 0 {q m } be a dense subset of C (t — g N. MtAT’m,Ft,F) s) is a martingale. For’ C Q, define — = (xt) — (xtAT) tATW J tAT& (xo).g(8)dv0 tAT Then by Lemma 4.5 there exists a F-null set Nm € FT such that (M°q5m, F,, FT,) is a mar tingale for N,,. By Lemma 4.4 we know that (M?m, F,, FT,o o EJj) is a martingale The Dynamic Programming Principle Chapter 4. N,,. Certainly, for t Therefore (MEAT Urn Nm, is a martingale, where &‘ = c&rn, F, P) is a martingale. It is obvious from the definition that F (w: Let N = 52 X, r = 2 1 = 60, Vr ) = 1. 0 r r = 0, 0 then P(N) = 0. Through a limit procedure we can show that Fe forwgN. U Proposition 4.8 (Closure under Concatenation) Let P e R and r be a stopping time such that s r T. If Q is a transition probability such that Q then e Q 7 P® (lRd), 2 e C Proof. It is obvious that we need only to show that for each is a martingale after time s. From Remark 4.2 and the fact verified that (Mth, F, = 6® Q, ®T Q) Q .F, P® 7 e R(w),qr(w)) it can be easily is a martingale after time r. By definition we know that equals the regular conditional probability distribution of P ® Q given F,-. The proof of the proposition now follows Theorem L2.l0 in Stroock and Varadhan [73]. Set Q= Q) U H(r(w), x(r(w))), where H is a measurable selector of fl°, and by definition, it is a r-t.p.. We denote H=P® 7 P® Q w. Corollary 4.9 For P in 4.3 we have P ®- H C R. U The dynamic programming principle With the preparations of Section 4.2, we can now establish the dynamic programming principle. For a given probability P e define 7ZL(P) to be the set of probabilities in fl - which 84 coincide with P up to time r, i.e., = {P ® Q Q C such that Q is a r — t.p.} C s r 3 Chapter 4. The Dynamic Programming Principle 53 We introduce the following notation: for a measurable function çiS e B(s), let t T ( 3 , )(w) = where LY = (x,u,v) j f(8, xe,e)dO + j c(O) dv(O) + (t, Xt), eQ. Theorem 4.10 (Dynamic Programming Principle) (a) If r is an Ft-stopping time, s r T, and P e R, then E {j )d9 + f(O, zo, 0 = j c(8) dye + W(r, . inf {P(F): XT)} P C i,(F)}, (4.8) where 8 F ( .) is defined by (3.11). (b) For Fe (c) Ifs T (f ( 3 i,W),F,P) is a submartingale. T, then W(s, x) = inf {Pr (r, W), P e R}. 3 (t, W), F, F) is a martingale under P 3 (d) (F Proof. (4.9) if and only if P C R. is optimal. (a) Recall that H denotes a measurable selector of R°; therefore by (4.3) and Remark 4.2 the LHS of (4.8) is E {f f(O, x, iio)dO + fT c(O) 9 dv + H(T(.), = EP{J f(e,xe,e)d9+f c(O).dve+6.ØTH(rT)} = P® and by Proposition 4.8 we know that P ØT H e R(P). On the other hand, for = ) 3 P®Q(r = E E = {j f(9, xo,e)d8 + c(O) dye + Q.(f)} f(, Xe, c(9) dv 9 + W(r(.), LHS of (4.8), j 1Le)d + j . . P = P ®T Q e Chapter 4. The Dynamic Programming Principle 54 so the proof of (a) is completed. (b)For any $t<t+h, we have E’{r ( 3 t + h, W) = E {j [j 1 ( 8 t, W)IFt} f(8, xo, )dO + 0 J t+h c(O) 8 dv + W(t + h, t+h Xj+h) 0 + W(t, xe)] j c(O) dv } ,o)d8 + 9 f(O, x j c(O) dvo + W(t + h, )dO + 6 f(, x, i — {f — . t+h = E = Pt,f(t + h, W) = F r(t+h,w)—w(t,x) Xt+h)F} — W(t,x) W(t, Xt) — by Lemma 4.3(2). Now we apply (4.8) and the fact (F) C i.e., Proposition 4.7, to get E{r(t + h, W) = inf P : inf {Pr : — f ( 3 t, W)IF} e (F)} I’ - W(t, Xt) W(t, Xt) — =0. Therefore (r(t, W), .F, F) is a submartingale. (c) Let v(dOdx) be the distribution of E”W(r,XT) (r, XT) = = under F. Then JW(O,z)v(d1dx) J J(O, H(O, x))v(d9dx) = EJ(r, H(r, XT)) = J(r, ®T H). Note that EF ( 3 r, W) = E ,o)d9 + 0 {j f(8, x T j c(6) 0 dv + EW(r, x) } . Chapter 4. The Dynamic Programming Principle 55 )dO + 9 f(O, xe,p {j = E = J(s,FØTII) f c(O) dvo} + J(r, F ØT H) inf {J(s, Q), Q e R} = T4 ( 7 s,x). On the other hand, E”W(r, XT) E’ J 3 (r, F) EJ(r, F) = = E E = {[ f(O, xe, po)dO + {j f(9,xe,e)dO + c(O) dvo} + EW(r, z) . L c(O) dve} + J(r, F) J(.s, F), (4.10) and thns (c) is proved if we take infimnm over F e r, W) is Q 3 W(s, x) = J(r, F), and therefore by (b), E”F ( 5 r,W) W(.s,x) (d) If = on the right hand side. a F-martingale, then K’t ( 8 s, W) = EI’S(T, W) because from our assumptions, W(T,.) = = EI’S(T, 0) = 5 Er = J(s, F), 0. So F is optimal. If we assume that F e R. is optimal, then by (4.10), Proposition 4.7 and Corollary 4.6, W(s,x) < 3 Eñ’ ( t,W) = 8 EI’ = W(s,x). (4.11) Therefore (I’(t, W), .T-, F) is a submartingale with constant mean value, so it is indeed a martingale. 4.4 U Markov property For (s, x) e Y2, recall that fl. denotes the collection of optimal control rules with initial condition (s, x), i.e., Rh = {F E R : JQs,F) = We have shown that Rh is nonempty, compact and convex. Similar to Proposition 4.7 and 4.8 for R, we state the following proposition. Chapter 4. The Dynamic Programming Principle Proposition 4.11 (a) stopping time, s T 56 is closed under conditioning, i.e., if P e R and r is an F T, then there exists a P-null set N e F such that E T(W),x() for wN. (b) R is closed under concatenation, i.e. if P E then P/T/H R., where H(..) is a measurable selector of R°.. Proof. (a) From Lemma 4.7, there exists a P-null set N’ E F such that for w N’, FE ?- and therefore W(r,x) F f, From our assumption P E Vw 0 N’. (4.12) and Theorem 4.10 (d), we know that (I’(t,W),F,P) is a martingale. Thus Pf(r,W) By the definition of F.(., •), (4.13) is equivalent PrTW = Let N” EF(]T = {w : W(r, xT) FT) = EFTW W(r,x) =F Fr ( 5 T,w). = (4.13) to PW(T,XT). (4.14) then N” e F. From (4.12) and (4.14), and the fact that FT}, we have P(N”) = 0. Let T = N’UN”, then P(N) = 0, and for w N, or ‘E (b) If P E R, then by definition and Proposition 4.10 (d), W(s, x) = J(s, F) = E {f = 6 EF = jT c(O) dv 9 . [JT XT) {f f(O, ) 3 P/r/H(f Moreover, we know that P/T/H Ef ( 3 r, W) f(9, Xe, 10 )de + +H(r, E = e xo, = f( xo, e)dO + to)dO + fT f c(O) dv ] 0 . c(O)dvo + H(r, XT) [FT } ° OT,W] } J(s,P/r/H). R,r, and therefore P/r/H E R. U Chapter 4. Let The Dynamic Programming Principle {?4} 57 be a countable dense subset of Kf, 0 subset of Cb(1R°). So {(A ,. 0 Ic 1 V e 111+ x d san arginfEP{Je_0gi(xo)dO+ ZJe0dv} a D D define — — = then be a countable dense NI ‘1V”n’ be an enumeration of this set. Now for (s, x) {g,} 1} is a countable set. Let ‘,nk,l . k, and i .... i Exactly as we have shown for R, we can prove that for each n, is nonempty and compact. Let = then R ifl s,r, 0, and is compact. Further, we can state the following theorem; Lemma 4.12 (a) fl is stable under conditioning: if F C R and r is an Fe-stopping time, s < r <T, then there exists a F-null set N (b) R.°X e 7 such that FE F for w g N. is stable under concatenation: if F e R, and H(.,.) is a measurable selector of then for an Fe-stopping time ‘r: s r T, we have F Ø H C Froof. The proof for Proposition 4.11 also works here without any change. U Now we can state the main theorem of this section. It ensures that there exists a measurable selection of whose marginal distribution on V V’[0, T] is a strong Markov family. Theorem 4.13 There exists afamily of control rules 5 { F } such that 5 F C RT for(s, x) C , and 3 { F v} satisfies the strong Markov property on V, where F 5 jv is the marginal of 5 F on V E V”[0,T]. Proof. We first show that all elements in fl’° have the same marginal distribution on V. { Assume F, Q e R’°, then for any integers n , 0 E ., n,ç-, I, jTe_?ioOgj(x ) 8 d9 + Lfei0dv} Chapter 4. The Dynamic Programming Principle = E { 58 jTe_oOgj(xo)do + jTe_i0dv}. Through a limit procedure we can conclude that { E’ jTe_O6g(xo)do + Zje_t0dv} {jTe_A9g(xe)d9+jTe_At9dv} = for any A°, •, E Ak C lR and g C Cb(JRd). By letting A 1 {fTe_Aeg(xe)do} e cc, 1 k we have {jTe_Aeg(xs)do} = E E’ for any A —* 18+. From the uniqueness of Laplace transforms and the left continuity of the function g(x.), ) = E’g(xe), 9 E”g(x VO 0 T. (4.15) It can be easily seen through a routine limit procedure that (4.15) is true for any bounded measurable function g. In order to show Fly = QIv we need only to show that Fly and QIv have the same finite dimensional distribution, i.e., 2 ( 1 EPg ( x).. )g gm(xt) = 12 g 9 E ) 1 ( 2 )’ g “gm(xt) x for any bounded measurable function g, m, g,, 1 <t t 2 < S know that (4.16) is true for m = 1. Suppose it is true for F and F(N) S given Q = = c(xt, 1 j Q(N) = 0 such that for w = (Xt, 60,0) for w = in) in. (4.16) T. By (4.15) we <im Let F and Q be the r.c.p.d. of respectively. We will show that there is a N N, öw’/tm/Fw and &u’/tm/Qw are in e Q, where (x., it., v.). For this purpose, let F , 1 Q be the r.c.p.d. of F and check that J Fc(.)1(d&) Q given .Fjm respectively. It is easy to Chapter 4. The Dynamic Programming Principle 59 is a r.c.p.d. of P, and thus (4.17) = fP&(.)F(d) for not in a P-null set A E Next, we can choose a P-null set N’ E F such that . F, P) is a martingale after time tm for w such that F(N’) = 0 for w B. Now from (4.17) we know that if w Q) hypothesis, P = Q on is a martingale after time tm when w . Let No 0 UM we will have N 0 N , = 0 = N AUE , e then E C(1Rd). Similarly, there exists an M 0 E (Mo, F, P) is a martingale after time tm for such that N’. Then there exists a P-null set B . Note that by induction 0 M eg and P(No) Moreover, (Mtq,F-t) is a martingale after time tm with respect to F and = Q(No) = 0. Q for w . 0 N Similar to the proof of Proposition 4.11 we can show for each k, there exists a P-null set Nk E such that &u’/tm/-w E then ic , conclusion for t5w’/tm/Qw for w F(N) = 0, and for L’ 1, 6w’/tm/Fw E Q, i.e., there exists an M e e Let iV = 0 UN N 1 We can have the same such that Q(M) = 0, and if w M, then MUN, then N E Q and P(N) = Q(N) = 0. For w = have &,j/tm/Pw, 6wu/tm/Qw E Egm+i(Xtm+i) Thus if we set N Nk. tfll,tm N, we Thus by (4.15), = w’g (x ) = = = = for w N. Since P = Q on Q, we can conclude Eg ) 1 ( 2 ) g x and thus (4.16) is proved, i.e., . . ) 1 gm(xi = = EQg ) 1 ( 2 ) g x QIv. Finally, let 3 P be a measurable selector of 1 D the strong Markov property. Let r be a = u(x8 0 ‘-+ ]M ( 1 ). We show that SPD satisfies 0 t) stopping time, .s r T, and Chapter 4. The Dynamic Programming Principle 60 F C (Fj , then T 7 F 8 ( ) (F) By Lemma 4.12 (a), there exists N C w g N, since all the elements in R°Z FT = ( F 8 ) o O;,(F) = F(F). such that P(N) = 0, and Fe R. Therefore, for have the same marginal distribution on V, we have k iv = TPQJID. In other words, (SPX)T,W (F) = (4.18) TPW(F). Recall that ($Px)T,w is the regular conditional probability distribution of (4.18) means that { F,jV} is a strong Markovian family. 5 sr given F,., and thus El Chapter 5 The Value Function 5.1 Introduction It is well-known in the classical control theory that when the value function has sufficient regularity it satisfies the corresponding Hamilton-Jacobi-Bellman equation, cf. Fleming and Rishel [22], Krylov [50], and Lions [55]. We will show that this is still true in the case of singular stochastic control problems, where the Hamilton-Jacobi-Bellman equation becomes a variational inequality. In Section 5.2 we prove that the value function is uniformly continuous when the coefficients are Lipschitz continuous . Note that the proof of this fact is nontrivial compared with the same result in the classical control problems, due to the presence of the singular control variable. Applying the dynamic programming principle we have established in Chapter 4, we derive the Hamilton-Jacobi-Bellman equation for our control problem heuristi cally in Section 5.3. Motivated by work of Lions [55], we will prove that the value function is the unique viscosity solution of the corresponding Hamilton-Jacobi-Bellman equation. 5.2 Continuity of the value function In the next two chapters, we add the following assumptions: • c(.) is Lipschitz continuous, • f(., ., .), b(.,.,.), u(.,.,.) f(., ., .) is bounded, and g = (g i t ) is a constant d x k-matrix, satisfy the following conditions: If(t, x, u) f(s, y, u)( G(It s + liz iib(i, x, u) CQt — — b(s, y, u)ii IIu(t, x, u) c(s, y, )li — 61 — — sj + G(It i — + liz — — yii), yii), lix VIi) — (5.1) Chapter 5. The Value Function uniformly for 0 62 c T, x, y s,t u e U. We will prove that under these conditions, the value function W(., on D. In fact, there exists a constant C W(t,x) — W(s,y)j C (it .) is uniformly continuous 0 such that — sI + lix uii) o s,t — T, x,y e Note that the constancy of g is only required in the proof of Theorem 5.2. Theorem 5.1 The value function W(s, x) is uniformly Lipschitz continuous in the state vari able x, i.e., there exists a constant C iW(s,x’) — > 0 such that Clix’— xW, W(s,x)i VO t T, x, x’ (5.2) E Proof. In the following, we use the same notation C to denote the constants, which may change from time to time. For any 0 J41(s, for each ,, P 5 Q e R. ((2,P, P , P) 1 such that (2 = — T, x, x’ C W(s, x) 8 EF — where 1’ is the cost function defined by (3.11). C Take an arbitrary P C tension x’) s By the definition of control rules, there exists a standard ex of (12, F, F , P), i.e., there exists another probability space (12’, F’, F 1 ’, P’) 1 12 x 12’, F = F x F’, F 1 x.,&,v. to 12 by the following: for = = ‘ = x.(w), 1 x Fl, and F F = P x P’. We can extend the processes (w,w’) C 12, u.(&) 1 = &(w), v.() = v.(w). On (12, F , F) there exists a standard d-dimensional Brownian motion B. such that for s 1 = x+ I j b(O, , po)dO + 0 x I u(8, x , zs)dBe + gv 0 1 a.s. t T, (5.3) Consider the same equation (5.3) with the initial state x’, i.e., = x’ + I I b(8, ye, jze)d9 + a(9, y, 1 )dBg + gv 9 t 1 (5.4) Chapter 5. The Value Function 63 on the stochastic basis (, F, F, F). The strong solution for (5.4) exists from the assumptions on b(.,.,.) and a(., control rule Q •, .), and so c (Q, F, F, , y, , Vt, , x’) e Therefore there exists a such that e J(a) = J(.s,Q) = . 8 EF (5.5) By definition, = E = E f(6, xo,#e)d6 + c(O) dve} . f(O, xo,o)dO + {j j c(O) dvo} and by (5.5), T 1 { 8 Er f(O, y, io)dO + E = therefore from the Lipschitz continuity of — T 8 EF ye,ie) T c — e)Id8} 0 f(, x )Iye—xej)dO CEJ < c(O) . dv } 0 f we have - 8 EF j 0 EPy — 112d8) 0 x (j Now from the equations (5.3), (5.4) and the Lipschitz continuity conditions on b, a, we have for , — 2 x011 CIjx’— x11 2 +CE CIIx’ — o + CE sup o’<e 2+ x11 f CE (‘ — 2+ x11 IIb(h,yh,h) j — b(hxh#h)Ildh) 2 (a(h, Yh, h) j — Ib(h, Yh, h) IIa(h, yh, h) +CE c (j 2 EPj)y — — 2 Xh11 a(h, Xh, nh)) dBh — b(h, Xh, h)jI d 2 h a(h, Xh, h)II dh 2 dh). Chapter 5. The Value Function 64 We have used the Burkholder-Gundy inequality to get the second inequality. By Gronwall’s inequality, — 2 xsii Clix’ Clix’ e°(°) 2 xii — — . 2 xW Hence we have 8 r 9 E and therefore W(s, x,) W(s, x) — — Clix’ Clix’ itFS — — xii, xii. The proof of the theorem is thus complete since x, x” E are arbitrary. II Next we consider the continuity of the value function in the time variable t. Theorem 5.2 The value function W(t, x) is uniformly continuous in the time variable t. In fact, there exists a constant C> 0 such that W(s, x) for all 0 s, s’ — W(s’, x)j C s — s’ (5.6) T, x C Proof. As in the proof of Theorem 5.1, we use the same notation C to denote the constants. First we assume s s’, so that W(s’, x) — W(s, x) sup (E’I’ ’ 3 — ) 5 EI’ (5.7) PER, 5 x for each Q e fl. From the strict positivity of c(.), we may actually take the supremum in (5.7) over a subset R,(A) for some A> 0, where = {F e R: ltiiv(T)ii A}. (5.8) For F C fl ,(A), similarly as in the proof Theorem 5.1, there exists a standard extension 8 (, .1, f, P) of (Q, F, F, F) such that x(.) is a solution to t = x+ t b(O, xe, p )d6 + 9 u(O, x, jto)dB 9 + gv 1 (5.9) Chapter 5. The Value Function 65 (, .F, F, F), on the stochastic basis = -t—s’+s, I = ILt_s’+s, = ‘ where B. is a d-dimensional Brownian motion. Define 3 Vt_ and B = B_i+ 3 for t s’. Consider the following stochastic differential equation ft x + ft b(8, y, fte)d8 + J on the stochastic basis J Si u(O, y, i2o)dB + gI’t, t s’ We know that under assumption (5.1), there i. 8 eA unique strong solution y., and by definition c = there control rule exists a (5.10) Q e R. i 8 ,, exists a Therefore such that J(Q) = J(s’,Q) = Er . 81 (5.11) Recall that I is defined by (3.11). Thus by definition T 1 { 8 EF E f(8, x )d9 + 9 , 6 c(8) dvØ} j . and by (5.11), = E = E - {[ {f Therefore, noticing that f(O, ye,o)dO + c(8) do} . T—s’+s T—s’+s 1(8+ f is 5 Er ‘ — 3 i 8 ,ye+ , o)d0+ _ c(O+ s’ j — ) . dve}. bounded below by a constant —K and P e R(A), we get T-(s’-s) - — j E {j +j E {j f(8, x, ) — f(8 + s’ — , Y6+s’-s, iio)1d8 T-(s’-s) — < < E { k(°) c(O + s’ - If(8,xs,) - +CIs’ — f(9 + s’ — sYo+si_se)Ido} sIE”)Iv(T)II — jT—s’s f(8 From the Lipschitz continuity of the function f(8 + )II dilvoll + KJs’ sI} T—(s’—s) Xe, o) +Cs’ — - f(0 + s’ — , )d8} ’- 6 3 Yo+ , si. — f, we have s’ 8,ye+s’_s,ILe)I — — C(js’ — + Ixo — Ye+s’-sII). Chapter 5. The Value Function 66 Therefore J I Cjis’—sI+ (J - — 3 Er (Is’ C T-(8’--s) line s +E — T—(s’—s) — (5.12) Yo+s1_siIdO) EFlixo_Ye+_sil2d6) - s, By (5.10), we have for 6 pO+s’—s = YO-fs’—s X + J b(h,yh,uzh)dh (5.13) 8 + J a(h, Yh, I.th)dBu + 3 ’_ 8 gii 8’ + = J S b(h+ + J 8,Yh+st_s,Ph)dh 8’— a(h + &‘ Yh+s’-s, Ph)dBh + yvo. — So from (5.9), (5.13) and the Burkholder-Davis-Gundy inequality we have for 6 F P no YO+s’—s — 2E (J 2 2 0 iib(h, Xh, Ph) - +2Ev sup o’o J +2Ev CEJ(Is’ (Is’ 8’ — , Yh+s’-s, fth)lidh) 2 (c(h, Xh, ph) — a(h + s’ — 8, Yh+s’—s, ph))dBh 8 2TEJ iib(h,xh,ph) C b(h + — o — — s, 2 si j — + b(h+ — Ja(h, Xh, Ph) 812+ WXh — J Flixh &‘ 2 , 3 8,yh+sl_ d ph)ii h u(h + — 8’ — 8, Yh+s’-s, ph)ii dh 2 2 Yh+ 1 8 ) dh _ 1 — Yh+S1_Sii2dh). Gronwall’s inequality implies, E”iixo — Cs’ 2 yo+s’-sii — sI e 2 ° Hence from (5.12) we have — 8 EF Cs’ — Cs’ — j2. Chapter 5. The Value Function and therefore W(s’, x) — 67 W(.s, x) C s’ — .s for s’ > s. Now we assume s < s. By the dynamic programming principle (cf. Theorem 4.10), W(s’, x) = inf E {f Take F° E R’ such that F°(ite e f(9, , 8 )d9 + 0, 0 = 6o, ye = j c(9) 6 dv + W(s, . 9 T) = 1 for some arbitrary but fixed U. Then W(s’, x) E° Cs’ < {f — f(9, x, u°)d8 + W(s, x )} 3 ), 3 I + E°W(s, x by the boundedness of f, and by Theorem 5.1, there exists a constant C such that ) 3 W(s,x W(s,x) + CWx 3 — xII. Hence W(s’,x) — C(s’ W(s,.t) C — sI (is’ I - + + 8 E°IIx xII) 8 (E°IIx - (5.14) xII2)). From the definition of control rules, we know that under F°, — x = f b(9, Xe, u°)dO + M , 3 (5.15) where M is a continuous square integrable martingale with 3 (M) = j tr(a(O, x, u°))dO. Therefore by (5.15) and the Burkholder-Davis-Gundy inequality, we have 5 E’°IIx — 2 <C (Is’ xii — 2 + js’ s1 Combining (5.14) and (5.16) we have W(s’,x)—W(s,X) < CIs’—sI, — si). (5.16) Chapter 5. The Value Function where we have used the inequality 68 ,/i+ /i /iT6 for a, b 0, and the fact that 0 s, s’ with T finite. The theorem is thus proved. T U Now combining Theorem 5.1 and 5.2 we can state the main result of this section. Theorem 5.3 The value function 14 is uniformly continuous on >1 Moreover, there exists a constant C 0 such that W(t,x) 5.3 — W(s,y) C (It— si + liz s,t 0 — T, x,y 6 The dynamic programming equation Before we derive the dynamic programming equation heuristically, we prove a result which shows that there exists a set such that the optimal state process is continuous when it is in this set. Theorem 5.4 (a) Assume (t, x) e E, then 14(t, x) for each h c 114. W(t, x + gh) + c(t) h Moreover, if equality holds for some h holds when we replace h by Ii with h (b) Define, for 0 t = (5.17) . (hi) e 114, t h = ) 6 IRhv, then the same equality 1 (h t (1 h i k). e 114,h T, {x: W(t,z) < W(t,x+gh)+c(t) h, Vh Then the optimal state process Xj e 0}. (5.18) is continuous when it is in A. To be precise, we have P(Axt0,xteAj)=0, s<t<T for every FE £, (s,x) $ (5.19) >D. Proof. (a) If (5.17) fails for some h c 114, then W(t, x) > W(t, x + gh) + c(t) h. . (5.20) Chapter 5. The Value Function Take F C 69 We define 0 Q — I = I. for w F C = (x., ji., v.), and let P t,r. = Q by ,0), 0 3 —gh, t<sT ,v u 1 , 3 (x + h), 0—1(.). Fa s As in the proof of Lemma 4.4, we can show that From the definition of F we have J(t, P) {JT f(9, xe,o)dO + = E = E = J(t,F)+cQt).h = W(t,x+gh)+c(t).h. {jT j f(9, xo, )d9 + 9 j c(O) dvo} . c(O) 9 dv + c(t). h} . Therefore, J(t, P) < WQI, x) from (5.20), a contradiction. Next, if (5.17) holds as an equality for h, then for h, W(t, x) — c(t). lv = W(t, x + gh) > T’V(t, x) W(t, x) = — , 1 1 h i lv, W(t, x + gh) — c(t) (lv c(t) lv . — 1 h . — — c(t) (lv . — lv) c(t) lv. . Therefore W(t, cc) (b) For F e = W(t, cc + gh) + c(t) . h. we know that F-a.s., = cc + b(O, xo, 1 )dO + gvj 9 z + a continuous local martingale. Hence Lx = gAve, and xt+ = Xi + Ax = Xj + gAve. Since W(.,.) is continuous on D, for s W(t,cc) = limW(t’,x’). (5.21) Chapter 5. The Value Function Assume P E that 70 then by the dynamic programming principle (cf. Theorem 4.10) we know (r ( 3 t, W), .F, F) is a martingale. Hence for t’ > t, ti W(s, x) Let t’ . = E ti j f(6, x, t )d9 + 0 {j c(O) dye + W(’, XtI)}. (5.22) t; note that from our assumption we know that c(.) is continuous on [0, T]. Therefore j j f(9, xo, 9 )de -÷ f t )dO 0 f(O, x, i c(O).dvej c(O).dvo+c(t).Avt. Thus if (5.19) fails, then (5.21) and (5.22) imply W(s, x) = > = {j f(O, xe,e)dO + f c(9) E {j f(8, xe,e)dO + j c(O) E . 0 + c(t) Ay + WQ, dv . Xt + gAv)} dvo + W(t, Xt)} E”r ( 8 t, W), which contradicts the fact P The inequality follows from the facts that W(t, and the strict inequality holds if W(t, rt + g/xt) + c(t). Xt) Xt E A and /Xt 0. Remark 5.5 From Theorem 5.4 (a) we can see that if the value function W E C” (E), then 2 (g*VW(t x)) + c(t) where * means transpose and there exists h = (.)i 0, i = 1,2,. •, denotes the i-th coordinate of the point in 1R’. For x (hi) E 1R such that for h W(t, x) = = ) t (Ji E 1R, h h, 1 i k, W(t, x + gh) + c(t) h. . Therefore we have (g*vw(j for those i such that h > 0. x)) + c(i) = 0 0 The Value Function Chapter 5. 71 Heuristic derivation of the dynamic programming equation 5.3.1 e Recall Ito’s formula, for ó(t,xt) = C” ( 2 ), (s, x) E D, t sx+f (+c) + ) 0 j V(O, x (6,xe,p)dO a(6, 0 )dB + 9 xo, [q(9, xg) + s, (0, Xe) — j V(0, xo) gdv 0 V(O, x) Axe]. (5.23) s<O<t Let then for P e fl, (5.23) may be written as E(t, x) = (s, x) + EP{J (8, Xe, e)d9 + [(O,xe+) + — (O,x) — f V(9, Xe) . gdvo Axe]}. ) 0 V(O,x sO<t By the dynamic programming principle (cf. Theorem 4.10), {jt W(s, x) so if we assume W 0 = 9 + W(t, dv = + we j c(O) C” ( 2 ), then + If )dO + 9 f(9, x, i E j(c(O) + g*VW(8, Xe)). dv 9 [W(O, x) take the infimum over all those P e — W(O, Xe) — VrW(O, Xe) Axe]}. such that P(v . = 0, 0 (5.24) we get 0. Let t ,. .s. we have inf E(LW + f)(s, x , IL 8 ) 8 0. t T) = (5.24) 1, then from Chapter 5. The Value Function 72 Moreover, from Remark 5.5, we can conclude that on t (g*VW) 1 +c 0, i = l,2, . ., k. (5.25) Therefore we can expect that W satisfies formally the following variational inequality, or Hamilton-Jacobi-Bellman equation, min{ inf uEU (Lw + f)(t, x, u), (g*VW(t, x)) (t), i t + c = 1,2, = o. . ., k} = 0 (5.26) on >D. For simplicity of notation, we write (5.26) as mm 5.3.2 { inf uEU (Lw + f), ?VrH’ + Viscosity solution As is well known in classical control problems, the value function is a solution to the corre sponding Hamilton-Jacobi-Bellman equation when it has sufficient regularity (cf. Fleming and Rishel [22], Krylov [50]). Hit is only known that the value function is continuous, it is observed by Lions [55] that the value function is a solution to the Hamilton-Jacobi-Bellman equation in the viscosity sense, which we will define in the following. Definition 5.6 A function is a viscosity solution of (5.26) if b 1. for each local maximum point (to, cc ) of & 0 min{ inf uEU — (L + f), e C(E) and for every e in the interior of >D, we have + c} 0 (5.27) at (to, cco), i.e., ,b is a subsolution, 2. for each local minimum point (to, cco) of • min{ inf uEU — in the interior of >D, we have (L + f), g*V + c} at (to, no), i.e., 5 is a supersolution. 0 (5.28) Chapter 5. The Value Function 73 For an introduction to viscosity solutions and their applications to stochastic optimal control problems, see Fleming and Soner [23]. For an extensive bibliography on viscosity solutions, cf. Crandall, Ishii and Lions [15]. Theorem 5.7 The value function W(.,.) is a viscosity solution of (5.26). Proof. By Theorem 5.3, we know W C C(E). We first show that W is a subsolution. For e (Y2), if (to, xo) 2 C” e int(>D) is a local maximum point of W—ql, then there is a neighborhood (to, xo) of (ta, xo) in > such that 1 0 W(t, x) (t, x) — W(to, xo) — , xo), (t, x) 0 (t (t xü), 1 O , e0 (to, so), (t, x) , so), 0 e O’(t or W(t, x) where O(tü, so) — W(to, so) (t, x) — (5.29) denotes the closure of 0 (to, so). If (5.27) fails, then one of the following will 1 be true, inf( + f)(to, o, u) < 0, (5.30) 0 ( 5 (g*Vq t x)) + e (to) < 0 for some 1 t If (5.30) is true, then from the assumption (5.1), and gi c i k. (E), we can find u 2 C” 0 (5.31) e U and a neighborhood 0 (t so) of (t 2 , , so) such that 0 (4 + f)(t, x, u0) < for (t, s) 0 ( to, so). Take F e fl e2 , such that 0 = The existence of such a P e 6{uO}, Vt. = 0, 0 r T) = is obvious. Define r = inf{t > to, (t, xi) 0 O(to, xo)}, 1. (5.32) Chapter 5. The Value Function where 0 OQ , xo) (to,xo) 1 0 = 74 Qto,xo). Since the state process x. is 2 nO continuous a.s.(F), we can see immediately that F(r > to) = 1, 0 < r, and for to (4+ f)(O, zo, p) < 0, a.s.(F). Therefore, f)(O,xg,o)d0 < 0. By the definition of control rules, (r, x) where Mçb e = (to, xo) + f 4(0, x, 9 )d9 + M , a.s.(F), 7 M, i.e., a continuous square integrable martingale with respect to F. Note that Stroock and Varadhan [73], Theorem 4.2.1 allows us to replace £ by v = L in (2.9) at least when 0. Hence E(r, Noticing that (r, x) XT) e O(to, xo), E”W(r, x) — (to, xo) = E j 4(0, x, 9 )d0. we have — W(to, xo) E’(r, x ) 7 — (to, xo) = EFf4(0,xO,e)d0 < —E L f(0, Xe, o)d0, which, by (5.32), can be rewritten as W(to, x ) 0 > E {j f(0, xo,o)d0 + W(r, XT)} = This contradicts the dynamic programming principle, cf. (4.9). Next, if (5.31) holds at (t ) for some i, then we can take h 0 ,x 0 t > 0 small enough such that (to, xo + gh ) 1 — (to, xo) < —c(to)h , t Chapter 5. The Value Function 75 where g 1 denotes the i-th column of the d x Ic matrix g. Therefore by (5.29) we have l’V(to, x + gh) — , xo) < —c(to)h 0 W(t , 3 and therefore W(to, xo) > W(to, xo + gh) + c(to) where h = (0,.. , hi,. , 0). This is a contradiction to Theorem 5.4, and thus we have shown that W is a subsolution to (5.26) e Now we show that W is also a supersolution of (5.26). If local minimum point at (t ,x 0 ) 0 C” ( 2 E) such that W — has a ) 0 e int(XD), then there exists a neighborhood O’(to, xo) of (to, x satisfying W(t, x) — 1’17(to, .xo) > &(t, x) — (to, xo), (t, x) e O’(to, xo). (5.33) If (5.28) fails, then inf(L+f) >0, (fV)+c’ >0 ueU at (to, xo) for j = 1, . . ., Ic. From the assumption (5.1) and the fact (to, zo) of (to, xo) such that for some 2 neighborhood 0 inf(L+f) ‘uEU on O (to, xo) for i 2 = 1, .. ., Ic. Let O(to, xo) we have for small h C 1R, h >6, = & (*)3 4 (E), we can find a 2 ’ 1 E C > 0, +c >6 O’(to, xo) n 0 (to, xo), then for (t, x) 2 e OQo, xo), 0, x + gh) — $t, x)> —c(t) h. . Therefore by (5.33), W(t, x + gh) — W(t, x)> —c(t) h . or x C A. Hence for F C 0 F(x ) = 0 = xt 1 by Theorem 5.4. Define r =inf{t to, (t,xj) 0 O(to,xo)}, (5.34) Chapter 5. The Value Function 76 then from (5.34) we see that FQr > to) (t,xj) = 1 for F E fl 3 and it can be seen that ,xo), 0 E OQ E A, to Xt t r. Therefore we have (g*v(o + f)(O, x, jie)dO + j Xe) + Ce) dve} eE(r — to). Applying Ito’s formula, noticing that the state process x. is continnons a. s. (F) when to t r, we have E(r, XT) = (to, xo) + E (6, x ,1 0 )d9 + 9 i V(O, xo) gdve}, . which may be rewritten as E[(r,x) — E (to,xo)] —f(O,xo,pe)dO {j —J c(o).dve}+EEF(r_to). to By (5.33) and the fact F(r > to) ) 7 E[W(r,x — > 0 we have W(to,xo)] > F {J —f(O,xe,e)d8 — J c(S). or W(to, xo) < E f(O, X, to)d9 + J c(S) dv 9 + W(r, . which contradicts the dynamic programming principle. The proof of this theorem is therefore complete. 5.4 U The uniqueness of viscosity solution to the HJB equation We now consider the uniqueness of viscosity solution to the dynamic programming equation min{ inf (LW + ueU f), g*VxW + c} 0 (5.35) with the boundary condition W(T, x) = g(x), (5.36) Chapter 5. The Value Function 77 where g is a bounded Lipschitz continuous function on The equation (5.35) may be rewritten as mm { + fl(t, x, DW, DW), + c} = 0, (5.37) where H(t, x,p, A) if{tr(c(tx u)u(txuiA) + for (t,x) e Z, p e 11?”, and A C dxd 8 fb(t, x, u)pj + f(t, xu)} Note that in (5.37), DW, DW denote the gradient and Hessian of the function W respectively. Let us introduce some notations: if w : i—÷ 18, then we define (0, T) x JR’, to be the collection of all the points (c, p, X) e JR x JR1 x w(t, x) w(s, z) + c(t s) + (p, —z) + — +o With the above notation, for (.s, z) C (0, T) x P + 2 w(s,z) d, 18 IIx —÷ (s,z), _2 + (_w(s,z)), 2 ’w(s,z) = 2 2 P ’ w = - — dxcl 8 z), x — e such that z) zj;2). we set {(c,p,X): B(s,z) e (0,T) x (s,z) and sj + - 1 +zv(s, z), (s, z) 2 P d, 18 (c,p,X) (c,p,X) € —÷ (c,p,X)}, _12 + 2 (_W). Obviously we have for any con stant A> 0 AP + 2 w = +(Aw), 2 2 P AP ’ w = The following lemma can be proved easily by the method in Crandall, Ishii, and Lions [15]. Lemma 5.8 A continuous function TV defined on E is a viscosity subsolution (supersolution, respectively) of (5.37) if and only if for all (t, x) € (0, T) x 18”, min{c + fl(t, x, p, X), g*p + c} ’W(t,x), respectively). ( 0, V(c,p,X) € 2 0, V(c, p, X) e ’w(t, 5 i 2 x) 0 Chapter 5. The Value Function 78 We write down the next lemma which is crucial for the proof of the main theorem of this section. This is a special case of Theorem 8.3 in Crandall, Ishii, and Lions [15]. e Lemma 5.9 Let V 2 be continuous functions defined on (0, T)xIR’, and 1 and V lad). Suppose that i e (0,T), i, w(t, x ,x 1 ) 2 for 0 <t < T and x ,x 1 2 E e JR”, and (t, xi) + V 1 V (t, 2 d, 18 i.e., w(., ., ) 2 x C whenever (ci, qi, X ) 1 It - I + > dxd e8 (i) ) P ’ VjQ,Ij) for i 2 , 1 (c D(I,2i,9i ,X) C 2 1 ‘X 0 ii A+EA, (+IIAII)I 6 0 X 2 ( - (iii) c 1+c 2 w(i, i, ) (t, xi, x ) 2 Assume, moreover, 0 there exists a constant C satisfying +T4Q, xi), 2 ep i = 1,2, 114(t, x)I + qj + r, - Then for each e> 0 there are c C Ill, X 1 — .) attains its maximum at (, i, ). that there exists an r> 0 such that for every M c ((0, T)x 2 ’ 1 C Xj M. (5.38) such that = 1,2, ) = where A= ) 2 , 1 Dg5(i,a ä . Observe that the condition (5.38) is automatically satisfied if Vj are subsolutions of (5.37). Let us define the function space C(E) {W(.,.): W IW(t, x) e C(; 111), with Wbounded, — W(t, y) ClIx — and y for some C 0} Now we state the main theorem of this section. Theorem 5.10 The dynamic programming equation (5.35) has at most one viscosity solution in the space C(E) satisfying the boundary condition (5.36). Chapter 5. The Value Function Proof. Let W, V 79 e C(YZ) be two viscosity solutions of the equation (5.35) with the boundary condition (5.36). We will first show that W For any given s: 0<e< 1, define E 6 E V on E. (e,T] x Q, x) 6 W (1 dt e t—&) — , s)W(t, x) and for (t,x) — , 6 ez -L—, (5.39) then we Therefore, by Lemma 5.8, — e >0. (t—e) 2 is a viscosity subsolution to the dynamic programming equation. For any given a, 3> 0, 0 <e < 1, define an auxiliary function Q, x, y) E we(t, x) It is easy to see that the function — V(t, y) — IIx 2 + (t yB — — T). is bounded above on >D . There exists (ta, Za, ya) C 6 such that Za, ya) > sup — a. (5.40) Define a(t, x, y) = (t, x, y) — [It — 1a12 It is obvious that the maximum of the function lix + a — 2 Xaii + ii Ya11 1, 2 — is attained on >D at some point (1, h, ), which depends implicitly on a, , and 2+ I taI - - xaii2 + - Yaii 2. (5.41) We first show that for any given e > 0, when a small Ii Uii = o(a). (5.42) — Since , attains its maximum on (e, TJ at (1, , a(1, i, 1) + cr(1, th ) ), we have 2a(i, cI, ). (5.43) Chapter 5. The Value Function 80 Recall that from the definition of r(1, , I) = we have , W6(i, i) — VQ, ) + /3(i — T) 2 + I -[Ii- t 1, cxQ aQ , ) = ) = W6(i, ) W(t ) — - V(i, ) + /3(i T) 2+ I [i- tI + t - - 1 2 Ya11 (5.44) ] 2 yII (5.45) j. 2 Ya11 (5.46) — VQ ) - 2 t 2 xajI uii + (t - - - + II + I - + - - T) - Using (5.41) and the fact 11± II — 2 Ya11 2j , 2 Ya11 — II + 2 2i — + 2II — xcrII2, — we can get II YaII + II xJ 4II - - - 2 +4. II (5.47) Now (5.43) and (5.44), (5.45), (5.46) and (5.47) will lead to 2 IIII (1-e)(W(IM-W(iM)+V(IM-V(IM +[IfI CJI — — 2 YaW II + 2h + II — X + 2u, — which implies (5.42). Now let ns assume the maximum of the function L is always attained at the point (T, , ), i.e., t = T. Then a(t, x, y) F(T, h, ) Vx, y E Note that the boundary condition (5.36) implies that W(T,.) = .q(.), V(T,’) = g(.). Recall that g is bounded and Lipschitz continuous on 1&, and by (5.42) we have lima+o I — = 0. Chapter 5. The Value Function 81 Thus lim sup (WE(T, 3) — cy—*O V(T, Th) <lim sup ((1 (1— E)limsup (g(th) g()) — — — ) &)W(T, — V(T, )) elimsupg(y) CE, aud therefore for some constant C, W Q t , x) — V(t, x) + /3(t sup (t,x,y) — = T) = Q, x, x) lim a(ta,xa,ya) e<tT a,ycjRd lim sup a(T, , ) lim sup (WE(T, ) a—*O — V(T, )) CE, where we have used the fact (5.40). Recall that W is defined by (5.39), so we have W(t, x) — V(t, x) — 1 = W ( t i, x)— V(t, x) + E’ CE+ where C’ is a constant. Since E, + c (1 1 E (1— EXt — W ( 6 t, x) + t) E) (1 — EXt +/3(T—t), 3 > 0 are arbitray, we can let W(t,x) for any (t, x) V(t, x) + = E, /3 —÷ — E) (5.48) 0 to get V(t,x) (5.49) 18 and from the continuity of the functions W and V we (0, T] x d, can conclude that (5.49) is true on Interchanging the order of the functions, and repeating the proof we have VQ,x) on E, and therefore W = V on W(t,x) Chapter 5. The Value Function 82 Therefore the proof of the theorem will be complete if we can show that I = T when e, a are small, and 3> 0. TI I < T, we can apply Lemma 5.9 with 17 (t, x, v) = lix ull 2+ - - to get cj E 18, Xj e the definition of we have cb<d, 8 i = 2 t = We, V 2 + lix - = —V, and 2 + My xaH ] 2 Yaii - - /3(t - 1, 2 such that Lemma 5.9 (i), (ii), and (iii) are satisfied. From = 1 Dq5(t, t, y) = ) = 2 D , Dc5(I, ä, ) — - = = y) + - a(x ) + a( (1- + a) i, (— — - y ) 0 , qS(I, ä, ) 2 D = — + a) i, where I is the d x d unit matrix. Therefore we can write A /(1+a)I = I —‘ \ \. 1+c c 2 = 1, (5.50) —‘ (1+a)1! —3 + a(I— ta). (5.51) Notice that Lemma 5.9 (i) can be rewritten as 611 1 and , —DKI, i, ), —X) C 2 2 (—c ’V(I, , ). From the definition of viscosity solution, we have C min( l 1 l Dxo(I,, 1+n(I. Xl), J mm (—c 2+ g* ?-t(I, , —D (I, th, ), —x 2 ), 2 ) + c(I)) 0, _g*Dycs(I, , ) + c(I)) (5.52) 0. (5.53) Chapter 5. The Value Function 83 The inequality (5.52) may he rewritten as (5.54) 1 g*(j , ) + c(i) 0, (5.55) where in (5.55) the inequality holds componentwise and we use the same convention in what follows. Now we show that gD(i, , — If (5.56) is not true, then for some 1 — ) + c(l) > 0. (5.56) i (g*Dy(I , )) 0, + c(I) (5.57) where x denotes the ith coordinate of x. Now multiply the ith inequality in (5.55) by 1 — then subtract (5.57) to get — Za + — )) Ec ( t t), which is a contradiction from the facts (5.41) and c’(i) co> 0 when we make a small. So the inequality (5.53) is equivalent to — 2 + i-tQE, ,, —D(I, 5, ), —X) c 0 (5.58) when a > 0 is small. The inequality (5.54) can be rewritten as 1 + (1— c em (t,, 1 e’’ 1 xi) 0 (5.59) Now we combine (5.58) and (5.59) to get 1 +2 c c + (1 — e)fl(t, Notice that 1 c +c 2 = 1 1 %, D(i, ) = —/3 + a(I 1 — — ta), 1 —Dth(i, , ) 2 ), —X therefore (5.60) can be written as /3—aQt—ta) (1— e)fl(I,, n(i, ) X 1 0. (5.60) Chapter 5. The Value Function 84 —]-i(i, , —Dçb(i, 1:, = (1 — 1 e) inf {tr( D(I + (1 - = 1_u(t, ) 2 ), —X x, u)u(t, x, u)*Xi) , ), b( , u)) + f( , u)} ) + 2 inb{_ tr (c(t, x, u)a(t, x, u)*X inf uEU { (-D(L , ), b( u)) + f(L , u)} —tr (u(t, x, n)a(t, x, u)*Xi) 2 + (D ±, ), b( ±, u)) + (1+ sup 2 {tr(u(t,x,u)u( ) t,z,u)*x uEU 2 11 / a’, suPttrcu(t iiEU N ) 1 u)ott, a’, UYX + (Dr(j , ), b(i e)f( , u)} + (D(i ,),b(i,Thu)) 1 / + trçu(t a’, u)u(t, , u)) + +(1 &)f( , u) - = , f(fmu)} , u)X (D(i , ), b(i - — . u)) f(i , u)} sup{I+I1+III}. (5.61) uEU Now we break the rest of the proof into two steps: Step 1. we first estimate the terms on the right hand side of (5.61). Recall that the matrix A in Lemma 5.9 is given by (5.50), which can be written as /‘ 1 a\ /10 —I\ \o I) I Therefore we have /3 \ / I —IN II 0 II \o I J\—I Now by Lemma 5.9 (ii) and the assumptions that a(t, tinuons in (t, a’), we have (with a E u(t, I, ‘a), & = tr(u(Z, ±, ‘a)u(i , u)*Xi) + tr(u(i, /* u&*\ 1 /X 0 = tr( II \& &&J \. 0 x 2 a’, ‘a) is bounded and Lipschitz con aQE, , u)) ) 2 , u)u(i, , u)*X The Value Function Chapter 5. 85 f* uô.*\ I —I / (—+2a)tr( / \* âoJ \—I I 7 /cTU* +(a+a ) 3 tr( \* ( ( = = + 2a) tr(* Ua*\ II 0 )( âJJ I \0 — +a — )a tr(uu* + âô) +(a + 3 + 2a) tr((u — )*) — a ) tr(ua* + &a*) + (a + 3 ) 3 2 +C(a+a lu—all (+a) < 2 (+2a) 3 c +C(a+a 2 ll—Il ) = o(1) (5.62) when a small, where C is a constant that changes from time to time. We have used (5.42), and the assumption that u(t, x, u) is bounded and Lipschitz continuous in x. Now let us look at the second term on the right hand side of (5.61). We have (Dr, , ), b(i, , u)) + (D(i, , ), b(t, , u)) = ),b(,u)) + aQ - - - - - ), b(, , u)) + a , b(, , u) - II II IIb(, , u) - y, b(i, , u)) , u)) + Ca[ll Xall + Il Yall] b(, , u)ll + 2Ca b(i, - - - - (Il_II + 2 a) = o(1), (5.63) when a small, where we have used the fact that b(t, x, u) is bounded and Lipschitz continuous in x. The third term in the right hand side of (5.61) can be bounded by (1 - E)f(t, , u) - f(, , u)l (1 - e)lf(i, , u) - f(, , u)I + Elf(t, u)I , Chapter 5. The Value Function 86 (1 = —e)Ij +Ce o(-s/&) + Ce, (5.64) since f(t, x, u) is bounded and Lip schitz continuous in x. Step 2. Now from (5.61), (5.62), (5.63) and (5.64) and have 5o(1)+Ce when a is small. By letting a, e —* 0 we get which is a contradiction. Therefore we have proved that i = T when e, a are small. The proof of the theorem is thus complete. U Corollary 5.11 There exists a unique viscosity solution in C(E) to the dynamic programming equation (5.35) with the boundary condition g = 0, which can be identified as the value function. Proof. This is a consequence of Theorem 5.3, Theorem 5.7 and Theorem 5.10. U Remark 5.12 By Theorem 5.7 we may characterize the value function as the unique viscosity solution of (5.35). U Remark 5.13 The proof of Theorem 5.10 is based on a modification of the mothods used in Fleming and Soner [23], Chapter 2 and 4, where the HJB equations for the classical con trol problems are considered. The assumption that b, a, modifying the proof given in Ishii [36]. f are bounded may be removed by U Chapter 6 The Continuation Region As the examples in Section 2.2 have shown, there exists a region, which is called the region, or inaction region, continuation such that if the optimal state process starts from outside this region, then apply the singular control to bring it to the region immediately, and then keep it inside this region from then on. In the interior of the region the optimal singular control should not produce any jumps. Recall from Theorem 5.4 that for (t, x) e E, W(t,x) W(t,x+gh)+c(t).h, Vh e 1i4. We define the continuation region of the problem as AE{(t,x)e >D: W(t,x)< W(t,x+gh)+c(t)’h, Vhe114, h#O}. It is obvious that U A= {t}xA, O<t<T where A is defined by (5.18), i.e., = {x: W(t,x)< W(t,x+gh)+c(t).h, V/i E 1R, h We have shown in Theorem 5.4 that the optimal state process in A, i.e., for F C R4, .s t Xt $ is continuous when it is T, F(Ax $ 0, Xt e A) = 0. In this chapter, we will show that there exists an optimal control rule F (s, x) 0 O}. A, then the control brings the state (t, xi) immediately to 87 A c such that if and keeps (t, xi) inside A Chapter 6. The Continuation Region 88 from then on. If (s, x) C A, the optimal state process wifi be kept inside A. We first give a series of lemmas. Lemma 6.1 There exists a function h(.,.): YD on >D such that h(t,.) is Borel measurable i—÷ for each t fixed. Moreover for each (t, x) C W(t, x) = T’V(t, x + gh(t, x)) + c(t) h(t, x), (6.1) . and for P C F(JAxtI > 0) = 0 (6.2) or equivalently, = Proof. For (t, x) e x + gh(t, x)) >D, let h be the 4 1 aximum h W(t, x) Let h be the maximum h 2 = = 1. 0 snch that h = (h, . ., 0) satisfies W(t, x + gh) + c(t) h. (6.3) . 0 snch that (6.3) holds for h procedure we can get an h , 0,. 1 (h . ., = (h, h , 0,. 2 . ., 0). Continue this h%). The existence of h7 follows easily from the Lipschitz continuity of W(t,.) and the fact that W(t,.) is bounded from below and c(t) > 0 componentwise. Let h(t, x) = h*, the Borel measurability of h(t,.) for t fixed is proved in the Appendix A. Now we prove (6.2) for F C To simplify our notation, let x’ = x + gh(t, x). Then Vh’ C ]R,h’ $0, WQt, x’) < W(t, x’ + gh’) + e(t) h’, . i.e., x’ A. Otherwise (6.4) would be an equality, and T’V(t, x) = l’V(t, x’) + c(t) h(t, x) = W(t, cc’ + gh’) + c(t) (h(t, cc) + h’). = W(t, cc + g(h(t, cc) + h’)) + c(t) (hQt, cc) + h’). . . (6.4) Chapter 6. The Continuation Region 89 This contradicts the maximality by which we construct h(t, x). Hence x’ c 2 and the result A now follows from Theorem 5.4. 0 Lemma 6.2 For each fixed t C [0, T], there exists a measurable selector H (.) : 2 such that 11 (x) C R, x 2 c jG2, —* (Q) 1 1M and (x)(xj+ 2 H ) = 2 eA 1. (6.5) Proof. Let H(.,.) be the measurable selector in Chapter 3. Define IIj(x) = H(t, x + gh(t, x)) o where O Q is defined by 1 = j I. @8 —gh(t,x),p , 3 0) 0 8 , 5 (x , 3 + ,u v h(t,x)) s t<sT for w = (x.j&, v.). It can be verified easily that 11 (x) C R4. Now we show that H 2 (.) 2 1M(f2) is Borel measurable. It is sufficient to show for any F C bF, the function 0(x) = EIt(r)F is in b13(lRd). We write T)0®F = t 0(x) = E so it will be enough to show that F(O ,(w)): 2 d 18 x Ot,(w) 18” x Q Q —* 11? is Borel measurable, or — is Borel measurable. But the Borel measurability of O,.(.) can be easily seen from the Borel measurability of h(t, .). Finally, (6.5) follows from (5.19) and the fact x + gh(t, x) C A . 2 Lemma 6.3 For 0 s t T, x C 18”, there exists a F C + C A 2 F(x ) = 1. 2 0 such that (6.6) Chapter 6. The Continuation Region Q Proof. Let I be fixed, and 90 = JIt(xt(w)). From Lemma 6.2 we know that for each B e F, the mapping wi-÷Q(B) is Ft-measurable. Therefore Q is a i-transition probability. Let F = 3 11 ( x) ® Q, by the same proof for Proposition 4.8 we can conclude that P E fl. Now we show that P satisfies (6.6). In fact, from the definition of P we know the r.c.p.d. of P with respect to F is Q = b® Q, thus P(xt+ e At) = JQ(xt+ c A)dP. But from the definition of Q and Lemma 6.2 it follows that Q(xt+ At) = Q(xt+ E A) = Ht(xt(w))(xt+ e At) =1; therefore (6.6) is proved. U We introduce the following notation: for n e let kn5);k=O,l,• T h_l}. ••, S={s+ 2 It is obvious that S,, C for n Lemma 6.4 For (s, x) e >D and n (6.7) 1. 1, there exists P, C Ra Pn(xr+ C Ar,Vr c Sn fl Proof. For simplicity of notation, we assume s [s,1’]) such that = 1. = 0. The proof for the general case is similar. Take P° = Ho(x), and let F’ = P°®i..HT(xr), p/c = P’ØkTHkT(xkT), k=l,2,...,2n_l, = p f 2 l.l Pn Ø2fl1T (6.8) H2n1T(x2n1T). Chapter 6. The Continuation Region 91 Then from the proof of Lemma 6.3 we can see that F, satisfies (6.8) and F C D Now we make the following assumption: Assumption A. A is continuous in the following sense: V (t, x) C E, if x of A, then there exist e, 6 > 0 such that S(x, e) nA = 0 whenever Iu — 0 A, the closure t < 6, where S(x, s) is a d-dimensional ball with radius e and center at x Proposition 6.5 Assumption A is equivalent to the following A U = {t} x A, (6.9) O<t<T where d respectively. A, A are the closures of A, A as subsets of E and 18 Proof. Assume first that (6.9) holds. For (t, x) A, then (t, x) 0 A. So 0, which means that for u t < a, e E, t $ 0, T, if x 0 such that [(t — e, t + a) x S(x, a)] flA there exists an 8> S(x,a)flA 0. Thus Assumption A holds. The cases when I = Now we assume Assumption A. If (t,x) Assumption A means exactly that (I, x) = — = 0 U{{t} x A, 0 0, T can be treated similarly. I T}, then x A, and A. Therefore under Assumption A, Ac {t}xA. (6.10) O<t<T But the reverse inclusion in (6.10) is obvious, and therefore (6.9) holds. U Remark 6.6 Note that Assumption A is similar in nature to the assumption made in the work of Chow. Menaldi and Robin [14], i.e., the moving boundary is continuous with respect to the time variable, cf. Example 2 in Section 2.2. Remark 6.7 If we define a set-valued map t U ‘-+ A, 0 I T, then by definition (cf. Aubin and Ce]Iina [1], p411) it is upper semicontinuous if and only if for each o C [0,T] and an open , 1 A 0 there exists a neighborhood N 19 of o 1 c M, t e . 10 It is N t such that A easily seen that the upper semicontinuity of A. implies Assumption A. U set M containing Chapter 6. The Continuation Region 92 Lemma 6.8 Under the Assumption A, if FE MIj(Q) is such that for some s € (O,T], 3 F(x 0 it) > 0, (6.11) then there exist y € ]R’, 7 > 0, and 6 € (0, s) such that u€[s—6,s], R(y,2’y)flA=ø, F(x€R(y,7): where R(y,r) Ix {x — y < r, 1< i u€[s—6,s])>0, d}. Proof It can be easily seen (6.11) implies that there exists a r > 0 such that F(x where A {x, lu — y <e for some y € g A;) } 3 A > 0, for e > 0. By dividing (AW into a countable number of small parts we can conclude that there exist y € (A and 7 > 0 such that 5 € R(y, y))> 0, F(x R(y, 27) C (A;y. (6.12) We will show that there exists 6 > 0 such that R(y,2’y)flA=ø u€[s—6 , 1 s]. If not, then we can have a sequence of points {xn} C R(y, 27) such that x € An,, with Un 11 s. Since {x} is bounded, there is a subsequence, which is still denoted by {xn}, converges to x € iRdi, and x € R(y, 27), therefore x . It can be easlly seen that this contradicts the 3 A Assumption A. From the left continuity of the sample paths and (6.12), we can find a 62 > 0 such that P(x € R(y,7),u€ [s— 62,s])>0. Let 6 = min{6 , 62}. Then y, 1 and 6 will satisfy the conditions in the lemma. U Now we state the main theorem of this chapter. It ensures the existence of an optimal control rule F € R such that if (s, x) in A, then the optimal state jumps to A, and then keeps A from then on. Recall that the optimal state process has no jumps when it is in A. Chapter 6. The Continuation Region Theorem 6.9 For (s, x) 93 e >D, there exists P e R such that T) P(x E A,Vs <t = 1. (6.13) Equivalently, (6.13) can be stated as P((t,xt)e A,Vs < t T)= 1. Proof. Without loss of generality, we assume that s sequence of probabilities {P} C ng = 0. From Lemma 6.4, we can get a satisfying (6.8). Recall that is a compact subset of (1Z), therefore there exists a subsequence, which is still denoted by {P}, and a control rule 1 1[V[ P e R such that P,- —÷ P weakly. It is enough to show that P satisfies (6.13). Note that (6.13) is equivalent to P(xr C A,r EQ n(0,TJ)= 1, where Q (6.14) denotes the set of all rationals. This can be easily seen from the left continuity of the canonical process x and Assumption A. Therefore it will be sufficient to show P(xt e At) 11(6.15) is not true, i.e., there is at e = 1, Vt C (0,Tj. (6.15) (0,T] such that As)> P(xj 0, ffld, > 0, and 6> 0 satisfying then by Lemma 6.8 we can find y C 7 R(y,2-y)flA P(x Recall that P —+ e = R(y, 6), u P weakly on 1M (!2), with 12 1 0, c u [t — e [t— 6,t], 6, t]) > s > 0. Vd[0,T] x U x Ak[0,TJ, xj(w) = Xt for w = (x., p, v.), and Vd[O, T] is endowed with the pseudo-path topology, and therefore by Theorem 2.11, there exists a subsequence of {P}, which will still be denoted by {P}, and a set I of full Lebesgue measure such that the finite dimensional distributions of (xj)jcJ under the control rule F,,, converge weakly to the finite dimensional distributions of (Xt)tcI under F. Chapter 6. The Continuation Region For an arbitrary K € iV, take 1 N C 94 a large enough Cardinality where {sX, fl(t — such that 6, t)} > K, is defined by (6.7). We denote those points by . (6.16) , S<’ (K’> K) and t < 6 31<82<”<8K?<t. Since I has full Lebesgue measure, we can take t ,.. 1 . , such that t. € I, 1 tf<I K’, and tb<t1<&1<t2<”<tK’<3Kl. Using the weak convergence of the finite dimensional distribution and the fact that R(y, 6) is open we have e 1 lirninfF(x R(y,6),1 i K’) F(x C R(y,6),1 i K’) F(xeR(y,6),ue[t—6,t]) E. > So we can take N> N 1 larger enough such that whenever n 1 C R(y, 6), 1 F(x Note that C Sc , i K’) N, e. > so for n N, + C 3 F(x A,i= {w: x + 3 e 1,2,.,K’) = 1. Let CK’ Then F(CKI) > e. Write = t = it t t + y, v y ,x 3 A 1 1 € R(y,7),1 = t t + 27, ü y = t y i — K’}. 27, V’ = NL”0 ( x) the number of upcrossings of x denote by t t between the levels interval [0, T]. It can be easily seen that if w C d CKI, (Nuivt(xi(w)) + N(xt(w))) K’ — 1. — it , 1 i d, and and v in the time Chapter 6. The Continuation Region Therefore when n 95 N, d (EPnrt&(xi) + EN(x)) > (K’ — 1)Ffl(CK’)> (K’ — l)s. Since K’ can be taken arbitrarily large (with e fixed), we have limsupZ (ENt(xi)+ EN(x)) (6.17) = 00. But as we know ) 1 IuI +Varp(x EFnNUZV&(Xt) &I +Var(x) 1 ( 2 EN ) x = + Varp(x)] = )], 1 + Varp(a where Varp(x) denotes the conditional variation with respect to F,. Hence 1 LHS of (6.17) Z {IuI + uI + 211m:upVarpn(xt)]. (6.18) Note that under the control rule F € = t+ x (9, Xe, te)dO + G(v) + M, t b where M is an .F-martingale under F, . From the boundedness of b(,,) and the positivity 3 IIb(., )II of c(.), there are constants C, c such that F,, € , C, and IIc(DII c, R, we have T 1 { W(O, x) = E f(&, Xe, ite)dO + j c(S) dve} . cE’{IvrU. Moreover, for any F € EjMfl from Lemma 3.2 we know 4 (EIMI2) 4 W 2 (E”( )) 4 T (EPf tr(a(Sxe.e))d9) C’ < 00. IIeCII c. Since Chapter 6. The Continuation Region 96 Therefore we have fT ) t VarpQc j’ JO 1 CdO+—W(O,x)+ sup C O<t<T E”IMI CT W 4 (O,x)+C’Co<oc, where C 0 is a constant independent of n. By (6.18) we can see ], 0 [InI + u l + 2C t LHS of (6.17) so the right hand side of (6.16) is uniformly bonnded in n. Bnt this is a contradiction to the fact of (6.17). El Remark 6.10 We have shown that the subset A has some of the features of the continuation region found in specific problems (see the examples in Section 2.2), but the following problem remains to to solved: • (5.24) and the examples in Section 2.2 suggest that it might be true that for any optimal control rule F e J T 1°c1 = where A°o is the interior of the subset A 9 C IRd, i5 2 i4 and i4 denotes the total variation of the 1-dimensional process v. In other words, the singular control variable acts only when the optimal state process is on the boundary 0A 9 of the continuation A after a possible inital jump. , region 9 Chapter 7 The Existence of Optimal Control Laws 7.1 Introduction In this chapter we study a special case of the singular stochastic control problem we formu lated in Chapter 2 when the singular control variable enters the state as a process of bounded variation, or the bounded variation control problem. We introduce a random time change which stretches out the time scale. Under this new time scale, the problem is transformed to a new control problem involving only classical controls. The new problem has been studied exten sively, cf. Fleming and Rishel [22], Krylov [50], and especially Haussmann [29], Haussmann and Lepeltier [30], which ensure the existence of an optimal Markovian control for the new problem under some mild continuity conditions on the coefficients of the state process. Applying this result and transforming the optimal control back to the original singular control problem, we show that the optimal control exists. Moreover, it is shown that there exists an optimal control in the following form: the control variable u is in Markovian form, and the increments of the singular controls during any time interval depend only on the state process in that interval. This type of control will be called a control law. This method gives an explicit way to construct the optimal control to the singular control problem when the optimal control to the new problem, which is relatively well understood, is known. However, it does not exhibit the special features of the optimal state process, i.e., we do not know whether the optimal state process is a reflected diffusion in some region. A similar approach has been used by Martins and Kushner [59], Kurtz [52], and Zhu [81] to study the weak convergence of probability distributions. This chapter is organized as follows: both the singular control and the classical control 97 Chapter 7. The Existence of Optimal Control Laws 98 problems are introduced in Section 7.2. In Section 7.3 the equivalence of the two problems is established and the existence of the optimal singular control is proved. Section 7.4 studies the existence of the optimal control law, and some comments about the approach in this chapter are given in Section 7.5. 7.2 Formulation of the problem Throughout this chapter, we assume U is a compact metric space, T > 0 is fixed, and the following functions are given: • u:>x • f: >D x U ‘—* 11? is a lower semicontinuous function in (t, x, u) and such that —K f(t, x, u) for some constants rn, K, C • c = (ci), (ei) e= : [0, T] and t ,c ë’> 0, 1 Moreover we let a = i i-4 C(1 + ), tm lxii (t, x, u) € X U, 0, and 114 are bounded and lower semicontinuous functions on [0, T], d. aa’ and we assume • for (t,x) € >D, K(t,x) {(a(t,x,n),b(t,x,u),z): z f(t,x,u),u€ U} C x x is convex. The last condition, sometimes called the Roxin condition, allows us to replace relaxed controls by ordinary controls U is convex. Ut. It is satisfied for example if a, b are linear in u, f is convex in u and Chapter 7. The Existence of Optimal Control Laws 7.2.1 99 The singular control problem In this chapter we study singular stochastic coutrol problems iu which the singular control variable enters the state as a process of bounded variation, or the bounded variation control problem. More precisely, in this chapter we study the optimal control problem in which the state evolves according to the d-dimensional stochastic differential equation x(t) where (B, t = x+ f b(O, x(O), u(8))dO + ft u(6, x(O), u(O))dBo + v — v, (7.1) 0) is a d-dimensional Brownian motion, x is the initial state at time s and u., v?-, v stand for the control variables with v , v 1 2 nondecreasing componentwise. Note that this is a special case of the problem we formulate in Chapter 2, in which we let lv 2d, and g(•) = = (I, —I) with I the unit d x cl-matrix. For convenience, we state the definition of controls corresponding to this problem as follows, which is consistent with the definition in Chapter 2. Definition 7.1 A control is a term a 1. s 0 s = (Q,F,Ft,F,xt,ut, vI,v?, s,x) such that T is the starting time and x E jjjd is the initial state; 2. (1!, F, F) is a probability space with the filtration {F}>o, and there is a standard d dime nsional Brownian motion B on it; 3. Ut ZS a U-valued process, progressively measurable with respect to {Ft}to; , v are progressively measurable with respect to F with v 1 4. v paths of the processes 5. Xt, = 0, i = 1, 2. The sample are in A’[0, T], i.e., for each w e Q, vftw) C A°[0, T] (i the state process, is a = 1, 2); Ft-adapted process such that (7.1) is satisfied. We call (s, x) the initial condition of the control a. The cost corresponding to the control a is defined in the form J(a) E {f f(t, Xt, u)dt + j c(t) dv + e(t) dv?}. . . L,T) (7.2) Chapter 7. The Existence of Optimal Control Laws 100 A control a is called admissible if J(a) <cc. The collection of admissible controls with initial condition (s, x) is denoted by AS,T. It is well known from the theory of stochastic differential equations that, under the above conditions, the set A , is nonempty for each fixed (s, x) (e.g., take u and vt, i 3 = 1,2, to be constants). The value function of this control problem is defined by W(s,x)= inf J(a). , 8 cx€A A control a* C A , is called optimal if 3 J@*) = inf J(a). (7.3) We say that (v+, vj is the minimal decomposition of the process v of bounded variation if v+ and c are the positive and negative variations of v respectively, i.e., v± e v = v+ — v and = v+ + where i3 is the total variation of v on [s,t). tr, Proposition 7.2 For each a = ,, we have a 3 (Q,F,.Fj,P,xt,uj,v’,v?,s,x) e A = (Q,F, .F, , and 3 P,x,u,vt,v1,s,x) C A J(a) J(a), where (v+, vj is the minimal decomposition for the process v optimal, then v 1 = v+, and v 2 = = 1 v — . Moreover, if a is 2 v v. Proof. The first part of the proposition is obvious, and the second part follows from the strict positivity of e(.) and e(.). u From Proposition 7.2 we know that if the optimal control for the problem exists, then we can always find an optimal control a* such that the two increasing (componentwise) processes 1 and v v 2 are mutually orthogonal, i.e., dv 1 dv 2 . = 0. Chapter 7. 7.2.2 The Existence of Optimal Control Laws 101 The classical control problem Following Martins and Kushner [59] (see also Zhu [81]) we now define a (d+ 1)-dimensional con trol problem which nses only the classical controls. The state of the problem evolves according to f °(O)dO, + j {b(°(o), (O), ü(O))ñ°(O) + + f c(°(O), (O), ii(O)) J 1 = s + = x (7.4) n(O)(i - o(o))] dO , t> 0, (s, x) € E. 9 (05dE The control variable of this problem has the following form: = where • It?: 1R—÷I=[0,1], • It.: ‘—÷ U, • ) 2 , 1 k=(It It : 18H÷S={zcJRf, . 2 z 1 Z =1},andIt=It’—It Notice that the control set E = I x U x S is compact. Control problems of this type are discussed in Hanssmann and Lepeltier [30]. Using the notation of Definition 2.2 of [30], we denote the control of this problem by a = (O, .fr, f, P, (±, ), üt, .s, x). The cost associated with a is defined by 3(a) = E’{f [f(°(o) (O), It(O))It°(O) + (c(°(o)) . (O) + 1 It 0(9)). n2(O)) (1- It0(O))] dO} (7.5) The Existence of Optimal Control Laws Chapter 7. 102 with = inf{t 1°(t) = T}. (7.6) A control a is called admissible if A , 5 . and the collection of all admissible controls is denoted by Let T’i7(s, x) be the value fnuction of this problem, i.e., f17(s, x) 3(a). inf = Lemma 7.3 If & is admissible, then r < cc. Proof From onr assumption that c(•), E(.), i = 1, . . ., d are strictly positive and lower semicontinuons on [s,T], there exists a constant c 9 > 0 such that (t)>co, e(t)c t c , stT, 1id. 0 Noticing that I(t) = (t), uz 1 (ui (t)) e 2 S for s T, we have I c(t) ñ’(t) + e(t) h (t) 2 If 3(&) . 0 c . . < cc, then cc > E{j[c(°(O)) ñ’(O) + coE”{f(l — aO(O))dO} e(°(9)) h (O)](i 2 — - u°(O))de} - KT KT co(Er—T+s)—KT, where K is the constant in the definition of f. Therefore Er < cc, which implies that r<cc, a.s.. Li It now follows that if we define r(t) then r(t) < cc a.s. for each I T. = inf{O, .f0(9) t}, Chapter 7. The Existence of Optimal Control Laws 103 We define, for (t,x) E D, 1(t,x) ={ ) 1 (a(t,x,u)uo,(t,x,u)u0+[(0 — f(t, x, u)u° + z (2)](1 u0),z) — ](1 2 n + ë(t) n [c(t) = (7.7) u°), — (u°,u,v) where à(t,x,u) x, u) and a = ( = = 0 0 0 a(t,z,u) (b(t,, )) e (d+1)x(d+1), 8 ) E au. It is convex according to the next result since K(t, x) is convex. Lemma 7.4 If K(t, x) is convex, then K(t, x) is also a convex set. Proof. It is clear that we need only to show the set k(t, x) = {(a(t x, u)u°, b(t, x, u)u° + (n’ < . Let j = (u?, u, vj) e E, — 1, we want to show that ) = 1 + (1 A/h — = (u°, u, v) = 1,2. u°), e and f(t,,u)u+(c(t).n +ë(t)n)(1— u), i z A x u°), z) — . is a convex subset of Sd x < n ) 2 (1 f(t, x, u)u° + [c(t) n 1 + ë(t) n ](1 2 z For 0 — 2 gives rise to a point in K(i, x). A)/h The result is obvious if u = = 0 < ü° 1, and 0 1. From the assumption that K(t, x) is convex we know that there exists a E U such that A’ 0 u 0, 50 A’a(t, cc, u ) + (1 1 cc, ui) + (1 A’f(t, cc, ui) + (1 we assume u + u — — — A’)a(t, cc, u ) 2 A’)b(t, cc, u ) 2 )i’)f(t, cc, ‘u ) 2 = = > 0. Let A’ a(t, cc, ), b(t, cc, ), f(t, cc, u). = Au/ü°. Note that Chapter 7. The Existence of Optimal Control Laws 104 Therefore Aa(t, x, ui)u? + (1 — A)a(t, x, 2 u ) u = a(t, x, u)u°, (7.8) a, ui)u + (1 — A)b(t, a, 2 u ) u = b(t, a, u)u°, (7.9) f(t,x,u)u°. (7.10) Af(t,x,ui)u + (1— A)f(t,x,u )u 2 Moreover, A(nI A(c(t) u) + (1 — A)(n’ + e(t) n)(1 - u) + (1- n)(l — . n — n)(l — — u) = 1 (n i4 + ë(t) n)(1 A)(c(t). . (c(t) u’ + e(t) n )(1 2 . n ) 2 (l — . — — fi°), - (7.11) t4) u°), (7.12) where 711+ and i = ) e S. 2 (n’,n and therefore k(t, a), Now (1—A)(1—z4) ‘2 0 i—u • 2 from (7.8), (7.9), (7.10), (7.11), and (7.12) we can see that K(t,x), is convex. Finally, we recall a result from Haussmann and Lepeltier [30] (cf.Haussmann [29] for the autonomous case). Theorem 7.5 (Haussmann and Lepeltier [30], 1990) There exists an optimal Markovian control & = (S\f,.frj,F,(th°(t),I(t)),i’(t),s,a) to the problem (7.4). More precisely, we can write ü°(t, w) t’(t, w) where U°(.,.) 18+ x 18 i—÷ 111+ x = U°(°(t, w), (t, w)), = (AT 1 ‘—+ ü(t, w) = (T(i°(t, w), i(t, w)), 2 (1°(t, w), (t, w), (th°(t, w), th(t, w)), AT [0,1], tT(.,.) 18 x ‘— U, and S are deterministic Borel measurable functions. JT(.,.) = (N’(., fJ2(.,.)) ., U Chapter 7. 7.3 The Existence of Optimal Control Laws 105 Equivalence of the two problems In this section we will show that the value functions for (2.1) and (7.4) are the same, i.e., W(s, x) J(a) = T’i7(s, x), (i(&), .1(&) (s, x) e D. More precisely, for any a C A , 3 eA , 3 respectively), there exists a control & J(a) (J(a) J(&), = (a e (a e A ,) 3 with the cost ,, respectively) such that 5 A respectively). For this section we do not require the Roxin condition although we made it part of the standing assumptions for ease of exposition. Take a control a = (Q,F,Ft,F,x(t),u(t),v’(t),v ( 2 t),s,x) C A ,. By definition, there is a 3 d-dimensional Brownian motion B. on (12, F, F, F) such that t = x+ j t b(O,xg, ug)dO + )dBg + v 9 j c(O, xo, u — v?. (7.13) We may assume F to be right continuous. In fact, note that B. is still a Brownian motion on (12, F, F+, F), where Ft+ C A ,, with J(a’) 8 = flt<o.crFs. Therefore a’ (12, F, Ft+, F, x(t), u(t), v Qt), v 1 Qt), s, x) 2 J(a). = hr view of Proposition 7.2, we may assume that 1 v and 2 v are mutually orthogonal, in other words, let v(t) = v (t) 1 — we have Vk(t)VL(t)+V(t), Let ci i3(t) = >Z k=1 then it is obvious that vI(.) << i3(.) i = 1,2, k = Apply the Radon-Nikodym theorem (cf. Dellacherie and Meyer [17], Theorem VI-68): there exist processes 4, i = 1,2, k = 1,.. ., d on [0, T] which are progressively measurable with respect to F and such that v(t) = jtn(e)d(o), i = 1,2, k = Chapter 7. The Existence of Optimal Control Laws 106 It can be easily verified that d 2 (7.14) i=1 k=1 a.e. (d13 ) on [s,T]. Let & 6 i = 1,2. We may redefine n t on a d13dP-nufl set = such that (7.14) holds everywhere. The cost function of the problem can be written as J(a) = E {JT T 1 f(6, x, uo)dO + (c(o) n’(O) + 3(0) . n2(0)) d13(0)}. (7.15) Define r(t)=t—s+13(t), (7.16) stT, then r(.) is strictly increasing and ‘eft continuous on [s,T], with r(s) = li(s) = 0. We denote the inverse function of r(.) by T(.), i. e., T(t)=inf{0<T: r(0)t}, 0t<co. Note that T a r(t) = t, 0 (7.17) t < cc. Lemma 7.6 (a) T(.) is nondecreasing, and for each t, T(t) is an Fe-stopping time, s < T(t) T. (b) T(.) is Lipschitz continuous with constant 1, i.e., IT(ti) T(t ) 2 I — t’ — , 0 2 t , t 1 t 2 cc. Proof (a) is obvious since F, is right continuous. Now we show (b). Take t 1 < we need only consider the case T(t ) < T(t 1 ). 2 f3(T(t ) 1 +) — ) 1 T(t ) 2 T(t = — ) + i3(T(t 1 T(t )—) 2 r(T(t ) 2 —) —ti, 2 t — and Note that ii is nondecreasing, so we have 13(T(t ) 2 —). Therefore ) 2 T(t , 2 t r(T(ti)+) — 13(T(t ) 1 +) Chapter 7. The Existence of Optimal Control Laws 107 i.e., T(.) is Lipschitz continuous with constant 1. 0 From Lemma 7.6(b) we can write TQ) with 0 T’(6) 1, 0 s + jT’(6)dO = (7.18) S < cc. Define A j = b(9, cc(S), u(S))dS, a(S,cc(S),u(S))dB(O), s MA A T. = Then from (7.13) we have x(T(t)) = cc + DTU) + MT(t) + v (T(t)) 1 — v ( 2 T(t)). (7.19) Note that = j b(T(S), x(T(S)), u(T(6)))T’(S)dS. As we know, AlA is a continuous square integrable martingale on the filtered probability base (Q,F,F,F) with A (M)A = j a(8, cc(S), u(S))dS. Let A 1 2 1=MT(t), t=T(t)’ then the optional sampling theorem implies that A2TQ) is a continuous P-martingale with T(t) a(S, cc(S), u(S))dS = = j a(T(S), cc(T(S)), u(T(S)))T’(S)dS. Therefore, there exists a standard extension Brownian motion E. (O, .fr, F, F) of (Q, F, F , F), and a d-dimensional 1 on this probability base such that = f u(T(S),cc(T(O)),u(T(S)))v(idEe. Chapter 7. The Existence of Optimal Control Laws 108 cf. Karatzas and Shreve [45]. Define (t) = f n(T(O))(1 T’(O))dO, — i = 1,2. Since r o T(t) = min{O, T(O) = T(t)}, it can be verified easily that v(T(t)) = ‘(t), t and for s + \( U (r(8), r(9+)]), i = 1,2, s<8<T T, j = 1,2, = v(9) + n(r(9))(8’ — T(O)), r(9) ‘ r(8+). Define I x(T(t)), t x(O) + n(r(O))(t with n = — — r(O)), e + r(O) \ (Us<o<T[T(),T(8+)]), I < r(O+), s 1 <T, , and 2 n = u(T(t)), 0 Note that T’(t) = 0 a.e. forI = = [T(),r(O+)], I < .s 00. T, so we have j b(T(O), (O), (O))T’(O)dO, j (T(O), (8), (O))VdEe, 0 t < co. (7.20) Then we can write (7.19) as (t) = x + j [b(T(O), (6), (6))T’() + n(T(O))(1 + f - T’(O))] dO u(T(O), (O), (9)). dE . 0 Therefore we have a control = (O, fr, f, P, (°(t), (t)), (ü°(t), (t), 1 (t), fi (t)), s, x) 2 for the problem (7.4), with ft°(t) = T(t), = T’(t), = (T(t)), 0 3 n 1< 00, i = 1,2, (7.21) Chapter 7. The Existence of Optimal Control Laws 109 where T is defined by (7.16) and (7.17). Now we can state the following result. Proposition 7.7 For any admissible control a = (, F, F, F, x(t), u(t), v (t), v 1 (t), s, x) e 2 there exists an admissible control aE8 A , ,, 8 A such that J(a) <J(a). (7.22) Proof. We first assume that v 1 and v 2 are mutually orthogonal. Then we have obtained a control a as (7.21), and we need only to show (7.22). It is easy to see that J = 3 where T PT f(O, x(O), u(O))dO J f(I°(O), it(9))ii°(6)dO F — 0 is defined by (7.6). Noticing that -r < 00 F-a.s., we can apply Lemma B.1 in the Appendix to get T j [c(O) n’(O) +e(8) n (O)]d(6) 2 . . = Since (O, , f, P) of J(.) and J [c(°(O)) fl (O) + e(°(9)). n 1 (O)](1 2 - °(O))dO. 0 is a standard extension of (Q, F, , F) we can conclude, by the definition J(.), that J(a) = i(a), cf. (7.15). If v , v 1 2 are not mutually orthogonal, then by Lemma 7.2 there exists a control (t),s,x) 2 (Q,F,F,F, x(t),u(t),v’(t),’i J(o) i(&) e , such that 3 A ,i 1 i 2 are mutually orthogonal and J(a). Therefore by what we have shown, there exists a control & E = J(o), and hence J(&) = , 3 A such that J(a). A converse to Proposition 7.7 is the following. Proposition 7.8 For any admissible control & = (t)), s, x) e 2 (, .fr, .fr, P, (i°(t), i(t)), (ü°(t), (t), ñ’(t), ñ there exists an admissible control a E A ,, which is given below by (7.24) and (7.27), such that 3 J(a) = i(&). (7.23) Chapter 7. The Existence of Optimal Control Laws 110 Proof. Define r(t) i°(O) inf{9 = (7.24) t}, = hi b(°(9), th(9), ü(9))°(9)d9, (7.25) ja(°(O),a(O),u(O))JI(ó)dE(O). = (7.26) Note that u°(.) is nonnegative, and therefore ao(.) is nondecreasing. It can be seen easily that r(.) is nondecreasing, left continuous, r(s) }-stopping time since 9 is an {Lfr i0(.) = o r(t) 0 and = is continuous. Define, for .s x(t) = u(t) = (t) 1 v t. Moreover, for a fixed t, r(t) t T, (7.27) (r(t)), = V(r(t)), J (9)(1 t k i = 1,2, with (A) t V = By Lemma 7.3 we know that x(.), u(.), v (.), i t 8, 0 0 a V°[0,T], i = — ñ°(O))dO. 0 = 1,2, are well defined. Since t’(t) = (ñ’(t), ft (t)) e 2 1, and rQ) is nondecreasing and left continuous, we can conclude that v e 1,2. Moreover, we have x(t) = x + D(t) + 7 M ( ,) + (t) It can be verified easily that the change of variable 9’ = — (O) implies = j b(O’, x(9’), u(9’))dO’. By the optional sampling theorem, we know that M 1 MT(t) is an .FT(,)-martingale. Moreover, we have (M) = (112t)() P T(t) = = J J (7.28) a(th°(O), (9), ii(O))ut°(9)dO a(9, x(9), u(9))d9. Chapter 7. The Existence of Optimal Control Laws 111 Applying the martingale representation theorem, there exists a standard extension (Q, F, F, F) of the probability base (Q, F, F (), I’) and a d-dimensional Brownian motion B on (Q, F, F, F) 7 such that = Therefore we have, for x(t) = to show J(ct) = u(O, x(9), u(8))dB(O). (7.29) T, j b(8, x(8), u(O))dS + j a(8, x(O), u(8))dB(9) + v’Q) x+ in other words, a t & f — (IZ, F, F, F, x(t), u(t), v’(t), v (t), .s, x) is a control for (7.13). It remains 2 = 3(&). In fact, by the change of variable pT PT J J f(9, x(9), u(O))cW f(1°(8), 53(0), it(0))ii°(0)dO, 0 = 8 and by Lemma B.2(b) in the Appendix, we have rr j c(0) dv’(0) . = 8 rT J J c(53°(0)) (53°(0))(1 . — it°(0))dO, 0 pr e(0) dv (0) 2 . = 8 eQS°(0)) h Qi°(0))(1 2 j . — 0 and therefore J(a)=J(&). 1 Evidently we now have Corollary 7.9 For (t,x) C W(t, x) 7.4 = I’17(t, x). D Existence of optimal control laws A control a = (Q, F, F, F, x(t), u(t), v’(t), v (t), &, x) e A 2 , 5 is called a control law if there exists a Borel measurable function 53 uQt,w) = ü(t,x(t,w)), and for any (t 1 v ) 2 — & 1 t 2 t ‘—* U such that T, (ti) C u(x(0), Li 1 v 0 ), i 2 t = 1,2. (7.30) Chapter 7. The Existence of Optimal Control Laws 112 Now we can state the main theorem of this chapter. Theorem 7.10 There exists an optimal control law for the problem (2.1). Proof. By Lemma 7.4 we know that k(t, x) is convex for any (t, x) C >D. Applying Theorem 7.5, there exists an optimal Markovian control a= (Q, .t, .fr, for the control problem (7.4), with ft(t) ü°(t) (t) where U°(.,.): IR+ x If?” i- x = = = P, (t), ft(t), s, x) (€i°(t), ii(t), i’(t)), and U°(±°(t), (t)), ñ(t) = U(°(t), (t)), (°Q), Q))), 2 (i’(°(t), (t)), iST U(.,.): [0,1], x U, and ‘; SQ.) = (fr ( 1 ., .), 2(.,.)): S are deterministic Borel measurable functions. As we have seen in the proof of Proposition 7.8, through an extension of the filtered probability space (12,.?, JP, P) and transforms (7.27) we can get an admissible control a = 2 (Q,F,F,P,x(t) ( t),s,x). ,u(t),v’(t),v By Proposition 7.8 and Corollary 7.9, we know that a is optimal, i.e., J(a) = W(s, x). It remains to show that a is a control law. Note that v , v 1 2 are orthogonal to each other by Proposition 7.2. Since 0(.) u(t) is continuous, we have 2°(r(t)) = ñ(r(t)) = U(±°(r(t)), (r(t))) = t for s = U(t, x(t)). t T. Therefore To show (7.30), let J {t: x(t) x(t+), s t T}. (7.31) Then from the equations (7.4) and the definitions (7.27) we can see J C {t: r(t) < r(t+), s t T}. (7.32) Chapter 7. The Existence of Optimal Control Laws Moreover < T(t) T(t+) implies that €t = Oa.e. on x(r(t+)) = (r(t)) By the orthogonality of [r(t),T(+)). , 1 v [r(t),T(t+)). J [ñ ( 1 8) r(.) {t : r(t) < n ( 2 8)] dO. - dv(t) 2(e), As ( 2 ( 1 (n 8),n 8)) ES a.e. then ‘(8) = s T(t+), 8 r, where r = 0, i.e. ñ = 0 a.e. on so < t < T}. is defined by (7.6), i.e., (7.33) Moreover, if 0(8) is strictly increasing and left continuous. 8 for 0 Hence r(t) (t) 1 it follows that dv 2 v J Recall that + 113 J, then r(T). So we can write r PT(t) v(t) = V(r(t)) = J 1.T((8), (6))(1 T = (t) j 1{O(G)J}(O)N(X°(8), (O))(1 + = U°(°(O), (O)))dO — 0 f — U°(°(8), (O)))d8 T(t) 1{O(e)eJ}(O)Nt(XO(O), (O))(1 — U°(°(o), (8)))d8 I ( t t) + IP(t). It is obvious that if °(8) E J, then 0(8f) = .s + io(o) for 8’ E [T(°(O)), r(i°(8)+)j. By definition, = J 9 U°(°(8’), c(8’))dO’, 0 and 0 < U 0 < 1, so we will have U°(I°(8’), (8’)) = 0 a.e. on [r(°(8)), r(°(O)+)]. (7.34) Therefore we can rewrite IP(t) = 0<O<T(t) Hence II(.) (i = J TQV j\T(50(8), (O’))d8’. (0)) 1,2) is a pure jump process, and from the definition of see that II’(t) ()+, = s0<t 11 ( 2 t) (x), = s0<t (.) and x(.) we can The Existence of Optimal Control Laws Chapter 7. where 0 (Ax ) + = max{x(O+) — x(9), O}, (Ax ) 0 = x(9+) — —min{x(8+) — x(O), O}. Therefore , 0 Ax 11(t) where Ax 0 114 x(9). We consider the process I(.). By Lemma B.2, and (7.33), noticing that r(i) > 9 if and only if O(9) < t, we can write (t) I(t) = = = = = = = where TC(.) j f f f f j j 1{O(O)J}(9)N(X°(9), (9))(1 — U°(°(9), (O)))d9 1{o<o<(i)}(9)1{o(o)gJ}(O)N(x°(9), (O))(1 <O(O)<t}(9)1{O(6)J}(9)N(X°(O), (9))(1 8 1{ O(O)< 3 1{ {s0<t, O(O)J}(9)N(X°(O), (9, (r(O)))(1 ) 9 SØJ}( 1jc(O)T(9,x(O))(1 jT x(O))(1 - (r(°(9)))(1 — U°(°(9), (O)))dO — — — U°(°(O), (9)))d9 U°(°(9), (r(°(9)))))dO U°(O, (r(0))))dr(O) U°(9,x(9)))dr(0) U°(9, x(O)))drc(9), - denotes the continuous part of the increasing process r(.). Define {9: U°(O,x(O)) R From the definition of T(.), 0, s 9 T}. we know PT(t) t = = s+ I Jo U°(°(9), (9))dO, s < t < T. (7.35) As above, we can get t since 0(9) = 0 if 0(9) e — = f U°(9, x(9))drc(9), s <t <T, J by (7.34). Therefore for any nonnegative Borel function k(.) defined on [s,T], I (7.36) J[s,T]flRc k(9)drc(9) = f J[s,T]nRc 1 U°(O, x(9)) k(9)d0. Chapter 7. The Existence of Optimal Control Laws 115 Hence = 1(t) = I’(t) - (t) 2 1 1Rc0d0 + = where ]T = J/1 — jçT2• j (O, x(O))1R()drC(O) I°(t) + 1(t), Notice that from (7.36) we can conclude m(R) = 0, where m denotes the Lebesgue measure on the real line. Therefore I”(.) and Isc(.) are the absolute and singular parts of the continuous bounded variation process 1(.) respectively. Now we consider 18c(t). Recall that B. is defined by (7.25), and by (7.28) we have f From (7.29) we know that M ) = 0, s 0 1(9)dD( M) is an F E = (M)T(t) = = j t T. (7.37) ()-martmga1e, and for s 7 F t < T, ja(O,x(O,u(0))dO. Therefore (J 1R(O)dlI/1e) )a(O, x(O), u(O))d = 0, s 8 1j( t < and hence j 1(O)dM = 0, s t T. (7.38) Moreover, from the definition of I” it is obvious that j 1R()dI(8) = 0, t (7.39) T. From (7.27), we can write = x + DT() + M + Pc(t) + 1sc(t), s where rc(.) t is the continuous part of the semimartingale x(.). Therefore by (7.37), (7.38), (7.39) and the definition I we have L t 1R(8)dXc(O) = f 1R(9)dI(8) = The Existence of Optimal Control Laws Chapter 7. Let v = — 116 , we have shown that 2 v vQ) = 1(1) + .UQ) = = rcQ) + IQ) + 11(t) (’(° x(9)) t j + J x(8))) - 1{9: S U00 ) 9 U°(O,x(0))-_O}( , 0 Ax + s0<t and the three parts are the absolute continuous, singular continuous and jump parts of the bounded variation process v(’) respectively. (7.30) is obviously satisfied by v. Since v , v 1 2 are orthogonal to each other, we have 1’ = 1+v v . From the definition of total 2 variation we can conclude that (7.30) is satisfied by 13. The proof of the theorem is therefore 1 1 completed by noticing that v 1 = (i3 + v), v 2 = $13 v). U — 7.5 Some comments (a) The method we have used can be applied to the following more general problem, which is similar to the one formulated in Fleming and Soner [23], Chapter 8. Define B—{nE: tInII1}, S={neffl: IInII=1} and let C be a closed cone in lFt’, i.e., n E C, A> 0 =t An Define S° = S fl C. Assume that c(.,.): [0, T] x C ‘—* e C. 114 is continuous and c > 0 on S. We also assume that c(.,.) satisfies the following condition c(t, Ax) for 0 A 1, x e B n C. = Ac(t, x) Instead of the controls , tm v v 2 as in (7.1), we use v(t) = n(0)dE(0). Chapter 7. The Existence of Optimal Control Laws 117 The state equation thus becomes t = j x+ t b(O, x, uo)dO + j t u(8, x, uo)dBo + (s,x)e>D, and the singular control is expressed by a pair of processes n(.) 1&, progressively measurable with respect to F, with a(s) j n(6)de(6), stT, [0, Tj = —÷ S° and E: [0, T] 0 and sample paths of —÷ in V[0, T]. The corresponding cost function is defined in the form T 1 { J E E f(t, Xt, ut)dt + sT) 1 c(t, n(t))dut)}. We can get the existence of an optimal control law for this problem. First let us relax the constraint n(9) e S° to n(O) e B° n B = C. If v(.) 5 n(O)d = with n(O) e B°, let 0 n C S° be arbitrary, and nQ) n(t) In(t)II = no (t) = if n(t) $ 0 if n(t) = 0, f IIn(O)IIdE(O). Then - v(.) = I Jo - ñ(O)dE(O), 1 Jo cQ, n(t))de(t) = I Jo cQt, n(t))d(t), and n(O) e S°. The corresponding classical control problem would be (7.4) with ñ(9) C B° and 3(a) The corresponding EQ, X) = k = Ej [f(,e,ue)2 + c(,ñe)(1 — ü)]d9. defined in (7.7) can be written as { (a(t, X X, u)u°, b(t, x, u)u° + f(t, X, (0)(1 u)u° + c(t, n)(1 — — u°), z) u°), u 0 e [0,1], u C U, n C B0} Chapter 7. The Existence of Optimal Control Laws 118 and it can be shown that Lemma 7.4 is still true. Now we outline the changes needed for the proof of Theorem 7.10. As before, we may assume F to be right continuous. Again Theorem 7.5 gives us Borel functions U°, Let the S°-valued function x) be defined from AT(t, (r(9+)) = - if r(O) (r(8)) and N. .z) just as ñ from n. Define KT(t, N(O, X (O)), 7 n(O) CT, = otherwise. r(O+)—r(O) Then n(.) is progressively measurable with respect to .F. We also define E(t) = IIJcT(10(9), (°II(’ j0 — tT°(±°(9), ft(8))d&. The proof for the property (7.30) is similar for v(t) E J n(8)d(9). Unfortunately, we are unable to show that there exists a pair (N(., N(t, x(t)) and .),.f)) such that nQ) = satisfies (7.30). (b) It is well known that under some conditions the value function W(t, x) for the classical control problem in Section 2.2 satisfies the following flamilton-Jacobi-Bellman equation inf {LMw(t, x) + fr(t, x)} = 0 (7.40) in some generalized sense (e.g., in the viscosity sense), cf. Krylov [50], Lions [55], where 2 , 1 (u°,a,n ) n c E, and ci A? frQ, x) with 0 = , = , 1 t x j â(t,x) 82 ci 88 +â(t,x)-r, at(t,x) = = f(t, x, u)u° + [cQ) 1 n + e(t) . . n2] (1 — d, and Io = (t,x) = I 0 0 aQ,x,u)u° I ) (t) 1 (b(t,x,u)u0+ (n — (t))(1 2 n — U0)) u°) t = Chapter 7. The Existence of Optimal Control Laws 119 We can rewrite (7.40) as inf{uo [CUW(t, x) + f(i, x, u)] + (1 — u°)[n’ (VW(t, x) + c(t)) . (7.41) (—VW(t, x) + ë(t))]} = 0, where d Lu = 82 8 aij(t,r,u) ij1 d +b(t,x,u)—. t=1 Note that (7.41) is exactly the same as min{ inf (LuW(t, x) + uEU ow f(, x, u)), ow --—-(t,x)+ (t), —--—-(t,x)+ e(), . •1 = 1,2,••,d) = 0, which is derived in Section 5.3 by the dynamic programming principle. (7.42) Appendix A Set-valued functions In this part, we will finish the proof of Lemma 6.1. We first recall recall some results from the set-valued mapping theory. Let X be a separable metric space with a metric 7. We denote by Comp(X) the space of all the compact subsets of X and define a metric p(K ,K 1 ) between two points of Comp(X) 2 by ,K 1 p(K ) 2 where for any set A C X, A = = inf{e> 0,1(1 C I( and “2 C I(}, {y, 7(x,y) < e for some x E A}, in other words, A° is the sphere around A of radius e. Then (Comp(X), p) is a separable metric space. The following results were used in Chapter 6, we write them down here without proofs. For details, see Stroock and Varadhan [73]. Proposition A.1 (a) Let f(x) be a real valued upper (lower) semicontinuous function on X. Consider the maps .7(K) (f(K), respectively) and f(K) induced by f that map Gomp(X) into R and Comp(X), respectively as follows: f(1() = sup fix), ccEK f(K) Then the maps K = ‘— (f(K) {y eK: f(y) J(I() (K = inf f(x)), — i.-÷ = J(K)}, (f(K) = {y C K: f(y) f(K), respectively) and K —÷ = f(K) are Borel maps of Comp(X) into 18 and Comp(X), respectively. (b) Let Y be a metric space and B its Borel a-field. Let y i—* K be a map of Y Comp(X) for some separable metric space X. Suppose for any sequence y 120 —÷ y, x,-, e into , 3 K, Appendix A. Set-valued functions 121 it is true that x. has a limit point x in K, then the map y ‘- K,, is a Borel map of Y into Comp(X). (c) Let (F, .1) be any measurable space and q There is a measurable map q ‘—p ‘—÷ K, a measurable map of F into Gomp(X). h(q) of F into X such that h(q) C K for every q C F. U The map h is called a measurable selector of K.. With these results we can prove the following propositiou which is used in Chapter 6. Recall that the function h(.,.) is defined in the proof of Lemma 6.1. Proposition A.2 The function h(t,.): d 111 J/?k H-* is Borel measurable for each fixed t. Before proving Proposition A.2, we define a map F : Comp(1R”) H+ 11 j k by the following: for K C Comp(fflhC), let and F(K) = F’(K) = max{h’: (h’,h)eK forsome heffl’’}, F ( 2 K) = : 2 2 max{h (F’(K),h , h) e K for some he F”(K) = max{h’: (F’(K),. (F’(K),F ( 2 K),. Lemma A.3 F: Comp(JRj Proof. For h = . ., fflk_ } 2 , Fc(K), hj C K}, . ]ftk is Borel measurable. ) e 1R’, define f t (h (h) 1 = h (1 i k), which are real valued continuous functions on JR”. Under the notations of Proposition A.1, it is easy to see that F’(K) and hence F’ : Comp(fflj -÷ 18 is Borel measurable. Also, it is obviously true that F ( 2 K)=f2ofJ(K), and more generally, F ( t K) = fr of’ = J’(K) Appendix A. Set-valued functions for 1 < i < ic, where o 122 denotes the composition of functions. By Proposition A.1 we get the Borel measurability of 1 F ( .), and therefore F : Comp(]R’) Next, we define a map K. : I—> i—p R’ is Borel measurable. 0 Comp(]Rk) by K={h: heffl., W(t,x)=W(t,x+gh)-i-c(t).h} for x e (A.1) 1R°. From the Lipschitz continuity of the function WQt,.), the positivity of c(t), and the fact that W is bounded below, it can be easily seen that K C Comp(lRj. Lemma A.4 The map K.: Rd Gomp(W) is Borel measurable. Proof. By Proposition A.1(b) it is enough to show that for any x it is true that h has a limit point h e —÷ x in 1R’, h e K. In fact, since W is bounded and c(•) are strictly positive, it is obvious that h is bounded. Therefore there is a point h e ii4 such that h - and from the continuity of W(t,.) we know that h € K. h El Proof of Proposition A.2. The proof follows from Lemma A.3 and A.4 easily. In fact, for t fixed, h(t, x) = F(K), i.e., h(t,.) is the composition of the maps K. : F(.) : Comp(lRj 11 j k ‘-÷ Comp(W) and The measurability of h(t,.) follows from that of K and F. U Appendix B Some results from real analysis The following result is used in Proposition 7.7. Recall that r, r(.) are defined by (7.6) and (7.24) respectively. Lemma B.1 For any nonnegative Borel measurable function k(.) on 18+, pT J Pr k(O)di’(O) =] k(±°(O))(1 — ui°(O))dO. (B.2) 0 S Proof. First we assume k(.) = 1[,t)(.), with t T. Note that r(t) > 0 if and only if 5°(9) <t, therefore Jr 1{T(9)<t}(O)(1 — ü°(0))dO = J = J 1{o<9<T(f)}(0)(l — fl°(0))dO ,r(t) (1 — fl°(0))dO 0 = = = r(t) - (°(r(t)) - r(t)—t+s 13(t), which is exactly the left hand side of (B.2) for k(.) = 1[,t)(.). Using the difference of two such functions we know that (B.2) is also true for the functions of the form k = 1[tj,v). Now applying the monotone class argument we can show the general case. C Let u°(.), k(.) be nonnegative Borel measurable functions on 18+, and 0 f(t) g(t) = s+Ju0(0)dO, 0 = inf{0: f(O) pg(t) v(t) =] k(O)dO. 0 123 t < cc, t} (inf 0 = cc), u°(.) 1. Define Appendix B. Some results from real analysis 14 The following results were used in the proofs of Proposition 7.8 and Theorem 7.10. Lemma B.2 (a) For any nonnegative Borel measurable function l(.) on [s, T], fT g(T) J l(f(O))d6 0 = J l(O)dg(O). (B.3) $ (b) For any nonnegative Borel measurable function 1(.) defined [s, T], g(T) I 3 J l(O)dv(O) = I JO l(f(9))k(8)d8. (B.4) Proof. (a) As in the proof of Lemma B.1, we need only to show the case when 1(8) for any t € [0, T]. Note that g(t) > 0 if and only if f(0) < = 1, therefore g(T) J 1[s.t)(f(6))dO = m(9: f(8) < t) = m(0 0 < 0 <g(i)) = g(L), 0 which is exactly the right hand side of (B.3) (m is the Lebesgue measure on IR). (b) As before, we may assume 1(0) L = 1[,)(0) for some s g(T) t T. Then it follows that g(T) 1{$<f(e)<t}(0)k(0)d0 = j J (t)}(0)k(0)dO 9 1{O< g(t) = k(0)dO 0 which, by definition, is equal to the left hand side of (B.4). = D Remark B.3 Dellacherie and Meyer [17] VI 54— 55 gives a slightly different but more general form of Lemma B.2(a). 0 Bibliography [1] J. P. Aubin and A. Ceffina, Differential Inclusions, Springer-Verlag, 1984 [2] F. M. Baldursson, Singular stochastic control and optimal stopping, Stochastics, 21(1987), pp 4 1 0 [3] J. A. Bather and H. Chernoff, Sequential decision in the control of a spaceship, Proce. Fifth Berkeley Symp. of Math. Stat. and Probab., Vol. 3, Univ. of California Press, Berkely, 1967, pp 207 181 [4] V. E. Benés, L. A. Shepp and H. S. Witzsenhausen, Some solvable stochastic control problems, Stochastics, 4(1980), pp 83 39 [5] A. Bensoussan and J. L. Lions, Applications des Inéquations Variationnelles en Contrôle Stochastique, Dunod, Paris, 1978 [6] P. Billingsley, Convergence of Probability Measures, Wiley, New York, 1968 [7] M. I. Borodowski, A. S. Bratus and F. L. Chernous’ko, Optimal impulse correction under random perturbation, J. Appl. Math. Mech., 39(1975), 797-805 [8] A. S. Bratus, Solution of certain optimal correction problems with error of execution on the control action, J. Appl. Math. Mech., 38(1974), 433-440 [9] D. S. Bridge and S. E. Shreve, Multi-dimensional finite-fuel singular stochastic control, Preprint , 1991 [10] F. L. Chernous’ko, Optimum correction under active disturbances, J. Appl. Math. Mech., 32(1968), 203-208 [11] Self-similar solution of the Bellman equation for optimal correction of random disturbances, J. Appl. Math. Mech., 35(1971), 333-342 , [12] M. Chiarolla and U. G. Haussmann, The free boundary of the monotone follower, to appear in SIAM J. Control and Opt. [13] , The optimal control of the cheap monotone follower, to appear in Stochastics [14] P. L. Chow, J. L. Menaldi and M. Robin, Additive control of stochastic linear system with finite time horizon, SIAM J. Control and Opt., 23(1985), pp 899 858 [15] M. D. Crandall, H. Ishii and P. L. Lions, A user’s guide to viscosity solutions, Bulletin of A.M.S., N.S.27(1992), ppl-67 125 Bibliography 126 [16] M. H. A. Davis and A. It. Norman, Portfolio selection with transaction costs, Math. Oper. Res., 15(1990), pp676-713 [17] C. Dellacherie and P. A. Meyer, Probabilities and Potential, North-Holland, Mathematics Stndies 29, 1978 [18] N. El Karoui, Hun Nguyen and M. Jeanblanc-Picqné, Compactification methods in the control of degenerate diffusions: Existence of an optimal control, Stochastics 20(1987), ppl69-2l9 [19] , Existence of an optimal Markovian filter for the control under partial observa tions, SIAM J. Control and Opt., 36(1988), pp 1061 1025 [20] N. El Karoni and I. Karatzas, Probabilistic aspects of finite-fuel, reflected follower problem, Acta Appl. Math., 11(1988), pp. 223-258 [21] J. N. Ethier and T. G. Knrtz, Markov Processes, Characterization and Convergence, John Wiley & Sons, 1986 [22] XV. H. Fleming and R. XV. Rishel, Deterministic and Stochastic Optimal Control, SpringerVerlag, New York, 1975 [23] W. H. Fleming and H. M. Soner, Controlled Markov Processes and Viscosity Solutions, Springer-Verlag, New York, 1993 [24] A. Friedman, Variational Principles and Free Boundary Problems, John Wiley & Sons, New York, 1982 [25] J. M. Harrison, T. M. Sellke and A. J. Taylor, Impulse control of Brownian motion, Math. Oper. Res., 8(1983), pp454-466 [26] J. M. Harrison and M. I. Taksar, Instantaneous control of Brownian motion, Math. Oper. Res., 8(1983), 439 pp 4 53 [27] J. M. Harrison and A. J. Taylor, Optimal control of a Brownian storage system, Stoch. Proc. Appl., 6(1978), pp 194 179 [28] Ii. G. Haussmann, A Stochastic Maximum Principle for Optimal Control of Diffusions, Pitman Reserach Notes in Math. Series 151, 1986 [29] , Existence of optimal Maricovian controls for degenerate diffusions, Lecture Notes in Control and Information Science 78(1986), pp 186 171 [30] U. G. Hanssmann and J. P. Lepeltier, On the existence of optimal control, SIAM J. Control and Opt., 28(1990), pp 902 851 [31] U. G. Haussmann and IN. Suo, Singular stochastic controls I: Existence of optimal con trols, to appear in SIAM J. on Control and Optimization Bibliography 127 [32] Singular optimal controls H: The dynamic programming principle and appli cations, to appear in SIAM J. on Control and Optimization [33] Existence of singular optimal control laws for stochastic differential equations, to appear in Stochastics , , [34] A. C. Heinricher and V. J. Mizel, A stochastic control problem with different value func tions for singular and absolutely continuous control, Proc. of 25th Conf. on Decision and Control, 1986 [35] N. Ikeda and S. Watanabe, Stochastic Differential Equations and Diffusion Processes, North-Holland, Amsterdam, 1981 [36] II. Ishii. Uniqueness of unbounded viscosity solutions of Hamilton-Jacob equations, Indiana U. Math. J., 26(1984), pp 748 721 [37] 5. D. Jacka, A finite fuel stochastic control problem, Stochastics, 10(1983), 103 pp 1 13 [38] J. Jacod and J. Mémin, Stir un type de convergence intermédiaire entre la convergence en loi et Ia convergence en probabilitd, Séminaire de Probabilité XV, Lect. Notes in Math 850, Spinger-Verlag, 1981, pp. 529-540 [39] J. Jacod and A. Shiryaev, Limit Theorems for Stochastic Processes, Springer-Verlag, Berlin, 1987 [40] I. Karatzas, The monotone follower problem in stochastic decision theory, Appl. Math. Opt., 7(1981) pp 189 175 [41] [42] , A class of singular control problems, Adv. Appl. Probab., 15(1983), 225 pp 2 54 , Stochastic control under finite fuel constraints, The IMA Volumes in Math. and Its Appl. Vol. 10, 1988 [43] I. Karatzas, J. P. Lehoczky, S. P. Sethi and S. E. Shreve, Explicit solution of a general consumption/investment problem, Math. Oper. Res., 11(1986), pp 294 261 [44] I. Karatzas, J. P. Lehoczky and S. E. Shreve, Optimal portfolio and consumption decision for a small investor, SIAM J. Control and Opt., 25(1987), pp 1586 1557 [45] I. Karatzas and S. E. Shreve, Brownian Motion and Stochastic Calculus, Springer-Verlag, NY, 1987 [46] Connections between optimal stopping and stochastic control: L Monotone follower problems, SL&M J. Control and Opt., 22(1984), pp 877 856 [47] Connections between optimal stopping and stochastic control: IL Reflected follower problems, SIAM J. Control and Opt., 22(1985), pp 451 433 [48] , , , pp24S-276 Equivalent models for finite-fuel stochastic control, Stochastics, 18(1986), Bibliography 128 [49] E. V. Krichagina and M. I. Taksar, Diffusion approximation for GI/G/1 controlled queue, Queuing System, 12(1992), pp 368 333 [50] N. Krylov, Controlled Diffusion Processes, Springer-Verlag, New York, 1980 [51] T. G. Kurtz, Random time changes and convergence in distribution under Meyer-Zheng conditions, The Annals of Probab., 19(1991), pplOlO-1034 [52] II. J. Kushner and K. M. Ramachandran, Nearly optimal singular controls for wide band noise driven systems, SIAM J. Control and Opt., 26(1988), pp 591 569 [53] R. Larsen, Functional Analysis, An Introduction, Marcel Dekker, Inc. New York, 1973 [54] J. P. Lehoczky and S. E. Shreve, Absolutely continuous and singular stochastic control, Stochastics, 17(1986), pp 109 91 [55] P. L. Lions, Optimal control of diffusion processes and HJB equations, Part 1: The dy namic programming principle and applications; Part 2: Viscosity solutions and unique ness, Comm. in PDE, 8(1983), ppllOl-l174; pp1129-1276 [56] P. L. Lions and J. L. Menaldi, Optimal control of stochastic integrals and Hamilton-JacobBellman equations I, IL SIAM J. Control and Opt., 20(1982), ppS8-81, pp82-9S [57] P. L. Lions and A. S. Sznitman, Stochastic differential equations with reflecting boundary conditions, Comm. Pure Appl. Math., 37(1984), ppSll-537 [58] J. Ma, On the principle of smooth fit for a class of singular stochastic control problems for diffusion, SIAM J. Control and Opt., 30(1992), pp. 975-999 [59] L. F. Martins and H. J. Kushner, Routing and singular control for queuing network in heavy traffic, SIAM J. Control and Opt., 28(1990), pp 1233 1209 [60] J. L. Menaldi, On the optimal stopping time problem for degenerate diffusions, SIAM J. Control and Opt., 6(1980), pp 721 697 [61] On the optimal impulse control problems for degenerate diffusions, SIAM J. Control and Opt., 18(1980), pp 739 722 , [62] J. L. Menaldi and M. Robin, On some cheap control problems for diffusion processes, T.A.M.S., 278(1983), pp 802 771 [63] , On singular stochastic control problems for diffusions with jumps, IEEE Trans. Auto. Control, AC-29(1984), pp991-1004 [64] J. L. Menaldi and E. Rofman, On stochastic control problems with impulse cost vanishing, Proc. International Symposium on Semi-Infinite Programming and Appl., Lect. Notes in Econom. and Math. Sys. 215, Springer-Verlag, New York, 1983, pp 294 281 [65] J. L. Menaldi and M. I. Taksar, Optimal correction problem of a multidimensional stochas tic system, Automatica, 23(1989), pp 232 223 Bibliography 129 [66] P. A. Meyer and W. A. Zheng, Tightness criteria for laws of semimartingales, Ann. Inst. Henri Poincaré, 20(1984), pp 372 353 [67] S. E. Shreve, An introduction to singular stochastic control, The IMA Volumes in Math. and Its AppL Vol. 10, 1988 [68] S. E. Shreve, J. P. Lohoczky and D. P. Gayer, Optimal consumption for general diffusion with absorbing and reflecting barriers, SIAM J. Control and Opt., 22(1984), pp 75 55 [69] 5. E. Shreve and H. M. Soner, Optimal investment and consumption with transaction costs, Res. Report No. 92-SA-001, Dept. of Math., Carnegie Mellon Univ. [70] A. V. Skorohod, Limit theorems for stochastic processes, Theor. Prob. Appl., 1(1956), pp26l-284 [71] H. M. Soner and S. E. Shreve, Regularity of the value function for a two-dimensional singular stochastic control problem, SIAM J. Control and Opt., 27(1989), pp 907 876 [72] A free boundary problem related to singular stochastic control: the parabolic case, Comm. Partial Duff. Equations, 16(1991), pp 424 373 , [73] D. ‘N. Stroock and S.R.S. Varadhan, Multidimensional Diffusion Processes, SpringerVerlag, 1979 [74] M. Sun, Singular control problems in bounded intervals, Stochastics, 10(1983), pplO3-ll3 [75] M. Sun and J. L. Menaldi, Monotone control of a damped oscillator under random per turbations, IMA Journal of Math. Control and Infor. 5(1988), pp 186 169 [76] M. I. Taksar, Storage model with discontinuous holding cost, Stochastic Pro. and Their Appl., 18(1984), pp 300 291 [77] Average optimal singular control and a related stopping prolem, Math. Oper. Res., 10(1985), pp 81 63 , [78] P. van Moerbeke, On optimal stopping and free boundary problems, Arch. Rational Mech. Anal., 60(1976), pp 148 101 [79] S.R.S. Varadhan and R. J. Wiffiams, Brownian motion in a wedge with oblique reflection, Comm. Pure Appl. Math., 38(1985), pp 443 405 [80] 5. A. Williams, P. L. Chow and J. L. Menaldi, Regularity of the free boundary in singular stochastic control, to appear in J. Duff. Equations [81] H. Zhu, Variational inequalities and dynamic programming for singular stochastic control, Ph.D. Thesis, Brown University, 1991
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- The existence of optimal singular controls for stochastic...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
The existence of optimal singular controls for stochastic differential equations Suo, Wulin 1994
pdf
Page Metadata
Item Metadata
Title | The existence of optimal singular controls for stochastic differential equations |
Creator |
Suo, Wulin |
Date Issued | 1994 |
Description | We study a singular control problem where the state process is governed by an Ito stochastic differential equation allowing both classical and singular coutrols. By reformulating the state equation as a martingale problem on an appropriate canonical space, it is shown, under mild continuity conditions on the data, that an optimal control exists. The dynamic programming principle for the problem is established through the method of conditioning and concatenation. Moreover, it is shown that there exists a family of optimal controls such that the corresponding states form a Markov process. When the data is Lipschitz continuous, the value function is shown to be uniformly con tinuous and to be the unique viscosity solution of the corresponding Hamilton-Jacobi-Bellman variational inequality. We also provide a description of the continuation region, the region in which the optimal state process is continuous, and we show that there exists a family of optimal controls which keeps the state inside the region after a possible initial jump. The last part is independent of the rest of the thesis. Through stretching of time, the singular control problem is transformed into a new problem that involves only classical control. Such problems are relatively well understood. As a result, it is shown that there exists an optimal control where the classical control variable is in Markovian form and the increment of the singular control variable on any time interval is adapted to the state process on the same time interval. |
Extent | 2186410 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
File Format | application/pdf |
Language | eng |
Date Available | 2009-04-08 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0080009 |
URI | http://hdl.handle.net/2429/6966 |
Degree |
Doctor of Philosophy - PhD |
Program |
Mathematics |
Affiliation |
Science, Faculty of Mathematics, Department of |
Degree Grantor | University of British Columbia |
Graduation Date | 1994-05 |
Campus |
UBCV |
Scholarly Level | Graduate |
Aggregated Source Repository | DSpace |
Download
- Media
- 831-ubc_1994-893958.pdf [ 2.09MB ]
- Metadata
- JSON: 831-1.0080009.json
- JSON-LD: 831-1.0080009-ld.json
- RDF/XML (Pretty): 831-1.0080009-rdf.xml
- RDF/JSON: 831-1.0080009-rdf.json
- Turtle: 831-1.0080009-turtle.txt
- N-Triples: 831-1.0080009-rdf-ntriples.txt
- Original Record: 831-1.0080009-source.json
- Full Text
- 831-1.0080009-fulltext.txt
- Citation
- 831-1.0080009.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0080009/manifest