THE EXISTENCE OF OPTIMAL SINGULAR CONTROLSFOR STOCHASTIC DIFFERENTIAL EQUATIONSByWulin SuoB. Sc., M. Sc. (Mathematics), Hebei University, ChinaA THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFDOCTOR OF PHILOSOPHYinTHE FACULTY OF GRADUATE STUDIESDEPARTMENT OF MATHEMATICSANDINSTITUTE OF APPLIED MATHEMATICSWe accept this thesis as conformingto the required standardTHE UNIVERSITY OF BRITISH COLUMBIAFebruary 1994© Wulin Suo, 1994In presenting this thesis in partial fulfilment of the requirements for an advanced degree at theUniversity of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarlypurposes may be granted by the head of my department or by his or her representatives. Itis understood that copying or publication of this thesis for financial gain shall not be allowedwithout my written permission.Department of MathematicsandInstitute of Applied MathematicsThe University of British ColumbiaVancouver, CanadaV6T 1Z2Date:Monelj to, ggzç..AbstractWe study a singular control problem where the state process is governed by an Ito stochasticdifferential equation allowing both classical and singular coutrols. By reformulating the stateequation as a martingale problem on an appropriate canonical space, it is shown, under mildcontinuity conditions on the data, that an optimal control exists. The dynamic programmingprinciple for the problem is established through the method of conditioning and concatenation.Moreover, it is shown that there exists a family of optimal controls such that the correspondingstates form a Markov process.When the data is Lipschitz continuous, the value function is shown to be uniformly continuous and to be the unique viscosity solution of the corresponding Hamilton-Jacobi-Bellmanvariational inequality. We also provide a description of the continuation region, the region inwhich the optimal state process is continuous, and we show that there exists a family of optimalcontrols which keeps the state inside the region after a possible initial jump.The last part is independent of the rest of the thesis. Through stretching of time, thesingular control problem is transformed into a new problem that involves only classical control.Such problems are relatively well understood. As a result, it is shown that there exists anoptimal control where the classical control variable is in Markovian form and the increment ofthe singular control variable on any time interval is adapted to the state process on the sametime interval.‘ITable of ContentsAbstract iiAcknowledgement v1 Introduction 12 Formulation of the Problem 112.1 Introduction 112.2 Model problems 112.3 Statement of the problem 142.4 Control rules 202.5 The topology on the canonical space 223 Existence of Optimal Controls 303.1 Introduction 303.2 An equivalent definition for control rules 303.3 Existence of optimal controls 343.4 Some comments 434 The Dynamic Programming Principle 454.1 Introduction 454.2 Some preparations 454.3 The dynamic programming principle 524.4 Markov property 551115 The Value Function5.1 Introduction5.2 Continuity of the value function5.3 The dynamic programming equation5.3.1 Heuristic derivation of the dynamic programming equation5.3.2 Viscosity solution5.4 The uniqueness of viscosity solution to the HJB equation6 The Continuation Region 877 The Existence of Optimal Control Laws7.1 Introduction7.2 Formulation of the problem7.2.1 The singular control problem7.2.2 The classical control problem7.3 Equivalence of the two problems7.4 Existence of optimal control laws7.5 Some commentsA Set-valued functions 120B Some results from real analysis 123Bibliography 1256161616871727697979899101105111116ivAcknowledgementFirst and foremost I am grateful to Professor TJlrich Haussmann. Throughout the years of mygraduate study at UBC, I have benefited enormously from his ideas, enthusiasm and overallguidance. I thank him for not only serving as my research supervisor and spending countlesshours in sharing his broad knowledge of mathematics with me, but also for his encouragement,patience and consideration during the years. I also thank Professors Phillp Loewen and EdPerkins for being in the supervisory committee, and giving those wonderful courses which Ihave enjoyed taking.I am also indebted to many of my fellow graduate students, whose friendship has helped tomake the student life tolerable, and to the staff in the department, especially Ms. Tina Tang,for their help.In addition, I would like to take this opportunity to thank my best friends Yan-qun Liuand Xue-feng Zhang of Rebei University for the favors they have done me since I left China topursue my graduate study here.Finally, I thank my wife Jane and my son Charles for their love, understanding and tolerance.To them this thesis is dedicated.vChapter 1IntroductionThe class of singular stochastic control problems, which has been studied extensively in recentyears, deals with systems described by a stochastic differential equation in which one restrictsthe cumulative displacement of the state caused by control to be of an additive nature, or inother words, of bounded variation on finite intervals. In classical control problems, this cumulative displacement is the integral of some function of the state (see Fleming and Rishel [22],Krylov [50]) and so is absolutely continuous. In impulsive control problems (see Bensoussanand Lions [5]), this cumulative displacement has jumps, between which it is either constantor absolutely continuous. Singular control problems admit both of these possibilities and alsothe possibility that the displacement of the state caused by the optimal control is singularlycontinuous with respect to the Lebesgue measure on the time interval.More precisely, in singular control problems the state process is governed by the followingd-dimensional stochastic differential equationJ dx = b(t, t, ut)dt + a(t, Xj, ut)dBt + g(t)dvj, s t T, (11)x3=x, xEladon some filtered probability space (fl,F,.F,F), where b(,.,.): [0,71] xladx U ‘—+ Ia”, ot.,.,.):[0,T] x la” < u ‘. ffldXl, g(.) [0,T] i—+ ladxIc are given deterministic functions, (Be, t 0)is an 1-dimensional Brownian motion, x is the initial state at time .s and u [0, T] i—* U,[0, T] -* 1R”, with v nondecreasing componentwise, stand for controls. We call u theclassical control variable, and v the singular control variable. When k = d and g(.) = I, thed x d unit matrix, the problem is often referred to as a monotone follower problem, and whenk = 2d, g(.) = (I, —I), it is also called in the literature a bounded variation follower problem.1Chapter 1. Introduction 2Moreover, we should point out that in most of the literature about singular control problems,there is no classical control variable u involved.The expected cost has the formJ E E{jTf(t, x uj)dt + c(t) . dv} (1.2)where f(.,.,.) : [0,T] x iRd x U 18, c(.) : [0,T] F— 1R are given, and f stands for therunning cost rate of the problem and c the cost rate of applying the singular control. If we letthe value function of the problem be I’V(t, x), i.e., the infimum of J over all admissible controls,then a heuristic application of the dynamic programming principle will lead to the followingvariational inequality, or Hamilton-Jacobi-Bellman equation,inf(LW+f)(t,x,u) 0,(g*VW(t, x))t + c(t) 0, i = 1,2, .., k, and (1.3)inf(LW + f)fl{(g*VrW(t,x))t + c(t)} = 0on [0,T] x 18d, where- 0 1 02 0LE +>ai3 +Eb—.The problem reduces to the classical control problem if g = 0, which has been studied extensivelyin the literature, cf. Fleming and Rishel [22], Krylov [50], Lions [55] among others. Let A bethe subset of [0, T] x on whichinf (LW + f)(t, x, u) = 0, (1.4)uEUand on its complement AC one of the other inequalities in (1.3) becomes an equality. If wecan show that the value function is convex and in C”2([0, T] x 1R’), and the boundary OA ofthe subset A is smooth enough, then it can easily be verified that the optimal control exists,and has the following form: if the state process starts outside of A, then the optimal controlwill make it jump to some point on the boundary OA, thereafter v acts only when the stateprocess is on OA, and pushes it back into A (along the direction—VW). The classical controlChapter 1. Introduction 3variable u acts when the optimal state process is inside A in the way such that (1.4) holds, inother words, the optimal classical control variable u acts optimally as if there were no singularcontrol v in the problem. The optimal state process is thus a reflected diffusion in the set A,which is called the inaction region, and the singular optimal control is like the local time of thereflected diffusion at the boundary DA.This kind of problem was first studied in the late 1960s by Bather and Chernoff [3], whoconsidered the problem of controffing the motion of a spaceship on some finite time horizonwith a quadratic terminal cost and a cumulative cost for the singular control, and was takenup by Borodowski, Bratus and Chernous’ko [7], [8], [10], [11]. However, not much progresshad been made until the seminal work of Benès, Shepp and Witzenhausen [4] (1980), whostudied several specific singular control problems, all of which were 1-dimensional with quadraticrunning costs and simple state processes (e.g., b = 0, and o. = 1 in (1.1)). The solutionsconstructed were among a long list of solutions to singular stochastic control problems in whichthe value functions are twice continuously differentiable, even across the free-boundary DAwhere the optimal singular control acts. This so-called principle of smooth fit has played animportant role in constructing the solutions of 1-dimensional problems (e.g., Harrison, Sellkeand Taylor [25], Harrison and Taksar [26], Harrison and Taylor [27], Jacka [37], Karatzas [40],[41], Karatzas and Shreve [48], Lehoczky and Shreve [54], Shreve, Lehoczky and Gayer [68],Sun [74], and more recently, Ma [58]). In all these results, the diffusion part of the state processis either a Brownian motion or a diffusion with linear drift and constant diffusion coefficients(i.e., b, a = constant). Moreover, the running cost function f is always assumed to be convexin the state variable x, and it can then be shown that the value function is convex in the statevariable. In this case, the subset A, i.e., the t-section of the set A, is an interval on the realline. The optimal state process is thus a reflected Brownian motion (or a linear diffusion) onthis interval, and the optimal singular control is the local time of the state at the endpoints.It is now clear that extending the principle of smooth fit to multi-dimensional cases will faceproblems like the unknown smoothness of the value function and the free-boundary, and the lackChapter 1. Introduction 4of knowledge of the direction of reflection. Moreover, even if intnitively we may gness —V1Wto be the optimal direction of reflection (since this is the direction of least increase of the valuefunction W), the gradient VW may be zero at some points on the boundary, and thus leavethe direction of reflection indeterminate. These are essentially the main difficulties that preventresearchers from applying the principle of smooth fit, which has been very successful for specific1-dimensional problems, to construct the optimal control for higher dimensional problems, andfrom showing the singular feature of the optimal control. However, this method is successfullyused by Soner and Shreve [71] for a 2-dimensional singular control problem for the Brownianmotion with infinite time horizon and discounted convex running cost rate. Their approachuses the gradient flow of the value function W (on 182) to change to a more convenient pairof coordinates, and to obtain a more standard free-boundary problem. By using this ingeniousdevice, they characterize the value function W as the uniqueC2-solution of the correspondingHamilton-Jacobi-Bellman equation. They also show the free boundary Th4 to be of class C2”for any ct e (0, 1). Such smoothness of DA is essential in their construction of the optimalprocess, which is a 2-dimensional Brownian motion reflected along ÔA in the direction —VW,and is obtained as the unique solution of a Skorohod problem satisfying the conditions of Lionsand Sznitman [57]. As the authors point out, the proof of Soner and Shreve’s result makescritical use of the 2-dimensional nature of the problem, and it cannot be extended to moregeneral problems in higher dimensions. However, by analytic methods Soner and Shreve [72]obtain similar results in higher dimensions if the singular control can be exerted only in onedirection.A 2-dimensional cheap monotone follower problem is solved in Chiarolla and Haussmann [12],[13] by geometric methods. It is shown there that the optimal control exists and is unique, andthe free-boundary is C2 except at one point. Again the method is specific to dimension two.Karatzas and Shreve [48] considered the singular control of a 1-dimensional Brownian motionunder a constraint on the singnlar control variable (a finite-fuel constraint), i.e., the totalvariation of the singular control variable cannot exceed a constant . The fuel remaining,Chapter 1. Introduction 5— 1’, constitutes a second state variable, and the value function is shown analytically to be C2jointly in both state variables. This property enables them to construct an optimal control inthe following form: act optimally as if there is no constraint, or in other words, with infinitefuel, until the fuel remaining is down to zero, and then leave the state uncontrolled. Problemsof this type can also be found in Benes, Shepp and Witzenhausen [4], Jacka [37], and Bridgeand Shreve [9] (a more recent work on a multi-dimensional problem).Singular control can also be approached as the limit of either impulse controls or absolutelycontinuous controls. Menaldi and Rofman [64] study a so-called cheap control problem (i.e.,c(.) = 0) for an n-dimensional diffusion process with infinite time horizon where only impulsecontrols are allowed. They obtain the optimal cost as a limit of impulse control problems havinga cost for each impulse. The existence of an optimal control is proved only after restrictingthe problem to a particular subset of impulse controls, which suggests that an optimal controlfor the problem has to be sought in a much larger set of admissible controls. In fact, Menaldiand Robin [62] prove that the value function of the problem is continuous and is the sameover sets of absolutely continuous controls, impulse controls, or pure jump controls as longas they are all of bounded variation on finite intervals. Moreover, in the 1-dimensional case,they prove the existence of a singular optimal control in the class of monotone controls for anondegenerate diffusion. The value function is shown to be C2 and is the maximum solution ofthe corresponding Hamilton-Jacobi-Bellman equation, when f is assumed to be convex in thestate variable, and the state process is assumed to have constant drift and diffusion coefficients.Menaldi and Taksar [65] study a multi-dimensional problem under the same hypothesis onb, and ci, with a positive cost c(.) = constant for the singular control variable, which entersthe state process additively and is of bounded variation. They approximate the value functionthrough absolutely continuous controls by means of penalization, and prove that T’V is a generalized solution of the corresponding Hamilton-Jacobi-Bellman equation. The existence anduniqueness of optimal control is also established without the requirements of regularity of thefree-boundary.Chapter 1. Introduction 6It should be pointed out that this approach is used by many authors to establish the regularity of the value fuuctions and to show that it is a generalized solutiou (in a certain sense) tothe corresponding variational inequality, or the Hamilton-Jacobi-Bellman equation, see Chow,Menaldi and Robin [14], Williams, Chow and Menaldi [80], Zhou [81] among others. The readermay consult Baldursson [2] for conditions under which an approximation by absolutely continuous controls is possible, and Heinricher and Mizel [34] for a counterexample. In Heinricher andMizel’s model the value function obtained by minimizing over the set of controls with boundedvariation is strictly smaller than the value function corresponding to absolutely continuouscontrols.It was first observed by Bather and Chernoff [3] that the 1-dimensional singular controlproblem has a close connection with an optimal stopping problem. In fact, it was shown thatthe space derivative of the value function W coincides with the optimal risk of an appropriatestopping problem, whose optimal continuation region is precisely the region of inaction A ofthe control problem. This connection between singular stochastic control and optimal stoppingwas developed rigorously by Karatzas [41] mostly by analytical methods based on propertiesof solutions to free-boundary problems and variational inequalities. Subsequently, Karatzasand Shreve [46] established the connection between the two problems by using only directprobabilistic arguments. In particular, they proved the existence of an optimal control for theproblem of controffing a Brownian motion, in the setting of the monotone follower problem,and they showed the optimal stopping time for the associated stopping problem is exactly thefirst time the singular control acts, in other words, the first time the singular control is positive.Karatzas and Shreve [47] obtained similar results for bounded variation follower problems andoptimal stopping for a Brownian motion with absorption at the origin (see also Baldursson [2]).Other results concerning the control of a Brownian motion and its relationship to optimalstopping problems can be found in El Karoui and Karatzas [20] and Taksar [77]. It should benoticed that control processes are more easily topologized than stopping times, and thereforethe proof of existence of an optimal stopping time through this connection usually requiresChapter 1. Introduction 7fewer conditions than the approach used in Friedman [24], and van Moerbeke [78].The equivalence between singular control and optimal stopping problems is also used byChow, Menaldi and Robin [14] to determine the free boundary ôit, whose 1-dimensional stateprocess is governed by a linear stochastic differential equation, possibly degenerate, with timedependent coefficients, and finite time horizon. They approach the control problem by a sequence of absolutely continuous control problems, which enable them to prove that the valuefunction is the unique solution of the corresponding variational inequality. Then they constructthe optimal control, which is, in fact, Markovian and whose input produces a reflected diffusionprocess as the optimal state process. To construct the reflected diffusion, they assume someregularity of the free-boundary.Singular optimal controls have been used to study various types of storage problems, cf. Harrison and Taksar [26], Harrison and Taylor [27], and Taksar [76]. In Martins and Kushner [59](and also Kushner and Ramachandran [52], Krichagina and Taksar [49] among others) singular control problems for a l3rownian motion are used to approximate queuing systems inheavy traffic. We should especially mention the work of Davis and Norman [16] (cf. Shreveand Soner [69] for the same model with fewer hypotheses). In this work they formulate anoptimal investment/consumption problem with transaction costs as a 2-dimensional singularcontrol problem. The inaction region (which is called the no-transaction region) is found to bea wedge, hence the results of Varadhan and Williams [79] establish the existence of the optimalcontrol policy, which is a linear mapping of the local time of the reflected diffusion, i.e., of theoptimal state process.In this thesis we use probabilistic methods to study the general d-dimensional control problem with the state process satisfying (1.1) and the expected cost function (1.2). We will showthat the optimal control exists under some very mild conditions. The dynamic programmingprinciple will be established. Moreover, the value function is shown to be uniformly continuousand is the unique viscosity solution of (1.3). We will define a continuation region for the problemand show that it has some of the features found in the specific problems. The adaptedness ofChapter 1. Introduction 8the optimal control to the state will be investigated.As we have pointed out, in the literature singular control problems usually involve only thesingular control variable v in (1.1) (exceptions can be found in [16], [54] and [69]). The presentwork, however, places both classical and singular control problems in a common framework. Itincludes both problems as special cases by letting g(.) = 0 or U be a singleton.An outline of this thesis is as follows:In Chapter 2 we first recall some model problems which arise from different applicationsto show the particular features of optimal singular controls. After we formulate the problemthe concept of relaxed control is introduced. The problem is reformulated as an equivalentmartingale problem on a canonical space, which simplifies taking limits when we apply thecompactification method. The control rules defined in Section 2.4 enable us to consider thecost as a function on the collection of probabilities (i.e., control rules) on the canonical space.A topology for the canonical space is given that makes it a metrizable separable space. InChapter 3 , we apply the compactification method to show that the optimal control for thesingular control problem exists. The cost function on the canonical space is defined and isshown to be lower semicontinuous. Moreover, the set of control rules starting from initialpoints that are in a bounded subset of [0,T] x Jpd is shown to be compact, and as a resultthe existence of optimal control is established. The value function is also shown to be Borelmeasurable.In Chapter 4, we apply the method of conditioning and concatenation, used by Stroockand Varadhan [73] in the construction of solutions to stochastic differential equations, and byRaussmann [29], Haussmann and Lepeltier [30], and El Karoui et al[18] in the setting of optimal(classical) control problems to establish an abstract dynamic programming principle. We alsoshow that there exists a family of optimal controls such that the corresponding optimal stateprocess forms a Markov process. Assuming Lipschitz conditions on the coefficients of the stateprocess, we show in Chapter 5 that the value function is uniformly continuous on [0, T] x II?d.Motivated by the work of Lions [55] on classical control problems, we characterize the valueChapter 1. Introduction 9function as the unique viscosity solution to the Hamilton-Jacobi-Bellman equation (1.3).Applying the results we obtained in previous chapters, we define in Chapter 6 the continuation region (or, the inaction region), and show that there exists an optimal control in thefollowing form: if the state starts outside the region A, then the singular control variable vbrings the state immediately to the closure A by a jump. The state stays in A thereafter, andis continuous when it is inside A. This result coincides with the properties of optimal solutionsof the specific problems solved in the literature.Chapter 7 is rather independent of the rest of this thesis. In this chapter, we consider thebounded variation control problem, i.e., k = 2d, and .q(.) = (I, —I) with I the dx d unit matrix.We introduce a random time change which stretches out the time scale. Under this new timescale, the problem is transformed to a new control problem involving only classical controls.The new problem has been studied extensively, cf. Fleming and Rishel [22], Krylov [50], andespecially Haussmann [29], Haussmann and Lepeltier [30], where the existence of an optimalMarkovian control for the new problem is established under some mild continuity conditions onthe coefficients of the state process. Applying this result and transforming the optimal controlback to the original singular control problem, we show that an optimal control exists. Moreover,it is shown that there exists an optimal control in the following form: the control variable u isin Markovian form, i.e., it depends only on the current state of the problem, and the incrementsof the singular controls v during any time interval depend only on the state process in thatinterval. This type of control will be called a control law.This method gives an explicit way to construct the optimal control for the singular controlproblem when the optimal control to the new problem, which is relatively well understood, isknown, e.g., by the maximum principle, dynamic programming, etc (see Fleming and Rishel [22],Haussmann [28], Krylov [50]). However, we do not know whether the optimal state process isa reflected diffusion in some region.Finally, we list some notation that will be used throughout this thesis:•ffl1, 11? denote the d-dimensional Eudidean space and the real line respectively. 18+ =Chapter 1. Introduction 10{x e 18, x 0}, and 18 is defined similarly. For x = (x1), y = (y1) e 18d, x y = (z, y) =zL1xty, and unless otherwise defined, denotes the Endidean norm.• T> 0 is the fixed horizon, and >D = [0,Tj x ia’. U is a compact metric space.• C[0, Tj denotes the collection of real-valued continuous functions defined on [0, T], andC’[0, T] denotes the collection of 1R-valued continuous functions defined on [0, T].• V”[0, T] denotes the collection of 18dvalued functions defined on [0, TJ that are left continuous and have right limits (i.e., icri functions).• Ak[0,T] denotes the collection of functions a [0,T] ]flk such that a = (a1) C Vk[0,T]and a1 is nondecreasing with a(0) = 0, i = 1, . ., k.• 3lxk is the space of I>< k matrices with the I x k-dimensional Eucidean norm.• If Y is a metric space, and 8(Y) denotes the corresponding Borel u-field, f C 8(Y),f e b13(Y) mean that f is a 8(Y)-measurable and bounded 8(Y)-measurable real-valuedfunction respectively. We denote by JM+(Y), J1lJi(Y) the space of nonnegative Radonmeasures and the space of probabilities on Y, respectively.• For any bounded function 4’: Y ‘-÷ 18, we can extend 4’ to ffi41(Y) by4’(t) EJ4’(Y)iL(dY)for each it C 1216(Y).• If X is a random variable on a probability space (12, F, F), the expectations of X will bedenoted by E’(X). iW (MlL) is the family of continuous square integrable martingales(local martingales, respectively) on some given probability space (12, F, F) with a filtration{F}.Chapter 2Formulation of the Problem2.1 IntroductionIn this chapter, we first give three model problems that arise from different applications, inclnding stochastic decision and finance, and that have been solved explicitly. These problemssuggest the general formulation of the stochastic control problem, which is presented in Section 2.3 and studied in this thesis. We also prove that the control problem is equivalent tothe relaxed control problem; this simplifies taking limits when applying the compactificationmethod to show the existence of optimal controls. As a consequence, the control problem isreformulated as a martingale problem in Section 2.4. When an appropriate canonical space ischosen, the concept of control rule is introduced. This enables us to consider the expected costas a function defined on the space of probabilities on the canonical space. A topology on thecanonical space is given in Section 2.5 to make it a separable metrizable space.2.2 Model problemsWe will present in this section some model problems which have been solved explicitly in theliterature. The solutions to these problems provide us with an intuitive idea about the basicfeatures of singular control problems.Example 1. (Karatzas [40]) Monotone follower in stochastic decision.The problem is to optimally track a 1-dimensional Brownian motion B by a nondecreasingprocess Vt, adapted to the past of B on a probability space (Q, F, F). The state is defined asxj=x+Bt—Vt, 0tT11Chapter 2. Formulation of the Problem 12and the aim is to minimize the expected costJ(v) = E {JT f(xt)dt}.This decision problem can be reduced, through formal dynamic programming, to a free-boundary problem in partial differential equations.Let the value function be defined asW(t,x) infEr{jT=where t is the time-to-go. Then the formal Hamilton-Jacobi-Be]lman equation is a variationalinequality:max {w — — f(x), w} = 0, for (t, x)€E,and W(0, x) = 0 for x€JR. It turns out that under the assumptionkxm f”(x) Kxm, Vx€JRfor some constants 0 lv < K and integer m 0, there is a classical solution TV€C1’2and a moving free-boundary, which is a C’-curve on [0,T], and separates the do-nothing, orinactive region from the active region in >1 The optimal control v has the foliowing form: ifthe initial state x is outside the inactive region, apply a jump to bring the state immediately tothe free-boundary, and apply control thereafter only when the state hits the free-boundary tokeep it inside the inactive region. The optimal control v (after time 0) is thus local time of theoptimal state, which is a reflected Brownian motion, at the free-boundary, and is thus singularcontinuous with respect to Lebesgue measure.Example 2. (Chow, Menaldi and Robin [14]) Additive control for a linear system.Consider the 1-dimensional linear stochastic differential equation, with the control v, whichis a process of bounded variation, entering additively into the system,J dx = (aQ)xt + b(t))dt + c(t)dB + dvi, t>I. d3 =Chapter 2. Formulation of the Problem 13The expected cost takes the formJs,r(v) Er{jrf(6, xe)dS +1sT)c(O)do_s}where 1’. denotes the total variation function of t..Under certain assumptions, especially the convexity of f(t,.), the value function W(s, z) ischaracterized as the unique generalized solution of the variational inequalitymin{LW + f(s, x), —W + c(s)} 0, onwhere LW = W3 + u2(s)W + (a(s)x + b(s))W. The moving-boundary consists of twobranches x+(s) and x—(s), which are given byx(s) = inf{x : W(s, x) + c(s) > 0},= inf{x : W(s, x) — c(s) < 0}.Moreover, if it is assumed that x±(.) are finite and continuous on [0, T], then the optimal stateprocess is a reflected diffusion in the moving interval (ar(s), x+(s)), except possibly for aninitial jump.Example 3. (Davis and Norman [16], Shreve and Soner [69]) Investment and consumptionmodel with transaction costs.Davis and Norman [16] (see also Shreve and Soner [69] for the same problem under morerelaxed conditions) consider an optimal investment/consumption model in which a single agentconsumes and distributes his wealth in two assets: a bond and a stock, with transaction chargesequal to a fixed percentage of the amount transacted. The agent’s portfolio (so(t), si(t)), whereso(t), si(t) stand for the amount of bond and stock respectively, evolves according todso(t) = (rso(t) — u(t))dt — (1 + A)dL + (1 — .u)dU,dsi(i) = asi(t)di + csi(t)dWj + dL — dUe,so(O) = x, si(O) =Chapter 2. Formulation of the Problem 14where 0 < A, t < 1, cr, a> 0 are constants, x + (1 — 1u)y > 0 and x + (1 + A)y 0, u(t) 0,t 0. Here (u, L, U) is called a control policy, with consumption rate u, cumulative stockacquisition L, and cumulative stock sales U. L, U are nondecreasing processes.•The problem is to choose (it, L, U) to maximize the utility functionL, U) j ef(uj)dt,subject to(so(t),8i(t))ESE{(z,y)e JR2 :x+(1—t)y>0,x+(1+A)y>0}, 0<t< oo,where 6 > 0 is a constant, and f is assumed to be in the form f(u) =—(y < 1, y $ 0)or f(’u) = logu. The problem is reduced to the corresponding free-boundary problem, whichcan be shown to have a unique solution, and the optimal state process is found explicitly asfollows: if the state starts from (x, y) e NT (NT, a wedge that can be found explicitly, iscalled the non-transaction region), do not use L, U and solve the problem as in the classicalcontrol problem (the control variable is u(t)); when the state reaches the the boundary ÔNT,apply L, U as much as necessary to prevent it form leaving NT. When the state starts fromS\NT, use L, U immediately to make the state jump to the boundary of NT along appropriatedirections. Hence the optimal process is a diffusion reflected in a wedge (after time 0), and theoptimal singular controls L and U are the local times of the optimal state process at the lowerand upper boundary of the wedge NT.2.3 Statement of the problemWe consider in this thesis the following optimal control problem in which we allow both classicalcontrol and singular control to act at the same time. The dynamics are in the formt t t= x + f b(O, x, uo)dO + j a(O, zo, uo)dB0 + g(O)dvo, a.s., (2.1)for(t,x)e>D, siT,whereChapter 2. Formulation of the Problem 15• U e U, .s < 6 <T, and U, called the control set, is a compact metric space;• (a, b): > x U i—+ S’ x]R0!, g [0, T] i—* S”> (k > 0 is a fixed integer); a(t, x, u), b(t, x, u)are measurable, bounded and continuous with respect to (x, u); g(t) is continuous on [0, T];• (B, 0 t T) is a d-dimensional Brownian motion on some probability space;• veAtj0,Tj.‘We introduce the concept of controls for the stochastic differential equation (2.1).Definition 2.1 A control is a term a = (Q,F,Fj,F,Bj,x,u,v,s,x) such that(Cl) (s, x) C(C2) (fl, F, F) is a probability space with the filtration {Fj}t>o;(C3) Uj Z.3 a U-valued process, progressively measurable with respect to {Ft}to;(C4) v is an 184-valued processes progressively measurable with respect to F. The sample pathsof v are in A’[0,T], i.e., for each a’ e 1!, v.(w) e(CS) B is a standard d-dimensional Brownian motion on (Q, F, F, F) and Xt, the state process,is Ft-adapted with sample paths in Vd[O, Tj, and such that (2.1) is satisfied. We assumethat x = x for 0 < r s.We call (s, x) the initial condition of the control a.The collection of controls with initial condition (s, a) is denoted by A5,. It is well knownfrom the theory of stochastic differential equations that, under the above conditions, the setA5 is nonempty for each fixed (s, a) (e.g., take u and v to be constant). The cost correspondingto the control a is defined to beJ(a) E{JTf(t, a1,U1)dt + L,T) c(t) . dvt}, (2.2)whereChapter 2. Formulation of the Problem 16• f: E x U i— 11? is a measurable functiou and is lower semicontinuous in (x, u), satisfying—K fQ,x,u) C(l+ IIxtItm), (t,x,u) C X Ufor some constants m 0, K 0 and C 0;• c = (Ct) [0,T] JR’ is lower sernicontinuous and Ct > 0, 1 i k.Throughout this work we writedJ k(S) da(S) = > JS it:1 [s,t)for any JRkvalued Borel measurable functions k = (kg) and a = (as) e .,4’[0,T]. For v CAc[0,T], definef g(S)dv(S), s <tGt(v) = [s,t) (2.3)0, 0ts.It can be verified easily that G.(v) C Vd[0,T].The value function of the problem is defined, for (s, x) C E,W(s, x) = inf J(a). (2.4)A control a C AS,T is called an optimal control if W(s, x) =Remark 2.2 This model includes the classical control problem as a special case, i.e., g(.) = 0.When k = 2d, g(.) = (I, —I), with I the d x d unit matrix, it is called the bounded variationcontrol problem. When k = d, g(.) = I, it is often called a monotone follower problem. URemark 2.3 In our model, the function g(.) is fixed. A different formulation of the problemcan be found in Fleming and Soner [23], Chapter 8. For a comparison, see the disscusion inSection 7.5.Remark 2.4 Note that there is no terminal cost in our model. Due to the way we choose thetopology on the canonical space, this method cannot treat the case with terminal cost. We willreturn to this point at the end of Chapter 3. 0Chapter 2. Formulation of the Problem 17Remark 2.5 We take the singular control variable, and thus the state process, to be leftcontinuous. It seems that this is a natural choice for singular control problems because thedynamic programming principle (Theorem 4.10) takes a simple form. The reader may compareTheorem 4.10 and Menaldi and Robin [63], Theorem 1.3, which gives the dynamic programmingprinciple for an one-dimensional problem with right continuous singular control variable. UIn order to apply the compactification method, we now reformulate the problem. Since theBrownian motion in the definition of controls is unknown in advance, we can reformulate thecontrol problem as an equivalent martingale problem. This simplifies taking limits. In fact, let£ Eajj38 +EbII—,where a = uu. Then we can show that a e A3, if and only if a satisfies (Cl), (C2), (C3),(C4), and(C5’) Xj is an Ft-adapted process with sample paths in V”[O, T] such that• F(x = x, ‘ur = ‘u’, z = 0, 0 r s) = 1 for some arbitrary but fixed u e U,• V e C(]IZd), Mq5 (0 t T) is in M, i.e., Mq5 is a continuous square integrablemartingale on the filtered probability space (Q, F, F, F) (1 .s ), wheret tM(w) E (Xt)—£(8, x0,u0)dO—V(xo) . g(O)dvo- > [b(xo)-ql(xo)-V(xO)./sx]. (2.5)s6<tTherefore we can delete the term B from the notation of a control. The proof of the equivalenceof the existence of weak solutions to a stochastic differential equation and the existence ofsolutions to the corresponding martingale problem, given in Proposition IV-2.1 of Ikeda andWatanabe [35], also works here despite the extra term G.(v).Next we introduce the concept of relaxed controls, which gives a more suitable topologicalstructure when applying the compactification method. In a relaxed control problem, the Uvalued process {Ut} is replaced by an IllS (U)-valued process {pt}, where 1716(U) is the spaceChapter 2. Formulation of the Problem 18of probability measures on U endowed with the topology of weak convergence. 1M (U) is alsoa compact metrizable space. If q: U -* 11? is a bounded measurable function, then we extendto IJI/Ii(U) by lettingLDefinition 2.6 a = (1Z,F,.Fj,F,x,jz,vt s x) is called a relaxed control if it satisfies theconditions (Cl), (C2), (C4), (C5’), and(C3’) jz is 1M(U) — valued, progressively measurable with respect to {F}>.Remark 2.7 Note that we never work with u(t, x, p1. Instead, we work with a(t, x, i) throughthe above martingale formution. It is not true in general that cr(t,x,u)a(t,x,u)* = a(t,x,t),cf. El Karoui et al [18]. UThe collection of relaxed controls starting from time s with the initial state x is denoted byA3,. Note that As,r can be imbedded into A5, by letting ,ut(du) E 6jdu). Here 6 deuotesthe Dirac measure at the point it. Hence Va e there exists an & C A3, such thatJ(&) =and thereforeinf J(&) inf J(a). (2.6)In order to get the reverse inequality, we define for each (t, x) C a subset of 3<1 xlRd xlR:K(t, cv) {(a(t, cv, it), b(t, cv, u), z): z f(t, cv, it), it C U}. (2.7)Proposition 2.8 Assume that K(t, cv) is convex for each (t, cv) C D. Then Va e A3,, thereexists an a C A3, such thatJ(a) <J(d).Proof We first show that K(t, cv) is closed for each (t, cv) C E. For u, C U, cv,,, f(t, cv, u,,),assume(a(t, cv, it,,), b(t, cv, it,,), Zn) —÷ (a, b, cv)Chapter 2. Formulation of the Problem 19as n —* co. Since U is compact, there exists u0 E U such that (possibly extracting a snbsequence) u — u0. Note that a(t, x,.), b(t, x,.) are continuous on U, therefore(a(t, x, Un), b(t, x, Un)) —+ (a(t, x, u0), b(t, x,u0))as n —* co. From the assumption that f(i, x,.) is lower semicontinuous on U, we havez= lim z, liminffQ,x,un) f(t,x,uo).n—cc n—*ooTherefore (a, b, Z) e K(t, x), i.e., K(t, x) is closed.Let= (l,F,.Ft,P,xt,pj,s,x)and define!: [0,T] x Q ‘—÷ x x 11? bylQt,w) = (aQ, xj(w), p(w)), b(t, xt(w), ut(w)), f(t, xt(w), ,ut(w)))= J(a, b, f)(t, xt(w), u)Jzt(w, du). (2.8)Since K(t,x) is closed and convex, it follows that I(t,w) E K(t,xt(w)) for all (t,w). It canbe easily verified that / is progressively measurable with respect to F. By a measurableselection theorem (see Theorem A.9 of Haussmann and Lepeltier [30]) there exist progressivelymeasurable processes Ut and d, which are U and lilt-valued respectively, such that for each(t,w),1(t, w) = (aQ, xt(w), ut(w)), b(t, xjQo),u1(w)), f(t, xj(w), utQo)) + dj(w)).Now we definea = (fl,F,Ft,P,xt,ut,vt,s,x).By definition we know that a e A3, (use (C5’) rather than (C5)). Moreover,J(&) = E f(t, x,)dt + 1:c(t) dv}T=EF{J f(t,xt,ut)dt+j c(t).dvt}+Ej ddtE{j f(t, Xt, u)dt +1Tc(t) . dv}= J(a).Chapter 2. Formulation of the Problem 20The proof is thus complete. UBy (2.6) and Proposition 2.8 we know that when K(t, x) is convex,inf J(d) inf J(a).cxeA8,Moreover, if the infimum over relaxed controls is attained, then so is the infimum over ordinarycontrols. Hence we can restrict our attention to the case of relaxed controls if K(t, x) is convex.If the convexity fails, our existence results pertain only to relaxed controls.2.4 Control rulesNow we choose a canonical space to simplify the arguments in the compactification method.DefineU E {p: [0, Tj i—* IA/Ii(U) is Borel measurable}.Consider the canonical spaceXE vd[o,Tj x U x A”j0,T].All the above spaces are equipped with appropriate topologies which we will discuss later. Let73, U, A denote their Borel u-fields, and,U, A the u-fields up to time 1, i.e., the Borel a-fields generated by all the functions in 73, U, A which are constants on [t, TJ respectively. Theirprecise definition will be given in the next section. LetXE V xU x A,E x x A.Definition 2.9 A control rule is a probability R on the measurable space (X, X) such that& =is a relaxed control, whereXtQQ) = Xt, i-ztQo) = [tj, Vt = Vt(W)for w = (x.,z., V.) C X, i.e.,Chapter 2. Formulation of the Problem 211. (X, X, R) is a probability space with th€ filtration {}, (s, x) E andR(r — , / = 6, v,. = 0, 0 < r s) = 1;2. Vçb e C(1Rd), (s t T) is in M on the filtered probability space (X, X, X, R)(s < t T), wheret tMth(w)— f (9, x, te)d9— j V(xe) . g(O)dv0- [(xe+)-e)-Vth(x0).txe]. (2.9)s<O<tLet J(s, R) E J(&).We denote by the space of control rules such that the above relaxed control starts fromtime .s with initial state x. We will suppress s in J(.s, R) when it is clear that R E R.. Nowthe control problem can be described completely in terms of control rules.Proposition 2.10 Let a = (Q,F,Ft, F, Xt,[tt, Vt, s, x) be a relaxed control; then there exists acontrol rule R E such thatJ(R) = J(a).Proof. The proof of this result is standard: define a map P : X byEThis map is measurable, and F’’’ C ‘(X) C F, where F’’ is the ifitration generatedby x, , v. We can show thata=(Q, F, _1(), F, Xt, JUt, Vt, .5, x)is also a relaxed control and such thatJ(a) = J(a),Chapter 2. Form ulation of the Problem 22see Haussmann and Lepeltier [30], Theorem 3.13. Let 1? = F a which is a probabilitymeasure on (X,.It is easily seen that(X, A?, ;, R, Xt, l-’t, vj, s, x)is a relaxed control satisfying the requirements in the Proposition. U2.5 The topology on the canonical spaceIn this section, we define the topologies on the spaces Vd[0,T], U and Ak[0,T].The space Vd[O, T]. We first give a topology on Vd[0, oo), the collection of lcrl functions(i.e., left continuous and having right limits) on-+ taking values in Then we can takeVd[0,T] as a subset of V1[0,oo) by extending each xc V’[0,T] to an x’ c V9O,co) throughx(t), 0t<T,x(T), tT,and consider the induced topology on Vd[0,T].Define a measure ?(.) on the Borel subsets of 18+ by A(dt) = etdt. For a Borel measurablefunction f 18 ‘-÷ 18d, the image of the measure A(.) under the mapping t i.—+ (t, f(t)) is acalled the pseudo-path of the function f. It is a probability law on [0, co] x 18d and is denotedby W(f). Note that JR° [—00, +oo]”. It is clear that 4! identifies two functions if and only ifthey are equal almost everywhere in the Lebesgue sense, and in particular, 4! is one-to-one onVd[O, oc). Thus it provides us with an imbedding of V90, oc) into the compact Polish space Pof all the probabilities on [0, oo] x 1& (with the topology of weak convergence). The topologythat P induces on Vc[O, oc) via the mapping 4! is called the pseudo-path topology, and makesVd[O, oc) a separable metrizable space. The associated Borel u-algebra on V°[O, oc) is the sameas we get from the Skorohod topology. In fact, Lemma 1 in [66] tells us that convergence inthe pseudo-path topology is just convergence in measure. Let V denote the Borel u-field onV![O, co), and i5 be the Borel u-field up to time t, i.e., V1 is the c-field generated by all theChapter 2. Formulation of the Problem 23functions in V[0, cc) that are constant after t. Then we know that15t=a{x(O): 09t}, 0tT.Now we introduce some notation. For x e Vd[0, cc), define x = supt IIx(t)II. For u = (ui),v = (vi) e ifid, u < v means < v (1 i d). Let N(x) = ZNuth)t(xi), where N’t(x1)denotes the number of upcrossings of x(.) on [0, cc) between the levels u and v. Then a subsetA C Vd[O, cc) such thatsup x’ < cc, sup N’(x) < cc (2.10)zEA xEAfor any u < v is relatively compact in Vd[O, cc) with the pseudo-path topology. For details, seeMeyer and Zheng [66].Assume (12, .1, F) is a probability space with filtration {F}g>o and X is a lcrl adaptedprocess, with EIIXtII < cc for each t < cc. By the pseudo-law of the process X we mean theimage law of F under the mapping w i—* ‘I’(X.(w)). Let r be a finite subdivision: 0 = to <t1 <= cc (define X = 0), define the conditional variationsVarT(X) = — XjJFt]II,i<nVar(X) = sup 14(X).if Var(X) < cc, then X is said to be a quasimartingale. For a martingale X, Var(X) =supt EIIXtII. For a quasimartingale X, it can be shown that for u = (ui), v = (v1) E 111d, u < vcF(X* c) Var(X), ENUV(X)du1 + Var(X)We state the main results about the pseudo-path topology in the following theorem.Theorem 2.11 (a). Let {F} be a sequence of probability laws on the Borel subsets of Vd[O, cc)such that under each F the coordinate process X is a quasimartingale with conditional variationVar(X) uniformly bounded in n. Then there exists a subsequence flk which converges weaklyon V”{0, cc) to a law F, and X is a quasimartingale under F.Chapter 2. Formulation of the Problem 24(b). Let (xi’), (Xe) be measurable processes on the probability space (12, F, F) such that thepseudo-law of X” converges to that of X. Then there exists a subsequence (Xr) and a set Iof full Lebesgue measure, such that the finite dimensional distributions of (Xi”Qi converge tothose of (X)€i. Let f be a bounded continuous function on (lR’, then the function(ti,t2,. ..,tk) i: E [f(x,x,. . .,X)]converges in measure to the corresponding function relative to (Xe) as n —÷ 00.Proof. See Meyer and Zheng [66]. 0Remark 2.12 The corresponding results on Vd[O, T] can be stated similarly. 0Remark 2.13 As pointed out by Meyer and Zheng [66], Vd[O, co) with the pseudo-path topology is not a Polish space. But from the definition we know that it is homeomorphic to asubspace of the Polish space 2, and hence is a separable metric space. ElThe Space ii. LI is the space of measurable transformations p [0, T] ‘-÷ ]Jl/11(U) endowedwith the stable topology, which is defined as follows. Define for A C B([0, T]), B C 13(U)(A x B) E Ldt.Then j can be extended uniquely to an element in ]M+([0, T] x U), the space of nonnegativeRadon measures on [0, T] x U. The stable topology on U is the weakest topology which renderscontinuous the mappingspT pJ J qS(t, u)/i(dt, du)0 Ufor all bounded measurable functions which are continuous in u.Under this topology, we know that M+([0,T] x U) is a compact separable metrizable space.U is also endowed with its Borel cr-field U, which is the smallest a-field such that the mappingsTF- f pt(du)f(t,u)dtJo JUChapter 2. Formulation of the Problem 25are measurable, where f is a bounded measurable function continuous with respect to thevariable u. The filtration 7-4 is the c-field generated by {1[ot]fL, t e M}. From the definitionof the stable topology, we know that 14 is generated by the sets of the form{:J08MdOEB}with .s t and B a Borel set in JM(U).For more details, see Haussmann and Lepeltier [30].The Space Ak[0, T]. Let V be the collection of functions v: [0, T] ‘—÷ 18 such that each v(’)is of bounded variation and left continuous. We assume v(0) = 0. We first consider a topologyon V.Let JiW[0, T] be the collection of signed Radon measures on [0, T]. Then there is a one-to-onecorrespondence between V and JM[0, T]): for a E V, letva([.5,t))Ea(t)a(s), V 0stT.So we need only consider a topology on M[0,T], and can then get the induced topology on V.Denote by C[0, T] the collection of real valued continuous functions on [0, T]. It is wellknown that IAJ[0,T], and therefore V, is the dual space of C[0,T] with the supremum norm:forf C C[0,T],lUll = sup lf(t),O<iXTand the corresponding weak*topology on M[0, T] is the topology induced by the weak convergence of measures. It can be easily seen that the measures of the form zc ab1, with Nfinite, a, x rational, comprise a countable dense subset of 11V1[0,T]; therefore M[0,T], withthe weak-topology, is separable.Let A° = {a C V : a is nondecreasing}. Then it is a closed subset of V under the weakconvergence topology, and the corresponding closed subset in IM[0, T] is IM+[0, T]. We considerthe induced topology on M+[0, T].Chapter 2. Formulation of the Problem 26Let {q5k,k 1} be a dense subset of C[O,T], and 4o = 1. For Ajt E 1M[O,T], defined(A,=fo kdA loT kd (2.11)It is easy to verify that d defines a metric on IM+[O,T], and hence a metric on A°.Theorem 2.14 For A, A E M+[O, T], d(A, A) ,‘ 0 if and only if A ‘ A in M[0, T], i.e.,çTI f(t)dA,(t)— I f(t)dA(t), (2.12)Jo Jofor each f e C[0,T]. Therefore IJIJ+[0,T] is metrizable.Proof. The sufficiency is obvious. To show it is also necessary, take f E C[0, T], f 0. Notethat ç’o = 1, so there exist a constant C > 0 such that A([0,T]),A([0,T]) C. For a givene > 0, since {k} is dense in G[0,T], we can find 4 such thatII!—E. (2.13)Now if d(A, A) ‘, 0, then there exists N such that when n > N,1 IOTIdATh_Id<dA A <IIiII - ‘ -Therefore when n N/ q51dA,,— / jdA e,Jo JOandpTJ fdAn_J fdA < J (f_çb1)dA0 0 0+ fT1dA - fT1dA + f(f -CIIf— ‘iI + e + CIIf— iIl(2C--1).Therefore (2.12) holds for f. 0Now we can conclude that under the weak convergence topology A° is a separable metricspace. We state the following theorem which will be used later.Chapter 2. Formulation of the Problem 27Theorem 2.15 For any constant C > 0,{A e iW[0,T] .X([0,T]) C} (2.14)is a compact subset of M+[0, T].Proof. Note that the set (2.14) is a closed subset in JM[0, T] in both the (variation) normtopology and the weak* topology. Also notice that if A E M[0, T] C JM[0, T], then the normof A will bepThAil = sup J fdA =IIfII’ °i.e., the set (2.14) is a norm-bounded subset of M[0,T]. Therefore by the Banach-AlaogluTheorem (see Larsen [53] Theorem 9.4.1) we can conclude that (2.14) is a compact subset ofIM[0,T], and therefore a compact subset of JM+[0,T]. 0Finally, observe that Ak[0, T] = (A0)’ and consider the product topology on Ak[0, T] inherited from the weak topology of A°. We can state the following result.Corollary 2.16 .,4k[0,T] is metrizable and separable. For a,a e Ac[0,T], a —÷ a if and onlyifpTJ f(t) . da(t) —+ j f(t) . da(t).0 0for any f e C([0,T],JRj. Moreover, the setVM = {a e Ak[0,T]: hla(T)H M} (2.15)is compact for any constant M > 0. 0We make the following observation for later use; its proof is obvious from the relative compactness criterion for pseudo-path topology on 7Y[0,T]. Recall that the map 0 A’[0,T]V![0,T] is defined by (2.3).Lemma 2.17 For any constant M > 0G(VM) = {G.(v): v Cis a relatively compact subset of Vd[0,T], where VM is defined by (2.15). 0Chapter 2. Formulation of the Problem 28Remark 2.18 Note that Lemma 2.17 is the main reason why we choose the pseudo-pathtopology for V°[O, T]. This result is critical for the proof of Theorem 3.8. It is obvious that theSkorohod topology (i.e., the Ji-topology) is too strong for Lemma 2.17 to be true.We should point out that the SkorokodM1-topology for Vd[O, T] (cf. Skorohod [70]) alsogives Lemma 2.17. However, we choose to use the pseudo-path topology because it makes theproof of the main result (Theorem 6.9) of Chapter 6 easier. DWe let A be the Borel u-field of Ac[0, T], and A the Borel u-field up to time t, i.e., A is theu-field generated by all the functions in .A”[0, T] that are constants after t. It can be verifiedthatAt=u{v(O): 0Ot}, 0tT.From now on we will always use the notation fi = X, F = A’ and F = X. It is well-knownthat M1(!2), endowed with the Prohorov weak convergence topology, is a separable metrizablespace. Denote the collection of all the control rules with initial condition (s, x) by whichis a subset of M1(Q). For any real number A, define{P e J(s, F) A},Proposition 2.19 There exists a constant C 0 such that forA(x) = C(1 + lix urn), x C (2.16)we have 0 0 for each (s, x) C Y. Recall that in is given in the definition of f.Proof. Under our assumptions it is known from the theory of stochastic differential equations(cf. Stroock and Varadhan [73]) that there exists a F0 C R.s,r withPo(vj = 0, p = otto, 0 t T) = 1.Then from the definition of control rule we know that under Fo,t= x + b(O, X9, u°)d9 + j u(9, x9, u°)dBo,Chapter 2. Formulation of the Problem 29therefore from the boundedness of b, a and the Burkholder-Davis-Gundy inequality we havemEFOI,XtIIm c { fim + EPO J b(O,x9u°)dOmSUP f u(’,e’,u°)dBeiQ9t Jsc {i + Ix II + EPO (j tr(a(O, x, u0))dO) }C(1+IIxIjm),where C is a constant independent of .s, x. Now we have by definitionJ(s, Fo) = E°{1Tf(8, x, u0)dO} c (i +1TE° IIzeIImds)< C(+IIxIIm).The proposition is thus proved by letting A(x) = C(1 + fh). LiChapter 3Existence of Optimal Controls3.1 IntroductionThis chapter studies the existence of optimal controls by the compactification method. We firstconsider an equivalent definition of control rules. After showing that the cost fnnction is lowersemicontinnous on the canonical space Q and the collection of control rules is a compact subsetof 111(Q), we conclude the existence of an optimal control by the well-known fact that a lowersemicontinuous function attains its minimum on compact sets. The value function is also shownto be Borel measurable. Some comments about the model we use and possible generalizationsare presented in Section 3.4.3.2 An equivalent definition for control rulesIn order to show the existence of optimal controls let us reformulate the problem as follows.Consider the stochastic differential equation (2.1), and lety = x — G(v),then x1= y + Gj(v) (0 t T), and (2.1) becomest tyj = x + f b(8, z, ue)dO + j u(9, Xe, ue)dBe. (3.1)Similarly, we can state the following result for control rules;Proposition 3.1 F e fl3, if and only if there exists an Fe-adapted process y such that1) y. is continuous w.p. 1, and FQr. = y. + 0(v)) = 1,2) FQr = X, /1 = Vr = 0, 0 r s) = 1,30Chapter 3. Existence of Optimal Controls 313) M E M for every q E C(IRd), where= (yt(w))— j (O, xe, y,0)d9, (3.2)and1(6,x,y,u) aj(O,x,u)3 +31 i1Proof. First, we assume that such a y exists. To show P e 7?, we need only to verifyMtq e M for e C(YD). Recall that Mb is defined by (2.5), which by 1) may be rewrittenas= (y + Gj(v))— f £(O, ye + Go(v), [Le)dO— f V(yo + Ge(v)) dGe(v)— [cb(y6 + Ge(v))— c5(Yo + Go(v)) — Vq(ye + Go(v)) AG6(v)]. (3.3)sO<tBy letting (x) = x’, x, and respectively for x = (x1) IRd, 1 i, j < d we canconclude that y is a continuous local semimartingale, withy1(t) = x + A + j b(O,xo,i9)dOclocwhere M , wfth= j a(O,xo,o)dO, 1 , i d.Therefore by Ito’s formula (cf. Ikeda and Watanabe [35], Chapter III)(yj + Gt(v)) = (x) + j V(y0 + Ge(v)) . dy0+ f V(ye + Ge(v)) . dGo(v) +f82(Yo+Ge(v))d(yi ya)+ [(yo + Ge+(v))— (yo + G9(v)) — V(yo + Ge(v)) AG9(v)].sO<tt t=(x) + j ye + Go(v),1te)d + j V(y9 + Ge(v)) . dGe(v)+ {q(ye + Go(v))—q(yo + Ge(v)) — Vq(ye + Ge(v)) AG0(v)].+ f V(y0 + G9(v)) . d9 (3.4)Chapter 3. Existence of Optimal Controls 32Note that the last integral in (3.4) is in M. Now from (3.3) we can see that= (x)+ f V(xe) due,therefore we have shown that Mçb€M, i.e., F€R.The other half of the proof is similar. cThe following result is used in the proof of Proposition 3.1 and will be used again. We writeit down as a lemma for convenience.Lemma 3.2 Assume F€For any ,€ C?() we have(i, iI)=ft a(9, Xe, )8e) 3e) dO (3.5)under the probability F. In particular, if we define, for 1 i d,(t) = y(t)—— f b1(O, xe,0)dO, (3.6)i.e., iQI with b(y) = y — x, then= jtaij(O,Xs,e)dO, 1 i, j d. URecall that C°{O, T} is the space of 1R’-va1ued continuous functions on [0, T]. We giveCd[0, T] the uniform topology, i.e., for x, y€C”[O, T], the distance between x and y is definedbyp(x, y) IIx(t) — y(1)II.This makes C![0, T] a Polish space, cf. Bifflngsley [6].For a sequence F’€with (sn, x)€E, the probability law of the process y, definedin Proposition 3.1, under Ftm is defined byP(C) FTh(w: y.(w)€C)for C€d, where d is the Borel a-field of Cd[0, T].Chapter 3. Existence of Optimal Controls 33Proposition 3.3 If the sequence (sn, Zn) is bounded in E, then {F’} are relatively compact.Moreover, for any e > 0, there exists a compact subset K C Cd[0, T] such thatP’1’(K) 1— &, V n. (3.7)Proof. To show that {FTh} is relatively compact, we need only verify the following:(a) limA_ infn P”-)y(0))) A) = 1, and(b) for any 7> 0,limlimsupF’2 sup y(t)— y(s) 7 = 0. (3.8)6O Os<tTt—s<6Note that (a) is obvions from the fact P’2(y(0) = Zn) = 1. By Billingsley [6], Theorem 12.3,(b) is implied byFfly(2)— y(ti)4 Cit2 — (3.9)for any n 1, 0 t1,t2 T, where C is a constant.Recall from the definition of y that F’1(y(t) = x, 0 t s,) = 1, and for t s,,y(t) = Zn + j b(8, x, e)dO + in(t)with Al,,. e J4lQC under P, and(In)j=tr(a(0,xo,pe))dO, t> s.It can be easily verified that (3.9) follows from the boundedness of the coefficients u(.,.,.) andb(.,.,.) and the Burkholder-Davis-Gundy inequality.We have thus shown that the probability laws of y under pn are relatively compact. Theexistence of a compact set K such that (3.7) holds is a consequence of Prohorov’s theorem. DProposition 3.4 If A is a bounded subset of 18d thenlim inf P{w : iivTii M} = 1, (3.10)M—*cc PE7?’,X(s,x) e[O,T] xAwhere ii denotes the Euclidean norm in J11kChapter 3. Existence of Optimal Controls 34Proof. If P e R, then J(P) A. Since the function f is assnmed to be bounded frombelow, there exists a constant K > 0 such that f —K. From the assumption that c(t) isstrictly positive and lower semicontinuous, there exists a constant c0 > 0 such that ct(t) c0(1 i k, 0 t T). ThusA > J(F) = E{1Tf(O, xo,po)dO + j c(O) . dvo}E”{—KT + coil vTii}.Therefore EIivTii I? E (A + KT)/co, andF(iivTiJ M)The proposition is obvious now. U3.3 Existence of optimal controlsDefine a function on Q byT TFs(w)Ej f(Oxoito)d8+j c(8).dvg (3.11)forw= (x.,u.,v.). Then we haveLemma 3.5 f(.) is lower sernicontinuous on fi, i.e.,liminfF3(w) r3Q4) (3.12)if w— win Q.Proof. We show the case when s = 0; for general .s the proof is similar. We can also assumethat f(t,.,.), c(.) are continuous. In fact, since they are lower semicontinuous, we can findsequences of continuous functions {fm(t,, )}, {cm()} such thatfm(t,.,.) I f(t, ., .), c(.) I c(.), 1 i k,Chapter 3. Existence of Optimal Controls 35and thus if the lemma is true for continuous functions, then Vm> 1,lirninf0(w) liminf{JTfm(0, x(o), t(o))do + f Cm(O) dv(O)}J fm(9,X(0),(6))d8+f Cm(O)’dV(O).0 0Let m —* oo, and use the monotone convergence theorem to conclude the result.We assume that f(t,.,.), and c(.) are continuous. From v — v in Ak[O, T] we havefTI c(9) dv(9) — I c(O) dv(O).JO JoNow we show that as n —+1T L f(9, x(9), u)(du)d9 j j f(9, x(9),u)8(du)d9. (3.13)Let dQ9 p(du)d9, dQ 0(du)dO, the right hand side of (3.13) can be rewritten asJ f(O,x(O),u)dQ = f[0,T]xU [0,TIxU+ f [f(O, x(9), u) - f(O, x(O), u)]dQ. (3.14)[O,T]xUFrom the definition of stable topology we know as n —* 00,J f(9,x(8),u)dQ _*J[O,T]xU [0,T]xUNow we show that the limit of the second integral in (3.14) is zero. For any positive integerm and constant 7> 0, letg(t, x, u) = f(t, x, u) — f(t, z(t), u),Am = (t,u): sup Ig(t,x,u) 7I IIx—r(t)I1/mThen each t-section of the set Am is a closed subset of U from the continuity of the functionf(t.).AlsoAiDA2D”DAmDAm+iD”.andflAmø.Chapter 3. Existence of Optimal Controls 36Applying the results in Jacod and Mémin [38] Proposition 2.11 we can getlim sup Qn(Am) Q(Am), limQ(Am) = 0. (3.15)mLet B,, = {(t, u) : g(t, x(t), u) In order to show the limit of the last integral in (3.14)is zero, we need only showlim Q(B) = 0 (3.16)by Jacod and Mémin [38] Cor 2.18. For a given e > 0, from (3.15) there exists an M > 0such that Q(AM) a Recall that convergence in pseudo-path topology is equivalent to theconvergence in Lebesgue measure, therefore x(.)—÷ x(.) in the Lebesgue measure 1, and thereexists N such that when n N,i {t:- x(t)II><aLet C = {t IIx(t)— x(t)II > 1/m}, then it is obviousB,, \ (C’ x U) C AM,and therefore we haveQ(B) Q(AM) + Q(C x U).But Q(C,,” x U) = l(C”) <e, hencelimsupQfl(B) limsupQ(AM)+EQ(AM)-f-e<2aSince e is arbitrary, we have shown (3.16), and the lemma is thus proved. 0For F e we have by definition thatJ(s, F) =i.e., the cost corresponding to the control rule F. We can state the following result now.Chapter 3. Existence of Optimal Controls 37Theorem 3.6 The mapping (s, x, F) —* J(s, P) is lower semicontinuous on {(s, x, F) : (s, z) é, F E with the induced topology of[O,T] x M+(Q), i.e., if(s,x) e , F e— .s, x,— x, and P —f F e R. weakly, thenJ(s, F) liminf J(8n, Pa). (3.17)Proof. Assume (s,, x,2, F,) —÷ (s, x, F) with F e F e It suffices to considertwo cases: s, .s or s, s. When s T s,E (r—r8) = E {j f(O, x, o)d9 + j c(O) . dvo}E (f f(8z/Le)dO) —K(s—s)because we have assumed that c 0. It followslirninf (r3 — r8) 0. (3.18)When s,, j s, (3.18) follows fromE’ (1— =— j f(o, Xn,where we have used the fact F(vo = 0, 0 0 s) = 1.In either case, from (3.18) and the lower semicontinuity of I(.), we havelirninf J(s, F) lirninf ET8 + liminf (F — F8)liminfE1F3 EFS = J(s,P),i.e., J(.,.) is lower semicontinuous. oTheorem 3.7 For any A> 0, if A is a compact set in , then u{1: s E {0,T], x A}is compact.Proof. Since Pvti() is metrizable, we need only to show that each sequence {F’} Chas a subsequence {Fnk} such that pflk— F for some s [0,T], x E A. Because A scompact and T < , we may assume s, —* s, x —b x for some s e [0,Tj, x E A.Chapter 3. Existence of Optimal Controls 38By Proposition 3.1, the process y, defined byy.(w) = x.(w) —for w = (x, p, v), has continuous sample paths under the probability F, and is such thati’IIt c M (Mth is defined in Proposition 3.1) for each E C?(ffld). Now we introduce thefollowing auxiliary spacez clXCd[O,TJ,Z FxLDefine a probability fr on (Z, Z) by the probability law of (x, z, v, y) with respect to Ftm, i.e.,for Z CFTh(Z) F (w: (x.(w), .(w), v.(w), y.(w)) e Z).In other words, for F E F, C C C,P(F x C) =We will show that the sequence F’ is tight.For a positive constant M, let VM = {v e Av[O,T] IIVTII M}. From Corollary 2.16 weknow that VM is a compact Borel subset of Ak[O, T]. Proposition 3.4 implies that for any givene > 0, there exists an M such thatF(Vd[O, T] >< U X Vw) 1 — (3.19)for every F C .s e [0,T], x e A. Therefore for the corresponding P, we haveE(vd[0,Tj x U x VM x Cd[0,TJ) 1— e. (3.20)By Proposition 3.3 we know that the probability laws of y with respect to Ftm, denoted byare relatively compact, and there exists a compact subset K c C°[o, T] withFtm(K) 1—Chapter 3. Existence of Optimal Controls 39or equivalently,F(w: y.Qu) e K) 1 — e, Vn. (3.21)From Proposition 3.1, (3.21) may be written asx K) 1— r, Vn. (3.22)We now consider the coordinate process x. LetD = K + G(VM) = {y + G(v), y E K, v E VM}.By Lemma 2.17 we know that GQIKw) is a relatively compact subset of Vd[0, T] under thepseudo-path topology. Since the uniform topology is stronger than the pseudo-path topology,K is also a compact subset in V’[O, T] and hence so is D. From Proposition 3.1 we havePTh(D x i-I x VM x K) = pTh(vd[o, T] x U x VM x K). (3.23)Let S = D x U x VM x K. Since U is a compact space, we know that S is a relatively compactsubset in Z. Moreover, from (3.20), (3.22) and (3.23) we haveFTh(S) 1 — 2efor every ii. Thus {FTh} is a tight sequence of probability measures on Z. By the Prohorovtheorem there is a subsequence {FThk } and a probability F on (Z, Z) such that Fnk —* F weakly.DefineFFIc2,i.e., in the terminology of Jacod and Mémin [38], F is the Q-marginal of F, then it is easyto see that Ftk F weakly. The proof of the theorem will be completed if F e R. ByProposition 3.1 we need only to show that there exists a continuous process Y on (Q, F, F, F)such that1. F(Y = x. — G.(v)) = 1;2. F(xT = X, v = 0, 0 r s) = 1;Chapter 3. Existence of Optimal Controls 403. Mt4 e M, Vç& C C(]Rd), and4. J(s,F) A.Note that 4) is obvious from Theorem 3.6. To show 1), note that the set {(w, y): x.(w) =y. + G.(v(w))} is a closed subset of S = Q x Cd[0, T], and thereforeP(x. = y. + G.(v)) lim sup P(x. = y. + G,(v)) = L (3.24)If we define Y(w) = x.(w)— G.(v(w)), then P((w, y): Y.(w)= y.) = 1. Thus Y is a continuousprocess on (Q,F,J,F), and 1) follows from (3.24). Moreover, {y(O) = x} is closed in 5, soF{y(0) = x} lim sup Eflk{y(0) = x} = 1,or F{w: Y(0) = x} = 1. It follows that x(0) = x a.s.(F).For 2), letBm = (x.,&,v.): Ix(t)—zII , ½ = 0, 0< t s—It is easy to see that Bm is closed in Q and B1 D B2 D , andflBm ={w=(x,p,v): Xt=X, vt=0, 0<ts}.Since we know for each in, Fflk(Bm) = 1 for large k since Xnk —* x. Butp2k F weakly, soF(Bm) limsupF’(Bm) = 1,kand thereforeF{w = X, Vt = 0, 0 t .s}= F(flBmfl{x(0) = x}) =llrnF(Bm) = 1.Finally, we prove 3). For any bounded continuous function H(.) on fl, if we define= H(w), Vc = (w,y.) C 12 xChapter 3. Existence of Optimal Controls 41then H is a continuous function on .Z. For any fixed I .s, the function= (x,p,v,y)e Z Mj(z) = (Yt)— jtL(oxy)dois continuous on Z. In fact, it can be shown as in the proof of Lemma 3.5 that the integral partof the function M,çb is continuous on Z, and the continuity of the functionz = (x,t,v,y)-÷ b(y)on Z follows from the fact that C![O, T] is endowed with the uniform topology. Thus foro < u < I T the functionz = (x, i, v, y) - ft(z) [Mtb(z) - Mgl(z)]is a bounded and continuous function on 2’, and since Pnk —* F weakly, we havelirn E— = E {ft[M—j}. (3.25)Again, by (3.24) and the definition of 1’, we have P(ilith = M.) = 1, where £ is defined by(3.2) with y replaced by Y. Thus (3.25) can be rewritten aslim EPnk {H[&th—= E” {H[&c5—M4]}. (3.26)For every bounded continuous function H on 12 that is Ft-measurable, the left hand side of(3.26) is zero since MgI e M on the filtered probability space (12, F, F,, F). By a routine limitprocedure we have— MugS)] = 0for each bounded Ft-measurable function H. Thus (MgI, F) is a martingale under F. Thecontinuity of this martingale follows from that of Y. The proof is therefore complete. DWe can now prove the main theorem of this chapter.Theorem 3.8 The control problem has an optimal solution, i.e., there exists a F* e fl8, suchthatJ(s,F*)= inf J(s,F).PER8,1Chapter 3. Existence of Optimal Controls 42Proof. By Proposition 2.19 and Theorem 3.7 we know that is nonempty and compact.Moreover, it is obviousmi J(s, F) = inf J(s, F).FE1Z8,xNow J(s,.) is a lower semicontinuous function on so it attains its minimum on the compactset c 1M(1), i.e., there exists a P e C R such thatJ(s, F*) = inf J(s, F) = inf J(s, F),which completes the proof. 0Recall that the value function is defined byW(s,x) = inf J(s.F).P€fl,Let us define= {F C : W(s,x) = J(s,F)}.By Theorem 3.8, R $ 0 for any (s, x) C YZ. It can be easily verified that it is a compactsubset of 11i4(Q).Before we prove the measurability of the value function W, we give a result which will beused later. We adopt the notations of Stroock and Varadhan [73] Chapter 12.Lemma 3.9 The map RY : i—* Comp(IM1(Q)) is Borel measurable. Moreover, there existsa measurable selector H of 7?”, i.e., H(s, x) C V(s, x) e E and H : ‘—÷ 1JI1(Q) is Borelmeasurable.Proof. By Stroock and Varadhan [73] Lemma 12.1.8, we need only to show the following:for (sn,xn) e E, s, —÷ s, Xn —+ x, F”‘ C 7?S,X’ there exists a subsequence flk and F Csuch that pn p.Since Xn —* x, we may assume A(xn) A for some constant A. Therefore {F”} C{7?, (s,x) C [O,T] x A} with A = {x,xi,x2,..} a compact set. By Theorem 3.7 thereChapter 3. Existence of Optimal Controls 43exists a subsequence and F C R such that Fnk —* F. From Theorem 3.6 it can be seeneasily that F e The measurability of 7?° is thus proved.The existence of a measurable selector H is a consequence of Stroock and Varadhan [73]Theorem 12.1.10. 0Corollary 3.10 W(.,.) is a Borel measurable function.Proof. From Theorem 3.6 we know the map (s, x, F) ‘— J(s, F) is lower semicontinuous andthus Borel measurable. The corollary follows from the fact that W(s, x) = J(s, H(s, x)) is thecomposition of two Borel measurable mappings. 03.4 Some comments(a) The model studied in this chapter includes the case of the monotone follower problemas formulated in Karatzas [40], Karatzas and Shreve [40], [46], by letting k = d and g(O) =I, 0 8 < T, where I denotes the d x d unit matrix. Moreover, if we take k = 2d andg(8) = (I, —I), 0 8 T, then the model reduces the bounded variation problem as discussedin Chow, Menaldi and Robin [14], among others.We have assumed that ct(.) > 0. This condition is necessary for the existence of optimalcontrols to the general problem formulated in this chapter (see the proof of Proposition 3.4).Thus it excludes the case of the so-called cheap control problems, i.e., c(.) = 0. This type ofproblem is discussed in Chiarolla and Haussmann [12], [13], Menaldi and Robin [62]. However,our method works for problems with finite fuel constraints, because for any y e V”[0,T], theset{veAk[0,T]:vt(t)<y),Vt, 1ik}is closed in Ac[0,T]. This follows from the fact that v —* v in A’[0,T] then v(t)—÷ v1(t)(1 i k) at all the continuity points of v1. For problems with finite fuel constraints, seeKaratzas [42], Karatzas and El Karoui [20] and a more recent work by Bridge and Shreve [9]among others.Chapter 3. Existence of Optimal Controls 44(b) Now we explain why the method used in this chapter is not suitable to the problemwhere terminal costs are allowed. If we define F3(w) çb(xr(w)) for w = (x, ji, v), with acontinnons function defined in 11?”, we cannot get the lower semicontinuity for F3. The reason isthat x” —p x in the pseudo-path topology only ensures that x”(.) —÷ x(.) in Lebesgue measure,and therefore xTh(T) —* x(T) may not hold.But we can modify the formulation of the problem to allow a terminal cost, i.e., letI rTJ(a) E J f(t, Zj, u)dt + J c(t) . dv + b(xT+)s [s,T]Note the additional term c(T).Avg in the second integral, and the relation XT+ = xr+g(T)L\vr.We can replace V°[O, T] by Vd[O, T] x 18d and .A”[O, T] by Ad[O, T] x ]R in the canonical spaceto obtain the results if is lower semicontinuous.(c) For the same reason as explained in (b) we cannot allow pointwise constraints of the type= 0 a.s. for some lower semicontinuous function P on 1R’ (which may take the values+oo), and fixed t0 C [0,T]. Bnt it is seen easily that the following kind of integral constraintsmay be added to the problem:2’ 2’j fo(t,xj,ut)dt+ co(t) . dv 0, a.s.where fo, co satisfy the same conditions as f and c, except that the positivity of co(.) is relaxedto bounded below. Moreover, from the proof of Lemma 3.5 we can conclude that the followingkind of constraint may also be added to the model:2’ 2’j f1(t, Xj, ut)dt + j ci(t) . dv = 0,withf1(t,•,.), c1(.) continuous on D (for each 0 t T), [0,T] respectively. Of course, wemust now assnme the existence of an admissible control. See Haussmann and Lepeltier [30] forconstraints of these types in the classical control problems.Chapter 4The Dynamic Programming Principle4.1 IntroductionIn this chapter we wifi apply the method used in Haussmann [29], Haussmann and Lepeltier[30] and El Karoui et al [18] to establish the dynamic programming principle. In Section 4.2 wewill present some techniques that are used by Stroock and Varadhan [73] to show the existenceof a Markovian solution to a stochastic differential equation. The main results of this sectionare that a control rule remains a control rule for the problem starting at a later time fromthe point reached at that time, and if we take a control rule and at some time later switch toanother one that has the initial value reached by the first control rule, the object we obtainis also a control rule. Section 4.3 is devoted to an abstract form of the dynamic programmingprinciple, which has played an essential role in the setting of classical control problems. In thelast section, Section 4.4, we show that there exists a famlly of optimal control rules such thatthe associated optimal state process has the strong Markov property. Note that this approachto the dynamic programming principle does not require any regularity of the value function Wother than Borel measurability.4.2 Some preparationsThis section presents some preparations for proving the dynamic programming principle. Define12 12 byI,0<t<s,=—— (4.1)I.. (xj,pj,ts+Vt—Vs), s<tT.45Chapter 4. The Dynamic Programming Principle 46Note that if & = 080(w), then c = & on [0,s], (x.()j&(&)) = (x.(w),t&(w)) on (s,T] and— v8() = v(w) — v8(w) on [s,T]. Define F8 to be the a-field generated by the canonicalpaths after time .s.Lemma 4.1 1fF is a probability on (Q,F8) (0 s T) and C 2, then there exists a uniqueprobability measure, denoted by 6 ® F, on (Q, F) such that6 ® F(w :wj = “j,O t s) = 1, (4.2)6 ® F(A) = F(Oj(A)), VA c F8, (4.3)where by Wj = 0 t s, we mean Xt = , Vt = t, 0 t s, and,u = Jt a.e. on [0,s].Proof. The uniqneness of such a probability measure is obvious, so we need only to showits existence.if I is a subinterval of [0, T], we write V(I) for the set of measurable functionsI i—> ]Mi(U).Let us recall an equivalent definition of the stable topology on U = V([0, T]). For p C U,define a mapping i: U’--> C([0,T],]M(U)) byi(p)(.)= f p dO.The topology on U induced by i is exactly the stable topology we have introduced in Chapter 2.For a discussion of this fact, see Haussmann and Lepeltier [30] §3.10. Similarly, we can considerthe topology on V(I) induced by the mapping= j 1i(O)p8d c C([0,T],M(U)), p.C V(I).Write A for the 1R-valued nondecreasing functions on I with the inherited topology fromA’[0,T]. LetVd[O,s] )< i[o,8]V([0,5]) xX V°[s,T] xi18,V([s,T]) <XEX0X,Chapter 4. The Dynamic Programming Principle 47and define o: Q i—* Xo, : i—+ X by= (xt,jtdo,vt), o t= (x,,jtIdo,v,), s i’ T,where w = (x, v) !. Define : $1 by=where T0 : X0 i’ : X= (tAs,z31(iz)(t A s), itAs),T(W) = (Xt8,i1(1L)(i V s), Vt8),with=(,fl,) E X0, = (x,t,v) e X.We define a probability F on X by F 6 o x P o and letp=o—1.Now we verify that F satisfies the conditions (4.2) and (4.3). Note that r0 o = , 0t < .s, so we haveP(w : = , 0 I s) = i((Wl,L4’2) : c’t, 0 I s)= P(w1,w2): = Lt, 0 t s)= TOOO(W1)t=L2t, 0t<t)= 1.For A e F8,P(A) = P(Q.41,w2): (wi,w2)E A)= ((‘i,w2):O3,(1)(T(W2)e A)= ((wi,w2): T(W2) EChapter 4. The Dynamic Programming Principle 48= J Fe C’(w2; r(w2) COST ()(A))6o a=r a (w) C O$T(A)) 6(O)(&o1)= F(w: r a (w) C= F(O;(A))since r a ‘(w)t = Wj holds on [s,T]. The lemma is proved by letting 6 Ø. F = F. 0Remark 4.2 Notice that for F C there exists a F-null set N0 such that if 0 N0,A C F8, then F(A) = F a O(A), andE{jth(8) . dvo} = EPOS$ {jt h(S) . tfor any bounded 1Rkva1ued Borel measurable function h(.). These properties will be usedrepeatedly in the rest of this section. 0Assume that r is an Fe-stopping time, 0 r T. A r-transition probability (or r-t.p.) isa family {Q w C } of probability measures on (Q, F) such thatw F—* Q(A) is FT-measurable, VA C F.Note that FT is the collection of sets A such that Afl{r t} C F, Vt < T. For a fixed rand w such that rQo) T, we denote by FT(W) the u-field generated by the sets of the formC Q: f odOCB, CC , (4.4)t T(W)where A C B(JR’), B C 13(IM+(U)), C C B(lRk), r(w) t T, so FT(W) is the collectionof events that occur after the time r(w). Notice that the topology on Q is separable, so F iscountably generated, and for a given probability F on (Q, F), the regular conditional probabilitydistribution (r.c.p.d.) of F for given FT exists, and will be denoted by FT,W.Given a stopping time r (0 r T) and a r-t.p. Q, if &‘ C fi, then from Lemma 4.1 weknow that there exists a unique 6 ®T Q C 1216(Q) such that6 OT Q{cZ’ :3 = Z’,O t r(w)} = 1,Chapter 4. The Dynamic Programming Principle 49bCJ ®T Q(A) = Q(E;J0(A)), VA e FT(c0).We write ® = Q. When = (x(r(w)), 6°, 0), we write0£ n flTU vrIt can be seen easily that for s T, and a stopping time r s r T,0= Q(rT(W)) =Q (I’) (4.5)where r is defined by (3.11).IS F e M1(Q), r is a stopping time, and {Q} is a r-t.p., then we have the following resultwhich is analogous to Theorem 6.1.2 in Stroock and Varadhan [73].Lemma 4.3 There exists a unique probability, denoted by F ØT Q, such that1. FØTQ(A)=F(A), ifAEFT,2. The r.c.p.d. of P ØT Q with respect to F,- is Q.Proof. The uniqueness of such a probability F®,- Q is obvious. Thus we need only to provethe existence. In fact, it is enough to check that w Q(A) = 6®()Q(A) is FT-measurable,and then set®T Q(A) = E13(6. ®,-(.) Q.(A)), A E F.Once it is known that w ‘-÷ Q,(A) is F,--measurable, the proof that P ®,- Q has the desiredproperties is easy. But if‘iiA=w=(xjt,v): x(tj)eX, j 19d8eB, vQtj)cC, i=1,•,nwith0ti<...<tT,XjeB(]Rd),BjcB(1lkt+(U)),C,eB(]Rj,1<in,thenQ(A) 1[o,t1)(T(W))Qw (e;)(A))+ E1[tk,tk+l )(T(W))k(i (x(ti))1B1j od6)1c (v(ti)))Chapter 4. The Dynamic Programming Principle 50/ rtzxQ(&: (ti)eX1,‘jlodO+ / podOEBi,Ji-(w)(tj)—(r(w))+v(r(w))ECi, k+1 zfl)+1[t.T](T(W)) II1x(x(ti))lBe(J btedO)1c1(v(t)),and this is clearlyF7-measurable. The lemma is thus proved by recaffing that the collection ofsets in the form of A generates the a-field F. ULemma 4.4 Assume P C r is an Fe-stopping time: s r T, and iL C 1!. Then(Mj4, F, P o O) is a rnartingaie after rz, for e C?(1R°).Proof. Since P C fl, we know that (Mjq!’, F, P) is a martingale, i.e.,IA IA VACF, su<tT, (4.6)orE’ {1(.) [M(.)- Mth(.)]} =0. (4.7)When t r0, we havept pi=— ] £(,x6,ue)dO— J v(xo) .g(O)dv- [(x) - (x)]- [(xT)-r<O<t—/£15(O,±e,Jio)dO_j— Z [(Eo+)—(te)J.Therefore we can get, for r, u t T, A e F,EFOOTL {1A() [M(’) - M(.)]}= E’3 {1A(O(.)) [Mb(O(.))— Mb(O,0(.))]}= E {1A(OrQ,)) [Mth(.) - M)]}=0Chapter 4. The Dynamic Programming Principle 51because it is obvious that O;’(A) e F. The lemma is thus proved. ElRecall that for a stopping time r, FT,W denotes the regular conditional probability distribution of F given F,-. is obviously a r-t.p., and so FE 6LZ ®T FT,W is defined, where= (xQr(w)), 6°, 0).Lemma 4.5 If(Mj, F,, F) is a martingale and r is an Ft-stopping time. Then there is a F-nullset N C FT such that for w g N, (Mi, F,, FT,W) is a martingale for t ru,.Proof. cf. Stroock and Varadhan [73] Theorem 1.2.10. 0Corollary 4.6 If F e and r is an Fe-stopping time, then there is a F-null set N C FTsuch that for w N, (Mjq5, F,, F,-, o °;‘) is a martingale for t when e C?(JR’1).Froof. The proof is obvious from Lemmas 4.4 and 4.5. ElThe next two results are important for the rest of the chapter. The first one states thata control rule remains a control rule for the problem starting at a later time from the pointreached at that time. The second one says that if we take a control rule and at some later timeswitch to another coutrol rule, then this concatenated object is still a control rule.Proposition 4.7 (Closure under Conditioning) If F e R-3, and r is a stopping time,s r T, then there exists a F-null set Ne FT such that Pse flTw,xr forw g N.Proof. Let {q5m} be a dense subset ofC0°°(]R’). Then for each m, (Mtqlm — MtAT’m,Ft,F)(t s) is a martingale. For’ C Q, define= (xt) — (xtAT)—tATWJ (xo).g(8)dv-tAT&tAT0Then by Lemma 4.5 there exists a F-null set Nm€ FT such that (M°q5m, F,, FT,) is a martingale for N,,. By Lemma 4.4 we know that (M?m, F,, FT,o o EJj) is a martingaleChapter 4. The Dynamic Programming Principle 52for t N,,. Certainly, is a martingale, where &‘ =Therefore (MEAT c&rn, F, P) is a martingale. It is obvious from the definition thatF (w: X, = 12r = 60, Vr = 0, 0 r r0) = 1.Let N = Urn Nm, then P(N) = 0. Through a limit procedure we can show that FeforwgN. UProposition 4.8 (Closure under Concatenation) Let P e R and r be a stopping timesuch that s r T. If Q is a transition probability such that Q e thenP®7QProof. It is obvious that we need only to show that for each e C2(lRd), .F, P®7 Q)is a martingale after time s. From Remark 4.2 and the fact Q e R(w),qr(w)) it can be easilyverified that (Mth, F, ®T Q) is a martingale after time r. By definition we know that= 6 ® Q, equals the regular conditional probability distribution of P ® Q given F,-. Theproof of the proposition now follows Theorem L2.l0 in Stroock and Varadhan [73]. USet Q = H(r(w), x(r(w))), where H is a measurable selector of fl°, and by definition, it isa r-t.p.. We denoteP®7H=P®Qw.Corollary 4.9 For P in we have P ®- H C R. U4.3 The dynamic programming principleWith the preparations of Section 4.2, we can now establish the dynamic programming principle.For a given probability P e define 7ZL(P) to be the set of probabilities in fl84- whichcoincide with P up to time r, i.e.,= {P ® Q Q C such that Q is a r — t.p.} C s3rChapter 4. The Dynamic Programming Principle 53We introduce the following notation: for a measurable function çiS e B(s), lettT3(, )(w)= j f(8, xe,e)dO + j c(O) dv(O) + (t, Xt),where LY = (x,u,v) eQ.Theorem 4.10 (Dynamic Programming Principle) (a) If r is an Ft-stopping time, sr T, and P e R, thenE {j f(O, zo,0)d9 + j c(8) . dye + W(r, XT)}= inf {P(F): P C i,(F)}, (4.8)where F8(.) is defined by (3.11).(b) For Fe (fi,W),F,P) is a submartingale.(c) Ifs T T, thenW(s, x) = inf {Pr3(r, W), P e R}. (4.9)(d) (F3t, W), F, F) is a martingale under P if and only if P C R. is optimal.Proof. (a) Recall that H denotes a measurable selector of R°; therefore by (4.3) andRemark 4.2 the LHS of (4.8) isE {f f(O, x, iio)dO + fT c(O) dv9 + H(T(.),= EP{J f(e,xe,e)d9+f c(O).dve+6.ØTH(rT)}= P ®and by Proposition 4.8 we know that P ØT H e R(P). On the other hand, for P = P ®T Q e= P®Q(r3)= E f(9, xo,e)d8 + j c(O) . dye + Q.(f)}E {j f(, Xe, 1Le)d + j c(9) . dv9 + W(r(.),= LHS of (4.8),Chapter 4. The Dynamic Programming Principle 54so the proof of (a) is completed.(b)For any $t<t+h, we haveE’{r3(t+ h, W) — 18(t, W)IFt}t+h= E {j f(8, xo,0)dO + J c(O) dv8 + W(t + h, Xj+h)— [j f(, x,i6)dO + j c(O) . dv0 + W(t, xe)] }t+h t+h= E {f f(O,x9,o)d8 + j c(O) dvo + W(t + h, Xt+h)F} — W(t,x)= Pt,f(t + h, W) — W(t, Xt)= F r(t+h,w)—w(t,x)by Lemma 4.3(2). Now we apply (4.8) and the fact (F) C i.e., Proposition 4.7, togetE{r(t + h, W) — f3(t, W)IF}= inf : P e (F)} - W(t, Xt)inf {Pr : I’— W(t, Xt)=0.Therefore (r(t, W), .F, F) is a submartingale.(c) Let v(dOdx) be the distribution of (r, XT) under F. ThenE”W(r,XT) = JW(O,z)v(d1dx)= J J(O, H(O, x))v(d9dx)= EJ(r, H(r, XT))= J(r, ®T H).Note thatEF3(r,W) = E {jT f(8,x0,o)d9 + j c(6) . dv0} + EW(r, x)Chapter 4. The Dynamic Programming Principle 55= E {j f(O, xe,p9)dO + f c(O) dvo} + J(r, F ØT H)= J(s,FØTII)inf {J(s, Q), Q e R}= T47(s,x).On the other hand, E”W(r, XT) E’3J(r, F) = EJ(r, F) = J(r, F), and therefore by (b),W(.s,x) E”F5(r,W)= E {[ f(O, xe, po)dO + c(O) . dvo} + EW(r, z)E {j f(9,xe,e)dO + L c(O) dve} + J(r, F)= J(.s, F), (4.10)and thns (c) is proved if we take infimnm over F e on the right hand side.(d) If r3Q, W) is a F-martingale, thenW(s, x) = K’t8(s, W) = EI’S(T, W) = EI’S(T, 0) = Er5 = J(s, F),because from our assumptions, W(T,.) = 0. So F is optimal.If we assume that F e R. is optimal, then by (4.10), Proposition 4.7 and Corollary 4.6,W(s,x) < Eñ’3(t,W) = EI’8 = W(s,x). (4.11)Therefore (I’(t, W), .T-, F) is a submartingale with constant mean value, so it is indeed amartingale. U4.4 Markov propertyFor (s, x) e Y2, recall that fl. denotes the collection of optimal control rules with initialcondition (s, x), i.e.,Rh = {F E R : JQs,F) =We have shown that Rh is nonempty, compact and convex. Similar to Proposition 4.7 and 4.8for R, we state the following proposition.Chapter 4. The Dynamic Programming Principle 56Proposition 4.11 (a) is closed under conditioning, i.e., if P e R and r is an Fstopping time, s T T, then there exists a P-null set N e F such that ET(W),x() forwN.(b) R is closed under concatenation, i.e. if P E then P/T/H R., where H(..)is a measurable selector of R°..Proof. (a) From Lemma 4.7, there exists a P-null set N’ E F such that for w N’,FE ?- and thereforeW(r,x) F f, Vw 0 N’. (4.12)From our assumption P E and Theorem 4.10 (d), we know that (I’(t,W),F,P) is amartingale. ThusPf(r,W) = Fr5(T,w). (4.13)By the definition of F.(., •), (4.13) is equivalent toPrTW = PW(T,XT). (4.14)Let N” = {w : W(r, xT) FT}, then N” e F. From (4.12) and (4.14), and the fact thatEF(]T FT) = EFTW we have P(N”) = 0. Let T = N’UN”, then P(N) = 0, and for w N,W(r,x) =F or ‘E(b) If P E R, then by definition and Proposition 4.10 (d),W(s, x) = J(s, F) = EF6 = Ef3(r,W)= E {f f(9, Xe, 10)de + jT c(O) . dv9+H(r, XT)[JTf( xo, e)dO + f c(O) . dv0] }E {f f(O, xo, to)dO + fT c(O)dvo + H(r, XT) [FT ° OT,W] }= P/r/H(f3)= J(s,P/r/H).Moreover, we know that P/T/H e R,r, and therefore P/r/H E R. UChapter 4. The Dynamic Programming Principle 57Let {?4} be a countable dense subset of Kf, 0 i k, and {g,} be a countable densesubset of Cb(1R°). So {(A0,. . ‘,nk,l 1} is a countable set. LetIc NI‘1V”n’be an enumeration of this set. Now for (s, x) e 111+ x d defineV1 —— san= arginfEP{Je_0gi(xo)dO+ ZJe0dv}aithen D D .... Exactly as we have shown for R, we can prove that for each n,is nonempty and compact. Let= ifl s,r,then R 0, and is compact. Further, we can state the following theorem;Lemma 4.12 (a) fl is stable under conditioning: if F C R and r is an Fe-stopping time,s < r <T, then there exists a F-null set N e F7 such that FE for w g N.(b) R.°X is stable under concatenation: if F e R, and H(.,.) is a measurable selector ofthen for an Fe-stopping time ‘r: s r T, we have F Ø H CFroof. The proof for Proposition 4.11 also works here without any change. UNow we can state the main theorem of this section. It ensures that there exists a measurableselection of whose marginal distribution on V V’[0, T] is a strong Markov family.Theorem 4.13 There exists afamily of control rules{5F} such that 5F C RT for(s, x) C ,and{3Fv} satisfies the strong Markov property on V, where 5Fjv is the marginal of 5F onV E V”[0,T].Proof. We first show that all elements in fl’° have the same marginal distribution on V.Assume F, Q e R’°, then for any integers n0, ., n,ç-, I,E{ jTe_?ioOgj(x8)d9+ Lfei0dv}Chapter 4. The Dynamic Programming Principle 58= E{ jTe_oOgj(xo)do+jTe_i0dv}.Through a limit procedure we can conclude thatE’ { jTe_O6g(xo)do + Zje_t0dv}= E{jTe_A9g(xe)d9+jTe_At9dv}for any A°, •, Ak C lR and g C Cb(JRd). By letting A1 —* cc, 1 k we haveE’{fTe_Aeg(xe)do}= E{jTe_Aeg(xs)do}for any A e 18+. From the uniqueness of Laplace transforms and the left continuity of thefunction g(x.),E”g(x9)= E’g(xe), VO 0 T. (4.15)It can be easily seen through a routine limit procedure that (4.15) is true for any boundedmeasurable function g.In order to show Fly = QIv we need only to show that Fly and QIv have the same finitedimensional distribution, i.e.,EPg1()2x).. gm(xt) =E9g1(x)g22’“gm(xt) (4.16)for any bounded measurable function g, m, g,, S t1 <t2 < <im T. By (4.15) weknow that (4.16) is true for m = 1. Suppose it is true for in. Let F and Q be the r.c.p.d. ofF and Q given = c(xt, 1 j in) respectively. We will show that there is a N e Q,F(N) = Q(N) = 0 such that for w N, öw’/tm/Fw and &u’/tm/Qw are in whereS = (Xt, 60,0) for w = (x., it., v.).For this purpose, let F1, Q be the r.c.p.d. of F and Q given .Fjm respectively. It is easy tocheck thatJ Fc(.)1(d&)Chapter 4. The Dynamic Programming Principle 59is a r.c.p.d. of P, and thus= fP&(.)F(d) (4.17)for not in a P-null set A E . Next, we can choose a P-null set N’ E F such thatF, P) is a martingale after time tm for w N’. Then there exists a P-null set B esuch that F(N’) = 0 for w B. Now from (4.17) we know that if w N0 = AUE , then(Mo, F, P) is a martingale after time tm for E C(1Rd). Similarly, there exists an M0 Esuch that Q) is a martingale after time tm when w M0. Note that by inductionhypothesis, P= Q on . Let No = N0UM,we will have N0 e g and P(No) = Q(No) = 0.Moreover, (Mtq,F-t) is a martingale after time tm with respect to F and Q for w N0.Similar to the proof of Proposition 4.11 we can show for each k, there exists a P-null setNk E such that &u’/tm/-w E for w Nk. Thus if we set N = N0 UN1then ic, F(N) = 0, and for L’ 1, 6w’/tm/Fw E We can have the sameconclusion for Q, i.e., there exists an M e such that Q(M) = 0, and if w M, thent5w’/tm/Qw e Let iV = MUN, then N E Q and P(N) = Q(N) = 0. For w N, wehave &,j/tm/Pw, 6wu/tm/Qw E tfll,tm Thus by (4.15),Egm+i(Xtm+i) = w’g (x ) ====for w N. Since P= Q on Q, we can concludeEg1(x)2..gm(xi1)= EQg1(x)g2and thus (4.16) is proved, i.e.,= QIv.Finally, let 3P be a measurable selector of 1 D ‘-+ ]M1(). We show that SPD satisfiesthe strong Markov property. Let r be a=u(x8 0 0 t) stopping time, .s r T, andChapter 4. The Dynamic Programming Principle 60F C (FjT, then(8F)7(F) = (8F) o O;,(F)= F(F).By Lemma 4.12 (a), there exists N C FT such that P(N) = 0, and Fe R. Therefore, forw g N, since all the elements in R°Z have the same marginal distribution on V, we havek iv = TPQJID.In other words,(SPX)T,W (F) = TPW(F). (4.18)Recall that ($Px)T,w is the regular conditional probability distribution of sr given F,., and thus(4.18) means that{5F,jV} is a strong Markovian family. ElChapter 5The Value Function5.1 IntroductionIt is well-known in the classical control theory that when the value function has sufficientregularity it satisfies the corresponding Hamilton-Jacobi-Bellman equation, cf. Fleming andRishel [22], Krylov [50], and Lions [55]. We will show that this is still true in the case ofsingular stochastic control problems, where the Hamilton-Jacobi-Bellman equation becomes avariational inequality. In Section 5.2 we prove that the value function is uniformly continuouswhen the coefficients are Lipschitz continuous . Note that the proof of this fact is nontrivialcompared with the same result in the classical control problems, due to the presence of thesingular control variable. Applying the dynamic programming principle we have established inChapter 4, we derive the Hamilton-Jacobi-Bellman equation for our control problem heuristically in Section 5.3. Motivated by work of Lions [55], we will prove that the value function isthe unique viscosity solution of the corresponding Hamilton-Jacobi-Bellman equation.5.2 Continuity of the value functionIn the next two chapters, we add the following assumptions:• c(.) is Lipschitz continuous, f(., ., .) is bounded, and g = (gti) is a constant d x k-matrix,• f(., ., .), b(.,.,.), u(.,.,.) satisfy the following conditions:If(t, x, u)—f(s, y, u)( G(It—s + liz— yii),iib(i, x, u) — b(s, y, u)ii CQt — sj + liz — yii), (5.1)IIu(t, x, u) — c(s, y, )li G(It — i + lix—VIi)61Chapter 5. The Value Function 62uniformly for 0 s,t T, x, y c u e U.We will prove that under these conditions, the value function W(.,.) is uniformly continuouson D. In fact, there exists a constant C 0 such thatW(t,x) — W(s,y)j C (it — sI + lix— uii) o s,t T, x,y eNote that the constancy of g is only required in the proof of Theorem 5.2.Theorem 5.1 The value function W(s, x) is uniformly Lipschitz continuous in the state variable x, i.e., there exists a constant C > 0 such thatiW(s,x’)— W(s,x)i Clix’— xW, VO t T, x, x’ E (5.2)Proof. In the following, we use the same notation C to denote the constants, which maychange from time to time. For any 0 s T, x, x’ CJ41(s, x’)— W(s, x) EF8—for each Q e R.5,, P C where 1’ is the cost function defined by (3.11).Take an arbitrary P C By the definition of control rules, there exists a standard extension ((2,P, P1, P) of (12, F, F1, P), i.e., there exists another probability space (12’, F’, F1’, P’)such that (2 = 12 x 12’, F = F x F’, F1 = F1 x Fl, and F = P x P’. We can extend the processesx.,&,v. to 12 by the following: for ‘ = (w,w’) C 12,= x.(w), 1u.(&) = &(w), v.() = v.(w).On (12, F1, F) there exists a standard d-dimensional Brownian motion B. such that for s t T,I I= x + j b(O, x0, po)dO + u(8, x0, zs)dBe + gv1 a.s. (5.3)Consider the same equation (5.3) with the initial state x’, i.e.,I I= x’ + b(8, ye, jze)d9 + a(9, y,1t9)dBg + gv1 (5.4)Chapter 5. The Value Function 63on the stochastic basis (, F, F, F). The strong solution for (5.4) exists from the assumptionson b(.,.,.) and a(.,•, .), and so c (Q, F, F, , y, , Vt, , x’) e Therefore there exists acontrol rule Q e such thatJ(a) = J(.s,Q) = EF8. (5.5)By definition,= E f(6, xo,#e)d6 + c(O) . dve}= E {j f(O, xo,o)dO + j c(O) dvo}and by (5.5),Er8 = E{1Tf(O, y, io)dO + j c(O) . dv0}therefore from the Lipschitz continuity of f we have- TEF8— EF8 ye,ie)— f(,x0e)Id8}< CEJ )Iye—xej)dOTc(jEPy0—x0112d8)Now from the equations (5.3), (5.4) and the Lipschitz continuity conditions on b, a, we havefor,o 2—x0112 CIjx’— x112 + CE (j IIb(h,yh,h) — b(hxh#h)Ildh)2+CE sup f (a(h, Yh, h) — a(h, Xh, nh)) dBho’<eCIIx’ — x112 + CE j Ib(h, Yh, h) — b(h, Xh, h)jI2dh+CE IIa(h, yh, h) — a(h, Xh, h)II2dhc (‘ — x112 + j EPj)y — Xh112 dh).Chapter 5. The Value Function 64We have used the Burkholder-Gundy inequality to get the second inequality. By Gronwall’sinequality,— xsii2 Clix’ — xii2e°(°) Clix’ — xW2.Hence we haveE9r8 — itFS Clix’— xii,and therefore W(s, x,) — W(s, x) Clix’ — xii.The proof of the theorem is thus complete since x, x” E are arbitrary. IINext we consider the continuity of the value function in the time variable t.Theorem 5.2 The value function W(t, x) is uniformly continuous in the time variable t. Infact, there exists a constant C> 0 such thatW(s, x) — W(s’, x)j C s — s’ (5.6)for all 0 s, s’ T, x CProof. As in the proof of Theorem 5.1, we use the same notation C to denote the constants.First we assume s s’, so thatW(s’, x)— W(s, x) sup (E’I’3’— EI’5) (5.7)PER-5,xfor each Q e fl. From the strict positivity of c(.), we may actually take the supremum in(5.7) over a subset R,(A) for some A> 0, where= {F e R: ltiiv(T)ii A}. (5.8)For F C fl8,(A), similarly as in the proof Theorem 5.1, there exists a standard extension(, .1, f, P) of (Q, F, F, F) such that x(.) is a solution tot t= x + b(O, xe,p9)d6 + u(O, x, jto)dB9 + gv1 (5.9)Chapter 5. The Value Function 65on the stochastic basis (, .F, F, F), where B. is a d-dimensional Brownian motion. Define= -t—s’+s, I = ILt_s’+s, ‘ = Vt_3 and B = B_i+3 for t s’. Consider the followingstochastic differential equationft ftx + J b(8, y, fte)d8 + J u(O, y, i2o)dB + gI’t, t s’ (5.10)Sion the stochastic basis We know that under assumption (5.1), there exists aunique strong solution y., and by definition c = e A8i. Thereforethere exists a control rule Q e R.8i,, such thatJ(Q) = J(s’,Q) = Er81. (5.11)Recall that I is defined by (3.11). Thus by definitionEF8 E{1Tf(8,x69)d9 + j c(8) . dvØ}and by (5.11),= E {[ f(O, ye,o)dO + j c(8) . do}- T—s’+s T—s’+s= E {f 1(8+ ‘ — ,ye+8i_3o)d0+ j c(O+ s’ — ) . dve}.Therefore, noticing that f is bounded below by a constant —K and P e R(A), we get- T-(s’-s)— Er5 E {j f(8, x,) — f(8 + s’ — , Y6+s’-s, iio)1d8T-(s’-s)+ j k(°) - c(O + s’ - )II dilvoll + KJs’ - sI}— T—(s’—s)< E {j If(8,xs,) — f(9 + s’ — sYo+si_se)Ido}+CIs’— sIE”)Iv(T)II< E{ jT—s’sf(8 Xe, o) — f(0 + s’ — , Yo+3’-6)d8}+Cs’—si.From the Lipschitz continuity of the function f, we have— f(8 + s’— 8,ye+s’_s,ILe)I C(js’ — + Ixo— Ye+s’-sII).Chapter 5. The Value Function 66Therefore- T-(8’--s)— Er3 C (Is’ — s + E J line — Yo+s1_siIdO) (5.12)I T—(s’—s) -Cjis’—sI+ (J EFlixo_Ye+_sil2d6)By (5.10), we have for 6 s,pO+s’—sYO-fs’—s = X + J b(h,yh,uzh)dh (5.13)8+ J a(h, Yh, I.th)dBu + gii8’_38’S= + J b(h+ 8’— 8,Yh+st_s,Ph)dh+ J a(h + &‘ — Yh+s’-s, Ph)dBh + yvo.So from (5.9), (5.13) and the Burkholder-Davis-Gundy inequality we have for 6 s,P 2F no— YO+s’—s202E (J iib(h, Xh, Ph) — b(h + 8’ — , Yh+s’-s, fth)lidh)-2+2Ev sup J (c(h, Xh, ph) — a(h + s’ — 8, Yh+s’—s, ph))dBho’o 8— o2TEJ iib(h,xh,ph) — b(h+ &‘ 8,yh+sl_3,p )ii2d+2Ev j Ja(h, Xh, Ph) — u(h + 8’ — 8, Yh+s’-s, ph)ii2dhCEJ(Is’ — 812+ WXh— Yh+81_12)dhC (Is’ — si2 + J Flixh — Yh+S1_Sii2dh).Gronwall’s inequality implies,E”iixo— yo+s’-sii2 Cs’ —sI2e° Cs’ — j2.Hence from (5.12) we have— EF8 Cs’—Chapter 5. The Value Function 67and therefore W(s’, x)— W(.s, x) C s’—.s for s’ > s.Now we assume s < s. By the dynamic programming principle (cf. Theorem 4.10),W(s’, x) = inf E {f f(9, ,8)d9 + j c(9) . dv6 + W(s,Take F° E R’ such that F°(ite = 6o, ye = 0, 0 9 T) = 1 for some arbitrary but fixede U. ThenW(s’, x) E° {f f(9, x, u°)d8 + W(s, x3)}< Cs’— I + E°W(s, x3),by the boundedness of f, and by Theorem 5.1, there exists a constant C such thatW(s,x3) W(s,x) + CWx3 — xII.HenceW(s’,x) — W(s,.t) C(s’— sI + E°IIx8 xII) (5.14)C (is’- I + (E°IIx8 - xII2)).From the definition of control rules, we know that under F°,— x= f b(9, Xe, u°)dO + M3, (5.15)where M is a continuous square integrable martingale with(M)3 = j tr(a(O, x, u°))dO.Therefore by (5.15) and the Burkholder-Davis-Gundy inequality, we haveE’°IIx5 — xii2 <C (Is’—s12 + js’— si). (5.16)Combining (5.14) and (5.16) we haveW(s’,x)—W(s,X) <CIs’—sI,Chapter 5. The Value Function 68where we have used the inequality /iT6 ,/i+ /i for a, b 0, and the fact that 0 s, s’ Twith T finite. The theorem is thus proved. UNow combining Theorem 5.1 and 5.2 we can state the main result of this section.Theorem 5.3 The value function 14 is uniformly continuous on >1 Moreover, there exists aconstant C 0 such thatW(t,x)— W(s,y) C (It— si + liz—0 s,t T, x,y 65.3 The dynamic programming equationBefore we derive the dynamic programming equation heuristically, we prove a result whichshows that there exists a set such that the optimal state process is continuous when it is in thisset.Theorem 5.4 (a) Assume (t, x) e E, then14(t, x) W(t, x + gh) + c(t) . h (5.17)for each h c 114. Moreover, if equality holds for some h = (h1) 6 IRhv, then the same equalityholds when we replace h by Ii with h = (hi) e 114, ht ht (1 i k).(b) Define, for 0 t T,{x: W(t,z) < W(t,x+gh)+c(t) h, Vh e 114,h $ 0}. (5.18)Then the optimal state process Xj is continuous when it is in A. To be precise, we haveP(Axt0,xteAj)=0, s<t<T (5.19)for every FE £, (s,x) e >D.Proof. (a) If (5.17) fails for some h c 114, thenW(t, x) > W(t, x + gh) + c(t) . h. (5.20)Chapter 5. The Value Function 69Take F C We define 0 Q — Q byI —gh,3,0), 0 s=I. (x1u,v+h), t<sTfor w = (x., ji., v.), and let P = F a 0—1(.). As in the proof of Lemma 4.4, we can show thatF C t,r. From the definition of F we haveJ(t, P) = E {JT f(9, xe,o)dO + j c(O) . dvo}= E{jTf(9, xo,)d9 + j c(O) . dv9 + c(t). h}= J(t,F)+cQt).h= W(t,x+gh)+c(t).h.Therefore, J(t, P) < WQI, x) from (5.20), a contradiction.Next, if (5.17) holds as an equality for h, then for h, h1 h1, 1 i lv,W(t, x) — c(t). lv = W(t, x + gh)W(t, x + gh)— c(t) . (lv —> T’V(t, x) — c(t) . lv — c(t) . (lv— lv)= W(t, x) — c(t) . lv.ThereforeW(t, cc) = W(t, cc + gh) + c(t) . h.(b) For F e we know that F-a.s.,= cc + b(O, xo,1z9)dO + gvj+ a continuous local martingale.Hence Lx = gAve, and xt+ = Xi + Ax = Xj + gAve. Since W(.,.) is continuous on D, forsW(t,cc) = limW(t’,x’). (5.21)Chapter 5. The Value Function 70Assume P E then by the dynamic programming principle (cf. Theorem 4.10) we knowthat (r3t, W), .F, F) is a martingale. Hence for t’ > t,ti tiW(s, x) = E {j f(6, x,t0)d9 + j c(O) dye + W(’, XtI)}. (5.22)Let t’.t; note that from our assumption we know that c(.) is continuous on [0, T]. Thereforetj f(9, xo,9)de-÷ f f(O, x,i0)dOj c(O).dvej c(O).dvo+c(t).Avt.Thus if (5.19) fails, then (5.21) and (5.22) implyW(s, x) = E {j f(O, xe,e)dO + f c(9) . dv0 + c(t) . Ay + WQ, Xt + gAv)}> E {j f(8, xe,e)dO + j c(O) dvo + W(t, Xt)}= E”r8(t,W),which contradicts the fact P The inequality follows from the facts thatW(t, Xt) W(t, rt + g/xt) + c(t).and the strict inequality holds if Xt E A and /Xt 0.Remark 5.5 From Theorem 5.4 (a) we can see that if the value function W E C”2(E), then(g*VW(t x)) + c(t) 0, i = 1,2,. •,where * means transpose and (.)i denotes the i-th coordinate of the point in 1R’. For xthere exists h = (hi) E 1R such that for h = (Jit) E 1R, h h, 1 i k,W(t, x) = W(t, x + gh) + c(t) . h.Therefore we have(g*vw(jx)) + c(i) = 0for those i such that h > 0. 0Chapter 5. The Value Function 715.3.1 Heuristic derivation of the dynamic programming equationRecall Ito’s formula, for e C”2(), (s, x) E D, t s,ó(t,xt) = sx+f (+c) (6,xe,p)dO+ j V(O, x0) a(6, xo,9)dB0+ j V(0, xo) gdv0+ [q(9, xg) (0, Xe) — V(O, x) Axe]. (5.23)s<O<tLetthen for P e fl, (5.23) may be written asE(t, x) = (s, x) + EP{J (8, Xe, e)d9 + f V(9, Xe) . gdvo+ [(O,xe+) — (O,x) — V(O,x0) Axe]}.sO<tBy the dynamic programming principle (cf. Theorem 4.10),W(s, x)=E{jtf(9, x,i9)dO + j c(O) dv9 + W(t,so if we assume W C”2(), then0 =+ j(c(O) + g*VW(8, Xe)). dv9+ [W(O, x) — W(O, Xe)— VrW(O, Xe) . Axe]}. (5.24)If we take the infimum over all those P e such that P(v = 0, 0 t T) = 1, then from(5.24) we get0.Let t,..s. we haveinf E(LW + f)(s, x8, IL8) 0.Chapter 5. The Value Function 72Moreover, from Remark 5.5, we can conclude that on(g*VW)t+ c1 0, i = l,2, . ., k. (5.25)Therefore we can expect that W satisfies formally the following variational inequality, orHamilton-Jacobi-Bellman equation,min{ inf (Lw + f)(t, x, u), (g*VW(t, x)) + ct(t), i = 1,2, . ., k} = 0 (5.26)uEUon >D. For simplicity of notation, we write (5.26) asmm { inf (Lw + f), ?VrH’ + = o.uEU5.3.2 Viscosity solutionAs is well known in classical control problems, the value function is a solution to the corresponding Hamilton-Jacobi-Bellman equation when it has sufficient regularity (cf. Fleming andRishel [22], Krylov [50]). Hit is only known that the value function is continuous, it is observedby Lions [55] that the value function is a solution to the Hamilton-Jacobi-Bellman equation inthe viscosity sense, which we will define in the following.Definition 5.6 A function is a viscosity solution of (5.26) if b e C(E) and for every e1. for each local maximum point (to, cc0) of &—in the interior of >D, we havemin{ inf (L + f), + c} 0 (5.27)uEUat (to, cco), i.e., ,b is a subsolution,2. for each local minimum point (to, cco) of — in the interior of >D, we have• min{ inf (L + f), g*V + c} 0 (5.28)uEUat (to, no), i.e., 5 is a supersolution.Chapter 5. The Value Function 73For an introduction to viscosity solutions and their applications to stochastic optimal controlproblems, see Fleming and Soner [23]. For an extensive bibliography on viscosity solutions,cf. Crandall, Ishii and Lions [15].Theorem 5.7 The value function W(.,.) is a viscosity solution of (5.26).Proof. By Theorem 5.3, we know W C C(E). We first show that W is a subsolution. Fore C”2(Y2), if (to, xo) e int(>D) is a local maximum point of W—ql, then there is a neighborhood01(to, xo) of (ta, xo) in > such thatW(t, x) — (t, x) W(to, xo) — (t0,xo), (t, x) e O1(t0,xü),orW(t, x) — W(to, so) (t, x) — (to, so), (t, x) e O’(t0,so), (5.29)where O(tü, so) denotes the closure of01(to, so). If (5.27) fails, then one of the following willbe true,inf( + f)(to, o, u) < 0, (5.30)(g*Vq5t0x)) +et(to) < 0 for some 1 i k. (5.31)If (5.30) is true, then from the assumption (5.1), and gi c C”2(E), we can find u0 e U anda neighborhood 02(t,so) of (t0, so) such that(4 + f)(t, x, u0) <for (t, s) e 02(to, so). Take F e fl0, such that= 6{uO}, Vt. = 0, 0 r T) = 1. (5.32)The existence of such a P e is obvious. Definer = inf{t > to, (t, xi) 0 O(to, xo)},Chapter 5. The Value Function 74where OQ0,xo) =01(to,xo) nO2Qto,xo). Since the state process x. is continuous a.s.(F), wecan see immediately thatF(r > to) = 1,and for to 0 < r,(4+ f)(O, zo, p) < 0, a.s.(F).Therefore,f)(O,xg,o)d0 < 0.By the definition of control rules,(r, x) = (to, xo) + f 4(0, x,9)d9 + M7, a.s.(F),where Mçb e M, i.e., a continuous square integrable martingale with respect to F. Note thatStroock and Varadhan [73], Theorem 4.2.1 allows us to replace £ by L in (2.9) at least whenv = 0. HenceE(r, XT) — (to, xo) = E j 4(0, x,9)d0.Noticing that (r, x) e O(to, xo), we haveE”W(r, x) — W(to, xo) E’(r, x7) — (to, xo)= EFf4(0,xO,e)d0< —E L f(0, Xe, o)d0,which, by (5.32), can be rewritten asW(to, x0) > E {j f(0, xo,o)d0 + W(r, XT)}=This contradicts the dynamic programming principle, cf. (4.9).Next, if (5.31) holds at (t0,x0) for some i, then we can take ht > 0 small enough such that(to, xo + gh1) — (to, xo) < —c(to)ht,Chapter 5. The Value Function 75where g1 denotes the i-th column of the d x Ic matrix g. Therefore by (5.29) we havel’V(to, x + gh) — W(t0,xo) < —c(to)h3,and thereforeW(to, xo) > W(to, xo + gh) + c(to)where h = (0,.. , hi,. , 0). This is a contradiction to Theorem 5.4, and thus we have shownthat W is a subsolution to (5.26)Now we show that W is also a supersolution of (5.26). If e C”2(E) such that W — has alocal minimum point at (t0,x0) e int(XD), then there exists a neighborhood O’(to, xo) of (to, x0)satisfyingW(t, x) — 1’17(to, .xo) > &(t, x)— (to, xo), (t, x) e O’(to, xo). (5.33)If (5.28) fails, theninf(L+f) >0, (fV)+c’ >0ueUat (to, xo) for j = 1, . . ., Ic. From the assumption (5.1) and the fact 4 E C1’2(E), we can find aneighborhood 02(to, zo) of (to, xo) such that for some & > 0,inf(L+f) >6, (*)3 +c >6‘uEUon O2(to, xo) for i = 1, .. ., Ic. Let O(to, xo) = O’(to, xo) n02(to, xo), then for (t, x) e OQo, xo),we have for small h C 1R, h 0,x + gh) — $t, x)> —c(t) . h.Therefore by (5.33),W(t, x + gh) — W(t, x)> —c(t) . hor x C A. Hence for F CF(x0 = xt0) = 1 (5.34)by Theorem 5.4. Definer =inf{t to, (t,xj) 0 O(to,xo)},Chapter 5. The Value Function 76then from (5.34) we see that FQr > to) = 1 for F E fl3 and it can be seen that(t,xj) E OQ0,xo), Xt E A, to t r.Therefore we have+ f)(O, x, jie)dO + j (g*v(o Xe) + Ce) dve} eE(r — to).Applying Ito’s formula, noticing that the state process x. is continnons a. s. (F) when to t r,we haveE(r, XT) = (to, xo) + E (6, x0,1i9)d9 + V(O, xo) . gdve},which may be rewritten asE[(r,x) — (to,xo)] E {j —f(O,xo,pe)dO—J c(o).dve}+EEF(r_to).toBy (5.33) and the fact F(r > to) > 0 we haveE[W(r,x7)— W(to,xo)] > F {J —f(O,xe,e)d8 — J c(S).orW(to, xo) < E f(O, X, to)d9 + J c(S) . dv9 + W(r,which contradicts the dynamic programming principle.The proof of this theorem is therefore complete. U5.4 The uniqueness of viscosity solution to the HJB equationWe now consider the uniqueness of viscosity solution to the dynamic programming equationmin{ inf (LW + f), g*VxW + c} 0 (5.35)ueUwith the boundary conditionW(T, x) = g(x), (5.36)Chapter 5. The Value Function 77where g is a bounded Lipschitz continuous function onThe equation (5.35) may be rewritten asmm { + fl(t, x, DW, DW), + c} = 0, (5.37)whereH(t, x,p, A) if{tr(c(tx u)u(txuiA) + fb(t, x, u)pj + f(t, xu)}for (t,x) e Z, p e 11?”, and A C 8dxd Note that in (5.37), DW, DW denote the gradientand Hessian of the function W respectively.Let us introduce some notations: if w : i—÷ 18, then we define P2+zv(s, z), (s, z) e(0, T) x JR’, to be the collection of all the points (c, p, X) e JR x JR1 x 8dxcl such that1w(t, x) w(s, z) + c(t — s) + (p, —z) + — z), x — z)+o - sj + IIx - zj;2).With the above notation, for (.s, z) C (0, T) x 18d, we setP2+w(s,z) {(c,p,X): B(s,z) e (0,T) x 18d, (c,p,X)€(s,z) —÷ (s,z), (c,p,X) —÷ (c,p,X)},and2’w(s,z) = _22+(_w(s,z)), P2’w = _122+(_W). Obviously we have for any constant A> 0AP2+w =P2+(Aw), AP2’w =The following lemma can be proved easily by the method in Crandall, Ishii, and Lions [15].Lemma 5.8 A continuous function TV defined on E is a viscosity subsolution (supersolution,respectively) of (5.37) if and only if for all (t, x)€(0, T) x 18”,min{c + fl(t, x, p, X), g*p + c} 0, V(c, p, X) ei52’w(t, x)( 0, V(c,p,X) €2’W(t,x), respectively). 0Chapter 5. The Value Function 78We write down the next lemma which is crucial for the proof of the main theorem of thissection. This is a special case of Theorem 8.3 in Crandall, Ishii, and Lions [15].Lemma 5.9 Let V1 and V2 be continuous functions defined on (0, T)xIR’, and e C1’2((0, T)xlad). Suppose that i e (0,T), i, e JR”, andw(t, x1,x2) V1(t, xi) + V2(t, x2) — (t, xi, x2) w(i, i, )for 0 <t < T and x1,x2 E 18d, i.e., w(.,., .) attains its maximum at (, i, ). Assume, moreover,that there exists an r> 0 such that for every M > 0 there exists a constant C satisfyingc C whenever (ci, qi, X1) ep2+T4Q, xi), i = 1,2,It - I +-r, 114(t, x)I + qj + Xj M. (5.38)Then for each e> 0 there are c C Ill, X1 e 8dxd such that(i) (c1,D(I,2i,9i2),X) CP2’VjQ,Ij) for i = 1,2,‘X1 0ii- (+IIAII)I ( ) A+EA,6 0 X2(iii) c1 + c2 =where A= Dg5(i,a1,ä2).Observe that the condition (5.38) is automatically satisfied if Vj are subsolutions of (5.37).Let us define the function spaceC(E) {W(.,.): W e C(; 111), with Wbounded, andIW(t, x) — W(t, y) ClIx—y for some C 0}Now we state the main theorem of this section.Theorem 5.10 The dynamic programming equation (5.35) has at most one viscosity solutionin the space C(E) satisfying the boundary condition (5.36).Chapter 5. The Value Function 79Proof. Let W, V e C(YZ) be two viscosity solutions of the equation (5.35) with the boundarycondition (5.36). We will first show that W V on E.For any given s: 0<e< 1, define E6 E (e,T] x , and for (t,x) e z6,W6Q, x) (1 — s)W(t, x) — -L—, (5.39)thene edt t—&)—(t—e)2 >0.Therefore, by Lemma 5.8, we is a viscosity subsolution to the dynamic programmingequation.For any given a, 3> 0, 0 <e < 1, define an auxiliary functionQ, x, y) E we(t, x) — V(t, y) — IIx — yB2 + (t — T).It is easy to see that the function is bounded above on >D6. There exists (ta, Za, ya) Csuch thatZa, ya) > sup — a. (5.40)Definea(t, x, y) = (t, x, y) — [It — 1a12 + lix — Xaii2 + ii — Ya1121,It is obvious that the maximum of the function a is attained on >D at some point (1, h, ),which depends implicitly on a, , andI - taI2 + - xaii2 + - Yaii 2. (5.41)We first show that for any given e > 0, when a smallIi — Uii = o(a). (5.42)Since , attains its maximum on (e, TJ at (1, , ), we havea(1, i, 1) + cr(1, th ) 2a(i, cI, ). (5.43)Chapter 5. The Value Function 80Recall that from the definition of , we haver(1, , I) = W6(i, i) — VQ, ) + /3(i — T)-[Ii- t2 + I - + t - Ya1121 (5.44)cxQ1, ) = W6(i, ) — V(i, ) + /3(i — T)[i- tI2 + I - xajI2 + I - yII2] (5.45)aQ , ) = W(t ) - VQ ) - - uii + (t - T)-t2 + II - + - Ya112j. (5.46)Using (5.41) and the fact11± — Ya112 2j—+ 2II—Ya112,II — 2i II + 2 — xcrII2,we can getII - YaII + II - xJ 4II - II2 +4. (5.47)Now (5.43) and (5.44), (5.45), (5.46) and (5.47) will lead toIIII2(1-e)(W(IM-W(iM)+V(IM-V(IM+[IfI— YaW2 + II — XCJI— II + 2h — + 2u,which implies (5.42).Now let ns assume the maximum of the function L is always attained at the point (T,, ),i.e., t = T. Thena(t, x, y) F(T, h, ) Vx, y ENote that the boundary condition (5.36) implies that W(T,.)=.q(.), V(T,’) = g(.). Recallthat g is bounded and Lipschitz continuous on 1&, and by (5.42) we have lima+o I — = 0.Chapter 5. The Value Function 81Thuslim sup (WE(T, 3) — V(T, Th) <lim sup ((1 — &)W(T, ) — V(T, ))cy—*O(1— E)limsup (g(th)— g()) — elimsupg(y)CE,aud therefore for some constant C,WtQ, x) — V(t, x) + /3(t — T)= Q, x, x)sup (t,x,y) = lim a(ta,xa,ya)e<tTa,ycjRdlim sup a(T,, )lim sup (WE(T, ) — V(T, ))a—*OCE,where we have used the fact (5.40). Recall that W is defined by (5.39), so we haveW(t, x)— V(t, x)= 1 E’ — V(t, x) + (1— EXt — E)= Wt(i, x)— V(t, x) + 1 E W6(t, x) + (1— EXt — E)CE+ + (1 t) +/3(T—t), (5.48)where C’ is a constant. Since E, 3 > 0 are arbitray, we can let E, /3 —÷ 0 to getW(t,x) V(t,x) (5.49)for any (t, x) c (0, T] x 18d, and from the continuity of the functions W and V we can concludethat (5.49) is true onInterchanging the order of the functions, and repeating the proof we haveVQ,x) W(t,x)on E, and therefore W = V onChapter 5. The Value Function 82Therefore the proof of the theorem will be complete if we can show that I = T when e, aare small, and 3> 0.TI I < T, we can apply Lemma 5.9 with 17 = We, V2 = —V, and(t, x, v) = lix- ull2 + - t2 + lix - xaH2 + My - Yaii2] - /3(t -to get cj E 18, Xj e 8cb<d, i = 1, 2 such that Lemma 5.9 (i), (ii), and (iii) are satisfied. Fromthe definition of we have=1Dq5(t, t, y) =—y) + a(x —D2,) =- - ) + a( - y0),Dc5(I, ä, ) = (1- + a) i, D2qS(I, ä, ) = —= (— + a) i,where I is the d x d unit matrix. Therefore we can write/(1+a)I —‘ \A = I 1, (5.50)\.—‘ (1+a)1!c1 + c2 = —3 + a(I— ta). (5.51)Notice that Lemma 5.9 (i) can be rewritten as611 1and(—c2,—DKI, i, ), —X) C2’V(I, , ).From the definition of viscosity solution, we havemin(1C1+ (I.lDxo(I,,Xl)J g* ) + c(I)) 0, (5.52)mm (—c2 + ?-t(I, , —D2(I, th, ), —x2), _g*Dycs(I,, ) + c(I)) 0. (5.53)Chapter 5. The Value Function 83The inequality (5.52) may he rewritten as(5.54)1 g*(j, ) + c(i) 0, (5.55)where in (5.55) the inequality holds componentwise and we use the same convention in whatfollows. Now we show that— gD(i,, ) + c(l) > 0. (5.56)If (5.56) is not true, then for some 1 i— (g*Dy(I, )) + c(I) 0, (5.57)where x denotes the ith coordinate of x. Now multiply the ith inequality in (5.55) by 1—then subtract (5.57) to get— Za +— )) Ect(t),which is a contradiction from the facts (5.41) and c’(i) co> 0 when we make a small.So the inequality (5.53) is equivalent to— c2 + i-tQE, ,, —D(I, 5, ), —X) 0 (5.58)when a > 0 is small. The inequality (5.54) can be rewritten asc1 + (1— em (t,, 1 e’’ 1 xi) 0 (5.59)Now we combine (5.58) and (5.59) to getc1 + c2 + (1 — e)fl(t, 11D(i, 1 — n(i, —Dth(i, , ), —X2) 0. (5.60)Notice that c1 + c2 = %, ) = —/3 + a(I — ta), therefore (5.60) can be written as/3—aQt—ta)(1— e)fl(I,,1 X)Chapter 5. The Value Function 84—]-i(i, , —Dçb(i, 1:, ), —X2)= (1— e) inf {tr(11_u(t,x, u)u(t, x, u)*Xi)+ (1 D(I , ), b( , u)) + f( , u)}- inb{_ tr (c(t, x, u)a(t, x, u)*X2 + (-D(L , ), b( , u)) + f(L , u)}= inf { —tr (u(t, x, n)a(t, x, u)*Xi)uEU 2+ (D ±, ), b( ±, u)) + (1- e)f( , u)}+ sup {tr(u(t,x,u)u(t,z,u)*x2)+ (D(i ,),b(i,Thu)) — f(fmu)}uEU 211 / N 1 /suPttrcu(t a’, u)ott, a’, UYX1) + trçu(t a’, u)u(t, , u)XiiEU+ (Dr(j , ), b(i , u)) + (D(i , ), b(i . u))+(1- &)f( , u) - f(i , u)}= sup{I+I1+III}. (5.61)uEUNow we break the rest of the proof into two steps:Step 1. we first estimate the terms on the right hand side of (5.61). Recall that the matrixA in Lemma 5.9 is given by (5.50), which can be written as/‘ —I\ /10a\1 I) \o ITherefore we have/3 \ / I —IN II 0J\—I II \o INow by Lemma 5.9 (ii) and the assumptions that a(t, a’, ‘a) is bounded and Lipschitz continuons in (t, a’), we have (with a E u(t, I, ‘a), & = aQE, , u))tr(u(Z, ±, ‘a)u(i , u)*Xi) + tr(u(i, , u)u(i, , u)*X2/* u&*\ /X1 0= tr( II\& &&J \. 0 x2Chapter 5. The Value Function 857 f* uô.*\ / I —I(—+2a)tr(/ \* âoJ \—I I/cTU* Ua*\ II 0+(a+a3)tr( )(\* âJJ \0 I= ( + 2a) tr(* — — + a+(a +a3)tr(uu* + âô)= ( + 2a) tr((u — — )*) + (a +a3)tr(ua* + &a*)(+a) lu—all2+C(a+a3)< c2 (+2a) ll—Il+C(a+a= o(1) (5.62)when a small, where C is a constant that changes from time to time. We have used (5.42), andthe assumption that u(t, x, u) is bounded and Lipschitz continuous in x. Now let us look at thesecond term on the right hand side of (5.61). We have(Dr,, ), b(i, , u)) + (D(i, , ), b(t, , u))=- ),b(,u)) + aQ -- - ), b(, , u)) + a - y, b(i, , u))- , b(, , u) - b(i, , u)) + Ca[ll - Xall + Il - Yall]II - II IIb(, , u) - b(, , u)ll + 2Ca(Il_II2+a)= o(1), (5.63)when a small, where we have used the fact that b(t, x, u) is bounded and Lipschitz continuousin x.The third term in the right hand side of (5.61) can be bounded by(1-E)f(t, , u)- f(, , u)l (1-e)lf(i, , u)- f(, , u)I + Elf(t, , u)IChapter 5. The Value Function 86(1—e)Ij +Ce= o(-s/&) + Ce, (5.64)since f(t, x, u) is bounded and Lip schitz continuous in x.Step 2. Now from (5.61), (5.62), (5.63) and (5.64) and have5o(1)+Cewhen a is small. By letting a, e —* 0 we getwhich is a contradiction. Therefore we have proved that i = T when e, a are small.The proof of the theorem is thus complete. UCorollary 5.11 There exists a unique viscosity solution in C(E) to the dynamic programmingequation (5.35) with the boundary condition g = 0, which can be identified as the value function.Proof. This is a consequence of Theorem 5.3, Theorem 5.7 and Theorem 5.10. URemark 5.12 By Theorem 5.7 we may characterize the value function as the unique viscositysolution of (5.35). URemark 5.13 The proof of Theorem 5.10 is based on a modification of the mothods usedin Fleming and Soner [23], Chapter 2 and 4, where the HJB equations for the classical control problems are considered. The assumption that b, a, f are bounded may be removed bymodifying the proof given in Ishii [36]. UChapter 6The Continuation RegionAs the examples in Section 2.2 have shown, there exists a region, which is called the continuationregion, or inaction region, such that if the optimal state process starts from outside this region,then apply the singular control to bring it to the region immediately, and then keep it insidethis region from then on. In the interior of the region the optimal singular control should notproduce any jumps.Recall from Theorem 5.4 that for (t, x) e E,W(t,x) W(t,x+gh)+c(t).h, Vh e 1i4.We define the continuation region of the problem asAE{(t,x)e >D: W(t,x)< W(t,x+gh)+c(t)’h, Vhe114, h#O}.It is obvious thatA= U {t}xA,O<t<Twhere A is defined by (5.18), i.e.,= {x: W(t,x)< W(t,x+gh)+c(t).h, V/i E 1R, h $ O}.We have shown in Theorem 5.4 that the optimal state process Xt is continuous when it isin A, i.e., for F C R4, .s t T,F(Ax $ 0, Xt e A) = 0.In this chapter, we will show that there exists an optimal control rule F c such that if(s, x) 0 A, then the control brings the state (t, xi) immediately to A and keeps (t, xi) inside A87Chapter 6. The Continuation Region 88from then on. If (s, x) C A, the optimal state process wifi be kept inside A. We first give aseries of lemmas.Lemma 6.1 There exists a function h(.,.): YD i—÷ on >D such that h(t,.) is Borel measurablefor each t fixed. Moreover for each (t, x) CW(t, x) = T’V(t, x + gh(t, x)) + c(t) . h(t, x), (6.1)and for P CF(JAxtI > 0) = 0 (6.2)or equivalently,= x + gh(t, x)) = 1.Proof. For (t, x) e >D, let h be the4aximum h1 0 snch that h (h1, 0,. . ., 0) satisfiesW(t, x) = W(t, x + gh) + c(t) . h. (6.3)Let h be the maximum h2 0 snch that (6.3) holds for h = (h, h2, 0,. . ., 0). Continue thisprocedure we can get an h = (h, . ., h%). The existence of h7 follows easily from theLipschitz continuity of W(t,.) and the fact that W(t,.) is bounded from below and c(t) > 0componentwise. Let h(t, x) = h*, the Borel measurability of h(t,.) for t fixed is proved in theAppendix A.Now we prove (6.2) for F C To simplify our notation, let x’ = x + gh(t, x).Then Vh’ C ]R,h’ $0,WQt, x’) < W(t, x’ + gh’) + e(t) . h’, (6.4)i.e., x’ A. Otherwise (6.4) would be an equality, andT’V(t, x) = l’V(t, x’) + c(t) h(t, x)= W(t, cc’ + gh’) + c(t) . (h(t, cc) + h’).= W(t, cc + g(h(t, cc) + h’)) + c(t) . (hQt, cc) + h’).Chapter 6. The Continuation Region 89This contradicts the maximality by which we construct h(t, x). Hence x’ c A2 and the resultnow follows from Theorem 5.4. 0Lemma 6.2 For each fixed t C [0, T], there exists a measurable selector H2(.) : —* 1M(Q)such that 112(x) C R, x c jG2, andH2(x)(xj+ e A2) = 1. (6.5)Proof. Let H(.,.) be the measurable selector in Chapter 3. DefineIIj(x) = H(t, x + gh(t, x)) owhere O Q is defined by1 @8 —gh(t,x),p3,0 0 s= jI. (x5,,uv+h(t,x)) t<sTfor w = (x.j&, v.). It can be verified easily that 112(x) C R4. Now we show that H2(.)1M(f2) is Borel measurable. It is sufficient to show for any F C bF, the function0(x) = EIt(r)F is in b13(lRd). We write0(x) =EtT)0®F =so it will be enough to show that F(O2,(w)): 18d x Q —* 11? is Borel measurable, orOt,(w) 18” x Q —is Borel measurable. But the Borel measurability of O,.(.) can be easily seen from the Borelmeasurability of h(t,.).Finally, (6.5) follows from (5.19) and the fact x + gh(t, x) C A2. 0Lemma 6.3 For 0 s t T, x C 18”, there exists a F C such thatF(x2+ C A2) = 1. (6.6)Chapter 6. The Continuation Region 90Proof. Let I be fixed, and Q = JIt(xt(w)). From Lemma 6.2 we know that for each B e F,the mappingwi-÷Q(B)is Ft-measurable. Therefore Q is a i-transition probability. Let F = 113(x) ® Q, by the sameproof for Proposition 4.8 we can conclude that P E fl.Now we show that P satisfies (6.6). In fact, from the definition of P we know the r.c.p.d. ofP with respect to F is Q = b ® Q, thusP(xt+ e At) = JQ(xt+ c A)dP.But from the definition of Q and Lemma 6.2 it follows thatQ(xt+ At) = Q(xt+ E A)= Ht(xt(w))(xt+ e At)=1;therefore (6.6) is proved. UWe introduce the following notation: for n e letS={s+ kn5);k=O,l,•••,2Th_l}. (6.7)It is obvious that S,, C for n 1.Lemma 6.4 For (s, x) e >D and n 1, there exists P, C Ra such thatPn(xr+ C Ar,Vr c Sn fl [s,1’]) = 1. (6.8)Proof. For simplicity of notation, we assume s = 0. The proof for the general case is similar.Take P° = Ho(x), and letF’ = P°®i..HT(xr),p/c = P’ØkTHkT(xkT), k=l,2,...,2n_l,Pn= p2fl.lØ2fl1T H2n1T(x2n1T).Chapter 6. The Continuation Region 91Then from the proof of Lemma 6.3 we can see that F, satisfies (6.8) and F C DNow we make the following assumption:Assumption A. A is continuous in the following sense: V (t, x) C E, if x 0 A, the closureof A, then there exist e, 6 > 0 such that S(x, e) n A = 0 whenever Iu — t < 6, where S(x, s)is a d-dimensional ball with radius e and center at xProposition 6.5 Assumption A is equivalent to the followingA= U {t} x A, (6.9)O<t<Twhere A, A are the closures of A, A as subsets of E and 18d respectively.Proof. Assume first that (6.9) holds. For (t, x) e E, t $ 0, T, if x A, then (t, x) 0 A. Sothere exists an 8> 0 such that [(t — e, t + a) x S(x, a)] flA = 0, which means that for u — t < a,S(x,a)flA = 0. Thus Assumption A holds. The cases when I = 0, T can be treated similarly.Now we assume Assumption A. If (t,x) 0 U{{t} x A, 0 I T}, then x A, andAssumption A means exactly that (I, x) A. Therefore under Assumption A,Ac {t}xA. (6.10)O<t<TBut the reverse inclusion in (6.10) is obvious, and therefore (6.9) holds. URemark 6.6 Note that Assumption A is similar in nature to the assumption made in the workof Chow. Menaldi and Robin [14], i.e., the moving boundary is continuous with respect to thetime variable, cf. Example 2 in Section 2.2. URemark 6.7 If we define a set-valued map t ‘-+ A, 0 I T, then by definition (cf. Aubinand Ce]Iina [1], p411) it is upper semicontinuous if and only if for each o C [0,T] and an openset M containing A10, there exists a neighborhood N19 of to such that A1 c M, t e N10. It iseasily seen that the upper semicontinuity of A. implies Assumption A. UChapter 6. The Continuation Region 92Lemma 6.8 Under the Assumption A, if FE MIj(Q) is such that for some s€(O,T],F(x3 0 it) > 0, (6.11)then there exist y€]R’,7 > 0, and 6€(0, s) such thatR(y,2’y)flA=ø, u€[s—6,s],F(x€R(y,7): u€[s—6,s])>0,where R(y,r) {x Ix — y < r, 1< i d}.Proof It can be easily seen (6.11) implies that there exists a r > 0 such thatF(x g A;) > 0,where A {x, lu—y <e for some y€A3} for e > 0. By dividing (AW into a countablenumber of small parts we can conclude that there exist y€(A and 7 > 0 such thatF(x5€R(y, y))> 0, R(y, 27) C (A;y. (6.12)We will show that there exists 6 > 0 such thatR(y,2’y)flA=ø u€[s—61,s].If not, then we can have a sequence of points {xn} C R(y, 27) such that x€ An,, with Un 11 s.Since {x} is bounded, there is a subsequence, which is still denoted by {xn}, converges tox€iRdi, and x€R(y, 27), therefore x A3. It can be easlly seen that this contradicts theAssumption A.From the left continuity of the sample paths and (6.12), we can find a 62 > 0 such thatP(x€R(y,7),u€ [s— 62,s])>0.Let 6 = min{61,62}. Then y, and 6 will satisfy the conditions in the lemma. UNow we state the main theorem of this chapter. It ensures the existence of an optimalcontrol rule F€R such that if (s, x) A, then the optimal state jumps to A, and then keepsin A from then on. Recall that the optimal state process has no jumps when it is in A.Chapter 6. The Continuation Region 93Theorem 6.9 For (s, x) e >D, there exists P e R such thatP(x E A,Vs <t T) = 1. (6.13)Equivalently, (6.13) can be stated asP((t,xt)e A,Vs < t T)= 1.Proof. Without loss of generality, we assume that s = 0. From Lemma 6.4, we can get asequence of probabilities {P} C ng satisfying (6.8). Recall that is a compact subset of1[V[(1Z), therefore there exists a subsequence, which is still denoted by {P}, and a control ruleP e R such that P,- —÷ P weakly. It is enough to show that P satisfies (6.13). Note that(6.13) is equivalent toP(xr C A,r EQ n(0,TJ)= 1, (6.14)where Q denotes the set of all rationals. This can be easily seen from the left continuity of thecanonical process x and Assumption A. Therefore it will be sufficient to showP(xt e At) = 1, Vt C (0,Tj. (6.15)11(6.15) is not true, i.e., there is at e (0,T] such thatP(xj As)> 0,then by Lemma 6.8 we can find y C ffld,7 > 0, and 6> 0 satisfyingR(y,2-y)flA = 0, u e [t— 6,t],P(x e R(y, 6), u c [t — 6, t]) > s > 0.Recall that P —+ P weakly on 1M(!2), with 12 Vd[0,T] x U x Ak[0,TJ, xj(w) = Xt for w =(x., p, v.), and Vd[O, T] is endowed with the pseudo-path topology, and therefore by Theorem2.11, there exists a subsequence of {P}, which will still be denoted by {P}, and a set I of fullLebesgue measure such that the finite dimensional distributions of (xj)jcJ under the controlrule F,,, converge weakly to the finite dimensional distributions of (Xt)tcI under F.Chapter 6. The Continuation Region 94For an arbitrary K€iV, take N1 C a large enough such thatCardinality {sX, fl(t — 6, t)} > K, (6.16)where is defined by (6.7). We denote those points by . , S<’ (K’> K) and<31<82<”<8K?<t.Since I has full Lebesgue measure, we can take t1,.. .,tf<I such that t.€I, 1 K’, andtb<t1<&1<t2<”<tK’<3Kl.Using the weak convergence of the finite dimensional distribution and the fact that R(y, 6) isopen we havelirninfF(x1e R(y,6),1 i K’) F(x C R(y,6),1 i K’)F(xeR(y,6),ue[t—6,t])> E.So we can take N> N1 larger enough such that whenever n N,F(x1 C R(y, 6), 1 i K’) > e.Note that C Sc , so for n N,F(x3+ C A,i= 1,2,.,K’) = 1.LetCK’ = {w: x3+ e A31,x1 € R(y,7),1 i K’}.Then F(CKI) > e. Write itt = yt + y, vt = yt + 27, üt = yt — 27, V’ =— ,1 i d, anddenote by NL”0t(x) the number of upcrossings of xt between the levels it and v in the timeinterval [0, T]. It can be easily seen that if w C CKI,d (Nuivt(xi(w)) + N(xt(w))) K’ — 1.Chapter 6. The Continuation Region 95Therefore when n N,d(EPnrt&(xi) + EN(x)) > (K’ — 1)Ffl(CK’)> (K’ — l)s.Since K’ can be taken arbitrarily large (with e fixed), we havelimsupZ (ENt(xi)+ EN(x)) = 00. (6.17)But as we knowEFnNUZV&(Xt) IuI +Varp(x1)= + Varp(x)]EN2(x1) &I +Var(x) = + Varp(a1)],where Varp(x) denotes the conditional variation with respect to F,. HenceLHS of (6.17) 1 Z {IuI + uI + 211m:upVarpn(xt)]. (6.18)Note that under the control rule F€= xt + bt(9, Xe, te)dO + G(v) + M,where M is an .F-martingale under F,3. From the boundedness of b(,,) and the positivityof c(.), there are constants C, c such that IIb(.,, )II C, and IIc(DII c, IIeCII c. SinceF,,€R, we haveW(O, x) = E{1T f(&, Xe, ite)dO + j c(S) . dve}cE’{IvrU.Moreover, for any F€from Lemma 3.2 we knowEjMfl (EIMI2)4 (E”(2W))4T 4(EPf tr(a(Sxe.e))d9) C’ < 00.Chapter 6. The Continuation Region 96Therefore we havefT 1VarpQct) j’ CdO+—W(O,x)+ sup E”IMIJO C O<t<TCT4W(O,x)+C’Co<oc,where C0 is a constant independent of n. By (6.18) we can seeLHS of (6.17) [InI + utl + 2C0],so the right hand side of (6.16) is uniformly bonnded in n. Bnt this is a contradiction to thefact of (6.17). ElRemark 6.10 We have shown that the subset A has some of the features of the continuationregion found in specific problems (see the examples in Section 2.2), but the following problemremains to to solved:• (5.24) and the examples in Section 2.2 suggest that it might be true that for any optimalcontrol rule F eTJ 1°c1 =where A°o is the interior of the subset A9 C IRd, i5 2 i4 and i4 denotes the totalvariation of the 1-dimensional process v. In other words, the singular control variableacts only when the optimal state process is on the boundary 0A9 of the continuationregion A9, after a possible inital jump.Chapter 7The Existence of Optimal Control Laws7.1 IntroductionIn this chapter we study a special case of the singular stochastic control problem we formulated in Chapter 2 when the singular control variable enters the state as a process of boundedvariation, or the bounded variation control problem. We introduce a random time change whichstretches out the time scale. Under this new time scale, the problem is transformed to a newcontrol problem involving only classical controls. The new problem has been studied extensively, cf. Fleming and Rishel [22], Krylov [50], and especially Haussmann [29], Haussmann andLepeltier [30], which ensure the existence of an optimal Markovian control for the new problemunder some mild continuity conditions on the coefficients of the state process. Applying thisresult and transforming the optimal control back to the original singular control problem, weshow that the optimal control exists. Moreover, it is shown that there exists an optimal controlin the following form: the control variable u is in Markovian form, and the increments of thesingular controls during any time interval depend only on the state process in that interval.This type of control will be called a control law.This method gives an explicit way to construct the optimal control to the singular controlproblem when the optimal control to the new problem, which is relatively well understood, isknown. However, it does not exhibit the special features of the optimal state process, i.e., wedo not know whether the optimal state process is a reflected diffusion in some region.A similar approach has been used by Martins and Kushner [59], Kurtz [52], and Zhu [81] tostudy the weak convergence of probability distributions.This chapter is organized as follows: both the singular control and the classical control97Chapter 7. The Existence of Optimal Control Laws 98problems are introduced in Section 7.2. In Section 7.3 the equivalence of the two problems isestablished and the existence of the optimal singular control is proved. Section 7.4 studies theexistence of the optimal control law, and some comments about the approach in this chapterare given in Section 7.5.7.2 Formulation of the problemThroughout this chapter, we assume U is a compact metric space, T > 0 is fixed, and thefollowing functions are given:• u:>x• f: >D x U ‘—* 11? is a lower semicontinuous function in (t, x, u) and such that—K f(t, x, u) C(1 + lxiitm), (t, x, u) € X U,for some constants rn, K, C 0, and• c = (ci), e = (ei) : [0, T] i-4 114 are bounded and lower semicontinuous functions on [0, T],andct,ë’> 0, 1 i d.Moreover we let a = aa’ and we assume• for (t,x)€>D,K(t,x) {(a(t,x,n),b(t,x,u),z): z f(t,x,u),u€ U} C x xis convex.The last condition, sometimes called the Roxin condition, allows us to replace relaxed controlsby ordinary controls Ut. It is satisfied for example if a, b are linear in u, f is convex in u andU is convex.Chapter 7. The Existence of Optimal Control Laws 997.2.1 The singular control problemIn this chapter we study singular stochastic coutrol problems iu which the singular controlvariable enters the state as a process of bounded variation, or the bounded variation controlproblem. More precisely, in this chapter we study the optimal control problem in which thestate evolves according to the d-dimensional stochastic differential equationx(t) = x + f b(O, x(O), u(8))dO + ft u(6, x(O), u(O))dBo + v — v, (7.1)where (B, t 0) is a d-dimensional Brownian motion, x is the initial state at time s and u.,v?-, v stand for the control variables with v1, v2 nondecreasing componentwise.Note that this is a special case of the problem we formulate in Chapter 2, in which we letlv = 2d, and g(•) = (I, —I) with I the unit d x cl-matrix.For convenience, we state the definition of controls corresponding to this problem as follows,which is consistent with the definition in Chapter 2.Definition 7.1 A control is a term a = (Q,F,Ft,F,xt,ut, vI,v?, s,x) such that1. s 0 s T is the starting time and x E jjjd is the initial state;2. (1!, F, F) is a probability space with the filtration {F}>o, and there is a standard ddimensional Brownian motion B on it;3. Ut ZS a U-valued process, progressively measurable with respect to {Ft}to;4. v1, v are progressively measurable with respect to F with v = 0, i = 1, 2. The samplepaths of the processes are in A’[0, T], i.e., for each w e Q, vftw) C A°[0, T] (i = 1, 2);5. Xt, the state process, is a Ft-adapted process such that (7.1) is satisfied.We call (s, x) the initial condition of the control a.The cost corresponding to the control a is defined in the formJ(a) E {f f(t, Xt, u)dt + j c(t) . dv + L,T) e(t) . dv?}. (7.2)Chapter 7. The Existence of Optimal Control Laws 100A control a is called admissible ifJ(a) <cc.The collection of admissible controls with initial condition (s, x) is denoted by AS,T. It iswell known from the theory of stochastic differential equations that, under the above conditions,the set A3, is nonempty for each fixed (s, x) (e.g., take u and vt, i = 1,2, to be constants).The value function of this control problem is defined byW(s,x)= inf J(a).cx€A8,A control a* C A3, is called optimal ifJ@*) = inf J(a). (7.3)We say that (v+, vj is the minimal decomposition of the process v of bounded variationif v+ and c are the positive and negative variations of v respectively, i.e., v± ev = v+ — v and = v+ + tr, where i3 is the total variation of v on [s,t).Proposition 7.2 For each a = (Q,F,.Fj,P,xt,uj,v’,v?,s,x) e A3,, we have a = (Q,F, .F,P,x,u,vt,v1,s,x) C A3, andJ(a) J(a),where (v+, vj is the minimal decomposition for the process v = v1 — v2. Moreover, if a isoptimal, then v1 = v+, and v2 = v.Proof. The first part of the proposition is obvious, and the second part follows from thestrict positivity of e(.) and e(.). uFrom Proposition 7.2 we know that if the optimal control for the problem exists, then wecan always find an optimal control a* such that the two increasing (componentwise) processesv1 and v2 are mutually orthogonal, i.e., dv1 . dv2 = 0.Chapter 7. The Existence of Optimal Control Laws 1017.2.2 The classical control problemFollowing Martins and Kushner [59] (see also Zhu [81]) we now define a (d+ 1)-dimensional control problem which nses only the classical controls. The state of the problem evolves accordingto= s + f °(O)dO,(7.4)= x+ j {b(°(o), (O), ü(O))ñ°(O) + n(O)(i - o(o))] dO+ f c(°(O), (O), ii(O))1J (05dE9, t> 0, (s, x) € E.The control variable of this problem has the following form:=where• It?: 1R—÷I=[0,1],• It.: ‘—÷ U,• k=(It1,It2): 18H÷S={zcJRf,Z1z=1},andIt=It’—It2.Notice that the control set E = I x U x S is compact. Control problems of this type arediscussed in Hanssmann and Lepeltier [30]. Using the notation of Definition 2.2 of [30], wedenote the control of this problem bya = (O, .fr, f, P, (±, ), üt, .s, x). (7.5)The cost associated with a is defined by3(a) = E’{f [f(°(o) (O), It(O))It°(O)+ (c(°(o)) . It1(O) + 0(9)). n2(O)) (1- It0(O))] dO}Chapter 7. The Existence of Optimal Control Laws 102with= inf{t 1°(t) = T}. (7.6)A control a is called admissible ifand the collection of all admissible controls is denoted by A5,. Let T’i7(s, x) be the value fnuctionof this problem, i.e.,f17(s, x) = inf 3(a).Lemma 7.3 If & is admissible, then r < cc.Proof From onr assumption that c(•), E(.), i = 1, . . ., d are strictly positive and lowersemicontinuons on [s,T], there exists a constant c9 > 0 such thatct(t)>co, e(t)c0 stT, 1id.Noticing that I(t) = (ui1t), uz2(t)) e S for s I T, we havec(t) . ñ’(t) + e(t) . h2(t) c0.If 3(&) < cc, thencc > E{j[c(°(O)) ñ’(O) + e(°(9)) h2(O)](i - u°(O))de}- KTcoE”{f(l — aO(O))dO}— KTco(Er—T+s)—KT,where K is the constant in the definition of f. Therefore Er < cc, which implies thatr<cc, a.s.. LiIt now follows that if we definer(t) = inf{O, .f0(9) t},then r(t) < cc a.s. for each I T.Chapter 7. The Existence of Optimal Control Laws 103We define, for (t,x) E D,1(t,x)={ (a(t,x,u)uo,(t,x,u)u0+[(01— (2)](1 — u0),z)z f(t, x, u)u° + [c(t) n + ë(t) n2](1 — u°), (7.7)= (u°,u,v)whereà(t,x,u) = ( 0 0 e8(d+1)x(d+1),0 a(t,z,u) )x, u) = (b(t,, )) Eand a = au. It is convex according to the next result since K(t, x) is convex.Lemma 7.4 If K(t, x) is convex, then K(t, x) is also a convex set.Proof. It is clear that we need only to show the setk(t, x) = {(a(t x, u)u°, b(t, x, u)u° + (n’ — n2)(1 — u°), z)z f(t, x, u)u° + [c(t) . n1 + ë(t) n2](1 — u°), = (u°, u, v) eis a convex subset of Sd x x . Let j = (u?, u, vj) e E, andz f(t,,u)u+(c(t).n +ë(t)n)(1— u), i = 1,2.For 0 < A < 1, we want to show that ) = A/h1 + (1 — A)/h2 gives rise to a point in K(i, x).The result is obvious if u = u0 = 0, 50 we assume u + u > 0. Let A’ = Au/ü°. Note that0 < ü° 1, and 0 A’ 1. From the assumption that K(t, x) is convex we know that thereexists a E U such thatA’a(t, cc, u1) + (1 — A’)a(t, cc, u2) = a(t, cc, ),cc, ui) + (1 — A’)b(t, cc, u2) = b(t, cc, ),A’f(t, cc, ui) + (1 — )i’)f(t, cc, ‘u2) f(t, cc, u).Chapter 7. The Existence of Optimal Control Laws 104ThereforeAa(t, x, ui)u? + (1 — A)a(t, x,u2)u = a(t, x, u)u°, (7.8)a, ui)u + (1 — A)b(t, a,u2)u = b(t, a, u)u°, (7.9)Af(t,x,ui)u + (1— A)f(t,x,u2)u f(t,x,u)u°. (7.10)Moreover,A(nI — n)(l — u) + (1 — A)(n’ — n)(l — u) = (n1 — n2)(l — fi°), (7.11)A(c(t) . n + e(t) n)(1 - u) + (1- A)(c(t). i4 + ë(t) . n)(1 - t4)(c(t) . u’ + e(t) . n2)(1 — u°), (7.12)where_____(1—A)(1—z4)• 2711+ i—u0 ‘2and i = (n’,n2)e S. Now from (7.8), (7.9), (7.10), (7.11), and (7.12) we can see that K(t,x),and therefore k(t, a), is convex.Finally, we recall a result from Haussmann and Lepeltier [30] (cf.Haussmann [29] for theautonomous case).Theorem 7.5 (Haussmann and Lepeltier [30], 1990) There exists an optimal Markoviancontrol & = (S\f,.frj,F,(th°(t),I(t)),i’(t),s,a) to the problem (7.4). More precisely, we canwriteü°(t, w) = U°(°(t, w), (t, w)), ü(t, w) = (T(i°(t, w), i(t, w)),t’(t, w) = (AT1 (th°(t, w), th(t, w)), AT2(1°(t, w), (t, w),where U°(.,.) 111+ x ‘—+ [0,1], tT(.,.) 18 x ‘— U, and JT(.,.) = (N’(.,.,fJ2(.,.))18+ x 18 i—÷ S are deterministic Borel measurable functions. UChapter 7. The Existence of Optimal Control Laws 1057.3 Equivalence of the two problemsIn this section we will show that the value functions for (2.1) and (7.4) are the same, i.e.,W(s, x) = T’i7(s, x), (s, x) e D. More precisely, for any a C A3, (a e A3,) with the costJ(a) (i(&), respectively), there exists a control & e A3, (a e A5,, respectively) such that.1(&) J(a) (J(a) = J(&), respectively). For this section we do not require the Roxincondition although we made it part of the standing assumptions for ease of exposition.Take a control a = (Q,F,Ft,F,x(t),u(t),v’(t),v2t) s,x C A3,. By definition, there is ad-dimensional Brownian motion B. on (12, F, F, F) such thatt t= x + j b(O,xg, ug)dO + j c(O, xo,u9)dBg + v — v?. (7.13)We may assume F to be right continuous. In fact, note that B. is still a Brownian motion on(12, F, F+, F), where Ft+ = flt<o.crFs. Therefore a’ (12, F, Ft+, F, x(t), u(t),v1Qt),v2Qt), s, x)C A8,, with J(a’) = J(a).hr view of Proposition 7.2, we may assume that v1 and v2 are mutually orthogonal, in otherwords, letv(t) = v1(t) —we haveVk(t)VL(t)+V(t),Letcii3(t) = >Zk=1then it is obvious thatvI(.) << i3(.) i = 1,2, k =Apply the Radon-Nikodym theorem (cf. Dellacherie and Meyer [17], Theorem VI-68): thereexist processes 4, i = 1,2, k = 1,.. ., d on [0, T] which are progressively measurable withrespect to F and such thatv(t) = jtn(e)d(o), i = 1,2, k =Chapter 7. The Existence of Optimal Control Laws 106It can be easily verified that2 d(7.14)i=1 k=1a.e. (d136) on [s,T]. Let & = i = 1,2. We may redefine nt on a d13dP-nufl setsuch that (7.14) holds everywhere.The cost function of the problem can be written asJ(a) = E{JTf(6, x, uo)dO +1T(c(o) n’(O) + 3(0) . n2(0)) d13(0)}. (7.15)Definer(t)=t—s+13(t), stT, (7.16)then r(.) is strictly increasing and ‘eft continuous on [s,T], with r(s) = li(s) = 0. We denotethe inverse function of r(.) by T(.), i. e.,T(t)=inf{0<T: r(0)t}, 0t<co. (7.17)Note that T a r(t) = t, 0 t < cc.Lemma 7.6 (a) T(.) is nondecreasing, and for each t, T(t) is an Fe-stopping time, s < T(t)T.(b) T(.) is Lipschitz continuous with constant 1, i.e.,IT(ti) — T(t2)I t’—t2, 0 t1, t2 cc.Proof (a) is obvious since F, is right continuous. Now we show (b). Take t1 < t2, andwe need only consider the case T(t1) < T(t2). Note that ii is nondecreasing, so we havef3(T(t1)+) 13(T(t2)—). ThereforeT(t2) — T(t1) T(t2)— T(t1) + i3(T(t2)—) — 13(T(t)+)= r(T(t2)—) — r(T(ti)+)t2—ti,Chapter 7. The Existence of Optimal Control Laws 107i.e., T(.) is Lipschitz continuous with constant 1. 0From Lemma 7.6(b) we can writeTQ) = s + jT’(6)dO (7.18)with 0 T’(6) 1, 0 S < cc.DefineA= j b(9, cc(S), u(S))dS,MA=a(S,cc(S),u(S))dB(O), s A T.Then from (7.13) we havex(T(t)) = cc + DTU) + MT(t) +v1(T(t)) —v2(T(t)). (7.19)Note that= j b(T(S), x(T(S)), u(T(6)))T’(S)dS.As we know, AlA is a continuous square integrable martingale on the filtered probability base(Q,F,F,F) withA(M)A= j a(8, cc(S), u(S))dS.LetA211=MT(t), t=T(t)’then the optional sampling theorem implies that A2TQ) is a continuous P-martingale withT(t)=a(S, cc(S), u(S))dS= j a(T(S), cc(T(S)), u(T(S)))T’(S)dS.Therefore, there exists a standard extension (O, .fr, F, F) of (Q, F, F1, F), and a d-dimensionalBrownian motion E. on this probability base such that= f u(T(S),cc(T(O)),u(T(S)))v(idEe.Chapter 7. The Existence of Optimal Control Laws 108cf. Karatzas and Shreve [45]. Define(t)= f n(T(O))(1 — T’(O))dO, i = 1,2.Since r o T(t) = min{O, T(O) = T(t)}, it can be verified easily thatv(T(t)) = ‘(t), t + \ ( U (r(8), r(9+)]), i = 1,2,s<8<Tand for s T, j = 1,2,= v(9) + n(r(9))(8’ — T(O)), r(9) ‘ r(8+).DefineI x(T(t)), t e + \ (Us<o<T[T(),T(8+)]),x(O) + n(r(O))(t — r(O)), r(O) I < r(O+), s 1 <T,with n = — n2, and= u(T(t)), 0 I < 00.Note that T’(t) = 0 a.e. forI [T(),r(O+)], .s T, so we have= j b(T(O), (O), (O))T’(O)dO,= j (T(O), (8), (O))VdEe, 0 t < co. (7.20)Then we can write (7.19) as(t) = x+ j [b(T(O), (6), (6))T’() + n(T(O))(1 - T’(O))] dO+ f u(T(O), (O), (9)). dE0.Therefore we have a control=(O, fr, f, P, (°(t), (t)), (ü°(t), (t),1(t),fi2(t)), s, x) (7.21)for the problem (7.4), with= T(t),ft°(t) = T’(t),= n3(T(t)), 0 1< 00, i = 1,2,Chapter 7. The Existence of Optimal Control Laws 109where T is defined by (7.16) and (7.17). Now we can state the following result.Proposition 7.7 For any admissible controla = (, F, F, F, x(t), u(t),v1(t),v2(t), s, x) e A8,,there exists an admissible control a E A8, such thatJ(a) <J(a). (7.22)Proof. We first assume that v1 and v2 are mutually orthogonal. Then we have obtained acontrol a as (7.21), and we need only to show (7.22). It is easy to see thatPTJ f(O, x(O), u(O))dO = J f(I°(O), it(9))ii°(6)dO F —3 0where T is defined by (7.6). Noticing that -r < 00 F-a.s., we can apply Lemma B.1 in theAppendix to getTj [c(O) . n’(O) +e(8) .n2(O)]d(6)= J [c(°(O)) fl1(O) + e(°(9)).n2(O)](1 - °(O))dO.0Since (O, , f, P) is a standard extension of (Q, F, , F) we can conclude, by the definitionof J(.) and J(.), that J(a) = i(a), cf. (7.15).If v1, v2 are not mutually orthogonal, then by Lemma 7.2 there exists a control =(Q,F,F,F, x(t),u(t),v’(t),’i( ),s,x) e A3, such that i1,i2 are mutually orthogonal andJ(o) J(a). Therefore by what we have shown, there exists a control & E A3, such thati(&) = J(o), and hence J(&) J(a).A converse to Proposition 7.7 is the following.Proposition 7.8 For any admissible control&= (, .fr, .fr, P, (i°(t), i(t)), (ü°(t), (t), ñ’(t),ñ2(t)), s, x) ethere exists an admissible control a E A3,, which is given below by (7.24) and (7.27), such thatJ(a) = i(&). (7.23)Chapter 7. The Existence of Optimal Control Laws 110Proof. Definer(t) = inf{9 i°(O) t}, (7.24)hi=b(°(9), th(9), ü(9))°(9)d9, (7.25)= ja(°(O),a(O),u(O))JI(ó)dE(O). (7.26)Note that u°(.) is nonnegative, and therefore ao(.) is nondecreasing. It can be seen easily thatr(.) is nondecreasing, left continuous, r(s) = 0 and o r(t) = t. Moreover, for a fixed t, r(t)is an {Lfr9}-stopping time since i0(.) is continuous. Define, for .s t T,x(t) =u(t) = (r(t)), (7.27)v1(t) = V(r(t)), i = 1,2,withVt(A)= J kt(9)(1 — ñ°(O))dO.0By Lemma 7.3 we know that x(.), u(.),vt(.), i = 1,2, are well defined. Since t’(t) = (ñ’(t), ft2(t)) e8, 0 a0 1, and rQ) is nondecreasing and left continuous, we can conclude that v eV°[0,T], i = 1,2. Moreover, we havex(t) = x + D(t) + M7(,) + (t) —It can be verified easily that the change of variable 9’ = (O) implies= j b(O’, x(9’), u(9’))dO’. (7.28)By the optional sampling theorem, we know that M1 MT(t) is an .FT(,)-martingale. Moreover,we have(M) = (112t)()P T(t)= J a(th°(O), (9), ii(O))ut°(9)dO= J a(9, x(9), u(9))d9.Chapter 7. The Existence of Optimal Control Laws 111Applying the martingale representation theorem, there exists a standard extension (Q, F, F, F)of the probability base (Q, F,F7(), I’) and a d-dimensional Brownian motion B on (Q, F, F, F)such that= f u(O, x(9), u(8))dB(O). (7.29)Therefore we have, for & t T,x(t) = x + j b(8, x(8), u(O))dS + j a(8, x(O), u(8))dB(9) + v’Q) —in other words, a = (IZ, F, F, F, x(t), u(t), v’(t),v2(t), .s, x) is a control for (7.13). It remainsto show J(ct) = 3(&). In fact, by the change of variablepT PTJ f(9, x(9), u(O))cW = J f(1°(8), 53(0), it(0))ii°(0)dO,8 0and by Lemma B.2(b) in the Appendix, we haverrj c(0) . dv’(0) = J c(53°(0)) . (53°(0))(1 — it°(0))dO,8 0rT prJ e(0) . dv2(0) = j eQS°(0)) .h2Qi°(0))(1 —8 0and thereforeJ(a)=J(&). 1Evidently we now haveCorollary 7.9 For (t,x) CW(t, x) = I’17(t, x). D7.4 Existence of optimal control lawsA controla = (Q, F, F, F, x(t), u(t), v’(t),v2(t), &, x) e A5,is called a control law if there exists a Borel measurable function 53 ‘—* U such thatuQt,w) = ü(t,x(t,w)), and for any & t1 t2 T,v1(t2)— v1(ti) C u(x(0), Li 0 t2), i = 1,2. (7.30)Chapter 7. The Existence of Optimal Control Laws 112Now we can state the main theorem of this chapter.Theorem 7.10 There exists an optimal control law for the problem (2.1).Proof. By Lemma 7.4 we know that k(t, x) is convex for any (t, x) C >D. Applying Theorem7.5, there exists an optimal Markovian controla = (Q, .t, .fr, P, (t), ft(t), s, x)for the control problem (7.4), with ft(t) = (€i°(t), ii(t), i’(t)), andü°(t) = U°(±°(t), (t)), ñ(t) = U(°(t), (t)),(t) = (i’(°(t), (t)), iST2(°Q), Q))),where U°(.,.): x [0,1], U(.,.): x ‘; U, and SQ.) = (fr1., .), 2(.,.)):IR+ x If?” i- S are deterministic Borel measurable functions. As we have seen in the proofof Proposition 7.8, through an extension of the filtered probability space (12,.?, JP, P) andtransforms (7.27) we can get an admissible controla = (Q,F,F,P,x(t),u(t),v’(t),v2t),s x).By Proposition 7.8 and Corollary 7.9, we know that a is optimal, i.e., J(a) = W(s, x). Itremains to show that a is a control law. Note that v1, v2 are orthogonal to each other byProposition 7.2. Since 0(.) is continuous, we have 2°(r(t)) = t for s t T. Thereforeu(t) = ñ(r(t)) = U(±°(r(t)), (r(t))) = U(t, x(t)).To show (7.30), letJ {t: x(t) x(t+), s t T}. (7.31)Then from the equations (7.4) and the definitions (7.27) we can seeJ C {t: r(t) < r(t+), s t T}. (7.32)Chapter 7. The Existence of Optimal Control Laws 113Moreover T(t) < T(t+) implies that €t = Oa.e. on [r(t),T(t+)). Hencex(r(t+)) = (r(t))+ J [ñ1(8) -n2(8)] dO.r(t)By the orthogonality of v1, v2 it follows that dv1(t) dv(t) = 0, i.e. ñ = 0 a.e. on[r(t),T(+)). As (n18),n2)) ES a.e. then ‘(8) 2(e), soJ = {t : r(t) < T(t+), s < t < T}. (7.33)Recall that r(.) is strictly increasing and left continuous. Moreover, if 0(8) J, then8 for 0 8 r, where r is defined by (7.6), i.e., r r(T). So we can writePT(t)v(t) = V(r(t))= J 1.T((8), (6))(1 — U°(°(O), (O)))dO0T (t)= j 1{O(G)J}(O)N(X°(8), (O))(1 — U°(°(8), (O)))d8T(t)+ f 1{O(e)eJ}(O)Nt(XO(O), (O))(1 — U°(°(o), (8)))d8= It(t) + IP(t).It is obvious that if °(8) E J, then 0(8f) = io(o) for 8’ E [T(°(O)), r(i°(8)+)j. By definition,9= .s + J U°(°(8’), c(8’))dO’,0and 0 < U0 < 1, so we will haveU°(I°(8’), (8’)) = 0 a.e. on [r(°(8)), r(°(O)+)]. (7.34)Therefore we can rewriteIP(t) = J j\T(50(8), (O’))d8’.0<O<T(t) TQV (0))Hence II(.) (i = 1,2) is a pure jump process, and from the definition of (.) and x(.) we cansee thatII’(t) = ()+, 112(t) = (x),s0<t s0<tChapter 7. The Existence of Optimal Control Laws 114where (Ax0)+ = max{x(O+) — x(9), O}, (Ax0) = —min{x(8+) — x(O), O}. Therefore11(t) Ax0,where Ax0 x(9+) — x(9).We consider the process I(.). By Lemma B.2, and (7.33), noticing that r(i) > 9 if and onlyif O(9) < t, we can write(t)I(t)= j 1{O(O)J}(9)N(X°(9), (9))(1 — U°(°(9), (O)))d9= f 1{o<o<(i)}(9)1{o(o)gJ}(O)N(x°(9), (O))(1 — U°(°(9), (O)))dO= f8<O(O)<t}(9)1{O(6)J}(9)N(X°(O), (9))(1 — U°(°(O), (9)))d9= f1{3O(O)< O(O)J}(9)N(X°(O), (r(°(9)))(1 — U°(°(9), (r(°(9)))))dO= f {s0<t, SØJ}(9) (9, (r(O)))(1 — U°(O, (r(0))))dr(O)= j 1jc(O)T(9,x(O))(1 - U°(9,x(9)))dr(0)= j jT x(O))(1 - U°(9, x(O)))drc(9),where TC(.) denotes the continuous part of the increasing process r(.). DefineR {9: U°(O,x(O)) = 0, s 9 T}.From the definition of T(.), we knowPT(t)t = s + I U°(°(9), (9))dO, s < t < T. (7.35)JoAs above, we can gett — = f U°(9, x(9))drc(9), s <t <T, (7.36)since 0(9) = 0 if 0(9) e J by (7.34). Therefore for any nonnegative Borel function k(.) definedon [s,T],I k(9)drc(9) = f 1 k(9)d0.J[s,T]flRc J[s,T]nRc U°(O, x(9))Chapter 7. The Existence of Optimal Control Laws 115Hence1(t) = I’(t)- 12(t)=1Rc0d0+ j (O, x(O))1R()drC(O)= I°(t) + 1(t),where ]T = J/1 — jçT2• Notice that from (7.36) we can conclude m(R) = 0, where m denotesthe Lebesgue measure on the real line. Therefore I”(.) and Isc(.) are the absolute and singularparts of the continuous bounded variation process 1(.) respectively. Now we consider 18c(t).Recall that B. is defined by (7.25), and by (7.28) we havef 1(9)dD(0)= 0, s t T. (7.37)From (7.29) we know that M E M) is an F 7()-martmga1e, and for s t < T,= (M)T(t) = ja(O,x(O,u(0))dO.Therefore(J 1R(O)dlI/1e) = j 1j(8)a(O, x(O), u(O))d = 0, s t <and hencej 1(O)dM = 0, s t T. (7.38)Moreover, from the definition of I” it is obvious thatj 1R()dI(8) = 0, t T. (7.39)From (7.27), we can write= x + DT() + M + Pc(t) + 1sc(t), s twhere rc(.) is the continuous part of the semimartingale x(.). Therefore by (7.37), (7.38), (7.39)and the definition I we havetL 1R(8)dXc(O) = f 1R(9)dI(8) =Chapter 7. The Existence of Optimal Control Laws 116Let v = — v2, we have shown thatvQ) = 1(1) + .UQ) = rcQ) + IQ) + 11(t)= jt(’(° x(9)) - x(8))) U00+ J 1{9: U°(O,x(0))-_O}()+ Ax0,Ss0<tand the three parts are the absolute continuous, singular continuous and jump parts of thebounded variation process v(’) respectively. (7.30) is obviously satisfied by v.Since v1, v2 are orthogonal to each other, we have 1’ = v1 + v2. From the definition of totalvariation we can conclude that (7.30) is satisfied by 13. The proof of the theorem is therefore1 1completed by noticing that v1 = (i3 + v), v2 = $13 — v). U7.5 Some comments(a) The method we have used can be applied to the following more general problem, which issimilar to the one formulated in Fleming and Soner [23], Chapter 8. DefineB—{nE: tInII1}, S={neffl: IInII=1}and let C be a closed cone in lFt’, i.e.,n E C, A> 0 =t An e C.Define S° = S fl C. Assume that c(.,.): [0, T] x C ‘—* 114 is continuous and c > 0 on S. Wealso assume that c(.,.) satisfies the following conditionc(t, Ax) = Ac(t, x)for 0 A 1, x e B n C. Instead of the controls vtm, v2 as in (7.1), we usev(t) = n(0)dE(0).Chapter 7. The Existence of Optimal Control Laws 117The state equation thus becomest t t= x + j b(O, x, uo)dO + j u(8, x, uo)dBo + j n(6)de(6),(s,x)e>D, stT,and the singular control is expressed by a pair of processes n(.) [0, Tj —÷ S° and E: [0, T] —÷1&, progressively measurable with respect to F, with a(s) = 0 and sample paths of inV[0, T]. The corresponding cost function is defined in the formJ E E{1Tf(t, Xt, ut)dt +1sT)c(t, n(t))dut)}.We can get the existence of an optimal control law for this problem. First let us relax theconstraint n(9) e S° to n(O) e B° = B n C. Ifv(.)=n(O)d5with n(O) e B°, let n0 C S° be arbitrary, andnQ)if n(t) $ 0n(t) = In(t)IIno if n(t) = 0,(t)= f IIn(O)IIdE(O).Then--v(.)= I ñ(O)dE(O), 1 cQ, n(t))de(t) = I cQt, n(t))d(t),Jo Jo Joand n(O) e S°. The corresponding classical control problem would be (7.4) with ñ(9) C B° and3(a) = Ej [f(,e,ue)2 + c(,ñe)(1 — ü)]d9.The corresponding k defined in (7.7) can be written asEQ, X)= { (a(t, X, u)u°, b(t, x, u)u° + (0)(1 — u°), z)X f(t, X, u)u° + c(t, n)(1 — u°), u0 e [0,1], u C U, n C B0}Chapter 7. The Existence of Optimal Control Laws 118and it can be shown that Lemma 7.4 is still true.Now we outline the changes needed for the proof of Theorem 7.10. As before, we mayassume F to be right continuous. Again Theorem 7.5 gives us Borel functions U°, CT, and N.Let the S°-valued function AT(t, x) be defined from KT(t, .z) just as ñ from n. DefineN(O, X7(O)), if r(O) =n(O)= (r(9+)) - (r(8))otherwise.r(O+)—r(O)Then n(.) is progressively measurable with respect to .F. We also defineE(t)= j IIJcT(10(9), (°II(’ — tT°(±°(9), ft(8))d&.0The proof for the property (7.30) is similar forv(t) E J n(8)d(9).Unfortunately, we are unable to show that there exists a pair (N(.,.),.f)) such that nQ) =N(t, x(t)) and satisfies (7.30).(b) It is well known that under some conditions the value function W(t, x) for the classicalcontrol problem in Section 2.2 satisfies the following flamilton-Jacobi-Bellman equationinf {LMw(t, x) + fr(t, x)} = 0 (7.40)in some generalized sense (e.g., in the viscosity sense), cf. Krylov [50], Lions [55], where t =(u°,a,n1,n2)c E, andci 82 ciA? = at(t,x)88+â(t,x)-r,frQ, x) = f(t, x, u)u° + [cQ) . n1 + e(t) . n2] (1 — u°)with 0 = , = xt, 1 j d, andIo 0â(t,x)= I I0 aQ,x,u)u° )(t,x)= (b(t,x,u)u0+ (n1t) —n2(t))(1 — U0))Chapter 7. The Existence of Optimal Control Laws 119We can rewrite (7.40) asinf{uo [CUW(t, x) + f(i, x, u)] + (1 — u°)[n’ . (VW(t, x) + c(t)) (7.41)(—VW(t, x) + ë(t))]} = 0,whered 82 dLu= aij(t,r,u)8 +b(t,x,u)—. (7.42)ij1 t=1Note that (7.41) is exactly the same asmin{ inf (LuW(t, x) + f(, x, u)),uEUow ow . •1--—-(t,x)+ (t), —--—-(t,x)+ e(), = 1,2,••,d) = 0,which is derived in Section 5.3 by the dynamic programming principle.Appendix ASet-valued functionsIn this part, we will finish the proof of Lemma 6.1. We first recall recall some results from theset-valued mapping theory.Let X be a separable metric space with a metric 7. We denote by Comp(X) the space ofall the compact subsets of X and define a metric p(K1,K2) between two points of Comp(X)byp(K1,K2)= inf{e> 0,1(1 C I( and “2 C I(},where for any set A C X, A = {y, 7(x,y) < e for some x E A}, in other words, A° is thesphere around A of radius e. Then (Comp(X), p) is a separable metric space.The following results were used in Chapter 6, we write them down here without proofs. Fordetails, see Stroock and Varadhan [73].Proposition A.1 (a) Let f(x) be a real valued upper (lower) semicontinuous function on X.Consider the maps .7(K) (f(K), respectively) and f(K) induced by f that map Gomp(X) intoR and Comp(X), respectively as follows:f(1() = sup fix), (f(K) = inf f(x)),ccEK —f(K) = {y eK: f(y) = J(K)}, (f(K) = {y C K: f(y) =Then the maps K ‘— J(I() (K i.-÷ f(K), respectively) and K —÷ f(K) are Borel maps ofComp(X) into 18 and Comp(X), respectively.(b) Let Y be a metric space and B its Borel a-field. Let y i—* K be a map of Y intoComp(X) for some separable metric space X. Suppose for any sequence y —÷ y, x,-, e K,3,120Appendix A. Set-valued functions 121it is true that x. has a limit point x in K, then the map y ‘- K,, is a Borel map of Y intoComp(X).(c) Let (F, .1) be any measurable space and q ‘—÷ K, a measurable map of F into Gomp(X).There is a measurable map q ‘—p h(q) of F into X such that h(q) C K for every q C F. UThe map h is called a measurable selector of K..With these results we can prove the following propositiou which is used in Chapter 6. Recallthat the function h(.,.) is defined in the proof of Lemma 6.1.Proposition A.2 The function h(t,.): 111d H-* J/?k is Borel measurable for each fixed t.Before proving Proposition A.2, we define a map F : Comp(1R”) H+ j11k by the following:for K C Comp(fflhC), letF’(K) = max{h’: (h’,h)eK forsome heffl’’},F2(K) = max{h2: (F’(K),h2,h e K for some he fflk_2},F”(K) = max{h’: (F’(K),. . ., Fc(K), hj C K},and F(K) = (F’(K),F2K) . .Lemma A.3 F: Comp(JRj ]ftk is Borel measurable.Proof. For h = (ht) e 1R’, define f1(h) = h (1 i k), which are real valued continuousfunctions on JR”. Under the notations of Proposition A.1, it is easy to see that F’(K) = J’(K)and hence F’ : Comp(fflj -÷ 18 is Borel measurable. Also, it is obviously true thatF2(K)=f2ofJ(K),and more generally,Ft(K) = fr of’Appendix A. Set-valued functions 122for 1 < i < ic, where o denotes the composition of functions. By Proposition A.1 we get theBorel measurability of F1(.), and therefore F : Comp(]R’) i—p R’ is Borel measurable. 0Next, we define a map K. : I—> Comp(]Rk) byK={h: heffl., W(t,x)=W(t,x+gh)-i-c(t).h} (A.1)for x e 1R°. From the Lipschitz continuity of the function WQt,.), the positivity of c(t), andthe fact that W is bounded below, it can be easily seen that K C Comp(lRj.Lemma A.4 The map K.: Rd Gomp(W) is Borel measurable.Proof. By Proposition A.1(b) it is enough to show that for any x —÷ x in 1R’, h eit is true that h has a limit point h e K. In fact, since W is bounded and c(•) are strictlypositive, it is obvious that h is bounded. Therefore there is a point h e ii4 such that h - hand from the continuity of W(t,.) we know that h€ K. ElProof of Proposition A.2. The proof follows from Lemma A.3 and A.4 easily. In fact, for tfixed, h(t, x) = F(K), i.e., h(t,.) is the composition of the maps K. : ‘-÷ Comp(W) andF(.) : Comp(lRj j11k The measurability of h(t,.) follows from that of K and F. UAppendix BSome results from real analysisThe following result is used in Proposition 7.7. Recall that r, r(.) are defined by (7.6) and(7.24) respectively.Lemma B.1 For any nonnegative Borel measurable function k(.) on 18+,pT PrJ k(O)di’(O) =] k(±°(O))(1 — ui°(O))dO. (B.2)S 0Proof. First we assume k(.) = 1[,t)(.), with t T. Note that r(t) > 0 if and only if5°(9) <t, thereforeJr 1{T(9)<t}(O)(1 — ü°(0))dO = J 1{o<9<T(f)}(0)(l — fl°(0))dO,r(t)= J (1 — fl°(0))dO0= r(t) - (°(r(t)) -= r(t)—t+s= 13(t),which is exactly the left hand side of (B.2) for k(.) = 1[,t)(.). Using the difference of two suchfunctions we know that (B.2) is also true for the functions of the form k = 1[tj,v). Now applyingthe monotone class argument we can show the general case. CLet u°(.), k(.) be nonnegative Borel measurable functions on 18+, and 0 u°(.) 1. Definef(t) = s+Ju0(0)dO, 0 t < cc,g(t) = inf{0: f(O) t} (inf 0=cc),pg(t)v(t)=] k(O)dO.0123Appendix B. Some results from real analysis 14The following results were used in the proofs of Proposition 7.8 and Theorem 7.10.Lemma B.2 (a) For any nonnegative Borel measurable function l(.) on [s, T],g(T) fTJ l(f(O))d6 = J l(O)dg(O). (B.3)0 $(b) For any nonnegative Borel measurable function 1(.) defined [s, T],g(T)I l(O)dv(O) = I l(f(9))k(8)d8. (B.4)J3 JOProof. (a) As in the proof of Lemma B.1, we need only to show the case when 1(8) =for any t€[0, T]. Note that g(t) > 0 if and only if f(0) < 1, thereforeg(T)J 1[s.t)(f(6))dO = m(9: f(8) < t) = m(0 0 < 0 <g(i)) = g(L),0which is exactly the right hand side of (B.3) (m is the Lebesgue measure on IR).(b) As before, we may assume 1(0) = 1[,)(0) for some s t T. Then it follows thatg(T) g(T)L 1{$<f(e)<t}(0)k(0)d0 = j 1{O<9(t)}(0)k(0)dOg(t)= J k(0)dO =0which, by definition, is equal to the left hand side of (B.4). DRemark B.3 Dellacherie and Meyer [17] VI 54— 55 gives a slightly different but more generalform of Lemma B.2(a). 0Bibliography[1] J. P. Aubin and A. Ceffina, Differential Inclusions, Springer-Verlag, 1984[2] F. M. Baldursson, Singular stochastic control and optimal stopping, Stochastics, 21(1987),pp1-40[3] J. A. Bather and H. Chernoff, Sequential decision in the control of a spaceship, Proce. FifthBerkeley Symp. of Math. Stat. and Probab., Vol. 3, Univ. of California Press, Berkely,1967, pp181-207[4] V. E. Benés, L. A. Shepp and H. S. Witzsenhausen, Some solvable stochastic controlproblems, Stochastics, 4(1980), pp39-83[5] A. Bensoussan and J. L. Lions, Applications des Inéquations Variationnelles en ContrôleStochastique, Dunod, Paris, 1978[6] P. Billingsley, Convergence of Probability Measures, Wiley, New York, 1968[7] M. I. Borodowski, A. S. Bratus and F. L. Chernous’ko, Optimal impulse correction underrandom perturbation, J. Appl. Math. Mech., 39(1975), 797-805[8] A. S. Bratus, Solution of certain optimal correction problems with error of execution onthe control action, J. Appl. Math. Mech., 38(1974), 433-440[9] D. S. Bridge and S. E. Shreve, Multi-dimensional finite-fuel singular stochastic control,Preprint , 1991[10] F. L. Chernous’ko, Optimum correction under active disturbances, J. Appl. Math. Mech.,32(1968), 203-208[11] , Self-similar solution of the Bellman equation for optimal correction of randomdisturbances, J. Appl. Math. Mech., 35(1971), 333-342[12] M. Chiarolla and U. G. Haussmann, The free boundary of the monotone follower, toappear in SIAM J. Control and Opt.[13] , The optimal control of the cheap monotone follower, to appear in Stochastics[14] P. L. Chow, J. L. Menaldi and M. Robin, Additive control of stochastic linear system withfinite time horizon, SIAM J. Control and Opt., 23(1985),pp858-99[15] M. D. Crandall, H. Ishii and P. L. Lions, A user’s guide to viscosity solutions, Bulletin ofA.M.S., N.S.27(1992), ppl-67125Bibliography 126[16] M. H. A. Davis and A. It. Norman, Portfolio selection with transaction costs, Math. Oper.Res., 15(1990), pp676-713[17] C. Dellacherie and P. A. Meyer, Probabilities and Potential, North-Holland, MathematicsStndies 29, 1978[18] N. El Karoui, Hun Nguyen and M. Jeanblanc-Picqné, Compactification methods in thecontrol of degenerate diffusions: Existence of an optimal control, Stochastics 20(1987),ppl69-2l9[19] , Existence of an optimal Markovian filter for the control under partial observations, SIAM J. Control and Opt., 36(1988),pp1025-61[20] N. El Karoni and I. Karatzas, Probabilistic aspects of finite-fuel, reflected follower problem,Acta Appl. Math., 11(1988), pp. 223-258[21] J. N. Ethier and T. G. Knrtz, Markov Processes, Characterization and Convergence, JohnWiley & Sons, 1986[22] XV. H. Fleming and R. XV. Rishel, Deterministic and Stochastic Optimal Control, Springer-Verlag, New York, 1975[23] W. H. Fleming and H. M. Soner, Controlled Markov Processes and Viscosity Solutions,Springer-Verlag, New York, 1993[24] A. Friedman, Variational Principles and Free Boundary Problems, John Wiley & Sons,New York, 1982[25] J. M. Harrison, T. M. Sellke and A. J. Taylor, Impulse control of Brownian motion,Math. Oper. Res., 8(1983), pp454-466[26] J. M. Harrison and M. I. Taksar, Instantaneous control of Brownian motion, Math. Oper.Res., 8(1983), pp439-53[27] J. M. Harrison and A. J. Taylor, Optimal control of a Brownian storage system,Stoch. Proc. Appl., 6(1978), pp179-94[28] Ii. G. Haussmann, A Stochastic Maximum Principle for Optimal Control of Diffusions,Pitman Reserach Notes in Math. Series 151, 1986[29] , Existence of optimal Maricovian controls for degenerate diffusions, Lecture Notesin Control and Information Science 78(1986), pp171-86[30] U. G. Hanssmann and J. P. Lepeltier, On the existence of optimal control, SIAM J. Controland Opt., 28(1990), pp851-902[31] U. G. Haussmann and IN. Suo, Singular stochastic controls I: Existence of optimal controls, to appear in SIAM J. on Control and OptimizationBibliography 127[32] , Singular optimal controls H: The dynamic programming principle and applications, to appear in SIAM J. on Control and Optimization[33] , Existence of singular optimal control laws for stochastic differential equations,to appear in Stochastics[34] A. C. Heinricher and V. J. Mizel, A stochastic control problem with different value functions for singular and absolutely continuous control, Proc. of 25th Conf. on Decision andControl, 1986[35] N. Ikeda and S. Watanabe, Stochastic Differential Equations and Diffusion Processes,North-Holland, Amsterdam, 1981[36] II. Ishii. Uniqueness of unbounded viscosity solutions of Hamilton-Jacob equations, IndianaU. Math. J., 26(1984), pp721-48[37] 5. D. Jacka, A finite fuel stochastic control problem, Stochastics, 10(1983), pp103-1[38] J. Jacod and J. Mémin, Stir un type de convergence intermédiaire entre la convergence enloi et Ia convergence en probabilitd, Séminaire de Probabilité XV, Lect. Notes in Math850, Spinger-Verlag, 1981, pp. 529-540[39] J. Jacod and A. Shiryaev, Limit Theorems for Stochastic Processes, Springer-Verlag,Berlin, 1987[40] I. Karatzas, The monotone follower problem in stochastic decision theory, Appl. Math.Opt., 7(1981) pp175-89[41] , A class of singular control problems, Adv. Appl. Probab., 15(1983),pp225-54[42] , Stochastic control under finite fuel constraints, The IMA Volumes in Math. and ItsAppl. Vol. 10, 1988[43] I. Karatzas, J. P. Lehoczky, S. P. Sethi and S. E. Shreve, Explicit solution of a generalconsumption/investment problem, Math. Oper. Res., 11(1986), pp261-94[44] I. Karatzas, J. P. Lehoczky and S. E. Shreve, Optimal portfolio and consumption decisionfor a small investor, SIAM J. Control and Opt., 25(1987),pp1557-86[45] I. Karatzas and S. E. Shreve, Brownian Motion and Stochastic Calculus, Springer-Verlag,NY, 1987[46] , Connections between optimal stopping and stochastic control: L Monotonefollower problems, SL&M J. Control and Opt., 22(1984),pp856-77[47] , Connections between optimal stopping and stochastic control: IL Reflectedfollower problems, SIAM J. Control and Opt., 22(1985), pp433-51[48] , Equivalent models for finite-fuel stochastic control, Stochastics, 18(1986),pp24S-276Bibliography 128[49] E. V. Krichagina and M. I. Taksar, Diffusion approximation for GI/G/1 controlled queue,Queuing System, 12(1992), pp333-68[50] N. Krylov, Controlled Diffusion Processes, Springer-Verlag, New York, 1980[51] T. G. Kurtz, Random time changes and convergence in distribution under Meyer-Zhengconditions, The Annals of Probab., 19(1991), pplOlO-1034[52] II. J. Kushner and K. M. Ramachandran, Nearly optimal singular controls for wide bandnoise driven systems, SIAM J. Control and Opt., 26(1988),pp569-91[53] R. Larsen, Functional Analysis, An Introduction, Marcel Dekker, Inc. New York, 1973[54] J. P. Lehoczky and S. E. Shreve, Absolutely continuous and singular stochastic control,Stochastics, 17(1986), pp91-109[55] P. L. Lions, Optimal control of diffusion processes and HJB equations, Part 1: The dynamic programming principle and applications; Part 2: Viscosity solutions and uniqueness, Comm. in PDE, 8(1983), ppllOl-l174; pp1129-1276[56] P. L. Lions and J. L. Menaldi, Optimal control of stochastic integrals and Hamilton-Jacob-Bellman equations I, IL SIAM J. Control and Opt., 20(1982), ppS8-81, pp82-9S[57] P. L. Lions and A. S. Sznitman, Stochastic differential equations with reflecting boundaryconditions, Comm. Pure Appl. Math., 37(1984), ppSll-537[58] J. Ma, On the principle of smooth fit for a class of singular stochastic control problemsfor diffusion, SIAM J. Control and Opt., 30(1992), pp. 975-999[59] L. F. Martins and H. J. Kushner, Routing and singular control for queuing network inheavy traffic, SIAM J. Control and Opt., 28(1990),pp1209-33[60] J. L. Menaldi, On the optimal stopping time problem for degenerate diffusions, SIAM J.Control and Opt., 6(1980), pp697-721[61] , On the optimal impulse control problems for degenerate diffusions, SIAM J. Controland Opt., 18(1980), pp722-39[62] J. L. Menaldi and M. Robin, On some cheap control problems for diffusion processes,T.A.M.S., 278(1983), pp771-802[63] , On singular stochastic control problems for diffusions with jumps, IEEE Trans.Auto. Control, AC-29(1984), pp991-1004[64] J. L. Menaldi and E. Rofman, On stochastic control problems with impulse cost vanishing,Proc. International Symposium on Semi-Infinite Programming and Appl., Lect. Notes inEconom. and Math. Sys. 215, Springer-Verlag, New York, 1983, pp281-94[65] J. L. Menaldi and M. I. Taksar, Optimal correction problem of a multidimensional stochastic system, Automatica, 23(1989),pp223-32Bibliography 129[66] P. A. Meyer and W. A. Zheng, Tightness criteria for laws of semimartingales, Ann. Inst.Henri Poincaré, 20(1984), pp353-72[67] S. E. Shreve, An introduction to singular stochastic control, The IMA Volumes in Math.and Its AppL Vol. 10, 1988[68] S. E. Shreve, J. P. Lohoczky and D. P. Gayer, Optimal consumption for general diffusionwith absorbing and reflecting barriers, SIAM J. Control and Opt., 22(1984), pp55-7[69] 5. E. Shreve and H. M. Soner, Optimal investment and consumption with transactioncosts, Res. Report No. 92-SA-001, Dept. of Math., Carnegie Mellon Univ.[70] A. V. Skorohod, Limit theorems for stochastic processes, Theor. Prob. Appl., 1(1956),pp26l-284[71] H. M. Soner and S. E. Shreve, Regularity of the value function for a two-dimensionalsingular stochastic control problem, SIAM J. Control and Opt., 27(1989), pp876907[72] , A free boundary problem related to singular stochastic control: the paraboliccase, Comm. Partial Duff. Equations, 16(1991),pp373-424[73] D. ‘N. Stroock and S.R.S. Varadhan, Multidimensional Diffusion Processes, Springer-Verlag, 1979[74] M. Sun, Singular control problems in bounded intervals, Stochastics, 10(1983), pplO3-ll3[75] M. Sun and J. L. Menaldi, Monotone control of a damped oscillator under random perturbations, IMA Journal of Math. Control and Infor. 5(1988), pp169-86[76] M. I. Taksar, Storage model with discontinuous holding cost, Stochastic Pro. and TheirAppl., 18(1984), pp291-300[77] , Average optimal singular control and a related stopping prolem, Math. Oper. Res.,10(1985), pp63-81[78] P. van Moerbeke, On optimal stopping and free boundary problems, Arch. RationalMech. Anal., 60(1976), pp101-48[79] S.R.S. Varadhan and R. J. Wiffiams, Brownian motion in a wedge with oblique reflection,Comm. Pure Appl. Math., 38(1985), pp405-43[80] 5. A. Williams, P. L. Chow and J. L. Menaldi, Regularity of the free boundary in singularstochastic control, to appear in J. Duff. Equations[81] H. Zhu, Variational inequalities and dynamic programming for singular stochastic control,Ph.D. Thesis, Brown University, 1991
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- The existence of optimal singular controls for stochastic...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
The existence of optimal singular controls for stochastic differential equations Suo, Wulin 1994
pdf
Page Metadata
Item Metadata
Title | The existence of optimal singular controls for stochastic differential equations |
Creator |
Suo, Wulin |
Date Issued | 1994 |
Description | We study a singular control problem where the state process is governed by an Ito stochastic differential equation allowing both classical and singular coutrols. By reformulating the state equation as a martingale problem on an appropriate canonical space, it is shown, under mild continuity conditions on the data, that an optimal control exists. The dynamic programming principle for the problem is established through the method of conditioning and concatenation. Moreover, it is shown that there exists a family of optimal controls such that the corresponding states form a Markov process. When the data is Lipschitz continuous, the value function is shown to be uniformly con tinuous and to be the unique viscosity solution of the corresponding Hamilton-Jacobi-Bellman variational inequality. We also provide a description of the continuation region, the region in which the optimal state process is continuous, and we show that there exists a family of optimal controls which keeps the state inside the region after a possible initial jump. The last part is independent of the rest of the thesis. Through stretching of time, the singular control problem is transformed into a new problem that involves only classical control. Such problems are relatively well understood. As a result, it is shown that there exists an optimal control where the classical control variable is in Markovian form and the increment of the singular control variable on any time interval is adapted to the state process on the same time interval. |
Extent | 2186410 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
FileFormat | application/pdf |
Language | eng |
Date Available | 2009-04-08 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0080009 |
URI | http://hdl.handle.net/2429/6966 |
Degree |
Doctor of Philosophy - PhD |
Program |
Mathematics |
Affiliation |
Science, Faculty of Mathematics, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 1994-05 |
Campus |
UBCV |
Scholarly Level | Graduate |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-ubc_1994-893958.pdf [ 2.09MB ]
- Metadata
- JSON: 831-1.0080009.json
- JSON-LD: 831-1.0080009-ld.json
- RDF/XML (Pretty): 831-1.0080009-rdf.xml
- RDF/JSON: 831-1.0080009-rdf.json
- Turtle: 831-1.0080009-turtle.txt
- N-Triples: 831-1.0080009-rdf-ntriples.txt
- Original Record: 831-1.0080009-source.json
- Full Text
- 831-1.0080009-fulltext.txt
- Citation
- 831-1.0080009.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0080009/manifest