Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Implementable algorithms for stochastic nonlinear programs with applications to portfolio selection and… Kallberg, Jarl Gunnar 1979

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-UBC_1979_A1 K34.pdf [ 7.21MB ]
Metadata
JSON: 831-1.0094923.json
JSON-LD: 831-1.0094923-ld.json
RDF/XML (Pretty): 831-1.0094923-rdf.xml
RDF/JSON: 831-1.0094923-rdf.json
Turtle: 831-1.0094923-turtle.txt
N-Triples: 831-1.0094923-rdf-ntriples.txt
Original Record: 831-1.0094923-source.json
Full Text
831-1.0094923-fulltext.txt
Citation
831-1.0094923.ris

Full Text

I M P L E M E N T A B L E A L G O R I T H M S F O R S T O C H A S T I C N O N L I N E A R P R O G R A M S W I T H A P P L I C A T I O N S T O P O R T F O L I O S E L E C T I O N A N D R E V I S I O N by J A R L G U N N A R K A L L B E R G B.Sc.(Hons.)., The U n i v e r s i t y of B r i t i s h C o l u m b i a , 1972 M.Sc., Simon F r a s e r U n i v e r s i t y , 1974 A T H E S I S S U B M I T T E D I N P A R T I A L F U L F I L M E N T O F O T H E R E Q U I R E M E N T S F O R T H E D E G R E E O F D O C T O R O F P H I L O S O P H Y i n T H E F A C U L T Y O F G R A D U A T E S T U D I E S (Commerce & Business A d m i n i s t r a t i o n ) We accept t h i s t h e s i s as conforming to the r e q u i r e d standard THE UNIVERSITY OF BRITISH COLUMBIA J u l y .19 79 © JARL GUNNAR KALLBERG, 19 79 I n p r e s e n t i n g t h i s t h e s i s i n p a r t i a l f u l f i l m e n t o f t h e r e q u i r e m e n t s f o r a n a d v a n c e d d e g r e e a t t h e U n i v e r s i t y o f B r i t i s h C o l u m b i a , I a g r e e t h a t t h e L i b r a r y s h a l l m a k e i t f r e e l y a v a i l a b l e f o r r e f e r e n c e a n d s t u d y . I f u r t h e r a g r e e t h a t p e r m i s s i o n f o r e x t e n s i v e c o p y i n g o f t h i s t h e s i s f o r s c h o l a r l y p u r p o s e s m a y b e g r a n t e d b y t h e H e a d o f my D e p a r t m e n t o r b y h i s r e p r e s e n t a t i v e s . I t i s u n d e r s t o o d t h a t c o p y i n g o r p u b l i c a t i o n o f t h i s t h e s i s f o r f i n a n c i a l g a i n s h a l l n o t b e a l l o w e d w i t h o u t my w r i t t e n p e r m i s s i o n . D e p a r t m e n t o f T h e U n i v e r s i t y o f B r i t i s h C o l u m b i a 2 0 7 5 w e s b r o o k P l a c e V a n c o u v e r , C a n a d a V 6 T 1W5 D E - 6 B P 75-51 1 E ABSTRACT This d i s s e r t a t i o n has two main o b j e c t i v e s : f i r s t , to develop e f f i c i e n t algorithms f o r the s o l u t i o n of one and two pe r i o d c o n s t r a i n e d o p t i m i z a t i o n problems, and second, to apply these methods to the s o l u t i o n of p o r t f o l i o s e l e c t i o n and r e v i s i o n problems. The algorithms developed are based upon the Frank-Wolfe method. A convergent a l g o r i t h m i s developed which modofies t h i s approach to allow f o r sequences of approximations to the o b j e c t i v e and to the gradient of the o b j e c t i v e , as w e l l as ine x a c t l i n e a r searches. By u t i l i z i n g v a r y i n g degrees of accuracy (with i n c r e a s i n g p r e c i s i o n as the optimum i s app-roached) , the method w i l l be computationally more t r a c t a b l e than f i x e d t o l e r a n c e methods without s a c r i f i c i n g the conver-gence p r o p e r t i e s . This a l g o r i t h m i s then a p p l i e d to a s t a t i c p o r t f o l i o s e l e c t i o n problem. Here the i n v e s t o r has a wealth allotment to be a l l o c a t e d to a number of p o s s i b l e r i s k y investments wi t h the o b j e c t i v e of maximizing the expected u t i l i t y of t e r m i n a l wealth. The i n v e s t o r ' s preferences are assumed to be given by a (von-Neumann-Morgenstern) u t i l i t y f u n c t i o n . For the e m p i r i c a l s t u d i e s seven c l a s s e s of u t i l i t y f u n c t i o n s and ten j o i n t normally d i s t r i b u t e d assets are used. One quest-i o n i n v e s t i g a t e d i s the degree t o which the Arrow-Pratt r i s k a version measure determines p o r t f o l i o composition. The emp-i r i c a l r e s u l t s are augmented by a theorem showing (for normally d i s t r i b u t e d s e c u r i t y returns) t h a t the Rubinstein g l o b a l r i s k a v e rsion measure i s s u f f i c i e n t t o determine p r t f o l i o composition. The second p a r t of t h i s d i s s e r t a t i o n deals w i t h two p e r i o d problems. An alg o r i t h m , based on the method of Hogan f o r extremal value f u n c t i o n s , i s developed. The extensions ( i i ) (and subsequent advantages) are analogous to those developed f o r the one p e r i o d problem. This method i s used t o solve a p o r t f o l i o r e v i s i o n problem u t i l i z i n g f i v e j o i n t normally d i s t r i b u t e d assets and prop-o r t i o n a l t r a n s a c t i o n c o s t s . E m p i r i c a l l y , i t i s shown th a t s i g n i f i c a n t e r r o r s are generated by i g n o r i n g the r e v i s i o n aspect of the problem, even wi t h s e r i a l l y u n c o r r e l a t e d r e t u r n s . ( i i i ) TABLE OF CONTENTS Introduction 1 § 1. The Frank-Wolfe Algorithm with Function and Gradient Approximations 11 §1.1 The Frank-Wolfe Algorithm and Some Extensions 12 §1.2 A Method of Feasible Directions Using Function Approximations 19 §1.3 Related Concepts of Approximation 25 §1.4 A Frank-Wolfe Algorithm Using Approximations 28 Appendix: Convergence Theories 42 §A.l Point-to-Set Maps and the Convergence Theory of Zangwill 43 §A.2 Algorithm Models 48 §A.3 Other Approaches to Convergence Theory 51 §2. The Solution of P o r t f o l i o Selection Problems 54 §2.1 P o r t f o l i o Selection Models 55 §2.2 Measures of Risk Aversion 61 §2.3 The P o r t f o l i o Selection Problem 72 §2.4 Summary 113 Appendix: Implementation 115 § 3. Frank-Wolfe Algorithms Using Approximate D i r e c t i o n a l Derivatives 121 §3.1 A Frank-Wolfe Algorithm Using D i r e c t i o n a l Derivatives 122 §3.2 Approximate D i r e c t i o n a l Derivatives 130 Appendix: Topological Concepts and Related Theorems 140 (iv) §4. P o r t f o l i o Revision Problems 143 §4.1 Dynamic Programming Formulations 144 §4.2 Extensions of Mean-Variance Analysis and Other Models 1 4 8 §4.3 A General P o r t f o l i o Revision Model Including Consumption, Borrowing and Taxes 155 §4.4 The Expected U t i l i t y Model 1 5 9 §4.5 Results 1 6 6 Appendix 1: Linear Programs with One Constraint and Upper Bounded Variables ^5 Appendix 2: Computing the Conditional D i s t r i b u t i o n 1^ 6 Appendix 3: The Expression for the Di r e c t i o n a l Derivative Appendix 4: Transaction Costs / J References 1^1 (v) L i s t of Tables 2.1 U t i l i t y Function P r o p e r t i e s 73 2.2 S e c u r i t y Means and Variances 78 2.3 S e c u r i t y C o r r e l a t i o n M a t r i x 78 2.4 Monthly, Q u a r t e r l y and Yea r l y Covariance Matrices 79 2.5 U t i l i t y Functions and Parameter Ranges 81 2.6 P o r t f o l i o Composition f o r Various U t i l i t y Functions when R A = .5 99 2.7 P o r t f o l i o Composition f o r Various U t i l i t y Functions when R = 1 • 100 2.8 P o r t f o l i o Composition f o r Various U t i l i t y Functions when R A = 2 101 2.9 P o r t f o l i o Composition f o r Various U t i l i t y Functions when R A = 4 10 3 2.10 P o r t f o l i o Composition f o r Various U t i l i t y Functions when R A = 10 . 1 0 4 2.11 E r r o r Associated w i t h the Approximate Measure w QR A f o r Changing I n i t i a l Wealth 106 2.12 E r r o r Associated w i t h the Approximate Measure W Q R a f o r Changing I n i t i a l Wealth f o r Negative Power U t i l i t y Function w i t h $ 6 = .5 106 4.1 Re v i s i o n S o l u t i o n w i t h Q u a r t e r l y Data and Exponential U t i l i t y 170 (vi) L i s t of Figures 1 . 1 An Example of Wolfe's Away Step 1 6 1 . 2 A Flowchart for Kless i g and Polak's Feasible Direction Algorithm with Function Approximations 2 2 1 , 3 A Flowchart for the Extended Algorithm 4 0 A l . 1 Jamming 5 1 2 . 0 Risk Premium and Certainty Equivalent 6 2 2 . 1 Quadratic Monthly 8 2 2 . 2 Exponential Monthly 8 2 2 . 3 Log Monthly 8 3 2 . 4 Special Exponential Monthly 8 3 2 . 5 Power Monthly 8 4 2 . 6 Negative Power Monthly 8 4 2 . 7 Arctan Monthly 8 5 2 . 8 Quadratic Quarterly 8 5 2 . 9 Exponential Quarterly 8 6 2 . 1 0 Log Quarterly 8 6 2 . 1 1 Special Exponential Quarterly 8 7 2 . 1 2 Power Quarterly 8 7 2 . 1 3 Negative Power Quarterly 8 8 2 . 1 4 Arctan Quarterly 8 8 2 . 1 5 Quadratic Yearly 8 9 2 . 1 6 Exponential Yearly 8 9 (vii) 2.17 Log Y e a r l y 90 2.18 Power Y e a r l y 90 2.19 Negative Power Yearly 91 2.20 S p e c i a l E x p o n e n t i a l Yearly 91 2.21 Arctan Y e a r l y 92 2.22 Percentage of Maxim-urn Variance versus 95 2.2 3 D i f f e r e n c e s i n Cash Equiv a l e n t Using the Measure w R, w i t h Changing I n i t i a l Wealth 107 o A ^ 3 2.24. Covariance E r r o r s Only 112 2.25 Mean and Covariance E r r o r s 112 4.1 Optimal Stock Shares 150 4.2 Kamin's No Transaction Region 152 4.3 Revision and S t a t i c S o l u t i o n s Q u a r t e r l y Data, Exponential U t i l i t y 171 4.4 The Transaction Cost Function and Approximation 179 ( v i i i ) Acknowledgements Sincere thanks go to Prof. W. T. Ziemba f o r h i s s k i l l e d d i r e c t i o n throughout my s t u d i e s as a d o c t o r a l student. His advice, support and knowledge have been major i n f l u e n c e s i n my work. The other members of my d i s s e r t a t i o n committee have a l s o s u b s t a n t i v e l y improved t h i s work and s p e c i a l mention i s due Professors D. Wehrung and A. Kraus. As w e l l , my g r a t i t u d e goes to Pr o f . P. Weinstein who f o r many years has been an important i n f l u e n c e on me. I would l i k e to thank the other graduate students at UBC who contibuted much to my l i f e here. These inc l u d e A. Amershi, E. Choo, Y. Gerchak, D. K i r a , E. Mak, E. M. Matsumura, M. Newman, M. Shin, G. Sick and B. S i n c l a i r . S p e c i a l thanks go to D. Hausch and M. Kusy. F i n a l l y , my acknowledgements end c l o s e r to home. For t h e i r help, encouragement and s a c r i f i c e s I thank my wife Peggy and e s p e c i a l l y my parents, to whom t h i s work i s dedicated. (ix) NOTATION x£A means x i s a member of the set A. x^A means x i s not a member of the set A. (a,b) = {x: a < x < b> . [a,bl = (x: a < x < b> . R n denotes E u c l i d i a n n-space. R + = {x: x >_ 0> . R* = l- 0 0,-*- 0 0] . A c B means t h a t A i s a subset of B. A x B i s the C a r t e s i a n product of the sets A and B = {(a,b): a£A, b£B> . A 2 i s the set of a l l subsets of A. A/B = (a: a£A, a£B> . UA^ . = (a: there e x i s t s k such t h a t af^ -A^ } . k ||x|| i s the E u c l i d i a n norm on R n 2 2 X -j "T* • • • T " X » 1 n |[x||^ i s the sup norm on R n, = max |x.| . 1<j<n 3 Vx means f o r each value of x. x'y i s the s c a l a r product of x and y ( i n R n) = x i y i + . . . + x n y n • ( x ) + = max(0,x) . U ) f:A -*• B means tha t f maps the set A i n t o the set B. fee 1 1 means tha t f i s n times continuously d i f f e r e n t i a b l e . Vf i s the gradient of f, i . e . , the vector of p a r t i a l d e r i v a t i v e s of f. V 2 f i s " t h e Hessian of f, i . e . , the matrix of second d e r i v a t i v e s of f. E^ means expectation with respect t o £. £ ^ N[£,£] means that £ i s normally d i s t r i b u t e d w i t h mean % and covariance matrix E. (xi) 1 INTRODUCTION This d i s s e r t a t i o n has two main o b j e c t i v e s : t o develop e f f i c i e n t methods f o r the s o l u t i o n of one and two p e r i o d constrained o p t i m i z a t i o n problems ( i n chapters one and three) and to apply these methods to the s o l u t i o n of p o r t f o l i o s e l e c t i o n and r e v i s i o n problems ( i n chapters two and f o u r ) . The algorithms to be presented here are m o d i f i c a t i o n s of the Frank-Wolfe method (FW) [36] .. To i l l u s t r a t e , consider the problem (0.1) max f (x) X£K where f i s concave. The general form of FW i s : Step 0. I n i t i a l i z a t i o n : choose a f e a s i b l e p o i n t x i n K. Step 1. D i r e c t i o n F i n d i n g : choose the best 'usable* d i r e c t i o n , d, by l i n e a r i z i n g f at x. Step 2. L i n e a r search problem: determine an x* i n K which maximizes f i n the d i r e c t i o n d from x. Step 3. O p t i m a l i t y t e s t : i f no stopping r u l e i s a p p l i c -a ble, r e t u r n to step 1 w i t h x=x*. (Here there e x i s t a number of p o s s i b l e t e r m i n a t i o n c r i t e r i a depending upon the assumptions.) 2 The FW a l g o r i t h m was chosen as the a l g o r i t h m i c frame-work f o r t h i s d i s s e r t a t i o n f o r a number of reasons: II the a l g o r i t h m i s s t r u c t u r a l l y simple and each of the steps i s u s u a l l y e a s i l y implementable; 1F i t s convergence behavior (asymptotic) has been e x t e n s i v e l y s t u d i e d (Canon and Cullum [14], Wolfe [114], and Amor [ 4 ] ) ; II s e v e r a l p o s s i b l e extensions have been developed (Wolfe [114], Best [12], Holloway [45], Hogan [43], Dyer [25], E v e r i t t and Ziemba [30 ] ) ; 11 the d i r e c t i o n f i n d i n g step i s s i m p l i f i e d i n our a p p l i c a t i o n s ; IF i t has a r a p i d i n i t i a l r a t e of convergence (Amor [ 4 ] , Dyer [25]). FW i s a foundation from which there e x i s t many p o s s i b l e d i r e c t i o n s f o r improvement and m o d i f i c a t i o n . One which i s fundamental to t h i s d i s s e r t a t i o n i s by Hogan [ 4 3 ] r who r e l a x e s the assumption t h a t the gradient of f e x i s t . The u n derlying idea i s t h a t FW (and other f e a s i b l e d i r e c t i o n algorithms) o f t e n r e q u i r e existence of the g r adient , when a c t u a l l y i t s only use i s i n computing f e a s i b l e d i r e c t i o n s ( v i a the d i r e c t i o n a l d e r i v a t i v e ) . There are many i n t e r e s t i n g problems where the g r adient of f f a i l s to e x i s t but where the d i r e c t i o n a l d e r i v a t i v e does e x i s t . One important c l a s s i n t h i s category are 'extremal value' problems. For example, 3 consider the problem (0.2) max f(x,y ) yeY xeX(y) where Y i s the f e a s i b l e set f o r y and X ( y ) , the f e a s i b l e set f o r x, may depend upon y. I t i s o f t e n advantageous to consider the p r o j e c t e d (or p a r t i t i o n e d ) problem represented by maximizing the extremal value f u n c t i o n (0.3) v(y) = max f(x,y) xeX(y) This r e p r e s e n t a t i o n could be u s e f u l i f : y i s in t e g e r - v a l u e d and x i s u n r e s t r i c t e d ( t h i s i s the p r o j e c t i o n used i n Benders decomposition [ 1 0 ] ) , f i s concave i n x f o r f i x e d y (but not j o i n t l y concave), or the problem i s n a t u r a l l y decomposable i n t h i s f a s h i o n . The l a s t p o i n t i s r e l e v a n t f o r chapter three where we decompose the two p e r i o d problem by s e t t i n g x equal to the f i r s t p e r i o d s o l u t i o n and x equal to the second p e r i o d s o l u t i o n . For problems of the form (0.3), Hogan g e n e r a l i z e s FW by a l l o w i n g use of the d i r e c t i o n a l d e r i v a t i v e (and approximations t o the d i r e c t i o n a l d e r i v a t i v e ) i n place of the gra d i e n t . In chapter one we develop a FW type a l g o r i t h m which uses a g e n e r a l i z a t i o n s i m i l a r to t h a t used by Hogan. This adapt-a t i o n i s motivated by K l e s s i g and Polak [52], who used successive approximations to the o b j e c t i v e f u n c t i o n and to the gradient of the o b j e c t i v e f u n c t i o n , to modify a f e a s i b l e 4 d i r e c t i o n a l g o r i t h m of Polak [83]. The a l g o r i t h m we present i s a p p l i c a b l e t o problem (0.1) and allows f o r i n e x a c t comp-u t a t i o n of the o b j e c t i v e and of the gradient of the o b j e c t i v e as w e l l as i n e x a c t s o l u t i o n of the l i n e a r search problem. Using these approximations of v a r y i n g accuracies has a number of advantages: 11 i t leads to s u b s t a n t i a l decreases i n computational e f f o r t ( as compared to f i x e d accuracy methods), and 11 i t enables us to solve problems where the o b j e c t i v e (or i t s gradient) does not have an e a s i l y computed form (e.g. an expected cost o b j e c t i v e which must be evaluated n u m e r i c a l l y ) . A l s o , when we use t h i s technique on a problem w i t h an expected cost o b j e c t i v e , numerical quadrature techniques can be a p p l i e d t o generate the appropriate sequences of approximations; i n p a r t i c u l a r , one may use Gaussian quad-rat u r e formulas w i t h i n c r e a s i n g numbers of f u n c t i o n e v a l -uations . We prove the convergence of t h i s modified FW a l g o r i t h m using the convergence theory of Z a n g w i l l [117-118]. Chapter three uses very s i m i l a r ideas t o those discussed above, to g e n e r a l i z e the aforementioned method of Hogan to allow the i n e x a c t computation of the o b j e c t i v e and i t s grad-i e n t and approximate l i n e searches. This a l g o r i t h m i s f o r problems of the form (0.3), e s p e c i a l l y w i t h a s t o c h a s t i c o b j e c t i v e f u n c t i o n . 5 In chapter two the a l g o r i t h m developed i n the p r e v i o u s chapter i s a p p l i e d to a p o r t f o l i o s e l e c t i o n problem. Here the i n v e s t o r ' s p r e f e r e n c e s are assumed to be given by a (von Neumann-Morgenstern) u t i l i t y f u n c t i o n . He has an i n i t i a l wealth a l l o t m e n t (w ) t o be a l l o c a t e d to a number o of p o s s i b l e r i s k y investments w i t h the o b j e c t i v e of maxim-i z i n g the expected u t i l i t y of f i n a l wealth. T h i s i s w r i t t e n as n (0.4) max z (x) = max E J u t g ' j x w )] s u b j e c t t o J x . = 1. x>0 ? ° 1 1 where u i s the i n v e s t o r ' s u t i l i t y f u n c t i o n , E, i s the random ve c t o r of one p l u s tne r a t e of r e t u r n on the p o s s i b l e i n v e s t -ments, x i s the v e c t o r which giv e s the p r o p o r t i o n s of i n i t -i a l wealth to be a l l o c a t e d to each of the p o s s i b l e r i s k y investments and r e p r e s e n t s the e x p e c t a t i o n w i t h r e s p e c t to K. In p o r t f o l i o s e l e c t i o n (as i n more ge n e r a l d e c i s i o n making under u n c e r t a i n t y ) a major d i f f i c u l t y i s determin-i n g the r i s k p r e f e r e n c e s of the i n v e s t o r , i n p a r t i c u l a r , h i s t r a d e o f f between r i s k and r e t u r n . P r a t t [ 85] and Arrow [ 5] developed two measures which attempt to ( l o c a l l y ) c h a r a c t e r i z e these a t t i t u d e s : the absolute r i s k a v e r s i o n index (0.5) R A(w) = -u" (w)/u' (w) 6 and the r e l a t i v e r i s k aversion index (0.6) RD(w) = -wu"(w)/u'(w) . A t h i r d measure treated i n t h i s chapter i s the Rubin-stein global r i s k aversion measure (0.7) RR(w) = - W Q E u" (w) / E u' (w) In (0.7) the expectations are taken over the relevant range of possible f i n a l wealth. Pratt has proven a number of theorems which relate these two measures (0.6)- (.0.7) to i n t u i t i v e interpretations of r i s k behavior. One question which i s addressed i n t h i s chapter i s the robustness of the R A measure i n a normal d i s t r i b u t i o n , p o r t f o l i o selection s e t t i n g . Stated more simply: i f two investors have the same average R , do they have the same optimal p o r t f o l i o s ? We w i l l say that two u t i l i t y functions have the same average Ra i f t h e i r expected R (over the relevant range of f i n a l wealth) i s the same. We measure closeness of two p o r t f o l i o s by measuring the percentage difference i n t h e i r certainty equivalents ( i . e . the cash equivalent of the uncertain returns). To test t h i s proposition empirically, we focus on (0.4) with E, d i s t r i b u t e d normally with mean \ and variance E, (£ ^ N[1,E]) and u concave with posi t i v e f i r s t d erivative. We estimate % and £ for monthly, quarterly and yearly data 7 using ten New York Stock Exchange s e c u r i t i e s and data from the years 1965-1970. We then solve (0.4) f o r seven classes of u t i l i t y function. By classes of u t i l i t y function we mean that we parametrize each u t i l i t y function so that i t s math-ematical properties can be varied. To i l l u s t r a t e , for the logarithmic u t i l i t y function we use log(w+B) and by changing the values of 3 we can change measures (0.5)-(0.7) and (theoretically) change the r i s k attitudes modeled by these functions. We f i n d that for monthly and quarterly data the robustness of the measure i s very strongly supported (the largest error i s less than .02%) and for yearly data the proposition i s s t i l l quite strongly support-ed (the largest error i s less than 1%). A major problem i n any p o r t f o l i o selection strategy i s the forecasting of future returns or the s p e c i f i c a t i o n of the d i s t r i b u t i o n of future p r i c e s . To p a r t i a l l y address the question of forecasting inaccuracies, we b r i e f l y examine the e f f e c t s of errors i n the estimation of f and E. This i s achieved by comparing the cash equivalents of the 'true' solution and of the solution of a randomly pert-urbed problem. We f i n d that the technique i s i n s e n s i t i v e to errors i n 2 , but quite sensitive to errors i n X. 8 The e m p i r i c a l r e s u l t s o f t h i s chapter are augmented by a t h e o r e t i c a l r e s u l t (a s p e c i a l case was p r e v i o u s l y proven by M. R u b i n s t e i n ) . Assuming j o i n t n o r m a l i t y o f r e t u r n s , we show t h a t two i n v e s t o r s , one with u t i l i t y f u n c t i o n u^ and i n i t i a l wealth l e v e l w^ , the other with u t i l i t y f u n c t i o n ^ and i n i t i a l wealth l e v e l W 2 , have the same op t i m a l p o r t f o l i o s i f w.jE[ u" (w) ] /E[ u' (w) ] = w 2E[ u" (w) ] /E[ u' (w) ] . Chapter two concludes with a d i s c u s s i o n of the implement-a t i o n of our m o d i f i e d FW a l g o r i t h m . In p a r t i c u l a r , we d e s c r i b e the use o f Gauss-Hermite quadrature formulas (see H i l d e b r a n d [ 41 ]) to generate sequences of s u c c e s s i v e approximations t o the o b j e c t i v e and to the g r a d i e n t o f the o b j e c t i v e . Chapter f o u r concerns the s o l u t i o n o f p o r t f o l i o r e v -i s i o n problems. One c r i t i c i s m o f p o r t f o l i o s e l e c t i o n app-roaches i s t h a t they maximize the u t i l i t y f o r f i n a l wealth without c o n s i d e r i n g the composition of wealth. For example, optimal s o l u t i o n s may have a number of very s m a l l h o l d i n g s i n a v a r i e t y of s e c u r i t i e s . In p r a c t i c e the composition o f the p o r t f o l i o i s very important because any changes i n p o r t -f o l i o h o l d i n g s i n c u r t r a n s a c t i o n c o s t s . We survey the t r a d i t i o n a l approaches, which are t y p i c -a l l y based on dynamic programming and we note t h a t the e x i s t -ence of t r a n s a c t i o n c o s t s renders t h i s approach i n f e a s i b l e 9 f o r r e a l i s t i c p r o b l e m s . N o t e h o w e v e r t h a t t h e p o r t f o l i o r e v i s i o n p r o b l e m f i t s n a t u r a l l y i n t o t h e t w o p e r i o d s t o c h a s -t i c p r o g r a m m i n g f r a m e w o r k . T h a t i s , a t t h e s t a r t o f p e r i o d o n e c h o o s e x ^ ( t h e i n i t i a l p o r t f o l i o o f a s s e t s ) , o b s e r v e £^ ( t h e r a n d o m v e c t o r o f s e c u r i t y r e t u r n s ) a n d t h e n a t t h e s t a r t o f p e r i o d t w o , c h o o s e x 2 ( t h e p o r t f o l i o r e v i s i o n ) c o n d i t i o n a l u p o n £^ a n d x ^ . A f t e r d e s c r i b i n g a g e n e r a l m o d e l t o w h i c h t h e a l g o r i t h m o f c h a p t e r t h r e e i s a p p l i c a b l e , we f o r m u l a t e t h e t w o p e r i o d e x p e c t e d u t i l i t y m o d e l u p o n w h i c h we f o c u s . n ( 0 . 8 ) max E r { max E . r u ( 2 t ; £ p y ) Y i i o y 2l° 1 s u b j e c t t o n ^ i y l i = W l n n f ^ l i p i y 2 i + a ' y 2 i ~ y l i ! 3 = W 2 + ^ l i P i y l i w h e r e ( f o r j = l , 2 a n d i = l , . . . , n ) i s t h e n u m b e r o f s h a r e s o f s e c u r i t y i h e l d i n p e r i o d j , i s o n e p l u s t h e r a t e o f r e t u r n o n s e c u r i t y i i n p e r i o d j , u i s t h e i n v e s t o r ' s u t i l i t y f u n c t i o n o n t e r m i n a l w e a l t h , p ^ i s t h e i n i t i a l p r i c e p e r s h a r e o f s e c u r i t y i , w.. i s t h e w e a l t h a l l o c a t i o n i n p e r i o d j , a n d a i s t h e c o s t o f t r a n s a c t i n g o n e s h a r e . 10 We assume ' Z Z IT ( 5 ) ^ N [ U ,Z ) r 1 2 1 2 11 12 Z Z 2 1 2 2 U u has p o s i t i v e f i r s t d e r i v a t i v e and i s concave, If t r a n s a c t i o n costs are l i n e a r i n the number of shares traded; i . e . a i s the cost of t r a d i n g one share of s e c u r i t y j . We i n d i c a t e the d e t a i l s of the a l g o r i t h m as implemented f o r t h i s problem. F o r t u n a t e l y , there e x i s t a number of devices which g r e a t l y d i m i n i s h the computational burden and these are e l u c i d a t e d . F i n a l l y , we use a subset of the data used f o r the s t a t i c problem of chapter two (the f i v e s e c u r i t i e s w i t h the highest return) t o perform some p r e l i m i n a r y e m p i r i c a l s t u d i e s on t h i s problem. The s e r i a l covariance matrix i s estimated f o r quart-e r l y data and we f i n d t h a t almost a l l e n t r i e s are i n s i g n i f -i c a n t . Thus, i n the case of zero s e r i a l c o r r e l a t i o n , we deter-mine the r e v i s i o n s o l u t i o n s f o r the e x p o n e n t i a l u t i l i t y f u n c t i o n over a range of parameter values. We compare the f i r s t p e r i o d r e v i s i o n s o l u t i o n (y^) w i t h the s o l u t i o n of the s t a t i c problem and f i n d t h a t the e r r o r a s s o c i a t e d w i t h the s t a t i c case i s on the order of . 5 % . Although the e r r o r s generated are r e l -a t i v e l y s m a l l , i t i s easy to see t h a t i f we introduce s i g -n i f i c a n t s e r i a l c o r r e l a t i o n or higher t r a n s a c t i o n c o s t s , we can generate a much l a r g e r d i f f e r e n c e . 11 §1; The Frank-Wolfe Algorithm with Function and  Gradient Approximations This chapter i s concerned with an extension of the FW algorithm. This method w i l l be applied to the p o r t f o l i o selection problem i n chapter two, and a s i m i l a r approach to the one developed here w i l l be u t i l i z e d i n chapter three. We begin by describing the basic FW algorithm (Frank and Wolfe [ 3 6 ] ) and some of the possible extensions to the d i r e c t i o n finding step. §1.2 elucidates, on a conceptual l e v e l , the idea (due to Kles s i g and Polak [52]) of u t i l i z i n g sequences of approximations to the objective function and to i t s gradient to modify feasible d i r e c t i o n algorithms. The following section deals with t h i s type of approximation on a more imp Lamentable l e v e l . §1.4 describes the FW extension which u t i l i z e s inexact gradients and objective function evaluations as well as approximate l i n e searches. The appendix describes the major unifying approaches to convergence theory, i n p a r t i c u l a r , the approaches of Zangwill [117], Polak [83 ] and Meyer [ 68 ] . 12 §1 .1 The Frank-Wolfe Algorithm and Some Extensions This chapter i s concerned with algorithms based on the o r i g i n a l Frank-Wolfe (FW) algorithm [36]. While th i s paper was largely concerned with quadratic programs, the general problem to which i t i s applicable i s max f(x) xeX with X = {xeRn: Ax < b, x ^ 0 } . Here f i s strongly, boundedly concave^" and real-valued; A i s an m by n matrix; beR m such that X i s bounded. The basic procedure i s as follows. Step 0: I n i t i a l i z a t i o n (choose a feasible point) Choose x°e X, set k = 0. Step 1: Direction Finding Problem (solve the l i n e a r i z e d problem) Let y eX solve k k maxVf(x ) 1 ( y - x ) . yeX t f i s boundedly concave i f f i s concave, twice d i f f e r e n t i a b l e , and there exists a uniform lower bound on the eigenvalues of the Hessian; f i s strongly concave i f there e x i s t s a uniform upper bound on the eigenvalues of the Hessian; see Wolfe [ 114]. 13 Stop i f the optimal value i s zero, otherwise k k k set d = y - x Step 2: Linear Search Problem (determine the optimal step i n the chosen direction) Let x e[0,l] s a t i s f y max f ( x k + rd k) xe [0,1] Let x k + 1 = x k + Tkdk , k = k + 1, go to step 1. Other authors (Zangwill [117]) have dubbed th i s algorithm the li n e a r approximation method or the conditional gradient method (Demyanov and Rubinov [ 2 2 ] ) . This nomenclature i s motivated by step 1. Here we have replaced f(y) by i t s f i r s t order Taylor series approximation about x . f (y) = f (x k) + Vf (x k) ' (y - x k) . k ~ Dropping the constant term, f(x ), step 1 then maximizes f ( y ) . Of course the greatest benefits from this- l i n e a r i z a t i o n are achieved i f the feasible set i s polyhedral, since this implies step 1 i s merely the solution of a l i n e a r program. However, there e x i s t extensions to nonlinear constraints, see e.g. Martos [ 6 3 ] . Further, Holloway [45] has applied the FW approach to more general convex sets X, using sequences of inner l i n e a r i z a t i o n s (of X) and r e s t r i c t i o n s of these l i n e a r i z a t i o n s . 14 FW i s very appealing i n i t s s i m p l i c i t y , and t y p i c a l l y each of the steps are e a s i l y implementable (for example step 1 i s a l i n e a r program and step 2 i s a one-dimensional l i n e a r search). However, Canon and Cullum [14], showed that the order t of convergence of FW i s zero. Later Wolfe [114] showed that the slowest and fastest rates of convergence were also of order zero. Roughly stated, the slowness of convergence i s due to the fact that near the optimum, the only usable directions may be nearly orthogonal to the gradient. See example 1 f o r an i l l u s t -r a t i o n . This phenomenon motivated the improvement given i n Wolfe [114], which allows movement away from a vertex, whereas the o r i g i n a l FW only allowed movement towards a vertex ( i . e . y i s t y p i c a l l y a vertex of X). With th i s modification, (but retaining the optimality test) we would replace step 1 with t I f M = sup f(x) and {x } i s a sequence produced by the xex algorithm, then i f lim (log|f(x k + 1)-M|)/(log|f(x k)-M|) k->-°° e x i s t s , we c a l l t h i s l i m i t the order of convergence; see Luenberger [57 ] for related concepts. 15 Solve Z f = max V f ( x k ) ' ( y - x k ) yeX f f f k to o b t a i n y , s e t d = y - x Solve Z b = max - V f ( x k ) • ( y - x k) yeX to o b t a i n y b , s e t d b = x k - y b . I f | z f | . <_ | z b | and T* = sup{x: x k + x*d beX} > 1 , T then s e t d k = x k + t * d b , k -F otherwise d = d This adaptation gives convergence of order one. We i l l u s t r a t e the two versions on the f o l l o w i n g simple example, adapted from Wolfe [ 114] . Example 1 max - x 2 - x 2 1 2 s u b j e c t to x + x < 1 1 2 — - X + X < 1 1 2 — X > -1 2 — S t a r t i n g w i t h x° = ( l , - . 6 ) we can t r a c k the two algorithms w i t h the f o l l o w i n g diagram. 16 Of course the objective here i s to reach zero. Using the 1 2 basic FW algorithm we generate the points x , x , etc. However, with Wolfe's modification, we could move away from the vertex ( 2 , - 1 ) to generate the point x*. 17 Note that this example manifests some of the general FW properties: a f a i r l y rapid i n i t i a l rate of improvement i n the objective, followed by a lack of 'good' directions near the optimum. The l a t t e r property leads to the rather slow asymptotic rate of convergence which we have described. The former property can also be formalised. In p a r t i c u l a r , Wolfe [114] has developed the following bounds f o r single steps i n the FW algorithm. Assume f i s C ( i . e . f i s twice continuously d i f f e r e n t i a b l e ) and denote by V 2 f the Hessian matrix. Define M = sup f(x) x e X and the 'error' e k = M - f(x f c) . Let L s a t i s f y L < i n f { (y - y ) ' V 2 f (y ) (y - y ) }, y ,y ,y eX 1 2 3 1 2 1 2 3 observe that L < 0. (Amor [4 ] characterizes L as a lower bound on f's "departure from l i n e a r i t y " . ) a. i f f (x^ .) < M + L then e k + l ± e k / 2 • b. i f f ( x k ) >_ M •+ L then ek+l 1 e k ( 1 + e k / 2 L ) * E s s e n t i a l l y a. says that i f we are ' f a r 1 from the optimum we can expect the error to be at least halved at each i t e r a t i o n . Amor extends the results of Wolfe to develop a bound applicable to a l l 'large-step' methods of feasi b l e d i r e c t i o n s . In the case 18 of FW, the bounds are sharper than Wolfe's bounds. This s u b j e c t i s a l s o discussed by Dyer [25 ] . He derives a bound analogous t o a., i f the gradient i s measured w i t h an unbiased e r r o r . This reference w i l l be subsequently discussed i n more d e t a i l . While t h i s concept i s seldom discussed, i n terms of measuring the e f f i c i e n c y of algorithms f o r p r a c t i c a l purposes t h i s i n i t i a l r a te of convergence i s of obvious importance. Other authors have sought to improve the performance of FW by using more elaborate schemes i n the d i r e c t i o n f i n d i n g step. For example, Meyer [67 ] e s s e n t i a l l y considers o p t i m i z i n g over the l a s t q d i r e c t i o n s . Best [12 - 13] i n c l u d e s const-r a i n t s t o force suceeding d i r e c t i o n s to be conjugate to the l a s t d d i r e c t i o n s used. While these improvements are important, (and improve the asymptotic convergence rate) they-may not be p a r t i c -u l a r l y u s e f u l f o r problems w i t h very simple c o n s t r a i n t s e t s . For example, Meyers' m o d i f i c a t i o n i s u s e f u l when the l i n e a r search problem i s much simpler than the d i r e c t i o n f i n d i n g problem; Best's technique w i l l introduce a number of a d d i t i o n a l c o n s t r a i n t s i n t o the d i r e c t i o n f i n d i n g problem. One f i n a l method of g e n e r a l i z a t i o n i s by Hogan [43 ] , who relax e s the assumption of d i f f e r e n t i a b i l i t y of f t o d i r e c t i o n a l d i f f e r e n t i a b i l i t y . This w i l l be of fundamental importance i n chapter t h r e e , where i t w i l l be discussed i n d e t a i l . 19 §1.2 A Method of F e a s i b l e D i r e c t i o n s Using  Function Approximations The b a s i c m o d i f i c a t i o n we make i n the FW a l g o r i t h m i s mot-i v a t e d by K l e s s i g and Polak [ 52] - They develop a method of f e a s i b l e d i r e c t i o n s f o r the problem (1.1) max f (x) s . t . g(x) >_ 0 . The key feature i s the use of adaptive ( i . e . s u c c e s s i v e l y accurate) approximations t o f ( x ) and t o V f ( x ) . This approach should be u s e f u l i n problems where the p r e c i s e determination of f ( x ) and Vf(x) i s d i f f i c u l t . A l s o s i n c e the approximations are of v a r y i n g t o l e r a n c e s , the alg o r i t h m should be s u p e r i o r t o a f i x e d t o l e r a n c e a l g o r i t h m . We b r i e f l y summarize K l e s s i g and Polak's a p p l i c a t i o n of t h i s concept t o a f e a s i b l e d i r e c t i o n a l g o r i t h m of Polak t 83 ]. We make the f o l l o w i n g assumptions and d e f i n i t i o n s . a. f:R n •+ R and g:Rn -+ R m are both C 1; b. the f e a s i b l e set X = {xeR n: g(x) >_ 0} i s a compact subset of Rn; c. f o r xeX and E>0 define the index s e t I(x,e) a subset of {1,...,m} by I ( x , e ) = ' { j e { l , . . . ,m}: g^ (x) < - e } which i s the s e t of e s s e n t i a l l y i n f e a s i b l e c o n s t r a i n t s at x; 20 d. , l e t S be the u n i t cube i n R n, S = {h=(h,,...,h )eR n: ||h|| = max |h.| < 1} ; e. l e t 9:X x R ={x£R: x^O} -*• R be defined by 8(x,e) = max[min{Vf(x)'h;Vg.(x)'h f o r j e l ( x , e ) } ] . heS 3 Note t h a t 8(x,e)>_0 VxeX and e>_0. I f x* maximizes (1.1) then Ve>0, 0(x*,e)=O i s a w e l l known o p t i m a l i t y c o n d i t i o n ; see K l e s s i g and Polak [52] . The method requires two subprocedures which generate the approximating sequences f.. (x) f(x) and V_.f(x) ->- V f ( x ) . They are assumed to have the f o l l o w i n g p r o p e r t i e s : f. F j : R n -> 2 R and V_.F:Rn + 2 R are fun c t i o n s such th a t f o r p o s i t i v e A there e x i s t s a nonnegative i n t e g e r such that j_>M and xeX i m p l i e s | f j ( x ) - f (x) | < A Vf.. (x) eF_. (x) and | |Vjf(x) - Vf (x) | | <_ A VV_.f (x) eV_.F(x) . (While Fj and V..F are defined as functions by K l e s s i g and Polak, they are more ac c u r a t e l y i n t e r p r e t e d as po i n t to set maps; see appendix t o chapter two.) 2 1 <3- Given a po s i t i v e integer j there exists a W j > - ° ° such that f.(x) > w. Vf.eF.(x) VxeX . 3 - D 3 3 Roughly, t h i s says that the sequences w i l l converge pointwise and that there i s a uniform lower bound on the approximating function values. Some f i n a l d e f i n i t i o n s : h - S : X x R n x R + - * R JS defined by B(x,u , e ) = max[min{u'h;Vg.(x)'h for j e l ( x , e ) } ] , heS 3 i - H:X x R n x R + 2 S i s defined by H(x,u ,e ) = {heS:B(x,u,e)=min[u*h;Vgj(x)'h for j e l ( x , e ) ] } . B i s 6 replacing Vf(x) with a more general vector u. The procedure i s outlined i n the following flowchart. Here the fundamental idea i s to replace the exact objective function and exact gradient with the approximations gener-ated by Fj and V^F respectively. As well, we use B as the surrogate optimality i n d i c a t o r with H giving the corresp-onding m u l t i p l i e r s . The method allows for f l e x i b i l i t y i n choosing the rate at which the approximating sequences w i l l converge. 22 24. 25. 26. S e l e c t X c ( 0 , l ) m 0,0(0,1)' i = l , 2 , 3 ; and j Q a nonnegative i n t e g e r . Figure 1.2 A Flowchart for Klessig and Polak's Feasible Direction Algorithm with Function Approximations 2. F i n d x°cX; s e t k,i=0; j=j Compute a f j (x 1) EF.. ( x 1) and V.f ( x 1 ) c V . F ( x 1 ) . 3 3 Compute S U ^ V ^ f (x1) . E 1 ) and heHfxSv^ (x1) ,c1) . 6. yes |Compute M x ^ V . f (x 1) ,0) yes ^ 2 1 ) II. 12. X=l x*=x 1+Xh < i i s \ . g(x * ) > o 14. X/2 no X-yes f j ( x ' ) = f j ( x 1 ) 16. i+1 v . x = x *v \ yes i = i + l . 15. Compute a f j (X*) EF . (x*) . T — ' Set D = £ j < x * ) _ f j ( x l ) " X / 2 [ ? j f (x 1) -h] . -> 23.. no j = j + l 18. f j l x ' l - f ^ x M l Compute V .f (x 1) eV .F(x') and S i x ' , V . f ( x ' J . c 5 ) . 3 © 20. X=A/2 27. z =x' ! k=k+l . >•( / ^ yes 23 We may m i t i g a t e some of the complexity of the above f l o w c h a r t by the f o l l o w i n g h e u r i s t i c summary. 1. Choose the i n i t i a l t o l e r a n c e s e 1 and the ' s h r i n k i n g ' parameters . 2. Find a f e a s i b l e p o i n t x 1 . 3. Compute an approximate f u n c t i o n value and an approximate gradient at x 1 . 4. Compute 9 and determine a ' m u l t i p l i e r ' h. 5. I f x 1 not e 1 - o p t i m a l then go to 9. 6. Compute S w i t h e = 0 . 7. I f B^O then go to 10, 8. x ^ x 1 ; go to 22. 9. I f B ^ e 1 then go to 11. 10. Shrink e 1 ; go t o 5. 11. Set the step s i z e (A) to 1. 12,13,14. Shrink A by 1/2 u n t i l x*=x1+Ah i s f e a s i b l e . 15. Compute an approximate f u n c t i o n value at x*. 16. Compute D (a measure of the improvement of x* over x 1 ) . 17. I f D negative then go to 19. 18. x ^ x 1 ; go to 22. 19. I f ^<^m/2 then go t o 21. 20. Shrink the step s i z e by 1/2; go to 12. 21. x ^ x 1 . 24 22, 23. I f f j ( x ' ) < f j ( x 1 ) - e 2 then increase the e 2 t o l e r a n c e ; increment j ; go to 3. 24. x1+"'"=x1 ; increment i . 25. Compute an approximate gradient at x' and B w i t h e 3 . 26. I f B > e 3 then go to 3. 27. z =x' ; shrink e3 ; increment k; go to 3. Observe t h a t steps 24-2 7 form a 'sieve' t o e x t r a c t a convergent subsequence {z k} from {x 1} . L a t e r i n t h i s chapter we w i l l compare t h i s a l g orithm w i t h a FW algorithm using approximations to the gradient and t o the o b j e c t i v e f u n c t i o n i n a manner analogous to th a t presented here. 25 § 1 . 3 Related Concepts of Approximation In the b a s i c FW one can recognize at l e a s t three p o s s i b i l i t i e s f o r u t i l i z i n g approximations i n place of exact procedures. In p a r t i c u l a r we may s u b s t i t u t e : II an approximate o b j e c t i v e f u n c t i o n f o r f , TT an approximate gradient f o r Vf, and U an approximate l i n e a r search f o r the exact l i n e a r search r e q u i r e d i n step 2. Before d e t a i l i n g our v e r s i o n of FW (which w i l l allow a l l three replacements), we w i l l survey some of the references which allow (to some extent) t h i s type of approximation. The most preve l a n t examples may be found i n the v a r i e t y of papers d e a l i n g w i t h i n e x a c t l i n e a r searches. At l e a s t i n t h e i r n a t a l form many algorithms r e q u i r e an exact l i n e search. This i s true not only of f e a s i b l e d i r e c t i o n algorithms l i k e FW, but of a much more general c l a s s of popular t e c h n i q u e s , - f o r example, steepest descent (Cauchy [ 1 5 ] ) , v a r i a b l e m e t r i c methods (Davidon [ 2 l ] ) / g r a d i e n t p r o j e c t i o n methods (Rosen [ 87]), and conjugate gradient methods (F l e t c h e r and Reeves [ 3 3 ] ) . To the more pragmatic, the e x i s t e n c e of a (generally) i n f i n i t e subprocedure w i t h i n an a l g o r i t h m i s d i s t u r b i n g and s e v e r a l authors have confronted t h i s problem. Mavrides [ 64] has developed an implementation of FW 26 without exact l i n e searches (see as w e l l the paper of Huard [ 46] discussed i n the appendix to t h i s c h a pter); P o w e l l [ 84 ] has d e r i v e d p r o p e r t i e s of the (Fletcher-Broyden-Goldfarb-Shanno) v e r s i o n of the v a r i a b l e m e t r i c a l g o r i t h m without l i n e searches; Shanno [9 5 ] has considered the e f f e c t s of i n e x a c t l i n e a r searches on conjugate gradient methods. More g e n e r a l l y , Wolfe 1113] ( f o r unconstrained methods) has s t u d i e d c o n d i t i o n s on the step s i z e t o guarantee convergence. A v e r s i o n of FW which allows f o r inexactness i n the gradient i s developed i n Dyer [25]. Dyer's study r e l a t e s to an i n t e r a c t i v e context where a d e c i s i o n maker i s expected t o provide estimates of a v e c t o r c o l i n e a r w i t h the gradient of h i s u t i l i t y f u n c t i o n . Thus, r a t h e r than assuming t h a t Vf(x) i s known, he assumed t h a t k k k at each i t e r a t i o n k, a s c a l e d estimate a Vf (x ) + r\ i s k k a v a i l a b l e . Here a i s a p o s i t i v e s c a l i n g f a c t o r and n i s a vector of unknown e r r o r s i n e s t i m a t i o n . Using the p o i n t - t o - s e t map convergence theory of Z a n g w i l l , he was able to modify Zangwill's [ 117] proof of the convergence of FW , to prove t h a t k k t h i s modified a l g o r i t h m w i l l converge i f a -»• 1 and r\ -*-T\ ( p o s s i b l y not equal t o z e r o ) . 27 F i n a l l y , S c h i t t k o w s k i [ 9 3 ] has an elegant treatment of an adaptive p r e c i s i o n method f o r the general m i n i m i z a t i o n problem i n f <j> (u) ueC where <f>:E R and C i s a subset of the normed l i n e a r space E. The key i s the assumption of the exis t e n c e of a sequence {<J>n} of continuous and continuously G a t e a u x - d i f f e r -e n t i a b l e penalty f u n c t i o n s E -»- R. He proves convergence of a prototype a l g o r i t h m ( i n the sense of Polak [83 ]; see appendix) which uses a f i x e d (Armijo [ 5]) step length (thus avoiding a l i n e a r search problem). 28 §1.4 A Frank-Wolfe A l g o r i t h m Using Approximations In t h i s s e c t i o n we develop an extension of the Frank-Wolfe algorithm which allows f o r approximations t o the o b j e c t i v e and to i t s gradient, as w e l l as i n e x a c t l i n e a r searches. The b a s i c approach i s to adapt the convergence theory of Zangwill to include a parameter which w i l l c o n t r o l the l e v e l s of approximation. The problem i s max f ( x ) xeX where X i s a nonempty, compact, convex set c R n and f:X •+ R i s concave and C''". Further, we assume a. There are sequences if^}^ a n d j 1 t l l a t converge uniformly on X to f and t o i t s gradient. More f o r m a l l y , given e>0 there e x i s t s an i n t e g e r M (m) such t h a t VxeX and Vj>M (m) we have | f j ( x ) - f ( x ) | < e ( I I V f (x) - Vf(x) I I < e)) . b. An i n i t i a l t o l e r a n c e ese[0,T] i s given (T«=°) , as i s a 's h r i n k i n g ' parameter ae(0,l) and a f i n a l t o l e r a n c e e e [0,T]. c. We assume we have s c a l e d the sequences { f j ) a n (l {Vjf} so t h a t f j and V j f are (ft).Je° approximations. Thus, the tolerance at i t e r a t i o n k i s e k= (..a)-'e0. 29 The f o l l o w i n g diagram i s u s e f u l i n i n t e r p r e t i n g each of the steps of the al g o r i t h m and i n understanding the purpose of each of the component psms. [0,T] x X G + [0 ,T] x x x Xr [0,T] x X x X (estimate the gradient) (solve the augmented d i r e c t i o n f i n d i n g problem) D [0,T] x x x x I M (take the d i f f e r e n c e as d i r e c t i o n ) (with tolerance ( l - a ) e s o l v e the l i n e a r search problem, shrink e ) [0,T] x x 30 Consider the p o i n t t o set map (psm) G: [0,T] x x [0,T] x X x X defined by G(e,x) = {(e,x,z):e = e, x = x, zeX s a t i s f y i n g y | |z - Vf (x) | | < e } . I.e. we have an i d e n t i t y map on the p o i n t (e,x) and z i s an e-approximation to the gradient of the o b j e c t i v e at x. Let [0,T] be the i n t e r v a l i n which the tolera n c e s e must l i e CT < co ). Here X = {zeR n: there i s an xeX such th a t | |z - Vf (x) | | <_ T} . Thus X c o n s i s t s of a l l p o i n t s i n R n w i t h i n T ( i n terms of y E u c l i d i a n distance) of Vf (x) f o r some xeX. Since the s e t {V f ( x ) : xeX} i s compact, ( i t i s the image of a compact set under a continuous map) X^ i s as w e l l . Lemma i . j :G i s closed on [0,T] x x . Proof: For keK, an i n f i n i t e s et of i n d i c e s , assume k k 0 0 0 0 (e ,x^) -*• (e ,x ) , and . k nk k . 00 noo c» (a , 6 , y ) •+ ( 0 1 , 6 ,y ) We need to show 31 k k k k Using the i d e n t i f i c a t i o n s a = e and 6 = x , i t s u f f i c e s to show tha t ||y - V f ( x )|| < e , but t h i s i s a simple consequence of the c o n t i n u i t y of the norm and of Vf, s i n c e VkeK , ||Yk - V f ( x k ) | | < e , we may take l i m i t s , to o b t a i n || Y" - V f ( x ~ ) | | < e~ . //. Now consider the psm Y: [0,T] x X x R n + [0,T] x X x X , defined by Y(e,x,z) = {(e,x,d): e = e, x = x, deX s a t i s f y i n g max max [ z' (<j - x) , - | | d - x M + e H • d£X The f i r s t two components of Y are merely the i d e n t i t y on (e,x). The l a s t component chooses a deX i n the f o l l o w i n g way. F i n d deX to maximize the i n n e r product p = z'(d - x) . I f p < e then set d = x. Here d i s a usable d i r e c t i o n unless p i s 'near' 0, i n which case we keep the p o i n t x and increase the t o l e r a n c e . Note th a t p < e may i n d i c a t e that x i s e-optimal or merely that the gradient approximation i s not s u f f i c i e n t l y accurate. 32 Lemma 1.2: ¥ i s closed on [0,T] x x x R n. Proof: For keK, an i n f i n i t e s e t of i n d i c e s , assume / k k V , . 0 0 0 0 0 0 , ( a k , 3 k , Y k ) e Y ( e k , x k , z k ) and (a ,£ ,y ) * (a ,B ,y ) • We need to show oo 0 0 0 0 OO CO CO (a , 3 ,Y ) e Y(e ,x ,z ) . Using the i d e n t i f i c a t i o n s a k=e k and 6 k=x k , i t s u f f i c e s t o show Y°° solves max maxiz 0 0' (y - x°°) , -| |y - x°°| | + e°°} yeX Now by d e f i n i t i o n , y solves max max { z k " ( y - x k) , -||y - x k | | + e k} yeX In p a r t i c u l a r , f o r each yeX max{z k'(Y k - x k) , -||y k - x k|| + e k} > max{z k'(y - x k) , -I|y - x k|I + e k} . Taking l i m i t s VkeK (using c o n t i n u i t y of the i n n e r product and of the norm) 0 0 0 0 0 0 . . 0 0 0 0 1 1 0 0 -, max{z 1 (Y - X ) , - | | Y — x | [ + e } > max{z0°' (y - x°°) , -||y - x°°| |+ e°°} VyeX //. 33 The t h i r d psm we w i l l consider i s D: [0,T] x x x x •* [0,T] x x x x ( X = {xeR n: there are y and z i n X such that x = y - z}. Note that X i s compact i f X i s . ) defined by D(e,x,d) = (e,x,d - x) . Lemma J_. 3; D i s closed on [0,T] x x x x Proof: T r i v i a l , since D i s a continuous function on the compact set [0,T] x x x x . //. The l a s t component of the algorithmic map i s M: [0,T] x X x X -> [0,T] x X defined by M(e,x,d) = { ( e , y ) : e = oce, y = x + xd where xe [0,1] s a t i s f i e s f(y) >_ max f (x+xd) - (1-a) e }. xe [0,1] Here ae(0,l) i s a user s p e c i f i e d constant. This psm 'shrinks' the tolerance and generates a (1-a)e-optimal solution to the l i n e a r search problem.* We can generate y=x+xd ei t h e r by exactly solving the l i n e a r search problem using an approximate f, or by approximately solv-ing the problem with an exact f, or even by an appropriate combination of an approximate search with an approximate f. * S t r i c t l y speaking, we should have the algorithm parametrized by a. 34 Lemma 1.4:M i s closed on [0,T] x x x X Proof: For keK, an i n f i n i t e set of indices, assume k k k oo oo oo (e ,x ,d ) -> (e ,x ,d ) , ( 6 k , y k ) e M(e k,x k,d k) and k k oo oo (5 ,y ) -»• (6 ,y ) It s u f f i c e s to show (6 ,y ) e M(e ,x ,d ) . F i r s t l y , VkeK, k k 6 = ae V oo k 0 0 and since e -»- e , then 6 -*• ae So i t remains only to v e r i f y that OO CO oo c f(y ) >_ max f (x + xd ) - (1 - a) e T£[0,1] Let us define T v i a , k = X k + T k d k (Such T e x i s t since f i s continuous on the compact set X.) k 1 By compactness of [0,1], from the sequence IT }, we k EK k may extract the subsequence {x } k e K* such that lim T K = T°°e [0,l] .. keK* OO 00 00 oo y = x + T d By d e f i n i t i o n , Vie[0,1] k k k k f ( y K ) = f(x + x d ) > f ( x k + x d k ) - (1 - a ) e k . 35 Now, t a k i n g l i m i t s (using the c o n t i n u i t y of f) Vie[0,1] f(y°°) > f (x°° + Td°°) - (1 - a)e°° . //. F i n a l l y , consider the composite map A = MDYG: [0,T] x X -> [0,T] x x . Lemma 1.5:A i s closed on [0,T] x x Proof:Lemmas 1.1 to 1.5 have demonstrated t h a t each of the component maps i s closed w i t h range on a compact s e t . By C o r o l l a r y 4.2.1 of Z a n g w i l l [H7], we conclude t h a t A i s closed on [0,T] x x . //. Theorem l.l:Consider the alg o r i t h m generated by the psm A, given the i n i t i a l p o i n t (e°,x°) e [0,T] x x , where X i s a compact subset o f R n and T < °° . The sequence of p o i n t s on the a l g o r i t h m i c map ( e k , x k ) i s an i n f i n i t e sequence, any accumulation p o i n t of which i s e-optimal. Proof: We proceed by v e r i f y i n g the co n d i t i o n s of Zangwill*s Convergence Theorem A * . To do t h i s we con s t r u c t the ascent f u n c t i o n Z: [0,T] x x -> R defined by Z(e,x) = f(x) - e . 0 The e-optimal s e t Q j _ s n = {xeX: max V f ( x ) ' ( y - x) < e} . yeX * See appendix. 3 6 /Condition 1: A l l p o i n t s (e,x) are contained on a compact s e t . [This i s c l e a r s i n c e [0,T] and X are compact.] Condition 2a: I f (e,x)/fi then V(e,x)eA(e,x) Z(e,x) > Z(e,x) . [To v e r i f y t h i s we need to show f(x) - e > f(x) - e, o r e q u i v a l e n t l y , t h a t f ( x ) > f ( x ) - (1 - a)e . Since (e,x)£°. we know there e x i s t s yeX such t h a t Vf (x) ' (y - x) > e > 0 . By Theorem 2.1 of Zangwill-[117], there e x i s t s p o s i t i v e 6 such t h a t V t e [ 0 , 6 ] f ( x + T ( y - x)) > f(x) . This i m p l i e s t h a t max f ( x + x ( y - x) ) > f (x) . T E [ 0 , 1 ] By d e f i n i t i o n , f (x) >_ max f ( x + x ( y - x) ) - (1 - a) e T E [ 0 , 1 ] > f ( x ) - (1 - a)e .] Condition 2b: I f (e,x)eQ, then V(e ,x) eA(e,x) Z (e,x) >^  Z (e,x) . 37 [To v e r i f y t h i s we need to show f (x) - e > f (x) - e, or t h a t f (x) > f (x) - (1 - a) e . By d e f i n i t i o n ( f o r y some d i r e c t i o n generated by Y) f(x) _> max f (x + T (y - x) ) - (1 - a) e TE[0,1] > f (x) - (1 - a) e . ] Condition 3: A i s c l o s e d at (e,x) V ( e,x)^ fi. [This f o l l o w s from Lemma 1.5] / 38 Remarks We have defined the optimal set fi, by 0, = {x: max Vf (x) 1 (y - x) < e] . yeX Note t h a t f o r concave f (see Z a n g w i l l [117]) Vy.eX Vf (x) ' (y - x) > f (d) - f (x) . This i m p l i e s f o r xeQ, f (x) > f ( x ) - e yyzX. So that t h i s c h a r a c t e r i z a t i o n i s the same as the usual e-optimal d e f i n i t i o n . Our d e f i n i t i o n i s more u s e f u l i n an implementation sense. I t i s e a s i l y seen t h a t A reduces t o the usual FW map i f e=0. In p a r t i c u l a r , G w i l l become the map which determines the g r a d i e n t , Y w i l l s olve the usual l i n e a r i z e d d i r e c t i o n f i n d i n g problem, and M w i l l be the exact l i n e a r search map. To obtain a f i n i t e t e r m i n a t i n g c o n d i t i o n we need only add the c o n d i t i o n : d- 2) stop w i t h xeX approximately optimal i f YyeX V-f (x) • ( y - x) < 1/2 , where i =e/2K w i t h K=sug||y|| y eX 39 Lemma 1.6: I f (1.2) holds then xeQ . Proof: V f ( x ) ' ( y - x ) = (Vf (x)-Vgf (x) +V gf (x) ) ' ( y-x) = (Vf (x)-Vgf (x) ) ' (y-x)+V.f (x) ' (y-x) < | | Vf (x)-Vgf (x) | | • | |y-x| |+e/2 < (e/2K)K + e/2 = e • .//• There are many p o s s i b l e implementations of the a l g o r i t h m o u t l i n e d i n t h i s s e c t i o n . Figure 1.2 i l l u s t r a t e s one such v e r s i o n . To o b t a i n a f i n i t e e-stopping r u l e , one modifies the d e c i s i o n block a f t e r step 2 (as given above). 40 Figure 1.3; A Flowchart of the Extended A l g o r i t h m Step 0 Choose e°e(0 fT], ee[0,T], ae (0,1), x°eX. k=-l. Step 1: Step 2 : Step 3 < — k=k+l, Choose u ex such t h a t g I|u k - V f ( x k ) I I < e k . Choose y eX to solve max {u k' (y-x k) , -| |y-x k| yeX k k k Set d =y -x Choose T K e [ 0 , l ] such that k+1 k k .k _ x =x +T d s a t x s f i e s f ( x k + 1 ) > m a x f ( x k + x d k ) - ( l - a ) e k TE [0,1] c . k+1 k Set e = ae . 41 For n o t a t i o n a l and conceptual s i m p l i c i t y , the a l g o r -ithm presented here uses the sequence of tolerances e k k o given by e =(.d) e . In many cases i t i s more u s e f u l to consider a more general sequence. In t h i s case the con-vergence proof r e q u i r e s only a s l i g h t r e d e f i n i t i o n of the M psm. Thus, suppose there e x i s t s the continuous f u n c t i o n <j>:[0,T] -> [0,T] which determines the sequence of tolerances v i a k j t k-1. e = <f> (e ) . Formerly we had <j>(e) = ae. We assume <j> (e) _< e. We replace M by M*(e,x,d) = { (e,y) :e=(j) (e) , y=x+fd where fe [ 0 ,1 ] s a t i s f i e s f(y) > max f (x+xd) - (e-<J)(e))} . x e [ 0 , l ] The proof of the closedness of M* i s e s s e n t i a l l y the same as the proof of Lemma 4 and the changes i n Figure 1.2 are c l e a r . 42 Appendix: Convergence Theories In the l a s t decade there have emerged two main unifying frameworks for the analysis of algorithms and convergence theory. The f i r s t i s the point-to-set map (psm) concept developed by Zangwill [ 116] and [ 117] . The second i s the algorithmic model approach of Polak [ 8 3 ]. Both e f f e c t a very s i g n i f i c a n t u n i f i c a t i o n i n algorithmic development, furthering the work of Zoutendijk [ 12 3] a n d Topkis and Veinott [ no] , who developed theories f o r feasible d i r e c t i o n algorithms. In this appendix we b r i e f l y survey the main results of these two and other approaches. 43 §A.l P o i n t - t o - S e t Maps and the Convergence  Theory of Z a n g w i l l The main references f o r t h i s s e c t i o n are Z a n g w i l l [117] and Hogan [ 4 4 ] . We a l s o draw from Hogan [ 4 3 ] , Huard [46]/ Jacobsen [4 7] and Luenberger [ 5 7 ] , as w e l l as the works c i t e d below. The type of psm ( e q u i v a l e n t l y , m u l t i f u n c t i o n , g e n e r a l i z e d f u n c t i o n , m u l t i p l e - v a l u e d function) b a s i c to our d i s c u s s i o n w i l l be a mapping A:V -> 2 V ( 2 V i s the power s e t of V) , which associates a subset of V w i t h each p o i n t of V. Maps which depend only on zeV w i l l be c a l l e d autonomous. However, nonautonomous maps are a u s e f u l extension; e.g. the map may depend on the i t e r a t i o n number, k, and perhaps on the p r e v i o u s l y generated p o i n t s . Dependence of the f i r s t type w i l l be denoted by the s u b s c r i p t k. An a l g o r i t h m may now be de f i n e d as an i t e r a t i v e process c o n s i s t i n g of a sequence of psms {A^} (the a l g o r i t h m i c maps), so that given an i n i t i a l p o i n t z°eV, a sequence of p o i n t s k k+1 k {z } i s generated v i a the r e c u r s i o n z eA k(z ). We say the psm A:V •> W i s open at z*eV i f k k z •> z* (z eV Vk) and y*eA(z*) imply there e x i s t s an i n t e g e r m and a sequence {y k} such t h a t Ir k yJS"eA(z ; Vk>m , and y k y* 44 A i s closed at Z*EV, i f z k z* (z keV Vk) , y k e A ( z k ) and it y y* imply that y*eA(z*) . We say A i s continuous at z*eV, i f A i s both open and c l o s e d at z*. In Huard [46], the f i r s t two concepts are c a l l e d upper cont-inuous and lower continuous r e s p e c t i v e l y . Berge [ n ] and others w i t h a more t o p o l o g i c a l o r i e n t a t i o n , use the terms upper and lower semicontinuous. However, the l a t t e r usage may cause confus-i o n , since when these d e f i n i t i o n s are s p e c i a l i z e d t o f u n c t i o n s which are s i n g l e - v a l u e d , they imply ( c l a s s i c a l ) c o n t i n u i t y and not ( c l a s s i c a l ) s e m i c o n t i n u i t y . We assume there e x i s t s a s e t ft i n V , which we c a l l the s o l u t i o n s e t . Z i s an ascent f u n c t i o n f o r ft and A, i f a) Z:V •* R , ZeC° , b) zeV/ft , yeA(z) imply Z (y) > Z (z) , and c) zeft , yeA(z) imply Z (y) >_ Z ( z ) . I n t u i t i v e l y , an a l g o r i t h m seeks a p o i n t i n ft, and f o r p o i n t s outside the s o l u t i o n s e t , one i t e r a t i o n of the a l g o r i t h m y i e l d s an increase i n the value of the ascent f u n c t i o n . Note t h a t ft may be a s e t of minimizing p o i n t s f o r the p a r t i c u l a r problem and Z may correspond t o the o b j e c t i v e f u n c t i o n . Often however each may have a more general d e f i n i t i o n ; t h i s gives Zangwill's approach more f l e x i b i l i t y than c l a s s i c a l approaches. 4 5 As w i l l be shown, i t i s o f t e n necessary to ensure t h a t a sequence of p o i n t s {z } on an a l g o r i t h m i c map contains a convergent subsequence. W compact w i l l of course s u f f i c e , and f o r E u c l i d i a n spaces, the uniform boundedness of A(z) i s s u f f -i c i e n t . In more general spaces, the concept of uniform compact-ness i s u s e f u l . (The map A i s uniformly compact near z* i f there e x i s t s a neighbourhood N of z* such t h a t the closure of u A(z) i s compact.) zeN The theorems of t h i s s e c t i o n p a r a l l e l Z a n g w i l l [117], although we w i l l use s l i g h t l y d i f f e r e n t terminology i n order t o s t a t e the r e s u l t s more s u c c i n c t l y . V Convergence Theorem A; Let the psm A:V -*• 2 denote the algorithm which given z°eV, generates the sequence {z k} . Let a s o l u t i o n set Q be given. Further assume, (i) z keX Vk, where X i s a compact subset of V, ( i i ) there e x i s t s an ascent f u n c t i o n Z f o r Q and A, and ( i i i ) A i s c l o s e d at zeV/fi Then the algorithm i s convergent. (I.e. e i t h e r the a l g o r i t h m k k k stops at a s o l u t i o n ( f o r some z , A(z )=<j> and z eQ) , or any accumulation p o i n t of the sequence {z k} i s a s o l u t i o n . ) 46 Note t h a t i n t h i s theorem, and i n other contexts, we use the generic v a r i a b l e z, r a t h e r than x (the d e c i s i o n v a r i a b l e ) s i n c e z, as a p o i n t on the a l g o r i t h m i c map, i s not n e c e s s a r i l y i d e n t i f i e d w i t h the v a r i a b l e of the program, x. Assumption ( i ) i m p l i e s t h a t a l l subsequences are w e l l behaved, but i t does not r e q u i r e t h a t the f e a s i b l e set be compact. Condition ( i i i ) i s the crux of the theorem and i s u s u a l l y the most d i f f i c u l t hypothesis t o v e r i f y . On occasion, one may use the theorems of Dubois [ 24] , which show tha t any algorithm which i s an 'improvement' of an a l g o r i t h m which has a closed map, need not s a t i s f y c o n d i t i o n ( i i i ) t o ensure convergence. The theorems of Dubois embody concepts f a m i l i a r i n proving the convergence of s e r i e s ; presumably t h i s i d e a can be developed f u r t h e r . Most i t e r a t i o n s of a p a r t i c u l a r a l g o r i t h m are composed of a number of steps, i . e . a composition of psms. Note t h a t we may not conclude, without f u r t h e r hypotheses, t h a t a compos-i t i o n of c l o s e d maps i s c l o s e d ( c f . Z a n g w i l l [117], p.100). We have the f o l l o w i n g r e s u l t s , analogous to Zangwill's Lemma 4.2 and C o r o l l a r y 4.2.1. Theorem!.2: Let C:W -»• X, B:X + Y b e psms. Suppose C i s c l o s e d at w* and B i s closed on C(w*). Assume tha t C i s r e l a t i v e l y * compact near w*. Then A = BC i s c l o s e d at w*. * I.e. compact i n the r e l a t i v e topology on W; see the appendix to chapter three. 47 (The composite map A i s defined by A(w) = u B(x) VweW.) xeC (w) Corollary 1.2.1 ; c;W X , B : X -> Y are psms, C i s continuous at w* and B i s c l o s e d on C(w*). Then A = BC i s closed at w*. The above r e s u l t s suggest the i n t u i t i v e appeal of Zangwill's approach. I.e. we may segment an a l g o r i t h m i n t o a number of subprocedures, each descibed by a psm. Most algorithms, viewed t h i s way, are composed of psms which are known to be c l o s e d . Thus wi t h each m o d i f i c a t i o n of an alg o r i t h m , we may a l t e r those psms which are to be changed i n a manner analogous t o recoding subroutines which are t o be rec o n s t r u c t e d . We define a mixed a l g o r i t h m to be an algorithm which has an autonomous map B used i n f i n i t e l y o f t e n ( i . e . f o r an i n f i n i t e s e t of i n d i c e s K ) ; f o r kgfK we use other maps. Convergence Theorem B:Suppose the a l g o r i t h m i c map B s a t i s f i e s hypotheses ( i ) - ( i i i ) of Convergence Theorem A. We de f i n e the mixed a l g o r i t h m by A^:V ->- 2 V such t h a t A k = B V k e K and Vk^ K Z ( z K x ) >_ Z(z ) . Assume t h a t z e X , a compact subset of V, Vk, and z*eQ, and Z (y) >_ Z (z*) i m p l i e s yeft. Then the mixed a l g o r i t h m i s convergent. I n t u i t i v e l y , i f we use a convergent a l g o r i t h m i n f i n i t e l y o f t e n , we may use other maps (as long as they do not decrease the ascent function) at other i t e r a t i o n s . 48 §A.2 Alg o r i t h m Models To i l l u s t r a t e Polak's approach, l e t us define a very general problem as f o l l o w s : given T (a c l o s e d subset of a Banach space B wi t h norm ||•{| ) the o b j e c t i v e i s to f i n d a zeft={zeT: z has property P} , where P i s some ' d e s i r a b l e ' property. We can char-a c t e r i z e Z a ngwill's approach as an a l g o r i t h m i c model given by the psm A:T -* 2 T. Model Z: Step 0: F i n d z°eT; set k=0. Step 1: Choose z k + 1 e A ( z k ) . Step 2: I f z k + 1 e f t then stop. otherwise set k=k+l; go to step 1. Polak [ 82] has the f o l l o w i n g theorem which p a r a l l e l s Conver-gence Theorem A. Theorem A2.1:Suppose t h a t ( i ) there e x i s t s c:T -> R such t h a t e i t h e r ceC° on T/ft or c i s bounded from below on T, and suppose t h a t ( i i ) VzeT/ft there e x i s t s e(z) > 0 and 6(z) < 0 such t h a t c(z") - c(z') < 6(z) < 0 Vz'eB(z,e) where B(z,e) = {z'eT:||z' - z|| B < e(z)} Vz"eA(z') , then the algorithm i s convergent. Polak focuses on the d i s t i n c t i o n between a conceptual and an implementable algorithm. Model Z i s conceptual because there 49 i s no guarantee t h a t step 1 can be determined by a f i n i t e procedure. Hence he introduces approximations to A ( z k ) . For examp the parameters a>0, Be(0,1) and e >0 + T l e l e t A:R x T 2 , c:T -»- R and assume we are given Model P: Step 0: Choose z°eT; k=0. Step 1: e=e o Step 2: Choose y e A ( e , z k ) . Step 3: I f c(y) - c ( z k ) <_ -ae then z k + 1 = y , k=k+l, go t o step 1; otherwise set •. e = Be , z k + 1 = z k , k=k+l, go to step 2. Theorem A2.2: Model P i s convergent i f (i) c(z)eC° at a l l zeT/ft or c(z) i s bounded from below on T, and ( i i ) VzeT/ft there e x i s t s e(z)>0, 6(z)<0 and y(z)>0 such t h a t c(z") - c(z') < 6(z) < 0 Vz 1eT , I l z ' _ z I lB - e ( z ) Vz"eA(Y,z') Vye[0,Y(z) ] . Polak uses a model s i m i l a r to Model P to prove the convergence of FW [82 ] a l l o w i n g n o n l i n e a r convex i n e q u a l -i t i e s t o define X, and u t i l i z i n g an approximate s o l u t i o n t o the d i r e c t i o n f i n d i n g step (which of course i s much more d i f f i c u l t i n t h i s case than the case of p o l y h e d r a l X). 50 Huard [ 46] uses these ideas to develop approximate algor-ithms s i m i l a r to those of K l e s s i g and Polak; i n p a r t i c u l a r an application of his prototype for 1unidimensional sub-optimization' methods yiel d s a convergence proof for FW with approximate l i n e searches. Nevertheless, surveying the examples of t h i s approach (e.g. Huard [46 ] , K l e s s i g [50 - 51] , K l e s s i g and Polak \s2 ] 1 Polak [81 - 82] . ) ,we f i n d that the p o t e n t i a l generality has not been exploited. Each author develops his own algorithm model and i t i s unclear that any s i g n i f i c a n t benefits can be achieved other than for very s i m i l a r algorithms. A Useful survey of the connections between the theory of Polak and that of Zang-w i l l can be found i n Tishyadgigama, Polak and Klessig [ 1 0 8 ] . 51 §A.3 Other Approaches to Convergence Theory There are some other approaches to convergence and a l g o r -i t h m i c theory which consider the l e v e l of g e n e r a l i t y t h a t the t h e o r i e s of Z a n g w i l l and Polak w i l l a l l o w . One such i s the f o r c i n g f u n c t i o n approach of Meyer [ 68 - 69 ] . Meyer shows tha t t h i s approach can handle the c l a s s of algorithms t h a t the psm theory encompasses, but i s more r e a d i l y adaptable to a l g o r -ithms which r e q u i r e 'antijamming' procedures. [Jamming i s e s s e n t i a l l y the r e s u l t of s e l e c t i n g d i r e c t i o n s without c o n s i d e r i n g boundaries which are near the current p o i n t . For example (adapted from Z a n g w i l l [ 1 1 7 ] ) we have the f o l l o w i n g diagram. Diagram A l . 1 : Jamming 52 I f one only considers the c o n s t r a i n t s at x which are a c t i v e (binding) then d i s the optimal d i r e c t i o n . However, because of the p r o x i m i t y of the boundary (g^(x)=0), i t can lead to only a very small improvement. Thus, one technique i s to determine an e-tolerance and consider any c o n s t r a i n t as a c t i v e which i s e - a c t i v e . For diagram A l . l we may generate the d i r e c t i o n d*.] Meyer's ide a i s to determine an ' o p t i m a l i t y i n d i c a t o r ' or f o r c i n g f u n c t i o n 6: G ->- R ( f o r G a closed subset of R ) . Define ftL = {x: there e x i s t s {y 1} i n G w i t h yX->-x, 6(y-"-)-*0}, Q* = {x: xeG, 6(x)=0} and ft+ = {x: xeG, 6 (x) >0} . I n t u i t i v e l y , 6 i s 0 at 'optimal' p o i n t s and p o s i t i v e elsewhere. We can o b t a i n the f o l l o w i n g easy lemma. Lemma A3:Let <J>:G -»• R be lower semi-continuous on G and l e t {x 1} be a sequence such t h a t ^ ( x 1 ) - <Mx 1 + 1) > Mx 1) i = 0 , l , 2 , . . . . I f {x 1} has an accumulation p o i n t x* then 6 (x 1) -> 0 and x*eft L. This lemma i s i n t i m a t e l y r e l a t e d t o Convergence Theorem A and t o Theorems A2.1 and A2.2, but i s s l i g h t l y sharper because the d e f i n i t i o n of the 'goal s e t ' ftL i s t i e d to the a c t u a l f u n c t i o n i n g of the a l g o r i t h m , r a t h e r than being independent of 53 i t . The technique has the other advantage of dispensing w i t h the ideas of psm convergence, but, conversely, i t r e q u i r e s the determination of the f u n c t i o n 6 which may pose a problem. K l e s s i g [ 5 1 ] extends some of the r e s u l t s of Polak so as to a l s o t r e a t the c l a s s of algorithms which u t i l i z e antijamming p r o v i s i o n s more d i r e c t l y . He d e r i v e d r e s u l t s s i m i l a r to those which Meyer l a t e r generated i n a s l i g h t l y more general s e t t i n g . K l e s s i g [ 5 1 ] makes the f o l l o w i n g statement: "... i n order t o compute x 1 + ^ both x 1 and e 1 (the antijamming parameter) must be known. Consequently, a p o i n t - t o - s e t map A such t h a t x^+^"eA(x^) does not e x i s t f o r c h a r a c t e r i z i n g t h i s algorithm. This means t h a t the Polak and Z a n g w i l l t h e o r i e s do not apply i n a s t r a i g h t f o r w a r d manner. In f a c t we are not aware of any way of s u c c e s s f u l l y applying these t h e o r i e s . " As we have seen i n t h i s chapter, i t i s p o s s i b l e to apply the Z a n g w i l l theory (and i n the same way, Polak's) to a l l o w f o r x 1 + 1 to depend upon x 1 and upon e 1 w i t h e 1 a t o l e r a n c e , simply by l e t t i n g the p o i n t on the a l g o r i t h m i c map be ( x ^ e 1 ) r a t h e r than j u s t x 1 . In an analogous manner we can proceed using e 1 as an antijamming parameter. §2: The Solution of P o r t f o l i o Selection Problems In t h i s chapter we apply the method developed i n chapter one to the solution of p o r t f o l i o s election problems. After a b r i e f survey of the l i t e r a t u r e i n §2.1, we describe some of the measures of r i s k aversion; those of Arrow-Pratt, of Tobin and of Rubinstein. We develop a relationship between the l a s t two measures and prove a theorem^ which e s s e n t i a l l y implies that Rubinstein's measure determines p o r t f o l i o composition i f the returns are normally d i s t r i b u t e d . For the remainder of §2 we use data estimated from ten NYSE s e c u r i t i e s over monthly, quarterly and yearly horizons. Diagrams 1 to 21 graph p o r t f o l i o proportions (for each of the seven u t i l i t y functions over the three periods) over the relevant range of the u t i l i t y function's parameter. Tables 6 to 10 examine the v a l i d i t y of the hypothesis that u t i l i t y functions with the same average R A have the same p o r t f o l i o composition. We conclude with an examination of the impact of errors i n the solution due to errors i n the estimation of % and E . t This theorem was communicated by M. Rubinstein together with a proof for a s p e c i a l case. The proof i n thi s chapter follows Rubinstein's paradigm. 5 5 §2.1 P o r t f o l i o Selection Models In t h i s section we survey some of the more important papers i n the area of p o r t f o l i o s e l e c t i o n . We w i l l have a bias towards the computational aspects of the l i t e r a t u r e . As we s h a l l attempt to cover an extensive f i e l d of l i t e r a t u r e i n a very b r i e f s e c t i o some s i g n i f i c a n t contributions w i l l be omitted and others w i l l be treated i n a cursory fashion. Mean-Variance Models and Their Manifold Extensions The seminal work i n p o r t f o l i o theory i s by Markowitz [ 61 ] and [62 J a n& Tobin [ 109] / who originated the mean-variance approach to p o r t f o l i o selection, thereby developing a formal method to incorporate r i s k . One formulation of t h i s approach can be given by the following quadratic program: max 8(£'x) - x'Ex n subject to £x. = 1 , x >_ 0 . i = l 1 Here, and subsequently, x i s a vector representing the propor-tions of i n i t i a l wealth invested i n each of the n s e c u r i t i e s . Security returns are random with mean \ and variance-covariance matrix £ . 8 i s a parameter which depends on the investor's 'risk preferences'. 56 The c r i t i c a l assumption of t h i s model i s t h a t i n v e s t o r s e v a l u a t e s e c u r i t i e s only i n terms of means, v a r i a n c e s and c o v a r i a n c e s . For example, t h i s i s v a l i d i f we assume s e c u r i t y r e t u r n s are normally d i s t r i b u t e d or t h a t the i n v e s t o r has a q u a d r a t i c u t i l i t y f u n c t i o n . By a l l o w i n g a l l p o s s i b l e values of 3 , we generate the e f f i c i e n t s e t , those p o r t f o l i o s which minimize x'Ex ( r i s k ) f o r a g i v e n l e v e l of I'x ( r e t u r n ) , or ( e q u i v a l e n t l y ) those p o r t f o l i o s which maximize r e t u r n f o r a given l e v e l of r i s k . The i d e a i s t h a t any reasonable p r e f e r -ence o r d e r i n g w i l l y i e l d an o p t i m a l p o r t f o l i o which i s e f f i c -i e n t . Thus we s o l v e p a r a m e t r i c a l l y the program to o b t a i n the e f f i c i e n t s e t and then,dependent upon the i n v e s t o r ' s p a r t i c -u l a r r i s k p r e f e r e n c e s ( u t i l i t y f u n c t i o n ) , a p o r t f o l i o i n the e f f i c i e n t s e t i s chosen. One drawback of t h i s approach i s p o i n t e d out by Mossin [ 7 3 ] : "...most of the computational work t h a t goes i n t o d e t e r m i n a t i o n of the e f f i c i e n c y l o c u s w i l l be wasted; once a p a r t i c u l a r p o i n t on the l o c u s i s s e l e c t e d , the r e s t of i t i s u s e l e s s knowledge." Another d i f f i c u l t y i s ( f o r l a r g e n) the l a r g e amount of data r e q u i r e d , i n p a r t i c u l a r the c o v a r i a n c e s . More fundament-a l l y , the v a l i d i t y of such data ( t r a d i t i o n a l l y d e r i v e d from 57 past p r i c e t r a j e c t o r i e s ) i s questionable; t h i s remains the most s e r i o u s o b s t a c l e to the e f f e c t i v e n e s s of such models. To a l l e v i a t e the data problem, Sharpe [ 96] suggested the 'si n g l e index' model. Here i t i s assumed t h a t a given secur-i t y ' s v a r i a t i o n s are b i p a r t i t e : one component (a.. + S^M) c o r r e l a t e d w i t h the market (or some other i n d e x ) , and the other (Uj), unique to each s e c u r i t y and independent of a l l other u^. Thus we have x . = a . + B M + u D 3 j j where M i s the index. This i s of course, the foundation f o r the c a p i t a l asset p r i c i n g model; f o r development see Sharpe [ 98 ]. For n>4, the Sharpe model r e q u i r e s l e s s data than the f u l l covariance model, and f o r large n the divergence i s very l a r g e . A recent study by F r a n k f u r t e r , P h i l l i p s and Seagle [ 3 7] u t i l -i z e s s i m u l a t i o n t o conclude t h a t the Sharpe model, i n a d d i t i o n to i t s s u p e r i o r t r a c t a b i l i t y , a l s o y i e l d s ' b e t t e r ' e f f i c i e n t s e t s . Sharpe [ 9 7 ] has f u r t h e r s i m p l i f i e d t h i s model using an assumption t h a t guarantees s e p a r a b i l i t y of the variance - cov-ariance terms, thus enabling s o l u t i o n by separable l i n e a r programming. More recent papers t h a t a l s o d e a l w i t h determin-a t i o n of the e f f i c i e n t set are Bawa [ 8-9]. 58 Under certain conditions i t can be shown that the mean-variance approach i s equivalent to the more general maximization of expected u t i l i t y approach (as mentioned above for example). See Ziemba and Vickson [±22] and references contained therein for d e t a i l s . Samuelson [ 99] and Ohlson [76 ] give conditions for the consistency of the mean-variance solution with the general expected u t i l i t y solution by considering asymptotic quadratic u t i l i t y approximations as the trading i n t e r v a l becomes a r b i t r a r i l y small. These results have been extended by Ohlson [ 7 8] to the case where the means and variances of returns may be i n f i n i t e . One consequence of t h i s i s that i t shows the consistency with more general return d i s t r i b u t i o n s , e.g. stable and logstable d i s t r i b u t i o n s . Other Computational Models An alternative to the mean-variance model i s the safety-f i r s t approach, f i r s t suggested by Roy [ 8 8 ] , which e s s e n t a i l l y seeks to minimize the p r o b a b i l i t y of large losses. Approaches along t h i s l i n e include applications of chance-constrained programming; Telser [107] and l a t e r Sengupta [ 9 4 ] . Also s i m i l a r i n s p i r i t i s Katoaka [49 ] , where the objective i s maximizing the largest return l e v e l that can be attained by (at least) a given p r o b a b i l i t y . 59 An approach which u t i l i z e s the separation property and f r a c t i o n a l programming i s Ziemba, Parkan and Brooks-Hill i^l^• Other authors have dealth with s p e c i f i c forms of the u t i l i t y function and/or s p e c i f i c d i s t r i b u t i o n s of returns. For example, Ziemba [ n s ] and Ohlson [ 7 7 ]. The goal programming approach in a mean-variance setting i s considered by Lee and Lerro [133] a stochastic dominance approach by Fishburn and Porter [331 i s the framework for analysis of changes i n optimal holdings of a risky asset and a safe asset under changes i n returns. F i n a l l y h e u r i s t i c methods are developed by Elton, Gruber and Padberg [27 - 29] . In summary, i t i s clear that h i s t o r i c a l l y , t h i s approach has dominated the p o r t f o l i o s e l ection l i t e r a t u r e . However, the tractable methods that have thusfar been developed f a l l (more or less) into one of four categories: a. approximation methods (e.g. Smith [ 1 0 1 J / Sharpe [ 97], Elton et a l [27 - 29]) , b. models assuming p a r t i c u l a r forms of the u t i l i t y function (e.g. Hakansson [129] Latane [132]), c. models assuming s p e c i f i c return d i s t r i b u t i o n s (e.g. Ziemba [ 1 1 8 ] , Ohlson [ 77]), or d. models u t i l i z i n g very limited numbers of assets (e.g. Fishburn and Porter [ 33] ) . 60 It i s perhaps most d i f f i c u l t to dispense with the assumption embodied i n category c. The main computational d i f f i c u l t y being that i f one i s not r e s t r i c t e d to a d i s t r i b u t i o n which i s closed on addition (viz. stable Paretian d i s t r i b u t i o n s ) , then the expected u t i l i t y with n assets (and dependency) would be expressed as an n-fold i n t e g r a l . Using stable Paretian d i s t r i b -utions, the objective function reduces to a univariate i n t e g r a l . For example, i n the normal case, i f £, the random vector corr-esponding to security returns has a N[f,E] d i s t r i b u t i o n , then £'x (the random variable corresponding to f i n a l wealth) is distributed N[£'x,x'£x] which i s univariate normal. 61 §2.2 Measures of Risk Aversion For the remainder of t h i s chapter we suppose th a t a d e c i s i o n maker's preferences f o r a p a r t i c u l a r set of random outcomes can be represented by a (von Neumann-Morgenstern) u t i l i t y f u n c t i o n . Thus we may assume there e x i s t s a mapping (u) from f i n a l wealth (w) i n t o R such that u(w) i s the u t i l i t y f o r f i n a l wealth. C l e a r l y u i s i n c r e a s i n g . We w i l l a l s o r e s t r i c t ourselves to r i s k averse u t i l i t y f u n c t i o n s . I.e., we assume tha t the dec-i s i o n maker p r e f e r s the expected value of any random gamble to t h a t gamble. This i m p l i e s (see f o r example, Keeney and R a i f f a [131]) t h a t the d e c i s i o n maker has a concave u t i l i t y f u n c t i o n . The converse i s a l s o t r u e . Risk aversion i s o f t e n c h a r a c t e r i z e d v i a the concept of a r i s k premium. The r i s k premium, i r , of the gamble z i s defined by TT = E (z) - u _ 1{Eu(z) } , where u 1{Eu(z)} i s the c e r t a i n t y e q u i v a l e n t of the gamble z. We may i n t e r p r e t TT as the amount the d e c i s i o n maker i s w i l l i n g to pay t o avoid the gamble. For example, i f z has the e q u a l l y l i k e l y outcomes 0 and z, we can represent TT and the c e r t a i n t y e q u i v a l e n t of z i n the f o l l o w i n g diagram. 62 u(w) u(E(z)) E(u(z) ) A ~ / 1 \ / \/ \ 1 ( / i i i u _ 1 ( E u ( z ) ) z > E(z) Diagram 2.0: Risk premium and c e r t a i n t y e q u i v a l e n t I f u i s i n c r e a s i n g , the d e c i s i o n maker i s r i s k averse i f and only i f the r i s k premium i s p o s i t i v e f o r a l l random gambles. A number of r i s k aversion measures have been proposed to rank a d e c i s i o n maker's aversion t o r i s k . Arrow [5] and P r a t t [85] developed the f o l l o w i n g two i n d i c e s which r e l a t e the mathematical p r o p e r t i e s of u to the r i s k a t t i t u d e s of the i n v e s t o r . The absolute r i s k aversion index 63 i s R (w) = -u" (w)/u' (w) and the r e l a t i v e r i s k aversion index i s given by RD(w) = -wu" (w)/u' (w) . RA ^RR^ c a n b e i n t e r P r e t e c ^ a s a l o c a l measure of the aversion to r i s k as a function of (the proportion of) t o t a l wealth. Note that both measures are invariant under li n e a r transformations of u, and thus are superior to other possible measures - e.g. u". Pratt [85 1 demonstrated that these measures have a correspondence with a number of ' i n t u i t i v e ' character-izations of r i s k aversion. He showed that (under suitable regularity conditions) that R^(w) i s twice the r i s k prem-ium per unit of variance for i n f i n i t e s i m a l r i s k s . Thus one may i n t e r p r e t RA(w) as the l o c a l propensity to insure at wealth l e v e l w. A number of important propositions follow; one such shows that i f R* (w) < R2(w) then an investor with RA(w) i s more ris k averse than an investor with RA(w) in a neighbourhood of w. See Pratt [85] for further discussion. An alternative characterization of r i s k aversion, developed by Tobin P-09] , i s derived from the investor's indifference curves i n (y,a) (return-risk) space. This concept has l i t t l e a p p l i c a b i l i t y without the assumption of normal (or some other two parameter) return d i s t r i b -utions. Here, an investor i s r i s k averse,if 2 2 du/da > 0 and d u/da > 0 (I.e. the indifference curves are convex to the a-axis.) To i l l u s t r a t e , suppose we have , for pos i t i v e 3 , u1(w) = w - B^w2 It i s easy to show that CO / u (w)f(w)dw = u - B y 2 - 3 a 2 -co 1 1 1 = u(y,a) , where f(w) i s the density of w, which has a N[y,a] d i s t r i b -ution. The slopes of the indifference curves (u=k a constant) are calculated from dy/da = - {du/da) / Ou/3y) which, for u-, i s (23 a ) / ( l - 23 y) . Now d 2 y / d a 2 = 9 / 9 a(dy/da) which i s ( 4 a 3 2 ) / ( l - 2 3 ^ ) * > 0. Thus, to obtain r i s k aversion i n Tobin's sense, we need only specify that 3 < l/(2y) (so that dy/da > 0); t h i s i s analogous to the Arrow-Pratt r e s t r i c t i o n that 3 < l/(2w) to ensure u'(w) > 0. 65 Adler [ 2 ] notes that these two characterizations are not equivalent. This i s clear, since Tobin's d e f i n -i t i o n depends upon the d i s t r i b u t i o n of returns and u, while the Pratt-Arrow measure depends only on u. The r e l a t i o n s h i p between the two i s further studied by M i l l e r [70] and Amihud [ 3 ] . i n the l a t t e r paper, i t i s proven that (under rather r e s t r i c t i v e assumptions) that (locally) 3/3y(du/da) evaluated along an indifference curve i s p o s i t i v e , zero or negative, when d/dw(RA(w)) i s p o s i t i v e , zero, or negative. While these three papers seek (unsucessfully) for a relationship between the Arrow-Pratt measure and the Tobin conditions, a rather strong relationship between Tobin's d e f i n i t i o n and Rubinstein's measure (discussed i n the next section, R ^ = - W Q E ( U " ) /E (u')) can be derived. M i l l e r has shown that Tobin's f i r s t condition i s 0 < dy/da = -E[zu'(w)]/E[u'(w)] where z = (w - u)/a. Suppose that z ^ N[0,1]. Observe that E[zu'(w)] = cov(z,u'(w)) + zE(u'(w)) = cov(z,u' (w) ) . 66 Now by Rubinstein [ 89 ], i f u'(w) i s bounded above and ueC , cov(z,u'(w)) = E(u"(w))cov(z,w) . cov(z,w) = E[zw] - z = E [ zw ] -1 2 = (a) E [w - uw] Thus = a R1? = -o-E [u" (w) ]/E [u' (w) ] Note the relationship between t h i s measure and Rubinstein's measure. 67 Rubinstein's Measure A measure of (global) r i s k aversion can be defined v i a -w E(u"(w)) R = ° M E[u'(w)] This measure i s due to Rubi n s t e i n . We b r i e f l y describe i t s development. I f we have a p e r f e c t l y competitive s e c u r i t i e s market where i n d i v i d u a l i ( i = l , . . . , I ) maximizes u?(c) + E [u(w) ] Here u? i s the u t i l i t y f o r current consumption and u . i s the o u t i l i t y f o r fut u r e wealth. Both u^and u^ are assumed t o be s t r i c t l y concave and i n c r e a s i n g . Further we assume a r i s k f r e e asset e x i s t s , there are no r e s t r i c t i o n s on short s a l e s or borrowing, and w i s normally d i s t r i b u t e d (or u i s q u a d r a t i c ) . ThenRubinstein [89 ] shows th a t a c o n d i t i o n f o r market e q u i l -i b r i u m i s E ( r p ) = r f + ( 0 / I ) c o v ( r p , w m ) where Vp i s the present value of p o r t f o l i o p, v i s the (random) fut u r e value of p o r t f o l i o p, P r = v /v i s one plus the ra t e of r e t u r n on p o r t f o l i o p, P P P w = Iw i s the (random) future value of the t o t a l number of m s e c u r i t i e s i n the market, r f i s one plus the r i s k l e s s rate of r e t u r n , and 0 = -E[u"(w)]/E[u'(w)] i s an i n d i v i d u a l measure of r i s k a v e r s i o n . 68 This valuation formula i s a s p e c i a l case of Rubinstein's 'fundamental theorem of parameter preference security valuation' [ 89 ]. This paper extends the mean-variance valuation model of Sharpe [98 ] and others. S p e c i f i c a l l y , for i n d i v i d u a l i E.(r ) = r f + Xe.na <r fw.) Vp 2 with 0 i n = - u | n ) / [ ( n - 1) !E[u! (wj ] ] an{rp'™i] = E [ r p " E [ r p ] ] (™i " E l W i l ) 1 1 " 1 .. Here 0. r e f l e c t s the i n d i v i d u a l measure of r i s k aversion, in Note that 0. i s invariant under p o s i t i v e a f f i n e transformations i n of u^ and that i t allows for comparison of r i s k aversion at d i f f e r e n t wealth l e v e l s , as well as global comparisons. A further j u s t i f i c a t i o n for t h i s measure i s described i n Litzenberger [56]. He observes that (under the conditions outlined at the s t a r t of t h i s section) that the marginal rate of substitution between expected rate of return and standard deviation i s related to Rubinstein's measure. I.e. dE(r p)/da p|E(u) = E [ r m - r f ] a m = C*a . P Here C* i s the investor's global r i s k aversion, C* = -w 0E[u"(w 1)]/E[u'(w 1)] and wQ (w-j_) i s the i n i t i a l (final) wealth. The following theorem shows that assuming normal d i s t r i b -utions, Rubinstein's measure can determine uniquely p o r t f o l i o composition. 6 9 Theorem 2.1: Assume that 5 i s j o i n t normally d i s t r i b u t e d 2 and that u. e C , u! > 0, uV < 0 i = 1,2 . Consider the f o l l o w i n g two problems (PI) {max u^w^'x) : Ex = 1, x > 0} (P2) {max E^ u^w^'x) : Ex = 1, x > 0}. Assume t h a t x* i s an optimal s o l u t i o n to (PI)and t h a t w ^ u ^ w ^ ' x * ) w 2E ?u^ (w 2 C'x*) E^u^w^'x*) E 5u^(w 2 C'x*) Then x* i s an optimal s o l u t i o n to (P2). PROOF: The necessary and s u f f i c i e n t Kuhn-Tucker co n d i t i o n s f o r (PI) are given by the system ( K T l ) : (al) E ^ u ^ w ^ ' x j w ^ } - X - \i± = 0 (bl) y i x i = 0 (cl) ]i± > 0 (dl) Ex = 1, x > 0 r i ( i = 1,...,n) . 70 Now, since cov(a,b) + E(a)E(b) = E(ab), we have E 5 { u l ( w l 5 ' x ) w l ? i } (2.1) = cov(w 1? i , u^(w 1C'x)) + w^E^u^w^ *x) Note that cov(x,g(y)) = E(g'(y))cov(x,y) i f g e C"*" , a l l expectations are f i n i t e , and x and y have a b i v a r i a t e normal d i s t r i b u t i o n . See Rubinstein [91] or Stein [10 2] for proof. I t now follows that covfw^. , u^(w 15'x)) = E^ (u^(w1C*x) ) c o v ( w 1 5 I , w-^'x) (2.2) = w^E ?(u^(w 1 5 ,x))2 C o v ( C i , ^ . ) x ^ since cov(u,2v i) = 2cov(u,v^) . Combining (2.1) and (2.2) , we may rewrite (al) as \ + \i± = w ^ E ^ (u^w^'x) + w ^ t u J l w ^ ' x n S c o v U ^ S j ) Observe that the Kuhn-Tucker conditions for (P2),(KT2) can be given by (KT1) with (al) replaced by (a2) and y i replaced by y\ i n (bl) and (cl) . 71 (a2) X + y ± = w 2I iE ?(u 2(w 2c: ,x) ) + w ^ (uJ (w2 5 ' x) ) Scov ( 5 ± , 5 j ) x_. Now, by assumption, f o r x* a s o l u t i o n t o (PI) w xE uj^w^'x*! w 2E uj^w^'x*) = = w © = w 0 E u j l w ^ 1 * ) E ? u ^ (w 2?'x*] ^ 1 thus, (al) i s e q u i v a l e n t t o ( X + y i ) / ( w 1 E ? (uj_(w 1? ,x*) )) = lL + w ^ E c o v f ^ . ^ . l x * and (a2) i s e q u i v a l e n t t o (X + ' y i ) / ( w 2 E ? ( u ^ w ^ ' x * ) ) ) = ? i + w 28Ecov(q,?.)x* Note t h a t the r i g h t hand s i d e s f o r the above two equations are i d e n t i c a l , and t h a t the denominators of the l e f t hand s i d e terms are constants (nonnegative). Thus given a s o l -u t i o n (x*,A,u) t o (KT1), we can generate a s o l u t i o n (x*, X,y) t o (KT2) by m u l t i p l y i n g X and y by a p p r o p r i a t e cons-t a n t s . //. 72 §2.3 The P o r t f o l i o S e l e c t i o n Problem The s t a t i c p o r t f o l i o problem under c o n s i d e r a t i o n i s n (2.3) max E {u(£*w x) } subject t o Ex. = 1, x _> 0 . x ^ ° 1 1 We assume: a. the u t i l i t y f u n c t i o n ueC"*", u'>0 and u i s concave; b. the r e t u r n d i s t r i b u t i o n i s normal with mean vector t and covariance matrix E; c. we have s c a l e d the budget c o n s t r a i n t so t h a t we may i n t e r p r e t x^ as the pr o p o r t i o n of i n i t i a l wealth i n v e s t e d i n s e c u r i t y i ; d. w i s the i n i t i a l wealth l e v e l . o Table 2.1 l i s t s the seven c l a s s e s of u t i l i t y f u n c t i o n s we w i l l consider along w i t h some of t h e i r most important prop-e r t i e s . These functions chosen represent the most f r e q u e n t l y discussed u t i l i t y f u n ctions and the group i n c l u d e s examples of i n c r e a s i n g , constant and decreasing absolute and r e l a t i v e r i s k a v ersion. In t h i s and the f o l l o w i n g w=l means that the given f u n c t i o n i s evaluated at a wealth l e v e l of one. T a b l e 2.1 U t i l i t y Function P r o p e r t i e s r n U t i l i t y Function u(w) Absolute Risk-Aversion R -R A U' 3 RA. aw Re l a t i v e Risk-Aversion RR ~ 3RR 3w (at w - .1) T h e o r e t i c a l Parameter Range 3 R A 38 (at w - 1) Quadratic « 2 2 8 1 > 0 i n c r . R, A 28xw > 0 > 0 W - PjW 1 - 2 8 ^ 1 - 28xw i n c r . . ( 0 . |) R A i n c r . i n (5 Exponential i " 6 2 w 1 - e 02 - 0 const. R. A 82w > 0 i n c r . R R ( 0 ,» ) .. > 0 Logarithmic i Log(6 3 + w) 1 < 0 w < 0 i f 0 < 0 ( -w , » ) B 3+w deer. R. A 8 3 + w .- i f B - 0 > 0 i f 8 > 0 < 0 Spe c i a l Exponential +8. /w e 4 ft. 2w+64 2 w < 0 deer. R, A 2w + S 4 w < 0 deer. Rjj (max(0,2w) fa>) . > 0 Power (w-w ) < 0 deer. R w-w (w > w QJ < 0 deer. i f wQ> 0 (0 , 1.0 ) > 0 Arctan arctan (w+g^) 1 + (w+B?) 74 Extensions of the U t i l i t y F u n c t i o n s Of the seven c l a s s e s of u t i l i t y f u n c t i o n s c o n s i d e r e d here, o n l y the e x p o n e n t i a l i s well-behaved over a l l p o s s i b l e values of w. For the o t h e r s , e i t h e r (i) the domain i s r e s t r i c t e d (e.g. l o g has domain w>0, s p e c i a l e x p o n e n t i a l has domain w^O, and the power and negative power have domain W > W Q ) , or ( i i ) m a rginal u t i l i t y i s d e c r e a s i n g (e.g. q u a d r a t i c f o r w_> ( 2 3 ) _ 1 ) , or ( i i i ) the f u n c t i o n may be convex (e.g. a r c t a n f o r w<0, s p e c i a l e x p o n e n t i a l f o r (approx.) w<0). The u s u a l technique f o r extending the domain u t i l i z e s a l i n e a r segment near the boundary of the domain, see Ziemba, Parkan and B r o o k s - H i l l [121],For a l l but the a r c t a n , t h i s l e a d s t o a ( t y p i c a l l y ) l a r g e jump d i s c o n t i n u i t y i n the f u n c t i o n . This, u s u a l l y leads to i n s t a b i l i t y i n the o p t i m a l p o r t f o l i o composition (and to d i f f i c u l t i e s i n equating the risk aversion see (2.5)). T h i s o b s t a c l e can be circumvented by appending an e x p o n e n t i a l segment. For l o g , s p e c i a l e x p o n e n t i a l , power, and negative power, we append the segment from W q = .01,.25, * .8 and .6 r e s p e c t i v e l y . T n u s we o b t a i n the f o l l o w i n g extended u t i l i t y f u n c t i o n s , d e f i n e d by * These points are chosen to be close to the singularity. 75 u (w) = u (w) w>w u(w ) + K ( l - e " r ( w W o } ) w<w , o — o w i t h r = -u" (w )/u' (w ) and o o K = u' (w Q)/r . Using t h i s extension we have t h a t u, u', and u" (thus R, as w e l l ) are continuous at w . A o 76 Data The following data were obtained from the University of Chicage Center f o r Research i n S e c u r i t y P r i c e s tape (CRSP) and c o n s i s t e d of monthly observations of p r i c e r e l a t i v e s * from January 1965 through December 1969; t h i s time span was chosen as a recent example of a r e l a t i v e l y s t a b l e p e r i o d . I n i t i a l l y 60 NYSE s e c u r i t i e s were s e l e c t e d and from t h i s s e t , 10 were chosen i n an attempt to provide d i v e r s i f i c a t i o n , smooth t r a n s i t i o n s i n means and v a r i a n c e s , and to avoid obvious dominations. From the 60 observations f o r each s e c u r i t y , we estimate the mean vector ( I ) , the variance-covariance matrix ( I ), and the c o r r e l a t i o n matrix (R). To generate the f i g u r e s f o r q u a r t e r l y and y e a r l y data, we f i r s t compute the corresponding compoufid (1+) rat e s of r e t u r n , then the means and variances are estimated as usual. We determine the r e s t of £ v i a the formula k k k G. . = P..°. a. k = q u a r t e r l y , y e a r l y . k k Here a^ f a_. are estimated from the compound ra t e s of r e t u r n * The p r i c e r e l a t i v e f o r p e r i o d t i s defined by R = P t +]_/P t where p i s the p r i c e at the s t a r t of p e r i o d t , and P t + 1 i s the p r i c e at the end of p e r i o d t , adjusted to r e f l e c t r e i n v e s t -ment of div i d e n d s , stock s p l i t s , e t c . 77 and cr i s determined from the monthly data. This device was used because the f i v e year period did not generate s u f f i c i e n t observations of quarterly and yearly rates to y i e l d reasonable estimates using the standard estimators. 78 Table 2.2 Security Means and Variances Monthly Quarterly Yearly Security Mean Variance Mean Variance Mean Variance 1. Cunningham Drug 1. 0194 .0105 1. 0559 .0290 1. 2852 . 3276 2. National Cash R. 1. 0173 • .0050 1. 0538 .0190 1. 2549 .2152 3. M. G.M. 1. 0162 .0148 1. 0463 .0386 1. 1819 .1711 4. G i l l e t t e Co. 1. 0137 .0044 1. 0408 .0126 1. 1694 .0666 5. H.F.C. 1. 0128 .0052 1. 0396 .0190 1. 1484 .0517 6. H.J.Heinz Co. 1. 0112 .0049 1. 0 35 3 .0190 1. 1608 . 1413 7. Anaconda Co. 1. 0111 .0081 1. 0357 .0286 1. 1711 . 1865 8. Kaiser Al.&C. 1. 0095 .0073 1. 0278 .0215 1. 0861 .0324 9 . Maytag Co. 1. 0084 .0048 1. 0243 .0128 1. 1292 .1676 10 . Firestone Tire 1. 0064 .0036 1. 0180 .0089 1. 0626 .0155 Table 2.3 Security Correlation Matrix Security Number 1 2 3 4 5 6 7 8 9 10 1 1.00 .2843 . 1313 .3049 .1739 . 3457 .4556 . 3033 .4229 .2822 2 1.00 .1333 .3859-.0101 .2334 .1800 . 4155 .2680 . 1332 3 1.00 .2882 .1251 . 3243 .2974 .2010 . 1762 . 1272 4 1.00 .2063 .2750 .4110 .3557 .2088 .2616 5 1.00 .2847 .2757 .2670 .2528 . 3167 6 1.00 . 3031 .1259 . 3178 .2501 7 1.00 .4546 . 3445 . 3020 8 1.00 . 3315 . 1968 9 1.00 . 189 8 10 1.00 79 Table 2.4 Monthly, Qu a r t e r l y and Yea r l y Covariances Monthly 1 2 3 4 5 6 7 8 9 10 1 .0105 .0021 .0016 .0021 .0013 .0025 .0042 .0026 .0030 .0017 2 .0050 .0012 .0018-.0001 .0012 .0012 .0025 .0013 .0006 3 .0148 .0023 .0011 .0028 .0033 .0021 .0015 .0009 4 .0044 .0010 .0013 .0025 .0020 .0010 .0010 5 .0051 .0014 .0018 .0016 .0013 .0014 6 .0049 .0019 .0008 .0015 .0011 7 .0081 .0035 .0021 .0016 8 .0073 .0020 .0010 9 .0048 .0008 10 .0036 Quarterly 1 2 3 4 5 6 7 8 9 10 1 .0290 .0067 .0044 .0058 .0040 .0081 .0131 .0076 .0081 .0045 2 .0190 .0036 .0060-.0002 .0044 .0042 .0084 .0042 .0017 3 .0386 .0064 .0034 .0088 .0099 .0058 .0039 .0024 4 .0126 .0032 .0043 .0078 .0059 .0027 ,0028 5 .0190 .0054 .0064 .0054 .0039 .0041 6 .0191 .0071 .0025 .0050 .0033 7 .0286 .0113 .0066 .0048 8 .0215 .0055 .0027 9 .0128 .0020 10 .0089 Year l y 1 2 3 4 5 6 7 8 9 10 1 .3276 .0755 .0311 .0451 .0226 .0746 .1126 .0313 .0091 .0201 2 .2152 .0256 .0462-.0011 .0407 .0361 .0347 .0509 .0077 3 .1711 .0318 .0118 .0504 .0531 .0150 .0298 .0066 4 .0666 .0121 .0267 .0458 .0165 .0221 .0084 5 .0517 .0243 .0271 .0109 .0235 .0090 6 .1413 .0492 .0085 .0489 .0117 7 .1865 .0354 .0609 .0162 8 .0324 .0245 .0044 9 .1676 .0097 10 .0155 80 Results For the seven c l a s s e s of u t i l i t y f u n c t i o n s given, we now use an implementation of the alg o r i t h m of chapter one to solve (2.3). The d e t a i l s of the implementation are given i n the appendix. Table 2.5 gives the u t i l i t y f u n c t i o n s and the parameter ranges used. Diagrams 2.1-2.21 p l o t optimal p o r t f o l i o prop-o r t i o n s versus parameter values f o r the seven u t i l i t y func-t i o n s using monthly, q u a r t e r l y and y e a r l y data. The para-meters chosen f o r each u t i l i t y f u n c t i o n are a subset (of 21 e q u a l l y spaced values) of the t h e o r e t i c a l parameter range. The numbers on each of the diagrams i n d i c a t e the s e c u r i t y number. Table 2.5 U t i l i t y Functions and Parameter Ranges U t i l i t y f u n c t i o n u (w) Absolute r i s k a version T h e o r e t i c a l Parameter T h e o r e t i c a l Corresponding parameter range range used R_ range with R range w i t h w=l A given parameters n 2 2Bn quadratic w-gw 1 1 (1-2BJW) (0,w/2) (0,.5) exponential 1-e ^ 2 W 6 . (0,~) -1 l o g a r i t h m i c log(B 3+w) (B3+w) (-w,°°) s p e c i a l _ e P 4 / w (2w+BJw - 2 ((2w) +,~) e x p o n e n t i a l 4 power (w-.75) 35 (l-B 5)/(w-.75) (0,1) negative _{w_h)-$6 (i+g )/( w-^) ( 0 f») power o a r c t a n tan 1(w+p ) 2(w+B?) l+(w+B ?) (-w,«>) (0,10) (-.8,1.2) (.1,4.1) (.05,1) (.1,2.1) (.1,4.1) (0,oo (0,oo (0,oo (2,oo (0,4 (2,°o (0,1 (0,oo) (0,10) (.45,5) (2.1, 6.1) (0,3.8) (2.2,6.2) (.38,1) 00 8 2 2 - l . QUADRATIC rnmir a o a. 10 0.0 o.os 0.1 0.2 0.25 term 0.35 0.43 —1 0.5 2 . 2 . E X P O N E N T I A L MONTHLY trt o 84 2 . 5 . P 0 W E R MONTHLY 3J 2 . 6 . N E G . POWER MONTHLY in B 88 2 . 1 3 . N E G . POWER Q U A R T E R L Y 89 2 - 1 5 . Q U R D R R T I C Y E R R L T 9 1 2 . 1 9 . NEC POWER YEARLY 2 . 2 0 . SPEC EXP YEARLY 92 93 We b r i e f l y (and i n f o r m a l l y ) summarize the r e s u l t s f o r each of the seven u t i l i t y f u n c t i o n s . 1. At 6-^ =0 u^ i s r i s k n e u t r a l and thus y i e l d s the h ighest r e t u r n p o r t f o l i o . As 3 ^ approaches .5 (and R A °°) / the p o r t f o l i o s chosen approach the minimum variance p o r t f o l i o . This i s c o n s i s t e n t w i t h the theory that i n c r e a s i n g i m p l i e s i n c r e a s i n g l y r i s k averse behavior. 2. The p o r t f o l i o s generated by u 2 are s i m i l a r to those generated by u^. As 3 2 i n c r e a s e s , the p r o p o r t i o n h e l d i n s e c u r i t y one decreases and the holdings i n s e c u r i t i e s 2-5 i n c r e a s e . The l a r g e s t changes occur f o r 3 2 e ( 0 , 4 ) . 3. As 3^ ranges from -.6 to 1.2, there i s r e l a t i v e l y l i t t l e change i n the p o r t f o l i o s . S e c u r i t y two always represents about one h a l f of the t o t a l investment. 4. Here there i s a l s o l i t t l e change i n the optimal s o l u t i o n s . This suggests t h a t t h i s f u n c t i o n cannot represent g r e a t l y v a r y i n g r i s k a t t i t u d e s . 5. U,- i n i t i a l l y y i e l d s low variance p o r t f o l i o s and as 3 ^ approaches one, the r i s k aversion decreases and the corresponding optimal p o r t f o l i o s have higher r e t u r n s . 94 6. Observe the very close s i m i l a r i t y between the graphs generated by u^ and Ug. 7. u 7 over the parameter range always has low r i s k aver-sion and consequently the p o r t f o l i o s generated have high returns. Diagram 2.22 demonstates how the variance of the optimal p o r t f o l i o decreases with R for monthly, quarterly and yearly data. For example at RA=0, the optimal p o r t f o l i o s are the maximum variance p o r t f o l i o s (because the highest return security happens to have the highest variance). As R approaches 10, the p o r t f o l i o s have only about 20% of the maximum variance. The curves are obtained by computing the variance of each of the p o r t f o l i o s generated by the expo-nen t i a l u t i l i t y function. The diagram also shows that the greatest chages i n p o r t f o l i o variance occurs between R values of 0 and 4. 96 From diagrams 1-21 we are led to conjecture that u t i l i t y functions with 'similar' values generate 'similar' port-f o l i o s . In thi s section we attempt to formalize t h i s state-ment and then to investigate i t s v a l i d i t y . For the following, l e t R^(8,w) represent the R function A A for u t i l i t y function u^, parameter 3 and i n i t i a l wealth w. Some possible ways to equate the R values of u t i l i t y functions u. and u. are the following: a. R^(B.,1) = R ^ ( 3.,l) : choose 3• and 6 - to equate R A 1 A 1 ^ A at wealth equal to one; b. R^(3-/W) = R^(3-/W) : choose 3 - and 3 . to equate R at expected f i n a l wealth; (2.4)c. /™R^(e i fw) f 1(w)dw = /~R A ( 3 j,w) f 1 5 (w)dw where f 1 (w) (f-*(w)) i s the density of a univariate normal with mean ^'x 1* {g'x3*) and variance x 1*'! x 1* (x^ ' I x 1 * ) with x 1* ( x 3 * ) the solution of (2.3) with u t i l i t y function u. (u.) : choose B- and B-1 j I D to equate expected R^ . The f i r s t two measures are less adequate than measure c. because they measure r i s k aversion at a single wealth l e v e l rather than over a range of possible wealth. (This state-ment has also been v e r i f i e d by performing calculations s i m i l a r to those i n the following tables 6-10.) 97 The use of measure c. i s s i m p l i f i e d by choosing the expo-nen t i a l u t i l i t y function to induce the parameters of another u t i l i t y function (since R^B-^w) = B-^  . Here (2.4) becomes 00 (2.5) 8 . = / R^(B.,W)f x(w)dw . Z -oo A 1 Equation (2.5) was solved (to an accuracy of .000001) using an adaptation of a hybrid bisection-secant method, originated by Wijngaarden, Dekker et a l and coded i n Forsythe, Malcolm and Moler [34]. I t uses a b i s e c t i o n method combined with quadratic and linear i n t e r p o l a t i o n . To compare p o r t f o l i o s there e x i s t also a number of poss-i b l e alternatives. For example, one may use any of the norms on Rn. However, using the p o r t f o l i o weights as a basis for comparison i s unsatisfactory because two very d i s t i n c t weightings may y i e l d i d e n t i c a l objective function values. The measure used here i s the percentage cash equivalent difference as defined i n Dexter, Yu and Ziemba [23]. This i s defined by u 1(z(x*)) - u 1(z(x)) , n n (2.6) , x ±uu , u" X(z(x*)) where z(x) i s the expected u t i l i t y with solution x. (Note that u ^"(z(x)) i s the cash equivalent of the random return Vx.) 98 In the f o l l o w i n g we use the e x p o n e n t i a l u t i l i t y f u n c t i i n the computation of (2.5). Noting from Freund [35] t h a t z(x) = E^{1 - exp(-B5'x)} = 1 - exp(-BC'x + h$zoz) we can compute the expected u t i l i t y from the given means and variances. A l s o , 6 can be chosen to correspond t o any R value. The f o l l o w i n g t a b l e s compare the d i f f e r e n t u t i l i t y f u nctions on each of the three sets of data f o r R = .5, 1, 2, 4 and 10. u t i l i t y f u n c t i o n T a b l e 2.6 P o r t f o l i o c o m p o s i t i o n f o r v a r i o u s u t i l i t y f u n c t i o n s when R =.5 and w =1 A o e x p o n e n t i a l q u a d r a t i c l o g power a r c t a n e x p o n e n t i a l q u a d r a t i c l o g power a r c t a n e x p o n e n t i a l q u a d r a t i c l o g power a r c t a n parameter 1 2 3 4 s e c u r i t y 7 8 9 10 .5 .618500 .381500 .165483 .618500 . 381500 .984259 .621151 .378849 879183 2.714659 611083 388917 626174 373826 .5 . 475958 .524042 163271 .952776 .475958 .524042 . 477149 .522851 882988 12.680433 .464452 .535548 .480267 .519733 .511442 .488558 .148687 .518551 .481449 . 821859 .502147 .497853 .907691 2.298559 . 480496 .509913 .009590 .509709 . 490291 imean v a r i a n c e jexpected u t i l i t y % e r r o r 1.018596 .005704 . 398654 .018596 .005704 . 398654 0 1.018602 .005725| . 39865. 0 1.018581 .005644 . 398654 0 1.018612 .005767 . 39 8654 0 1.054812 .015119 .408750 1.054812 .015119 . 408750| 0 1.054815 .015129 .408751 0 1.054789 .015028 .408750 0 1.054821 .015156 .408750 0 1.270399 .174774 .458468 1.270614 .175657 .458466 .0006 1.270117 .173680 .458465 .0008 1.268761 .169126 .458407 .0185 1.270346 . 174565! .458468 0 Remarks: 1. 2 . 3. 4. Rfl=.5 i s u n a t t a i n a b l e f o r Ug and u^, hence t h e y are e x c l u d e d . A l l f i g u r e s a r e a c c u r a t e t o s i x d e c i m a l p l a c e s . B l a n k s and z e r o e s i n d i c a t e f i g u r e s l e s s t h a n 10 E x p e c t e d u t i l i t y f i g u r e s a c r o s s p e r i o d s and a c r o s s R A v a l u e s such as i n the f o l l o w i n g t a b l e s , are n o t comparable. F o r c o m p a r i s o n , the h i g h e s t r e t u r n p o r t f o l i o had means and v a r i a n c e s o f (1.016216,.014840), (1.046334,.038602) and (1.285196,.327557) and the e q u a l i n v e s t m e n t p o r t f o l i o had means and v a r i a n c e s o f (1.012607 ,.002232), (1.037764,.007527) and (1.164958,.042719) f o r monthly, q u a r t e r l y and y e a r l y d a t a , r e s p e c t i v e l y . -2i U t i l i t y F u n c t i o n P a r a m e t e r S e c u r i t y 1 2 3 4 5 6 7 8 9 10 M e a n V a r i a n c e E x p e c t e d U t i l i t y Z E r r o r M O N T H L Y D A T A Table 2.7 P o r t f o l i o C o m p o s i t i o n f o r V a r i o u s U t i l i t y P u n c t i o n s W h e n R^ Q U A R T E R L Y D A T A i arwluV 3! E x p o n e n t i a l Q u a d r a t i c L o g P o w e r 1.0 .247237 -.013835 .750988 .414406 .418291 .418190 .410246 .506770 .508365 .508198 .507673 .078824 .073344 .073612 .082082 1.018094 1.018108 1.018108 1.018082 .004241 .004269 .004268 .004218 .637950 .637950 .637950 .637950 - 0 0 0 E x p o n e n t i a l Q u a d r a t i c L o g P o w e r 1.0 .241759 -.040319 .754962 .396072 .397829 .397514 .403243 .550361 .551144 .546582 .596757 .053567 .051026 .055903 1.054246 1.054269 1.054232 1.054663 .013727 .013773 .013699 .014798 .649145 .649145 .649145 .649121 - 0 0 .0065 Y E A R L Y D A T A E x p o n e n t i a l Q u a d r a t i c L o g P o w e r 1.0 .210483 -.112980 .765732 .342082 .381592 .287004 .245520 .421457 .454136 .373936 .339830 .146545 .146148 .163999 .141238 .006089 .089916 .018123 .175061 .267322 1.244992 1.253864 1.232981 1.223041 .110314 .129135 .088336 .073005 .695729 .695565 .695417 .694724 - .0454 .0861 .2772 H O O U t i l i t y F u n c t i o n P a r a m e t e r S e c u r i t y Mean V a r i a n c e E x p e c t e d U t l l i c y I E r r o r H O N T H L Y D A T A T a b l e 2.8 P o r t f o l i o C o m p o s i t i o n f o r V a r i o u s U t i l i t y F u n c t i o n s W h e n R A - 2 Q U A R T E R L Y D A T A E x p o n e n t i a l Q u a d r a t i c L o g S p e c i a l E x p o n e n t i a l P o w e r N e g a t i v e P o w e r 2.0 .328093 -.510706 .283320 .492853 .021701 .282218 .285479 .279667 .283320 .283888 .298170 .512462 .513005 .510997 .513247 .50178 7 .510228 .103576 .103887 .103271 .108219 .103719 .101422 .101745 .097629 .095213 .110606 .090181 1.017332 1.017357 1.017307 1.017359 1.017295 1.017420 .003268 .003294 .003244 .003296 .003232 .003360 .868418 .868418 .868418 .868418 .868418 .868417 - 0 0 0 0 0 E x p o n e n t i a l Q u a d r a t i c L o g S p e c i a l E x p o n e n t i a l P o w e r N e g a t i v e P o w e r 2.0 .317898 -.530359 .086256 .598896 .063542 .279549 .286024 .273713 .271465 .281605 .278865 .472942 .477334 .466487 .466941 .472449 .457115 .104567 .103563 .105836 .118487 .093868 .122138 .142942 .133079 .153964 .143108 .152078 .141881 1.051596 1.051757 1.051418 1.051472 1.051551 1.051478 .010109 .010274 .009934 .009993 .010070 .010005 .875441 .875440 .875440 .875439 .875390 .875437 - .0004 .0004 .0008 .0197 .0016 Table 2. 8(continued) YEARLY D A T A U t i l i t y F u n c t i o n E x p o n e n t i a l Q u a d r a t i c L o g S p e c i a l E x p o n e n t i a l P o w e r N e g a t i v e Power P a r a m e t e r 2.0 .275157 -.587446 .350128 .413450 .213501 1 .174457 .208418 .116236 .149547 .081275 .120428 2 .262884 .300979 .202592 .240565 .167586 .205139 3 .126734 .133008 .115598 .129158 .099687 .113806 4 .090320 .048109 .159015 .116267 .184367 .157239 S e c u r i t y 5 .341605 .309487 .406559 '.364464 .417025 .403389 6 7 8 9 1 0 .050061 M e a n 1.206489 1.214433 ' 1.193087 1.201245 1.180277 1.193834 .053081 .061813 .041833 .048167 .035125 .042347 V a r i a n c e E x v e c t e d .900422 .900265 .899992 .900356 .898764 .900039 U t i l i t y - .0684 .1868 .0286 .7158 .1665 Z E r r o r o T a b l e 2 . 9 P o r t f o l i o c o m p o s i t i o n f o r v a r i o u s u t i l i t y f u n c t i o n s w h e n R f l = 4 a n d w = 1 f u n c t i o n e x p o n e n t i a l q u a d r a t i c ^ i . ^ i v B e x p o n e n t i a l , q u a d r a t i c * ? ^ . ^ * » e x p o n e n t i a l q u a d r a t i c ^ P e c i a l " ^ a t i v e p a r a m e t e r 1 2 3 4 s e c u r i t y ^ 7 8 9 1 0 m e a n v a r i a n c e e x p e c t e d u t i l i t y % e r r o r 4 . 0 . 1 7 9 6 8 4 . 4 3 8 3 6 3 . 0 8 6 3 0 1 . 0 5 1 7 0 8 . 2 4 3 9 4 4 . 0 1 6 3 0 3 . 0 0 2 5 0 9 e x p o n e n . p o w e r . 3 9 1 5 9 1 2 D 6 1 4 8 3 1 . 0 4 5 3 8 9 . 1 7 7 5 6 6 . 4 4 7 7 5 5 . 0 8 8 7 8 3 . 0 4 5 0 6 1 . 2 4 0 8 3 5 . 1 6 9 2 7 5 . 1 7 7 5 3 1 . 4 4 0 8 7 8 . 4 3 6 8 6 6 . 0 8 9 4 3 3 . 0 8 5 5 5 0 . 0 5 4 5 0 9 . 0 5 4 0 7 6 . 2 4 5 9 7 8 . 2 4 5 9 7 8 1 . 0 1 6 3 3 5 1 . 0 1 6 2 6 0 1 . 0 1 6 2 8 2 . 0 0 2 5 2 5 . 0 0 2 4 8 7 . 0 0 2 4 9 8 4 . 0 . 1 8 7 2 6 0 . 3 7 4 6 6 7 . 0 9 8 3 4 0 . 1 0 7 4 4 7 . 2 3 2 2 8 6 1 . 0 4 8 7 8 1 . 0 0 8 0 9 7 . 9 8 2 4 9 3 . 9 8 2 4 9 3 . 9 8 2 4 9 3 . 9 8 2 4 9 3 . 9 8 3 9 2 5 0 0 0 e x p o n e n . p o w e r . 3 7 3 2 0 6 2 . 2 3 5 2 7 6 1 . 1 3 2 5 3 2 . 2 0 2 7 6 7 . 3 9 6 4 5 5 . 1 0 1 7 2 8 . 0 7 7 1 6 4 . 2 2 1 8 8 6 1 . 0 4 9 3 3 0 . 0 0 8 3 7 9 . 9 8 2 9 2 2 . 0 0 4 6 . 1 7 8 8 8 8 . 3 6 8 6 9 0 . 0 9 7 6 2 7 . 1 1 8 6 5 8 . 2 3 6 1 3 6 . 0 4 8 5 6 8 . 0 0 7 9 7 5 . 1 8 1 7 5 8 . 3 6 3 5 1 6 . 0 9 6 6 2 8 . 1 2 1 9 8 4 . 2 3 6 1 1 3 . 0 4 8 5 3 8 . 0 0 7 9 6 1 . 9 8 3 9 2 5 . 9 8 3 9 2 5 0 0 4 . 0 . 0 8 8 2 3 9 . 1 6 9 4 5 5 . 1 0 6 8 9 4 . 1 9 4 0 2 6 . 4 4 1 3 8 5 1 . 1 8 6 1 7 0 . 0 3 7 7 4 3 . 9 8 8 2 3 6 e x p o n e n . p o w e r . 3 5 1 4 4 7 2 . 8 8 4 4 0 0 1 . 4 4 3 3 5 7 . 0 8 2 9 9 1 . 0 2 1 2 2 4 . 1 6 5 9 8 2 . 1 8 5 2 7 4 . 1 0 6 6 6 3 . 1 9 8 8 3 0 . 4 4 5 5 3 3 1 0 4 0 6 4 1 4 8 5 2 2 4 4 1 1 8 2 . 1 9 9 7 3 3 . 0 4 7 6 2 1 . 1 1 2 7 9 4 . 0 7 9 6 0 0 . 1 5 4 4 7 4 . 3 2 8 9 5 8 . 2 5 8 2 32 1 . 1 8 5 1 7 5 1 . . 0 3 7 4 2 7 . 1 5 8 3 9 7 1 . 1 4 9 5 2 7 0 2 7 8 0 2 . 0 2 3 7 5 6 . 9 8 8 2 3 6 . 9 8 7 8 5 9 . 9 8 7 8 2 1 0 . 7 0 9 9 . 7 8 2 7 M O N T H L Y D A T A T a b l e 2 • 10 P o r t f o l i o C o m p o s i t i o n f o r V a r i o u s U t i l i t y F u n c t i o n s W h e n R A - 10 (\Y)tji Ul^-" I U t i l i t y F u n c t i o n E x p o n e n t i a l Q u a d r a t i c S p e c i a l E x p o n e n t i a l N e g a t i v e P o w e r P a r a m e t e r 10.0 .440442 8.156772 4.109086 1 .118851 .095302 .079358 .080020 2 .342845 .364688 .336837 .333334 3 .062053 .068848 .053716 .053053 4 .120721 .129129 .129992 .130616 S e c u r i t y 5 .264604 .278670 .259364 .260044 6 .044528 .043953 .062782 .062214 7 8 9 .008197 .009482 10 .646398 .019411 .069754 .071237 M e a n 1.015080 1.015229 1.014559 1.014531 V a r i a n c e .002113 .002141 .001994 .001989 E x p e c t e d U t i l i t y .999957 .999957 .999957 .999957 X U t i l i t y - 0 0 0 Q U A R T E R L Y D A T A E x p o n e n t i a l Q u a d r a t i c S p e c i a l E x p o n e n t i a l N e g a t i v e P o w e r 10.0 .425157 8.616831 4.303717 .121244 .124070 .066857 .077372 .255242 .298659 .241709 .237311 .070965 .082464 .066857 .064084 .167511 .179232 .175207 .173416 .205289 .237913 .191142 .188466 .013380 .013907 .025845 .024110 .035740 .068875 .067213 .130629 .063754 .163507 .168048 1.042470 1.045211 1.040104 1.040127 .006262 .006891 .005773 .005780 .999959 .999959 .999959 .999959 - 0 0 0 Y E A R L Y D A T A E x p o n e n t i a l Q u a d r a t i c N e g a t i v e P o w e r 10.0 .423749 4.935791 .031553 .017099 .075474 .050898 .044526 .056237 .050898 .041265 .111908 .094011 .054966 .237341 .190958 .135844 .087302 .143023 .184158 .400185 .470213 .522144 1.125199 1.108230 1.101730 .017605 .014485 .013892 .999969 .999968 .999967 - .2261 .4728 . R e m a r k : T h e S p e c i a l E x p o n e n t i a l l a n o t i n c l u d e d f o r y e a r l y d a t a b e c a u s e o f p o t e n t i a l o v e r f l o w p r o b l e m s . 105 Changing Wealth Levels For a l l but the arctangent u t i l i t y function (u^), a simple transformation of the parameter 3^ allows us to obtain results for a l l positive w ( i n i t i a l wealth level) from the o results for w =1. To i l l u s t r a t e for u n : o 1 2 u^(w) = w - B j W = (£'xwo) - B 1 ( C ' x w o ) 2 which i s equivalent to the u t i l i t y function ?'x - B 1 ( ? ' x ) 2 with 3, = 3nw 1 1 o Thus, the solution for w and 3, can be determined from the o 1 results with w = 1 and parameter B-j_-To further test the e f f e c t s of changing wealth l e v e l s , we considered the following two measures: -w u"(w )/u'(w ) o o o (relative r i s k aversion at i n i t i a l wealth) and w E(R ) =w R.. o A o A The f i r s t measure i s unsatisfactory since i t i s based only on one l e v e l of wealth. To test the second, we vary W q to y i e l d parameter values at approximately the midrange of values used i n the W q=1 computations. Given these fixed values of the parameter, the i n i t i a l wealth l e v e l was varied to obtain the ranges of w R„ as shown i n table 2.11. Also, 3. ^ o A 1 and Wq were chosen to generate comparable ranges of W 0 R A across u t i l i t y functions. The results are shown i n tables 2.11 and 2.12 and diagram 2.23. Recall that i s the base or 'exact' solution case. The measure i s quite accurate (errors less than 106 .5%) and the error is not sensitive to changes in w T a b l e 2.11 E r r o r A s s o c i a t e d w i t h A p p r o x i m a t e Measure w R. f o r C h a n g i n g I n i t i a l W e a l t h U s i n g Y e a r l y D a t a o A U t i l i t y F u n c t i o n P a r a m e t e r I n i t i a l W e a l t h Range Ranj>e o f w R, o A Range o f P e r c e n t a g e Cash E q u i v a l e n t E r r o r Average P e r c e n t a g e Cash E q u i v a l e n t E r r o r Q u a d r a t i c 0.2 (.6,1.7) (.345523,3.602835) (0,.0470) 0. 0091 S p e c i a l E x p o n e n t i a l 1.0 (-6,2.4) (2.051729,3.027538) (.0287,.0426) 0. 0360 Power 0.5 (.85,1.3) (1.001502,2.861930) (.4387,.9576) 0. 7966 N e g a t i v e Power 0.5 (.85,1.5) (1.870776,2.903902) (.2449,.4207) 0. ,3230 A r c t a n g e n t 0 (.9,2.) (.85437,1.423139) (0,.0065) 0. .0030 L o g a r i t h m i c -0.2 (.8,2.8) (.934752,1.205693) (.0669,.5296) 0. .2116 Table 2.12 E r r o r A s s o c i a t e d w i t h A p proximate Measure W Q R a f o r Changing I n i t i a l W e a l t h U s i n g Y e a r l y D ata f o r N e g a t i v e Power U t i l i t y F u n c t i o n w i t h 8 6 = 0 . 5 . t r u e P e r c e n t a g e w w R„ w w Cash E q u i v a l e n t o o A E r r o r 0.85 2.903902 1.127707 1.132348 0.4099 0.90 2.707062 1.131786 1.136568 0.4207 0.95 2.554579 1.135917 1.139949 0.3537 1.00 2.433971 1.139092 1.142703 0.3160 1.10 2.149399 1.145640 1.149555 0.3406 1.20 2.114404 1.147181 1.150439 0.2832 1.30 2.059525 1.149025 1.151846 0.2449 1.40 1.933805 1.152041 1.155178 0.2716 1.50 1.870776 1.153843 1.156923 0.2662 10 7 cash e q u i v a l e n t 19 18 17 16 .15 .14 ..13 L 2.5 3.0 w o R A Diagram 1. 1.5 2.0 2.23:Differences i n Cash E q u i v a l e n t s Using the Measure -w R w i t h Changing I n i t i a l Wealth o A 108 E r r o r s i n E s t i m a t i o n In applying any p o r t f o l i o model u t i l i z i n g d i s t r i b u t i o n a l assumptions, there e x i s t s the fundamental problem of the v a l i d -i t y of the chosen d i s t r i b u t i o n . The reasons f o r the choice of the normal d i s t r i b u t i o n f o r these s t u d i e s has been o u t l i n e d elsewhere, however, i t i s necessary to note t h a t other d i s t -r i b u t i o n s have been proposed which seem t o r e p l i c a t e the p a t t e r n of s e c u r i t y returns more c l o s e l y than the normal. The fundamental observations by Mandelbrot [ 134-135], Fama 1127] and others, i n d i c a t e t h a t a more l e p t o k u r t i c ( ' f a t t e r -t a i l e d ' ) d i s t r i b u t i o n than the normal i s r e q u i r e d . Fama, f o r one, has suggested the s t a b l e P a r e t i a n f a m i l y . This f a m i l y i s most general f a m i l y of d i s t r i b u t i o n s which i s cl o s e d on a d d i t i o n , thus (as mentioned p r e v i o u s l y ) g r e a t l y s i m p l i f y i n g the comput-a t i o n a l burden. On the other hand, Markowitz [62] has suggested that the proper d i s t r i b u t i o n ought to be skewed t o the r i g h t . The r a t i o n a l e i s t h a t i n v e s t o r s tend to be conservative when l o s i n g and l e s s r i s k averse when winning. A d i s t r i b u t i o n poss-e s s i n g t h i s property (and a l s o more t a i l observations) i s the lognormal. This has some l o g i c a l appeal i f the e f f e c t s generating r a t e s of r e t u r n are independent and m u l t i p l i c a t i v e . While a sum of lognormals i s not lognormal (thus rendering the o p t i m i z a t i o n much more complex) , Dexter, Yu and Ziemba [ 2 3] 109 and Ohlson and Ziemba [136] have e m p i r i c a l and t h e o r e t i c a l work which suggests t h a t approximating a sum of lognormals by a lognormal can be reasonable. B l a t t b e r g and Gonedes[126J have suggested the Student (t) d i s t r i b u t i o n as a model and there are the compound Poisson models of Press [138] as w e l l . Again, w h i l e perhaps s u p e r i o r interms of goodness of f i t , these d i s t r i b u t i o n s are not considered here because of the much greater computational requirements. Even r e s t r i c t i n g our a t t e n t i o n t o the normal d i s t r i b u t i o n , the problem of f o r e c a s t i n g the means and covariances r e q u i r e d remains a very serious o b s t a c l e . While e s t i m a t i n g the covar-iances from h i s t o r i c a l data may be p l a u s i b l e , t h e e f f i c i e n t markets l i t e r a t u r e i n s i s t s t h a t h i s t o r i c a l rates of r e t u r n cannot be used t o p r e d i c t f u t u r e r a t e s of r e t u r n . In gener a l , t h i s problem has no s o l u t i o n which seems adequate. 110 In t h i s s e c t i o n , we attempt to assess the impact of e r r o r s i n the es t i m a t i o n of £ and E . In the f i r s t e x p e r i -ment, we assume that 1 i s the true mean v e c t o r , but tha t a . . = a . . + ea..z ID ID ID w i t h z ~ N[ 0 ,1] , e = .05,.10,.15,.20,.25 . I.e., we assume that we have an unbiased estimate &. ., and ID the true covariance term i s the estimate plus a normally d i s t r i b u t e d e r r o r term w i t h zero mean and standard d e v i a t i o n ea^j , where e i s a parameter which allows us to vary the s i z e of the e r r o r term. The presence of 8.. i n the e r r o r ID term w i l l avoid negative variances and w i l l help to keep the s c a l i n g reasonable. In the second experiment, we assume, i n a d d i t i o n , t h a t I. = K- + e£.z I I I w i t h e and z as before. So as above, we assume the true mean i s the sum of an unbiased estimate and a normal e r r o r I l l Diagrams 2.24 - 2.25 compare the average cash equivalent errors caused by using the estimated data rather than the true data. We use exponential u t i l i t y with yearly data and g 2 = l / . . . , 6 . Diagram 2.24 assumes only covariance errors and each data point i s the average of four runs with d i f f e r e n t values of z. Diagram 2.25 i s the average of eight runs and assumes, in addition to covariance errors, that there are errors i n the mean vector. 112 percentage cash e q u i v a l e n t e r r o r .3 -.2 .1 rt .25 = e 20 .15 .10 1 3 5 Diagram 2.2 4 .Covariance E r r o r s Only percentage cash eq u i v a l e n t e r r o r Diagram 2.25: Mean and Covariance E r r o r s 113 §2.4 Summary Diagrams 1 to 22 suggest the f o l l o w i n g : H the q u a d r a t i c u t i l i t y f u n c t i o n , despite i t s l i m i t -a t i o n s ( r e s t r i c t e d domain and i n c r e a s i n g absolute r i s k aversion) y i e l d s the l a r g e s t range of R values and i s thus able to represent the widest spectrum of ' r i s k a t t i t u d e s ' ; 11 a v a r i a t i o n i n R from 5 to 10 or from 10 t o 20 produces very l i t t l e change i n p o r t f o l i o composition, most changes o c c u r r i n g i n the region from 0 to 4. (cf . diagram 2.22); H the p o r t f o l i o compositions f o r d i f f e r e n t u t i l i t y f u n c t i o n s are s i m i l a r when they have s i m i l a r average R A values (cf. diagrams 2.4 and 2.6 or 2.11 and 2.13). The l a s t p o i n t i s very important from a methodological p o i n t of view since i t suggests t h a t i t may be p o s s i b l e t o s a f e l y replace t r a c t a b l e u t i l i t y f u n c t i o n s ( l i k e the q u a d r a t i c or expon e n t i a l which have d e t e r m i n i s t i c e quivalents) f o r more ' p l a u s i b l e ' u t i l i t y f u n c t i o n s . Tables 2.6 and 2. . 1 0 attempt to v e r i f y the hypothesis t h a t i f two u t i l i t y f u n c t i o n s have s i m i l a r average r i s k aversion then they have s i m i l a r p o r t f o l i o s . This i s very s t r o n g l y supported f o r monthly and q u a r t e r l y data, and even f o r y e a r l y data, the 114 approximation i s s t i l l quite adequate (error less than 1%). Diagram 2,23 and table 2.11 (in a very l i m i t e d case) suggest that the use of average R with changing i n i t i a l wealth levels A. s t i l l supports the above hypothesis. Diagrams 2.24-2.25 demonstrate that the solutions are robust with respect to unbiased errors i n the estimates of the covar-iance terms, but that errors i n the estimation of the expected rate of return can lead to much larger cash equivalent e r r o r s . 115 Appendix: Implementation Our problem has the form max E [u(S'xw )] = max z(x) . xeX 5 xeX wi t h X = {xeR 1 ; Ex ± = 1, x > 0} , i S ~ N [ 5 , Z ] , u i s concave, u'>0, and w Q i s the i n i t i a l wealth l e v e l Before g i v i n g each of the steps of the a l g o r i t h m (numbered to correspond to f i g u r e 1.2), we note the s i m p l i f i c a t i o n s used i n computing z(x) and Vz(x). The o b j e c t i v e value can be w r i t t e n as z(x) = / u(w Qt) • ( / 2 ? a ) " x exp (-3s ( ( t - S ' x ) / a ) 2 ) d t With the change of v a r i a b l e y = (t - I'x)/(/2a) we obtain ,°° 2 z(x) = (TT) / u(( 2ay + C ' X ) W q ) exp (-y ) dy . 116 The j t h p a r t i a l of the o b j e c t i v e f u n c t i o n can be w r i t t e n as OO 00 3z/3x. = / / wJ,u'(ww o)f(L,w) dwd£. •J —oo —co 3 -* where w = £'xeR , u' i s the marginal u t i l i t y f u n c t i o n and f(£.,w) i s the density of a b i v a r i a t e normal w i t h mean 3 [ f . , f . x ] and covariance matrix I a.. x'Z 3 3 33 x'Z 3. x' Ex 3. w i t h Z . the j t h column of Z . We introduce the notation' 3- _ #5 a = (x'Zx)" and p = ( x ' Z ) / ( a a ) j • J We now have 00 oo 8 Z / 8 X J = / /w 0Lu' (wwQ) [ ( l / ( 2 w a / B 7 ) ) .exp[ ( l / ( 2 ( l - p 2 ) ) 00 oo J ((C.-^,)/a.) 2-(2p(^-I.) (w-£'x))/(a.a) + ((w-rx)/o) 2 d^jdw . Using the change of v a r i a b l e = a j t 1 / 2 ( l - p * ) + I w = ot2V2 (1 - p 2 ) + I'x , we o b t a i n the expression: CO oo (2.8) 3z/9x. 3 (w ov /(l - P 2)/ T T)/ / { { a - t ^ U - p 2 ) + £ } —00 —00 • u ' ( ( a t 2 / 2 ( l - p z ) + f'x)w 0) •exp(2pt 1t 2) e x p ( - t f - t 2 ) } d t x dt . 117 The main advantage of the r e p r e s e n t a t i o n s given i n (2.7) and (2.8) i s that we may apply the Gauss-Hermite quadrature formulas to'approximate the i n t e g r a l s . These formulas have the s t r u c t u r e °° - n / f ( x ) e x p ( - x )dx = I A.f(x.) -co i = l They are discussed i n K r y l o v [53]and extensive t a b l e s of Gaussian formulas can be found i n Stroud and Secrest [105]. Gaussian quadrature formulas have degree (2n-l) ( i . e . w i t h n e v a l u a t i o n s of f(x) they are exact f o r polynomial f of degree not greater than ( 2 n - l ) ) . This i s the h i ghest degree of p r e c i s i o n p o s s i b l e f o r any n p o i n t formula. The nodes x^ correspond t o the zeroes of a Hermite polynomial. Some of the other reasons f o r using the Gauss-Hermite formulas f o r e v a l u a t i n g (2.7) and (2.8) are t h a t i t i s unnecessary to truncate the i n t e g r a l , and t h a t these formulas give us a mechanism f o r generating the sequences ofapproximations we use. In p a r t i c u l a r , by i n c r e a s i n g n we can o b t a i n a formula which i s exact f o r a polynomial of a r b i t r a r i l y high degree, and consequently an a r b i t r a r i l y good approximation t o (the continuous functions) z (x) or 3z/3x_.. For c a l c u l a t i n g the l a t t e r , we use a product form of the formulas; see Stroud [104]. E s s e n t i a l l y t h i s i n v o l v e s using a Gauss-Hermite formula 118 i n both the w and the dimensions, to generate a g r i d of p o i n t s on which the integrand i s evaluated. Thus we can generate the sequences e i n a number of ways. F i r s t , i f using a quad-rat u r e scheme which allows f o r e r r o r t o l e r a n c e s ( as most commercially a v a i l a b l e algorithms do) we s p e c i f y tolerance e at i t e r a t i o n k. A l t e r n a t i v e l y , i f using a Gaussian formula we increase n (the number of integrand evaluations) at each i t e r -a t i o n t o i m p l i c i t l y determine the sequence of tol e r a n c e s ( i . e . we i m p l i c i t l y define the f u n c t i o n <j> described i n chapter one) . For example, i f we begin w i t h n=6, e would be the tolerance from 6+k integrand e v a l u a t i o n s . The code was w r i t t e n i n FORTRAN-IV using double p r e c i s i o n a r i t h m e t i c and was implemented on an IBM 370/16 8. S o l u t i o n times were between two and f i v e seconds. The accuracy of the code was v e r i f i e d by comparing the s t o c h a s t i c a l g o r i t h m (using numerical quadrature) w i t h the d e t e r m i n i s t i c form of FW which may be a p p l i e d to the e x p o n e n t i a l and q u a d r a t i c u t i l i t y f u n c t i o n s . For example, i n the former case, the stoch-a s t i c problem i s e q u i v a l e n t t o the d e t e r m i n i s t i c problem of maximizing I'x - ^ B - j X'Zx ; see Freund [37], 119 We now i l l u s t r a t e one of the implementations of the algorithm. Step 0: Set e°=.001 , £=.000001 and a=.5 o x. = J 1 i f j = i 0 i f DVi, where i s a t i s f i e s £. = max . i t" l<t<10 I.e. the i n i t i a l p o r t f o l i o i s the highest r e t u r n p o r t f o l i o . Set k=0. Step 1: Approximately evaluate (2.8) using t o l e r a n c e e to o b t a i n u the approximate gradient. Step 2: The simple s t r u c t u r e of the c o n s t r a i n t set allows us to express the optimal y as k 1 i f i = j 0 i f i ^ j , k k where u. = max u . 1 Kt<10 I f u k ' ( y k - x k) < -||y - x k|| + e k k+1 k then set e =% e and k = k+1, go to step 1, otherwise go t o step 3. 120 Step 3: The l i n e a r search problem was solved using an adapt-a t i o n of the (cubic i n t e r p o l a t i o n ) a l g o r i t h m of Fox, Lasdon, Tamir and Ratner [35]. This method was used because no gradient c a l c u l a t i o n s are r e q u i r e d . We solve the l i n e a r search problem to an accuracy of -10 k 10 and we use a tolerance of e i n the e v a l u a t i o n of (2.7). Set e*+^~ = he* k = k + 1 . Go to step 1. 121 §3 Frank-Wolfe Algorithms Using Approximate  D i r e c t i o n a l D e r i v a t i v e s In t h i s chapter we begin by d e s c r i b i n g Hogan's [43] ve r s i o n of FW. This adaptation i s intended t o be a p p l i e d t o o p t i m i z a t i o n of extremal-value f u n c t i o n s and t o other s i t u a t i o n s where d e r i v a t i v e s are u n a v a i l a b l e , but d i r e c t -i o n a l d e r i v a t i v e s e x i s t . In §3.2 we modify Hogan's approach by a l l o w i n g the use of approximate p a r t i a l d e r i v a t i v e s i n the c h a r a c t e r i z a t i o n of the d i r e c t i o n a l d e r i v a t i v e s f o r a completely convex extremal-value problem. The appendix contains the r e l e v a n t t o p o l o g i c a l concepts. 122 §3.1 A Frank-Wolfe Algorithm Using D i r e c t i o n a l Derivatives_ We begin by not i n g a few d e f i n i t i o n s ; Stoer and W i t z g a l l [10 3] and r e l a t e d t e x t s c o n tain f u r t h e r d e t a i l . A f u n c t i o n f:R n ->- R* i s d i r e c t i o n a l l y d i f f e r e n t i a b l e at the p o i n t xeR n i n the d i r e c t i o n deR n , i f | f (x) | < °° and f'(x;d) = l i m [ f ( x + Ad) - f ( x ) ] / X e x i s t s . A+0 + We c a l l t h i s l i m i t the d i r e c t i o n a l d e r i v a t i v e (dd) at x i n the d i r e c t i o n d . Two important p r o p e r t i e s are: 11 a l l concave fu n c t i o n s are d i r e c t i o n a l l y d i f f e r -e n t i a b l e wherever they are f i n i t e , and 11 i f f i s C 1, then f'(x;d) = Vf ( x ) ' d . The l a t t e r property i s e x p l o i t e d i n many f e a s i b l e d i r e c t i o n algorithms; i n f a c t many such algorithms r e q u i r e f t o be C^ " when indeed they merely use t h i s property to compute the dd. The former property i s c r u c i a l to the development i n Hogan [43] which we now d e t a i l . 12 3 Hogan's discussion centers on extremal-value functions. For example, consider the problem max f (x,y) xeX,yeY Suppose that this problem i s r e l a t i v e l y easy to solve i n x for fixed y. We can project (or p a r t i t i o n ) the problem on the y variable by considering the equivalent problem max[ v(y) = sup f(x,y) ] yeY xeX of maximizing the extremal-value function v over the variable y. An example of t h i s technique i s Benders [10] decomposition. For t h i s example, xeR n but y i s constrained to integer values. Usually the extremal-value function i s the solution of a l i n e a r programming problem, and, by dualizing, optimizing v(y) i s often much easier than ta c k l i n g the problem j o i n t l y i n x and y. For other examples of p a r t i t i o n i n g techniques see Lasdon [54]. Another natural projection i s when we have a two period prob lem; we have y equal to the f i r s t period decision and x equal to the second period decision ; see Geoffrion [128]. In general, v w i l l not be d i f f e r e n t i a b l e , but i t w i l l be d i r e c t i o n a l l y d i f f e r e n t i a b l e i f i t i s concave, which w i l l be the case i n a number of important problems. 124 Hogan derives the expression f o r the dd i n a number of cases. We w i l l focus on the one to be used i n our development, the completely convex case. Let us assume the o r i g i n a l problem may be w r i t t e n as (3.1) max f(x,y) s . t . g(x,y) £ 0 xeX,yeY h(x,y) = 0. Here f:R p x R n R* , (the n o n l i n e a r c o n s t r a i n t s ) g:Rp x R n -> R*m, (the l i n e a r c o n s t r a i n t s ) h:R p x R n ->- R*q, Y i s a convex subset of R n and X i s a convex subset of RP. Assume f and -g are concave ( j o i n t l y i n (x,y)) . The p r o j e c t e d problem can be w r i t t e n (3.2) max [ v(y) = sup f (x,y) s . t . g(x,y) <_ 0 yeY xeX h(x,y) = 0 ] . Hogan u t i l i z e s the psm approach of Z a n g w i l l (see the appendix t o chapter one), so the d e f i n i t i o n of a number of psms i s r e q u i r e d . Let H define the set of f e a s i b l e d i r e c t i o n s f o r the in n e r problem H (d) = {weRP: g(x+Aw ,y+Ad)<0 , h (x+Aw,y+Ad) =0, (x,y) and x+AweX, VAe(0,A(x,y,d,w)) } ; l e t M define the set of optimal s o l u t i o n s f o r the inner problem M(y) = {xeX; g(x,y)<0, h(x,y)=0, and v (y) <f (x,y) } . * This n o t a t i o n i n d i c a t e s the dependence of A on (x,y,d,w). 125 Hogan proves Theorem 3.1: Suppose t h a t X i s a nonempty convex s e t , f and -g are concave on X x (N^ A i s a neighbourhood of y * ) , h i s l i n e a r , v(y*) i s f i n i t e and x*eM(y*). Then V (y*;d) = sup f'(x*,y*;w,d) weH (d) This theorem i s u s e f u l i f we can c h a r a c t e r i z e H and f' i n a simple manner. One approach i s to r e q u i r e t h a t a c o n s t r a i n t q u a l i f i c a t i o n (see e.g. Z a n g w i l l [ 1 1 7 ] ) hold. For example, S l a t e r ' s C o n d i t i o n ( S l a t e r [99 ]) i s e s s e n t i a l l y t h a t the f e a s i b l e set have an i n t e r i o r ( i . e . i t would s a t i s f y each i n e q u a l i t y s t r i c t l y i f the feasible set i s determined by inequality relations). I f a c o n s t r a i n t q u a l i f i c a t i o n holds, then the c l o s u r e of the set of f e a s i b l e d i r e c t i o n s at a p o i n t equals the l o c a l l i n e a r -i z a t i o n of the b i n d i n g c o n s t r a i n t s . Formally, i f we define the l i n e a r i z a t i o n by the psm L (x,y) VJi.(x,y)'w=-V h.(x,y)»d Vj } 3 y 3 (here V g(x,y) denotes the gradient w i t h respect t o x ) , then a c o n s t r a i n t q u a l i f i c a t i o n w i l l guarantee t h a t t c l ( X ) i s the c l o s u r e of the set X. 126 We may now c h a r a c t e r i z e the d i r e c t i o n a l d e r i v a t i v e as a s o l u t i o n t o a l i n e a r program. C o r o l l a r y 3.1; Suppose we have no l i n e a r c o n s t r a i n t s h, and t h a t the hypotheses of Theorem 3,1 h o l d . F u r t h e r , assume f and g are d i f f e r e n t i a b l e on X x N and y* C l ( H ( x * , y * ) ( d ) ) = L ( x * , y * ) ( d ) ' then ) v' (v*;d) = \ y (3.3 y V„f(x*,y*)'d + sup V xf(x*,y*)'w w subject t o V xg^ (x* ,y*) 'w <^  -V^g^(x*,y*)'d Vk such t h a t g v(x*,y*) = 0 I f we have l i n e a r c o n s t r a i n t s we would modify (3.3) as f o l l o w s : (3.4) V (y*;d) = V f ( x * , y * ) ' d + sup V f(x*,y*)'w y w subject to V xg k(x*,y*)'w < - V y g k ( x * , y * ) ' d V h. (x*,y*) 'w = -V h. (x*,y*) 'd x j , J f y 3 Vj and Vk such t h a t g k(x*,y*) = 0. C o r o l l a r y 3.2: Assume the hypotheses of Theorem 3.1, A l s o assume f f and g are d i f f e r e n t i a b l e , h i s l i n e a r and x*eint(X) . Then a. I f y* + AdeV Q = {y: g(x,y) < 0 and h(x,y)=0 f o r some xeX} f o r some A>0, then (3.4) i s v a l i d . t i n t ( X ) i s the i n t e r i o r of the s e t X, 12 7 b. I f y * e r i ( V Q ) + , then (3.4) i s v a l i d Vd such t h a t y* + d e a f f ( V ).* o c. I f y * e i n t ( V Q ) , then (3.4) i s v a l i d VdeR n. Hogan [ 42] gives examples to demonstrate t h a t a naive g e n e r a l i z a t i o n of FW to u t i l i z e dds i n s t e a d of g r a d i e n t s , w i l l run a f o u l of the jamming phenomenon (see the appendix to chapter one). However, one can approximate the dd i n a manner which i s analogous to an antijamming p r o v i s i o n , i n p a r t i c u l a r an e-perturbation technique. This approximation developed by Hogan, v 1 (y;d), i s defined by r, x (3.5) v 1 (y;d) = V f ( x , y ) ' d + sup V f(x,y)'w r ' x y w subject to rg(x,y) + V vg(x,y)'w < -V g(x,y)'d x — y V h(x,y)'w = -V h(x,y)'d x y I M L £ r a Here r >1 i s a r e l a x a t i o n parameter and a i s -chosen to be a la r g e p o s i t i v e number. The l a s t c o n s t r a i n t avoids unbounded-ness i n the d i r e c t i o n f i n d i n g problem and the £ m (sup ) norm i s chosen merely to ensure t h a t the f e a s i b l e set i s p o l y h e d r a l . The f o l l o w i n g lemma v e r i f i e s t h a t t h i s approximation t , * r i ( X ) i s the r e l a t i v e i n t e r i o r of the s e t X , a f f ( X ) the a f f i n e h u l l of X, 12 8 has the r e q u i r e d p r o p e r t i e s . Lemma 3.1: Suppose f and -g are d i f f e r e n t i a b l e concave f u n c t i o n s , h i s l i n e a r , y*eY, Y i s a concave subset of r i ( V ), M(y*) i s bounded, r ^ l and a> diameter of M(y*) . Then the f o l l o w i n g are v a l i d : a. v ^ x * ( y * ; y - y*) < v'(y*;y - y*) Vyeaf f (V Q) . b» v ' *(Y*;d) i s nondecreasing i n r and Vyeaff(V ) r,x* J • J o there e x i s t s a f i n i t e r(y) such t h a t v ^ x * ( y * ; y - y*) = V (y*;y - y*) Vr>r(y). c. The f o l l o w i n g are e q u i v a l e n t : ( c . l ) sup v' v * ( y * ; y - y*) = 0 f o r r > l . yeY ' (c.2) sup v 1 ( y * ; y - y*) = 0. yeY (c.3) y* solves (3.2) . I t i s often more economical to generate only e-optimal s o l u t i o n s to the inner problem. We then u t i l i z e the psm M*(y,e) = {x: g(x,y)<0, h(x,y)=0 and v(y)<f(x,y) + e} . Hogan's v e r s i o n of FW (HFW) can be described by the f o l l o w i n g flowchart. 129 Choose y ^ Y ^ ^ O j r ^ l ; set k=l. I — > Choose x keM*(y,e k) k max v' __k (y ; z - 5 zeY and f i n d z^eY to solve rk) . Set d k = z k - y k . Choose T e [ 0 , l ] to solve max v ( y k + T d k ) . Set y k + 1 = y k + x k d k ; xe [0,1] £ k + l = £ k / 2 . k = k + i . Convergence i s given by the f o l l o w i n g theorem. Theorem 3.2: Suppose f and -g are d i f f e r e n t i a b l e and concave f u n c t i o n s , h i s l i n e a r , Y i s a nonempty convex subset of r i ( V ), r > l , the set of f e a s i b l e s o l u t i o n s t o (3.1) i s compact; a> the diameter of the s e t of f e a s i b l e s o l u t i o n s to (3.1). Then every accumulation p o i n t of the sequence {y k} generated by HFW i s a s o l u t i o n t o (3.2). 130 §3.2 Approximate Di r e c t i o n a l Derivatives v i a Approximate  Gradients » The key characterization i n §3.1 i s given by (3.5). In t h i s section we consider the following approximation to (3.5): (3.6) v' k(y;d) = vj cf(x,y) ,d + sup V*f(x,y)'w r , x ' e y weW x where w = { w e R P . r g ( X / y ) + V g(x,y)'w < -V g(x,y)'d x y V x h ( x / Y ) 'w = -V h(x,y) 'd • llwlL 1 ro } . The feasible set, W, of course depends upon x,y,d and r but for notational s i m p l i c i t y t h i s dependence i s not r e f l e c t e d i n our notation. Observe that W i s the feasible set i n (3.5) also. The only difference between (3.5) and (3.6) i s that i n the objective, we replace the exact gradients V f and V f with the k k approximations V f and V^f respectively. P a r a l l e l i n g the development i n chapter one, we assume the following: (Al) there e x i s t sequences ^ ^ f ( x , y ) ^ and ^ v x f ( x , y ) ^ converging uniformly to ^ f(x,y) and ^ f(x,y) respectively; y x X and Y are compact; the approximating sequences are ordered so that l | V f ( X f y ) - vkf(X/y)|| < ( a )k eo / 2 c^ a n d l | V f(x,y) - vkf(X/y)|| < ( a ) k e o / 2 p ( r a ) 2 f with 1 3 1 c = sup| | d | I . y deY i s f i n i t e since Y i s compact. As i n chapter one, ae ( 0,l) i s a 'shrinking parameter and e° i s the i n i t i a l tolerance so that the tolerance at the k t h i t e r a t i o n i s e k = (a) ke°. We now v e r i f y that expression ( 3 . 6 ) uniformly approximates ( 3 . 5 ) . Theorem 3 . 3 ; Assuming conditions (Al) , V r , x ( Y ; d ) " v r , x , e k ( y ; d ) I - ^ Vy,deY , xeX , r>l with e k=(a) ke° , ae ( 0,l) and e ° > 0 . Proof: |v' x(y;d) - v' k(y;d)| = |V f(x,y)'d + supV f(x,y)'w-V kf(x,y)'d-supV kf(x,y) y weW x y weW x (Henceforth we w i l l write f for f(x,y).) < 1 7 f'd - V k f d | + |supV f'w - supV kf'w| . y y weW x W E W Claim 1 : |V f'd - V kf'd| < e k / 2 . i y y I |V f'd - V kf'd| < |]V f - V k f | | - | | d | l l y y l _ l l y y II II II 1 e k | | a | | / 2 c y < e k / 2 . 1 Claim 2 : |supV f'w - supV kf'w| < e k / 2 . weW weW [ F i r s t note that |supV f'w - supV kf'w| < |sup(V f - V kf)'w| weW x weW weW x x [ Since W i s compact we may assume there e x i s t 132 w1 and w2 i n W such that supV f'w = V f'w, and T T x x 1 weW supV kf»w = vVw, . weW Case 1: V f'w, > V kf'w„ X J. X c. then |supV f'w - supV kfwj = V f'w. - V kf'w. weW x weW x v < v x f W l - v x f W l = (V xf - V k f ) « W l < |sup (V f - V kf)'w| weW x x Case 2: V f'w.. < V kf'w 0 x 1 x 2 then |supV f'w - supV kfw| = V kf'w 0 - V f'w, wsW x weW x x 2 x 1 < V kf'w„ - V f'w_ — x 2 x 2 < |sup(V f - V kf) 'w| weW x x Thus, |sup(V f - V kf)'w| < 1j V f - V k f | |-sup| |w| | weW x x x w g ; W < e ksup||w||/p(ra) 22 < e k/2 Combining the claims we are done. //. 133 We may modify HFW t o obt a i n the f o l l o w i n g a lgorithm, FWT -Wolfe Three) t o solve (3.1). V S t a r t / Choose y°eY,ae (0,1) , e° >0, r>_l; Set k=0 . t k k k ' Choose x eM*(y , e ). Determine z k t o solve max v' k k (y^; z) . Set d k = k k y - z . zeY r,x ,e Choose x k e [ 0 , l ] t o s a t i s f y max xe[0,l] v(y k+T kd k) > v ( y k + T d k ) - e k Set y k + 1 = y k + x k d k , e k + 1 = a e k , k = k + 1. =7-134 We now develop the analog of Theorem 3.2 in§3.1. The method of proof i s t h a t of Hogan, wi t h m o d i f i c a t i o n s n e c e s s i t a t e d by k k using the approximations V x f and 7 y f ' a s w e l l as using an i n e x a c t l i n e search. The l a t t e r m o d i f i c a t i o n can (as observed by Hogan) be used i n HFW without s i g n i f i c a n t changes i n the proof. Notice t h a t we do not a l l o w approximations to the gradients of h or of g. This i s p r i m a r i l y because f o r the a p p l i c a t i o n s we w i l l be c o n s i d e r i n g i n chapter f o u r , the const-r a i n t set i s t r i v i a l . A l s o , modifying the f e a s i b l e set i n t h i s way can run i n t o problems due t o the lack of c o n t i n u i t y o f a l i n e a r program's s o l u t i o n w i t h respect t o the c o e f f i c i e n t s of the technology matrix; see f o r example ParJJkh [137]. The p i e c e -wise c o n t i n u i t y of s o l u t i o n s w i t h respect to p e r t u r b a t i o n s i n the o b j e c t i v e f u n c t i o n c o e f f i c i e n t s i s w e l l known. We proceed by developing the c o n t i n u i t y p r o p e r t i e s of the psms we use. These lemmas are i m p l i c i t i n Hogan's proof of Theorem 3.2. Some necessary theorems from Hogan [43 - 44], are l i s t e d i n the appendix along w i t h some of the t o p o l o g i c a l concepts we have u t i l i z e d . R e c a l l the hypotheses used i n §3.1: (A2) f and -g are d i f f e r e n t i a b l e and concave f u n c t i o n s , h i s l i n e a r , Y i s a nonempty convex set i n r i ( V Q ) , r>_l, K (the set of f e a s i b l e s o l u t i o n s t o (3.1)) i s such that a > the diameter of K. 135 F i r s t we consider the psm F:Y -> 2" which i d e n t i f i e s the set of f e a s i b l e p o i n t s f o r the in n e r problem, F(y) = {xeX: g(x,y)<0, h(x,y)=0} . Lemma 3.2: F i s continuous on r i (Y) . Proof: Both g and h are continuous on X x Y. Thus by Theorem A3. 4, F i s closed on Y. A l s o , g and h are convex on X x Y and from Theorem A3.5 we conclude t h a t F i s open on r i ( Y ) . Hence, F i s continuous on r i ( Y ) . //. Lemma 3.3: v i s continuous on r i (Y) . Proof: By Lemma 3.2 we know tha t F i s continuous on r i ( Y ) . F i s a l s o uniformly bounded si n c e Y i s compact. A l s o , f i s continuous on X x y so tha t by Theorem A3.1 we may conclude t h a t v i s continuous on r i ( Y ) . //. Now consider the psm P:X x Y R n x R n defined by P(x,y) = { (d,w) : y + deY and (d,w)eK L} wit h K L being the set of f e a s i b l e p o i n t s i n (3.4). 136 Lemma 3.4:p i s continuous on K1"1. Proof ;P i s closed on Y from Theorem A3. 4. I t remains t o show P i s open on K L. Assume f o r J an i n f i n i t e index s e t , t h a t {x 3,y 3) •* (x,y) f o r (x3,y3) £K L. Let (d,w) eP (x,y) . We need t o v e r i f y t h a t there e x i s t s a sequence {(d^ ,wl)} w i t h (d3,w3)eP(x 3,y 3) such that (d3,wi ) •+ (d,w) . Let y e r i ( Y ) ( r i ( Y ) i s nonempty s i n c e Y i s ; see R o c k a f e l l a r [86] ) a n d l e t xeX such t h a t g(x,y)<0 and h(x,y)=0. Define the l i n e s w(8) = 8(x - x) + (1 - 6)w and d(6) = 6(y - y) + (1 - 6)w . Now ( x , y ) , ( x , y ) e K L i m p l i e s V h (x,y) ' (x - x) + V h (x,?) ' (y - y) = 0. X y (d,w)eP(x,y) means V h(x,y)'w + V h ( x , f ) ' d = 0. x y By the l i n e a r i t y of h we may conclude Vh(x,y) ' (w(6) ,d(0) ) = 0 V6e[0,l] and ¥(x,y). S i m i l a r l y rg(x,y) + V xg (x,y) ' w (6) + V g(x,y)'d(6) < 0 V9e (0, U . By the c o n t i n u i t y of g and of i t s g r a d i e n t , there e x i s t s 137 j (6) such t h a t r g ( x j , y j ) + V q ( x j , y j ) 'w(0) + V a ( X 3 , y 3 ) " d ( 6) < 0 x y Vj>j (9) and 9e (0, l l . y e r i ( Y ) i m p l i e s there e x i s t s j*(8) such t h a t y 3 + d (9)eY Vj>j (9) and 9e (0,1]... T h i s f o l l o w s from the f a c t t h a t y^ + d ( 0 ) = (1 - 9) (y + d) + 9y + y 3 - y. y 3 + d ( 9 ) e a f f ( Y ) , y j + d ( 0 ) •> y + d ( 0 ) and y + d ( 9 ) e r i ( Y ) 9e (0,1] . F i n a l l y , V 6 e ( 0 , l] and j > max(j(9) , j * (9)) (d(0) ,w(9)) eP(x j ,y j) . Thus a sequence {9"'} can be chosen such t h a t 0^ -»• 0 and ( d ( 0 j ) , w ( 0 j ) ) e P ( x j , y j ) . S e t t i n g (d^,w-') = ( d ( Q J ) , w ( 0 ^ ) ) , we see t h a t (d^,w^)eP(x D,w 3) and (d j,w j) - (d,w) . //. 138 Theorem 3.4: i f (Al) and (A2) hold then any accumulation p o i n t of the sequence {y k} generated by the al g o r i t h m FWT i s a s o l u t i o n t o (3.1). Proof: Observe t h a t FWT i s w e l l - d e f i n e d . Since K i s compact, {y } has an accumulation p o i n t y*. By assumption Vk there e x i s t s T e [ 0 , l ] such t h a t v ( y k + 1 ) = v ( y k + t k d k ) >_ max v ( y k + xd k) - e k. x e [ 0 , l ] By c o n t i n u i t y of v (Lemma 3.3)/ subsequencing i f necessary x k •*> T * , d k -> d* and e k e * = 0. v(y* + x*d*) = v(y*) >_ max v(y* + xd*) - e* xe[ 0,1] = max v(y* + xd*) . xe[ 0,1] Hence, by d e f i n i t i o n of the d i r e c t i o n a l d e r i v a t i v e v'(y*;d*) < 0 . By Lemma 3.1 (a) , Vx*eM*(y*,0) V * (y*;d*) < 0 r , x* ~ since M*(y*,0) i s a subset of K which i s compact. Using Lemma 3.1(c) , i t remains t o show t h a t there e x i s t s x*eM*(y*,0) such t h a t 139 v' x*(y*;d*) > v' (y*;y - y*) VyeY , sin c e y* + d*eY and si n c e v' , converges uniformly r , x * , e K to v i Define the psm D v i a D(x,y) = {d: y+deY, v 1 (y;d) > v' (y;z-y) VzeY} . I t s u f f i c e s t o prove t h a t there e x i s t s x*eM*(y*,0) such that d*eD(x*,y*). Let c(x,y,d,w) = V f ( x , y ) ' d + V f (x,y) 'w . y x Thus d£D(x,y) i f f there e x i s t s w such t h a t (d,w) solves (3.7) max c(x,y,d,w) (d,w)eP(x,y) Now, c i s continuous (by c o n t i n u i t y of the gradient and and of the in n e r product) and the sequence { ( x k , y k , d k , w k ) } i s bounded. Again, subsequencing i f necessary, we know (x k,y k,d k,w k) - (x*,y*,d*,w*). Since (d k,w k) solves (3.7) at ( x k , y k ) and x keM*(y k, k ) , by Theorem A3.3, i t f o l l o w s t h a t x*eM*(y*,0). By Lemma 3.4 and Theorem A2.2, t h i s i m p l i e s (d*,w*) solves (3.7) at (x*,y*) . Hence, d*eD(x*,y*) f o r x*eM*(y*,0). //. 140 Appendix: Topological Concepts and Related Theorems In t h i s section we c o l l e c t some fundamental defin-i t i o n s of the topological terms we have u t i l i z e d as well as some of the results required for the proofs of §3.2. The basic references are Willard [111] , Hogan [43-44] and Rockafellar [86] A topology T on the set X, i s a family of sets which i s closed under f i n i t e intersections and a r b i t r a r y unions. The pair (X,T) i s c a l l e d a topological space; i f the topology i s unambiguous, we may refer to X as the topological space. The members of T are c a l l e d the open sets with respect to T. If (X,T) i s a topological space, and E i s a subset of X, we say that E i s closed i f the complement of E (relative to X) i s open. If E i s an a r b i t r a r y subset of X, the closure of E i n X, c l ( E ) , i s the smallest closed ser containing E; more pre c i s e l y , cl(E) = n{K i n X: K i s a closed set containing E}. The i n t e r i o r of E i n X, i n t ( E ) , i s the largest open set containing E; more formally, int(E) = ^ [G i n X: G i s an open subset of E}. The a f f i n e h u l l of E, a f f ( E ) , i s the smallest a f f i n e set (a set W i s a f f i n e i f for every two points in W, the l i n e through the two points also l i e s in W) containing E; thus 141 aff(E) = {x: x=£w.e. with e.eE and Sw.=l} . 1 1 1 l The r e l a t i v e i n t e r i o r of E, r i ( E ) , i s int(E) with respect to the r e l a t i v e topology on a f f ( E ) . The r e l a t i v e topology here i s the family of a l l intersections of elements of T with a f f ( E ) . The concept of r e l a t i v e i n t e r i o r i s useful when we want to consider i n t e r i o r s of sets which are embedded i n a higher dimension; for example, the i n t e r i o r 3 of a triangle embedded i n R i s empty i n the usual def-i n i t i o n , but not in the r e l a t i v e sense. The diameter of a subset E of a Euclidian space i s sup{||x-y||: x,yeE} . For the following: F i s a psm :Y -*• 2 X , f:X x Y ->• R* , v(y) = sup f(x,y) , xeF(y) M(y) = {xeF(y) : v(y) <_ f(x,y)} and M*(y,e) = {xeF(y) : v(y) <_ f(x,y) + e}. Let g:X x y -»• R*m and for yeY l e t P(y) = {xeX: g(x,y) < 0} . 142 Theorem A3.1:If F i s continuous at y*, uniformly bounded near y* and f i s continuous on F(y*) x y* then v i s contin-uous at y*. Theorem A3.2: I f F i s continuous at y* and f i s continuous on F(y*) x y* then M i s closed at y*. Theorem A3.3: If F i s continuous at y* and f i s continuous on F(y*) x y*, then M* i s closed on y* x R. Theorem A3.4;If each component of - g i s lower semicont-inuous on X x y*, then f> i s closed at y*. Theorem A3.5: If f and g are convex on X x Y, y*eri(V ) and X i s convex and bounded, then F i s open at y* r e l a t i v e to aff(V ). 143 §4 PORTFOLIO REVISION PROBLEMS This chapter begins with a b r i e f survey of the l i t e r a t u r e i n this area. The majority of these papers feature dynamic programming formulations but most of these, as a convenient mathematical f i c t i o n , assume no transaction costs. This i s c l e a r l y unsatisfactory from a normative point of view. In §4.3 we formulate a general consumption-investment model to which the algorithm developed in the preceeding chapter i s applicable. The following section gives a simp-l i f i c a t i o n of this model - the expected u t i l i t y model with proportional transaction costs. As well, d e t a i l s of the implementation are given. The f i n a l section gives results of numerical computations using a subset of the data pres-ented in chapter 2. Here we show that the errors caused by neglecting the revision aspect of the problem are not i n s i g -n i f i c a n t even i f there i s no inter-period covariance. 144 § 4.1 Dynamic Programming Formulations The f i r s t r elated works are those concerned with multi-period consumption-investment decisions. The basic dynamic programming framework was developed by Phelps [ 79 ], who treated the problem of maximizing the expected u t i l i t y for l i f e t i m e comsumption when the investor chooses (in each period) to allocate wealth to current consumption and to investment i n a risky asset. Phelps determined the optimal strategy (as a function of wealth and 'age') for u t i l i t y functions with constant r e l a t i v e r i s k aversion (viz. log(w), w^ 1 , -w~^ for Xe (0,1) ) . This paper was extended by Hakansson [39]/Who introduced an arb i t r a r y number of assets, short sales, borrowing and lending. These two, and most of the subsequent work, e.g. Mossin [ 72] and Hakansson [40] , u t i l i z e s p e c i a l properties of the u t i l i t y functions (especially constant r e l a t i v e r i s k aversion and hyperbolic absolute r i s k aversion (hara)) to obtain properties of the optimal p o l i c y . One fundamental r e s u l t i s that with constant r e l a t i v e r i s k aversion the optimal mix of r i s k y and r i s k l e s s assets depends only on the one period u t i l i t y function and on the one period d i s t r i b u t i o n of returns, 145 not on wealth or on consumption. P o s s i b l y the most s o p h i s t i c a t e d dynamic programming treatment i s by Abrams and Karmarkar [ ]_] . In t h i s paper, f o r general concave u t i l i t y f u n ctions and prop-o r t i o n a l t r a n s a c t i o n c o s t s , they c h a r a c t e r i z e the region of no t r a n s a c t i o n s (as w e l l as proving i t s existence) and derive p r o p e r t i e s of the optimal p o l i c y . One important t h e o r e t i c a l r e s u l t (one of s e v e r a l r e l a t i n g to r e d u c t i o n of m u l t i p e r i o d problems) i s by Fama [31_ 3 2 ] "... i f the consumer i s r i s k averse ( i . e . h i s u t i l i t y f u n c t i o n f o r l i f e t i m e consumption i s s t r i c t l y concave) and markets f o r consumption goods and p o r t f o l i o assets are p e r f e c t , then the consumer's observable behavior i n the market i s i n d i s t i n g u i s h a b l e from t h a t of a r i s k averse expected u t i l i t y maximizer who has a one p e r i o d h o r i z o n . " This r e s u l t , w h i l e t h e o r e t i c a l l y important, i s d e r i v e d i n the p e r f e c t market s e t t i n g . Constantinides [19] explores the very l i m i t e d extent t o which s i m i l a r r e s u l t s can h o l d i n the presence of convex t r a n s a c t i o n c o s t s . See a l s o Ziemba [119-J.20] Another paper which employs the induced u t i l i t y f u n c t i o n / dynamic programming framework i s Neave [ 75], which gives cond-i t i o n s t o guarantee t h a t the p r o p e r t i e s of decreasing absolute r i s k aversion and i n c r e a s i n g r e l a t i v e r i s k aversion (with 146 wealth) are preserved i n p e r f e c t market, m u l t i p e r i o d problems. With respect t o these papers, Zabel [ H5] p o i n t s out: "The character of these s t u d i e s i s s t r o n g l y i n f l u e n c e d by two major assumptions through which the e s s e n t i a l s i m p l i c i t y of Phelps" model i s r e t a i n e d . One i s t h a t the u t i l i t y of each period's consumption i s independent of past and f u t u r e consumption... the second i s t h a t conversion of assets from one form to another i s c o s t l e s s , which i m p l i e s t h a t the composition of wealth i s i r r e l e v a n t i n d e c i s i o n making and, moreover, t h a t i n the presence of a r i s k l e s s investment opportunity the i n d i v i d u a l has no i n c e n t i v e to h o l d cash as an asset." These dynamic programming formulations are u s e f u l i n determining p r o p e r t i e s of optimal p o l i c i e s , however t h e i r normative a p p l i c a t i o n s are se v e r e l y l i m i t e d by the assumption of zero t r a n s a c t i o n costs (except f o r Abrams and Karmarkar). Another p o i n t , discussed by E l t o n and Gruber [26 ] : "The i n t r o d u c t i o n of t r a n s a c t i o n costs and taxes v a s t l y complicates the a n a l y s i s and i n general, makes the dynamic programming 147 formulation i n f e a s i b l e . The problem arises because taxes and transaction costs depend on the amount of s e c u r i t i e s bought and sol d ... thus... the state variable becomes (z - j _ t / • • • i z m t ^ [the d o l l a r investment i n assets l,...,m i n period t] rather than w^  and a dynamic programming problem would have to be solved f o r each l e v e l of each z^ t." Thus the re s u l t s of Fama, Neave et a l cannot be expected to hold i n the presence of transaction costs, and these formulations are of li m i t e d a p p l i c a b i l i t y . 148 §4.2 Extensions of Mean-Variance Analysis  and Other Models Smith [ l o o 1 i n 196 7 wrote the f i r s t paper to use the mean-variance approach. This i s one of the few substantial computational studies i n p o r t f o l i o r e v i s i o n . He b a s i c a l l y compares four possible approaches to r e v i s i o n : a) buy and hold (no revision) , b) adjustment (maintain the same ( i n i t i a l l y optimal) proportions through each period), c) complete t r a n s i t i o n (move to the currently optimal p o r t f o l i o each period), and d) controlled t r a n s i t i o n (consider a l l possible changes in each security i n d i v i d u a l l y , make those changes that, a f t e r consideration of transaction costs and taxes, have a p o s i t i v e gain). E s s e n t i a l l y each of these approaches i s suboptimizing i n that transaction costs and taxes are considered only ex post. An analogous model which attempted to incorporate ex ante consideration of these costs was by Chen,Jen and Zionts [16]; they used a simple two-period zero covariance model which could be solved i n closed form and made suggestions about possible extensions to multiperiod models. As t h e i r model 149 was s i m i l a r to t h a t of Mossin [ 7 2 ] , they were able t o show t h a t the fo r m u l a t i o n i n c o r p o r a t i n g t r a n s a c t i o n costs could y i e l d d i f f e r e n t p o l i c i e s than the l e s s r e a l i s t i c case. One model which eschews both of the techniques we have discussed i s by Zabel [115 ] , who uses a two p e r i o d model assuming one r i s k y asset, p r o p o r t i o n a l t r a n s a c t i o n costs and a power u t i l i t y f u n c t i o n . The problem can be expressed as A A A max { c A + a / [w - c + ( 5 ( l - b ) - l ) Y - b | Y-y | ] Ad$ (O } c, Y>0 a. subject t o w - c - Y - b|Y - y | >^  0 Where u(c)=c A , Ae(0,l) i s the u t i l i t y f o r consumption; w i s the i n i t i a l wealth l e v e l (=x+y); x i s cash; y i s the current value of the r i s k y asset; b i s the p r o p o r t i o n a l t r a n s a c t i o n c o s t ; Y i s the new value of the r i s k y asset; £ i s the random r e t u r n on the r i s k y asset; a_<£<A; a i s the discount r a t e on second p e r i o d u t i l i t y ; $ i s the d i s t r i b u t i o n f u n c t i o n of £. (Note t h a t w h i l e Zabel c a r r i e s out the c a l c u l a t i o n s f o r the power u t i l i t y case, he suggests t h a t r e s u l t s f o r other constant r e l a t i v e r i s k aversion u t i l i t y f u n c t i o n s w i l l be s i m i l a r . ) 150 A f t e r s i m p l i f i c a t i o n we are reduced to c o n s i d e r i n g the i n t e g r a l A i H(V,v) = / [1 + (5(1 - b) -1)V - b|V - v|] Ad<M£) a where V=Y/(w-c) and v=y/(w-c) . This expression represents the expected u t i l i t y of i n v e s t i n g $1 w i t h d e c i s i o n s v and V. I f v e ( l , l / b ) and [ £ A _ X (£-1) ] < 0 , Zabel shows dV*/dv = - b V * / ( l - bv). This i m p l i e s t h a t the optimal stock and cash shares are l i n e a r i n v. A l s o , i f v e [ 0 , l ] , then V* (v) i s uniquely determined and we obtain the f o l l o w i n g diagram. Diagram 4.1; Optimal stock share . 151 Subsequently Mukherjee and Zabel [74 ] considered consumption-investment problems w i t h two a s s e t s , a two p e r i o d h o r i z o n and u t i l i t y f u n c t i o n s w i t h constant r e l a t i v e r i s k a v e r s i o n . They used (independently) two types of t r a n s a c t i o n c o s t s : p r o p o r t i o n a l (as i n Zabel [115]), and a 'lump sum' charge, independent of the amount tr a n s a c t e d . The former case e s s e n t i a l l y r e p l i c a t e s the r e s u l t s l i s t e d above; the l a t t e r case proves to be very unwieldy and i m p r a c t i c a l f o r any r e a l i s t i c problem. Leland [55 ] extends some of the r e s u l t s of Mukherjee and Zabel t o the case of general r i s k averse u t i l i t y f u n c t i o n s assuming only one r i s k y asset. Another extension of Zabel's work i s by Kamin [48], who considered p o r t f o l i o r e v i s i o n w i t h p r o p o r t i o n a l t r a n s a c t i o n costs and a two r i s k y asset p o r t f o l i o , w i t h the o b j e c t i v e of maximizing expected u t i l i t y of end of h o r i z o n wealth (also there i s no sh o r t s e l l i n g , no borrowing or l e n d i n g , no consumption no cash i n f l o w s or outflows, and no t a x e s ) . F i n a l l y , the p a r t i a l d e r i v a t i v e s of the u t i l i t y f u n c t i o n are homogeneous t o some degree i n the values of each of the as s e t s ; thus "... the optimum adjustment s t r a t e g y i s c h a r a c t e r i z e d i n each p e r i o d by two c o n t r o l l i m i t s f o r the r a t i o s of the values of the 152 of the two assets ( i . e . tan.6' and tan9"' i n f i g u r e 4.2). I f the r a t i o i s outside the l i m i t s , then the composition of the p o r t f o l i o should be adjusted along the appropriate f e a s i b l e adjustment l i n e u n t i l the r a t i o equals t h a t at the adjacent c o n t r o l l i m i t . I f the r a t i o i s i n s i d e the c o n t r o l l i m i t s , the composition of the p o r t f o l i o should not be adjusted." Hence, from [48],- we have the f o l l o w i n g diagram. $-amount i n asset 2 1 Diagram 4.2; Kamin's no t r a n s a c t i o n r e g i o n . 153 Constantiriides [18], assuming hara u t i l i t y f u n c t i o n s , exogenous income, one r i s k y and one r i s k l e s s a s s e t, d e r i v e s Kamin's r e s u l t . S i m i l a r work i s a l s o found i n the aforemen-t i o n e d work of Abrams and Karmarkar [ 1 ] . [Our a t t e n t i o n here has been focused upon d i s c r e t e time fo r m u l a t i o n s . Of course, the continuous time models have c e r t a i n advantages ( i n continuous time the sum of lognormal v a r i a b l e s i s s t i l l lognormal, f o r example) but i n c o r p o r a t e very d i f f e r e n t a n a l y t i c techniques t o those discussed here. Merton [ 65 ] i n i t i a l l y developed the s t o c h a s t i c d i f f e r e n t i a l equation approach (under the assumption of a geometric Brownian motion) to the m u l t i p e r i o d consumption-investment model, and d e r i v e d e x p l i c i t consumption and investment p o l i c i e s . The statement of these p o l i c i e s contains a n o n l i n e a r p a r t i a l d i f f e r e n t i a l equation and two a l g e b r a i c equations. For the constant r e l a t i v e r i s k a version c l a s s , these reduce t o a simple form. The extension of Merton*s work to the case of p r o p o r t i o n a l t r a n s a c t i o n costs was done by Constantinides and M a g i l l [20]. This paper i s probably the source of the most acceptable model of an i n v e s t o r ' s p o r t f o l i o behavior t h u s f a r appearing i n the l i t e r a t u r e , i n t h a t i t allows f o r an a r b i t r a r y number of s e c u r i t i e s , an a r b i t r a r y time h o r i z o n , as w e l l as t r a n s a c t i o n c o s t s . The main d i f f i c u l t y w i t h t h i s model i s i t s 154 l a c k of f l e x i b i l i t y (e.g. the d i f f i c u l t y i n i n t r o d u c i n g t a x a t i o n ) . In a l l of these models there i s a problem of i n c o r p o r a t i n g the changing expectations of i n v e s t o r s over time. One paper t r e a t i n g t h i s i s by Pogue [ 80], wherein he i n c o r p o r a t e s a f o r e c a s t i n g e r r o r i n estimates of s e c u r i t y p r i c e s ; i . e . the expected p r i c e i s a f u n c t i o n of past earnings, d i v i d e n d s , growth, r i s k and f o r e c a s t e r r o r s . A model i n c o r p o r a t i n g Bayesian r e v i s i o n of s e c u r i t y p r i c e d i s t r i b u t i o n s i s by Winkler and Barry tll2l» This b r i e f summary should i n d i c a t e t h a t the computational b a r r i e r s t o the implementation of a r e a l i s t i c p o r t f o l i o r e v i s i o n model are severe. As s t a t e d i n Smith [101]: "We hope to soon see progress i n developing m u l t i p e r i o d models f o r p o r t f o l i o r e v i s i o n t h a t are (a) both adaptive and dynamic, (b) i n c l u d e t r a n s a c t i o n costs and other r e a l i t i e s of the s e c u r i t i e s i n d u s t r y , and (c) have b e n e f i t s to the d e c i s i o n process t h a t exceed the a s s o c i a t e d computational c o s t s . " In t h i s chapter we i n v e s t i g a t e the degree to which a p p l i c a t i o n of a s t o c h a s t i c programming technique can a i d t h i s progress. 155 §4.3 A General P o r t f o l i o R e v i s i o n Model I n c l u d i n g  Consumption, Borrowing and Taxes A r a t h e r general formulation to which the al g o r i t h m of chapter three i s a p p l i c a b l e can be given by the f o l l o w i n g expected u t i l i t y f o r consumption and f i n a l wealth model: (4.0) max { v ( c j + E '{ max v 2 ( c 2 ) + E u( yl' Cl-° 5j Y2' C2' L1° X 5 2 i ^ l i p i y 2 i " t{±l1 ^ 2 i ^ l i P i y 2 i + c l + C2 " W l - w 2 - L) - Lr) }} n subject to I P i v l i + c i - w l (4.1) i = l 1 n n n (4.2) h . ( y i i - y 2 i ) + Ji C l i P . y 2 i + c 2 1 w 2 +L + J^^p.y^ (4.3) 9 1 ( Y 1 , c 1 ) > 0 (4.4) g 2 ( c 1 , c 2 , y 1 , y 2 , L , C i ) > 0 Here, f o r i = l , . . . , n (n i s the number of s e c u r i t i e s ) : y l i ^2i^ 1 S t* l e n u m l : > e r o f shares of s e c u r i t y i he l d i n p e r i o d 1 (2) ; c 1 ( c 2 ) i s the $-amount o f consumption i n p e r i o d 1 (2); 156 v ( ( v 2 ) i s the (concave, i n c r e a s i n g ) u t i l i t y f u n c t i o n on p e r i o d 1 (2) consumption; ^ l i ^ 2 i ^ ^ s o n e P^ u s t h e p e r i o d 1 (2) rate of r e t u r n on s e c u r i t y i ; L i s the amount of borrowing at the end of p e r i o d 1; u i s the (concave, i n c r e a s i n g ) u t i l i t y f o r f i n a l wealth; p^ i s the i n i t i a l per share p r i c e of s e c u r i t y i ; t i s the ( l i n e a r ) tax rate on gross p r o f i t s ; r i s one plus the borrowing r a t e ; w^ (w2) i s the exogenous wealth allotment i n p e r i o d 1 (2); i s the (convex) t r a n s a c t i o n cost f u n c t i o n , dependent on the number of shares bought or s o l d *; g^(y^,c^) i s the set of (convex) c o n s t r a i n t s on f i r s t p e r i o d d e c i s i o n s ; g 2 ( c l f c 2 , y 1 / y 2 f L, -^^ ) i s the set of (convex) c o n s t r a i n t s on second p e r i o d d e c i s i o n s (e.g. t h i s could i n c l u d e a c o n s t r a i n t t h a t consumption must exceed a c e r t a i n p r o p o r t i o n (A) of the end of f i r s t p e r i o d wealth, c 2 _^ A{ ^ l i P j Y i i + c2 * e t c - * ' We assume th a t a l l maximizations are f i n i t e w i t h f i n i t e maximizing p o i n t s , t h a t u i s i n t e g r a b l e w i t h respect t o E,^ and t h a t v 2 i s i n t e g r a b l e w i t h repect to £ . * This i s p r e f e r a b l e to having t r a n s a c t i o n costs a f u n c t i o n of the d o l l a r value of the t r a n s a c t i o n ; c f . the model of Zabel i n §4.2. 157 The important features of t h i s f o r m u l a t i o n are: H convex t r a n s a c t i o n c o s t s , II s t o c h a s t i c and dependent rates of r e t u r n i n each p e r i o d , IT general (concave, increasing) u t i l i t y f u n c t i o n s , 11 exogenous income, and U borrowing and taxes. Theorem 4.1: (4.0) - (4. 4) i s a concave program. Proof: We begin by showing t h a t the f e a s i b l e . set i s convex. F i r s t observe th a t (4.1) i s l i n e a r and hence the set of p o i n t s s a t i s f y i n g (4.1) i s convex. Now, w r i t i n g (4.2)-(4.4) i n the form -g (x) •<_ 0 , the l e f t hand si d e i s a concave f u n c t i o n and thus the set {x: -g (x) <_ 0 } i s convex. We conclude t h a t the f e a s i b l e set i s convex since i t i s an i n t e r s e c t i o n of convex s e t s . * I t remains to show tha t the o b j e c t i v e f u n c t i o n i s a concave f u n c t i o n . Consider f i r s t the argument of u. I t i s a l i n e a r f u n c t i o n of the d e c i s i o n v e c t o r s , and the composition of a concave f u n c t i o n w i t h a l i n e a r f u n c t i o n i s concave (Mangasarian [59] ). The sum of two concave •A 158 functions i s concave, the maximum of a concave f u n c t i o n i s concave, and the expectation of a concave f u n c t i o n i s concave. Hence, (4.0) i s concave. //. While t h i s general framework can be t r e a t e d with the techniques we have developed, the r e s t of t h i s chapter w i l l focus on the d e t a i l s of the a p p l i c a t i o n i n a s i m p l e r , but s t i l l important fo r m u l a t i o n . In t h i s way we may i l l u s t r a t e the major p o i n t s without an excess of d e t a i l . 159 §4.4 The Expected U t i l i t y Model We c o n s i d e r the problem of maximizing the u t i l i t y of t e r m i n a l wealth. n ix E [ max E u( I . ?,, P. y~. ) ] >0 5 y >0 ? U i = 1 (4.5) maxy. 1 2- 2 1 s u b j e c t t o n I P i ^ l i = W i i = l n 1 C l i P i y 2 i + . ^ h i ( y 2 i " y l i ) = W2 + ? l i p i y l i i = l J-x 1 Z 1 i = i x ^ i = l We make the f o l l o w i n g assumptions: a. (5 ,£ ) has a normal d i s t r i b u t i o n w i t h mean [£ ,1 ] 1 2 1 2 and c o v a r i a n c e matrix I I 1 2 1 2 2 2 b. u i s C x, nondecreasing and concave, and c. t r a n s a c t i o n c o s t s are l i n e a r i n the number of shares t r a d e d , thus a ( y 2 i " y l i } i f y 2 i l y i i a(y U ~ y2i> i f y 2 i < y l i * h i ( y 2 i " y l i ) = The s t r u c t u r e o f the f e a s i b l e s e t i s s i m p l i f i e d by i n t r o -ducing the v a r i a b l e s b^ ^ and s ^ corresponding t o the number of shares bought and s o l d r e s p e c t i v e l y , of s e c u r i t y i at the end of p e r i o d one. Thus 16 0 y_. = y, . + b• - s. J 2 i J l x i i and we may r e w r i t e (4.5) as n (4.6) max E [ max E u( £ . E . P • (y,+b.-s, ) ) ] y >0 5 b,s>0 ? U i = l 2 1 1 1 1 — 1 — 2 1 subject to n J X P i Y l i = W l n J n K l i P i + a ) b i " U l i p i " a ) s i ] = W2 S l 1 y l l s n £ y m • Let us denote by y 2 the vector (y-^+b-^-s.^, . . . » Y i n + b n ~ s n ) • Now, given y^, the f i r s t stage s o l u t i o n and ^ , a p a r t i c u l a r r e a l i z a t i o n of the f i r s t p e r i o d r e t u r n s , we may r e w r i t e the second stage (recourse) problem as (4.7) max E u ( Z U 2 i £ p y )) b,s>0 ? |e 2 1 — 2 1 subject to Z [ ( ? l i p i + a ) b i ~ ( ^ l i P i - a ) s i l = w 2  s l £ y l l • * sn i y m • We denote by K 2 the f e a s i b l e set to (4.7). 161 To avoid an i n f i n i t e number of subproblems, we generate an approximation t o (4.5) by assuming a d i s c r e t e approximation to £ . I.e., we assume t h a t £ has p o s s i b l e r e a l i z a t i o n s 1 I 5 £ m w i t h p r o b a b i l i t i e s p i / - - ' / p m r e s p e c t i v e l y . We may now summarize the d e t a i l s of the implementation of the alg o r i t h m developed i n chapter three to t h i s problem. Step 0: I n i t i a l i z a t i o n We generate a f e a s i b l e f i r s t p e r i o d d e c i s i o n by assuming b=s=0, ('buy and hold')and thus solve the r e s u l t i n g s t a t i c problem: max E u(Z^ ip.y 1.) subject t o ^ P - i Y l i = w i • y >0 C 1 l — Here £ i s the two-period j o i n t d i s t r i b u t i o n of returns This problem can be solved by any s t a t i c p o r t f o l i o a l g orithm, not s u r p r i s i n g l y we u t i l i z e the a l g o r i t h m developed i n chapter one. Step 1: The D i r e c t i o n F i n d i n g Problem a. S o l v i n g the Second Stage Problem For each value of £ ( i . e . f o r each element i n the l d i s c r e t i z a t i o n of £ ) we must determine an e-optimal I s o l u t i o n to (4.7). This problem can be solved using a simple adaptation of the p o r t f o l i o s e l e c t i o n a l g o r -ithm used i n step 0. I t i s s l i g h t l y more d i f f i c u l t 162 because K 2 i s more r e s t r i c t i v e then the s t a t i c const-r a i n t s e t . However, observing t h a t K 2 contains only one c o n s t r a i n t which i s not an upper bounding or no n n e g a t i v i t y c o n s t r a i n t , i t i s n a t u r a l t o apply a s p e c i a l i z a t i o n of the simplex a l g o r i t h m w i t h upper bounding to t h i s problem (more s p e c i f i c a l l y to the d i r e c t i o n f i n d i n g s t e p ) ; see appendix 1. Once b and s are chosen, the second stage o b j e c t i v e can be w r i t t e n as z (b,s) = E (u(E?.p y . ) ) ^ 1 J- ^ -L w i t h t ^ NlT'y ,y'E y ], The p a r t i a l d e r i v a t i v e s can 2 2 2 2 2 2 be expressed as 8z/9b. = E t u * ( S c . p . y , . K . ] 2D = -8z/3Sj b. Computing the Approximate D i r e c t i o n a l D e r i v a t i v e We o b t a i n (see appendix 2 f o r d e t a i l s ) the f o l l o w i n g expression f o r the d i r e c t i o n a l d e r i v a t i v e : (4.8) v ' ( y i ; d ) = E [ (E V u(-))'d+sup (E V,, xu(-)'v] r ' x 5 ? U Y l veK, 5 |£ ( b ' s ) 1 2 1 a 2 i where the set i s given by 163 K d = {v: Z [ < e i i P i + a)v. - (^.p. - a ) v n + . ] = 0 V n + 1 £ d l v„ < d 2n - n v l 1. r C T v„ < r a }. 2n — Here ra i s a large p o s i t i v e number used only to prevent unbounded d i r e c t i o n s . A l t e r n a t i v e l y , w i t h our d i s c r e t i z a t i o n , we have (here a l s o using an approximate quadrature t o compute the c o n d i t i o n a l e x p e c t a t i o n ; see appendix 3) m (4.9) v ' ( y i ; d ) = J P. [(E . V u ( - ) ) ' d + i = l 1 5 U Y l 2 1 sup (E V,, _xu(-) ) 'v ] . veK £ U ^ ' S ) d 2 1 I f we f i x a r e a l i z a t i o n E,1 (and thus f i x b and s as l w e l l ) , the f i r s t term i n the summation i s [E . ( u ' ( - ) C 1 ? P l ) , . . . , E .(u'fOC 1.? p J H ' d £ [ £i I I 2 I 1 E \E 2 n 2 1 2 ' 1 = [ U 1 p E .(u'(-)£ ) , . . . , S X r > E . <u' (•)£))] 'd i l l 21 m r n r I? 2 2 1 2 1 But E ( u 1 ( • ) ? •) f o r each j , has already been 2 1 computed i n determining the second stage optimal s o l u t i o n s (again, i n the d i r e c t i o n f i n d i n g s t e p ) . 16 4 Furthermore, the second term i n the summation i n (4.9) equals sup [ (E . (u' (•) I ) , ...,E . (u 1 (•)E ) , - E . (u' (•)K ),..,-E (u' (•)K ) } "v . 2 1 2 1 Again, the c o e f f i c i e n t s of v have already been deter-mined, and since v i s the only v a r i a b l e which i s not f i x e d , has only one c o n s t r a i n t which i s not an upper bounding or nonn e g a t i v i t y c o n s t r a i n t . Hence t h i s problem can a l s o be solved by the aforementioned modified simplex method. c. Maximizing the D i r e c t i o n a l D e r i v a t i v e To optimize (an approximation to) the d i r e c t i o n a l d e r i v a t i v e i t i s f i r s t necessary to consider the comp-l e x i t y of one f u n c t i o n e v a l u a t i o n . F i r s t l y , we save the c o e f f i c i e n t s which w i l l appear i n the summand of (4.9), seeing these depend on y ^ b j S , ? ; 1 and not on d. Essen-t i a l l y t h e r e f o r e , one e v a l u a t i o n of v' re q u i r e s the s o l u t i o n of m of our s p e c i a l l y s t r u c t u r e d l i n e a r programs. I t i s c l e a r t h a t a method u t i l i z i n g only f u n c t i o n e v a l u a t i o n s (and not gradients) i s p r e f e r a b l e . 165 The method we use here i s an adaptation of the c o o r d i n -ate ascent method (see e.g. Z a n g w i l l [117]). I.e. we assume we have as d i r e c t i o n s the n u n i t vectors d l ' " " * ' d n " W e kegin w i t h d^=(l,0,...,0)=d Q . We maximize v 1 ( d Q + T (d^ - d Q ) ) and set d Q to the new optimum. We continue w i t h d^=(0,1,0,. . . , 0 ) e t c . We i t e r a t e u n t i l the appropriate stopping r u l e i s obtained ( c f . chapter three) . To r e t a i n the f e a s i b i l i t y Q f d Q at each i t e r a t i o n we r e s t r i c t T to l i e i n the u n i t i n t e r v a l [0,1]. step 2: Determine the Optimal Step Length The problem i s to f i n d x i n [0,1] to maximize k > z(y + x ( d Q - y ) ) . For each choice of T we need t o solve m recourse problems. At t h i s stage the a b i l i t y to u t i l i z e i n e x a c t l i n e searches becomes very important; we employ various h e u r i s t i c s t r a t e g i e s t o determine a 'reasonable' improvement i n z. 166 §4.5 Results and Implementation The input data required f o r the formulation i n §4.3 are the f o l l o w i n g : 1. c h a r a c t e r i s t i c s of the d i s t r i b u t i o n : ( E ^ , ^ ) ' the f i r s t and second pe r i o d mean v e c t o r s , ( , ) ' t n e f i r s t and second period covariance matrices, and (,Y.^0,I^^) the s e r i a l covariance matrices, 2. (p-jy . . - ,Pn) , the i n i t i a l p r i c e of each of the s e c u r i t i e s , 3. (w^,W2)/ the wealth a l l o c a t i o n i n each p e r i o d , 4. a, the estimated t r a n s a c t i o n c o s t , 1 M 5. ( E ^ , . . . , K j ) f the d i s c r e t i z a t i o n of E-^, and 6. u, the i n v e s t o r ' s u t i l i t y f u n c t i o n . Of the above, 1. and 4. req u i r e some e l u c i d a t i o n . Note that the f o l l o w i n g paragraphs use d e f i n i t i o n s and theorems from Hannan [ 130] • In general, we assume t h a t we have observations X i , - . . , X N °f the vector of s e c u r i t y r e t u r n s . (Formally, we re q u i r e t h a t the s e r i e s i s s t a t i o n a r y and er g o d i c . These assumptions are p l a u s i b l e i f , f o r example, s e c u r i t y markets are e f f i c i e n t and our r e v i s i o n p e r i o d i s s u f f i c -i e n t l y short.) S t a t i o n a r i t y w i l l imply t h a t I i = ? 2 a n <^ ^1^ = ^22' U n < ^ e r m i l d r e g u l a r i t y c o n d i t i o n s , the sample 167 mean vector x ' s o that the means may be estimated i n the standard manner. The n a t u r a l estimate f o r the f i r s t order s e r i a l covariance i s given by S = I = (N-2)" 1 Y ( X m - x ) ( X m + 1 - X>'. m=l We set S = = 2-^ 2 t o maintain symmetry. Let Kl = F = E { ( ^ j k " ^ l ) ( * k + l " ^ 1 ) , } • With p r o b a b i l i t y one, s. . -»• T. .. Note that S i s not the ID maximum l i k l i h o o d estimator but i s 'close' to i t and that these s e r i a l covariance estimates are i n general asymmetric. For our a p p l i c a t i o n we use a subset of the q u a r t e r l y data used i n chapter two, the f i v e s e c u r i t i e s w i t h the highest r e t u r n . S e c u r i t y Mean r e t u r n I n i t i a l p r i c e 1. Cunningham Drug Co. 1.0559 29 3/8 2. N a t i o n a l Cash Regist e r 1.0538 140 1/2 3. Metro-Goldwyn-Mayer 1.0 46 3 38 1/2 4. G i l l e t t e Co. 1.0408 44 5. Household Finance Corp. 1.0396 44 1/2 16 8 The f i r s t and second pe r i o d covariance matrices are given by .0290 .0067 .0044 .0058 .0040 .0067 .0190 .0036 .0060 -.0002 .0044 .0036 .0386 .0064 .0034 .0058 .0060 .0064 .0126 .0032 .0040 -.0002 .0034 .0032 .0190 For the s e r i a l covariance matrices, we f i n d that e s s e n t i a l l y a l l of the elements of a r e n o t s t a t i s t -i c a l l y s i g n i f i c a n t (at the .01 l e v e l ) over a monthly h o r i z o n . Hence, i n the f o l l o w i n g experiment we use ^12 = ^21 = ®' -"-s c o n s i s t e n t w i t h the e f f i c i e n t markets hypothesis, We estimate a v i a a l i n e a r r e g r e s s i o n (with zero constant term) using the round l o t f i g u r e s given i n appendix four ( f o r 100 to 600 shares), y i e l d i n g a = .233 . This p a r t i c u l a r d e r i v a t i o n of a i s somewhat a r b i t r a r y and i n p r a c t i c e the determination would depend upon w^ to obtain the range i n expected t r a n s -a c t i o n s . We assume t h a t there are f i v e elements i n the d i s c r e t i z a t i o n of . 1 6 9 P r o b a b i l i t y R e a l i z a t i o n . 0 5 0 0 ( . 7 7 5 7 6 6 . 8 2 7 0 5 2 . 7 2 3 1 0 9 . 8 5 6 1 4 8 . 8 1 2 8 5 2 ) . 2 9 1 3 ( . 8 8 5 6 0 6 . 9 1 5 9 5 9 . 8 4 9 9 3 1 . 9 2 8 5 4 9 . 9 0 1 7 5 9 ) . 3 1 7 4 ( 1 . 0 5 5 9 0 0 1 . 0 5 3 8 0 0 1 . 0 4 6 3 0 0 1 . 0 4 0 7 9 9 1 . 0 3 9 5 9 9 ) . 2 9 1 3 ( 1 . 2 2 6 1 9 3 1 . 1 9 1 6 4 0 1 . 2 4 2 7 6 9 1 . 1 5 3 0 4 9 1 . 1 7 7 4 4 0 ) . 0 5 0 0 ( 1 . 3 3 6 0 3 3 1 . 2 8 0 5 4 7 1 . 3 6 9 4 9 1 1 . 2 2 5 4 5 0 1 . 2 6 6 3 4 7 ) These f i g u r e s are generated by assuming the r e a l i z a t i o n s % ± 2 a w i t h p r o b a b i l i t y . 0 5 , % ± a with p r o b a b i l i t y . 2 9 1 3 and \ wit h p r o b a b i l i t y . 3 1 7 4 . To f a c i l i t a t e comparison w i t h the s t a t i c s o l u t i o n s from chapter two, we assume th a t the second p e r i o d wealth a l l o c a t i o n i s zero. An exponential u t i l i t y f u n c t i o n i s u t i l i z e d w i t h parameters 3 = . 5 , 1 . , 2 . , 3 . , 4 . , 5 . and 6 . Table 4 . 1 shows the s o l u t i o n s and the percentage e r r o r (as defined i n chapter two) obtained by using the s t a t i c (one peri o d s o l u t i o n ) and by using an evenly balanced p o r t f o l i o . Figure 4 . 3 p l o t s the proportions held i n each s e c u r i t y f o r the one and two peri o d cases against the parameter 6 . X . X . Percentage e r r o r s : s t a t i c s o l u t i o n Table 4.1 Revision s o l u t i o n s with q u a r t e r l y data and exponential u t i l i t y . .5 1. 2. 3. 4. 449427 .343517 .291819 .106432 .108379 ,550573 .656483 .708181 .673761 .583551 0000 219807 0731 .2575 .3395 308071 . 4304 ,042575 .068298 ,527895 .434597 429530 5464 497105 6037 o . 1.2304 1.0386 .8816 .7622 .8928 i o 1.0745 1.2214 171 Proportions 4, .9 " . 8 .5 1.0 2.0 3.0 4.0 5.0 6.0 Beta Figure 4.3: Revision and s t a t i c s o l u t i o n s , q u a r t e r l y data, exponential u t i l i t y . revision solutions - - - - static solutions 172 Summary Table 4.1 and f i g u r e 4.3 i n d i c a t e the f o l l o w i n g : - the r e v i s i o n s o l u t i o n has fewer p o s i t i v e holdings than the s t a t i c s o l u t i o n ( i . e . i s l e s s d i v e r s i f i e d ) ; - the percentage cash eq u i v a l e n t e r r o r a s s o c i a t e d w i t h the s t a t i c s o l u t i o n increases as the r i s k aversion increases but i s never very large (between 0 and .6%); - the balanced p o r t f o l i o (x*=(.2,.2,.2,.2,.2)) generates e r r o r s of approximately 1%. The f i r s t c o n d i t i o n i s i n t u i t i v e l y obvious. U n l i k e the r e v i s i o n s o l u t i o n , the s t a t i c s o l u t i o n doesn't consider the composition of f i n a l wealth, thereby i g n o r i n g the p o s s i b l e costs of r e v i s i n g p o r t f o l i o h o l d i n g s . A l s o , the s t a t i c s o l u t i o n s o f t e n have small holdings i n a number of secur-i t i e s , w h i l e the r e v i s i o n case tends to maintain l a r g e r p r o p o r t i o n s . The e r r o r a s s o c i a t e d w i t h the s t a t i c s o l u t i o n i s r e l -a t i v e l y s m a l l , but i t i s c l e a r t h a t the d i f f e r e n c e between the two techniques could be made more dramatic by consider-i n g higher t r a n s a c t i o n costs or s i g n i f i c a n t s e r i a l c o r r e l a -t i o n of s e c u r i t y r e t u r n s . The a l g o r i t h m used approximately 10 seconds of CPU time on an IBM 370/16 8. 173 Conclusions In t h i s chapter we have focused upon some of the s h o r t -comings of past approaches to the p o r t f o l i o r e v i s i o n problem and have t r i e d to show that a p p l i c a t i o n of the al g o r i t h m developed i n chapter three can be a t r a c t a b l e s o l u t i o n method. This a l g o r i t h m was shown to possess a number of computational s i m p l i f i c a t i o n s when a p p l i e d to the expected u t i l i t y problem ( 4 . 3 ) . In a d d i t i o n , the advantages of the MFW al g o r i t h m of chapter one are a l s o present. F i n a l l y , i n a b r i e f numerical experiment, we showed t h a t even i n the absence of s i g n i f i c a n t s e r i a l c o r r e l a t i o n of s e c u r i t y r e t u r n s , the r e v i s i o n s o l u t i o n o f f e r s b e t t e r s o l u t i o n s than the s t a t i c case. 174 Appendix 1: L i n e a r Programs with One C o n s t r a i n t and Upper Bounded V a r i a b l e s He assume our problem has the form max c'x s . t . xX> a' x = b x l * u l x < u n n = { i : xj, = 0 ) R 2 = { i : X i = U i ) —.no 0 - y e s yes Ctop-optima i\ solution.„ S e l e c t e n t e r i n g v a r i a b l e x^ s a t i s f y i n g [ max (-c .) ,1 min (-c .)I 1 e - c - a ' ^ r e c o r d b a s i s change 0 r e c o r d b a s i s change • r — > is \ < b / a > > = ' no XB " U B .. b = (b - u B ) / a ) c = C " C k a yes x k • \ b - b - V k 0~ 175 Appendix 2: The D i r e c t i o n a l D e r i v a t i v e We f i r s t r e w r i t e the expression (4.6) max E {max E u (E£9 . g,. p . (y, . +b .-s . ) } Y l > 0 K l (b,s ) > 0 5zI £ i subject to ^ P i ^ i i ~ w i = ^ E { ( 5 l i P i + a ) b i - ( 5 l i P i - a ) s i > - w2 = 0 s l " y l l £ 0 ' s n " y l n i 0 • We st a t e a theorem of Bertsekas [125] Theorem 4.2: I f f(x,£) i s a f i n i t e concave f u n c t i o n of x V£ , and i f f i s i n t e g r a b l e with respect to £, then the d i r e c t i o n a l d e r i v a t i v e of E ^ f ( x , 5 ) w i t h respect to d, e x i s t s and equals E ?f'(x,c:;d) . U t i l i z i n g t h i s r e s u l t and expression (3.5), the d i r e c -t i o n a l d e r i v a t i v e of (4.6) given i n the d i r e c t i o n d i s 176 (4.10) V' (y ;d) = r , y 2 1 E ^ { ( V y 1 E ^ l ^ U ( ' ) ) ' d + T ( V < b . s ) E 5 2 | ? 1 u ( - ) > , v } subject to V ( b ^ s ) { E ( 5 l i P i + a ) b i - ( C l i p i - a ) s i > -= - V y i { E ( 5 1 . P . + a ) b i - ( 5 l i P i - a ) s i } - * 2) • V ( b , s ) ( s l " y l l ) , v 1 - V Y l ( s l " y l l > ' d v ( b , s ) ( s n " y m ) , v 1 - v y i ( s n " llvIL < ra This s i m p l i f i e s to EE, (' E 5 2| 5,V (" ,' d +-P' E 5 !| 5 l' ( b,s,«'-"^ subject to £{ ( £, • p • +a) v. - (£.. .'p.-a) v ,.)} = 0 J l i ^ i l l i ^ i n+i v n + l i d l v„ < d 2n — n v l 1 R A V2n 1 R A This i s expression (4.8) 177 Appendix 3: Computing the C o n d i t i o n a l D i s t r i b u t i o n In our a p p l i c a t i o n we may consider the j o i n t d i s t r i b u t i o n of s e c u r i t y returns t o be given by E , normally d i s t r i b u t e d w i t h mean vector Xi,Kz and covariance matrix Z 1 l £ 1 2 Z \ 2 E 2 2 where each of the submatrices i s n by n; Z ^ i s the i t h p e r i o d covariance matrix; Z ^ i s the s e r i a l covariance matrix f o r p e r i o d one and two r e t u r n s . We re q u i r e f{Ki\Ki) t the c o n d i t i o n a l d e n s i t y of p e r i o d two r e t u r n s , given a p a r t i c u l a r p e r i o d one r e a l i z a t i o n . The general form i s given by (see Morrison [71] ) fu2|5i) = {{2n)hn\i22 - z i 2 ^ T l s 1 2 | h r 1 •exp{-y 52-!2-ZUZ7i 1 (E22-M2Z1 j z ^ ) " 1 • ( E 2 - E 2 - M 2 i : T l ( E 1 - E i ) ] } . This i s the den s i t y of a normal v a r i a t e with mean vector I2 + 2 1 2Z 1 1'(Ei - l i ) and covariance matrix Z22 - ^12^1 l^iz • Note t h a t f o r the d i f f e r e n t r e a l i z a t i o n s of Ei we need not r e i n v e r t any of the matrices. 178 Appendix 4:Transaction Costs Transaction costs as represented i n the diagram below do not f i t i n t o the usual mathematical programming framework, mainly because of the d i s c o n t i n u i t y at zero, but a l s o because of the nonconvexity. Within our model we use p r o p o r t i o n a l t r a n s a c t i o n costs and thus we f i t a l i n e , bx (x=number of shares t r a d e d ) , through the data as represented here. In p r a c t i c e , one would adjust the slope according to the volume of t r a n s a c t i o n t h a t would be expected from the i n i t i a l wealth l e v e l . I.e., a p o r t f o l i o of s mall s i z e would have the approximation as i n d i c a t e d i n the diagram, wh i l e a l a r g e r p o r t f o l i o would approach the slope at ten cents per share. Figure 4.4 gives a graph of the t r a n s a c t i o n cost versus the number of shares traded. The f i g u r e s are from Sherman, Ralston, Inc., N.Y. f o r May 1979 f o r a stock c o s t i n g $30. For stocks p r i c e d d i f f e r e n t l y the graph i s the same except f o r s p e c i a l discounts given on some small t r a n s -a c t i o n s . D i f f e r e n t firms were allowed to charge d i f f e r e n t rates- as of May 1, 19 75. g u r e 4 ' 4 : T h e t r a n s a c t i o n cost f u n c t i o n and approximati 180 REFERENCES 1. Abrams, R. A. and Karmarkar, U. S.,'Optimal M u l t i p e r i o d Investment-Consumption P o l i c i e s , ' Report 7830, Univ. of Chicago Center f o r Mathematical Studies i n Business and Economics, 1978. 2. A d l e r , M.,'On the Risk-Return Tradeoff i n the V a l u a t i o n of Assets,' JFQA, 4, No.4, 1969, 493-512. 3. Amihud, Y.,'A Note on Risk Aversion and I n d i f f e r e n c e Curves,' JFQA, 12, No.3, 1977, 509-514. 4. Amor, J.-P.,'On the I n i t i a l Rate of Convergence of Large-Step Methods of F e a s i b l e D i r e c t i o n s , ' D i s c u s s i o n Paper UCLA, 1971. 5. Armijo, L.''Minimization of Functions Having Continuous F i r s t P a r t i a l D e r i v a t i v e s , ' P a c i f i c J o u r n a l Of Math., 16, 1966, 1-3. 6. Arrow, K.J.,'Aspects of the Theory of Risk Bearing, Y r j o Jahnsson Foundation, H e l s i n k i , 1965. 7. ,'The Theory of Risk A v e r s i o n , ' i n Essays i n the Theory of Risk Bearing, ed. K.J.Arrow, Markham P u b l i s h i n g Co.,Chicago, 1971. 8. Bawa, V. S.,'Admissible p o r t f o l i o s f o r A l l I n d i v i d u a l s , ' J o u r n a l of Finance, 31, No.4, 1976, 1169-1183. 9. ,'Mathematical Programming of Admissible P o r t -f o l i o s , ' Management Science, 23, No.7, 1977, 779-784. 181 10. Benders, J . F . , ' P a r t i t i o n i n g Methods f o r S o l v i n g Mixed V a r i a b l e s Programming Problems,'Numerische Mathematik, 4, 1962, 238-252. 11. Berge, C.,Topological Spaces, O l i v e r and Boyd, E d i n -burgh, 1963. 12. Best, M.J.,'A F e a s i b l e Conjugate-Direction Method t o Solve L i n e a r l y Constrained M i n i m i z a t i o n Problems,' JOTA, 16, No.1-2, 1975, 25-38. 13. ,'A Method to A c c e l e r a t e the Rate of Convergence of a Class of O p t i m i z a t i o n Algorithms,' Math. Prog., 8, 1975, 139-160. 14. Canon, M.D. and Cullum, C.,'A Tight Upper Bound on the Rate of Convergence of the Frank-Wolfe Algorithm,' SIAM  Jo u r n a l on C o n t r o l , 6, No.2, 1968, 509-516. 15. Cauchy, A.,'Methode Generale pour l a R e s o l u t i o n des Systemes d'Equations Simultanees,' Comptes Rendus Hebdom-ad a i r e s , 25-2, 1847, 536-538. 16. Chen, A.H.Y., Jen, F.C. and Z i o n t s , S.,'The Optimal P o r t -f o l i o R e v i s i o n Strategy,' J o u r n a l of Business, 44, N o . l , 1971, 51-61. 17. Chipman, J.,'The Ordering of P o r t f o l i o s i n Terms of Mean and Variance,' RES, 40, 1973, 167-190. 18. C o n s t a n t i n i d e s , G.M.,'Optimal P o r t f o l i o R e v i s i o n w i t h Prop-o r t i o n a l Transaction Costs: Extensions to HARA U t i l i t y Func-t i o n s and Exogenous D e t e r m i n i s t i c Income,' Management  Science, 22, No.8, 1976, 921-923. I 1 182 19. Constantinides, G.M.,'Multiperiod Consumption and Invest-ment Decisions with Convex Transaction Costs,' Carnegie-Mellon Working Paper #15-76-77, Nov. 1977. 20. Constantinides, G.M. and M a g i l l , M.I.P.,'Portfolio Selec-tion with Transaction Costs,' JET, 13, 1976, 245-263. 21. Davidon, W.C.,'Variable Metric Method for Minimization,' AEC Research and Development Report, ANL-5990, 1959. 22. Demyanov, V.F. and Rubinov, A.M.,'The Minimization of a Smooth Convex Functional on a Convex Set,' SIAM Journal on  Control, 5, 1967, 280-294. 23. Dexter, A.S., Yu, J.N.W. and Ziemba, W.T.,'Portfolio Selec-t i o n i n a Lognormal Market when the Investor has a Power U t i l i t y Function,' i n Proceedings of the International  Conference on Stochastic Programming ed. M.A.H. Dempster, Academic Press, N.Y., 1979. 24. Dubois, J.,'Theorems of Convergence for Improved Nonlinear Programming Algorithms, 1 Operations Research, 21, 1973, 328-332. 25. Dyer, J.S.,'The Ef f e c t s . o f Errors i n the Estimation of the Gradient on the Frank-Wolfe Algorithm, with Implications for Interactive Programming,' Operations Research, 21, 1, 1974, 160-174. 26. Elton, E.J. and Gruber, M.J., Finance as a Dynamic Process, Prentice-Hall, Inc., Englewood C l i f f s , N.J., 1975. 183 27. E l t o n , E.J., Gruber, M.J. and Padberg, M.W.,1 Simple C r i t -e r i a f o r P o r t f o l i o S e l e c t i o n , ' J o u r n a l of Finance, 31, No.5, 1976, 1341-1357. 28. , ' Simple C r i t -e r i a f o r Optimal P o r t f o l i o S e l e c t i o n w i t h Upper Bounds,' Operations Research, 25, No.6, 1977, 952-967. 29. , 'Simple Rules f o r Optimal P o r t f o l i o S e l e c t i o n : the Multigroup Case,' JFQA, 12, No.3, 1977, 329-346. 30. E v e r i t t , R. and Ziemba, W.T.,'Two-Period S t o c h a s t i c Prog-rams wi t h Simple Recourse,' Operations Research, f o r t h -coming 19 79. 31. Fama, E . F . , ' M u l t i p e r i o d Consumption-Investment Decisions,* AER, 60, 1970, 163-174. 32. , ' M u l t i p e r i o d Consumption-Investment D e c i s i o n s : A C o r r e c t i o n , ' AER, 66, 1976, 723-724. 33. F l e t c h e r , R. and Reeves, CM. ,'Function M i n i m i z a t i o n by-Conjugate Gradients,' Computer J o u r n a l , 7, 1964, 149-154. 34. Forsythe, G.E., Malcolm, M.A. and Moler, C.B., Computer  Methods f o r Mathematical Computations, P r e n t i c e - H a l l , Inc., Englewood C l i f f s , N.J., 1977. 35. Fox, R.L., Lasdon, L.S., Tamir, A. and Ratner, M.,'An E f f -i c i e n t One-Dimensional Search Procedure,' Management Science, 22, N o . l , 1975, 42-50. 36. Frank, M. and Wolfe, P.,'An Algorithm f o r Quadratic Prog-ramming,' Naval Res, and Log. Q., 3, 1956, 95-110. 184 37. Freund, J.,'The Introduction of Risk into a Programming Model,* Econometrica, Vol.14, 1956, 253-263. 38. Garfinkel, R. and Nemhauser, G., Integer Programming, John Wiley, N.Y., 1974. 39. Hakansson, N.H.,'Optimal Investment and Consumption Dec-isions for a Class of U t i l i t y Functions,' Econometrica, 38, 1970, 587-607. 40. , ' On Optimal Myopic P o r t f o l i o P o l i c i e s with and without S e r i a l c o r r e l a t i o n of Yields,' Journal of Bus-iness, 44, 1971, 324-334. 41. Hildebrand, F.B., Introduction to Numerical Analysis, 2nd ed., McGraw-Hill, N.Y., 1974. 42. Hogan, W.W.,'Convergence Results for Some Extensions of the Frank-Wolfe Method,' WP16 9, WMSI, 19 71. 43. ,'Directional Derivatives for Extremal-Value Functions with Applications to the Completely Convex Case,' Operations Research, 21, 1973, 188-209. 44. ,'Point-to-Set Maps i n Mathematical Programming,' SIAM Review, 1973, 591-603. 45. Holloway, C.A.,'An Extension of the Frank-Wolfe Method of Feasible Directions,' Math. Prog., 6, 1974, 14-27. 46. Huard, P.,'Optimization Algorithms and Point-to-Set Maps,' Math. Prog., 8, 1975, 308-331. 4 7 . Jacobsen, S.,'On Zangwill's Necessary and S u f f i c i e n t Condit-ions for Convergence,' Management Science, 18, No.l, 1971, 111-112. 185 48. Kamin, J.H.,'Optimal P o r t f o l i o Revision with Proportional Transaction Costs,' Management Science, 21, No.11, 1975, 1263-1271. 49. Katoaka, K.,'A Stochastic Programming Model,' Econometrica, 31, 1963, 181-196. 50. Klessig, R.,'A Convergence Theory for Optimization Algor-ithms using Approximations,' paper presented at the 1973 Decision and Control Conference, San Diego, Ca. 51. ,'A General Theory for Convergence of Constrained Optimization Methods that Use Antizigzagging Provisions,' SIAM Journal on Control, 12, No.4, 1974, 598-608. 52. Kle s s i g , R. and Polak, E.,'A Method of Feasible Directions Using Function Approximations with Applications to Min-Max Problems,' Journal of Mathematical Analysis and Applications, 41, 1973, 583-602. 53. Krylov, V.I., Approximate Calculation of Integrals, Mac-Mi l l a n , N.Y., 1962. 54. Lasdon, L.S., Optimization Theory for Large Systems, Mac-Mi l l a n , N.Y., 1970. 55. Leland, H., "Comment on Mukherjee and Zabel,' i n Essays on  Economic Behavior Under Uncertainty, M.S. Balch, D.L. McFadden and S.Y. Yu eds., North-Holland, 1974, 184-190. 56. Litzenberger, R.H.,'Discussion of Blume and Friend,' Journal  of Finance, 30, No.2, 1975, 624-629. 57. Luenberger, D.G., Introduction to Linear and Nonlinear Prog- ramming, Addison-Wesley, Reading, 1973. 186 58. M a g i l l , M.J.P. and Co n s t a n t i n i d e s , G.M.,'Portfolio S e l e c t i o w i t h Transaction Costs,' JET, 13, No.2, 1976, 245-263. 59. Mangasarian, 0.L.,'Convexity, Pseudo-Convexity and Quasi-Convexity of Composite Functions,* Cahiers du Centre d'  Etudes de Recherche O p e r a t i o n e l l e , 12, 1970, 114-122. 60. , Nonlinear Programming, McGraw-Hill, N.Y., 1969 . 61. Markowitz, H.M.,'Portfolio S e l e c t i o n , ' J o u r n a l of Finance, 7, N o . l , 1952, 77-91. 62. , P o r t f o l i o S e l e c t i o n : E f f i c i e n t D i v e r s i f - i c a t i o n of Investments, John Wiley and Sons, Inc. N.Y., 1959. 63. Martos, M., Nonlinear Programming: Theory and Methods, North-Holland, Amsterdam, 1975. 64. Mavrides, L.P.,'An Implementable Frank-Wolfe Algorithm App-l i e d to a Banking Problem,' INFOR, 16, No.2, 1978, 189-195. 65. Merton, R.C.,'Lifetime P o r t f o l i o S e l e c t i o n Under U n c e r t a i n t y : the Continuous-Time Case,' Review of Economics and S t a t i s t i c s , 51, 1969, 247-257. 66. , 'Optimum Consumption and P o r t f o l i o Rules i n a Continuous-Time Model,' JET, 3, 1971, 373-413. 67. Meyer, G.G.L.,'Accelerated Frank-Wolfe A l g o r i t h m s , 1 SIAM  Jo u r n a l on C o n t r o l , 12, No.4, 1974, 655-663. 68. Meyer, R.R.,'A Comparison of the F o r c i n g Function and P o i n t -to-Set Mapping Approaches to Convergence A n a l y s i s , ' SIAM  J o u r n a l on C o n t r o l , 15, No.4, 1977, 699-715. 69. ,'A Convergence Theory f o r a Class of Anti-Jamming Strategies,'JOTA, 1977. 187 70. M i l l e r , S.M.,'Measures of Risk Aversion: Some C l a r i f y i n g Comments,' JFQA, 10, No.2, 1975, 299-310. 71. Morrison, D.F., M u l t i v a r i a t e S t a t i s t i c a l Methods, 2nd ed., McGraw H i l l , N.Y., 1976. 72. Mossin, J.,'Optimal M u l t i p e r i o d P o r t f o l i o P o l i c i e s , ' J o u r n a l  of Business, 41, No.2, 1971, 215-229. 73. , The Theory of F i n a n c i a l Markets, P r e n t i c e - H a l l , Inc., Englewood C l i f f s , N.J., 1975. 74. Mukherjee, R. and Zabel, E.,'Consumption and P o r t f o l i o Choices w i t h Transaction Costs,' i n Essays on Economic Behavior Under  U n c e r t a i n t y , M.S. Balc h , D.L. McFadden and S.Y. Wu eds., North-Holland, 1974, 157-184. 75. Neave, E.H., ' M u l t i p e r i o d Consumption-Investment Decisions and Risk Preference,' JET, 3, 1971, 40-53. 76. Ohlson, J.A.,'The Asymptotic V a l i d i t y of Quadratic U t i l i t y as the Trading I n t e r v a l Approaches Zero,' i n 122. 77. , ' P o r t f o l i o S e l e c t i o n i n a Log-Stable Market,' JFQA, 10, No.2, 1975, 285-298. 78. , ' Quadratic Approximations of the P o r t f o l i o S e l -e c t i o n Problem when Means and Variances of Returns are I n f i n i t e , ' Management Science, 23, No.6, 1977, 576-584. 79. Phelps, E.S.,'The Accumulation of Risky C a p i t a l : A Sequent-i a l U t i l i t y A n a l y s i s , ' Econometrica, 32, 1964, 122-136. 80. Pogue, G.A.,'An Intertemporal Model f o r Investment Manage-ment,' J o u r n a l of Bank Research, 1, 1970, 17-33. 188 81. Polak, E.,'On the Convergence of Optimization Algorithms,' Revue Francaise d*Information et de Recherche Operationelle, 7 , 1968, 43-59. 82. ,'On the Implementation of Conceptual Algorithms,' in Nonlinear Programming, J.B. Rosen, O.L. Mangasarian and K. R i t t e r eds., Academic Press, N.Y., 1970. 83. , Computational Methods i n Optimization, Academic Press, N.Y., 1971. 84. Powell, M.J.D.,'Some Global Convergence Properties of a V a r i -able Metric Algorithm for Minimization without Exact Line Searches,' i n Nonlinear Programming, R.W. Cottle and C.E. Lemke eds., AMS, Providence, 19 76. 85. Pratt, J.W.,'Risk Aversion i n the Small and i n the Large, 1  Econometrica, 32, No.1-2, 1964, 122-130. 86. Rockafellar, R.T., Convex Analysis, Princeton Univ. Press, 1970. 87. Rosen, J.,'The Gradient Projection Method for Nonlinear Prog-ramming, I. Linear Constraints,* SIAM Journal, 8, 1960, 181-217. 88. Roy, A.,'Safety-First and the Holding of Assets,' Economet-r i c a , 20, 1952, 431-449. 89. Rubinstein, M.E., "The Fundamental Theorem of Parameter Pref-erence,' JFQA, 90. ,*A Comparitive Stat i c s Analysis of Risk Premiums,'Journal of Business, 12, 1975, 605-615. 91. , * The Valuation of Uncertain Income Streams and the P r i c i n g of Options,' B e l l Journal of Economics, 7, No.2, 1976, 407-425. 189 92. Samuelson, P.A.,1 The Fundamental Approximation Theorem of P o r t f o l i o A n a l y s i s i n Terms of Means, Variances and Higher Moments,' RES, 37, No.4, 1970, 537-542. 93. S c h i t t k o w s k i , K.,'An Adaptive P r e c i s i o n Method f o r Nonlinear O p t i m i z a t i o n Problems,' SIAM J o u r n a l on C o n t r o l , 17, N o . l , 1979, 82-98. 94. Sengupta, J . K . , ' S a f e t y - F i r s t Rules Under Chance-Constraint L i n e a r Programming,' Operations Research,17, 1969, 112-132. 95. Shanno, D.F.,1 Conjugate Gradient Methods wi t h Inexact Searches,' Math, of O.R., 3, No.3, 1978, 244-256. 96. Sharpe, W.,'A S i m p l i f i e d Model f o r P o r t f o l i o A n a l y s i s , ' Management Science, 9, 1963, 277-293. 97. ,'A L i n e a r Programming Approximation f o r the General P o r t f o l i o S e l e c t i o n Problem,' JFQA, 6, 1971, 1263-1276. 98. , P o r t f o l i o Theory and C a p i t a l Markets, McGraw H i l l , N.Y., 1973. 99. S l a t e r , M.L.,'Lagrange M u l t i p l i e r s R e v i s i t e d : A C o n t r i b u t i o n to Nonlinear Programming,' Cowles Commision Dis c u s s i o n Paper, Math 403, 1950. 100. Smith, K.V.,'A T r a n s i t i o n Model f o r P o r t f o l i o R e v i s i o n , ' J o u r n a l of Finance, 22, 1967, 425-439. 101. , P o r t f o l i o Management, H o l t , Rinehart and Winston, Inc., 1971. 102. S t e i n , C.,'Estimation of the Mean of a M u l t i v a r i a t e Normal D i s t r i b u t i o n , ' i n Proc. of the Prague Symposium on  Asymptotic S t a t i s t i c s . 190 103. Stoer, J . and W i t z g a l l , C. , Convexity and Opt i m i z a t i o n i n F i n i t e Dimensions I , Spr i n g e r - V e r l a g , B e r l i n , 1970. 104. Stone, B.K.,'A Li n e a r Programming Formulation of the General P o r t f o l i o S e l e c t i o n Problem,' JFQA, 8, No.4, 1973, 621-636. 105. Stroud, A.H., Approximate C a l c u l a t i o n of M u l t i p l e I n t e g r a l s , P r e n t i c e - H a l l , Inc., Englewood C l i f f s , N.J., 1971. 106. Stroud, A.H. and Secres t , D., Gaussian Quadrature Formulas, P r e n t i c e - H a l l Inc., Englewood C l i f f s , N.J., 1966. 107. T e l s e r , L . , ' S a f e t y - F i r s t and Hedging,' RES, 23, 1955, 1-6. 108. Tishyadhigama, S., Polak, E. and K l e s s i g , R.,'A Comparitive Study of Several General convergence Conditions f o r A l g o r -ithms Modeled by P o i n t - t o - S e t Maps,' U.C. Berkeley, UCB/ ERL, M77/78, 1977. 109. Tobin, J . E . , ' L i q u i d i t y Preference as Behavior Towards Risk,' RES, 26, 1958. 110. Topkis, D.M. and V e i n o t t , A.F.,'On the Convergence of Some F e a s i b l e D i r e c t i o n Algorithms f o r Nonlinear Porgramming,' SIAM Jo u r n a l on C o n t r o l , 5, No.2, 1967, 268-279. 111. W i l l a r d , S., General Topology, Addison-Wesley, Reading, 1970. 112. Winkler, R.L. and Barry, C.B.,'A Bayesian Approach to P o r t -f o l i o S e l e c t i o n and R e v i s i o n , ' IASA Working paper, 1974. 113. Wolfe, P.,'Convergence Conditions f o r Ascent Methods,' SIAM Review, 11, 1969, 226-235. 191 114. Wolfe, P.,'Convergence Theory i n Nonlinear Programming,' i n Integer and Nonlinear Programming, J. Abadie ed., North-Holland, Amsterdam, 19 70. 115. Zabel, E.,'Consumer Choice, P o r t f o l i o Decisions and Trans-action Costs,' Econometrica, 41, No.2, 1973, 321-335. 116. Zangwill, W.I.,'Convergence Conditions for Nonlinear Prog-ramming Algorithms,' Management Science, 16, No.l, 1969, 1-13. 117. , Nonlinear Programming: A Unified Approach, Prentice-Hall, Inc., Englewood C l i f f s , N.J., 1972. 118. Ziemba, W.T.,'Choosing Investment P o r t f o l i o s when the Returns Have a Stable D i s t r i b u t i o n , ' i n 122. 119. , ' The Behavior of a Firm Subject to Stochastic Regulatory Review,' B e l l Journal of Economics, 5, 1974, 710-712. 120. ,'Multiperiod Consumption-Investment Decisions: Further Comments,' AER, 67, No.4, 1977, 766-767. 121. Ziemba, W.T., Parkan, C. and Brooks-Hill, R.,'Calculation of Investment P o r t f o l i o s with Risk-Free Borrowing and Lending,' Management Science, 21, 1974, 209-222. 122. Ziemba, W.T. and Vickson, R.G. eds., Stochastic Optimization  Models i n Finance, Academic Press, N.Y., 1975. 123. Zoutendijk, G., Methods of Feasible Directions, E l s e v i e r , Amsterdam, 1960. 124. , Mathematical Programming Methods, North-Holland, Amsterdam, 1976. 19 2 125. Bertsekas, D . P . S t o c h a s t i c O p t i m i z a t i o n Problems w i t h N o n d i f f e r e n t i a b l e Cost F u n c t i o n a l s , ' J o u r n a l of Optimiz- a t i o n Theory and A p p l i c a t i o n s , 12, 1973, 218-231. 126. B l a t t b e r g , R.C., and Gonedes, N.J.,'A Comparison of the Stable and Student D i s t r i b u t i o n s as S t a t i s t i c a l Models f o r Stock P r i c e s , ' J o u r n a l of Business, 1974, 244-280. 12 7. Fama, E.F.,'The Behavior of Stock Market P r i c e s , ' J o u r n a l of Business, 38, 1965, 34-105. 12 8. G e o f f r i o n , A.M.,'Elements of Large-Scale Mathematical Programming Par t I , ' Management Science, 16, 19 70, 652-6 75. 129. Hakansson, N.H.,'Capital Growth and the Mean-Variance Approach to P o r t f o l i o S e l e c t i o n , ' JFQA, 6, 1971, 517-557. 130. Hannan, E.J., M u l t i p l e Time S e r i e s , John Wiley and Sons, N.Y., 1970. 131. Keeney, R.L. and R a i f f a , H., Decisions w i t h M u l t i p l e  O b j e c t i v e s ; Preferences and Value T r a d e o f f s , John Wiley and Sons, N.Y., 1976. 132. Latane, H.A.,'An Optimum Growth P o r t f o l i o S e l e c t i o n Model,' i n Mathematical Methods i n Investment and  Finance, G. Szego and K. S h e l l eds., North-Holland P u b l i s h i n g , Amsterdam, 19 72. 133. Lee, S.L., and L e r r o , A . O p t i m i z i n g P o r t f o l i o S e l e c t i o n f o r Mutual Funds,' J o u r n a l of Finance, 28, 19 73, 1087-1101. 19 3 134. Mandelbrot, B.,'The V a r i a t i o n of C e r t a i n S p e c u l a t i v e P r i c e s , ' J o u r n a l of Business, 36, 1963, 394-419. 135. ,'New Methods i n S t a t i s t i c a l Economics,' J o u r n a l of P o l i t i c a l Economy, 61, 1963. 136. Ohlson, J.A., and Ziemba, W.T.,'Portfolio S e l e c t i o n i n a Lognormal Market when the Investor has a Power U t i l i t y Function,' JFQA, 10, 19 75. 137. P a r i k h , S.C.,'Lecture Notes f o r the Course i n S t o c h a s t i c Programming,' Dept. I.E.O.R., U.C. Berkeley, 1968. 138. Press, S.J.,'A Compound Events Model f o r S e c u r i t y P r i c e s , ' J o u r n a l of Business, 40, 1967, 317-335. 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0094923/manifest

Comment

Related Items