UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Essays in empirical asset pricing Smith, Daniel Robert 2002

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-ubc_2002-750841.pdf [ 5.79MB ]
Metadata
JSON: 831-1.0090704.json
JSON-LD: 831-1.0090704-ld.json
RDF/XML (Pretty): 831-1.0090704-rdf.xml
RDF/JSON: 831-1.0090704-rdf.json
Turtle: 831-1.0090704-turtle.txt
N-Triples: 831-1.0090704-rdf-ntriples.txt
Original Record: 831-1.0090704-source.json
Full Text
831-1.0090704-fulltext.txt
Citation
831-1.0090704.ris

Full Text

Essays in Empirical Asset Pricing by Daniel Robert Smith B. Bus. (Hons), Queensland University of Technology, 1996 M . Bus., Queensland University of Technology, 1999 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF  Doctor of Philosophy in The Faculty of Graduate Studies Finance Division Faculty of Commerce and Business Administration We accept this thesis as conforming to the required standard  The University of British Columbia July 2002 © Daniel Robert Smith, 2002  In presenting this thesis in partial fulfillment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission.  Department of Finance, Faculty of Commerce and Business Administration. The University of British Columbia Vancouver, Canada  Abstract  ii  Abstract T his thesis consists of two essays which contribute to different but related aspects of the empirical asset pricing literature. The common theme is that incorrect restrictions can lead to inaccurate decisions. The first essay demonstrates that failure to account for the Federal Reserve experiment can lead to incorrect assumptions about the explosiveness of short-term interest rate volatility, while the second essay demonstrates that we need to incorporate skewness to develop models that adequately account for the cross-section of equity returns. Essay 1 empirically compares the Markov-switching and stochastic volatility diffusion models of the short rate. The evidence supports the Markov-switching diffusion model. Estimates of the elasticity of volatility parameter for single-regime models unanimously indicate an explosive volatility process, whereas the Markov-switching models estimates are reasonable. We find that either Markov-switching or stochastic volatility, but not both, is needed to adequately fit the data. A robust conclusion is that volatility depends on the level of the short rate. Finally, the Markov-switching model is the best for forecasting. A technical contribution of this paper is a presentation of quasi-maximum likelihood estimation techniques for the Markov-switching stochastic-volatility model. Essay 2 proposes a new approach to estimating and testing nonlinear pricing models using G M M . The methodology extends the G M M based conditional mean-variance asset pricing tests of Harvey (1989) and He et al (1996) to include preferences over moments higher than variance. In particular we explore the empirical usefulness of the conditional coskewness of an assets return with the market return in explaining the cross-section of equity returns. The methodology is both flexible and parsimonious. We avoid modelling any asset specific parameters and avoid making restrictive assumptions on the dynamics of co-moments. By using G M M to estimate the models' parameters we also avoid making any assumptions about the distribution of the data. The empirical results indicate that coskewness is useful in explaining the cross-section of equity returns, and that both covariance and coskewness are time varying. We also find that the usefulness of coskewness is robust to the inclusion of Fama and French's (1993) SMB and HML factor returns. There is an interesting debate raging in the empirical asset pricing literature comparing the SDF versus beta methodologies. This paper's technique is a conditional version of the beta methodology, which turns out to be directly comparable with the SDF methodology with only minor modifications. Our SDF version imposes the CAPM's restrictions that the coefficients in the pricing kernel are known functions of the moments of market returns, which are modelled using macro-variables. We find that the SDF implied by the three-moment C A P M provides a better fit in this data set than current practice of parameterizing the coefficients on market returns in the SDF. This has an interesting application to the current SDF versus beta methodology debate.  Table of Contents  iii  Table of Contents Abstract Table of Contents  ii iii  List of Tables  v  List of Figures  vi  Preface  vii  Acknowledgements  viii  Essay 1: M S S V Interest Rate Models 1.1 Introduction 1.2 Models of Interest Rate Volatility Dynamics 1.2.1 Diffusion Models 1.2.2 Generalized A R C H Models 1.2.3 Stochastic Volatility 1.2.4 Markov Switching 1.2.5 Markov-Switching Stochastic Volatility 1.3 Data 1.4 Empirical Results 1.4.1 Basic Diffusion Model 1.4.2 Stochastic Volatility 1.4.3 The Markov-Switching Diffusion Model 1.4.4 A Markov-Switching Stochastic Volatility Diffusion Model . . 1.5 Comparing the Models 1.6 Conclusions L A Appendix: Quasi-Maximum Likelihood Estimation l . A . l Stochastic Volatility Model 1.A.2 Markov-Switching Models 1.A.3 Markov-Switching Stochastic Volatility Models l.A.3.1 Smoothing References  1 1 3 3 5 6 8 11 12 15 17 18 22 26 27 32 33 33 35 36 38 38  Essay 2: Conditional Coskewness and Asset Prices 2.1 Introduction 2.2 Model Development and Motivation 2.2.1 The Two-Moment C A P M 2.2.1.1 An Empirical Specification  43 43 45 45 48  Table of Contents Modelling Mean and Variance, or Mean and Price of Covariance Risk 2.2.2 The Three-Moment C A P M 2.2.2.1 An Empirical Implementation 2.2.2.2 An Alternative Empirical Specification 2.2.3 Multi-Moment Extension 2.3 Data and Estimation Methodology 2.4 Empirical Results 2.5 Multi-Factor Asset Pricing 2.6 Three Factors or Three Moments? or Both?? 2.7 Further Specification Tests 2.7.1 Time Varying Alphas 2.7.2 Structural Breaks 2.8 Comparison with Dittmar (2001) 2.9 Conclusions 2.A Appendix: Conditional Asset Pricing References  iv  2.2.1.2  51 52 55 57 59 61 69 78 82 90 90 91 95 101 103 105  List of Tables  v  L i s t of Tables  1.1  Summary Statistics and Diagnostic Tests on 30-day Treasury Bill yields and Residuals From AR(1) Model 1.2 Parameter Restrictions Imposed by Different Volatility Models 1.3 Parameter Estimates of Mean Equation 1.4 Maximum Likelihood Parameter Estimates of C K L S Model . 1.5 Parameter Estimates of C K L S Model 1.6 Parameter Estimates of Stochastic Volatility Model 1.7 Parameter Estimates of Markov-Switching Model 1.8 Parameter Estimates of Markov-Switching Stochastic Volatility Model 1.9 Diagnostic Tests on the Standardized Residuals 1.10 In-Sample and Out-of-Sample Specification Tests 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11 2.12 2.13 2.14 2.15  15 16 17 17 18 20 23 27 30 31  Portfolio Return Predictability 63 Predictability of Conditional Covariance 67 Predictability of Conditional Coskewness 68 Parameter Estimates of the Two-Moment Model 70 Parameter Estimates of Full Three-Moment Model 72 Two- and Three-Moment C A P M Parameter Estimates: Fixed Weights 76 Parameter Estimates of the Sign-Corrected Three-Moment CAPM 79 Parameter Estimates of the Fama-French Three-Factor Asset Pricing Model 83 Parameter Estimates of Fama-French's Three-Factor Model with Fixed Weighting Matrix 84 Parameter Estimates of the Combined Three-Factor and ThreeMoment Asset Pricing Model 86 Comparing Various Specifications of the Asset Pricing Models 88 Comparing Various Specifications of the Asset Pricing Models: Equally Weighted Moments 89 Lagrange Multiplier Tests for Time-Varying Intercepts: Industry Portfolios 92 Testing for Structural Breaks 95 Stochastic Discount Factor Estimation of Industry Data . . . 100  List of Figures  vi  L i s t of Figures  1.1 Annualized 30-Day Treasury Bill Yields, July 1962 to December 1996 1.2 Change in Annualized 30-Day Treasury Bill Yields, July 1962 to December 1996 1.3 Conditional Standard Deviation from Diffusion Model, July 1962 to December 1989 1.4 Conditional Standard Deviation from the Stochastic Volatility Diffusion Model, July 1962 to December 1989 1.5 Conditional Standard Deviation from the Markov-Switching Diffusion Model, July 1962 to December 1989 1.6 Plot of Probabilities of Low-Volatility Regime in the MarkovSwitching Diffusion Model, July 1962 to December 1989. . . 1.7 Conditional Standard Deviation from the Markov-Switching Stochastic Volatility Diffusion Model, July 1962 to December 1989 1.8 Plot of Probabilities of Low-Volatility Regime in the MarkovSwitching Stochastic Volatility Diffusion Model, July 1962 to December 1989  13  14 19 21 24 25  28  29  Preface  vii  Preface A paper based on Essay 1 of this thesis, "Markov-Switching and Stochastic Volatility Diffusion Models of Short-Term Interest Rates" was published in April 2002 in the Journal of Business and Economic Statistics (Vol. 20, No. 2., pp. 183-197).  Acknowledgements  viii  Acknowledgements I would like to acknowledge the love and support of my wife Nicole. I thank my thesis supervisory committee Murray Carlson, John Cragg, Glen Donaldson, and Allan Kraus for their excellent advice and guidance throughout the program. I thank Kai L i for providing the data used in Essay 1, and Ken French for providing the data used in Essay 2 online at http://mba.tuck.dartmouth.edu/pages/faculty/ ken.french/data_library.html. I thank seminar participants at the University of British Columbia and the Queensland University of Technology for comments on Essay 1, and Simon Fraser University for comments on Essay 2. The comments of Jeff Wooldridge and an anonymous referee greatly improved the paper. This research would not have been possible without the financial support of the University of British Columbia.  1  Essay 1: Markov-Switching and Stochastic Volatility Diffusion Models of Short-Term Interest Rates  1.1  1  Introduction  A very important research area of asset pricing is modeling the term structure. Most of the recent term-structure models were developed in continuous time. There are two main approaches to modeling the term structure in continuous time: the no-arbitrage approach and the general equilibrium approach. Vasicek  ( 1 9 7 7 )  showed how to price zero-coupon default free bonds of different matu-  rities using the no-arbitrage approach used in the option pricing of Black and Scholes (1973) for a given stochastic process for the spot interest rate. The approach prices all bonds on the basis of a finite number of state variables. By contrast, the general equilibrium approach, by Cox, Ingersoll, and Ross  ( 1 9 8 5 ) ,  prices the term structure  in equilibrium. These models specify the dynamic behavior of exogenous production factors (in continuous time) and the preferences of a representative consumer, and they derive the interest rate and prices of all assets, including bonds, endogenously. In both these approaches it is critical to the pricing of bonds to specify the stochastic behavior of short-term, or ideally spot, interest rates. Most previous term-structure models used diffusion models of the spot interest rate in pricing bonds. However some recent research, most notably by Ball and Torous ( 1 9 9 9 )  and Gray (1996), indicates that this basic model may be inappropriate. Ball  and Torous argued that the coefficient of the lagged interest rate (raised to some power) should be allowed to be time varying. Furthermore, Gray and others argued that U.S. short-term interest rates should be characterized by a nonlinear regimeshifting model to account for changes in economic regime brought about by factors such as the Federal Reserve experiment in the early 1980s and the OPEC oil crises in the late 1970s. This essay builds on this research in several directions. It presents a new approach 1  This essay has been published in the Journal of Business and Economic Statistics, April 2002, Vol. 69, pp. 666-677.  Essay 1. MSSV Interest Rate Models  2  to modeling short-term interest rates: a Markov-switching diffusion model. There are several motivations for this. First, different economic regimes appear to govern the level of interest rate volatility, and the Markov-Switching diffusion model is able to account for these explicitly. Supporting this view, Duffee (1993) found evidence of a temporary structural break in the interest rate series during the Federal Reserve experiment of 1979-1982. Second, the point estimates of the elasticity of variance with respect to the level of interest rates from single-regime diffusion models indicate an explosive volatility process. Duffee (1993) finds that the very high estimate reported by Chan, Karolyi, Longstaff, and Sanders (1992) (about 1.5 when most other models impose elasticities in the range of 0 to 1) is attributable to their failing to account for the structural break. This essay draws a similar conclusion. The single-regime models have point estimates of volatility elasticity that are comparable with the Chan et al. (1992) estimate, while the regime-switching model estimates are more reasonable around 1. The Markov-switching diffusion model examined in this essay takes account of both effects: the change in the level of interest rate volatility and the spuriously high elasticity of volatility. An alternative diffusion model that allows the coefficient on lagged interest rates to change through time is the stochastic volatility model of Ball and Torous (1999). We estimate this model and compare its performance to that of the Markov-switching model. We are unable to reject one model in favor of the other using the nonnested hypothesis testing procedure of Vuong (1989). Similarly, neither model can be rejected in favor of the more general Markov-switching stochastic volatility model that nests them both. Finally, both models performed similarly in forecasting tests; however, the evidence slightly favored the Markov-switching model over the stochastic volatility model. Also, when the standardized residuals of all models were examined for goodness of fit, the two models performed comparably but the Markov-switching diffusion model again outperformed all other models. Finally, the results from the Markov-switching and Markov-switching stochastic volatility models indicate that the stochastic volatility model overestimates the size of the elasticity of volatility with respect to the level of interest rates. Given this body of evidence we favor the Markov-switching model over the stochastic volatility model as a parsimonious representation of the dynamic behavior of U.S. short-term interest rates. These results are broadly consistent with Naik and Lee (1997). They developed an analytic expression for the term structure of interest rates with time-varying volatility in two contexts: stochastic volatility and regime switching. Naik and Lee (1997) found that the regime-switching model is better able to reproduce the observed term  Essay 1. MSSV Interest Rate Models  3  structure than is the stochastic volatility model. The regime-switching and stochastic volatility models analysed by Naik and Lee (1997) exclude a diffusion term (that is, set 7 = 0). This essay rejects this assumption and this rejection is robust to the different volatility specifications considered herein. Our conclusion is that termstructure models that allow volatility to be both regime dependent and a function of the level of the short rate are worth consideration. On a technical note, this essay demonstrates how to estimate Markov-switching stochastic volatility models using quasi-maximum likelihood techniques. The benefits of quasi-maximum likelihood procedures are that they are straightforward to apply and are less computationally intensive than the current Bayesian estimation procedure. However, the estimates may not be as efficient as the Bayesian technique, which involves approximating a log-chi-squared random variable with a normal random variable with the same mean and variance. The Markov-switching stochastic volatility model is used to compare the diffusion models with Markov-switching and stochastic volatility parametrically. It nests both the Markov-switching and the stochastic volatility model. Combining Markov-switching and stochastic volatility adds very little over the more simple models. The remainder of the essay is organized as follows. Section 1.2 discusses model development. Section 1.3 discusses the data used. Section 1.4 presents the empirical results. Section 1.5 compares the Markov-switching and stochastic volatility models, and section 1.6 concludes. 1.2  Models of Interest Rate Volatility Dynamics  This section discusses the basic types of model that have been used to explain shortterm interest rate dynamics. The first type of model is the diffusion model that is predominantly used in building term-structure models. The second type of model is the autoregressive, conditional heteroscedasticity (ARCH) model that has proved useful for modeling the dynamics of the second moment of many financial time series. The next three models are extensions of the basic diffusion model: the first allows for stochastic volatility, and second allows for Markov switching in the behavior of volatility, and the third includes both effects simultaneously. 1.2.1 Diffusion Models Most term-structure models assume that short-term interest rates evolve over time as some type of diffusion process. The beauty of the diffusion model is that the  Essay 1. MSSV Interest Rate Models  4  instantaneous change in the short rate can be characterized as a stochastic differential equation (SDE), and Ito calculus can then be utilized to characterize the term structure. This basic approach is used in both the arbitrage pricing and the general equilibrium approaches to pricing the term structure. Chan et al. (1992) (CKLS) showed that many of the specific SDEs used in the literature can be written as special cases of the following general SDE: dr = (a + br )dt + arJdB , t  t  t  (1.1)  where dB is a standard Brownian motion. t  CKLS were concerned with calibrating this general SDE econometrically to evaluate the appropriateness of these competing models for the short rate. The exact functional form of the short rate SDE is of critical importance for models of the term structure. For example, Vasicek (1977) used an arbitrage argument to derive a partial differential equation for bond prices. His derivation was sufficiently general to allow for any diffusion type of SDE for the short rate and then proceeded to derive closed-form bond process for the special case of an Ornstein-Uhlenbeck process for the short rate. However, the following quote by Vasicek (1977, p. 185) illustrates the importance of estimating the appropriate SDE for the short rate: In the absence of empirical results on the character of the spot rate process, this specification serves only as an example. However, this empirical work was not pursued until much later. Most theoretical term-structure research imposed ad hoc structures on the SDE, and there was no consistency between the different models. It was not until the work of CKLS that financial economists were concerned with empirically calibrating models of the short rate. To empirically calibrate the general SDE, CKLS employed the following simple discretization of (1.1): Ar = a + 6r _i + a r ? . ^ , t  t  (1.2)  where A r = r — r _ i and e is a standard normal random variable. They estimate 4  t  t  t  the parameters of this model by using the so-called generalized method of moments (GMM) estimation technique of Hansen (1982). They found that the short rate is mean reverting, and that the elasticity of volatility parameter was 1.4999 (the standard error was 0.2519). The elasticity parameter indicates that the volatility of short-term interest rates is explosive.  Essay 1. MSSV Interest Rate Models  5  Other research includes the work of Broze, Scaillet and Zakoian (1995) who used maximum-likelihood-based procedures and the indirect inference technique of Gourieroux, Monfort, and Renault (1993) to account for the discretization bias, which they found to be very small. Another approach by Ait-Sahalia (1996) estimates the implied density of discrete changes in the spot rate implied by various continuous-time models, and compares these with the empirical distribution of the discrete changes in the spot rate. 1.2.2 Generalized ARCH Models The A R C H model was introduced by Engle (1982), and later extended by Bollerslev (1986), who developed the generalized A R C H , or G A R C H , model. In a GARCH(1,1) model, the conditional mean and conditional variance of a time series process are modeled simultaneously, r = a + br -i + e t  t  t  where the conditional volatility of e is given by E\e {ipt-i} = h : 2  t  t  t  h = LU + ae _ + fiht-i. 2  t  t  x  G A R C H models are able to capture the very important volatility clustering phenomena that have been documented in many financial time series, including short-term interest rates (see Bollerslev, Chou, and Kroner, 1992), as well as their leptokurtosis. Note that in G A R C H models the volatility is a deterministic function of lagged volatility estimates and lagged squared forecast errors. One problem with G A R C H models of the short rate is that the parameter estimates suggest that the volatility process is explosive. Bollerslev (1986) demonstrates that the variance process is covariance stationary when | YA=I i + Z)jLi Pj\ < 1- Note that a  it is usually assumed that oti,[3j > 0 Vi, j to ensure that the conditional volatility is nonnegative, so the usual case where YA=I % + A  Pj < 1 is considered. If this  inequality is violated, then shocks to the volatility process are persistent or explosive. If the sum of the coefficients equals 1, then the process is integrated G A R C H (IGARCH). If the sum of coefficients is strictly greater than 1, then a shock to volatility is explosive, and linij_ , E[h +j\ipt] = co. Parameter estimates of GARCH(1,1) oc  t  models fitted to short-term interest rates typically indicate an explosive process for the conditional volatility, or a + (3 > 1. For example, Gray (1996) reported using weekly 30-day treasury-bill data that a + j3 = 1.0303, and Engle, Ng, and Rothschild  Essay 1. MSSV Interest Rate Models  6  (1990) found that a+(3 = 1.0096 for a portfolio of treasury-bills. It is interesting that Gray (1996) found that the implied persistence of volatility within a regime in his two-state regime-switching G A R C H model is much lower than in the single-regime G A R C H model. He found that in the high volatility regime a + (3 = 0.6586, and a + j3 = 0.4340 in the low-volatility regime, compared with 1.0303 in the singleregime model. This is because there is high persistence in the state of the economy that the single-regime model interprets as persistence in volatility. This is a common finding in the regime-switching interest rate literature. 1.2.3 Stochastic Volatility The stochastic volatility model used in this essay allows log-volatility to itself evolve stochastically over time. This is in direct contrast with the G A R C H type models which model volatility as a deterministic function of lagged squared forecast errors and lagged conditional volatility. The stochastic volatility model is parsimonious and yet flexible, and has been successfully applied to a range of financial time series including short-term interest rates (Ball and Torous, 1999), exchange rates (Taylor, 1986; and Harvey, Ruiz, and Shephard, 1994), and stock prices (Hsieh, 1991; Harvey and Shephard, 1996; Sandmann and Koopman, 1998; and So, Lam, and L i , 1998). Most stochastic volatility models are set in discrete time, and this essay follows this convention. We follow Ball and Torous (1999), who presented their stochastic volatility model as a simple extension of the discrete time-diffusion models of the type presented in  (1.2): Ar = a + br -i + atf-^. t  (1.3)  t  As the time subscript on a in equation (1.3) indicates, the generalization employed allows the volatility to be time varying. The model allows log-volatility to evolve stochastically as a simple first-order autoregressive process: logo~} — cu -\- (plog a _j + r] , 2  t  t  (1.4)  where r\ ~ iidA(0, o^). The disturbance term rj in (1.4) makes the process stochastict  t  —the variance itself is subject to random shocks. This process is parsimonious and able to capture interesting dynamics. Note that G A R C H models can be derived as the discrete time limit of a continuous-time stochastic volatility model but that the discrete-time stochastic volatility model here considered is more direct.  Essay 1. MSSV Interest Rate Models  7  One of the simplest procedures available for estimating stochastic volatility models of this type is the quasi-maximum likelihood procedure of Harvey et al. (1994). This approach is based on a simple transformation of the residual in (1.3), which allows one to write the system in state-space form and apply the Kalman filter to recursively build up the quasi-likelihood function. The method involves transforming the residual e = Ar — a — br _\. By taking the log of the squared residual, one obtains t  t  t  log e\ = log of + 27 log r  t  _i  + log e , 2  t  because et — crt t-i t- New notation simplifies this expression: y = loge^, which r  £  t  is observable given the observed changes in interest rates and the parameters a and 6; and x = log of is the state variable—log-volatility. Using this notation, we can t  rewrite the system in state-space form as y  t  = x + 27 log r _j + logs'},  (1.5)  x  t  = u + ^Xt-x+rjt.  (1.6)  t  t  The Kalman filter is an iterative procedure that forecasts the state variable one period into the future by a linear projection and then updates this forecast when the observation on the variable y becomes available. If the disturbance terms are t  both Gaussian, the linear projection is also the conditional expectation, and the conditional expectation and its mean squared error are all that is required to describe the conditional density. In this case, the Kalman filter enables the construction of the exact likelihood function, and then full maximum likelihood estimation is possible. However, in this case the disturbance term for the observation equation (1.5) is nonGaussian. In fact it is distributed as log-chi-squared random variable with one degree of freedom, which has mean £[log£j] = —1.2704 and variance Var[log£j] = ^ . The quasi-maximum likelihood procedure of Harvey, Ruiz, and Shephard (1994) uses the quasi-likelihood function, which uses the likelihood function of a normal random variable that has the same mean and variance as log ef: j/ = x + 27 log r _ i - 1.2704+ t  t  t  (1.7)  The Kalman filtering equations and likelihood function are presented in Appendix l.A. This approach provides a criterion that when maximized results in consistent and asymptotically normal parameter estimates if the model is correctly specified (White, 1982b; and Bollerslev and Wooldridge, 1992). We can use the central limit  Essay 1. MSSV Interest Rate Models  8  theorem of Dunsmuir (1979) to establish the consistency and asymptotic normality of the resulting parameter estimates. A number of alternative estimation techniques to quasi-maximum likelihood exist in the literature. These include the Bayesian technique of Jacquier, Poison, and Rossi (1994), maximum likelihood procedures by Fridman and Harris (1998), the maximum likelihood Monte Carlo method of Sandmann and Koopman (1998), and the efficient method of moments by Andersen, Chung, and S0rensen (1999). However, these techniques are usually very computationally intensive, in contrast to the relatively straightforward quasi-maximum likelihood technique. 1.2.4  Markov Switching  In recent years econometricians have modeled various economic time series as coming from one of a number of regime-switching time series. In these models it is assumed that the distribution of the variable is known, conditional on a particular regime or state occurring. When the economy changes from one regime to another, a substantial change occurs in the series. Hamilton (1989) presented the Markov regime-switching model in which the unobserved regime evolves over time as a first-order Markov process. This type of modeling makes sense in a time-series context. Markov-switching models have proven to be quite useful in modeling a range of economic time series ranging from the business cycle (Hamilton, 1989; Goodwin, 1993, Durland and McCurdy, 1994; Filardo, 1994; and Smith, 2001), the stock market (Hamilton and Susmel, 1994; Schaller and van Norden, 1997; and Turner, Startz and Nelson, 1989), exchange rates (Engle and Hamilton, 1990) and short-term interest rates (Cai, 1994; and Gray, 1996). The empirical usefulness of Markov-switching models of macro-economic variables that are related to interest rates, and the significant evidence for the validity of Markov-switching models of short-term U.S. interest rates, motivate this research. Cai (1994) fitted a Markov-switching A R C H model to the excess returns of the 3-month treasury-bill over the 30-day treasury-bill and found two periods of excessively high interest rate volatility: the OPEC oil shock in 1974 and the monetarist experiment of the Federal Reserve between 1979 and 1982. Cai (1994) fitted only an A R C H model because regime-switching G A R C H models are inherently path dependent. This is because the conditional variance today depends on the conditional variance yesterday, which, in turn, depended on previous conditional variances. Thus the conditional variance today depends explicitly on all previous states. However, both Gray (1996) and Dueker (1997) have since developed two approximations that  Essay 1. MSSV Interest Rate Models  9  overcome the path-dependence problem. Gray (1996) fitted a generalized-regime switching model that allowed for G A R C H effects as well as a diffusion term. He found evidence in his weekly 30-day treasurybill dataset that two regimes, a high-volatility and a normal-volatility regime, are necessary to adequately characterize the dynamics of short-term interest rates. He also found that he needed both G A R C H terms as well as the diffusion term (which roughly takes the place of the intercept in the conditional variance equation) in his model. In three periods the economy was in the high-volatility regime, the O P E C oil shock, the Federal Reserve experiment, and a brief period in 1987 corresponding to the stock market crash. Duffee (1993) found evidence of a structural break (associated with the monetarist experiment) and concluded that the high elasticity of variance reported by C K L S was due to their failure to account for this break. As reported in the following, even the stochastic volatility model suffers from this problem. Regime-switching models allow the economy to be in any of a finite number of distinct regimes at any point in time. The regime completely governs the dynamic behavior of the series. This implies that once we condition on a particular regime occurring, and assume a particular parameterization of the model, we can write down the density of the variable of interest. In Markov-switching models, the regime is strictly unobservable by the econometrician, who must therefore draw statistical inference regarding the likelihood of each regime occurring at any point in time. In this particular parameterization it is necessary to define the transition probabilities from regime j to regime i at time t asp^ = PrfSt = i\S -i = j]. Note that in specifying the parameters t  of such a model, we need to include only K(K — 1) parameters because there are K redundant transition probabilities, Pr[5 = K\S -\ t  t  = j] = 1 — S i l l Pr[St = i\S -i = j]. 1  t  It is also possible to allow for time-varying transition probabilities. For example, Durland and McCurdy (1994) allowed the transition probabilities to decline as the economy remained in one regime. In other words the longer the economy is in one regime, the more likely it is to change out of that regime. Another approach was taken by Filardo (1994) and Gray (1996), who allowed the transition probability to be a function of some other variables. For example, Gray (1996) made the transition probability a function of the log-interest rate. The function is the cumulant of a standard normal random variable and hence guaranteed to be contained within the unit interval. This essay presents a Markov-switching diffusion model of the type in (1.2), but allows the unconditional volatility to change between regimes. This is done by allowing  Essay 1. MSSV Interest Rate Models  10  a in the basic discretized diffusion model (1.2) to change between two states: Ar = a + br -\ + ^r^et, t  t  (1.8)  where i € {1,2} is an index of the regime at time t. We choose to fit two regimes since we expect to find, as have previous studies, that the Federal Reserve experiment and the oil shock are fundamentally different in their dynamic behavior from the rest of the sample period. This difference in behavior is because the experiment resulted in a different economic regime, and a similar explanation can be given for the oil shock. Because we are fitting two regimes, we need to fit two transition probabilities: p = Pi[S = l | 5 _ i - 1] and q = Pr[5 = 2|S _! = 2]. t  t  t  t  As noted, in Markov-switching models the regime is unobserved, and the econometrician must infer which regime occurred at each time period. Hamilton (1989) developed a filter that allows the econometrician to infer the probability of the regime at each point in time iteratively. A useful byproduct of this algorithm is the loglikelihood, which can be numerically maximized to obtain the maximum likelihood estimates of the parameters. Appendix 1.A.2 presents Hamilton's filter along with a brief discussion. Kim (1994) presented a simple backwards recursive filter that allows inference to be made regarding the regime at each point in time using the entire sample of observations. His smoothing algorithm is also discussed in Appendix l.A.2. This model should outperform the basic diffusion model because the parameters of the basic single regime diffusion model are estimated under the assumption that there is only one regime. If there are two regimes, say, a high- and low-volatility regime, simply assuming a constant volatility during the sample period will systematically overestimate the volatility in the low-volatility regime, and systematically underestimate the volatility in the high-volatility regime. Therefore the Markov-switching diffusion model is expected outperform the basic single-regime diffusion model of CKLS. Because we are interested in directly comparing the Markov-switching model with the stochastic volatility model instead of estimating (1.8), we employ the same transformation used in the stochastic volatility model: 2/ = Wi + 27logr _i + loge?, t  t  where Ui = log of and y is defined previously. t  (1.9)  Essay 1. MSSV Interest Rate Models 1.2.5  11  Markov-Switching Stochastic Volatility  The final model estimated in this essay is the Markov-switching stochastic volatility (MSSV) model. This model is a generalization of the stochastic volatility model and the Markov-switching model, which are both special cases of MSSV. This is useful on a number of levels. First, it enables the models to be compared. If we can reject the MSSV model in favor of the Markov-switching model but not the stochastic volatility model, then we can conclude that the data support the stochastic volatility model; the converse is also true. Another benefit is that it helps determine which features of the data are specific to which model. Specifically, the Markov-switching and stochastic volatility diffusion models give different estimates of the elasticity of variance 7. The estimate of 7 is important because if 7 > 1, then the interest rate process is explosive, while if 7 < 1, it is not. Our regime-switching models of interest rates find estimates of 7 that are in the stable region, while all the single-regime models indicate that interest rates are explosive. To determine which of these is spurious, we can compare the estimate from the MSSV model with the two estimates and draw some conclusions on which model is preferred based on this. The specific form of the MSSV model estimated is y  t  = x + 27 log r -i + log e ,  x  t  = LUi + 0x _i + r) .  2  t  t  t  t  (1-10)  t  This nests the stochastic volatility model when u\ = u , although the transition 2  probabilities p and q (defined as in the Markov-switching model) are not identified. This model also nests the Markov switching model when a = 0. However, <fi is not 2  identified under this null hypothesis. When a parameter is unidentified under the null hypothesis, standard hypothesis testing is not valid because the information matrix is singular, and this violates one of the assumptions used to derive the asymptotic distribution of the likelihood ratio, Wald, and Lagrange multiplier test statistics. A similar model was developed by So et al. (1998) that employed the computationally burdensome Bayesian technique of Jacquier et al. (1994). In this essay, however, we employ quasi-maximum likelihood procedures. This procedure has the benefit of being straightforward to apply and is very simple computationally. The quasi-maximum likelihood procedure, outlined in Appendix 1.A.3, uses the Kalman filtering equations derived by Kim (1994), and the state-space representation in the preceding, to calculate the quasi-likelihood function, which is then numerically maximized. Issues related to filtering and smoothing are discussed in the Appendix. The  Essay 1. MSSV Interest Rate Models  12  derivation of the filtering equations are more general than the case used in this essay, and allow for all the parameters (that is, LU, C/> and o^) to be regime dependent. In the model estimated in this essay only the unconditional volatility is modeled as being regime dependent (that is, only Ui is different in each regime i). This was done to keep the model as parsimonious as possible. 1.3  Data  The data for this study consist of monthly observations on 30-day U.S. treasury bills from June 1964 to December 1996. Duffee (1996) presented an interesting discussion on some of the problems associated with using the treasury-bill to proxy for the riskfree short-term interest rate. Yet, this study uses 30-day treasury-bill data to ensure comparability with most previous studies that attempted to model the short-rate (see, for example, CKLS; Ball and Torous, 1999; Gray, 1996). To maintain comparability with CKLS and Ball and Torous (1999) we use data from June 1964 to December 1989 in all in-sample calculations and hold the remaining data (January 1990 to December 1996) for out-of-sample analysis. The treasury-bill data are plotted in Figure 1.1 and the first difference of the interest rate is plotted in Figure 1.2. Several interesting observations can be made. First, Figure 1.2 shows the well documented volatility clustering phenomena in financial time series—large changes, of either sign, are typically followed by large changes, and small changes follow previous small changes. Second, there appear to be two regimes in the volatility. The 1979-1982 period is clearly more volatile than other periods, and the period in late 1974, corresponding to the oil crisis, also seems a little more volatile than other periods. October 1987 also consists of a large changes in the interest rate, but this is an isolated incident—interest rate changes either side of it are of normal size. This indicates that it may be useful to allow for a two-regime Markov-switching model that allows the volatility in the Federal Reserve experiment and the oil crisis to be different from the volatility in other periods. Also, the periods of higher volatility are related to periods of higher interest rates, so it is possible that a diffusion model, in which volatility is a function of the level of interest rates, accounts for this effect. Table 1.1 contains some summary statistics and diagnostic tests of the treasury-bill data. There is clear evidence of autocorrelation and A R C H effects in both the level of interest rate series and in the residual of a first-order autoregressive model (also the residuals later used in the stochastic volatility modeling). This indicates that we need to take account of time-varying heteroscedasticity. Even accounting for A R C H  14  Essay 1. MSSV Interest Rate Models  co  OS O  (%) PieiA ll!a-JL "I 96UBUO  g bJO •— !t  fa  Essay 1. MSSV Interest Rate Models  15  effects leaves clear evidence of autocorrelation in the residuals (see the Box-Pierce A R C H adjusted portmanteau statistics). Table 1.1: Summary Statistics and Diagnostic Tests on 30-day Treasury Bill yields and Residuals From AR(1) Model Skewness and kurtosis are calculated as the standardized third and fourth central moments, respectively. The standard errors of the skewness and kurtosis statistics are reported in parenthesis below the statistic and are calculated as y/oTTand )/24JT respectively. The Bera-Jacque statistic tests for normality by checking that the skewness equals 0, and kurtosis equals 3 and is distributed as a chi-squared variable with two degrees of freedom. BPk tests for k-\h order autocorrelation using the BoxPierce portmanteau test statistic, ^ p A R C H ^ ^ f /j_^ order autocorrelation using the Box-Pierce portmanteau test statistic corrected for A R C H following Diebold (1986), and BP% tests for autocorrelation in the squared deviation from mean (roughly equivalent to testing for ARCH) out to k lags. All portmanteau test statistics are distributed as a chi-squared random variables with k degrees of freedom. es  r 6.6587 6.9120 1.2097 0.1398 4.3588 0.2796 98.4936 276.5656 2221.6082 72.0637 786.5749 264.5250 1991.4313 t  Mean Variance Skewness (Standard Error) Kurtosis (Standard Error) Bera-Jarque BPx BP 12  gpARCH  BpARCH  BPl  1.4  Empirical  s  or  n  e = Ar - a- br -i -0.0015 1.2533 -1.1038 0.1403 11.6429 0.2805 1011.2436 60.7781 91.4565 15.4548 25.2592 23.1056 98.7721 t  t  t  Results  This section reports the results of fitting a number of different empirical models to short-term interest rates. To ensure that the playing field is level, all the models are estimated as special cases of the MSSV model. Therefore, all models are of the basic form y = x + 2 r _! - 1.2704 + t  t  7  t  Essay 1. MSSV Interest Rate Models  16  Table 1.2: Parameter Restrictions Imposed by Different Volatility Models This table presents the restrictions imposed by the various diffusion models considered in this essay. The restrictions are on the parameters of the model: X  t  =  LOi +  <j>X -! + t  T] , t  where i G {1, 2} is the index of the regime occurring at time t and rj ~ iid./V(0, cr ). Note that NI refers to a parameter that is not identified in this model—these parameters have no effect on the log-likelihood. Model U>2 4> CKLS LU = Uli NI 0 SV C0 = Ul\ NI 0 MS MSSV 2  t  <  2  2  but each imposes different restrictions on the parameters. These restrictions are summarized in Table 1.2. The reason for this approach is consistency. Our technique for estimating the stochastic volatility and MSSV models requires the model to be written in state-space form, and then the parameters are estimated by quasi-maximum likelihood. The CKLS and Markov-switching models, however, can be estimated by full maximum likelihood. By also writing these two models in state-space form, we use the same approximation as in the stochastic volatility and MSSV models and guarantee that our results are not biased against either the stochastic volatility or the MSSV model and therefore our results cannot be attributed to the use of different estimation techniques. Thus, all models require the residuals from the model Ar = a + & r t  t  e , because  _i +  t  the observable variable used in the estimation of all models is y = \oge . Ordinary 2  t  least squares is used to obtain consistent estimates of the mean parameters a and b. These parameters are reported in Table 1.3. The parameter estimates are broadly consistent with the G M M estimates of CKLS. Both models indicate that short-term interest rates are mean reverting (b < 0), although the mean reversion implied by the point estimate of b is much lower than in CKLS. The G M M estimates of CKLS imply that the unconditional estimate of the spot treasury-bill interest rate is 0.0689, whereas the parameter estimates of the ordinary least squares estimates are 6.8436 percent. The differences are simply due to the use of percentages rather than decimals as in CKLS; therefore, the results are comparable.  Essay 1. MSSV Interest Rate Models  17  Table 1.3: Parameter Estimates of M e a n Equation Ordinary least squares estimates are of the discretized diffusion model: A r = a + br -\ + e . The standard errors are estimated using the procedure of White (1980a) and are therefore consistent in the presence of unspecified heteroscedasticity. Parameter Estimate Standard error a 0.3476 (0.1842) b (0.0322) -0.0508 s  t  t  1.4-1 Basic Diffusion Model Table 1.4 reports the maximum likelihood estimates of the CKLS model. These results indicate significantly less mean reversion than implied by the G M M estimates of CKLS. The estimate of the elasticity of volatility with respect to the level of interest rates is very similar to CKLS G M M estimate—well in excess of unity. Finally, note that the coefficient on lagged volatility is lower when estimated by maximum likelihood than when estimated by G M M , although it is comparable with some of the restricted models. However, neither estimation technique can reject the null hypothesis that a = 0. 2  A striking result is the comprehensive rejection of the non-diffusion model in favor of the diffusion alternative. The robust Lagrange multiplier (RLM) test statistic for this null hypothesis (7 = 0) is 39.5349, which is distributed as a chi-squared random variable with one degree of freedom. We can thus reject the nondiffusion restriction at any normal significance level. One of the more popular term structure models is by Cox et al. (1985) (CIR), who set 7 = 0.5, which implies a square-root process for interest rates. Gray (1996), in his generalized regime switching model, chose to set 7 = 0.5 to conform with CIR. To determine how appropriate this restriction is, a restricted form of the diffusion model is estimated in which this restriction is imposed while all other parameters are estimated freely. This restricted specification is estimated for all the other types of  Table 1.4: M a x i m u m Likelihood Parameter Estimates of C K L S M o d e l Parameter Estimate Standard error a 0.2216 (0.0909) b -0.0308 (0.0177) a 0.0031 (0.0012) 1.3066 (0.1001) 7 2  Essay 1. MSSV Interest Rate Models  18  Table 1.5: Parameter Estimates of C K L S M o d e l Parameter estimates and robust standard errors are presented for a basic diffusion and a constant variance model of interest rate volatility. Both models are of the general form y = cu + 27 log r -\ — 1.2704 + £ . In the diffusion model, 7 is estimated as a free parameter; in the nondiffusion model, 7 is restricted to equal 0. Diffusion model Nondiffusion model Restricted model Parameter -6.7227 -1.4198 -3.2464 LO (0.1540) (0.1470) (0.6896) 1.4515 0 0.5 7 (0.1788) Log-Likelihood -715.2661 -750.3347 -730.3362 — 21.5914 39.5349 Robust L M t  t  t  —  model considered in this essay. The R L M test statistic for this restriction takes the value 21.5914, which is highly significant given that it has a one-degree-of-freedom chi-squared distribution. Thus the basic model rejects both the no-diffusion and the CIR restrictions on the relationship between the level of interest rates and interest rate volatility. Figure 1.3 plots the conditional standard deviation from the basic diffusion model against the absolute residual. Although the periods of high volatility associated with the Federal Reserve experiment and the oil crisis are also periods of relatively high interest rates, these are periods when the basic diffusion model underpredicts interest rate volatility. This is likely because there are some periods of high interest rates that are not associated with higher volatility—for example the late 1980s and the early 1990s. The basic diffusion model assumes that the relationship between the level of interest rates and volatility is the same through time, yet this is clearly not the case. For this reason it may be profitable to consider models that allow this relationship to change over time, as do the models considered in the remainder of this section. 1.4-2 Stochastic Volatility As discussed previously, Ball and Torous (1999) presented an extension of C K L S that allows the log-volatility to evolve stochastically over time. The results of fitting such a stochastic volatility model using the quasi-maximum likelihood method is presented in Table 1.6. The nondiffusion null hypothesis is rejected in the stochastic volatility model. The relevant R L M statistic is 7.9221 (also distributed as chi-squared with one degree of freedom), which indicates a clear rejection of the null hypothesis of no diffusion. The CIR restricted model is also rejected by the data, because the test  Essay 1. MSSV Interest Rate Models  20  statistic for this hypothesis is 4.4348, which has a p-value of 0.035. Table 1.6: Parameter Estimates of Stochastic Volatility Model Parameter estimates and robust standard errors are presented for three types of stochastic volatility model of interest rate volatility. Both models are of the general form y = x + 27logr _ — 1.2704 + ^ , where x = u + (px -\ +n . In the diffusion model, 7 is estimated as a free parameter; in the nondiffusion model, 7 is restricted to equal 0. Parameter Diffusion model Nondiffusion model Restricted model LO -3.3789 -0.0884 -0.2717 (3.9599) (0.0539) (0.1808) 0.4943 0.9407 0.9172 (0.5612) (0.0313) (0.0536) 0.8418 0.1991 0.1765 (1.1635) (0.1265) (0.1552) 1.4407 0 0.5 7 (0.2153) Log-Likelihood -710.3453 -719.8013 -715.5016 Robust L M 7.9221 4.4348 t  (  t  1  t  t  t  —  —  The null hypothesis that is used in comparing the basic diffusion model with the more general stochastic volatility model is that a = 0. However, under this null 2  hypothesis, <f> is no longer identified and can take literally any value without changing the likelihood function. For this reason, the information matrix becomes singular and standard asymptotic distribution theory no longer applies. Therefore, we are no longer able to assert that the Wald and likelihood ratio test statistics are asymptotically chi-squared. There are techniques for handling hypothesis testing when some parameters are not identified under the null hypothesis (Davies 1977, 1987), but these are extremely computationally intensive, so we calculate the empirical p-values of the standard likelihood ratio test. The log-likelihood of the stochastic volatility model is a sizable improvement over the basic diffusion model—the likelihood ratio statistic, calculated in the usual fashion, is equal to 9.8416. The asymptotic distribution of this statistic is unknown, but the empirical p-value was calculated at 0.043 percent using a Monte Carlo study with 1000 replications. The basic diffusion model is thus rejected in favor of the stochastic volatility model, which supports the similar conclusion reached in Ball and Torous (1999). Figure 1.4 plots the conditional standard deviation from the stochastic volatility model against the absolute residual. When we compare this with the basic diffusion model in Figure 1.3 the plots are very similar. Both models underpredict the size  Essay 1. MSSV Interest Rate Models  22  of volatility in both the oil crisis and the Federal Reserve experiment. However, although difficult to see with the naked eye, it seems that the predicted volatility from the stochastic volatility model is a better fit than in the basic diffusion model. This issue is explored in more detail in Section 1.5. 1.4-3  The Markov-Switching Diffusion Model  One of the first issues to be established is whether the parameters describing the conditional mean dynamics, a and b, should be allowed to be regime dependent. To examine this issue, two nested Markov-switching diffusion models were estimated and a likelihood ratio test was performed on the null hypothesis that a and b are the same in both regimes. The more general model allows o and 6 to depend on which regime the economy is in, whereas the null model restricts a and b to be the same in both regimes. The likelihood ratio statistic for the null hypothesis of mean parameter regime independence was calculated as 0.6843, which is asymptotically distributed as a chi-squared random variable with two degrees of freedom because two parameter restrictions are imposed under the null hypothesis. Thus, the null hypothesis that the mean parameters are regime-independent cannot be rejected at acceptable levels, so we can proceed with estimating the regime independent mean parameters a and b by ordinary least squares. Table 1.7 presents the parameter estimates of the Markov-switching diffusion model, the Markov-switching nondiffusion model, and the restricted Markov-switching diffusion model. Clearly, the Markov-switching models of short-term interest rates require a diffusion term to be included. The R L M statistic testing the restriction that 7 = 0 is 10.9254, which is comprehensively rejected at any reasonable significance level. However, the square-root specification of CIR, which sets 7 = 0.5, cannot be rejected by the data. The p-value of the R L M test statistic (which is 2.2222) is only 0.1360. The reason for the high p-value is the relative imprecision with which the elasticity parameter 7 is estimated. Its standard error is 0.3443 which is very large when compared with the parameters point estimate of only 0.9200. To test the Markov-switching model against the more simple basic diffusion model, we construct a simple likelihood ratio test. The value of this likelihood ratio statistic is 4.1648. Unfortunately, the exact distribution of this statistic is unknown because the transition parameters are unidentified under the null hypothesis. We therefore calculate an empirical p-value by running a simple Monte Carlo study. The empirical p-value is 0.095 and we are thus able to reject the null hypothesis of only one regime at the 10% significance level.  Essay 1. MSSV Interest Rate Models  23  Table 1.7: Parameter Estimates of Markov-Switching M o d e l Parameter estimates and robust standard errors are presented for three types of Markov switching model of interest rate volatility. Both models are of the general form y = x + 2 l o g r _ — 1.2704 + £ , where x = u>i and i G {1, 2}refers to the discrete state of the economy at time t. In the diffusion model, 7 is estimated as a free parameter; in the nondiffusion model, 7 is restricted to equal 0. Diffusion model Nondiffusion model Restricted model Parameter 0.9928 0.9929 0.9919 V (0.0052) (0.0051) (0.0065) 0.9531 0.9537 0.9483 q (0.0278) (0.0270) (0.0414) -1.9056 -3.6348 -5.0855 (0.1759) (0.1787) (1.1720) 1.4198 -0.9806 -2.9990 OJ2 (0.2887) (0.2880) (1.7180) 0.9200 0 0.5 7 — — (0.3443) Log-Likelihood -711.6426 -719.1908 -713.1184 2.2222 10.9254 — Robust L M t  t  7  t  1  t  t  It is interesting that the estimated elasticity parameter 7 is much lower in the Markov switching model (0.9200) than in either the stochastic volatility model (1.4407) or the basic diffusion model (1.4515). This supports Duffee's (1993) conclusion that failing to account for the structural break, or regime shift in our model, due to the Federal Reserve experiment, spuriously inflates the estimated elasticity of volatility. Figure 1.5 plots the conditional standard deviation from the Markov-switching diffusion model against the absolute residual. This figure is very similar to the figures for the basic diffusion and stochastic volatility models, yet it is apparent that the Markov-switching model outperforms the two previous models, especially in the period of the Federal Reserve experiment. This assertion is further borne out by the comparison of the specification tests in Table 1.9 in Section 1.5. Figure 1.6 plots the smoothed and forecast probability of the low-volatility regime over the sample period. Note that we identify two periods of high volatility—in late 1974 and the early 1980s. These two periods correspond to the O P E C oil crisis and the Federal Reserve experiment. Thus, our model supports the conclusion of Gray (1996) and Cai (1994), who argued for the need of a second regime to adequately model the dynamics of short-term interest rates.  Essay 1. MSSV Interest Rate Models  24  CJ  o CJ  Q CN CD Ci  rH  >> 1-5  o o • I—I  bO S u  > o SH  rt 0)  O (4-1  a o  > Q SH  CS  c CO  "rt C o T3 O  O CJ  Ci  .SP  c3  Essay 1. MSSV Interest Rate Models  25 tN CO iH  3 0)  0  w  •u O  00 Ci  iH  CU CO  3  bJO  cu o CD  Q M  fa -u  Essay 1. MSSV Interest Rate Models  26  1.4-4 A Markov-Switching Stochastic Volatility Diffusion Model Table 1.8 contains the parameter estimates of the Markov-switching stochastic volatility model. Some interesting points to note are that the estimate of the elasticity of volatility is very similar to the Markov-switching model—7 = 1.0205. Second, it is clear that the MSSV model still requires the inclusion of a diffusion term because the R L M statistic testing the no-diffusion null hypothesis is 7.7127. Also, the restriction imposed by CIR is not rejected by the data. The R L M statistic is 2.3354, which has a p-value of 0.1360 and therefore cannot be rejected at reasonable significance levels. Similar to the Markov-switching model, the reason for this is the imprecision with which the diffusion parameter is estimated. Thus the MSSV model suggests that when fitting a model of the short-term interest rate, one should allow for a second regime and include a diffusion term. The final implication is that the elasticity of volatility should be set at unity, although we cannot reject the CIR model. The next issue is whether we need to include both Markov-switching and stochastic volatility effects to model interest rates. The answer to this question is somewhat ambiguous. First, note that traditional hypothesis testing procedures are not applicable because some of the parameters are not identified under the null hypothesis of only Markov-switching or stochastic volatility. However, the likelihood ratio statistics are not huge in either case: 4.1648 and 1.5702 respectively. One could calculate the empirical distribution of these test statistics using Monte Carlo methods, but the likelihood surface in the MSSV model is extremely bumpy with numerous local maxima, and we require many starting values to be sure that we have attained the global maximum. In a Monte Carlo study, this is not feasible. Second, in estimating the models fitted to this data, the choice of starting value is critical, because many different starting values forces the maximization algorithm to look in impossible regions of the parameter space that result in an infinite likelihood, even when restrictions were imposed. For these reasons we choose not to perform such a study. Instead we resort to using an informal discussion. For the Markov-switching model the null hypothesis is that a = 0, which has a t value of 1.7064, which is only marginally significant if 2  there were no unidentified parameters. Another point to note is that the Markovswitching model outperforms the MSSV model both out of sample and in sample. For these reasons we favor the more parsimonious Markov-switching model. Second, the likelihood ratio statistic for the stochastic-volatility model results in a p-value of 0.2102, which would not be significant if we resorted to traditional hypothesis testing procedures.  27  Essay 1. MSSV Interest Rate Models  Table 1.8: Parameter Estimates of Markov-Switching Stochastic Volatility Model Parameter estimates and robust standard errors are presented for three types of Markov-switching stochastic volatility models of interest rate volatility. Both models are of the general form y = x + 27logr _i — 1.2704 + £ , where x = u>i + <px -\ + r\ and i E {1,2} refers to the discrete state of the economy at time t. In the diffusion model, 7 is estimated as a free parameter; in the nondiffusion model, 7 is restricted to equal 0. Diffusion model Nondiffusion model Restricted model Parameter 0.9953 0.9947 0.9936 V (0.0040) (0.0036) (0.0056) 0.9639 0.9607 0.9489 q (0.0253) (0.0233) (0.0717) -1.1508 -2.9102 -4.3209 LOi (1.4647) (0.8109) (2.5425) 0.9193 -0.7785 -2.7620 U>2 (0.4697) (0.6366) (2.3400) 0.3809 0.1911 0.1971 <f> (0.3966) (0.4157) (0.4496) 0.8901 0.8412 0.8513 '1 (0.8894) (0.7459) (0.7502) 1.0205 0 0.5 1 — — (0.6151) -709.5602 -715.7928 -711.0486 Log-Likelihood 2.3354 7.7127 — Robust L M t  t  t  t  t  t  Figure 1.7 plots the conditional standard deviation from the Markov-switching stochastic volatility model against the absolute residual. This plot is very similar to the stochastic volatility and Markov-switching models. Finally, Figure 1.8 plots the smoothed and forecast probability of the low-volatility regime over the sample period. Note that as in the Markov-switching model we identify two periods of high volatility, in late 1974 and the early 1980s. 1.5  Comparing the Models  The first test we use to compare the Markov-switching and the stochastic-volatility models is a likelihood ratio test. As noted, these models are clearly non-nested, and traditional hypothesis testing procedures are inapplicable. However, we use the nonnested likelihood ratio test of Vuong (1989). Vuong's non-nested likelihood ratio test is valid in situations in which two models are competing to explain some variable,  t  Essay 1. MSSV Interest Rate Models  29  Essay 1. MSSV Interest Rate Models  30  such as y in this case. Vuong (1989) showed that under certain regularity conditions the variable n- LR /u  A(0,1),  l/2  n  n  where LR = L^ — L^ , L^ is the log-likelihood of the stochastic volatility model, L^ is the log-likelihood of the Markov-switching model, n is the number of observations, and v  s  v  n  s  = -n 1  u,  2  n  [log  f M m  \}  jMs(yt)_  1^,  n  t=i  fsviVt) jMs{yt)  is the variance of the likelihood ratio statistic. Thus, the correctly standardized likelihood ratio statistic converges in distribution to a standard normal random variable. The Vuong likelihood ratio statistic for the stochastic volatility model over the Markov-switching model is 0.2979, which is not significant at traditional significance levels. Thus, although the stochastic volatility model has a slightly higher log-likelihood, they are not statistically significantly different. A second way to compare the models is to compare the standardized residuals from each of the models. Table 1.9 presents a range of diagnostic tests on the standardized residuals from the four models considered. The model that performs the best in all these tests is the Markov-switching model. The stochastic volatility model's standardized residuals are leptokurtic and display positive skewness; in the Markov-switching model, although still non-normal (as evidenced by the Bera-Jarque statistic), the de-  Table 1.9: Diagnostic Tests on the Standardized Residuals Stochastic Markov- Markov-switching Diffusion volatility switching stochastic volatility Mean -0.0364 -0.0223 -0.0334 -0.0340 Variance 1.5419 1.5653 1.4819 1.4921 Skewness 0.3374 0.6877 0.0939 0.2413 (Standard error) 0.1400 0.1400 0.1400 0.1400 Kurtosis 5.7241 7.7514 5.7578 6.0131 (Standard error) 0.2801 0.2801 0.2801 0.2801 Bera-Jarque 100.4205 311.9639 97.4189 118.7231 BP 0.8778 0.2289 0.0116 0.0434 16.2448 15.9324 19.0878 18.0802 BP £ ARCH 0.6801 0.2142 0.0084 0.0335 pARCR 16.0252 16.0252 16.0252 16.0252 BP 1.3669 0.0461 2.0625 1.1890 31.7113 22.3279 27.3609 BPh 30.8865 1  12  P  B  2  Essay 1. MSSV Interest Rate Models  31  viations from normality are less pronounced. Another point to note from the tables is that all models account for very-short-term dynamics well, and all the portmanteau test statistics at lag one are statistically insignificant, but do not perform nearly so well with long-term behavior. This is not considered a major issue because we are mainly interested with the short-term behavior. To determine how well the various models performed in predicting interest rate volatility, a number of specification tests were performed. The results of these tests are presented in Table 1.10. Three metrics were chosen to compare the predictive ability of the various models. The variable being predicted is the absolute value of the forecast error and the predictors are the conditional standard deviations from, respectively, the Markov-switching and stochastic volatility models. The three metrics are:  Table 1.10: In-Sample and Out-of-Sample Specification Tests Sample (No. Obs)  Statistic  Diffusion  SV  MS  MSSV  0.3775 0.5581 0.5209  In-Sample  July 1964 December 1989 (307)  MAE RMSE R  0.3786 0.5423 0.5477  0.3729 0.5315 0.5656  0.3709 0.5443 0.5443  July 1964 June 1974 (120)  MAE RMSE R  0.2280 0.2815 0.5773  0.2340 0.2868 0.5613  0.2283 0.2288 0.2803 0.2798 0.5811 0.5823  July 1974 June 1984 (120)  MAE RMSE R  0.5306 0.7359 0.5680  0.5118 0.7140 0.5933  0.5284 0.7495 0.5518  0.5422 0.7744 0.5215  July 1984 December 1989 (66)  MAE RMSE R  0.3762 0.4846 0.4051  0.3728 0.4828 0.4095  0.3438 0.4578 0.4690  0.3484 0.4598 0.4644  MAE MSE R  0.2090 0.2681 0.4627  0.2090 0.2687 0.4603  0.1925 0.2517 0.5262  0.1959 0.2545 0.5159  2  2  2  2  Out-of-Sample  January 1990 December 1996 (84)  2  Essay 1. MSSV Interest Rate Models  32  1. mean absolute error 2. root mean squared error i  v  r  2  In the full sample period, the Markov-switching model outperformed all other models using all three metrics. It is also the best performing model in terms of out-ofsample predictive accuracy. Interestingly, the stochastic volatility model performs the best in the period that includes the two high-volatility periods. The Markov-switching model also outperformed the stochastic-volatility model in the other two subperiods. Thus, the general conclusion of the specification tests is that the Markov-switching model is the best at predicting interest rate volatility. 1.6  2  Conclusions  This essay is concerned with modeling short-term interest rates. Given the link between short-term interest rates and the prices of bonds of various maturities, this essay also could be thought of as empirically testing various term structure models. The specific goal of this essay was to compare the effect of allowing for the coefficient of lagged interest rates in discrete approximations of the stochastic differential equations of short-term interest rate dynamics to be time varying. Two models were specifically considered: the Markov-switching model, which allows the economy to switch between periods of high and normal volatility, and the stochastic-volatility model, which allows the volatility parameter (cj ) to vary stochastically over time. A second goal of the t  essay was to present a filtering algorithm that allows Markov-switching stochastic volatility models to be estimated using quasi-maximum likelihood procedures. The major conclusion of this essay is that the Markov-switching model is preferred to the stochastic volatility model, although this conclusion is not definitive. This conclusion is based on the Markov-switching model's superior ability to predict the volatility of interest rate changes and the bias that is evidenced in estimating the elasticity of volatility of interest rate changes with respect to the level of interest rates. One conclusion of this research is that estimated values of the elasticity of the volatility of changes in the interest rate with respect to the level of interest rates If returns are normally distributed, then it would be appropriate to compare the conditional standard deviation with \ A r / 2 times the absolute residual. These calculations yield qualitatively similar results, and do not alter the conclusions reached.  2  Essay 1. MSSV Interest Rate Models  33  from models that do not allow for regime shifts are spuriously high. CKLS estimated the elasticity parameter 7 to be 1.4999. Our estimate of 7 made by using the quasimaximum likelihood methodology employed in this essay is similar at 1.4515; the estimate of 7 in the single-regime stochastic volatility model is 1.4407. However, the two models estimated in this essay that account for regime shifts, and in particular the atypical behavior of interest rate volatility in the period of the Federal Reserve experiment, find estimates of 7 much closer to unity—the Markov-switching model estimates 7 = 0.9200, and the Markov-switching stochastic volatility model estimates 7 = 1.0205. This indicates that term-structure models should allow for regime shifts and have an estimated elasticity of 1. Fortunately, an elasticity of one means that the interest rate process will be positive almost surely. A subsidiary question asked by the research is whether it is important to include lagged volatility in the volatility function of interest rate models. A unanimous conclusion among all the four models considered in this essay is that it is important to allow for a diffusion term—interest rate volatility depends on the level of lagged interest rates. This implies that term-structure models should allow volatility to depend on the level of interest rates if they want to adequately capture the dynamics of interest rates. Our results are broadly consistent with Naik and Lee's (1997) term-structure model, which allows for regime switching. However our results suggest that to adequately account for short-rate dynamics a term structure must also allow for diffusion-type effects. Future research could attempt to develop a term structure model that models interest rate volatility both as regime dependent and as a function of the level of interest rates. l.A l.A.l  Appendix:  Quasi-Maximum  Likelihood  Estimation  Stochastic Volatility Model  The Kalman filtering equations are presented for the stochastic volatility model discussed in Section 1.2.3. We also show how to construct the quasi-likelihood function for use in parameter estimation. We reproduce the state and observation equations used in the Kalman filter. This set-up is general and applies to both the diffusion and the nondiffusion model presented in the empirical results. The nondiffusion model is a special case of the diffusion model with 7 set equal to 0. y = x + 27logr _! - 1.2704 + & , t  t  4  Essay 1. MSSV Interest Rate Models X =LO  +  t  <f>X -i t  +  34  J] . t  The Kalman filter has two basic steps: constructing the linear projection and then updating that projection. Let the information set available to the econometrician at time t be denoted by ip = {yt, • • • ,Vo}- The inference regarding log-volatility conditional on information available to date i — 1 is denoted x \t-\ = E[x \ip -\], and the mean squared error of this forecast is denoted as Pt\t-i = E[(x — a; |t_i) |^ _i]. Now we are in a position to present the Kalman filtering equations. In the derivation of the Kalman filtering equations we assume that the parameters are known with certainty. Issues regarding parameter estimation are deferred until after the filtering equations have been derived. The equations to forecast the state variable and to calculate its mean squared error are as follows. t  t  t  t  2  t  t  t  Step 1. Forecasting log-volatility: Z | _l = t  t  co  (1.11)  + Cte -l|t-l, t  p \t-i = 4> p -i\t-i + o- . 2  t  (1.12)  2  t  Inference regarding the observed variable y can now be made because E f ^ l ^ - i ] = y \ -\ = z | _ i + 27logr _! - 1.2704, and E[(y - y | _i) |^_i] = p -i\ -i + ^ • Along with the assumption that £ is conditionally Gaussian, we can calculate the density t  2  t  t  t  t  t  t  t  t  t  t  t  as f(y \ip -i) t  t  = ,  1  === exp  -{yt - yt\t-\l2 H\t- -i + f) 2(p«| t  and this can be used in quasi-maximum likelihood parameter estimation, where the log-likelihood £(y; 0) can be calculated as £(y;6) =  £logf(yt\A-i),  1  t=i  where 0 is the vector containing all relevant parameters. The maximum likelihood estimate of the parameters is then obtained by maximizing the log-likelihood numerically with respect to the parameter vector 0. Robust standard errors can then be constructed using the Hessian outer product Hessian estimate, where the Hessian and outer product estimates of the information matrix are calculated numerically. Step 2. Update the forecast: x \t = x \ -i +pt\t-i (pt\t-i + t  t t  (yt-Vt\t-i),  (1-13)  Essay 1. MSSV Interest Rate Models  (  ^V  35  1  Pt\t = Pt\t-i ~Pt\t-\ I Pt\t-i + y I  Pt\t-i-  ,  X  (1-14)  It is also possible to construct a series of smoothed estimates of the state vector (x yr)' t  and associated mean squared errors (pt\r)- This is done by iterating backward from XT\T and PT\T, which are the terminal steps in Step 2 from the basic filter, on the following equations: Xt\T  =  X\  t t  +  J (x 1\T t  (1-15)  ~ Xt+l\t),  t+  Pt\T = Pt\t + Jt(Pt+l\T -Pt+l\t)Jt  (1-16)  for t = T - 1, T - 2 , . . . , 1, where J = Pt\t^>Pt+i\ t  1.A.2  t  Markov-Switching Models  Hamilton's (1989)  filter is designed to calculate the probability of observing each  state at each point in the sample using a very simple recursive filter. The filter requires the probability of observing each state for the first period and this essay uses the ergodic probability of the two state model for this purpose. Alternative choices include arbitrarily selecting the probability and estimating it as a free parameter. If we denote the probability of remaining in each regime from one period to the next by p = Pr[S = l\S -i = 1] and q = ~Pr[S = 2\S -i = 2], then the ergodic probability t  t  t  t  of state one occurring is Pr[5 = 1] = 0  ~ 2-p-q' 1  q  if there are only two states. The ergodic probability of state two occurring is calculated in a similar fashion, or simply as PrfSo = 2] = 1 — Pr[Srj = 1]. Hamilton (1994, p. 684) provided the formulas for calculating the ergodic probabilities in the more general K regime case. Given the probability of the different states occurring at t = 0, Hamilton's filter proceeds by iterating forward from t = 1,..., T on the following steps. Step 1. Calculate a forecast of the regime probabilities: K  Pr[S = # _ i ] = t  where Pr[5 = i, S -i = t  t  t  Y, *[St = P  i, S -i = Mt-i], t  = Pr[S = z|5 _i = j] x Pr[5 _i = Mt-i]t  t  t  Pr[5t_i = j\ipt-i] is the output of the filter from the previous iteration.  (1.17) Note that  Essay 1. MSSV Interest Rate Models  36  Step 2. Calculate the joint density: f(yt,S = i\A-i) = f(yt\St = i,ibt-i) * Pi[S = i\ip -i]. t  t  (1.18)  t  This calculation requires that the density of y is known conditional on a particular t  regimes realization. For example, if y is Gaussian conditional on state i occurring t  with mean / i ^ and variance uf , where the mean and variance can depend on previous t  values of y (e.g. fin = cn + hyt-i and a  =  2 t  i the case for the AR(1)  8 5  s  Markov-switching diffusion model), then the conditional density is given by f(y \S = i, ipt-i) t  t  = J - exp  2<4  Step 3. Calculate the unconditional density by integrating out the regimes: K  f(y \A-i) t  f(y , S = i | ^ _ i ) .  = £  t  (i.i9)  t  =i  Step 4- Update the probability of observing the regimes: Pr[S ^^ ] t  This holds because ip = {y ,A-i} t  1.A.3  =  t  t  n  y  - ^ ; ^ \  (1.20)  f(yt\wt-i)  and Pi[A\B] =  ^[B] •  P  ]  Markov-Switching Stochastic Volatility Models  In the derivation of the filtering equations presented here, use is made of the following notation: let x ^ i,S -! =j,ib -i], t  t  = i , S _ i = j,rp -i],  = E[x \S t  x^  t  t  = E[x \S = t  t  Pt\f-i = \( t E  x  ~ 4t-i) \ 2  andpJJ.! = E[(x - x ^ ) ^ 2  t  t  =  St  = i,^ -i]t  Step 1. Forecast log-volatility: 4-i  = "i+^xSVi, =  Step 2. Updating the forecast:  ^11,-1 + *  (1-21) (  L 2 2  )  Essay 1. MSSV Interest Rate Models  # =  -4-!  +  37  y) 4-i' _1  (1-24)  where = xf^ + 2 l o g r _ - 1.2704. The conditional density can be calculated as 7  t  1  f(y \S = i, S -i = j, ipt-i) = i t  t  =  1  t  exp  (1.25)  + 2f  2 ^ 1  It is fairly clear that forecasting used this procedure will cause the number of observations to blow up and the process will become entirely path dependant. Kim (1994) uses a very simple approach to remove this path dependence, and this approach is followed here. At each forecasting step we use the conditional expectation of logvolatility that depends only on the state in that period. This is obtained by taking a weighted average of the output of the previous iteration: (i)  *"* P  _  £ f P r [ 5 t = »,5«-i=J#t]xjy )  "  Pr[5 = t\A]  W _ * ~  =1  t  S£i  P  JN(4  ' J)  ^ = i, fl-i = + (sjff ~ Pr[5 = i\M "  4 ))  (  L  (  L  6 )  J) 2  t  '  2  7  )  where Pr[5 = = £ * i Pr[5 = i, S -i = j|Vt]A slight modification to Hamilton's filter allows the iterative calculation of the regime probabilities to be calculated: Step 1. Calculate a forecast the regime probabilities: t  t  t  Pr[5 = t, S -! = Mt-i] = Pr[S = i\S -i = j] x Pr[5 _i = Mt-i] t  t  t  t  t  (1.28)  Note that Pr[S -\ = j\4>t-i] is the output of the filter from the previous iteration. Step 2. Calculate the joint density: t  f{y , S = i, St-i = t  t  = f(yt\St = i, St-i = j, ibt-i) x Pr[5 _i = jjV't-i], (1.29) t  where f{yt\S = i , , St_i = j,ipt-\) is defined as previously. Step 3. Calculate the unconditional density by integrating out the regimes: t  38  Essay 1. MSSV Interest Rate Models  Step 4- Update the probability of observing the regimes: rc • c - i / i f\Dt, S = i, S -i = j\ipt-i) P r S = i,S -i = j\ibt] = 7^"n ^ Hytm-i) D  t  t  l.A.3.1  / i o i \ ( )  t  L 3 1  t  Smoothing  We next turn our attention to issues relating to smoothing. Define k, S = i, tpr] a n d p j ^ = E[(x —x^) \S i t  t  = E[x \S +i = t  t  = k,S = i, V>T]- Kim (1994) showed how  2  t+  t  to calculate smoothed regime probabilities using a very straightforward backwarditerative algorithm: Pr[5  t + 1  = k,S = Mr] = Pr[S t  t+1  =  x  A# ] r  Pr[5  m  =  t  * ^ j f ^  =  ^  .\, = j]  (1.32) (k,i)  r(k,i) / (fe) \ t+l\T  (i)  ~  t\T  X  t\t ' t  X  J  = pg^O^],)"  for t = T - 1, T - 2 , . . . , 1, where J^  l)  (0  _  Pr[5  _  /,  1  ,  „„\  (i-.OO)  a n d  E ^ i P r t S t + ^ f c . S ^ z l ^ r ] ^  X t | T  W  (fe,i) \  t+l\t)'  X  X  ZZ Pr[St i =1  +  \  = Z|^T]  t  = k,S =  i\ fr]p$  )  t  1  Pr[S = i | V r ]  Pt T  t  The regime-independent smoothed estimates of log-volatility and the mean squared error, x \r and p \r, can be calculated by taking a simple weighted average: t  t  K  %t\T  =  K  Yl Y^ t\T x  = k,S = i\ip ],  [W  )pic S  t  T  k=li=l  PAT  =  YYp$Vx[S  t+1  = k,S  t  =  i\ik}.  fc=li=l  References A i t Sahalia, Yacine, 1996, Testing Continuous-Time Models of the Spot Interest Rate, Review of Financial Studies 9, 385-426.  Essay  J.  MSSV Interest Rate Models  39  Andersen, Torben G., Hyung-Jin Chung, and Bent E. S0rensen, 1999, Efficient Method of Moments Estimation of a Stochastic Volatility Model: A Monte Carlo Study, Journal of Econometrics 91, 61-87. Ball, Clifford A., and Walter N. Torous, 1999, The Stochastic Volatility of Short-term Interest Rates: Some International Evidence, Journal of Finance forthcoming. Black, Fisher, and Myron Scholes, 1973, The Pricing of Options and Corporate Liabilities, Journal of Political Economy 81, 637-654.  Bollerslev, Tim, 1986, Generalized Autoregressive Conditional Heteroskedasticity, Journal of Econometrics  31, 307-327.  Bollerslev, Tim, Ray Y . Chou, and Kenneth F. Kroner, 1992, A R C H Modeling in Finance: A Review of the Theory and Empirical Evidence, Journal of Econometrics 52, 5-59. Bollerslev, Tim, and Jeffrey M . Wooldridge, 1992, Quasi-Maximum Likelihood Estimation and Inference in Dynamic Models with Time Varying Covariances., Econometric Reviews 11, 143-172.  Broze, Laurence, Oliver Scaillet, and Jean-Michel Zakoian, 1995, Testing for Cointinuous-Time Models of the Short-Term Interest Rate, Journal of Empirical Finance 2, 199-223. Cai, Jun, 1994, A Markov Model of Switching-Regime A R C H , Journal of Business and Economic  Statistics.  Chan, K . C , G. Andrew Karolyi, Francis Longstaff, and Anthony Sanders, 1992, The Volatility of Short-term Interest Rates: An Empirical Comparison of Alternative Models of the Term Structure of Interest Rates, Journal of Finance 47, 1209-1227. Cox, John, Jonathan Ingersoll, and Stephen Ross, 1985, A Theory of the Term Structure of Interest Rates, Econometrica 53, 385-408. Davies, R. B., 1977, Hypothesis Testing When a Nuisance Parameter is Present Only Under the Alternative, Biometrika 64, 247-254.  Essay 1. MSSV Interest Rate Models  40  Davies, R. B., 1987, Hypothesis Testing When a Nuisance Parameter is Present Only Under the Alternative, Biometrika 74, 33-43. Diebold, F. X., 1986, Testing for Serial Correlation in the Presence of A R C H , Proceedings of the Business and Economic Statistics Section of the American Statistical Association pp. 323-328. Dueker, Michael J., 1997, Markov Switching in G A R C H Processes and MeanReverting Stock-Market Volatility, Journal of Business and Economic Statistics 15, 26-35. Duffee, Gregory, 1993, On the Relation Between the Level and Volatility of ShortTerm Interest Rates: A Comment on Chan, Karolyi, Longstaff and Sanders, Working paper, Federal Reserve Board Washington D.C. Duffee, Gregory, 1996, Idiosyncratic Variation of Treasury Bill Yields, Journal of Finance 51, 527-551. Dunsmuir, W., 1979, A Central Limit Theorem for Parameter Estimation in Stationary Vector Time Series and its Application to Models for a Signal Observed with Noise, Annals of Statistics 7, 490-506. Durland, J. Michael, and Thomas H. McCurdy, 1994, Duration-Dependent Transitions in a Markov Model of U.S. G N P Growth, Journal of Business and Economic Statistics 12, 279-289. Engle, Charles, and James D. Hamilton, 1990, Long Swings in the Dollar: Are They in the Data and Do Markets Know It?, American Economic Review 80, 689-713. Engle, Robert F., 1982, Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of U.K. Inflation, Econometrica 50, 987-1008. Engle, Robert F., Victor Ng, and Michael Rothschild, 1990, Asset Pricing with a Factor A R C H Covariance Structure: Empirical Estimates for Treasury Bills, Journal of Econometrics 45, 213-238. Filardo, Andrew J., 1994, Business-Cycle Phases and their Transitional Dynamics, Journal of Business and Economic Statistics 12, 299-309.  Essay 1. MSSV Interest Rate Models  41  Fridman, Moshe, and Lawrence Harris, 1998, A Maximum Likelihood Approach for Non-Gaussian Stochastic Volatility Models, Journal of Business and Economic Statistics 6, 284-291. Goodwin, Thomas H., 1993, Business-Cycle Analysis with a Markov-Switching Model, Journal of Business and Economic Statistics 11, 331-339. Gourieroux, C , A. Monfort, and E. Renault, 1993, Indirect Inference, Journal of Applied Econometrics 8, S85-S118. Gray, Stephen F., 1996, Modeling the Conditional Distribution of Interest Rates as a Regime-Switching Process, Journal of Financial Economics 42, 27-62. Hamilton, James D., 1989, A New Approach to the Economics Analysis of NonStationary Time Series and the Business Cycle, Econometrica 57, 357-384. Hamilton, James D., 1994, Time Series Analysis. (Princeton University Press: Princeton, USA). Hamilton, James D., and Raul Susmel, 1994, Autoregressive Conditional Heteroscedasticity and Changes in Regime, Journal of Econometrics 64, 307-333. Hansen, Lars Peter, 1982, Large Sample Properties of Generalized Method of Moments estimators, Econometrica 50, 1029-1054. Harvey, Andrew, Esther Ruiz, and Neil Shephard, 1994, Multivariate stochastic variance models', Review of Economic Studies 61, 247-264. Harvey, Andrew, and Neil Shephard, 1996, Estimation of an Asymmetric Stochastic Volatility Model for Asset Returns, Journal of Business and Economic Statistics 14, 429-434. Hsieh, David A., 1991, Chaos and Nonlinear Dynamics: Application to Financial Markets, Journal of Finance 46, 1839-1877. Jacquier, Eric, Nicholas G. Poison, and Peter E. Rossi, 1994, Bayesian Analysis of Stochastic Volatility Models, Journal of Business and Economic Statistics 12, 371389.  Essay 1. MSSV Interest Rate Models  42  Kim, Chang-Jin, 1994, Dynamic Linear Models with Markow-Switching, Journal of Econometrics 60, 1-22. Layton, Allan P., and Daniel R. Smith, 2000, A Further Note on the Three Phases of the US Business Cycle, Applied Economics 32, 1133-1143. Naik, Vasant, and Moon Hoe Lee, 1997, Yield Curve Dynamics with Discrete Shifts in Economic Regimes: Theory and Estimation, Working paper, Faculty of Commerce, University of British Columbia Vancouver, Canada. Sandmann, Gleb, and Siem Jan Koopman, 1998, Estimation of Stochastic Volatility Models via Monte Carlo Maximum Likelihood, Journal of Econometrics 87, 271301. Schaller, Huntley, and Simon van Norden, 1997, Regime Switching in Stock Market Returns, Applied Financial Economics 7, 177-192. So, Mike K., K. Lam, and W. K. Li, 1998, A Stochastic Volatility Model with MarkovSwitching, Journal of Business and Economic Statistics 16, 244-253. Taylor, S.J., 1986, Modelling Financial Time Series. (John Wiley and Sons: Chichester, UK). Turner, Christopoher M., Richard Startz, and Charles R. Nelson, 1989, A Markov Model of Heteroscedasticity, Risk, and Learning in the Stock Market, Journal of Financial Economccs 25, 3-22. Vasicek, Oldrich, 1977, A n Equilibrium Characterization of the Term Structure, Journal of Financial Economics 5, 177-188. Vuong, Quang H., 1989, Likelihood Ratio Tests for Model Selection and Non-Nested Hypotheses, Econometrica 57, 307-333. White, Halbert, 1980a, A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity, Econometrica 48, 817-838. White, Halbert, 1980b, Maximum Likelihood Estimation of Misspecified Models., Econometrica 50, 1-26.  43  Essay 2: Conditional Coskewness and Asset Prices.  2.1  Introduction  This essay presents a new approach to empirically estimating and testing nonlinear asset pricing models. The approach is inspired by Harvey's (1989) test of the C A P M and the extension to multi-factor pricing models by He et al (1996). The failure of linear pricing models has encouraged the consideration of non-linear pricing models. An alternative response to the poor performance of the C A P M is to include conditioning information to allow for risk and prices of risk to change over time. Neither of these approaches has been entirely satisfactory in explaining the cross-section of equity returns. This essay combines these two approaches by developing a conditional non-linear pricing model which improves the performance of both conditional linear and unconditional non-linear pricing models. The simple, linear single-factor C A P M only holds under very specific conditions. When asset prices are non-normal and investors have non-quadratic preferences, then investors will care about all return moments and not only mean and variance, the assumption that drives the C A P M . There are a number of extensions to the basic two-moment C A P M which predict a CAPM-like linear relationships in which terms like coskewness (Kraus and Litzenberger (1976)) and cokurtosis (Fang and Lai (1997)) risks are priced. To date no true conditional test of such a pricing model has been undertaken. A recent paper by Dittmar (2002), which is related to the three-moment C A P M , considers a pricing kernel, or stochastic discount factor, which is a quadratic function of market returns. However, this line of research does not constitute a test of the three-moment C A P M because the model parameters are not tied down to the distribution of market returns as predicted by the C A P M . This essay attempts to fill this gap in the empirical literature. We develop an empirical framework in the spirit of the beta method to testing asset pricing models that incorporate moments higher than variance. The approach is parsimonious: we avoid modelling any asset-specific parameters, while imposing the restrictions implied by asset pricing theory. The approach is also flexible: by a judi-  Essay 2. Conditional Coskewness and Asset Prices.  44  cious parameterization we avoid placing restrictions on the dynamics of conditional covariance and coskewness; and by using generalized method of moments we avoid making restrictions on the exact distribution of returns. This essay can be viewed as either an alternative to, or an extension of, the SDF method which has become very popular in estimating and testing nonlinear asset pricing models. In particular we compare the results from modelling the higher moments of market returns with the SDF methodology of Dittmar (2002). The two approaches are approximations to reality, since they are derived using Taylor series expansions of either expected utility (in the three-moment CAPM) or the marginal rate of substitution (in the SDF). Which approximation conforms more closely to reality is fundamentally an empirical question. Although the methodology developed herein can in principle be applied to an asset pricing model that prices any market return moment, the empirical component of the essay truncates consideration of higher order moments at skewness, the third moment. There are a number of reasons for this limited focus: the fist is pragmatic; after mean-variance analysis, mean-variance-skewness seems like a reasonable next step. Second, there is already a reasonably large literature devoted to the role of skewness in asset prices. And lastly, we have theoretical motivations to assert that a representative agent would exhibit a preference for skewness, yet no such motivation exists for higher moments. In fact, the third moment is the highest moment that aggregates: if each investor has a preference for skewness, then the representative agent in a pareto-efficient allocation will too; yet no higher moments are known to share this property. Since we typically derive asset pricing models in a representative agent framework, it makes sense to consider only preferences for those moments which are known to aggregate. We apply the framework to address a number of interesting questions, such as: Are covariances and coskewness time varying? Are covariance and coskewness risks priced? Does the price of these risks vary through time? Do investors have a preference for positive skewness, or do they exhibit aversion to negative skewness? Can preference for skewness explain the usefulness of characteristic-based factors such as size and book-to-price in explaining the cross-section of equity returns? Conversely, does coskewness explain part of the returns that these factors do not? Can the performance of the SDF methodology, as a test of factor pricing, be improved by linking the coefficients on factor returns more closely with theory? The remainder of the essay proceeds as follows. Section 2.2 discusses the conditional two- and three-moment CAPMs, and discusses an empirical procedure useful in test-  Essay 2. Conditional Coskewness and Asset Prices.  45  ing these models. Section 2.3 then discusses the data and presents some evidence on the need for conditioning information. The major results are discussed in Section 2.4. Section 2.5 presents the results from estimating a multi-factor asset pricing model that includes the SMB and HML factors. Section 2.6 compares the improvement in performance relative to the basic C A P M induced by considering three factors and three moments, while Section 2.7 conducts specification tests to determine if the models make full use of the conditioning information, and if the parameters are stable. Section 2.8 compares the results reported here in favour of the three-moment C A P M and the usefulness of coskewness with the contradictory results of a contemporary paper by Dittmar (2002). The essay concludes in Section 2.9 with a summary and a suggestion for future research. 2.2  Model Development and Motivation  Let ^t-i denote the information set available to the market at the end of time period t — 1. An econometrician drawing statistical inference about an asset pricing model is at a disadvantage relative to the market, having available information variables  Z -\, t  an L-vector, which is a strict subset of the information available to the market. For the purposes of the current analysis, we will ignore any complications that may be caused by omitted information variables. Asset pricing models are subject to a joint testing problem since any rejection of an asset pricing model may be due to the use of only a subset of the market's conditioning information. A second possible interpretation of any rejection is that the assumed functional form translating information into expectations may be incorrect. For simplicity, this study uses a simple linear functional form whereby conditional moments are a linear function of lagged information. Theory gives the econometrician virtually no guidance as to the nature of this unknown true functional form. One could view the chosen linear specification as a first-order approximation to this general form. 2.2.1  The Two-Moment CAPM  The conditional specification of the two-moment capital-asset pricing model (CAPM) of Sharpe (1964) and Linter (1965) predicts a very simple linear relationship between the expected return on any risky asset is excess of the conditionally risk-free return, and a measure of that assets systematic risk. This relationship is defined in terms of  Essay 2. Conditional Coskewness and Asset Prices.  46  expected returns: E -i(ri, ) t  where r^ and r t  mit  t  = Pi,tE -i(r t) t  for i = 1... n,  mt  (2.1)  refer to the excess returns (over the return on a conditionally  risk-free asset) to asset i and the market respectively, and E -i(-)  = E(-\ty -i)  t  t  is  the expectation operator conditional on information available at time t — 1. The systematic risk, or beta (/?), of security i at time t is defined as: o  _ Cov -i(r ,r ,t) Var _i (r ) t  M  t  m  mit  where Cov _i(-, •) and Var _i(-) are the conditional covariance and variance operators and t  t  Cov -i(ri,t,r ) t  = Et^Kr^t  mjt  - £t-i(r ))(r , iit  m  E -i(r ))],  t  t  mtt  and Var _i(r ) = E _ i [ ( r t  mit  t  - £t-i(r , )) ]. 2  m)t  m  t  This setup allows for an asset's beta to vary over time, as indicated by the time subscript. One difficulty with the basic two-moment C A P M is its very poor empirical performance. Numerous studies have documented its empirical failure. For recent surveys of the literature on the failure of the C A P M , along with other asset pricing anomalies, see Campbell, Lo, and MacKinlay (1997) and Campbell (2000). There have been a number of responses to the poor empirical performance of the C A P M . The earliest studies that documented some of the CAPM's shortcomings were unconditional. Unconditional tests of the C A P M , such as the seminal studies of Fama and MacBeth (1973) and Black, Jensen, and Scholes (1973), imposed the restriction that betas were constant over time. However, there is a large literature that documents the time-varying nature of broad-based stock index variance both in the US and internationally. Harvey (1989) and Schwert and Seguin (1990) also present evidence that covariances too are time-varying. And finally, Ferson and Harvey (1991, 1993, 1999) demonstrate that betas, both US and international, are also time-varying. If the C A P M only holds conditionally, there is no reason to be surprised when an unconditional test fails. One response to the failure of unconditional tests was to test the C A P M using methodologies that relaxed the constant beta assumption. An early set of these conditional studies were Harvey (1989), Bodurtha and Mark (1991), Ng (1991), and  Essay 2. Conditional Coskewness and Asset Prices.  47  Bollerslev, Engle, and Wooldridge (1988). Bodurtha and Mark (1991) and Ng (1991) considered testing the conditional C A P M using a GARCH-based specification for the variances and covariances that allowed the asset betas to vary through time. Bodurtha and Mark (1991) used Generalized Method of Moments (GMM hereafter), while Ng (1991) pursued a maximum likelihood approach. Both these studies, and others such as Schwert and Seguin (1990) Ferson and Harvey (1991, 1993, 1999), present a sequence of studies that allow for time-varying betas using a number of techniques, including rolling window estimation, linear in instruments as linear functions of a set of conditioning variables, and the method of Davidian and Carrol (1987). Jagannathan and Wang (1996) present yet another approach since they demonstrate that an unconditional version of a conditional C A P M will include variables that predict variation in the price of risk (such as the default spread) as priced "factors". Still another literature models beta to be time-varying in a basic varying parameter regression model used in Fabozzi and Francis (1978), Bos and Newbold (1984) and Thompson (1986). Note that tests such as those of Bodurtha and Mark (1991), and of Ng (1991), which explicitly model covariances and prices of risk, are the most restrictive in nature. One advantage of the linear beta specification of Ferson and Harvey (1999) is that it does not impose any restrictions on the price of covariance risk. Yet another technique, which is applied in this essay, explicitly models the prices of risk while leaving the covariance dynamics unspecified. These last two approaches are neither more general nor restrictive than the other — they are simply different since we must choose to model either covariances or prices of risk. We favour modelling prices of risk rather than betas because this approach seems to be supported by the data.  Ghysels (1998) demonstrates that the linear beta  specification fails Andrews' (1993) test for structural breaks, a failure which suggests that there is a misspecification. Our G M M procedure requires specifying certain moments of market returns and permits us to test this using a number of specification tests. The results reported below find that the three-moment C A P M specification passes the structural break tests, and that the overidentifying moment conditions associated with stock market dynamics are not rejected. An important caveat is in order here. The strength of our chosen methodology is also a weakness. Because we avoid making restrictive assumptions about covariance dynamics (arguably the most difficult parameter in the C A P M to specify) we are unable to obtain explicit beta estimates. Without explicit betas we cannot analyze pricing errors in the traditional way — the error terms in our model are doubly ex-  Essay 2. Conditional Coskewness and Asset Prices.  48  post in that they use not only ex-post return realizations, but it also uses the ex-post covariance realization (using the term covariance extremely loosely). Before illustrating the empirical methodology, we should introduce some notation. The notational convention used here is to use capitals to denote actual returns, and lower cases to refer to excess returns. Recall that we have defined r  ijt  as the excess  return (over the risk-free return Rf ) on the i-th risky asset over the period t — 1 to t  period t for i = 1..., N, and the excess return on the market portfolio is similarly defined as r j. m  ables Z -\. t  Conditioning is done with respect to an L-vector of information vari-  The following notational shorthand is used for the conditional moments  of the market returns: • /it = Et-\(r t) mi  of = E -i[(r t t  mi  is the conditional mean market return,  — E -i(r )) ]  is the conditional variance of market returns,  2  t  mit  and = E -i[(r t  — E _i(r t)) ] is the conditional third central moment of market 3  mtt  t  mj  returns, which we will term the conditional skewness, and 1  • A = /it/of is the conditional price of covariance risk. t  2.2.1.1 An Empirical Specification We use the conditional pricing equation implied by the two-moment C A P M in equation (2.1) and the definition of each asset's conditional beta (2.2) to construct a set of testable moment restrictions on asset and portfolio returns. We can use these two equations to rewrite the pricing implications of the two-moment one-factor C A P M in terms of each asset's conditional covariance with the market, and the conditional price of covariance risk: r>o v t - ii ^ t . r ^ t )\; ( ™,t) -, ETP-i(r( ) \ = C Var _i (r ) r  t  itt  t  m>t  or using our simplified notation: E -i(r t) t  it  = Covt-i^t.r^t)^. °~t  (2.3)  Traditionally, the skewness of a distribution is the third central moment standardized by the cubed standard deviation. However, when we refer to skewness in this essay it can be understood as referring to be the unstandardized third central moment. We use this slightly inaccurate simplification because the term skewness is less wieldy than third central moment.  1  Essay 2. Conditional Coskewness and Asset Prices.  49  All market-wide variables are now grouped together. The benefit of this seemingly innocuous rearrangement will soon become apparent. The basic approach we employ to construct our empirical procedure is similar in spirit to that of Harvey (1989) for the basic single-factor two-moment C A P M , and the extension to multiple-factor two-moment asset pricing models by He et al (1996). The approach is to construct moment conditions implied by the C A P M without explicitly modelling the covariances. This is done by defining the moment conditions as the expectation of the conditional pricing equation. The relevant parameters are then estimated by minimizing the distance between the theoretically implied moments from the sample moments. The law of iterated expectations allows us to avoid modelling conditional covariances, since E[Cov -i(r r j)] t  i!U  m  = E[E ^i{r t  =  i:t  - E -. (r ))E - (r t t  1  itt  E[{r - # t - i ( r ) ) ( r M  iit  t  m|t  1  - £ _i(r ,  rni  -  t  m i t  ))]  E -i{r ))]. t  mit  We are able to construct a test of the C A P M by modelling the mean and the variance of the market, leaving the exact dynamics of the conditional covariances unspecified. One drawback of this procedure, as employed by Harvey (1989) and Ghysels (1998), is that it requires modelling the asset-specific returns on the individual securities. In most empirical studies, including the two previously mentioned, this is modelled as linear in the chosen instrumental variables. Harvey (1989) notes that this linear projection is the conditional expectation if the returns and instruments are jointly spherically invariant. However, this seems fairly restrictive. If the postulates required for the application of this method are met, why do we need the C A P M at all? If the conditions are not met then we are only using an approximation. Fortunately, we are able to avoid modelling any asset specific parameters and still construct a conditional test of the C A P M . The simplification we employ to avoid modelling E -i(r^t) t  uses the well known representation of the covariance between  two random variables: Cov(A,5)  =  E[{A - E(A)){B - E(B))]  =  E[AB] -  E(A)E(B),  which suggests using the alternate definition of the covariance operator: Cov(A, B) = E[A(B - E(B))}.  (2.4)  Essay 2. Conditional Coskewness and Asset Prices.  50  We are now able to rewrite the moment restrictions of the conditional C A P M without modelling any asset specific parameters: Et  0  n,t - r t(r t - Mt) — at it  mt  for i = 1,  (2.5)  — /j,t)u. /a be orthogonal to the information variables used to identify the dynamics of market moments. We can use the orthogonality condition to estimate the parameters. This is done by forming the N x L orthogonality conditions: If the C A P M is valid, then the conditional pricing errors r ^ — ri (r -  it  E  n, - r (r t  u  m<t  I <g> Z _ i = 0 L x l  - m)^  t  for i = 1  2  mtt  w  t  (2.6)  N.  0>  where  O L  X  I  is an L - column vector of zeroes.  The simplest case that we can consider imposes a linear relationship between the moments and the instruments. A linear specification for the mean return on the market, that is quite common in the literature — see Harvey (1989), He et al (1996), Whitelaw (1994), among others — is: (2.7) where n is an L-column vector of parameters (note that the instrument set includes a constant so the first element of \i is the intercept). We can construct moment conditions by the assumption that the forecast error  2  (2-8) is orthogonal to the instruments that are used to form it: hence (2.9)  E[(r ,t ~ Mt) ® Z -i] = 0 i . m  t  i x  We also model the standard deviation as linear in instruments, which approach was also used by Whitelaw (1994), a = Zj_ a t  x  (2.10)  where a is an L-column vector of relevant parameters. One main advantage of this 2  The error is a forecast error since the conditional mean is formed using the instrument set which is observable in period t — 1.  Z -\ t  m  Essay 2. Conditional Coskewness and Asset Prices.  51  specification is that the strict positivity of the variance is imposed by construction. To identify the parameters, we require that the deviation of the squared residual from its conditional expectation, the variance, e  2 mt  — o\ be orthogonal to the conditioning  variables. We construct enough moment conditions to exactly identify the variance parameters  3  E[{e ,-G )®Z _ ] 2  = 0  2  m  t  t  1  L l X l  .  (2.11)  These two sets of moment conditions (2.9) and (2.11) exactly identify the mean and variance parameters for market returns. The (N + 2) error vector (  m,t — Zt-ll  r  ^  1  < - (z V) t  ft — ft(fm,t  2  t  (2.12)  t  — Z7-lP) (zf'^a)  2  )  can be used to calculate the (N + 2)L moment constraints used in the two-moment model as 1 (2.13) g = -Y, t® t-i, u  z  T  t=i  which is the sample estimate of the moment constraint that E[u  t  ® Z _i] = 0 t  L(iV+  2)xi,  or alternatively, that E[h ] = 0L(N+2)XI, where ht = u ® Z -\. t  t  t  Since there are  2L parameters to be estimated and (N + 2)L moment conditions, the system is overidentified with NL overidentifying restrictions. 2.2.1.2 Modelling Mean and Variance, or Mean and Price of Covariance Risk. He et al (1996) (HKNZ hereafter) present a similar approach to testing the conditional C A P M where they explicitly model both the conditional mean of the market and the price of covariance risk (X = fJ*t/o~t) t  8 5  linear functions of the instrumental  Note that although we follow Whitelaw (1994) by modelling the standard deviation as being linear in the instruments, we differ in that we use the variance moment restriction rather than \Jl/it times the absolute value of the residual from the conditional forecast of the market's return. Using the absolute value of the return innovation is equivalent to identifying a moment restriction on the standard deviation if the returns are Gaussian. In the context of this essay it would be inconsistent to assume Gaussian disturbances to identify the standard deviation and simultaneously model time variation in the skewness of returns.  3  Essay 2. Conditional Coskewness and Asset Prices.  52  variables. Our approach, in contrast, is to model the mean and the standard deviation themselves, and impose the theoretical constraint that the price of covariance risk be equal to the ratio of the conditional mean to variance. Both specifications have the same number of parameters, but HKNZ's approach has only (TV + 1)L moment conditions, L less than our approach. The difference is that our approach imposes further restrictions on the variance as well. In their specification, HKNZ model the price of covariance risk, A = /i /of, as t  t  A = Zj_ \. t  (2.14)  x  To implement their approach one needs the conditional mean return to the market and the price of covariance risk, giving that the pricing on risky asset % as: E[(r  itt  - r (r itt  m<t  - /i )A ) <g> Z _i] = 0 i . t  t  t  L x  (2.15)  One can rearrange the price of covariance risk to express the conditional variance as a function of o\ = ^  (2.16)  which has the direct testable implication E 2.2.2  2  ^t-iM  (2.17)  The Three-Moment CAPM  A third response to the CAPM's failure is to consider multi-moment extensions of the basic asset pricing model. It is well known that to achieve an asset pricing model that depends only on mean and variance one must impose either strict distributional and/or preference restrictions. If returns are normal, or if agents have quadratic utility then the two-moment C A P M obtains. These form two theoretical objections to using the two-moment C A P M : Firstly, returns are notoriously non-normal — they are skewed and leptokurtic. Secondly, quadratic utility implies economic agents have increasing absolute risk aversion and will have decreasing marginal utility of income at some wealth levels — both unpalatable from a behavioural perspective. For these reasons extensions to the basic C A P M that define preferences over higher moments are considered. We may be satisfied to accept the two-moment C A P M even though its assumptions  Essay 2. Conditional Coskewness and Asset Prices.  53  are so terriblyflawedif the model worked. But it does not work. A large literature has developed demonstrating the failure of both unconditional and conditional versions of the basic C A P M . There are a number of unconditional tests of the three-moment C A P M , including Kraus and Litzenberger (1976), Friend and Westerfield (1980) and Lim (1989). Harvey and Siddique (2000b) present some recent work on testing for time-variation in skewness; and Harvey and Siddique (2000a) test whether, in the context of a threemoment conditional C A P M , the market risk premium changes over time. The results from unconditional tests are mixed at best: Kraus and Litzenberger (1976) and Lim (1989) find evidence supporting the three-moment C A P M , while Friend and Westerfield (1980) find evidence inconsistent with the three-moment C A P M . Harvey and Siddique (1999) present an extensive analysis of the effect of conditional coskewness on asset prices. They find both that coskewness accounts for part of explanatory power of the SMB and HML factors of Fama and French (1993), and that coskewness can explain part of the returns unexplained by these other factors. The same objection to unconditional testing of the two-moment C A P M if a conditional specification is valid holds as in the three-moment word. Unfortunately, however, to date there has been no true test of the conditional three-moment C A P M . This essay fills this gap. The closest research on this topic is the non-linear pricing kernel literature, of which Dittmar (2002) is the closest in spirit to our research. Dittmar (2002) presents a test of the stochastic discount factor model, where the discount factor is a cubic function of the return on the NYSE value-weighted stock index, and labour growth a la Jagannathan and Wang (1996). Dittmar (2002) performs conditional tests, allowing the coefficients in the polynomial expansion of the aggregate investors marginal rate of substitution to be sign-corrected functions of lagged information variables. He finds that one needs to include cokurtosis with labour growth rates to produce an admissible pricing kernel. The conditional version of the three-moment C A P M is E -i[ri, ] = A,tMi,t + li,t(Jv,t, t  where  fi\  !t  t  is the price of covariance risk, which is measured by asset /?, and  (2.18) fi2,t  is  the price of 7 risk, which is measured by an asset's 7. Just as the beta of an asset is the ratio of its return covariance with the market to the market return variance, an asset's 7 is defined as the ratio of the coskewness of that asset's return with the  Essay 2. Conditional Coskewness and Asset Prices.  54  market's return, to the market's skewness: Coskew _! ( r ^ r ^ ) t  7i,t  =  ™  7  N  »  (  2  -  1  9  )  bkew _i ( r ) t  mit  and Coskewt_i(-, •) and Skew _i(-) are the conditional coskewness and skewness opt  erators respectively: Coskew _i(r ,r t) = ^ - l t ^ t - E _ i ( r ) ) ( r t  iit  mi  t  iit  mit  - £ _i(r t  )) ] 2  mit  and Skew _i(r ) = E _i[(r t  and (ii  a n tt  m]i  t  - £ _i(r )) ], 3  mtt  t  mjt  d 1^2,t e the prices of covariance and coskewness risk respectively. ar  This model is a conditional form of Kraus and Litzenberger's (1976) two-period three-moment C A P M . As will be demonstrated below in appendix 2.A, such a conditional specification is appropriate because it is equivalent to a pricing kernel which is a quadratic function of the market return. Investors who exhibit non-increasing absolute risk aversion have a preference for positive skewness. This implies that \±2,t should have the opposite sign of the conditional market skewness, K , since in equilibrium investors are willing to sacrifice 3  returns when the market is positively skewed, but demand a premium when returns are negatively skewed. The three-moment C A P M has a direct implication for the market risk premium. Since /?  m>t  = 7  m>t  = 1, the three-moment C A P M implies that the excess return on  the market portfolio is Et-i(r , ) m t  = [i + n . ltt  2tt  (2.20)  This has some very useful implications. One puzzling result in the conditional asset pricing test literature is the tendency of conditional expected market returns to be negative over extended periods — see, e.g., He et al (1996) and Harvey and Siddique (2000b). This result is inconsistent with investors being risk averse and caring only about mean and variance. However, when investors have a preference for positive skewness, if the market was sufficiently positively skewed the negative price of skewness risk may dominate the positive price of variance risk, resulting in a negative market risk premium. This idea is explored by Harvey and Siddique (2000b). Fang and Lai (1997) develop a four-moment C A P M and present some empirical tests. Dittmar (2002) tests the empirical performance of a pricing kernel which is a  Essay 2. Conditional Coskewness and Asset Prices.  55  cubic function of the return to a value weighted stock index and growth in labour income. A cubic pricing kernel is equivalent to a four-moment C A P M . The four-moment C A P M , like its two- and three-moment cousins, is derived in a representative agent framework. When all agents are risk averse, then the representative agent's utility function will also exhibit risk aversion. A similar result was proven by Kraus and Litzenberger (1983, Theorem 1) that when all individual agents exhibit non-increasing absolute risk aversion, then the representative agent in a pareto efficient allocation will similarly exhibit non-increasing absolute risk aversion. Thus, preference for positive skewness aggregates. However, we know of no proof which demonstrates that aversion to kurtosis aggregates. Scott and Horvath (1980) have a proof of the direction of preferences for moments higher than variance that is quite general. Their proof relies on the assumption that agents have consistent preferences, or that the direction of preference for a particular moment of the distribution of random wealth does not change for different wealth levels. Given that this assumption holds, Scott and Horvath (1980) apply the mean value theorem to show that one can derive the direction of preferences for all higher moments recursively, and in particular that investors with a preference for mean, and an aversion to variance, will have a preference for positive skewness and an aversion to kurtosis. However, this result does not directly transfer to a representative agent framework with a pareto-efficient allocation. It is not even clear that a representative agent will have consistent preferences over skewness. To derive a higher moment C A P M we must work in a representative agent framework. Therefore, even though we have strong reasons to believe that individual investors are averse to kurtosis, and that returns clearly have fat tails, it is not clear that a four-moment C A P M is appropriate since we don't know whether this preference will aggregate.  For this reason the empirical focus of this essay is on the  conditional three-moment C A P M . We do, however, show how our approach can be extended to still higher moment asset pricing frameworks in a fairly trivial fashion. 2.2.2.1 An Empirical Implementation To implement empirically the three-moment C A P M , we must model the prices of covariance and coskewness risk, along with the conditional variance and skewness of the market. We continue with our linear standard deviation specification, and proceed with a similar specification for conditional skewness of market returns: that  Essay 2. Conditional Coskewness and Asset Prices.  it is linear in our instrument set  56  4  «? = Zj_ .  (2.21)  lK  We also model the price of covariance risk ii^ as linear in the instrument set t  H = U  Zj_ n.  (2.22)  lf  Investors who exhibit non-increasing absolute risk aversion will have a preference for positive skewness and an aversion to negative skewness. The price of 7 risk, Li2,t, will therefore will have the opposite sign of the conditional skewness of the market's return — when the market is negatively skewed,  /j.2,t  is negative, and when the market  is positively skewed, \i x is positive. For this reason we choose to model fj, ,t as 2  2  = Zj_ Li + 1{ T_ }8,  t*2,t  where 1{ T Z  iK>0  x  2  Z  (2.23)  iK>0  } is an indicator variable taking the value one when the conditional  skewness of market returns is positive, and zero otherwise. Because we include a constant in the information set, the effect of the indicator variable 5 is to allow the constant to change depending on the sign of the conditional skewness. We can now redefine the conditional mean of the market as Lit = /ii,t + Li2,t, which along with the redefined market return innovation e = r , — Lii t t  m  t  —  fJ*2,ti  enables  us to form the mean and variance restrictions given in (2.9) and (2.11), noting the alternative form for the conditional mean E[{r  - Z _ T  mtt  t  l M l  - Zj_ i2 - l ;_ >Q}S) ® Z - ] = 0. lL  {z  t  lK  X  (2.24)  This system can be augmented by the similarly defined moment restrictions on the market's conditional skewness, i.e., £ [ ( <  t  - K ? ) ® Z * - i ]  =  0  L l X  i.  (2.25)  We now need some way to simplify the conditional coskewness to avoid modelling the conditional mean of the risky asset's returns, analogous to our simplification of 4  A n alternative to our choice of the linear specification is the autoregressive conditional skewness model of Harvey and Siddique (1999).  Essay 2. Conditional Coskewness and Asset Prices.  57  the conditional covariance. This is achieved by expanding the conditional coskewness: Coskew _i(r r ) t  i]t)  mji  =  E -i[(r  =  E _i{r^ {r  =  E _ [r , {(r  t  - E -\r )(r  itt  t  t  t  t  1  i  itt  - £ _ir  mtt  4  t  -  m>t  2  t  mtt  ) ] - E _i{r }E -x(r ^ 2  m j t  t  itt  t  m  - £ -i»w) ] 2  t  -/i ) -(a- ) }], 2  m>t  E ^r ) ]  2  t  t  which conveniently avoids the need to model any asset specific moments in constructing a test of the three-moment C A P M . This, along with the specification of the conditional mean, variance and skewness, are sufficient to construct the moment restrictions on the cross-sectional returns implied by the three-moment C A P M , which are E[{r -r e , ^-r ^e -a )^)®Z _ } 2  ltt  itt  m t  i  = 0  2  mt  t  t  1  L x l  ,  for i = 1,.. . , TV. (2.26)  We can combine this moment restriction, the identifying restriction on the conditional mean (2.24), variance (2.11), and skewness (2.25) of market returns, to redefine the error vector \  m,t  (ri,t ~ r cm,t itt  z  (2.27)  Zj_ K  e  x  [z <_ >oy  t-iMi  t  lK  J-\  17T  Z  K  The error vector, which now has N + 3 elements, can be used to calculate the (N + 3) x L moment conditions h = u ® Z -\ implied by the three-moment C A P M . The t  t  t  sample moments are estimated just as in the two-moment case, using equation (2.13). There are (TV + 3)L moment conditions and only 3L + 1 parameters to be estimated, which leaves NL — 1 overidentifying moment restrictions that can be used in forming model specification tests. 2.2.2.2 An Alternative Empirical Specification Just as we can specify the two-moment C A P M by modelling the conditional mean and the price of covariance risk as discussed in section 2.2.1.2, so too we can specify an empirical version of the three-moment C A P M which requires modelling the price  Essay 2. Conditional Coskewness and Asset Prices.  58  of covariance and coskewness risks: A,,, = ^  (2:28)  X = ^ .  (2.29)  and u  We need to have the conditional mean of the market returns to use our modelling of the conditional covariance, and the conditional mean and variance of market returns to use the simplified representation of the conditional coskewness of each asset. We thus have four parameters to model separately. We employ a linear specification of each of these four terms (recall that it is the standard deviation and not the variance that is modelled as a linear function of the instruments), which requires 4L parameters. The two new parameter vectors are Aj and X , such that 2  A  M  = Zj^Xu  (2.30)  and X ,t = Zj_ X . 2  x  (2.31)  2  The previous specification required 4L + 1 parameters, so this specification is more parsimonious. The set of moment restrictions needed to estimate the parameters in this model are as in equations (2.9), and (2.11), along with the following: E[(r  iit  - r e tX itt  mi  - re X )  <g> Zj_ ] = 0  2  u  u  mt  2i  x  .  NLxl  (2.32)  A second advantage of this specification is that it is straightforward to apply the restrictions on the two co-moment risk prices. Both the price of covariance risk and the conditional variance of the market are positive, so the market's price of covariance risk must be positive: Ai > 0. Investors have a preference for positive skewness and )t  an aversion to negative skewness implying that the sign of the price of 7 risk will have the opposite sign of the conditional skewness of market returns, so the markets price of coskewness risk must be negative, i.e., X  < 0. The way this essay sign corrects  2tt  the prices of co-moment risk is: X = (^ -iAi) T  u  X , = -{ZUX f. 2 t  2  2  (2.33) (2.34)  59  Essay 2. Conditional Coskewness and Asset Prices.  One difference between the market moment approach discussed earlier and the current price of co-moment approach is that the former has a moment restriction on the conditional skewness of market returns while the latter does not. The reason for this is that we need the conditional skewness of market returns to specify the price of 7 risk in the market moment specification, while we explicitly model the price of coskewness risk in the current approach. However, because the skewness price of covariance risk is defined as the ratio of the price of 7 risk to the markets conditional skewness, we can augment the current moment restrictions using the following: E  OLXI.  5  (2.35)  This expanded set of moment restrictions has the same dimension as the market moment specification of the skewness C A P M test. 2.2.3  Multi-Moment  Extension  Consider a general multi-moment asset pricing model which is a direct generalization of the two- and three-moment CAPMs. Assume that the general form of our conditional multi-moment asset pricing model posits that conditional expected market returns take the following form: E _ {f ) t  1  = J2Pi,A-i,t  ltt  (2.36)  fc=2  where _ Et-\r E -ir _ Et-^fjf. ^t-l\'i,t - " t - l ' i t)(r , t A ' m , t - *->t-l< m,tj „ \ P .-i{r Ar . — ^• > E - P.. E.r_ r ±.\k is the standardized measure of risk which is the contribution of holding each asset i ok Jk  P i  >*  it  m<t  t  m;  (  -  r  >  7  6 i  t  mtt  t  x  m  to the k-th moment of a diversified portfolio. These (3s have a number of well known special cases: • (3f , which the traditional two-moment CAPM's beta t  5  These moment restrictions follow from rewriting the price of coskewness risk as fj,i = A i o f and using this in connection with the fact that in the three-moment C A P M \i = + /j, ,t to obtain H2 t = fJ-t — ^i,t°~t d using this to rewrite equation (2.29) giving the conditional skewness of market returns implied by the other parameters: jt  tt  t  a r ,  t  t  K  3 _ -  f't — Ai,tO~t C ^2,t  2  Essay 2. Conditional Coskewness and Asset Prices.  60  • 0f , which is 7 in the Kraus and Litzenberger (1976) three-moment C A P M , is t  the ratio of coskewness with market returns to market return skewness • 8f , which is the ratio of cokurtosis to kurtosis, is a term in a four-moment t  C A P M of Fang and Lai (1997). Such a pricing equation is consistent with some asset pricing model in which a representative agent cares about the first K moments of the distribution of their portfolio. Let bk t be the time t price of the A:-th moment risk. Further, denote by [i the }  t  expected return on the market portfolio E -i(r ,t) t  m  = lh-  (2-38)  Note that 0^ = 1 V7c holds trivially, and we then have the following representation t  of the market return: K  fit = T,  (2-39)  fc=2  Note that the summation starts at  = 2 corresponding to the first priced moment  risk being variance. We will apply the same approach to forming testable moment restrictions as in the two- and three-moment CAPMs. Take the A>th summand  and rewrite as £U(r  M  - £Vir^)(rW - St-i^w)*  * — , Mfc.t  - 1  (2.40)  where H  kit  = £ _i[(r , - £ t  m  t  (r , )) ],  (2.41)  fe  t=1  m  t  which is useful in rewriting (2.40) as  = £ _i[r {(r , t  iit  m  t  fit)"-  1  -  (2.42)  fefc-i,t  Hk-Lt}^]-  Taking the unconditional expectation and simplifying gives the following moment restrictions:  h  K  E[r (l ilt  - E ( ( W - Vtf-  1  fc=2  ~ Vk-i,t]—)l  /*M  (2-43)  Essay 2. Conditional Coskewness and Asset Prices.  61  where LL 1 = 0 since u. refers to the k-th. central moment and the first central moment k  is zero, i.e. E[r  - LH} = 0.  itt  (2.44)  Theory permits us to sign the price of the first four moment risks. We can augment' the analysis with the instrument set Z -i, an L-dimensional vector of information t  variables available at time t — 1, since the pricing errors in (2.43) should be orthogonal to all such information: K  E[(r (l itt  i  - £{(r , m  t  - lit)*-  - /i  1  f c  _i, }-^}) ® t  Zt-J = 0  L x l  .  (2.45)  We also use the K x L moment restriction implied by our linear conditional moment parameterization K-l  E[(r ,t m  h ) ® Z_ \ = 0  £  -  it  t  x  i x  (2.46)  i  fc=0  E[(r , m t  Li ) t  k  - Li , ) k t  ® Zt^) = 0  Vfc.  (2.47)  We therefore have a set of (N + K) x L moment conditions. 2.3  Data and Estimation Methodology  To test the empirical performance of the asset pricing models discussed above, we follow the traditional approach of using portfolios rather than individual assets, which dates back to the seminal work of Black, Jensen, and Scholes (1973) and Fama and MacBeth (1973). The principle advantage of using portfolios rather than individual assets is reduction in measurement error associated with individual assets. We consider two portfolio formation approaches. The first data source consists of seventeen portfolios formed by sorting the stocks on the NYSE, A M E X and NASDAQ according to industries. The second sorting procedure forms twenty-five portfolios (herein FF25) which are allocated on the basis of their size and the ratio of their book equity and their market equity. This data set has been particularly troublesome to price. This data set has been studied conditionally by, among others, He et al (1996) who found that a conditional two-moment C A P M and Fama and French (1993) threeand five-factor models failed to price these portfolios, and Ferson and Harvey (1999) who used a linear beta specification in testing the performance of again a two-moment C A P M and the F F three-factor model. The next step in our empirical study is to define what information variables we will  Essay 2. Conditional Coskewness and Asset Prices.  62  consider. We use a set of six information variables which are fairly standard in the literature, in fact they are taken directly from He et al (1996), but many other studies use similar information variables. We include a constant, the S&P500 index return (S&P), the dividend price ratio on the S&P500 index, defined as the cumulative 12month dividends divided by the current price level (Div), the term spread measured by the difference on yields on a three-month treasury bill and a one-month treasury bill rate (Term), the junk bond yield spread measured as the difference in yields on Baa rated bonds over Aaa investment grade bonds (Junk), and finally the one-month treasury bill rate (Tbill). Each variable is standardized to have zero same mean and unit sample variance to simplify interpretation of the parameter estimates. To ensure that the inference is conducted using only publicly available information, we use only lagged values of the information variables; hence the use of the t subscript to refer to asset and portfolio returns, e.g. r , and the use of t — 1 as the information variable t  time subscript  Z -\. t  To illustrate that these variables have power to explain the industry portfolios consider Table 2.1 where we report the results of running a very simple linear regression where we use the lagged economic information variables to predict individual stock returns as in £ _ i ( r ) = ZjLrfi t  M  for i = 1,..., n.  (2.48)  The parameters on all 17 and then 25 portfolios are estimated jointly and the heteroscedasticity consistent standard errors are reported in the parenthesis. We also report the results of a Wald test along with asymptotic p-value. The results reported in this table are consistent with the vast majority of empirical studies which show that economic variables are quite powerful in explaining the variation in individual portfolio returns. The main focus of this study is on analyzing the effect of conditional covariance and coskewness on asset prices. Before proceeding with further analysis it would be useful to determine the extent to which the portfolio co-moments vary over time. To approach this question a very simple testing procedure is applied. Each co-moment is explicitly modelled as a linear function of the lagged information variables. We use the definition of the conditional covariance and coskewness operators to form the following moment conditions for covariance: E[(e e t - tr iit  mi  i7rM  ) ® Z -i] = 0 i t  i x  (2.49)  Essay 2. Conditional Coskewness and Asset Prices.  63  Table 2.1: Portfolio R e t u r n Predictability This table presents the parameter estimates and G M M standard errors from projecting excess returns on industry and size and book-to-market sorted portfolios onto a set of six instrumental variables:  E[(r , - Zj_ ) ® Zj_ \ = Owxi a  lQi  x  for  i = 1,..., N  over the period July 1963 to December 1997. Parameter Industry 1 Industry 2 Industry 3 Industry 4 Industry 5 Industry 6 Industry 7 Industry 8 Industry 9 Industry 10 Industry 11 Industry 12 Industry 13 Industry 14 Industry 15 Industry 16 Industry 17 S1-BM1 S1-BM2  Constant 0.7248 ( 0.2178) 0.4771 ( 0.2989) 0.6338 ( 0.2458) 0.5645 ( 0.2992) 0.5641 ( 0.2573) 0.5131 ( 0.2463) 0.7511 ( 0.2331) 0.5539 ( 0.2808) 0.2647 ( 0.2851) 0.5501 ( 0.2481) 0.4782 ( 0.2658) 0.4522 ( 0.2656) 0.5836 ( 0.2865) 0.3492 ( 0.1876) 0.6180 ( 0.2755) 0.6192 ( 0.2526) 0.4961 ( 0.2169) 0.2037 ( 0.3702) 0.7141 ( 0.3242)  S&P -0.3013 ( 0.3142) 0.2886 ( 0.3789) 0.0283 ( 0.3470) 0.3296 ( 0.3712) 0.1118 ( 0.3511) -0.1269 ( 0.3605) -0.0458 ( 0.3289) 0.0034 ( 0.3785) 0.0336 ( 0.3803) -0.0045 ( 0.3346) 0.1104 ( 0.3526) 0.1660 ( 0.3229) 0.2049 ( 0.3891) -0.5594 ( 0.2629) 0.1161 ( 0.3718) -0.1072 ( 0.3483) -0.1107 ( 0.3089) 0.6460 ( 0.4479) 0.4656 ( 0.4000)  Div 0.0903 ( 0.3830) 1.2207 ( 0.5215) 0.6303 ( 0.4385) 1.2589 ( 0.5220) 0.3971 ( 0.5096) 0.2686 ( 0.4619) 0.0802 ( 0.4355) 0.9781 ( 0.5104) 0.2554 ( 0.5159) 0.8257 ( 0.4334) 0.9240 ( 0.5005) 0.6288 ( 0.4766) 1.2267 ( 0.5199) 0.1582 ( 0.2926) 0.4485 ( 0.5012) 0.4869 ( 0.4386) 0.5429 ( 0.3968) 2.4506 ( 0.6178) 1.7798 ( 0.5491)  Term -0.6750 ( 0.3077) -1.1040 ( 0.4513) -0.4616 ( 0.3687) -0.9194 ( 0.3705) -1.1035 ( 0.3516) -0.7539 ( 0.3487) -0.7335 ( 0.3394) -0.9066 ( 0.4079) -0.5607 ( 0.4047) -0.9154 ( 0.3456) -1.0816 ( 0.3583) -0.8898 ( 0.3394) -0.9892 ( 0.3713) -0.6461 ( 0.2256) -0.7395 ( 0.3792) -0.5320 ( 0.3268) -0.7213 ( 0.2893) -1.3905 ( 0.4388) -1.1741 ( 0.4012)  Junk 0.9747 ( 0.3499) 0.4354 ( 0.5148) 0.2771 ( 0.4070) 1.2587 ( 0.4768) 1.0408 ( 0.4400) 1.1003 ( 0.4236) 0.4884 ( 0.3726) 0.9186 ( 0.4809) 0.8893 ( 0.4926) 0.7092 ( 0.4213) 1.0130 ( 0.4289) 1.6179 ( 0.4038) 0.7772 ( 0.4596) 0.7628 ( 0.2750) .1.3945 ( 0.4524) 0.7794 ( 0.4210) 0.8903 ( 0.3610) 0.5331 ( 0.5549) 0.7904 ( 0.4922)  Tbill -0.7976 ( 0.3483) -1.7703 ( 0.5253) -1.0617 ( 0.4781) -2.0053 ( 0.4502) -1.6670 ( 0.4143) -1.3621 ( 0.4073) -0.6089 ( 0.3622) -1.8459 ( 0.4645) -1.2372 ( 0.5236) -1.6214 ( 0.3850) -2.1982 ( 0.4074) -2.1989 ( 0.4256) -1.9077 ( 0.4555) -0.7014 ( 0.3238) -1.4252 ( 0.4488) -1.1685 ( 0.3839) -1.2989 ( 0.3424) -2.8881 ( 0.5821) -2.3763 ( 0.5183)  Wald 25.8316 ( 0.0002) 21.0752 ( 0.0018) 20.4857 ( 0.0023) 41.3497 ( 0.0000) 31.6829 ( 0.0000) 22.9550 ( 0.0008) 18.5021 ( 0.0051) 27.5108 ( 0.0001) 10.6757 ( 0.0989) 30.9391 ( 0.0000) 41.4714 ( 0.0000) 57.4810 ( 0.0000) 36.4546 ( 0.0000) 16.1490 ( 0.0130) 27.6342 ( 0.0001) 20.6450 ( 0.0021) 29.7533 ( 0.0000) 40.9238 ( 0.0000) 44.5149 ( 0.0000)  Essay 2. Conditional Coskewness and Asset Prices.  Parameter S1-BM3 S1-BM4 S1-BM5 S2-BM1 S2-BM2 S2-BM3 S2-BM4 S2-BM5 S3-BM1 S3-BM2 S3-BM3 S3-BM4 S3-BM5 S4-BM1 S4-BM2 S4-BM3 S4-BM4 S4-BM5 S5-BM1 S5-BM2 S5-BM3 S5-BM4 S5-BM5  Constant 0.8198 ( 0.2953) 0.9948 ( 0.2771) 1.1271 ( 0.2951) 0.3821 ( 0.3484) 0.6772 ( 0.2947) 0.9374 ( 0.2682) 1.0147 ( 0.2500) 1.0777 ( 0.2839) 0.4410 ( 0.3159) 0.7267 ( 0.2646) 0.7697 ( 0.2408) 0.9006 ( 0.2302) 1.0251 ( 0.2655) 0.4768 ( 0.2780) 0.5049 ( 0.2503) 0.7573 ( 0.2327) 0.8656 ( 0.2219) 0.9581 ( 0.2569) 0.4743 ( 0.2252) 0.4814 ( 0.2154) 0.4691 ( 0.1928) 0.6492 ( 0.1905) 0.7637 ( 0.2219)  S&P 0.3787 ( 0.3679) 0.3548 ( 0.3612) 0.4425 ( 0.3799) 0.0992 ( 0.4309) 0.1332 ( 0.3852) 0.0378 ( 0.3522) 0.0429 ( 0.3366) 0.0743 ( 0.3623) 0.0196 ( 0.4146) 0.0014 ( 0.3518) 0.0117 ( 0.3225) -0.0101 ( 0.3136) -0.0048 ( 0.3664) -0.1081 ( 0.3696) -0.0263 ( 0.3489) -0.1347 ( 0.3398) -0.2094 ( 0.3008) 0.0296 ( 0.3531) 0.0315 ( 0.3224) -0.1517 ( 0.2987) -0.2268 ( 0.2807) -0.1758 ( 0.2867) -0.1006 ( 0.3116)  Table 2.1 cont'd. Div 1.4482 ( 0.5135) 1.2032 ( 0.4778) 1.4269 ( 0.5102) 1.7609 ( 0.6079) 1.5589 ( 0.5337) 1.0865 ( 0.4808) 0.9763 ( 0.4407) 1.2290 ( 0.4921) 1.4662 ( 0.5523) 1.1293 ( 0.4807) 0.9423 ( 0.4307) 0.7537 ( 0.3966) 0.9827 ( 0.4738) 1.0633 ( 0.4801) 0.9644 ( 0.4580) 0.7021 ( 0.4187) 0.6032 ( 0.3829) 0.8627 ( 0.4297) 0.2809 ( 0.4243) 0.3667 ( 0.4090) 0.3811 ( 0.3567) 0.2113 ( 0.3364) 0.2218 ( 0.3685)  Term -1.1260 ( 0.3622) -1.0710 ( 0.3697) -1.2424 ( 0.3971) -1.2868 ( 0.4276) -1.0412 ( 0.3782) -1.0347 ( 0.3462) -0.9638 ( 0.3360) -1.0013 ( 0.3690) -1.0420 ( 0.3990) -0.9286 ( 0.3458) -0.8856 ( 0.3227) -0.8543 ( 0.3152) -0.8958 ( 0.3721) -1.0029 ( 0.3818) -0.8431 ( 0.3398) -0.7972 ( 0.3298) -0.8170 ( 0.2942) -0.9143 ( 0.3561) -0.8766 ( 0.3251) -0.7327 ( 0.2889) -0.4935 ( 0.2612) -0.4726 ( 0.2436) -0.7564 ( 0.2861)  Junk 0.9740 ( 0.4727) 0.8614 ( 0.4595) 0.8220 ( 0.5074) 0.8400 ( 0.5215) 0.9467 ( 0.4598) 1.0407 ( 0.4482) 0.9018 ( 0.4227) 0.7734 ( 0.5001) 0.9730 ( 0.4842) 1.0642 ( 0.4363) 0.8289 ( 0.4104) 0.8228 ( 0.3938) 0.7449 ( 0.4795) 0.8569 ( 0.4408) 0.9447 ( 0.4145) 0.9602 ( 0.3945) 0.9314 ( 0.3775) 0.9158 ( 0.4419) 0.8125 ( 0.3655) 0.9192 ( 0.3551) 0.6590 ( 0.3151) 0.7884 ( 0.3174) 0.9230 ( 0.3527)  Tbill -2.2543 ( 0.4788) -2.0297 ( 0.4653) -2.2928 ( 0.5023) -2.5149 ( 0.5306) -2.2047 ( 0.4693) -2.0019 ( 0.4302) -1.7180 ( 0.4116) -1.8455 ( 0.4853) -2.2726 ( 0.4864) -1.9549 ( 0.4179) -1.7341 ( 0.4015) -1.5396 ( 0.3844) -1.5320 ( 0.4523) -1.8173 ( 0.4326) -1.7503 ( 0.3962) -1.5363 ( 0.3913) -1.5744 ( 0.3710) -1.6194 ( 0.4386) -1.2478 ( 0.3502) -1.2356 ( 0.3569) -1.1396 ( 0.3339) -1.0011 ( 0.3174) -1.0899 ( 0.3389)  64  Wald 46.9675 ( 0.0000) 47.3004 ( 0.0000) 49.8160 ( 0.0000) 34.9017 ( 0.0000) 45.4474 ( 0.0000) 50.8823 ( 0.0000) 47.8487 ( 0.0000) 39.7271 ( 0.0000) 33.6775 ( 0.0000) 45.0165 ( 0.0000) 44.0586 ( 0.0000) 43.6049 ( 0.0000) 35.8926 ( 0.0000) 29.0162 ( 0.0001) 36.9114 ( 0.0000) 38.5737 ( 0.0000) 39.9430 ( 0.0000) 38.0299 ( 0.0000) 26.7015 ( 0.0002) 29.2772 ( 0.0001) 25.2222 ( 0.0003) 29.4554 ( 0.0000) 32.2912 ( 0.0000)  Essay 2. Conditional Coskewness and Asset Prices.  65  and for coskewness: «im,t) <8> Z -i] t  = 0 xlL  (2.50)  To implement this test we need the market and portfolio return residuals e r <t m  —  ZJ_ LI X  m|t  =  and e^t = r^t — Zj^fa, which assume a linear conditional moment spec-  ification. The final ingredient is the conditional co-moments which are also modelled as linear functions of the instruments for covariance: (2.51) and for coskewness: (2.52) We test for non-zero covariance and coskewness using the null hypothesis <7; = m = Oixi  for i — 1,..., N, and we test for time-varying co-moments using the null hypoth-  esis u\  —is  — 0£_ixi for i — 1,. . . , N, where the superscript 2 — L is used  to indicate that this hypothesis only depends on the coefficients of the information variables and not the intercept. In summary, to test whether covariance and coskewness varies over time, we must model the conditional mean of the market and portfolios, and estimate a linear approximation to the conditional covariance and coskewness . The model requires the 6  estimation of (2N + l)L parameters. To identify the model we require that the error on each moment be orthogonal to the information variables, resulting in an exactly identified system of moment conditions. To keep estimation tractable we proceed by estimating this equation for each portfolio one at a time. Note that the FF25 portfolio would require 312 parameters with only 414 observations. The essay uses the trick of expanding the r^t — Et-\r^ term in the unconditional t  expectation of the conditional co-moments to avoid modelling individual asset returns. This dramatically reduces the number of parameters that need to be estimated since 6  To the extent that the linear specification is an approximation, this model will be misspecified. To overcome this the technology employed in the remainder of the essay avoids modelling covariance and coskewness in favour of a more general process  Essay 2. Conditional Coskewness and Asset Prices.  66  it completely avoids the need to model any asset-specific parameters. We test the accuracy of these alternative definitions in two ways. First, we use the following moment conditions, which are analogous to (2.49) and (2.50) to identify the conditional co-moments: £[(r e , M  m  cr' )  t  imt  ® Z _i] = 0 t  (2.53)  i x l  and E[(ri,t(e ,t 2  m  -  (Zl r ) ) 2  lC  m  - « ; ) ® Z _!] = 0 mit  t  L x l  ,  (2.54)  along with the explicit linear specification of the conditional covariance and coskewness in equations (2.51) and (2.52) respectively. We annotate the parameters o[ and K[ with primes to differentiate between the specifications. This approach completely eliminates the need to estimate fa and we therefore drop the relevant moments from both covariance and coskewness. However, we need the variance of market returns and include moment conditions to estimate the variance parameters. There are now only (N + 1)L and (N + 2)L parameters and moments in the exactly identified system for the covariance and coskewness respectively. Second, by estimating a system including both definitions of the co-moments, along with ancillary parameters, we can test the adequacy of the more parsimonious representation using the null hypothesis Oi = o~\ and Ki = K[ for i = 1..., N. The results for conditional covariance are presented in Table 2.2 and for conditional coskewness in Table 2.3. The broad results indicate clear evidence that both covariance and coskewness are time varying with only one or two portfolios having coefficients on the lagged information variables that are not significant at traditional levels. In contrast we are unable to reject the null hypothesis that the simplified co-moment specification is equivalent to the alterative which requires modelling the mean return on each asset: not one portfolio was significant at even the 10 percent level for either moment in either data set. We can therefore proceed with the remainder of our analysis with confidence that both covariance and coskewness are indeed time-varying and that our simplified representation is adequate for our purposes.  Essay 2. Conditional Coskewness and Asset Prices.  67  Table 2.2: Predictability of Conditional Covariance This table presents Wald statistics for the null hypothesis of zero and constant conditional covariance for two alternate specifications (resp. cols 2 and 4, and 3 and 5); and a test for the equivalence of the specifications (in col. 6) for both the industry and FF25 data sets. The test statistics in columns 1, 3 and 5 are distributed as chi-squared with 6 degrees of freedom, and the statistics in columns 2 and 4 have 5 degrees of freedom. Null Hypothesis Industry 1 Industry 2 Industry 3 Industry 4 Industry 5 Industry 6 Industry 7 Industry 8 Industry 9 Industry 10 Industry 11 Industry 12 Industry 13 Industry 14 Industry 15 Industry 16 Industry 17 S1-BM1 S1-BM2 S1-BM3 S1-BM4 S1-BM5 S2-BM1 S2-BM2 S2-BM3 S2-BM4 S2-BM5 S3-BM1 S3-BM2 S3-BM3 S3-BM4 S3-BM5 S4-BM1 S4-BM2  cr, = 0 113.6750 110.1541 100.1347 113.6307 148.6933 136.1335 120.2319 133.0196 126.0594 143.8768 142.5266 103.1778 147.0280 104.9180 118.0181 139.0313 146.9422 107.7074 105.8549 100.9350 96.9709 87.1639 128.9724 122.7950 114.1148 115.4649 105.7122 134.2127 126.6172 126.7304 127.9146 103.7706 142.4265 138.2896  " =0 15.4475 21.8465 25.8679 12.3732 12.9680 22.0952 9.9977 23.7271 17.3724 23.0738 15.8137 4.3784 20.2814 22.3188 15.3003 22.7836 23.8875 24.7949 22.1365 18.0414 15.9157 13.7260 27.4563 21.6135 21.7006 18.9012 15.9180 27.6666 22.8220 23.2111 21.5780 16.8713 24.9225 26.7517  of  B)  =0 112.1010 106.7877 99.8975 113.5098 145.6618 137.0944 115.9354 133.6514 128.2278 141.6622 139.8298 103.9831 145.0573 104.3701 118.0807 137.9380 146.7972 105.7730 105.0131 100.2713 96.8413 87.1154 127.8701 122.4353 115.2480 115.6325 105.4095 135.3151 127.1279 128.5125 128.7577 103.4177 140.4640 139.1808 0 ^  " =0 16.3678 23.3288 27.4422 14.4329 15.8126 24.2582 10.5934 26.8332 19.6033 25.2934 17.9189 4.8468 22.5472 24.1652 17.3508 23.5583 26.3693 26.4642 24.4888 19.8990 17.9339 15.3899 29.7381 23.5052 23.8689 20.1972 17.4759 30.1604 25.6521 25.9524 23.2903 18.6068 26.3171 29.6013  of  6 )  °i  = v'i  4.5393 2.5758 1.9415 3.2011 2.4655 2.9549 2.3779 2.7894 1.4510 3.0681 2.5626 2.3749 2.9965 3.7930 2.8986 2.7463 3.9428 3.5849 3.1501 3.1889 2.7369 2.6823 3.7591 3.5698 3.5073 3.4326 3.3136 3.6151 3.6266 3.2959 3.3158 3.2337 3.9043 3.4527  Essay 2. Conditional Coskewness and Asset Prices.  Null Hypothesis S4-BM3 S4-BM4 S4-BM5 S5-BM1 S5-BM2 S5-BM3 S5-BM4 S5-BM5 .1 p- value .05 p-value .01 p- value  Table 2.2 cont'd. a> = 0 a f - = 0 o-i = 0 of~ = 0 25.7475 142.8737 28.9491 141.9376 24.1601 147.3045 21.6469 148.5868 129.0533 21.9987 126.6146 18.2108 18.0848 139.9010 19.4355 143.8768 157.0880 25.2237 157.0778 27.8773 20.4316 146.9290 22.3938 145.0289 152.8569 18.5939 151.8551 20.3178 135.1814 135.8254 20.6972 17.4915 9.2364 9.2364 10.6446 10.6446 12.5916 11.0705 12.5916 11.0705 15.0863 16.8119 15.0863 16.8119 6)  6 )  68  = cr'i 3.6864 3.4748 3.2772 2.4857 3.7308 2.8766 2.4922 3.5634 10.6446 12.5916 16.8119  Table 2.3: Predictability of Conditional Coskewness This table presents Wald statistics for the null hypothesis of zero and constant conditional coskewness for two alternate specifications (resp. cols 2 and 4, and 3 and 5); and a test for the equivalence of the specifications (in col. 6) for both the industry and FF25 data sets. The test statistics in columns 1, 3 and 5 are distributed as chi-squared with 6 degrees of freedom, and the statistics in columns 2 and 4 have 5 degrees of freedom. Null Hypothesis Industry 1 Industry 2 Industry 3 Industry 4 Industry 5 Industry 6 Industry 7 Industry 8 Industry 9 Industry 10 Industry 11 Industry 12 Industry 13 Industry 14 Industry 15 Industry 16 Industry 17 S1-BM1 S1-BM2 S1-BM3  0 11.4221 16.7075 11.1959 9.6256 11.2472 10.6372 6.3022 13.8846 13.2540 13.3198 10.4639 13.1802 13.7167 15.5776 6.4961 10.5049 11.8031 12.8052 11.5990 11.4788 Ki  =  t  6)  K  = 0 «; = o  9.8132 15.4475 10.9013 9.2014 10.1716 10.4857 4.7167 13.6258 13.0011 12.7113 10.2284 11.6146 13.3943 12.9536 5.8630 9.4303 10.8348 12.4478 11.2952 11.0150  12.2664 15.3192 10.5747 9.9925 10.2665 10.6374 5.3623 14.7880 13.1776 13.5346 9.4872 14.6456 13.1584 13.7123 6.8720 11.3799 12.1275 12.3800 11.6601 12.1422  411.0306 " =o 2  6)  14.3621 10.4208 9.5982 9.6924 10.5516 4.3641 14.7487 12.8005 13.2649 9.3827 14.3102 13.1061 11.8435 6.2955 10.7066 11.4979 12.2712 11.2917 11.6983  =<  2 3845 4 6562 4 7754 1 2912 4 5751 4 3061 2 9054 2 5705 2 6184 4 4177 3 7191 4 8075 3 9287 4 3022 4 0636 3 9468 4 2151 4 0884 3 4446 3 2478  Essay 2. Conditional Coskewness and Asset Prices.  69  Table 2.3 cont'd. Null Hypothesis S1-BM4 S1-BM5 S2-BM1 S2-BM2 S2-BM3 S2-BM4 S2-BM5 S3-BM1 S3-BM2 S3-BM3 S3-BM4 S3-BM5 S4-BM1 S4-BM2 S4-BM3 S4-BM4 S4-BM5 S5-BM1 S5-BM2 S5-BM3 S5-BM4 S5-BM5 .1 p- value .05 p-value .01 p- value  2.4  Empirical  K-i = 0  12.0609 12.4003 12.4372 11.3686 12.0991 12.7791 12.1559 12.2432 12.6984 12.3032 12.2776 12.5209 11.2680 12.3831 11.4907 12.9956 11.7909 8.6456 10.4718 14.9645 11.1459 14.1540 10.6446 12.5916 16.8119  = 0 11.3369 11.4241 12.2779 10.8347 11.5821 11.9154 11.0156 11.9765 12.2001 11.2531 11.4620 11.9842 10.5854 11.8895 10.6329 11.5639 10.7642 8.0538 9.6527 14.5276 10.2037 13.1814 9.2364 11.0705 15.0863  <= o  12.6105 13.2081 12.1349 11.9725 13.5147 13.6639 13.3770 12.4375 13.7657 13.6486 13.6275 14.0354 10.7989 13.1282 12.6646 13.7875 13.6117 7.7884 10.3957 14.9099 11.7892 14.8738 10.6446 12.5916 16.8119  =o 11.7389 11.9092 12.1246 11.5884 13.2464 12.8019 12.0952 12.3508 13.5045 12.9916 12.9415 13.5444 10.5656 12.9326 12.1686 12.7208 12.7047 7.5392 10.0210 14.7728 11.1310 14.3921 9.2364 11.0705 15.0863  4~ 2  6}  H-i —  3.0514 3.5175 4.0445 2.0027 3.4557 2.9953 2.5235 3.5380 2.6660 2.6154 3.1284 2.0566 4.1717 2.3273 1.3378 4.0326 1.9881 5.1517 4.9463 7.0990 4.9697 4.0084 10.6446 12.5916 16.8119  Results  Table 2.4 presents the parameter estimates of the two-moment C A P M where the price of covariance risk and variance of market returns are modelled explicitly. The parameters are estimated using Hansen's (1982) generalized method of moments ( G M M ) with the efficient weighting matrix iterated until convergence. Except where explicitly stated to the contrary, all models are estimated using this procedure. The positive coefficient on ii in both data sets indicates that the unconditional price of covariance risk is positive as expected, although the coefficient is not significant in the industry data. It is interesting that the vast majority of the coefficients in the mean equation have the opposite sign to those in the standard deviation equa-  Essay 2.  Conditional Coskewness and Asset Prices.  70  Table 2.4: Parameter Estimates of the Two-Moment Model Parameter estimates and standard errors are reported for the test of the conditional two-moment C A P M with linear means and variances. Data covers two data sets: the FF25 size and book-to-market sorted stocks, and the 17 industry sorted assets. Data spans the period July 1962 to December 1997. The moment restrictions implied by the model are: E[(r  - Hm,t) ® Zj_j] = 0  m<t  - ti ,t)  E[{(r ,  m  m t  E[(r  iit  - r (r iit  -  mtt  2  - < J  ® Zj_ ] = 0 x  M m , t ) ^ ) ® Zj_ ] = 0 x  Vi,  °~m,t  and Li ,t = ZJ_ LI and o~ = Zj_ a. The parameters are estimated by the Generalized Method of Moments and are reported along with their asymptotic standard errors in parenthesis. Hansen's (1982) Jy-statistic and a number of Wald tests are also reported along with their p-values and degrees of freedom. Div Tbill Junk Term Constant S&P Parameter FF25 0.1682 -0.4902 -0.2516 0.2451 0.6064 0.1095 t ( 0.1268) ( 0.1032) ( 0.1454) ( 0.1103) ( 0.1530) ( 0.1162) -0.4685 0.1650 0.1428 -0.0833 0.8427 3.6251 a ( 0.1552) ( 0.1626) ( 0.2248) ( 0.1268) ( 0.1418) ( 0.2430) 206.7632 (0.0015) 162 d.f. Jy-statistic 5 d.f. (0.0000) 28.1046 Constant \x 5 d.f. Constant of 168.3192 (0.0000) 10 d.f. 239.7058 (0.0000) Constant A 6 d.f. (0.0000) 37.8016 Zero A Industry 0.4962 -0.6904 0.5449 -1.0628 0.2029 0.2513 i ( 0.1455) ( 0.1058) ( 0.2062) ( 0.1619) ( 0.1974) ( 0.2019) 3.8894 -0.3629 0.7810 -0.2621 1.3876 0.0035 a ( 0.1499) ( 0.1701) ( 0.2806) ( 0.1432) ( 0.2515) ( 0.1556) 102 d.f. 148.3481 (0.0019) Jy-statistic 5 d.f. 40.5338 ( 0.0000) Constant Lit 5 d.f. Constant of 108.7423 (0.0000) 10 d.f. 204.9332 (0.0000) Constant A 5 d.f. 61.0512 (0.0000) Zero A m  mit  X  1  t  t  t  1  t  t  x  Essay 2. Conditional Coskewness and Asset Prices.  71  tion. This suggests a negative relationship between conditional mean and variance in market returns since when any variable changes, it causes the conditional mean and variance to move in opposite directions. This is in line with the general finding of a negative relationship between mean and variance in aggregate stock market returns (see, e.g., Whitelaw (1994)). The interest rate variables (Term, Default and Tbill) all appear useful in explaining both mean market returns and market volatility. The coefficients have the same sign in both data sets except for the term spread where the standard deviation coefficient changes sign, but is statistically insignificant in both data sets. The dividend price ratio is only significant in the mean return equation in the industry data set. Lagged market returns are only significant in the FF25 data set for explaining volatility. The JT test is a general goodness of fit specification test. The statistic is large when the moment restrictions are violated, and zero when all moment restrictions are exactly met. The rational behind its use as a diagnostic too is that a correctly specified model will logically price all portfolios well. The JT test is strongly significant in both data sets, leading us to reject the two-moment C A P M as a reasonable model. We also test for the null hypothesis that covariance risk is not priced. This null hypothesis is equivalent to testing whether u. = O^xi- We use a Wald test for this which is comprehensively rejected: covariance risk is priced. We have already presented evidence that conditioning information is important to consider, since covariances appear to be time varying. Another interesting question is whether the price of risk is time varying. This hypothesis is clearly rejected by the Wald statistic in both data sets. The two components of the price of risk, the conditional mean and variance, also appear to be time varying. This demonstrates the importance of accounting for conditioning information in tests of the C A P M . We next turn our attention to the test of the three-moment C A P M , which is reported in Table 2.5. The three-moment C A P M provides a much better fit for the data. The p-value for the J test in both data sets is around the 5 percent level, much larger T  than in the two-moment C A P M . There is strong evidence that both covariance and coskewness risk are priced by the market.  Essay 2,  72  Conditional Coskewness and Asset Prices.  Table 2.5: Parameter Estimates of Full Three-Moment Model Parameter estimates and standard errors are reported for the test of the conditional three-moment C A P M with linear prices of risk, standard deviation and skewness. Portfolios come from two data sets: the FF25 size and book-to-market sorted stocks, and the 17 industry sorted assets. Data spans the period July 1962 to December 1997. The moment restrictions implied by the model are: E[(r  - H\,t ~ A*2,t) <8> Zj_ ] = 0 E[(e - o\) <g> ZlJ = 0 E[{e - K ) <g> Zj_ ] = 0 i x l e ^ - r M « t - e r ) ^ ) ® Zj_ \ = mit  x  Lxl  2  t  Lxl  z  t  E[(r  ht  where and  e =r t  m)t  - r  3  t  x  0  2  u  m  x  - /xi,t - M2,t, Mi,t = ^t-iMi> M2,t = Zj-ito  +  1  { Z ,  T  NLxl  _  1  K > O } ^ .  °t  =  t-i >  Z  a  = Z~J_ K. X  Parameter Mi M2 a  K  5 J-statistic Zero A i Zero A Constant Constant H2,t Constant a\ Constant K\ jt  2)t  Constant  S&P  0.6987 ( 0.1797) 0.1096 ( 0.0786) 4.1132 ( 0.1226) -29.4132 ( 5.6301) -0.8712 (0.1857) 172.5394 66.5482 28.5560 37.5735 27.8783 207.3796 28.2343  -0.2058 ( 0.1723) 0.2990 ( 0.0768) -0.4654 ( 0.1119) 22.5666 ( 4.3089)  Div FF25 -0.3927 ( 0.2151) 0.5471 ( 0.1658) 0.2079 ( 0.1488) 91.3647 ( 17.3957)  (0.0467) (0.0000) (0.0002) (0.0000) (0.0001) (0.0000) (0.0000)  143 d.f. 6 d.f. 7 d.f. 5 d.f. 6 d.f. 5 d.f. 5 d.f.  Term  Junk  Tbill  -0.5039 ( 0.2130) -0.4573 ( 0.1212) 0.4969 ( 0.1101) -66.0606 ( 12.5884)  1.2634 ( 0.2295) -0.2991 ( 0.0798) 0.0660 ( 0.1488) -21.5275 ( 4.0897)  -1.1117 ( 0.2442) 0.0818 ( 0.1032) 0.7187 ( 0.1313) -58.9323 ( 11.2583)  It is evident that just as conditioning information is useful in the two-moment C A P M , so too conditioning information is very important to the three-moment C A P M since we clearly reject the null hypotheses that the prices of both covariance and coskewness risk are constant. We also find strong evidence that both the markets' variance and skewness are time varying. The dynamics of the mean, variance and skewness of market returns were also estimated using only market data, but to save  Essay 2. Conditional Coskewness and Asset Prices.  Constant  Parameter  0.5847 ( 0.1866) 0.1512 M2 ( 0.0931) 4.0842 cr ( 0.1042) -29.3989 K ( 0.6674) -0.8019 6 118.2748 J-statistic 67.5960 Zero Ai^ 33.9650 Zero A ,t Constant u. 49.3241 Constant H2,t 33.9512 155.1923 Constant of Constant K\ 24336.1363 Mi  2  lit  Table 2.5 cont'd. Div S&P Industry -0.2002 -0.2235 ( 0.2000) ( 0.2433) 0.5666 0.3480 ( 0.1172) ( 0.2071) -0.5849 0.2064 ( 0.1017) ( 0.1733) 22.6047 91.3191 ( 0.5037) ( 1.4363) (0.1612) (0.0532) 95 d.f. 6 d.f. (0.0000) (0.0000) 7 d.f. 5 d.f. (0.0000) 6 d.f. (0.0000) 5 d.f. (0.0000) (0.0000) 5 d.f.  73  Term  Junk  Tbill  -0.6745 ( 0.2296) -0.4486 ( 0.1513) 0.3591 ( 0.1132) -66.1290 ( 1.9825)  1.2032 ( 0.2475) -0.3436 ( 0.0938) 0.0611 ( 0.1559) -21.4061 ( 0.6676)  -1.5881 ( 0.2654) 0.1466 ( 0.1495) 0.5877 ( 0.1507) -58.8891 ( 1.5366)  space are not reported. The variance and skewness parameters were much less significant. This interesting observation demonstrates that information provided by the cross-section of equity returns is useful in pinning down the time series dynamics of market returns. Theory predicts that the price of covariance risk be positive and the price of 7 risk have the opposite sign. Incorporating coskewness in the pricing model increases the constant term in the price of covariance risk in both data sets. This increase is sufficient that the industry data now support the conclusion that the unconditional price of covariance risk is positive and statistically significant, while it was insignificantly positive in the two-moment C A P M for industry portfolios. The constant in //2,t) which is the unconditional price of 7 risk when the market is negatively skewed, is positive in both data sets, but not statistically significant. However, the parameter 5, which captures a level shift in the intercept when the market is positively skewed, is statistically significant and negative, as theory suggests it should. These parameter estimates convey an interesting story: they indicate that investors are somewhat ambivalent towards negative skewness since they don't demand a premium for bearing coskewness risk when their diversified portfolio is negatively skewed, but have a  Essay 2. Conditional Coskewness and Asset Prices.  74  strong preference for positive skewness since they are willing to sacrifice returns for bearing coskewness risk when returns are positively skewed. One is always concerned, when estimating models using G M M about focusing too closely on the JT statistic since this favours models that produce volatile pricing errors. One robustness check is to compare the statistical significance of the parameters. When the variance-covariance matrix of moment conditions is large, then we have both large standard errors in both tests for the significance of pricing errors and parameters. A spurious JT statistic should therefore be accompanied by insignificant parameter estimates. Fortunately we observe that both the JT statistic and and the parameters are significant, somewhat alleviating our concerns. To further assuage these worries, we also calculate the models using two types of fixed weighting matrices — the identity matrix, which weights all moments as being equally important, and the matrix W, chosen to recognize the wisdom in focusing more attention on portfolios whose returns are less noisy, and which accounts for the correlation between portfolios. It is critically important that the weighting matrix W be independent of any parameter estimates. The matrix is formed by taking the dependent variable in each of the moment conditions, for example r identifying the mean and  t  m j i  in the moments  for the moments identifying the conditional standard  deviation, and stacking them in a matrix y . The conditioning information is incorpot  rated in a simple empirical fashion. With the exception of the markets variance, the fitted values of each element of y are estimated using a linear relationship estimated t  using least squares, and the conditional variance is fitted by modelling the conditional standard deviation. Let v = yt — yt{Z -\) be the residual from this fitting, then the t  t  weight is defined as -l  the inverse of the linear residual covariance matrix. This moment matrix is motivated in spirit by the efficient weighting matrix, and would be the efficient weighting matrix if the conditional moments were the fitted y s and the moments were not t  autocorrelated.  Essay 2. Conditional Coskewness and Asset Prices.  75  Both advantages of the covariance-matrix based weights are important, but we would argue that accounting for correlations is more critical. The pricing error on each portfolio conveys information regarding the adequacy of asset pricing models beyond its value. The direction of the correlation between portfolio returns, along with the relative sign of the pricing errors convey useful information in testing the degree of fit in an asset pricing model. If the returns are positively correlated and the pricing errors are of opposite signs, then we are more confident that the pricing errors are evidence of model miss-specification than we would be if the pricing errors had the same sign. A moderately large pricing error can arise by chance, and if the other portfolio's returns are positively correlated, then we should expect a pricing error of the same sign. If a model produces pricing errors of oppose signs in positively correlated portfolios then the model is more likely to be misspecified than an alterative which produced pricing errors of equal magnitude but of the same sign. Although portfolio correlations are incorporated when calculating Zhou's HT specification test, it is more reasonable to use a weighting matrix that accounts for the correlation in returns up front. The parameter estimates of the two- and three-moment C A P M are reported in Table 2.6 for both W (panel A) and the identity matrix (panel B). The economic story told by the point estimates are the same as for the efficient weights: the price of covariance risk is positive in both models but increases as one accounts for asymmetry in returns; and investors demand a small but insignificant premium for bearing coskewness when returns are negatively skewed but are willing to make quite large sacrifices when returns are positively skewed. These conclusions are robust to different portfolio formation strategies. The minimized value of the quadratic form, along with the HT test and its p-value are reported in the first row of each panel in Tables 2.11 and 2.12 for both portfolio formation rules and, respectively, the W and identity matrices. The minimized criteria improve quite substantially as one incorporates coskewness. The p-values of the HT statistic indicate that the three-moment C A P M provides a good fit to the data. Although the improvement when using the identity matrix is good, the p-value  76  Essay 2. Conditional Coskewness and Asset Prices.  Table 2.6: Two- and Three-Moment C A P M Parameter Estimates: Fixed Weights This table presents the parameter estimates and standard errors from estimating the two and three-moment C A P M models that rely on explicit modelling of market moments. Panel A uses the linear projection based covariance matrix, while panel B uses an identity weighting matrix. D a t a spans the period July 1963 to December 1997.  Panel A: Model Independent Covariance Based Weight. Parameter  Constant  M  0.6767 ( 0.2460) 4.0620 ( 0.1828)  a  Mi  M2 a K  5  ti-  er  Mi  M2 a K  8  0.5418 ( 0.3371) 0.0022 ( 0.6811) 4.0822 ( 0.2062) -27.9843 ( 5.1513) -0.2689 (1.0782)  0.5603 ( 0.2251) 4.0788 ( 0.1863) 0.5897 ( 0.3304) 0.0510 ( 0.2194) 4.0756 ( 0.1907) -30.1553 ( 20.3378) -0.4295 (0.4755)  Div FF25 Two-moment 0.6708 0.1207 ( 0.3163) ( 0.2807) -0.2385 -0.7833 ( 0.5144) ( 0.2857) Three-Moment 0.9443 0.1199 ( 0.6207) ( 0.5449) -0.2086 -0.1057 ( 0.6435) ( 0.2976) -0.2633 -0.7810 ( 0.4910) ( 0.2181) 84.3381 22.1151 ( 13.4067) ( 33.4642) S&P  Industry Two-Moment 0.4510 0.0632 ( 0.2604) ( 0.2507) -0.1845 -0.7914 ( 0.5133) ( 0.2857) Three-Moment -0.1124 0.2501 ( 0.8354) ( 0.5870) 0.4239 0.1728 ( 0.8180) ( 0.2485) -0.2533 -0.7939 ( 0.4498) ( 0.2452) 94.1880 18.5196 ( 13.5652) ( 66.4816)  Term  Junk  Tbill  -0.7884 ( 0.2415) 0.0941 ( 0.2777)  0.3947 ( 0.3011) 0.3252 ( 0.4014)  -1.2858 ( 0.3980) 0.4798 ( 0.2739)  -1.0715 ( 0.4042) 0.1516 ( 0.4978) 0.0979 ( 0.3227) -62.1143 ( 21.5699)  0.7292 ( 0.9200) 0.0036 ( 0.4121) 0.3755 ( 0.3361) -19.7429 ( 8.6061)  -2.1434 ( 0.6324) 0.6621 ( 0.6265) 0.4486 ( 0.3104) -53.5546 ( 12.2080)  -0.7307 ( 0.2282) 0.0736 ( 0.2870)  0.5061 ( 0.2676) 0.3612 ( 0.3988)  -1.2098 ( 0.3707) 0.4106 ( 0.2725)  -0.6040 ( 0.6098) -0.2556 ( 0.6670) 0.0879 ( 0.3008) -71.0711 ( 49.1049)  0.8609 ( 0.5818) -0.0716 ( 0.3003) 0.3651 ( 0.3316) -25.5027 ( 17.7766)  -1.4474 ( 0.6440) 0.0340 ( 0.7360) 0.4522 ( 0.2890) -60.4636 ( 42.2516)  77  Essay 2. Conditional Coskewness and Asset Prices. Table 2.6 cont'd. Panel B: Identity Weighting Matrix. Parameter  Constant  S&P  Div  Term  Junk  Tbill  -0.1795 ( 0.1656) 0.0102 ( 0.2065)  0.2848 ( 0.1906) 0.3916 ( 0.2612)  -0.3445 ( 0.1851) 0.3523 ( 0.2230)  -0.5221 ( 0.5810) -0.5832 ( 0.3067) 0.2112 ( 0.3296) -66.2930 ( 4.7438)  0.9890 ( 0.5411) -0.4412 ( 0.2887) 0.0361 ( 0.4381) -21.8154 ( 2.0016)  -1.0407 ( 0.6558) 0.0305 ( 0.2320) 0.6621 ( 0.3060) -58.7942 ( 3.4522)  -0.4811 ( 0.3030) 0.2695 ( 0.2246)  0.7810 ( 0.3594) -0.0375 ( 0.3067)  -0.6120 ( 0.2659) 0.3075 ( 0.2597)  -0.4086 ( 0.5339) -0.6889 ( 0.4332) 0.4046 ( 0.1888) -67.4382 ( 38.0491)  1.1927 ( 0.5502) -0.5821 ( 0.3334) 0.0293 ( 0.3254) -22.4619 ( 12.7004)  -1.1633 ( 0.4692) 0.0512 ( 0.3699) 0.5151 ( 0.3333) -59.4835 ( 33.8820)  FF25 M <7  Ml M2 (7  K  S  0.5542 ( 0.1743) 3.7552 ( 0.1388) 0.5981 ( 0.2892) 0.1266 ( 0.2545) 4.1146 ( 0.2067) -29.0636 ( 1.1488) -0.8115 (0.7295)  Two-Moment 0.2707 0.2770 ( 0.1541) ( 0.1051) 0.2076 -0.3665 ( 0.2820) ( 0.1886) Three-Moment -0.3002 -0.3333 ( 0.8931) ( 0.4878) 0.9206 0.2271 ( 0.3608) ( 0.2226) 0.0804 -0.5503 ( 0.5518) ( 0.2377) 92.3605 22.7603 ( 8.4392) ( 2.7714)  Industry M cr  Ml M2 cr K  S  0.1013 ( 0.2691) 4.2481 ( 0.1955) 0.7644 ( 0.2792) 0.2168 ( 0.1888) 4.0716 ( 0.2719) -30.3128 ( 17.2528) -1.1254 (0.5550)  Two-Moment 0.3241 -0.1171 ( 0.2380) ( 0.2838) 0.6010 -0.4839 ( 0.3589) ( 0.2156) Three-Moment -0.4541 -0.1271 ( 0.9270) ( 0.3699) 1.0370 0.3051 ( 0.6677) ( 0.2208) -0.5544 0.4129 ( 0.2852) ( 0.3606) 91.2514 21.0458 ( 11.8012) ( 51.5537)  is only around 2.5 percent in both data sets; yet when using the W matrix, the pvalues are much closer to their JT counterparts — in fact the p-value for the industry data jumps from less than one quarter of one percent to over 8 percent. For the reasons elaborated on above we place more weight on the W matrix results. We also consider a specification in which we explicitly model the price of covariance and coskewness risk as functions of the information variables and constrain  Essay 2. Conditional Coskewness and Asset Prices.  78  them to have the theoretical sign as in restrictions in equations (2.15) for the two7  moment model, and (2.32) for the three-moment model. This empirical specification requires modelling the mean market return and the price of covariance risk in the two-moment C A P M , requiring 2L parameters. In addition to these requirements, the three-moment C A P M requires modelling the conditional variance of the market and the price of 7 risk, or AL parameters. Table 2.7 reports estimates of the the parameters for the price of covariance and coskewness risk, all of which, we can conclude to a high degree of confidence, are non-zero and time varying. This specification provides an impressive fit to the industry data but a more modest fit to the FF25 data set. One major limitation of this approach is that it appears to be miss-specified since it implies inordinate conditional skewness. The average difference between the cubed market residual and the conditional skewness implied by the model is —8.5 x 10 , 5  while the sample skewness is only -30, and the lowest cubed residual is —1.6 x 10 , 4  and the maximum is 2.9 x 10 . The fact that the average error in forecasting the 3  skewness is an order of magnitude larger than the largest negative residual indicates how poorly this specification approximates market skewness. The corresponding value when explicitly modelling the skewness is 0.0026. 2.5  Multi-Factor  Asset  Pricing  An alternative response to the CAPM's failure is to consider multi-factor alternatives. Examples of this line of research include Chen, Roll, and Ross (1986), Fama and French (1993), and Jagannathan and Wang (1996) . These models can be the8  oretically motivated following Merton (1973) and Breeden (1979) as intertemporal asset pricing models, where the other factors arise because of price distortions induced by hedging demand against changes in the investment opportunity set. A n 7  Recall that the standardized price of covariance risk is unambiguously positive and because we standardize by the skewness the price of 7 risk is unambiguously negative. It may be a little uncharitable to calling the conditional C A P M of Jagannathan and Wang (1996) a multi-factor model, since labour growth is included to account for omitted components of market return and the term spread is included to account for time-variation in the pricing equation. However, operationally this model is a three-factor model.  8  Essay 2. Conditional Coskewness and Asset Prices.  0 -rf fl 0 -H  CO t -  "0  5  °  ft  •2  S  co  "d  o  <N LO  o  a  o - ^ s  ,  S3  CO  s <-i  0  cu  o  0  u  rf. 8 13  -rf (H  o £ o  rt a  rf x >H O  a> in  *  o  O  o  M  CO  —'  O i-H co  LO -sH o  1—1  t - 00 00 ^ H Oa r H 1  —\  o  i—<  £  +i  p  a  °  rt rt "O a a3 03  c5  o o  ^3  03  co  O O  c  1"  fl .2  a >  o3  •«  s.  ~"  o o  O)  O  -4-)  o  iH  03  O  «  o o °. ^  o  1—1 ••^—'  C D  o  S  o C D  00  o o o o  oa  a>  oo  LO  CO  oa  CO  d  LO  00  00  CO CM CD CN  i—I LO  t-oq c o oa o  oo c o  O  O  O  d  d  o  ^  C D  CO  LO  o  m co  o  m  N  o  LO M o o  -  o o  •  o o  CD  i  o  S  d  LO  LO  t-  co  C D  o  o  i  f_|  o  o  o o  o> i n O)  O  C D  i-H  O  d  C D  d  d  i-H  oa  co  O  i-H  oa  o  o  LO i-H  o  o  co  oo oH b - ->H  jo  O  i—i  iH  d  O d  o  G 0  o  o  d  o  03 ft  I I  «o oa !£>  IO  b -a o3  4. ^a  CD cu  oi co oa o o  CO  CO  N CO  Ol  o  o  d  d  CO  oa co  00 LO co L O oa T-H o  o  d  C D  o  o  N  SH  o  o  ^ 2  jo  O  CD  O  o  CD  LO  oo ^  °2  d  o  LO  co  £ CD  O « O  C O C D O  CD  C D  d  C ^  CO  00  "  I  CO  o3  ^  o  CD  T H  oa  0  d  0  o  i-H Tj<  r~~\ Oa o ro co o CD  I-H  p  c5 d  ^ c£  co  I-H  oa c o cn oa  d  H  C D C N C D  ^« o o  d  C D C D  oa ^I  LO  o  C D  o  i—H CO  00  o  CO  <-rf  H  CD  " -2 ^  -  PH  OO O i OJ  o  —i  03  S  S  o  O C  0 3  a  CD i-H  in  LO OO  a  JO  o  i—i  ^  II  c O  '•S a  •S S  OJ  ^ 00  CO  03 °SH CD o3  fl  rn m  Tt< O o  O  X ) io fl S 03 ^2  O  LO  G  I  S3 fl  fl  0 3  3  a. I '  03 >  03  G 03  O O O ft rf J H  CU  o  l>-  -a  W  _  as H o  •<  x  O ||  oa  •«* co  O O  i-H  O  '  1  LO  o  CD  LO  oa i-H LO  §  CD O O  'G  <<  CO  rt  CO  i—H  o3  H  CP  o  oa  CO  CO  * ^  CP  00  CO CO  »- s  =1  cu -rf 3 -rf  0) ft  0)  S3  00 00  o  ai  o  HH  IH  HH J H  8  o  CD  i-H  CO  1=1  CP  -u &H  o  o  a co  01  co  03 0  0 o  a)  O  o  CM OJ  i-H  'G - ~ co co  o  'G  o  co  ^ Is  i-H  CM O O o  i-H  03 0  o o  ^ ^  79  d  d  C D  -rf  oa o  o d  CN  oi  C D  d  o  n  LO  O  w  co o  CD  N  CD  C D  C D  d  d  o o  - d  PH  g  U -rf  G  ®  iH  3  fl ^  a a a _a  G  cS E -a 0 HG  rf  ^ § o  o  o o  rf  O  o  Co  O  o  O  o  o d  o o  co  O O  d  Essay 2. Conditional Coskewness and Asset Prices.  80  alternative motivation is the arbitrage pricing theory of Ross (1977). Fama and French (1993) has become by far the most popular multi-factor pricing model applied over the past decade. This model uses three factors, HML (high minus low), SMB (small minus big) and the excess market return to price equity. SMB and HML are used in response to the authors finding in an earlier paper (Fama and French, 1992) that sorting stocks based on their size and book-to-market ratio helps explain crosssectional variation in equity returns. HML is a zero investment portfolio formed by taking a long position in assets with a high book equity to market equity ratio and an equivalent short position in low book-to-market stocks. SMB is similarly formed by taking a long position in small stocks and a short position in big stocks. This model 9  has received very wide application in the literature, even being applied by Ibbotson research associates in calculating costs of capital. Suppose there are K factors that are relevant to price individual risky assets. Denote by fj the return on the factor mimicking portfolio of the j-th risk factor. The return tt  on any risky asset can be represented as K  (2.55)  i,t = £ A j / j , t + ei,t,  r  i=i  where  is an idiosyncratic error term uncorrelated with the K factor returns. The  most common model of this type is Fama and French (1993). Ferson and Harvey (1999, Eq. 2) estimate a conditional model of this type, where the pricing equation is given by K  E -i[ri,t] = zZfojtHu  (2.56)  t  where 8 = Cov.-.fa,., / t X V a r ^ / t ) ) itt  1  (2.57)  is the if-vector of asset z's conditional betas with respect to each of the factors, and where Var _i(/ ) denotes the conditional K x K variance-covariance matrix of the t  t  The returns to these portfolios and more specific details on their formation are available on the internet at http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library. html.  9  Essay 2. Conditional Coskewness and Asset Prices.  81  K-vector of time t factor returns, and /j,j is the condition expected return to the j - t h >t  risk factor. The Fama and French (1993) specification of this model uses three factors: the return on some value weighted stock market index in excess of the risk-free return, the excess return to a portfolio of small stocks over the return to a portfolio of large stocks (SMB), and the excess return to high book-to-market stocks over the return to a portfolio of low book to market stocks (HML), or j G {MKT, SMB,  HML}.  He et al (1996) present an extension of Harvey's (1989) conditional test. Their approach involves rewriting the moment restrictions of each asset in terms of K covariances and the price of each factor's covariance risk. They use the same "trick" discussed above to simplify the covariances which completely eliminates the need to explicitly model any asset specific parameters.  The model is estimated using the  implied moment conditions on the risky assets: E [(r - r (f ht  ht  - u, ) \ ) ® Z _ i ] = 0 i T  t  t  t  t  L x  for i = 1,..., N,  (2.58)  along with moment conditions to identify the conditional means of the K factors: E[(ft - fit) ® Z -x] = 0_,/fxi. t  (2.59)  This specification requires explicit modelling of the conditional means and conditional prices of risk for each of the K factors only. There are no asset specific parameters to be estimated, and there are no explicit restrictions placed on the covariance dynamics other than stationarity. He et al (1996) find that although the muli-factor beta pricing model is rejected by the size and book-to-market portfolios, the model performed significantly better than the conditional two-moment C A P M , but they could still reject the null hypothesis that the model is correctly specified. We reach a similar conclusion using industry portfolios. The results of fitting the F F three-factor model are reported in Table 2.8 and Ta-  Essay 2. Conditional Coskewness and Asset Prices. ble 2.9 for fixed weighting matrix. Note that the definition of  XMKT  82 in this setting  is not sign corrected and can take on negative values. We observe an improvement when including the SMB and H M L factors, but that both p-values are still less than two percent. We must conclude that the model is still misspecified. There is, however, strong evidence again supporting a conditional analysis since all three factors estimated using both data sets show strong evidence of time variation. Another interesting result is the fact that adding the Fama and French (1993) factors does not adversely affect the price of covariance risk. The standardized price of beta risk  XMKT  is unconditionally statistically significantly positive. Adding the extra  factors actually serves to moderately  increase  the intercept in the price of market  covariance risk. This suggests that previous studies that reject the significance of market returns in explaining the cross-section of equity returns, such as Fama and French (1993) and Ferson and Harvey (1999), are due to their restrictive assumptions on the dynamics of covariances. Once these restrictions are relaxed, covariance with the market is still significant in pricing the cross-section of equity returns. In spite of the importance of market returns, the other two factors are important to pricing returns in both the industry and size and book-to-market portfolios. 2.6  Three Factors or Three Moments? or Both??  What is the effect of combining a multi-factor and multi-moment asset pricing model? Do we need both the SMB and HML factors and coskewness? To address this issue we consider a pricing model which includes premia for covariance and coskewness risk:  Et-\[r t] = Cov -i{r t,r )\ KT,t it  t  it  mtt  M  +  Coskew _ (r ,r t)\sKEW,t+  Covt-xintJjjXjt,  t  1  i:t  rn!  (2.60)  where F_ denotes the set of all priced factors excluding the market return. To identify the moment we need the conditional mean on all the factors, including the market,  Essay 2. Conditional Coskewness and Asset Prices.  -fl o  13 11 2  03  co  »  o g -c S -fl 8 <-l  -rf  bO fl  fl  •«  &H  ?  8 -2 £ * a co O  iH  +i  o  o  ft  "  J ?  rt rtn  o  4J  CJ  fl CD -fl , o  ^rf  j-  o g " o f fl ^ O  ft  CD  >  I  co 0  I - J . CD  2  -j  03  «  ^  rt  CD  5  o  H T3 C -d  cu ft §  15 2 •-H  G g  03  m  3  co  s|  .  °  1  .2  -a  A  B  S  03 °3 r  <°  MH  .5  CO  CU o o3  S o CO  OH  O w n  -d fl  03  T-H  d  CD  T-H  O  O o  O  o  LO CM CO  o  o o  d o o CO  CO T—I  O  *  O  CM ® O ~ ^O o OO  o  TT  oo d  t—T  °  d  CO  o o o o CO o as o d CO  o ^  o d  o co fe. co p o o d d d o d  CO  LO CO CN H N  0  LO CO  T-H  Ol  T-H  o d  co  o d  CN t O CN  d iv  CM  LO  O o co o  »  N  o" o cd  CM  o 3 CO OO OS o as o T-H CN as o o o as o o p o oq o LO d as o CO d CN IV  o co co o oo o co  LO 00  CN  H I  ci  CD CD I  I  TH  H I  p p o  T t  O CD  o  ci d  o  o d  ^  CO m  N CO  d d o  d  M  O  O o  CO CTJ H I CO oq CN  o  o  LO  LO  CN  CD CM CD CD o CD  &H  LO  fe O  o d  o d  LO  tt  c o CO CM  O T-H  O ^  CM  LQ in  Oi  CO  00  _ H  T-H  S  g d d o  O  O  d  O  CM CM CD O  O  0  d  d o  d  A  co  fl  O  cj  co d o  0 0 CM  CO T-H  O O  T-H  CN  O  O  h—  C7S LO i—I  CO 0 0  LO CO  co T* ^ o o o o oo  o LO  00  LO  jz!  O CD  d  o  Jg as cjs CD  LO rft CD CJ  T f  IV  CO H  CN  CD  O  00  00  as  TH  §  LO CO ^ TLO j i CO LO  o d  o  o d  CN  O  CN 2 O CD CD  o P d <=>  d  ^ H T-H  00  0  cS cS <^ cS cS c5  Of  Cq  LO  £  O CD  S 2 io S  H  g ,  H I  2 S 2S 3 o 9 o 2 9 io H q o CD o o o d  00  CO  N N  d  CD  ^ O O  CD N  pO  LO  N  N  LO  CO  2  BH  EH CO  II  S5  CM °0  £ O CN CN CD O o CD CD o d o CD 0  CO co  -  O  T—i LO b- as co o T-H CN iv co p o g p p o  CJ  rtH  cj  cu  d  CD -fl -rf  03  -rf  g  d  O O O O  CO CO CO LO CM  O  r-v 03  S ££ 3 P—i  00 CO t-H  ^  < -fl  o .S  fl  CM CO  CO  ° 'C -2 ft  J  T J OS O cfi fl OS , f l  00 • t-H CN 55 -fl  LO  CD -rf  ft  -t-s  ^ co ^  CM 1< 0 CO — 1  p o p o w ooS o oo O CO d  T-H  ft  CD  Io <» .2 'is S & - d cj ^ a <u 2 S -I -a S H  IV O  CO  fl  o3 fl 03  §  i o co  '•rf 03  o cfl - d "o fl CD  S a g  co  O  CM  O  rH  CD fl  o3  «  OS  03  o  i  ^  O o ^ d O  <=>  CM CM O O  LO CO  iH  fl 03  ^ fl & - ft  Mrt  s  CO W  OS  fl CD  o  MH  o o o o  CM  CD -fl -rf  HH  S *  CO  TH  oJ  i  00  CO  ig <u  £  I - T-H  CD -fl  03  -rl  CO LO CO CO  3  •< fl -a -rf CO ^ M K>  oo a i  oo  O T-H  os  O  O  o3 o3  o  s  LO  i f  iH  CJ  IV  o o  .S § -fl  'o <S i T] 1  COS CM CM CO  a  CO  CO CO  o  CD  •rf03 -fl «8 — co  s  0  83  EH CO  Si  Essay 2. Conditional Coskewness and Asset Prices.  84  Table 2.9: Parameter Estimates of Fama-French's Three-Factor Model with Fixed Weighting Matrix Parameter estimates and standard errors are reported for the test of the conditional threemoment CAPM where the risk-free rate is not the zero-beta rate. The moment restrictions are: E[(f ,t - fijt) ® Ztlll = 0 1 LX  m  [(n,t  E  - EJ6{F> n,tCjtXj,t) ® zJ-\]  =  OJVLXI  where e = f - (ijt, Hjt = Z~J_ x and X = Zj^Xj, for factors j e {F} and F are the set of factors, i n particular: F = {MKT, SMB, HML). jtt  jit  xi  Parameter XMKT  XSMB  XHML  J-statistic XMKT  XSMB  XHML  J-statistic  h  jt  Constant  S&P  0.0453 ( 0.0158) 0.0298 ( 0.0218) 0.0676 ( 0.0237) 170.6055  0.0220 ( 0.0144) 0.0051 ( 0.0221) 0.0419 ( 0.0231) (0.0133)  0.0266 ( 0.0167) 0.0313 ( 0.0269) 0.0162 ( 0.0283) 118.6759  -0.0030 ( 0.0171) 0.0299 ( 0.0300) -0.0118 ( 0.0317) (0.0076)  Div FF25 0.0005 ( 0.0227) 0.0523 ( 0.0366) -0.0364 ( 0.0416) Q Industry -0.0233 ( 0.0263) 0.1135 ( 0.0431) -0.0827 ( 0.0514) Q  Term  Junk  Tbill  -0.0087 ( 0.0149) -0.0222 ( 0.0275) 0.0325 ( 0.0229) 0.9662  0.0296 ( 0.0213) 0.0097 ( 0.0369) -0.0143 ( 0.0335)  -0.0288 ( 0.0259) -0.0338 ( 0.0375) 0.0550 ( 0.0320)  -0.0070 ( 0.0153) 0.0050 ( 0.0301) 0.0275 ( 0.0238) 0.5412  0.0491 ( 0.0237) -0.0120 ( 0.0423) 0.0776 ( 0.0465)  -0.0160 ( 0.0262) -0.0798 ( 0.0443) 0.0519 ( 0.0397)  and the conditional variance of the market, along with the following: E[{r - r ( r itt  where  &MKT,t  -  mj(  £  r (f  ZJ_ U.MKT)Z7_ X T  i;t  =  i>t  MKT,t  R  1  jit  —  1  - Zj^Zl.Xj)  ZJ^U-MKT  +  MK  ®  r^ {e ^ 2  t  m  - 0  L x l  (Zj^of^Zj^XsKEwf'-  for i = 1,..., n,  (2.61)  is the error term on the market. Note the plus  sign in front of the coskewness term. This is due to the sign correction on the market moments. We don't adjust the signs of the F F factors since they are not theoretically motivated and we therefore don't have any clear behavioural constraints on the sign of the risk price. The specification of the model where the conditional moments are modelled and  Essay 2. Conditional Coskewness and Asset Prices.  85  the price of covariance risk and coskewness risk are defined explicitly is inappropriate when there are other priced factors. The reason is analogous to the fact that the price of market covariance is not the ratio of the market's mean to variance, but rather depends also on the covariances between the factors. The proposed methodology avoids this issue by modelling the prices of covariance and coskewness risk as functions of the instruments. Parameter estimates are reported in Table 2.10 and demonstrate the impressive performance of the model. The benefit of adding coskewness over the three-factor model is quite significant, with the most striking results being in the size and book-tomarket portfolios. A version of the F F three-factor model where the price of market risk is constrained to be nonnegative has a JT statistic with a p-value of 0.0396 on the FF25 data set. Adding conditional coskewness improves the p-value of the JT test to 0.1240. The industry data also experiences a dramatic improvement but caution should be exercised in interpreting these parameter estimates because they had trouble converging and ended up in an equilibrium where they would bounce from one region of the parameter space to another repeatedly. There is evidence that all factors are priced: the lowest p-value for any statistic is for the industry data and the HML portfolio which is only just significant at the 5 percent level. This is interesting because the SMB portfolio appears to be the stronger of the two F F factors, while previous studies seem to imply that H M L is the dominant factor. One possible interpretation of the empirical success of the F F three-factor model is that the factors are proxying for coskewness. The evidence partially supports this conjecture, but not all of the explanatory power of the SMB and H M L portfolios is due to coskewness. To see the motivation for this consider the pricing errors when the parameters of various models are estimated using common weighting matrices as in Tables 2.11 for the inverse-residual covariance matrix weights and 2.12 for the identity matrix. 1 0  10  We can compare the pricing improvement induced by the inclusion of the  The table reports the sum of the terms for different sets of moment restriction and we don't adjust to a common base. Each set of moment conditions includes terms corresponding to the portfolio  Essay 2. Conditional Coskewness and Asset Prices.  CD  .1=1  O  03 0  w  fl  CO  J3  S3  0  03 .fl  o  0 43 +^  0 PH „  TP ^ CO ' TP LO ^ -1 T —  SH  0  co (o  a CD  fl  o  ^  SH o3  S *  cu cu  0  o3  5 > o  fl re  S3  a)  *H  -fl  1 -—> 0  SH  O  SH  00  o3 >  o  -3  0 - U  o  ft  T—I  N  ^ O C N o C 0 o ° ^ O  o o  d o t o ' o c i o H O  0  -  H  c o  O0  l>-  c o  CO CN  Tf  c4  T  ^  CO  d  co  O  cfl T 3  o  a  03  0 ft  a  03  .03  -< co  0 43  c  -a fl  fl  O  SH SH  0 fl  g i 0  o3  II  LO T - H lO N LO LO O  CN ^  CS  o M CO O O)  ^  o  CN  g  T—I  s  CN CN T-H  CN LO  io  n  o p— CO -- O o d d o o o CN C O  ^ 52 TP CO JO T - H CN < M ^ i-H  T-H  CO CO  co 1  t-~S o o o o O  CN  fl 0  o  1X1  'SH  H I  O  C? d o d d 3  H  0  CN  o S CO LO Si CN  ^  CO Tt  03 00 0 ) 0 00 LO (M ^ T-H CO  S3 T 3 a3 0  . 3  -H a3  00  LO  CO ^ S 52 LO CO 0 0 CO t o O CO CO CO o co LOp o p !rt o o o o o d o o'  o  0  _  CN O  O  O  O  ^  T-H  -a ^ ' ^ co — O TP co coi-- oo" Mwco g —a O 0> O M < M H O ^ L O Tt* o o 3 H LO o ° o p O p o o ' O d o d d CD CD d ^  _ _  C O  lO ^ O O Q O Q  N  n  1  > J  oi  CO  CO  c o  c o  03 CN L O CO f2 c o S! SI ^„ CO r5 LO CO rt C3 LO P CN P p p o o o o* d d d> CD <~, d d ° d cd  c o  CN LO  cu ,fl  _  >  o  LN]  O  oo 1^ o  LO CO  co cooo co co O o o p o o o o o  S3  0 43  u  03  3  i  GO  O  L  Tf  o 2 O .5 fa  CO  C O  o  CO CO CN  00  _ H  c  o  O . TP  C N O " — ' O C O | > . t _ o  LO  , - H  SH  03  4^ o3 03  fa i CU CU SH  cn CU  a  .2  G 03 03  -(J U  j_.  T3  0 3  T3  O  CD  0 43 4H fl 0  o a  SH  .fl H  00  LOoCNoO^nfSo  C N  C O  0 -fl  c_> 43 a  CO  00  0 43  fl  o  o  o ft >»  O  43  «  86  T—I  LO  CN t-  o  03 CO CO CO  00  C O  O  o  (O  P ) ffl CO 0 0  o  C N  1 — 1  o d d  0  p 2 oo g o O c oo o o o T f-  LO frIV O TP LO LO T - H O O O  O l  OO  O )  CO H H CO TP  o —o —o ~ d d  o  P.  SH  2 re  co  O 0 TO __  "0  -H3  • •  03  "0  ""J  SH  CN  -2  O  O  00  O  w-, CO rr, O CO co o r°^ o CO CO LO CN LO T - H CO £ q CN O o o  ^  d d  5n  o o o o o  a co  0 0 CO 22 _5 co CO CO o cj o o o o o o o o  M  T-H  a  l/J  '-3  CO 03 II  EH  h - ^ co  T_  JJ1 § H •S PL,  nfl  o  (21 a .2 _J__  03  0  03 0  C->  43 0 o3 0  0  ^  SH  ii  11  o 0  -H  1  ^  S!  to  CO  a; -c  co  -<  Essay 2. Conditional Coskewness and Asset Prices.  87  FF factors after accounting for coskewness risk informally by seeing the improvement in the middle column of each table going from C A P M to FF3 with the corresponding improvement in the right column. When the residual covariance matrix weights are used, the improvements in pricing errors are much smaller. Note the industry data in which ignoring skewness results in a 75 percent improvement to the inclusions of the FF3 factors, while there is only 20 percent improvement after accounting for coskewness. (In the FF25 data this effect is even more pronounced a measly 15 percent improvement after coskewness while a whopping 92 percent improvement without coskewness.) When we consider the identity weights the improvement is much less pronounced, but is qualitatively similar. The improvement from incorporating the SMB and HML factors is more pronounced when using the inverse of the residual covariance matrix than the identity matrix; in fact, when the covariance weights are used, adding the third moment appears to result in a greater improvement than when the other factors are added. This suggests that the three-factor model may be better at pricing more volatile assets, since using the inverse of the covariance matrix results in these assets being given relatively little weight. This is borne out by direct calculation of the pricing errors. The average pricing error on the small-value F F portfolio, which is by far the most volatile, has an average pricing error from the F F three-factor model (unconstrained using the identity weighting matrix) of -0.4021, while the three-factor model produces a pricing error of -0.7896. The error when both SMB and HML are included as factors and we use the constrained specification of the market price of covariance and coskewness risk is only -0.3511. When the inverse of the residual covariance matrix is used as the weighting matrix, the pricing errors are -0.5678, -0.4166 and -0.4702. and managed portfolio returns, and moments used to identify the extra moment conditions. One estimation approach, the two-stage estimation strategy of Ogaki (1993), would be to exactly fit the non-asset specific moments which is sufficient to identify the parameters (except in the threemoment C A P M with explicit moments) and then compared the pricing errors. Results are very similar when simply using the weighted sum of the portfolio pricing errors.  Essay 2. Conditional Coskewness and Asset Prices.  88  Table 2.11: Comparing Various Specifications of the Asset Pricing Models We compare the performance of the two-moment and three-factor asset pricing models both w i t h (first column) and without skewness priced (last column). Results are reported that initially don't constrain the price of market risk and then constrain the price of market covariance and coskewness risk to be theoretical. We report the minimum quadratic form using the sample covariance matrix as weights. For each model we report the value of the criterion function, the iJ-statistic (in square brackets) and the p-value of the //"-statistic (in parenthesis). Model  W i t h o u t Skewness  W i t h Skewness  FF25 Modelling Moments  Unconstrained CAPM  FF3  Constrained CAPM  FF3  1.1745 [199.1673] ( 0.0045)  0.9436 [172.6520] ( 0.0461)  1.1415 [194.1116] ( 0.0034) 0.9662 [169.9643] ( 0.0145)  0.9591 [176.8996] ( 0.0142) 0.8042 [151.6040] ( 0.0598)  50.8843 [180.9782] ( 0.0200) 4.2297 [157.0165] ( 0.0678)  0.9892 [190.7102] ( 0.0020) 0.8297 [160.8504] ( 0.0196)  Industry Modelling Moments  Unconstrained CAPM  FF3  Constrained CAPM  FF3  0.7106 [147.7852] ( 0.0021)  0.4914 [114.9054] ( 0.0805)  0.6587 [141.9254] ( 0.0016) 0.5412 [118.8967] ( 0.0073)  0.5087 [119.2065] ( 0.0214) 0.4256 [101.5544] ( 0.0379)  3.0059 [128.1441] ( 0.0158) 0.6958 [119.5170] ( 0.0066)  0.4961 [128.8336] ( 0.0046) 0.3910 [106.7707] ( 0.0170)  Essay 2. Conditional Coskewness and Asset Prices.  89  Table 2.12: Comparing Various Specifications of the Asset Pricing Models: Equally Weighted Moments We compare the performance of the two-moment and three-factor asset pricing models both with (first column) and without skewness priced (last column). Results are reported that initially don't constrain the price of market risk and then constrain the price of market covariance and coskewness risk to be theoretical. We report the minimum quadratic form using the identity matrix for weights . For each model we report the value of the criterion function, the //-statistic (in square brackets) and the p-value of the //-statistic (in parenthesis). Model Modelling Moments  Unconstrained CAPM  FF3  Constrained CAPM  FF3  Modelling Moments  Unconstrained CAPM  FF3  Constrained CAPM  FF3  W i t h o u t Skewness FF25 5.9310 [197.8664] ( 0.0054)  W i t h Skewness  3.7823 [206.2016] ( 0.0005) 1.1154 [180.3696] ( 0.0033)  3.0434 [219.2010] ( 0.0000) 0.9961 [164.1818] ( 0.0126)  8.2715 [199.3245] ( 0.0016) 1.2221 [170.9144] ( 0.0128) Industry 3.2959 [145.0307] ( 0.0033)  4.5353 [202.0209] ( 0.0003) 1.0024 [166.0521] ( 0.0097) 2.5254 [123.7446] ( 0.0254)  2.9232 [145.6653] ( 0.0008) 1.7591 [141.3509] ( 0.0001)  1.8271 [117.6317] ( 0.0269) 1.3668 [133.0451] ( 0.0001)  5.7860 [138.1839] ( 0.0031) 1.4199 [117.3259] ( 0.0096)  3.1402 [127.1349] ( 0.0061) 1.3665 [105.1355] ( 0.0220)  3.6926 [178.6753] ( 0.0230)  Essay 2. Conditional Coskewness and Asset Prices. 2.7  Further  Specification  90  Tests  The JT and HT tests are useful for testing for general model miss-specification, but they have very low power to detect other forms of miss-specification. In this section the asset pricing models are exposed to two specification tests: testing for time varying intercepts and parameter stability. 2.7.1  Time  Varying  Alphas  A well specified asset pricing model will have zero intercept, and our specifying a set of moment conditions that exclude an intercept means that the null hypothesis that the average pricing errors are zero, another way of stating a hypothesized zero intercept, is incorporated in the JT and HT tests. However, Ferson and Harvey (1999) present evidence that both the basic C A P M and Fama and French (1993) three-factor asset pricing model have a non-zero time varying alpha.  11  This finding  that the conditioning information is important for the intercept is robust to allowing the betas to be time varying as a linear function of a set of information variables. We test for a time-varying intercept by augmenting each of the pricing equations by the term —Zj^oti, for example the two-moment C A P M in equation (2.12) becomes: E[(r  ht  - ZUoti -  ri,t(r ,t m  - Zj^u.)  -)  ® Z -i] t  = OLXI  for each asset i. To keep the number of parameters to a minimum we test each asset individually holding all other asset's alpha's at the hypothesized value of zero. An alternative, but very closely related, interpretation of this test is to see if the functional form of the conditional asset pricing model is sufficiently flexible to capture all the predictive ability of the information variables. If the model is misspecified then there will be residual predictive ability left in the information variables even after accounting for the time variation in prices of risk. The very simple linear format of the time varying intercept acts as a first-order approximation to detect this residual 1 1  To conform with historical tradition we will preserve the custom in finance of referring to the intercept as alpha to the factor betas.  Essay 2. Conditional Coskewness and Asset Prices.  91  explanatory power. The results for this test are presented in Table 2.13. These tables confirm the findings of Ferson and Harvey (1999) that in both the basic two-moment C A P M and the three-factor model there is strong evidence of a time varying intercept. We present individual Wald tests and the Hochberg (1988) lower bound on the p-value for the hypothesis that at least one of the alpha's is non-zero. This is constructed as  PHo=  h b  .  r s  =  . min ( N/j) e{  N)]  Pj  where j is the index of the sorted portfolio p-values. The Hochberg lower bounds are less than 0.05 for both the C A P M and three-factor models in both data sets. These results extend the results in Ferson and Harvey since this methodology relaxes the restrictions on conditional covariances implies by the linear beta assumption. Our setup allows the covariances to follow some arbitrary process. However, we must model the price of covariance risk as a linear function, while Ferson and Harvey's approach makes no restrictive assumptions. Which approach is better is an empirical question, though they are certainly complementary. An interesting result, however, is that including the effect of conditional coskewness on asset prices in the three-moment and three-factor and three-moment models increases the Hochberg bound dramatically, in fact only the three-moment model on the industry portfolios is lower than 0.25 and this is still greater than 0.075. This evidence indicates the need to include coskewness in both single factor and multi-factor asset pricing models. 2.7.2  Structural Breaks  A second type of miss-specification that is not readily detected using the goodnessof-fit statistics is parameter instability. Andrews (1993) and Andrews and Ploberger (1994) present Lagrange multiplier (LM) based tests, which are in the spirit of the famous Chow test, for a structural break in a data set.  Essay 2. Conditional Coskewness and Asset Prices.  92  Table 2.13: Lagrange Multiplier Tests for Time-Varying Intercepts: Industry Portfolios This table reports the Lagrange multiplier statistics for testing the null hypothesis that there is no intercept against the alternative that the intercept is a linear function of the information variables. The LM statistics are distributed as chi-squared random variables with 6 degrees of freedom and their p-values are reported below the statistic in parenthesis. Also reported in the infimum Hochberg bound of the individual pvalues. Results are reported for the two-moment, three-moment, three-factor and three-factor-three-moment asset pricing models using the Industry portfolios in the period 1963-1997. Portfolio Two-Moment Three-Moment Three-Factor Three-FactorMoment Industry 1 10.6036 10.0800 11.0048 9.7014 ( 0.1014) ( 0.0882) ( 0.1213) ( 0.1378) Industry 2 13.1150 6.9136 7.5776 5.3945 ( 0.0412) ( 0.3289) ( 0.2707) ( 0.4943) Industry 3 20.9412 16.0088 19.5119 13.6062 ( 0.0019) ( 0.0137) ( 0.0034) ( 0.0344) Industry 4 7.9695 10.3293 6.0188 2.3467 ( 0.2403) ( 0.4211) ( 0.1115) ( 0.8852) Industry 5 5.1760 5.1306 2.6540 2.7170 ( 0.5214) ( 0.5272) ( 0.8508) ( 0.8434) Industry 6 8.9895 9.6399 6.5468 7.7110 ( 0.1742) ( 0.1407) ( 0.3648) ( 0.2601) Industry 7 13.1521 9.0886 9.5094 10.0333 ( 0.0407) ( 0.1687) ( 0.1469) ( 0.1233) Industry 8 6.3826 12.0407 4.8706 4.8417 ( 0.3817) ( 0.0611) ( 0.5605) ( 0.5643) Industry 9 18.1409 13.2747 21.7929 10.1937 ( 0.0059) ( 0.0389) ( 0.0013) ( 0.1167) Industry 10 5.4902 4.1805 2.7807 3.8075 ( 0.4826) ( 0.6523) ( 0.8358) ( 0.7027) Industry 11 10.3235 7.2256 11.4084 6.8049 ( 0.1117) ( 0.3005) ( 0.0765) ( 0.3393) Industry 12 4.7659 8.1100 1.0273 5.4209 ( 0.5742) ( 0.2302) ( 0.9846) ( 0.4911) Industry 13 8.4972 10.0979 9.9110 6.8785 ( 0.2039) ( 0.1206) ( 0.1285) ( 0.3322) Industry 14 18.0361 18.8412 23.4807 12.1303 ( 0.0061) ( 0.0044) ( 0.0007) ( 0.0591) Industry 15 7.1690 10.6791 9.5901 4.9944 ( 0.3055) ( 0.0988) ( 0.1430) ( 0.5445) Industry 16 14.8915 16.5567 17.2720 5.1643 ( 0.0211) ( 0.0111) ( 0.0083) ( 0.5229)  Essay 2. Conditional Coskewness and Asset Prices.  Portfolio Industry 17 Hochberg p-value S1-BM1 S1-BM2 S1-BM3 S1-BM4 S1-BM5 S2-BM1 S2-BM2 S2-BM3 S2-BM4 S2-BM5 S3-BM1 S3-BM2 S3-BM3 S3-BM4 S3-BM5 S4-BM1 S4-BM2 S4-BM3 S4-BM4  Table 2.13 cont'd Two-Moment Three-Moment Three-Factor 6.4113 6.3007 3.9249 ( 0.3787) ( 0.3904) ( 0.6868) 0.0320 0.0755 0.0111 22.1406 16.4952 22.3551 ( 0.0011) ( 0.0113) ( 0.0010) 6.7638 2.9703 6.1124 ( 0.3432) ( 0.8126) ( 0.4107) 6.1130 2.9887 3.4420 ( 0.4107) ( 0.8103) ( 0.7517) 6.3021 3.2207 3.4195 ( 0.3902) ( 0.7807) ( 0.7546) 14.6252 9.7587 11.6413 ( 0.0234) ( 0.1352) ( 0.0705) 7.9239 4.5331 4.5531 ( 0.2437) ( 0.6049) ( 0.6023) 8.3558 4.2245 4.4663 ( 0.2132) ( 0.6463) ( 0.6138) 13.5548 8.5206 11.0264 ( 0.0350) ( 0.2024) ( 0.0876) 8.5892 7.4460 6.7147 ( 0.1980) ( 0.2816) ( 0.3480) 6.0321 4.1734 6.5455 ( 0.4196) ( 0.6532) ( 0.3649) 6.5145 4.9357 2.7984 ( 0.3681) ( 0.5521) ( 0.8337) 11.2162 7.4614 5.9052 ( 0.0819) ( 0.2803) ( 0.4339) 8.1630 4.1404 8.7748 ( 0.2264) ( 0.6577) ( 0.1866) 9.1814 9.7761 8.5816 ( 0.1636) ( 0.1344) ( 0.1985) 6.6663 4.3108 4.2334 ( 0.3528) ( 0.6347) ( 0.6451) 11.6177 9.4498 5.9722 ( 0.0711) ( 0.1498) ( 0.4263) 8.8617 6.7627 3.6363 ( 0.1815) ( 0.3433) ( 0.7258) 7.5861 4.3838 4.6804 ( 0.2700) ( 0.6249) ( 0.5854) 10.6667 12.9748 7.1115 ( 0.0992) ( 0.0434) ( 0.3107)  93  Three-Factor Moment 2.3481 ( 0.8851) 0.4685 13.1402 ( 0.0409) 5.2063 ( 0.5176) 5.6158 ( 0.4676) 1.3756 ( 0.9673) 12.1495 ( 0.0587) 5.4648 ( 0.4857) 7.9057 ( 0.2451) 5.5380 ( 0.4769) 8.2211 ( 0.2223) 5.9447 ( 0.4294) 2.2234 ( 0.8980) 10.6691 ( 0.0992) 3.3674 ( 0.7615) 6.9000 ( 0.3302) 3.6476 ( 0.7242) 10.2882 ( 0.1130) 2.1194 ( 0.9084) 5.5486 ( 0.4756) 8.1142 ( 0.2299)  Essay 2. Conditional Coskewness and Asset Prices.  Portfolio S4-BM5 S5-BM1 S5-BM2 S5-BM3 S5-BM4 S5-BM5 Hochberg p-value  94  Table 2.13 cont'd Two-Moment Three-Moment Three-Factor Three-FactorMoment 9.0132 7.2476 8.8775 6.8056 ( 0.1728) ( 0.2986) ( 0.1806) ( 0.3392) 8.7597 6.8500 7.7423 8.6823 ( 0.1875) ( 0.3349) ( 0.2576) ( 0.1922) 5.1046 5.0303 3.4422 6.8210 ( 0.5305) ( 0.5399) ( 0.7516) ( 0.3377) 7.5406 8.1133 4.2437 4.3839 ( 0.2737) ( 0.2299) ( 0.6437) ( 0.6249) 5.7872 6.4360 5.4386 2.2115 ( 0.4474) ( 0.3762) ( 0.4889) ( 0.8993) 8.2949 5.7985 6.0917 4.0124 ( 0.2173) ( 0.4461) ( 0.4130) ( 0.6750) 0.0285 0.2832 0.0261 0.7064  The LM statistic does not have a standard distribution because under the null hypothesis of no structural breaks, or that the parameters before and after the break are equal, and an unknown change point then the break point is an unidentified parameter and this causes the statistic to have a non-standard distribution. Andrews (1993) presents a test based on the supremum of the LM statistics over all possible break points (known as the sup LM test) and derives the limiting asymptotic distribution which is a function of integrals of Brownian motions. Because this distribution has no simple closed form expression, Andrews tabulates the critical values for different numbers of parameters. This statistic is based on only one break point and therefore misses some important information contained in the values of the LM statistic at all other break points. Andrews and Ploberger (1994) provide an extension to the sup LM test by deriving the asymptotic distribution of an LM statistic which is calculated as an average over all break points. Two special cases are tabulated in Andrews and Ploberger - the avg LM and the exp LM statistics which are a simple average and an exponentially weighted average over the LM statistics for each given break point. In Table 2.14 we present the values of these three LM tests for a range of asset pricing models. The results suggest weak evidence that the two-moment C A P M contains a structural break, and no evidence that models which include coskewness and the SMB and HML factors have a structural break.  Essay 2. Conditional Coskewness and Asset Prices.  95  T h i s weak evidence is at odd's w i t h Ghysels (1998) w h o finds t h a t even models w i t h i m p l i c i t beta's (or models such as ours t h a t do not require e x p l i c i t m o d e l l i n g of beta's) e x h i b i t s t r u c t u r a l breaks.  O n e reason for these inconsistent findings is t h a t  out approach does not require any asset specific parameters to be estimated, w h i l e Ghysels (1998) approach does.  Table 2.14: Testing for Structural Breaks Andrews (1993) and Andrews and Ploberger (1994) sup L M , exp LM and avg LM tests for structural breaks are reported for each model estimated. The naming convention refers to two- and three-moment C A P M when modelling market moments, and con. (for constrained) three-moment C A P M when using the sign corrected price of covariance and three-momentthree-factor includes the F F factors and coskewness w i t h explicitly modelled prices of risk. * indicates significance at the 0.10 level, ** significance at the 0.05 level, and * * * significance at the 0.01 level. Model supLM exp LM avgLM  FF25 Two-Moment C A P M Three-Moment C A P M Three-Factor Con. Three-Moment C A P M Three-Moment-Three-Factor  28.9948* 39.9903 48.3197 40.7286 66.4202  12.8685 21.5566 27.0480 23.2528 41.1771  10.6899* 16.7389 20.7299 17.4653 30.4702  14.2092 19.0434 24.5720 23.9948 40.9831  12.6778** 13.8143 19.6037 17.2881 30.9462  Industry Two-Moment Three-Moment Three-Factor Con. Three-Moment C A P M Three-Moment-Three-Factor  2.8  33.8269** 32.8856 46.5592 40.9074 68.6896  Comparison with Dittmar (2001)  Of the existing literature on nonlinear asset pricing, the closest in spirit to the current research is Dittmar (2002), however our results broadly support the three-moment C A P M as an adequate characterization of the cross-section of equity returns is at odds with Dittmar's (2001) results rejecting the three-moment C A P M . Dittmar (2002) empirically tested a pricing kernel that is a polynomial in market returns. This polynomial is motivated as an approximation to the true pricing kernel which in representative agent economy is that agent's intertemporal marginal rate of  Essay 2. Conditional Coskewness and Asset Prices.  96  substitution. This polynomial approximation is closely linked with nonlinear asset pricing models: a first-order polynomial is equivalent to the mean-variance C A P M , the second-order polynomial is equivalent to a three-moment asset pricing model, while adding the cubed market return incorporates preference for the fourth moment of asset returns.  The highest polynomial considered was cubic since we can sign  preferences of "reasonable" agents out to the fourth moment. To proxy for the return to wealth, he includes as factors the return on the CRSP value-weighted stock market index and, following Jagannathan and Wang (1996), the lagged moving average smoothed monthly growth rate of labour income. The linear pricing kernel is related to the basic C A P M , the quadratic pricing kernel to the three-moment C A P M , and the cubic pricing kernel to a four moment C A P M . Under certain assumptions about the aggregate investor's preferences we can impose constraints on the sign of the coefficients on the returns to wealth raised to different powers in the stochastic discount factor (or SDF).  12  Dittmar (2002) finds a number of interesting results. When only including the return on the stock index in the return to wealth, all asset pricing models up to and including the fourth moment are rejected by the data. However, by adding labour growth rates, the empirical performance can be dramatically improved such that the p- value for the Hansen-Jagannathan distance statistic increases (to approximately 23 percent). However, in the current essay we find that the three-moment C A P M adequately prices the cross-section of equity returns. Why do these two approaches yield such contradictory results? Virtually all asset pricing theories, whether statements of general equilibrium or the law of one price, can be represented as a stochastic discount factor (see Cochrane (2001)). A stochastic discount factor is a random variable £t such that all asset prices 1 2  Note that Dittmar (2002) uses aggregate investor's aversion to kurtosis to sign the coefficient on cubed market returns and labour growth rate. As previously argued, although we have sound economic reason to think that investors are averse to kurtosis, to our knowledge there are no results proving that aversion to the fourth moment aggregates.  Essay 2. Conditional Coskewness and Asset Prices.  97  satisfy E - [(l t  + R )Zt] = l  (2.62)  = (l + Ru)-  (2.63)  1  t  and that 1  for the time t conditionally risk-free asset Rf. The G M M technique is an extremely powerful tool for estimating stochastic discount factors. See Cochrane (1996, 2001) and Jagannathan and Wang (1996,2001) for discussions and examples of this methodology. Just as the two-moment C A P M has a SDF representation, so too does the threemoment C A P M , as demonstrated in Harvey and Siddique (1999) and the appendix. Consider the following random variable, which we will proceed to demonstrate is stochastic discount factor which is equivalent to the empirical version of the threemoment C A P M discussed above:  & = Co.t + Ci,t(r ,t - fit) + C 2 , t « - cr ), 2  m  t  t  (2.64)  where C  0tt  ;  U  = {l +  *Kl  u  KUI+  T  R )-  1  fit  + Rf,t) R ,tY f  This demonstrates the very close link between the co-moment methodology used in this essay with the SDF methodology. This variable is derived by taking the pricing constraint implied by the threemoment C A P M (2.26) and dividing through by the gross risk free return (1 + R/t), which gives E -i[r Z ] t  iit  t  = 0.  (2.65)  Dividing through by the gross conditional risk-free return does not affect the con-  Essay 2. Conditional Coskewness and Asset Prices.  98  ditional expectation of the pricing error, and so long as the mean and variance are unbiased then the expectation of the pricing kernel in equation (2.64) is  Et-i[Zt] = (o,t + Ci,tE -i(r t  - lit) + (2,tEt-i(e  2  m>t  - of) = (1 + R )~  l  mtt  u  (2.66)  which is a necessary condition of the pricing kernel. Taken together, equations (2.65) and (2.66) verify that £ , as defined in equation (2.64), is indeed a stochastic discount t  factor. Finally, taking the unconditional expectation of both sides and applying the law of iterated expectations gives the unconditional excess return stochastic discount factor representation: E[n£ ] t  = o.  Consider now the calculation of E _ i [ ( l + Rf,t)€t\- Since the gross risk-free return t  appears in the denominator of every term of £ , we have t  2J___[(1 + RfM  = ^ - i [ l - (r ,t ~ Ht)^- + t m  a  since Et-\{r ,t — Mt) = E _\(e  2  m  t  - a )^f)] t 2  = 1  (2.67)  K  — a ) = 0. Add to this each side of (2.65) and take 2  mt  the unconditional expectation of each side to obtain:  Et-^1  + Ri,t)£t] = 1,  (2.68)  which enables direct comparison with Dittmar (2002) who considers the gross return specification of a pricing kernel which (among other things) is a quadratic function of gross market returns. One can verify by simple algebra that £ is similarly a quadratic t  function of market returns, but that the coefficients have very specific values related to the conditional moments of market returns. In this representation of the threemoment CAPM's pricing kernel we must model four terms: the price of covariance and coskewness risk, and the mean and variance of excess market returns. In Dittmar (2002) there are only three terms to be modelled: the coefficients on each of the terms in the quadratic equation. Even though the conditional beta method discussed  Essay 2. Conditional Coskewness and Asset Prices.  99  previously pins down the coefficients of the pricing kernel, the added flexibility granted by the extra parameter dramatically improves the models performance; more than making up for the decreased degrees of freedom in the JT statistic. Furthermore, even though we no longer impose the moment conditions on the mean, variance and skewness, as will be shown below by including the risk-free asset as a security to be priced we effectively impose the mean and variance pricing restrictions. Dittmar (2002) takes a different root to testing a SDF which incorporates preference for skewness (and also kurtosis) by defining a p-th order polynomial in the market return: p  &=%,l+£ 7M m,i 7  r  ( - ) 2  6 9  1=1  where Vi,t  =  Si(Zj_ T] ) 1  i  )  are the time varying coefficients, which incorporates conditional pricing, and  is a  random variable taking one for even i and negative one for odd i, accounting for the preference for mean and skewness and aversion to variance and kurtosis. The first reason that this study supports the three-moment C A P M while Dittmar (2002) does not is that the SDF representation of the three-moment C A P M which involves explicit modelling of market moments provides a better fit to the the data than the polynomial approximation. We demonstrate this point by fitting the twoand three-moment C A P M SDF approximations that explicitly model the moments of market returns and the linear, quadratic and cubic polynomial approximations. The parameters are estimated using G M M and two different model independent weighting matrices to ensure direct comparisons can be made. The first is an identity matrix and the second is the inverse of the second moment matrix as advocated by Hansen and Jagannathan (1991). Table 2.15 reports the value of the objective function, along with the p-value of the Zhou's (1994) HT specification test. We follow Dittmar and consider the industry portfolios and augment them with the conditionally risk free asset. The estimation 13  1 3  Although Dittmar used 20 portfolios our results only use 17 industry portfolios.  Essay 2. Conditional Coskewness and Asset Prices.  100  uses the same i n f o r m a t i o n variables as i n previous sections of this essay w h i c h includes the default spread w h i c h is not one of D i t t m a r ' s variables. A s c a n be seen b y e q u a t i o n (2.67) the effect of i n c l u d i n g the gross risk-free r e t u r n is to tie d o w n the c o n d i t i o n a l m e a n a n d variance to the market returns, so we w i l l continue to interpret these parameters i n this fashion.  Table 2.15: Stochastic Discount Factor Estimation of Industry Data The value of the minimized criterion function for the multi-moment asset pricing models i n S D F form are presented for two arbitrary weighting matrices: the identity matrix and the inverse of the second-moment matrix. The table also reports the p-values of the HT test of Zhou (1994). We report the S D F version of the two- and three-moment C A P M models and the first- through third-order Taylor series expansion of D i t t m a r (2002). Model Two-Moment Three-Moment First-Order Second-Order Third-Order  Identity M a t r i x 0.3501 ( 0.0026) 0.2026 ( 0.0091) 0.3095 ( 0.0077) 0.2651 ( 0.0060) 0.2314 ( 0.0077)  Second Moment M a t r i x 0.5369 ( 0.0053) 0.4855 ( 0.0076) 0.5662 ( 0.0091) 0.5587 ( 0.0083) 0.5390 ( 0.0114)  T h e two most i m p o r t a n t points to take from this table are t h a t the three-moment C A P M approach provides for a m u c h better p r i c i n g fit t h a n even the c u b i c p r i c i n g kernel, w h i c h is consistent w i t h a four m o m e n t C A P M ; a n d t h a t despite this superior p r i c i n g performance, the p-values of the specification tests s t i l l i n d i c a t e t h a t the m o d e l be rejected b y the data. T h e two most i m p o r t a n t differences between the m o d e l l i n g is the n u m b e r of p a r a m eters a n d the shape of the m a p p i n g from i n f o r m a t i o n to coefficients. T h e moment m o d e l l i n g a p p r o a c h has 4 L + 1 parameters while the q u a d r a t i c p o l y n o m i a l specificat i o n has o n l y 3 L parameters.  T h e second difference is the relationship between the  parameters a n d coefficients i n the p o l y n o m i a l . B y e x p a n d the m o m e n t m o d e l l i n g S D F and collect terms i n r  m j 4  and  t  we observe t h a t a l l the coefficients i n the p o l y n o m i a l  d e p e n d o n a l l parameters while the p o l y n o m i a l approach o n l y uses L parameters o n  Essay 2. Conditional Coskewness and Asset Prices.  101  each term. These two effects: the extra L +1 parameters and the explicit dependence of each coefficient on all parameters, drives the improved fit. 2.9  Conclusions  This essay develops a methodology for testing nonlinear asset pricing models in the spirit of the conditional tests of the C A P M and multi-factor pricing models of Harvey (1989) and He et al (1996). To keep the number of parameters to a bare minimum we use a transformation of the conditional co-moments to avoid the need to model any asset specific parameters. The model is very parsimonious: we are able to test the restrictions of the theory while estimating the bare minimum number of parameters, all of which are common to all assets in the economy, and we avoid making any restrictive assumptions on the dynamics of co-moments, the joint distribution of returns and factors, or the exact form of heterosckedasticity. These latter two benefits derive from our use of G M M . The resulting moment conditions are very close in form and spirit to the SDF methodology. The reason for this is that we write all terms in the pricing equation as a product between the relevant co-moment and the price of that co-moment's risk. The co-moment is then rewritten as the product between the return on the asset and a term which includes the return on the factor and various conditional moments of the factor. When we combine this setup with the generalized method of moments estimation we end up with a representation that looks very much like an SDF. In fact we can turn the co-moment restrictions into a SDF moment restriction by simply dividing through by the gross conditional risk-free return. The key difference between the two approaches is that the co-moment approach directly links the price of risk to the moments of market returns as implied by the asset pricing theory, while the SDF method does not. The essay documents a number of interesting and important empirical observations. First, we show that standard information variables predict portfolio returns, and explain time variation in covariance and coskewness. To verify our simplification of  Essay 2. Conditional Coskewness and Asset Prices.  102  the conditional covariance and coskewness, we compare the rewritten form of the co-moments (i.e. the specifications that avoid modelling the conditional mean return on the individual assets) with their traditional specification, and we cannot reject the hypothesis that they are equivalent. This lends credence to the validity of our pragmatic modelling choice. We also find that the three-moment C A P M is broadly consistent with the data. Although the data soundly rejects the two-moment C A P M , the three-moment C A P M has goodness-of-fit statistics that are not rejected at conventional levels (about the five percent level) and that the price of risk, in addition to co-moments, are timevarying. The parameter estimates indicate that investors care about covariance risk and also coskewness risk when the market is positively skewed, but are only marginally averse to negative skewness. We also find that including a cross-section of equity returns contributes to the precision with which we can estimate the time-series dynamics of the markets variance and skewness. We also fit the three-factor asset pricing model due to Fama and French (1993) which has become an industry standard, and find that, depending on the choice of weighting matrices, the improvement due to adding the extra two factors SMB and HML is similar to the improvement that results from including skewness; and that the increased explanatory power due to the F F factors is attenuated after accounting for conditional coskewness. Interestingly, the three-factor model does not appear to capture all the predictive ability of the information variables, while the three-moment model does. The two-moment C A P M is rejected as having stable parameters, while both the three-factor and three-moment models have stable parameters. There does appear to be some usefulness of the F F factors in explaining returns over coskewness, and that coskewness retains its explanatory power over the F F factors also. These results are at odds with Dittmar's (2001) rejection of the three-moment C A P M using an SDF framework. The essay demonstrate that there are two basic reasons for this difference. First, the three-moment specification employed has more parameters than the comparable SDF setup, and the functional form mapping  Essay 2. Conditional Coskewness and Asset Prices.  103  the information variables to polynomial coefficients is significantly more flexible. Second, there appears to be gains to including the moment restrictions linking the prices of risk to market moments since the co-moment specification is not rejected by the industry data, but the SDF method is. The fundamental conclusion reached by this research is twofold: • That coskewness is important for pricing the cross-section of equity returns. • That both risk, including covariance and coskewness risk, and the prices of risk are strongly time varying. The great flexibility our chosen specification and econometric methodology have afforded us in drawing these robust conclusions comes at a price. Because we don't have an explicit representation for the dynamics of conditional covariance and coskewness we are unable to make predictions about returns in the normal sense. Future research should attempt to specify the dynamics of covariances and coskewness so we can use this model for estimating costs of capital. 2.A  Appendix:  Conditional  Asset  Pricing  The three-moment C A P M , as developed by Kraus and Litzenberger (1976), is a static, one-period representative agent model, just as is the basic two-moment C A P M . However, it is relatively straightforward to reconcile a conditional two-moment C A P M by specifying the stochastic discount factor in the economy. It is well know that in an economy that does not permit arbitrage, there will exist a random variable m such t  that E ( ( l + R )m ) = 1 it  where R  itt  t  (2.70)  is the return (and not excess return) on any asset i.The pricing kernel m  t  is the same across all assets in the economy. The conditional two-moment C A P M is equivalent to a pricing kernel that is linear in the market return m = 8 + 5 R,"m,t t  0tt  ltt  (2.71)  Essay 2. Conditional Coskewness and Asset Prices.  104  where  <W = /27J(i + 4).  and Ht = E -i{Rm,t) t  is the expected market return and of = Et-\(Rm t  —  t  E -i(Rm,t))  2  t  is the conditional variance of the market return. This can be verified by simple algebra. The pricing kernel representation of the conditional two-moment C A P M is obtained by simple algebraic manipulation of the conditional expectation. Jagannathan and Wang (1996) derive a similar representation for an unconditional multi-beta pricing model (though time-variation in the market beta is incorporated by a factor depending on the term-premium). We can similarly obtain a pricing kernel that is consistent with the static threemoment C A P M . This pricing kernel is a quadratic function in the return on the market portfolio, whose coefficients depend on the moments of the market return and on the relative prices of variance and skewness risks:  m =S t  0tt  + S tRm,t lt  + ^t ,t R2  (2-72)  m  where x  D - ii  x  _  _i_ (°t ~  )/*2,t  ~ M l , t . 2/x // ,t f  2  <J2,t — —TJ:—•  Simple calculations verify the constraint on the pricing kernel that E -i(rn ) t  t  = Rj].  The main point to take from this discussion is that although both the two- and three-moment CAPMs were originally derived as static models, they can also be derived in a dynamic context since they are consistent with some pricing kernel by imposing suitable restrictions on utility functions. Note that although absence of  Essay 2. Conditional Coskewness and Asset Prices.  105  arbitrage is the standard motivation for the stochastic discount factor representation of dynamic asset pricing models, it is not a motivation in this context because the pricing kernel implied by both the two- and three-moment CAPMs are negative with positive probability. References Andrews, Donald W. K., 1993, Tests for Parameter Instability and Structural Change with Unknown Change Point, Econometrica 61, 821-856. Andrews, Donald W. K., and Werner Ploberger, 1994, Optimal Tests when a Nuisance Parameter is Present Only Under the Alternative, Econometrica 62, 1383-1414. Black, Fisher, Michael Jensen, and Myron Scholes, 1973, The Capital Asset Pricing Model: Some Empirical Tests, in Michael Jensen, eds.: Studies in the Theory of Capital Markes (Praeger, New York, N Y ). Bodurtha, James, and Nelson Mark, 1991, Testing the C A P M with Time Varying Risks and Returns, Journal of Finance 46, 1485-1505. Bollerslev, Tim, Robert Engle, and Jeff Wooldridge, 1988, A Capital Asset Pricing Model with Time Varying Covariances, Journal of Political Economy 96, 116-131. Bos, T, and P. Newbold, 1984, A n Empirical Investigation of the Possibility of Systematic Stochastic Risk in the Market Model, Journal of Business 57, 35-41. Breeden, D, 1979, An Intertemporal Asset Pricing Model with Stochastic Consumption and Investment Opportunities, Journal of Financial Economics 7, 265-296. Campbell, John Y., 2000, Asset Pricing at the Millennium, Journal of Finance 60, 1515-1567. Campbell, John Y., Andrew W. Lo, and A. Craig MacKinlay, 1997, The Econometrics of Financial Markets. (Princeton University Press Princeton, USA).  Essay 2. Conditional Coskewness and Asset Prices.  106  Chen, Nai-fu, Richard Roll, and Stephen A. Ross, 1986, Economic Forces and the Stock Market, Journal of Business 59, 383-404. Cochrane, John H., 1996, A Cross-Sectional Test of an Investment-Based Asset Pricing Model, Journal of Political Economy 104, 572-621. Cochrane, John H., 2001, Asset Pricing. (Princeton University Press, Princeton, USA.). Davidian, M . , and R.J. Carrol, 1987, Variance Function Estimation, Journal of American Statistical Association 82, 1079-1091. Dittmar, Robert F., 2002, Nonlinear Pricing Kernels, Kurtosis Preference, and Evidence from the Cross-Section of Equity Returns, Journal of Finance 51, 369-403. Fabozzi, F.J., and J.C. Francis, 1978, Beta as a Random Coefficient, Journal of Financial and Quantitative Analysis 13, 101-115. Fama, Eugene, and Kenneth French, 1992, The Cross-Section of Expected Returns, Journal of Finance 47, 427-465. Fama, Eugene, and Kenneth French, 1993, Common Risk Factors in the Returns on Stocks and Bonds, Journal of Financial Economics 33, 3-56. Fama, Eugene, and J. MacBeth, 1973, Risk, Return and Equilibrium: Empirical Tests, Journal of Political Economy 96, 607-636. Fang, Hsing, and Tsong-Yue Lai, 1997, Co-Kurtosis and Capital Asset Pricing, The Financial Review 32, 293-307. Ferson, Wayne E., and Campbell R. Harvey, 1991, The Variation of Economic Risk Premiums, Journal of Political Economy 99, 385-415.  Essay 2. Conditional Coskewness and Asset Prices.  107  Ferson, Wayne E., and Campbell R. Harvey, 1993, The Risk and Predictability of International Equity Returns, Review of Financial Studies 6, 527-566. Ferson, Wayne E., and Campbell R. Harvey, 1999, Conditioning Variables and the Cross Section of Stock Returns, Journal of Finance 54, 1325-1360. Friend, I., and R. Westerfield, 1980, Co-skewness and Capital Asset Pricing, Journal of Finance 35, 1085-1100. Ghysels, Eric, 1998, On Stable Factor Structures in the Pricing of Risk: Do TimeVarying Betas Help or Hurt?, Journal of Finance 53, 549-573. Hansen, Lars Peter, 1982, Large Sample Properties of Generalized Method of Moments estimators, Econometrica 50, 1029-1054. Hansen, Lars Peter, and Ravi Jagannathan, 1991, Implications of Security Market Data for Models of Dynamic Economies, Journal of Political Economy 99, 225-262. Harvey, Campbell, 1989, Time Varying Conditional Covariances in Tests of Asset Pricing Models, Journal of Financial Economics 24, 289-317. Harvey, Campbell, and Akhtar Siddique, 1999, Autoregressive Conditional Skewness, Journal of Financial and Quantitative Analysis 34, 465-488. Harvey, Campbell, and Akhtar Siddique, 2000a, Conditional Skewness in Asset Pricing Models Tests, Journal of Finance 65, 1263-1295. Harvey, Campbell, and Akhtar Siddique, 2000b, Time-Varying Conditional Skewness and the Market Risk Premium, (in Research in Banking and Finance Vol. 1, ). He, Jia, Raymond Kan, Lilian Ng, and Chu Zhang, 1996, Tests of the Relations Among Marketwide Factors, Firm-Specific Variables, and Stock Returns Using a Conditional Asset Pricing Model, Journal of Finance 51, 1891-1908.  Essay 2. Conditional Coskewness and Asset Prices.  108  Hochberg, Yosef, 1988, A Sharper Bonferroni Procedure for Multiple Tests of Significance, Biometrika 75, 800-802. Jagannathan, Ravi, and Zhenyu Wang, 1996, The Conditional C A P M and the CrossSection of Expected Returns, Journal of Finance 51, 3-53. Jagannathan, Ravi, and Zhenyu Wang, 2001, Empirical Evaluation of Asset Pricing Models: A Comparison of the SDF and Beta Methods, Working paper, Kellogg Graduate School of Management, Northwestern University. Kraus, Alan, and Robert Litzenberger, 1976, Skewness Preference and the Valuation of Risk Assets, Journal of Finance 31, 1085-1100. Kraus, Alan, and Robert Litzenberger, 1983, On the Distributional Conditions for a Consumption-oriented Three Moment C A P M , Journal of Finance 38, 1381-1391. Lim, Kian-Guan, 1989, A New Test of the Three-Moment C A P M , Journal of Financial and Quantitative Analyis 24, 205-216. Linter, John, 1965, The Valuation of Risky Assets and the Selection of Risky Investments in Stock Portfolios and Capital Budgets, Review of Economics and Statistics 47, 13-37. Merton, Robert, 1973, A n Intertemporal Capital Asset Pricing Model, Econometrica 41, 867-887. Ng, Lilian, 1991, Tests of the C A P M with Time-Varying Covariances: A Multivariate G A R C H Approach, Journal of Finance 46, 2507-1521. Ogaki, M , 1993, Generalized Method of Moments: Econometric Applications, in G.S. Maddala, C.R. Rao, and H.D. Vinod, eds.: Handbook of Statistics: Vol. 11 Econometrics (North-Holland, Amsterdam ).  Essay 2. Conditional Coskewness and Asset Prices.  109  Ross, Stephen, 1977, The Arbitrage Theory of Capital Asset Pricing, Journal of Economic Theory 13, 341-360. Schwert, G. William, and Paul Seguin, 1990, Heteroscedasticity in Stock Returns, Journal of Finance 45, 1129-1155. Scott, Robert C , and Philip A. Horvath, 1980, On the Direction of Preference for Moments of Higher Order than the Variance, Journal of Finance 35, 915-1253. Sharpe, William F., 1964, Capital Asset Prices: A Theory of Market Equilibrium Under Conditions of Risk, Journal of Finance 19, 425-442. Thompson, Spencer C , 1986, An Econometric Evaluation of Parameter Estimation within the Capital Asset Pricing Model, Ph.D. thesis University of Queensland. Whitelaw, Robert F., 1994, Time Variations and Covariations in the Expectation and Volatility of Stock Market Returns, Journal of Finance 49, 515-541. Zhou, Guoufu, 1994, Analytical G M M Tests: Asset Pricing with Time-Varying Risk Premiums, Review of Financial Studies 7, 687-709.  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0090704/manifest

Comment

Related Items