A N E M P I R I C A L A N A L Y S I S O F T H E C O R P O R A T E C A L L D E C I S I O N By Murray Carlson B. Sc. (Applied Science) Queen's University M B A University of British Columbia A THESIS SUBMITTED IN PARTIAL F U L F I L L M E N T OF T H E REQUIREMENTS FOR T H E D E G R E E OF D O C T O R OF P H I L O S O P H Y in T H E FACULTY OF GRADUATE STUDIES D E P A R T M E N T OF FINANCE F A C U L T Y OF C O M M E R C E A N D BUSINESS ADMINISTRATION, We accept this thesis as conforming to the required standard THE UNIVERsfTTTjF~BRITISH COLUMBIA March 1998 © Murray Carlson, 1998 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for refer-ence and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of Finance Faculty of Commerce and Business Administration The University of British Columbia 2075 Wesbrook Place Vancouver, Canada V6T 1Z1 Date: Abstract In this thesis we provide insights into the behavior of financial managers of utility companies by studying their decisions to redeem callable preferred shares. In particular, we investigate whether or not an option pricing based model of the call decision, with managers who maxi-mize shareholder value, does a better job of explaining callable preferred share prices and call decisions than do other models of the decision. In order to perform these tests, we extend an empirical technique introduced by Rust (1987) to include the use of information from preferred share prices in addition to the call decisions. The model we develop to value the option embedded in a callable preferred share differs from standard models in two ways. First, as suggested in Kraus (1983), we explicitly account for transaction costs associated with a redemption. Second, we account for state variables that are observed by the decision makers but not by the preferred shareholders. We interpret these unobservable state variables as the benefits and costs associated with a change in capital structure that can accompany a call decision. When we add this variable, our empirical model changes from one which predicts exactly when a share should be called to one which predicts the probability of a call as the function of the observable state. These two modifications of the standard model result in predictions of calls, and therefore of callable preferred share prices, that are consistent with several previously unexplained features of the data; we show that the predictive power of the model is improved in a statistical sense by adding these features to the model. The pricing and call probability functions from our model do a good job of describing call decisions and preferred share prices for several utilities. Using data from shares of the Pacific Gas and Electric Co. (PGE) we obtain reasonable estimates for the transaction costs associated with a call. Using a formal empirical test, we are able to conclude that the managers of the Pacific Gas and Electric Company clearly take into account the value of the option to delay the call when making their call decisions. Overall, the model seems to be robust to tests of its specification and does a better job of describing the data than do simpler models of the decision making process. Limitations in the data do not allow us to perform the same tests in a larger cross-section of utility companies. However, we are able to estimate transaction cost parameters for many firms and these do not seem to vary significantly from those of PGE. This evidence does not cause us to reject our hypothesis that managerial behavior is consistent with a model in which managers maximize shareholder value. ii Contents Abstract i i List of Figures v List of Tables v i Acknowledgements vi i 1 Introduction 1 2 Literature Review 5 2.1 Models of the Cal l Decision: Theory 5 2.2 Models of the Call Decision: Tests 10 3 The Theoretical Mode l 14 3.1 Assumptions and Methodology 15 3.2 The Cross Sectional Relationships between Prices 23 3.3 Alternative Models of the Call Decision 28 4 Empirical Implementation 44 4.1 The Numerical Procedure 44 4.2 The Likelihood Function 52 5 Empirical Results 57 5.1 The Data 58 5.1.1 Liquidity 59 5.1.2 Regulatory Environment 60 5.1.3 Taxation 61 5.2 An Empirical Analysis of the Call Decisions of the Pacific Gas and Electric Co. . 62 5.2.1 Benchmark Parameter Estimates 62 5.2.2 Specification Tests 66 5.3 Transaction Cost Estimates for Other Utilities 72 6 Conclusions and Directions for Future Research 90 Appendices 95 iii A Technical Appendix 96 A . l Existence of Preferred Share Prices 96 A.2 Existence of Callable Preferred Share Prices 98 A.3 Solving for the Optimal Call Policy 101 A.4 Characteristics of the Optimal Call Policy and of Callable Preferred Share Prices 103 A.5 The Q Operator for Specific Models of Interest Rate Dynamics 106 A.6 Making the State Space Discrete 108 i v List of Figures 3.1 Non-callable preferred share price vs. interest rate 36 3.2 Callable preferred share price vs. state variable 37 3.3 Value of callable preferred share liability 38 3.4 Optimal exercise region 39 3.5 Price vs. interest rate - timing considerations 40 3.6 Price vs. interest rate - effect of unobservable state 41 3.7 Price vs. dividend rate 42 3.8 Price vs. interest rate - static vs. dynamic model 43 5.1 Share price behavior with and without sinking fund 83 5.2 Callable share prices vs. non-callable share price - Pacific Gas and Electric Co. . 84 5.3 The time series of pricing errors - P G E 85 5.4 The time series of the first difference of pricing errors - P G E 86 5.5 Pricing errors vs. call probability - P G E 87 5.6 Pricing errors vs. strike price - P G E 88 5.7 Non-callable preferred share prices implied by parameter estimates 89 v. List of Tables 3.1 Notation 24 3.2 Sample Dataset 35 5.1 Parameter estimates - Pacific Gas and Electric Co 76 5.2 Parameter estimates from other studies 76 5.3 Test of myopic behavior by P G E managers 77 5.4 Specification test - alternative models of the term structure 77 5.5 Specification test - alternative models of the term structure 78 5.6 Specification test - rational expectations 78 5.7 Specification test - presence of bid/ask bounce 78 5.8 Estimates of bid/ask spread 79 5.9 Specification test - Gauss-Newton regressions 79 5.10 Parameter estimates - other utilities 80 5.11 Issuers ranked by distance of parameter estimates from P G E estimates 82 vi Acknowledgments In the summer of my first year Max asked me to "look into this bus paper by Rust". Although I had taken Marty's course on Markov Decision Processes I couldn't quite figure out what was going on in the paper. After a lot of thinking and a few discussions with Marty I figured out enough about the technique to use it in the summer paper of my second year "Why Fire the Coach?" I didn't do anything more with Rust's paper until late in my third year when Ron suggested I use the technique to analyze the refinancing decision. Three and a half years later, this is the end product. I'd like to thank the members of my Supervisory Committee for their enthusiasm for the topic, careful guidance, difficult to answer questions and well timed encouragement: Ron Gi-ammarino, Burton Hollifield, Rob Heinkel and Marty Puterman. Thanks also to my son, Scott, and daughter, Emily, who had no idea what was going on and helped me to keep the thesis process in perspective. Finally, I'd like to thank my wife, Lisa, who never gave up her belief that I'd get this thing finished. I gratefully acknowledge the financial support of the Social Sciences and Humanities Re-search Council of Canada. vii To Dad and Mom viii Chapter 1 Introduction Although the empirical analysis of compensation contracts may provide indirect evidence about managers' objective functions, we currently have very little direct evidence about how managers make decisions. As a result, a variety of assumptions about objectives underly models of corporate behavior. A common one is that managers act to maximize firm value. In a world with optimal managerial compensation contracts, this is perhaps the easiest objective to justify (Dybvig and Zender 1991); however, there are good reasons to believe optimal contracting may not always be possible. If managers can renegotiate contracts at no cost, for example, then it may be more reasonable to assume that the managers of these companies maximize shareholder value (Persons 1994).1 Thus, even if we can perfectly observe the incentive contracts offered to management there is some question as to whether or not we can predict their subsequent behavior. A better empirical strategy may be to observe how management acts and then deduce their objectives. Corporate securities are often issued with some sort of embedded option; callable debt, redeemable preferred shares and convertible debt are common examples. We feel that the deci-sions made by managers to call (or to not call) these securities constitute ideal opportunities for 1 For companies where there are insignificant barriers to changes in corporate control, for example, it may be impossible to ensure that existing compensation contracts are not rewritten by new boards of directors. Anti-takeover defenses may increase the cost of this type of renegotiation but will not completely eliminate the effect. This gives scope for managers to behave partly as total firm value maximizers and partly as shareholder value maximizers. 1 CHAPTER 1. INTRODUCTION 2 studying managerial behavior. Compared to many of the decisions made by a firm's managers, the exercise decision is relatively straightforward to model and is easily observed. Furthermore, differences in objectives can lead to differences in call behavior. A well known result for callable debt, for example, is that managers acting to maximize shareholder value should call debt as soon as it is trading at the call price (Brennan and Schwartz 1977). The picture gets more complicated if we consider the effect of transaction costs on the call decision. Kraus (1983) shows that if the call decision has associated deadweight costs it is not unreasonable to observe securities trading at premiums to their call prices.2 To the extent that we can identify sources of deadweight costs and model bond prices, we can determine what premiums we may reasonably expect to observe. If we then see debt trading at prices very much in excess of these levels we can claim to have evidence that managers do not, in fact, maximize shareholder value. On the other hand, since managers do, in fact, call outstanding debt we have clear evidence that their objective is not to maximize firm value since any transaction costs that are incurred when an outstanding security is called may be avoided by simply not calling the security. In this paper we analyze the decisions made by managers of utility companies to call issues of outstanding preferred shares. We focus on preferred shares rather than debt because market prices for preferreds are reliable and easily obtained. In addition, to the best of our knowledge no one has studied the decision to redeem preferred shares. From a common shareholder's point of view, utility company managers appear to exercise their call options on preferred shares sub-optimally; as we will see, the preferred share issues in our sample often trade at prices above their call price. However, as we mentioned above, we need a model of preferred share prices and transaction costs in order to determine whether or not the premiums we observe are inconsistent with the hypothesis that managers act in shareholders best interests. Our strategy is to develop a dynamic model of the call decision and then test it using actual call decisions and preferred share prices. In order to estimate a dynamic model it is necessary to make strong assumptions relating 2For example a firm often redeems outstanding securities using funds raised by a simultaneous reissue of new securities, in which case underwriting fees and other variable expenses resulting from the new security issue are the relevant deadweight costs. CHAPTER 1. INTRODUCTION 3 to the specification of a manager's objectives, the underlying interest rate process and the form of the transaction costs. Firstly, we will assume managers are acting to maximize shareholder value. Secondly, we will restrict our attention to one factor Markovian term structure models, for example the model of Cox, Ingersoll, and Ross (1985) (CIR). 3 Finally, we will assume that transaction costs are proportional to the value of shares that are redeemed. Armed with these assumptions, we can construct a model of the exercise decisions of managers and price the redeemable preferred shares. Our analysis would be a straightforward application of bond option pricing theory were it not for the fact that we will explicitly account for the effect of unobserved state variables. More specifically, we will assume that issuers of callable preferred shares base their decisions to exercise on information relevant to the call that preferred shareholders cannot observe. Whereas standard option pricing models would predict that all issues with identical features will be called at the exact same time, our model allows for rational managers to deviate from the apparently optimal policy. Our model will, therefore, be consistent with two features of the data. First, managers often call after an increase in interest rates. This behavior would be inconsistent with that predicted by a traditional bond option pricing model, where the optimal policy is to exercise when interest rates fall to some critical value. Second, we often observe a price drop when preferred shares are called. This phenomenon is also inconsistent with standard option pricing models. If all relevant information is observed by all parties and prices are continuous functions of this information, redeemable preferred share prices will approach the call price as interest rates approach the critical rate since rational investors can deduce the optimal exercise policy (Kraus 1983). Our empirical methodology is an extension of the techniques described in Rust (1987). He was able to learn something about a manager's behavior by observing how decisions were related to an observable state variable. A few details about his technique may help to highlight the similarities and differences between his work and the work undertaken here. Rust modeled the decisions made by the maintenance superintendent of a bus company. One of the manager's 3 0 u r empirical technique can easily incorporate more complicated models of the term structure, including two factor models. CHAPTER 1. INTRODUCTION 4 duties was to decide when to overhaul or replace bus engines. This problem can be formulated as a "regenerative optimal stopping" problem in which the observable state variable is the number of miles on an engine since the last overhaul. Rust's dynamic programming model also accounted for unobservable state variables, which can be interpreted as decision relevant cash flows that the manager observes but outsiders do not. His treatment of the unobservable variable allowed him to develop a "nested fixed point" algorithm that was used to estimate parameters that are primative to the dynamic programming model, for example the relationship between mileage and variable operating costs, and to test some specific hypotheses about the manager's behavior. Our application has two important sources of information about managerial behavior: call decisions and callable preferred share prices. In a decision problem of a financial nature, this is the type of data that is likely to be available. It is, therefore, important to extend Rust's "dynamic logit" model to handle prices. In this thesis, we will make use of both decision and price data by simultaneously fitting a dynamic version of a logit model and a modified option pricing model.4 Modifying the empirical methodology in this way allows us to obtain more precise parameter estimates than either a standard option pricing model or Rust's technique would produce given the same amount of data. In fact, we are able to obtain meaningful parameter estimates from as few as five preferred share issues. Utilities generally have several issues of preferred shares outstanding at any one time; therefore, our technique allows us to gain insights into managerial behavior on the firm level. With firm specific parameter estimates in hand, it is possible to examine differences in the cross-section of firms in our sample. The thesis is structured as follows. In Chapter 2 we provide a brief survey of the theoretical and empirical literature on the call decision. In Chapter 3 we introduce the theoretical model we will test. Chapter 4 describes the numerical techniques and empirical methodology we employ. Finally, Chapter 5 describes the data and presents our findings on the behavior of managers of several utility companies. 4 Our empirical technique may be of interest to those pricing other types of options. It applies in any setting where there appears to be "noise" in agents' exercise decisions. , . Chapter 2 Literature Review In this chapter we review the literature relevant to our study. Where it is appropriate, we will discuss the differences between the existing literature and the work we undertake in this thesis. The chapter has two sections. In the first we survey theoretical models of the corporate call decision and in the second we discuss empirical studies of the exercise decision. 2.1 M o d e l s o f t h e C a l l D e c i s i o n : T h e o r y Brennan and Schwartz (1977) were among the first to describe the shareholder value maxi-mizing call policy for callable (non-convertible) bonds or preferred shares in a market with no imperfections. This policy follows in a straightforward manner from one of the boundary con-ditions for the valuation equation for the bonds. The boundary condition is very intuitive; if the price of the callable liability is above (below) the strike price, calling the outstanding issue would transfer the difference between the market price and call price to (away from) common shareholders; thus, the value maximizing call strategy must have managers calling when the market price equals the strike price. The story changes when transaction costs are introduced. As mentioned in the introduction, Kraus (1983) explains why managers will rationally delay a call if there are costs associated with the call. In particular, he points out that when deciding to call an outstanding bond 5 CHAPTER 2. LITERATURE REVIEW 6 or preferred share, managers should consider the costs of issuing the replacing security. The impact on the call decision is straightforward. Managers will act as if the callable security has an exercise price that is inflated by the amount of the flotation costs. If interest sensitive securities with high strike prices are called at a lower interest rate than those with low strike prices, the call will be "delayed" relative to the case where there are no transaction costs involved. With this delay, the market price of the security can rise above the strike price, since investors in preferred shares receive only the strike price in the event of the call. 1 Dunn and Spatt (1986) expand on the point made by Kraus by showing that, when a similar security replaces the called security (for example, if callable debt with a lower coupon replaces called debt), the amount by which the market price of the callable security can exceed the call price is bounded by the one-time transaction cost.2 Thus, the costs of floating a new security provide a metric by which to determine whether or not callable security prices are set by rational traders. For example, if we observe securities trading at prices much in excess of 5% of the call price in our sample we could conclude that, relative to our model of managerial behavior, managers of utilities are calling suboptimally.3 Although the above two papers provide an explanation for why we might find rational managers not calling securities when the market price exceeds the call price, they do not explain why these managers would issue callable securities in the first place. In fact, in the absence of other imperfections, the presence of transaction costs provides one more reason not to issue callable securities. Firm value maximizing managers would never call since this (costly) action simply transfers value from one set of claimants to another. On the other hand, shareholder value maximizing managers will call if the transfer of value to them from the other 'Ling (1991) calculates the prices of a callable bond under these assumptions using numerical techniques. Mauer (1993) derives a closed form solution for callable bond prices under the assumption that interest rates follow a random walk with instantaneous variance proportional to the interest rate cubed. 2Barone-Adesi and Delgado (1995) determine prices using the Dunn and Spatt strategy using numerical techniques. *The Dunn and Spatt bound no longer holds if callable securities are replaced by securities that are callable at a premium. If the present value of future refinancing costs under this refinancing strategy were lower than those of the Dunn and Spatt strategy then their bound would continue to hold. Unfortunately, this is not the case. The effect of the premium is ambiguous; the premium will delay calls and thus decrease the present value of future transaction costs but, since the premia are paid by common shareholders, the direct effect of the premium is to increase the present value of the cost of future calls. CHAPTER 2. LITERATURE REVIEW 7 claimants (preferred shareholders or bondholders) outweighs the flotation costs. Thus, unless we consider the impact of other imperfections, our prediction is that firms whose managers maximize shareholder value will not issue callable liabilities (and thereby avoid transaction costs) while firms whose managers maximize firm value would be indifferent between non-callable securities and callable securities (which would never be exercised).4 In this paper, our solution is to develop a reduced form model in which there may be other imperfections driving the call decision. Before we explain this further, we introduce some explanations for the existence of the call provision. Authors have considered the roles of many types of imperfections in explaining the existence of call provisions. Brick and Wallingford (1985) show that differences between the tax treatment of the borrower and lender regarding the payment of the call premium can lead to a tax-based rationale for issuing callable debt. Barnea, Haugen, and Senbet (1980) "demonstrate how debt maturity structure and call provisions can be rationalized as a means of resolving the agency problems of debt associated with informational asymmetry, risk incentives, and Myers-type foregone growth opportunities" (p. 1224). Flannery (1986) and Robbins and Schatzberg (1986) show how callable debt can be used as a signal of a firm's prospects. Finally Kraus (1983) shows that the call provision may be necessary for preserving operating flexibility. If bonds have covenants that in some way restrict future investment opportunities, the call price places an upper bound on the amount of rents bondholders can extract from shareholders.5 In this paper we assume the presence of taxes and bankruptcy or agency costs. Managers make a sequence of decisions that affect the firm's capital structure and, consequently, its value. In addition, we assume that managers must call preferred shares or debt whenever they refinance because of a holdup problem (Kraus 1983). These decisions are similar to those made by managers in the Fischer, Heinkel, and Zechner (1989a) model of dynamic capital 4 Consider the optimal behavior of the owner of a firm who anticipates the call policy of her managers. If she knows they are going to maximize shareholder value, issuing callable debt will be suboptimal since these managers will incurr transaction costs and recapitalize unnecessarily. If she knows they are going to maximize firm value she will be indifferent between issuing non-callable debt and callable debt that managers will not call. 8Kraus is quick to point out that, although the call provision is one solution to this holdout or free-rider problem, one can come up with less expensive contracts that solve the same problem. Thus, operating flexibility does not fully explain the extensive use of the call feature. CHAPTER 2. LITERATURE REVIEW 8 restructuring which we now describe. Note that we have chosen to model the call decision in a discrete time setting while they work in continuous time. In order to compare the our model to theirs we must make some assumptions about how to interpret their setup in discrete time. Our description of their manager's decisions should, therefore, not be taken as what they describe literally; we only intend to provide a setting in which we can compare the two models. Fischer, Heinkel and Zechner assume that a firm's managers run a productive asset in which they can invest each period. The return on this asset is independent of the scale. Each period the asset generates a cash flow which is automatically reinvested into the project; thus, the investment policy is exogenous. Taxes, both personal and corporate, combined with the existence of bankruptcy costs lead to a setting in which capital structure is relevant. Managers are assumed to issue a callable consol bond under the assumption that, because of some market imperfections, they would not be able to refinance without a call provision. Thus, a decision to refinance is automatically a decision to call the existing debt. Each period the managers make two decisions. The first is whether or not to pay the bondholders their coupon. If they do not, the firm is bankrupt; a deadweight bankruptcy penalty that is proportional to the face value of the bond is levied, bondholders get control of the firm and shareholders get nothing. If they pay the coupon, they can then decide whether or not to refinance. If they refinance, they call the existing bond and issue a new callable consol bond. The firm incurrs flotation costs proportional to the face value of this new issue. The presence of flotation costs leads to an interesting capital restructuring policy. The authors show that at any given point in time the value maximizing managers of the firm may have what appears to be a sub-optimal capital structure. This is because the optimal policy is to wait until the value-to-debt ratio reaches some maximum level, which they denote y, before they call and recapitalize. In addition, the managers will wait until the same ratio falls to some lower level, y, at which point they optimally declare bankruptcy (if y < 1) or recapitalize (if y = i ) . Unlike Fischer, Heinkel and Zechner, we assume that the riskless interest rate is stochastic. In this setting there are two reasons that managers might call; one is to refinance because CHAPTER 2. LITERATURE REVIEW 9 of changes in the present value of tax shields relative to bankruptcy costs and the other is to refinance because of drops in the riskless interest rate. This generalization does not come at zero cost. In order to keep our model computationally tractable we have simplified the stochastic process for the benefits resulting from leverage changes. Whereas they assume that the productive asset has log-normal returns, we introduce a "leverage state variable" which has much simpler dynamics. We mean for this state variable to capture in a reduced form way some of the effects that market imperfections may have on the call decision. We will briefly summarize the decisions made by our managers to make this point more clear. Like Fischer, Heinkel and Zechner, we assume there are potential gains to recapitalizing at the beginning of each period and that these gains can only be captured by calling a security. Thus, at the beginning of a period managers observe the level of this state variable as well as the current short-term interest rate. They will call a security if the net benefit of the call is positive; this benefit depends on interest rate considerations as well as leverage considerations. Unlike previous models of the call decision, we are explicitly accounting for the potential of arbitrary changes in the capital structure; that is, we make no assumptions about what security replaces the called security but assume that managers recapitalize optimally. However, as we have already stated the benefit of recaptializing is exogenous in our model. To conclude this section we provide a brief summary of how our approach differs from the existing theoretical literature on the call decision. Several papers have examined the call decision when interest rates are stochastic but the refinancing decision is fixed. Fischer, Heinkel, and Zechner (1989a) have a model of the optimal call decision in which they explicitly account for the effect of the refinancing decision that accompanies a call; however, they do not model the effect of stochastic interest rates. We allow stochastic interest rates in our study of the call decisions made by a firm's managers and try to decide whether or not managers are acting optimally relative to our model. In the dataset we utilize, the effect of stochastic interest rates is likely to be of first-order; however, we do not want to ignore the fact that the leverage decision accompanied by a call can also influence a managers decisions. Therefore, in our model we introduce a variable that is meant to proxy for the effects of time variation in leverage, in CHAPTER 2. LITERATURE REVIEW 10 addition to modeling the relationship between the interest rate and the call decision. 2.2 M o d e l s o f t h e C a l l D e c i s i o n : T e s t s There is a suprisingly small empirical literature on calls of non-convertible corporate securities. Our survey covers evidence on calls of convertible debt and non-convertible debt, both corporate and government, as well as one paper on mortgage prepayment. Ingersoll (1977) examined the call decision of firms with outstanding convertible debt. He showed that firms should optimally force conversion as soon as the conversion value equals the call price. He finds, however, that firms do not call until, on average, the conversion value exceeds the call price by 43.9%. He interpreted this as evidence that managers wait "too long" before calling these instruments. He considers the possibility that transaction costs might explain the large premiums but concludes that unreasonably large flotation costs would be necessary to explain premia of this magnitude. Asquith (1995) also studies callable, convertible debt. Unlike Ingersoll, he looks at the history of prices and documents the time between the date at which the conversion value first reaches the call price and the date at which the issue is eventually called. He finds that, on average, convertible bonds are called 170.5 days after the conversion value first exceeds the call price.6 In his sample the average premium of conversion value to call price on the call date is 50.2%. Although this figure is consistent with that of Ingersoll, he finds that when he excludes bonds whose call protection has just expired and those for which there is a cash flow advantage to delaying the call this figure falls to 29%. He points out that this premium isn't excessive in light of the fact that investment bankers recommend that managers wait until the conversion value is at least 20% over the call price before forcing conversion. Dunn and Eades (1989) examine convertible preferred share issues. They look at the con-version decisions of the individuals who own the preferred shares and find that they do not 6 He also examines the call decision in subsamples where the firm has differing cash flow advantages arising from the call and finds that firms that can potentially benefit from the cash flow advantage do indeed delay the call. (The firm gains a cash flow advantage by calling when the common stock dividend is less than the after tax interest cost of the convertible bond.) For these firms the average delay is 388.1 days. CHAPTER 2. LITERATURE REVIEW 11 seem to exercise their conversion option optimally.7 They argue that shareholder value maxi-mizing managers who rationally anticipate slow conversion should optimally delay the call, thus explaining why we might see convertible instruments trading in excess of their call price. Firms typically have only one or two issues of convertible securities outstanding at one time. It is, therefore, impossible to obtain statistically meaningful information about call delays for these securities on a firm by firm basis. As a result, these studies are restricted to describing the behavior of the median firm. As we mentioned above, our sample allows us to overcome this restriction since any one firm may have enough callable preferred shares outstanding to allow us to describe the behavior of managers of a given firm. We can, thus, examine cross-sectional differences in call behavior, an issue that studies based on convertible securities are unable to address. In addition, given our structural framework we can quantify the costs of delaying a call and determine what premia we might reasonably expect to see in the data. This allows us to test whether or not ad-hoc specifications of managerial behavior (Asquith 1995) have better descriptive power than the models described in section 2.1. Vu (1986) examines calls of non-convertible bonds during the period October 1962 through April 1978. He finds that the market value of called bonds is usually below the call price when the intent to exercise is announced. Of 102 firms in his sample only 41 waited until or after the bond's price exceeded the call price. In this subsample of 41 bonds, few bonds ever traded at prices in excess of 102% of the call price. King (1997) examines a sample of 1642 non-convertible bonds calls made during the time period spanning January 1973 to March 1994. This sample is considerably larger than Vu's and her findings are quite different from his. She finds that "86% of calls are made well after the bond price reaches the call price." In addition, she documents that the bond price exceeds the call price on the call announcement date for 77% of the calls in her sample. The firms in our sample behave quite differently from the typical firm discussed by Vu and more like those in King's sample. Utility companies generally wait until well after the "optimal" call date before redeeming issues of callable preferred stock. That is, preferred shares often trade 7The authors point out that it may be optimal for preferred shareholders to delay conversion when transaction costs are present. CHAPTER 2. LITERATURE REVIEW 12 at large premiums to their call prices for many months prior to redemption. In addition, no firms in our sample redeemed preferreds when the market value was below the call price. Longstaff (1992) examines the decisions of the U.S. treasury with respect to their issues of callable bonds and finds that they made timely bond calls 66.7% of the time. He notes that callable bond prices often exceed their call price of 100 and finds that on the call date the average market price was 100.61. He concludes that "the Treasury has a tendency to wait too long before calling eligible bonds." Bliss and Ronn (1998) also look at the decisions made by the U.S. treasury to call (or to not call) outstanding issues of callable bonds. Unlike Longstaff, they use a term structure model to determine the value of the embedded option to call these bonds. In addition, they account for the fact that the embedded options can be exercised only at prespecified dates.8 The fact that they are modeling callable bond prices allows them to distinguish between pricing errors and irrational prices in the data. They find that most past call decisions have been made optimally, although a small number of them seem irrational. Our approach is similar to that taken in Bliss and Ronn (1998). We are also careful to model prices and call decisions as a function of the prices of other relevant fixed-income prices. Our model differs from theirs in that we explicitly account for the fact that our decision makers are operating in a corporate environment and, as a result, must pay attention to variables other than the riskless term structure. In particular, we must account for the impact of capital structure changes on shareholder value associated with a call, an issue that Treasury officials do not have to consider. Stanton (1995) examines the prepayment behavior of mortgage holders in mortgage-backed security pools. He assumes that a one factor CIR model of the term structure holds and that mortgage holders incur deadweight costs when "calling" their mortgage. Estimates from his model indicate that transaction costs in the neighborhood of 40% of the face value of the mortgage are necessary to explain observed prepayment decisions. Although our model is similar to the one Stanton develops, there are two important differ-8This type of call option is referred to as a Bermuda option. CHAPTER 2. LITERATURE REVIEW 13 ences. First, the agents in Stanton's model are assumed to behave suboptimally. His mortgage holders act as if they are in one of two regimes, one with a low prepayment rate and the other with a high rate, depending on whether the current interest rate is above or below some critical level; however, optimal behaviour in his framework would have mortgage holders prepaying instantaneously when the interest rate falls to the critical rate. The managers in our model are assumed to make optimal exercise decisions but base this decision on relevant, unobserv-able information. Thus, observed states do not give us enough information to predict exactly when they will call. The instantaneous exercise probabilities that our model generates do not, therefore, conflict with our maintained assumption that managers act to minimize the value of the outstanding preferred share issue.9 Second, in this paper we estimate both the transaction cost and CIR parameters simultaneously whereas Stanton relies on out-of-sample estimates of the CIR parameters in forming his estimates for transaction costs.10 As we will see, transaction cost estimates are very sensitive to the choice of the parameters in the term structure model. To conclude, economic agents, whether corporate financial managers, U.S. treasury officials or homeowners, do not appear to exercise options to call outstanding liabilities as predicted by the model of Brennan and Schwartz (1977). Prices of callable securities typically exceed the call price on the call announcement date and for some period of time prior to the announcement. In this thesis, we will attempt to determine whether or not a model which incorporates transaction costs and refinancing motives can explain managerial behavior. We will see that in a qualitative sense, the prepayment policies generated by our model are close to those analyzed by Stanton. l 0 H e uses the estimates of Pearson and Sun (1989). Chapter 3 The Theoretical Model In this chapter we describe a model of how managers of a firm might decide when to redeem outstanding issues of callable, non-convertible preferred shares. The model differs from the traditional option pricing approaches because of the assumptions we make regarding the payoff to a firm upon exercise of the callable preferred share's embedded call option. Here the payoff incorporates all costs, both direct and indirect, associated with the refinancing decision that automatically accompanies a call. Although we do not explicitly model the refinancing decision, we assume that when managers call preferred shares they replace them with the "lowest cost" instrument available. As we will see, the cost of an instrument is broadly defined as the direct and indirect economic impact of replacing a callable preferred share with another financial instrument. Our treatment of the indirect costs of the refinancing decision associated with a call lead to a model in which we must solve for decisions and prices using non-standard option pricing techniques. Our methodology is based on the techniques of Markov Decision Processes as described, for example, in Puterman (1994). A significant part of this chapter, and especially the appendix accompanying the chapter, is devoted to showing that these technniques apply in our setting. The layout of the chapter is as follows. In the first section we describe the assumptions underlying the call decision in detail and develop a model that is consistent with these assump-14 CHAPTER 3. THE THEORETICAL MODEL 15 tions. In the second section we describe the model's implications in a cross-section of callable preferred share prices, comparing its predictions to those of existing models of the call decision. In the final section, we discuss alternative models of the call decision and show that our model can distinguish between different models of managerial behavior. 3.1 Assumptions and Methodology In order to understand the model and the motivation behind our assumptions, it is helpful to keep in mind the general characteristics of the data we will utilize. We have chosen to focus exclusively on the cross-sectional restrictions of the model.1 We will typically work with all preferred shares with a given set of characteristics that are issued by one utility company.2 For each of these shares, we will observe monthly prices over the course of several years. In addition, we will observe whether or not an issue is called; for shares that are called we observe the call announcement date and the price of all uncalled shares at that date. Table 3.2 summarizes some typical data. Cross-sectional differences among the call prices and/or dividend rates for these shares will typically lead to cross-sectional differences in the prices and call decisions. Our model will give rise to testable restrictions in this dataset through two types of non-linear functions. The first will describe the relationship between some proxy for the short-term interest rate and a callable preferred share's price; the second will describe the relationship between the same interest rate and the conditional probability of the issue being called. We begin our description of the model by dealing with the pricing of a firm's non-callable preferred shares. The techniques we employ to price these shares are fairly standard, so we can focus on the basics of our valuation technique before we introduce the call decision and the non-standard assumptions underlying that part of the model. 1 We are ignoring several other pieces of data we could potentially observe. For example, we will not examine any effects associated with the announcement of a call, such as the reaction of common stock price. Additionally, we will ignore the time series characteristics of the preferred share prices. Announcement effects have been well documented and any time series predictions would come at the expense of adding more assumptions to our model. 2 We will describe these characteristics in detail in the next chapter. CHAPTER 3. THE THEORETICAL MODEL 16 Following Duffie and Singleton (1997) we place restrictions on the relationship between the current state and the state price measure. We assume that this relationship stays constant over the time period spanned by our sample and that one factor is sufficient to describe prices of preferred shares of all maturities.3 We will refer to the state of the system as the risky short-term interest rate. Consol (or straight) preferred shares pay dividends at regularly spaced intervals in perpe-tuity. Denote the current date as t and the next dividend date, which occurs at time t + A , as date t + 1. Let rt denote the short-term interest rate for the interval A . 4 If we know the state price measure at time t + 1 as a function of rt we can derive the relationship between the price of a preferred share and the current short-term interest rate. Assume that state prices are given by the following function: 4(rt+i | rt) = e n A q ( r t + l | rt) = \ ( e _ r t A o. 5 [ l^pi1 ; ^75=e t "i J if r t + i > 0 c r 7 v ^ r l 2 (3.1) _o.5 i i z ^ l V ; where a, b, c, and 7 are parameters. Note that we have been careful to distinguish between the state price, denoted q(rt+i \ rt), and the risk-neutral probability, denoted q(rt+i \ rt) since we will refer to both of them later. Our assumption regarding the risk-neutral measure is roughly equivalent to assuming that, under an appropriate change of measure, r has the following dynamics: r t + i = a + brt + crfut+i (3.2) where is distributed normally with mean zero and variance one. The discrete-time version of the Vasicek (1977) and Cox, Ingersoll, and Ross (1985) term structure models are special cases of the term structures that these dynamics can generate.5 In the appendix we show that preferred share prices exist given weak restrictions on the 8 We can easily extend the model to deal with multi-factor models of the risky term structure. 4Throughout the paper we will work with A — 3 months, as preferred shares pay dividends quarterly. 5 Backus and Zin (1994) present discrete time versions of common one factor interest rate models. Our approach differs from theirs in that we have been careful to restrict the interest rate process to be non-negative (although possibly zero) under the risk-neutral measure. CHAPTER 3. THE THEORETICAL MODEL 17 parameters a, b, c and 7. In particular, it turns out that the price of a perpetual stream of dividends is given by a bounded function of the short interest rate. Note that this result is non-trivial, since r can take on the value zero with positive probability. Let P(r; c5) denote the relationship between the short-term interest rate and preferred share prices where the preferred share pays S each period. There are two ways to characterize this function. One is to add up the prices of all "pure discount bonds" that pay S at a future date.6 The second way is to solve a recursive valuation equation. We will use the second method. Consol preferred share prices must solve the following recursive valuation equation:7 P(rt) = e -" A | r + [S + P(rt+l)] q(rt+l \ rt)drt+l (3.3) where S is the dividend payable quarterly to preferred shareholders. We will interpret the pricing operator as a linear functional on R + and write the above equation as P = Q(S 4- P). The preferred share price can then be represented as the fixed point of a functional equation. This interpretation of equation (3.3) underlies the numerical procedure we will use to calculate non-callable preferred share prices and is the key to understanding our approach to valuing callable preferred shares. In the discussion that follows, callable preferred share prices will be interpreted as the fixed point to a somewhat more complicated functional equation that also involves the linear operator Q. The assumption that one factor is sufficient to describe the evolution of the term structure makes empirical implementation of our model flexible. If we observe the price of any interest rate sensitive instrument we can deduce the level of rt. One obvious candidate for the state variable is the short term interest rate as measured by the three month t-bill rate. Another is the price of a non-callable preferred share with no maturity date (a consol preferred share). Given the price of such a share, we can invert equation (3.3) and determine rt; thus, any observation about the relationship of a callable preferred share price to the short-term rate can be restated 'Since we are ignoring tax effects at this point we will not distinguish between dividend and interest income. Thus, a pure discount bond pays income at one (and only one) date in the future and the price of this security will be referred to as a pure discount bond price. 7 This is the analogue of equation (37) in Duffie and Singleton (1997). CHAPTER 3. THE THEORETICAL MODEL 18 in terms of the consol preferred share price. When we estimate our model in the next chapter, we use both of these instruments (the short-term rate and the consol rate) as proxies for the interest rate. Having described the valuation of non-callable preferred shares, we now turn to our model of the call decision. Our assumption about the information structure immediately prior to the call is what distinguishes our model from other option-pricing based models of the call decision. Managers almost certainly base their call decision on several pieces of information, only one of which is the current level of interest rates as reflected by the market value of their preferred shares. We have both theoretical and empirical motivations for this assumption. From an empirical standpoint, without other motivations for a call our model would predict that managers will exercise their option exactly when interest rates hit some critical level. As we mentioned in the introduction, a few casual observations show this prediction is violated; for example, managers sometimes call after an increase in interest rates. From a theoretical standpoint, one might reasonably assume that at each date managers have several options with respect to the instrument that will replace the called preferred shares. For example, after calling preferred shares managers may issue debt (callable or non-callable, fixed or variable rate), common stock, or other preferred shares. We will not explicitly model the replacement decision.8 Instead, we assume the called shares are replaced by the instrument management believes is best given the condition of the firm at the call date. As a result, we cannot say exactly what will be the net benefit of a call and it follows that, if we only observe the short-term interest rate, the decision will appear to be noisy. In the discussion that follows we provide a theoretical justification for adding noise to the call decision. Although this is not critical to the development of our model, a view of the possible structure that underlies the unobserved state variables will give us some guidance when we try to interpret our empirical results in the next chapter. Although managers can refinance any given issue of preferred shares at any instant, we 8 Data on what instrument replaces the called preferred share may be valuable for determining what motivates the call decision. We have not collected this data but note that it could be valuable in testing the specification of our model. CHAPTER 3. THE THEORETICAL MODEL 19 assume they make their decision to call at discrete intervals. It is, therefore, necessary to impose the following structure on observable events at any decision date, t: • dividends are paid, • the prices of several preferred shares are observed, and • managers decide whether or not to call an issue. When managers call a preferred share issue, current shareholders are paid the call price of their shares and a new security may be issued at some cost. At the conclusion of this sequence of events we wait another three months, over which time the prices of the preferred shares change, and the cycle repeats. The fact that managers may choose to alter the firm's capital structure when they call means both direct and indirect costs will be associated with the call decision; a refinancing decision may change the expected present value of bankruptcy costs, tax shields, agency costs and transaction costs. We will assume that, in addition to the short-term interest rate, managers observe one other state variable that summarizes all the economic consequences of a refinancing before making the call decision. To make this point clear, we will now give a detailed description of the impact of a call decision on a typical firm. Assume, for now, that managers of a firm call preferred shares so as to maximize shareholder value; obviously, we need a model of firm value and of the firm's common stock. Separate the cash flows of the firm into two parts; Ut will represent the present value of the firm's cash flows assuming the capital structure remains unchanged and 2~2jLi Vjt W M represent the present value of current and future cash flows associated with the firm's N preferred share and/or debt liabilities, including the value to the firm of the refinancing option associated with the associated call provisions. The exercise price of each of these options is denoted by Kj. It is important to realize that Vjt represents not only the market value of the cash flows received by the holders of liability j , in the form of coupons or dividends and principal repayments, but also the potential net benefits to the firm of refinancing in the future. In order to distinguish between these two values, we let Pjt be the market value of security j and let pjt = Pjt — Vjt CHAPTER 3. THE THEORETICAL MODEL 20 be the value common shareholders place on the option to refinance that liability. This option is valuable since, as in Fischer, Heinkel, and Zechner (1989a), we assume that managers must call an outstanding share if they are to refinance. We will now be more explicit about the components of f/j. In the spirit of Modigliani and Miller (1958), denote the value of the firm's real assets and future investment opportunities as At and consider the impact that debt has on firm value. Firstly, if the possibility of bankruptcy alters managers investment decisions, At may depend on the firm's leverage ratio. Secondly, assuming that interest payments on debt are tax deductible we must keep track of the tax shields associated with debt, st. Lastly, assume that debt has direct and indirect bankruptcy costs, bt. Using this partition of firm value, write the value of common shares, St, as follows: N N St = Ut -J2VJt = At + St-bt -J2VJt If managers refinance, each of the components of shareholder value may change. For exam-ple, consider the effect of calling security j. If the security is not called, shareholder value is given by: St = At + 8t-bt-J2 Vit - Pjt + pjt. where Pjt and pjt represent the values of security j and the option to retire that security respectively. If the security is called, shareholder value is given by: S't = A't + s't-b't-Y,V!t-(l + T)Kj + Pnt. We have split the impact of the call on shareholder value into two pieces. First are the direct effects of the call. A liability with a market value of Pjt is replaced by one with a market value of Kj. If the new instrument has a call option, shareholder value increases by the value of the new refinancing option pnt; however, the old refinancing option is eliminated (ie. pjt is lost). Shareholders also incur the cost of issuing the new instrument, TKJ, which we assume are proportional to the market value of the new instrument. Second are the indirect effects of CHAPTER 3. THE THEORETICAL MODEL 21 the refinancing decision. The call will typically have an impact on any or all of st, bt and VJ t. Ideally, we would model the impact of the call on each component of shareholder value. This would be possible if there were a small number of observable state variables to which all these components were related. One obvious candidate, besides the risky short rate, is the current cash flow of the firm. However, even if cash flow were observable and it were computationally feasible to develop a model of shareholder value as a function of interest rates and cash flow, we would still be missing key components of shareholder value. For example, to predict the impact of increasing leverage on At, which includes the value of future investment opportunities, we'd need to know something about the characteristics of the alternative projects in which the firm can currently invest.9 Other examples are easy to come up with in this setting. In order to determine the impact of an increase in leverage on the value of a firm's tax shields, we'd need to know something about the probability with which the firm will generate future taxable profits that debt tax shields can potentially offset. The point is that at some level our model will not capture the reaction of shareholder value to a call. Our approach, which is based on the approach taken by Rust (1987), is to explicitly model the relationship between one observable state variable, the risky interest rate, and the call decision, while at the same time acknowledging the fact that we cannot capture the entire impact of the decision. Let €jt denote the net indirect effect of calling security j at time t. More precisely: ejt = A't + s't -b't-Y,Vit + Pnt -(At + st-bt-^ V«). To shareholders, the "NPV" of call decision is given by: S't-St = eit + Pjt-{l + T)Kj- Pjt = (jt + Vjt-il + ^Kj. (3.4) In words, managers who maximize shareholder value will call if the net impact of the call on 9 The shareholder value of firms that have a large number of riskless projects available will react very differ-ently to increased leverage than will the shareholder value of firms which can potentially invest in very risky projects (Parrino and Weisbach 1997). CHAPTER 3. THE THEORETICAL MODEL 22 shareholder value, S't — St, which includes the direct effects as well as the indirect effects as summarized by c j t , is positive. However, in order to quantify the difference we must first know the values of e j t and Vjt. Our key assumption is that the indirect effects of refinancing are short lived and unpre-dictable. That is, we assume that the dynamics of e j t are sufficiently complicated that neither managers nor investors can forecast its evolution. In addition, we assume that ejt tells us noth-ing about the current state price measure, given by equation (3.1). Although we could let c J t be correlated with the current risky rate, we will assume that it is independent of r t . Thus, just after the call decision has been made, rt is sufficient to describe the current state and just before the call decision, investors will behave as though they are risk neutral with respect to the call decision (ie. the call decision will be uncorrelated with the state price.) If everything is valued fairly it is straightforward to show that shareholder value is maximized when the value of the liabilities is minimized. To be explicit, if Vt = e~r'AE^(Ct+i + Ut+i) where Ct represents the net cash flow given the current capital structure, and Vt = e - r t A £ ^ ( i 5 + Vj+i) represents the value of an uncalled liability then we write the optimization problem that managers solve as: St = max{Ut + et - (1 + r)K, e~ r 'AEfl(C t + 1 -8 + Ut+1 - Vt+1 \ rt)} = max{(7( + et - (1 + r)K, Ut + e-r'AE^(-S - Vt+1 \ rt)} Given that St = Ut — Vt, the following Bellman equation characterizes the value of a given liability: V(rt, tt) = min{(l + r)K - et, e~r'AEQ [V(rt+1, et+1) + S | r j}. (3.5) We reemphasize that V represents the value to a firm of a callable liability including all benefits, direct and indirect, associated with the embedded option to refinance. The Bellman equation thus has managers calling when the indirect effects of the call, net of the cost of the call, exceed the value of waiting. Equation (3.5) completely characterizes the call decision as a function of the state vari-CHAPTER 3. THE THEORETICAL MODEL 23 ables. As we mentioned previously, we can interpret V as the fixed point to a functional equation. In fact, there are several relevant fixed point equations associated with equation (3.5). The first characterizes the value of an uncalled preferred share liability as represented by e - r t A ^ Q rj_|_ y(rt+\, €t+i) \ ft]. This expectation depends only on rt and is, therefore, consid-erably easier to calculate than V; we will make extensive use of the function that is related to this expectation when we calculate preferred share prices. The second characterizes the market price of a preferred share given the optimal call policy. 3.2 The Cross Sectional Relationships between Prices In order to facilitate understanding of the functional relationships introduced in the previous section, and to highlight the empirical implications of our model, we will now compare our model to the more traditional models of the call decision that our framework nests. We will start with a model in which r and e in equation (3.5) are zero and then, by adding each parameter to the model one at a time, proceed to build our model up from models of the call decision that have preceeded this one. We begin by examining the relationship between the short-term interest rate and non-callable preferred shares, as given by the fixed point to equation (3.3).10 Figure 3.1 plots the price of a non-callable preferred share against the short-term risky interest rate.11 Although it is hard to tell from the graph, this relationship is not linear. Notice, however, that due to our choice of parameters for the pricing kernel given in equation (3.1), this particular function is bounded and invertible. Although this will not be the case in general, we will impose parameter restrictions that ensure this price function is invertible throughout the paper. Therefore, as mentioned above, our predictions about the relationship between the short-term rate and both of the call decision made by a firm's managers and the market value of callable preferred shares can be restated in terms of their relationship to non-callable preferred shares. This is important, 1 0 We will discuss our numerical procedure in the next chapter. 1 1 Throughout this section we use the CIR parameter estimates from Pearson and Sun (1989) for the period 1979-1986: r,+i = 0.0058 + 0.957rt + 0.057r?'5e ( +i. We assume this share pays dividends of 12% per year and has a par value of $100. CHAPTER 3. THE THEORETICAL MODEL 24 since we can observe prices of the latter, whereas the short-term risky rate for securities which pay dividends is unobservable. Timing Relative to Call Decision Immediately Before Immediately After Value to Common Shareholders Market Price V P V P Table 3.1: Notation associated with the value of the callable preferred share liability and the market price of callable preferred shares. We now turn our attention to the pricing of callable preferred shares. There are four different functions associated with each preferred share liability. Table 3.1 summarizes the notation for these functions. We will be considering the effect of a call on both common and preferred shareholders; since the cashflows accruing to these parties will be typically be different, we distinguish between the value of the preferred share liability to common shareholders and the value of the liability to preferred shareholders, denoted V and P respectively. It is also important to keep track of the value of these liabilities relative to the instant at which the call decision is made; the indirect impact of the call, e, is relevant before the call decision is made but not after. We use a caret to identify functions that describe the relationship between the state variable and the value of the liability to a given party immediately after the call decision has been made and the security has not been called. If the call decision has no impact on other aspects of firm value (ie. if €jt = 0) and if there are no transaction costs, our model collapses to the traditional option pricing model of callable preferred shares (Brennan and Schwartz 1977). In this case, the notation in Table 3.1 is redundant. The cash flows paid out by common shareholders exactly equal the cash flows received by preferred shareholders, so V — P and V = P. Furthermore, the call decision can be perfectly anticipated by preferred shareholders, so we would never see preferred shares changing in price as a result of a call announcement. Al l of V, V, P and P are thus given by the fixed CHAPTER 3. THE THEORETICAL MODEL 25 point to equation (3.5) with e = 0 and r = 0: V(rt) = min{K, e -rAEQ [5 + V(rt+1)\rt]}. The first panel in Figure 3.2 shows the relationship between the price of these shares and the risky short rate. The second panel shows the relationship between the callable share price and the price of non-callable preferred shares. Note that this model predicts that managers will call outstanding preferreds when the short rate falls below some critical level which we denote r* (K). Equivalently, managers call the shares as soon as the market price reaches the exercise price.1 2 If €t = 0 and the call decision has accompanying transaction costs, we have the model of callable preferred shares described by Kraus (1983) and Dunn and Spatt (1986). Managers call the shares as if they had a strike price equal to (1 + T)K and, if we could observe the value of this claim, we would see that they call as soon as the price of their claim rises to (1 + T)K. The value of the preferred share liability immediately before the call decision is given by the fixed point to the following equation: The call policy that solves this equation will have the same form as in the model with no transaction costs; managers will call the issue when the short-term interest rate falls to some critical level r*((l-\-T)K). We observe the market price of callable preferred shares as determined by rational investors who we assume can observe the level of transaction costs and, therefore, anticipate the call policy. In order to derive the market value of the callable preferreds we simply calculate the expectation of the future payments accruing to preferred shareholders under the l 2Note that the callable preferred share price function is not differentiable at the critical interest rate. This is a result of the fact that we are working in a discrete time setting. The continuous time version of the model has a "smooth pasting condition" that ensures the callable preferred share price function is differentiable at the critical interest rate. V(rt) = mm{{l + T)K,e - r A ^[c5 + F ( r t + 1 ) | r t ] } . CHAPTER 3. THE THEORETICAL MODEL 26 risk-neutral measure. Formally, callable preferred share prices will be given by: P(rt) = e-"*EQ [lrt+1 r . ( ( i + W * + P(rt+i)]] where Irt+i(rt), prices will be given by the fixed point to the following equation: P(rt) = p(rt)K + [l-p(rt)]P(rt) (3.6) = p(rt)K + [1 - p(rt)]e-r'AE^[6 + P(rt+1) \ rt] We will be more explicit about how investors calculate the call probabilities in the next section. Intuitively, investors in preferred shares may be suprised by a call, since that they can only guess at the expected value of e conditional on the current interest rate. Note that if P is above (below) the strike price, a call will cause the price to fall (rise); thus, unlike the other models described above, this model predicts that a price change may accompany the call. Figure 3.5 shows the relationship between the short-term interest rate and the market price of preferred shares, both before and after the call/don't call decision has been made. An interesting feature of the model that includes the indirect effects of the call is that, even when transaction costs are zero, the potential of refinancing at better terms causes managers to delay the call relative to the case with no indirect refinancing costs. This behavior is a direct consequence of the fact that adding more "noise" to the option payoff results in higher option prices. Therefore, interest rates have to fall to lower levels before the direct net benefits of refinancing will be large enough to overcome the value of the refinancing option. This gives rise to situations in which callable preferred share prices can exceed the call price even in the absence of transaction costs. We can, therefore, consider this explanation of the observed behavior of callable fixed income security prices as an alternative to the explanation based on transaction costs (Kraus 1983). In Figure 3.6 we show the relationship between the short-term interest rate and callable preferred share prices when transaction costs are zero given different levels for the variance of e. As we would expect, the higher is the variance the higher is the preferred share price. Note that there is a region over which the preferred share prices exceed the strike price of $100. CHAPTER 3. THE THEORETICAL MODEL 28 We conclude the section with a brief description of the relationship between the prices of preferred shares with different dividend rates conditional on some level of the short-term interest rate.1 4 Figure 3.7 shows that the prices of preferred shares are not strictly increasing in the dividend rate.15 As described above, the existence of transaction costs and the indirect effects of the refinancing decision lead to callable preferred share prices that may exceed the call price. In our model, the probability of a call increases as the dividend rate increases since the likelihood that e will be in the call region increases. Thus, as the dividend rate increases, the preferred share prices approach the call price. The fact that our model makes predictions both about the relationship between preferred shares with different dividend rates as well as about the relationship between different levels of the short-term interest rate and a specific preferred share's price means that we can use a panel of preferred shares to construct tests of alternative theories of managerial behavior. We will discuss this point in more detail in the next section and will describe our empirical methodology in detail in the next chapter. 3.3 Alternative Models of the Call Decision It is fair to ask why we have developed such a complex model of the call decision. The answer is that we would like to learn something about how managers behave. In this section we will discuss two alternative models of managerial behavior that are consistent with the following stylized facts from the data: • Callable preferred share prices can decrease as the price of non-callable preferred shares increase; • Callable preferred shares often trade at prices in excess of their call price and the shares can trade at premiums to their call price for several months before they are called; 1 4 This is the relationship that is focused on by Dunn and Spatt (1986). "Figure 3.7 describes the relationship among pre-call decision prices (P). A similar relationship exists among dividend rates and the post-call decision prices P. CHAPTER 3. THE THEORETICAL MODEL 29 • Shares with higher exercise prices are called "later" than shares with lower exercise prices; that is, if at some point in time two callable preferred shares with different exercise prices exist, the share with the higher strike price will exist for longer, on average, than the share with the lower strike price. A study that goes no further than documenting these facts can only tell us whether these models have a chance at explaining observed behavior. In order to distinguish between them, we must look to the functional form of the predictions. If we can reject a subset of the many plausible models of how managers behave, we have learned something about their behavior. This insight into their actions may also help us to better understand other decisions these economic agents may make. The purpose of this section is not to present a formal discussion of the type II error of the model (ie. the power of the model), but to show how datasets such as those used in this paper can allow us to distinguish among competing models of managerial behavior. As such, our focus will be on the intuition about why our methodology can distinguish between such models. As we have mentioned previously, the data consists of prices of several callable and non-callable preferred shares, observed at several dates, as well as the observations of the dates when the option to call the shares was exercised. Our dataset is thus a panel of callable preferred share prices and call decisions. As it turns out, we require such a panel in order to distinguish between the model we described above and the first of the alternative models of the call decision which we introduce now. The relationship between callable preferred share prices and the state variable exhibited in Figure 3.5 does not explicitly depend on the dynamic nature of the problem. In fact, similar preferred share price behavior can be achieved using a much simpler static model of manager behavior. Assume managers exercise their call option as soon as the present value of the perpetual dividend exceeds the call price plus transaction costs. We write this objective function as follows: Pa(rt) = min{P(r t), (1 + Ts)K} (3.7) CHAPTER 3. THE THEORETICAL MODEL 30 where P(rt) represents the current price of a non-callable preferred share with dividend rate S and P«(r t) represents the price of a callable preferred share when managers call policy is myopic 1 6 Note that this is not a Bellman equation. In words, the equation states that managers should call an outstanding issue when the present value of the dividends at the current yield, P(ry, <5), exceeds the strike price plus transaction costs. There is no recognition of the fact that the decision can be delayed in this statement of the problem. This objective differs from that described in equation (3.5) where managers are assumed to incorporate the value of the option to call into their decision. This is easier to see if we write the objectives in terms of the value of the embedded call option, C(rt), using the fact that the value of the call option is given by the value of the straight preferred share less the value of the callable share, C(rt) = P(rt) — Pc(rt): C{rt) = max{P(rt) - (1 + rd)K, Cu{rt)} (3.8) where Cu is the function describing the value of the unexercised option to call the preferreds. Managers with objectives given by equation (3.7) call when P(rt) — (1 + rs)K > 0, whereas managers with objectives described by equation (3.8) call when P(rt) — (14- Td)K — Cu(rt) > 0 (ie. in the static model, managers ignore the value of their option to call the issue in the future). Let rd be the interest rate at which managers obeying (3.8) optimally call:1 7 we can mimic this call decision with a static model where rs = TJ-\—Furthermore, preferred share prices from both models will be identical since in both models preferred shareholders receive the same cash flows. We can summarize this argument as follows: Proposition 1 Suppose a single factor model describes the dynamics of riskless interest rates. Given any set of parameter values for the dynamic model described by equation (3.8), there exists a transaction cost parameter, Ts, such that the static model of call behavior yields identical call decisions. In addition, the preferred share prices implied by both models will be identical. 1 6 We have suppressed the dependence on S from the notation. 1 7ie. r* solves P(r*) - (1 + rd)K - C u ( r ' ) = 0 CHAPTER 3. THE THEORETICAL MODEL 31 One implication of Proposition 1 is that, in a sample consisting of one non-callable preferred share and a callable preferred share, our model of the call decision will be indistinguishable from a static model of the call decision. This conclusion is not true if we have access to a panel of preferred share prices.18 To see why this is the case, assume we have arrived at accurate estimates of the parameters in the state price measure using information from the time series information in a sample of prices from one callable and one non-callable preferred share. According to Proposition 1, there will be two estimates of the transaction cost parameter, one from the static model (r s) and one from the dynamic model (r^), that are consistent with the observed prices and the call decision. However, these two models will make very different predictions for shares with different dividends or strike prices. One observation, based on examination of the relevant inequalities above, is that the relationship between the critical non-callable preferred share price (corresponding to the critical interest rate, r*) and the strike price per dollar of quarterly dividend is linear for the static model and distinctly non-linear for the dynamic model. Another observation is described in Figure 3.8 where we see the relationships among prices of callable preferred shares paying different quarterly dividends when the short-term interest rate is, for example, 8.5%. Given any set of parameters describing the state price measure one can see that the two models make very different predictions in any given cross-section. In practice, we will not compare competing models of the call decision using specification tests based upon observations described in the previous paragraph. (The next chapter gives a detailed description of our empirical methodology.) However, the discussion above highlights why our model has power against some alternatives. We will require that one set of parameters describe the relationship between both the prices as well as the call decisions associated with several callable preferred shares and some proxy for the observable state variable. It is the fact that predictions about these relationships can differ significantly in a panel of data that allows l 8 It is also not technically true if we are comparing models in which there is an unobservable state variable. In the next chapter we will show that, because of the non-linearity introduced by the additional state variable, the static and dynamic models do make slightly different predictions about the relationship between the state variable and the call probabilities. We suspect, however that the difference between the static and dynamic call probabilities functions will not be identifiable in reasonable sized samples. CHAPTER 3. THE THEORETICAL MODEL 32 us to reject one model in favor of another. We will discuss one more ad-hoc description of managerial behavior that is consistent with the qualitative nature of the data. This model is based on an explanation of managers' behavior with respect to the decision to call convertible debt as described in Asquith (1995). He suggests that, based on recommendations from investment bankers, managers wait until convertible debt prices exceed the call price by some fixed percentage amount before they call the debt. In the spirit of this model, we consider policies that have managers calling preferred shares three months after the market value of the uncalled share exceeds the strike price by some fixed percentage amount, for example two percent. To determine the interest rate at which a call will occur, assume investors rationally antici-pate the call. Managers will not anticipate a call if the discounted value of a dividend payment plus the strike price is less than 102% of the strike price, since a call with three months lead time will have shares trading at less than 102% of the exercise price. To be more precise, let B(rt) represent the price of a one dollar dividend three months from now. In equilibrium, managers will call when B{rt)(K + 1) > 1.02K and uncalled preferred shares will trade at prices up to two percent in excess of the strike price. This rule of thumb implies that the critical interest rate, r* solves B(r*) = ffiffi. Again, we can distinguish between our model and this one by using information from a panel of callable preferred shares, since this model makes different predictions about the critical interest rate as a function of the call price than does our dynamic model. This difference in critical interest rates will manifest itself in different predictions about the cross-sectional relationship between prices of preferred shares with different coupons and/or strike prices. We finish the chapter with a brief discussion of how changing the manager's objective function will impact our interpretation of the empirical results. Fischer, Heinkel, and Zechner (1989b) show that shareholder value maximizing managers recapitalize too often relative to firm value maximizing managers. With our current setup we cannot, unfortunately, replicate this result; one must be more explicit about the structure of the unobserved state than we have been. However, we can relax a few assumptions underlying the model and at least speculate CHAPTER 3. THE THEORETICAL MODEL 33 about the sensitivity of our results to changes in the objective function. In particular, we will examine how managers that maximize a weighted average of shareholder value and firm value may behave when the call has associated transaction costs. To this end, we will assume that there are no indirect cash flows associated with the call decision. Suppose that the value of all future cash flows produced by the firm, A, will be split among three claimants: the common stockholders, the preferred shareholders and the outside party who receives the transaction cost incurred when preferred shares are called (eg. investment bankers). Let C represent the current market value of the option to call the preferred shares. When the call is exercised, common shareholders receive preferred shares valued at V for the price K. In addition, upon redemption outside parties receive the transaction costs. Let the current market value of this exercise-contingent future payment, TK, be denoted by T. We can write an expression for the value of the firm to "inside" claimants (ie. common and preferred shareholders) and for the value of equity.19 The firm's managers are assumed to maximize a weighted average of these two values so we write their objective function as follows: U = a{A-T) + {l-a)(A-T-V + C) (3.9) where a (E [0,1]. The first term in parentheses, A — T, is the value of the firm to common and preferred shareholders. The term (A — T — V + C), is the value of the firm to common shareholders. Thus, a is the weight managers place on firm value.20 The exercise decision has no effect on either A or V, so U is maximized if and only if the following function is maximized: 6 = c-why <3-l0> 1 9 We will suppress the time subscripts in order to keep the notation simple. 2 0It is possible to write equation (3.9) as the weighted average of common shareholder value, Vc, and preferred shareholder value, Vp, as follows: U = ———Vc + T——Vp. 1+a l+a p However, in this formulation we would need to artificially restrict the weight managers place on common share-holder value to lie in the interval [.5,1] in order to be consistent with the objective (3.9). CHAPTER 3. THE THEORETICAL MODEL 34 We see that the more managers care more about firm value maximization relative to common shareholder value maximization (ie. the larger is a), the more their disutility from transferring wealth out of the firm. Thus, this specification of the objective function allows us to have managers who face identical transaction costs place different weights on the synthetic security whose value is given by T and, therefore, make different exercise decisions. Under this inter-pretation of the model, the more weight managers place on maximizing firm value, the larger will be our estimate of implied transaction costs as measured by -rf^. Conversely, if we find that the transaction costs implicit in preferred share prices turn out to be reasonable (on the order of 2-5%) we have evidence that managers behave as if they maximize shareholder value (ie. a = 0). CHAPTER 3. THE THEORETICAL MODEL 35 Table 3.2: Sample Dataset. The price and call decision data of three Pacific Gas and Electric Company preferred share issues with different dividends and call prices is presented here. This subset of our data was chosen to illustrate the characteristics of a typical set of shares available for our analysis. At each date we observe the three month t-bill rate, the price of a well out-of-the-money preferred share and the prices of two uncalled issues. In addition, we indicate dates on which these issues were called by placing a 1 in the decision column. Dividend Rate/Strike Price 3-mo. 4.36% 10.28% 10.46% t-bill 25.75 28.5 27.75 Date Rate Price Price Decision Price Decision 30/04/91 5.616 12.81 28.38 0 28.13 0 31/05/91 5.636 13.00 28.50 0 28.25 0 28/06/91 5.646 12.88 28.25 0 28.13 0 31/07/91 5.637 13.25 28.63 0 1 30/08/91 5.430 14.00 29.00 0 30/09/91 5.203 14.25 29.50 0 31/10/91 4.907 13.63 28.75 0 29/11/91 4.424 13.88 28.88 0 31/12/91 3.923 14.38 28.75 0 31/01/92 3.902 15.00 29.13 0 28/02/92 3.994 14.88 29.00 0 31/03/92 4.118 14.38 29.00 0 30/04/92 3.759 14.31 28.75 0 29/05/92 3.759 14.38 28.63 0 30/06/92 3.626 14.75 29.13 0 31/07/92 3.227 15.13 1 CHAPTER 3. THE THEORETICAL MODEL 36 Figure 3.1: The relationship between the short-term interest rate and the value of a 12% non-callable preferred stock. This line, which looks deceivingly straight, represents the solution to the fixed point equation (3.3). CHAPTER 3. THE THEORETICAL MODEL 37 105 601 1— 1 1 1 1 0 0.05 0.1 0.15 0.2 0.25 Short-term Interest Rate 105 20 25 30 35 40 45 50 4% Preferred Share Price Figure 3.2: Callable preferred share price vs. state variable The top panel shows the relationship between the short-term interest rate and the value of a 12% callable preferred share as given by the fixed point to equation (3.5) with r and e set equal to zero. The bottom panel shows the same callable preferred share price, this time as a function of the price of a 4% non-callable preferred share. In order to construct the bottom panel, the interest rate was first mapped into the price of a non-callable 4% preferred share, as in the previous figure, and then at each interest rate a matching callable preferred share price was calculated. These pairs are plotted on the bottom graph. CHAPTER 3. THE THEORETICAL MODEL 38 115 60 1 1 1 1 1 0 0.05 0.1 0.15 0.2 0.25 Short-term Interest Rate Figure 3.3: The relationship between interest rates and the value of callable pre-ferred stock to common and preferred shareholders. The top solid line represents the value of the callable preferred share liability to common shareholders and the bottom dashed line shows the preferred share price as a function of the short-term interest rate. The difference between the two curves is the market value of a synthetic security that pays the transaction costs when the issue is called. The "hump" arises because managers exercise the call option on the preferreds as if it had a strike price of $110 but the preferred shareholders receive only $100 when their shares are called. CHAPTER 3. THE THEORETICAL MODEL 39 0 0.05 0.1 0.15 0.2 0.25 Short-term Interest Rate Figure 3.4: The region that defines the optimal exercise policy. When et and rt are within the shaded region, managers will call the preferred share. This figure is a graphical representation of the optimal policy for a particular parameterization of the problem described by the Bellman equation (3.5). CHAPTER 3. THE THEORETICAL MODEL 40 Figure 3.5: The relationship between the short-term interest rate and the market price of a callable preferred share. The topmost curve represents the price of an uncalled preferred share, P, whereas the bottom dashed curve represents the price of the same share just prior to the instant at which the call decision is made, P. (See Table 3.1 for a description of the notation.) In order to create this figure, the optimal call policy was determined by solving the Bellman equation (3.5), the call probabilities were calculated as a function of the observable state variable and the fixed points associated with equation (3.6) were calculated. CHAPTER 3. THE THEORETICAL MODEL 41 Figure 3.6: The relationship between the short-term interest rate and the market price of a callable preferred shares when the variance of the indirect effects of the call are different. This graph shows the effect of changing the variance of the unobserved state variable e in equation (3.5). Note that the more noise in this state variable, the more the callable preferred share price can exceed the strike price. Each function represented here is a solution to the fixed point equation (3.6). CHAPTER 3. THE THEORETICAL MODEL 42 105 14 Dividend Rate Figure 3.7: The relationship between the dividend rate and the market price of a callable preferred shares given different levels of the short-term interest rate. This graph describes the implications of the model in a cross-section of preferred shares with different dividend rates but a call price of 100. In order to create this figure we chose a set of dividend rates and set the call price of the shares to 100. Next, we calculate the market price functions for each of these shares using (3.6). We then take the prices of each callable preferred share when the short-term interest rate is 9.5% (12.5%) and plot those prices against the dividend rate of the callable preferred shares. CHAPTER 3. THE THEORETICAL MODEL 43 Figure 3.8: The difference in preferred share prices predicted by the dynamic and static models of managerial behaviour. This graph shows that different models of managerial behavior may give rise to different predictions in a cross-section of callable preferred shares. The curve labeled "Dynamic" is generated as described in the previous figure. The curve labeled "Static" describes the relationship between prices of shares with different dividends when the short-term interest rate is 8.5%, as given by the solutions to equation (3.7). Chapter 4 Empirical Implementation In this chapter we describe the techniques we will use to fit the model described in the previous chapter to actual data. We begin with a section in which we describe the numerical techniques we employ to solve for the optimal call policy and callable preferred share prices given a set of parameters. We then place further restrictions on the errors from the model and specify the likelihood function to be used in the analysis. 4.1 The Numerical Procedure We will deal with the pricing of straight preferred shares before describing our technique for determining callable preferred share call policies and pricing. This will allow us to focus atten-tion on the differences between the methodology used in this paper and the methods currently employed in fixed income derivative pricing. We begin by repeating the pricing equation for non-callable preferred shares described in the previous chapter: Recall that q is the risk-neutral measure associated with the state price measure given in equation (3.1). There are two equivalent ways in which to treat the pricing equation. The 44 CHAPTER 4. EMPIRICAL IMPLEMENTATION 45 first, as described in Campbell, Lo, and MacKinlay (1997), is to view straight preferreds as a portfolio of pure discount bonds. In our particular case, the portfolio would consist of every pure discount bond which matures at some multiple of three months from the present. Campbell, Lo, and MacKinlay (1997) show that for the cases in which 7 = 0 or 7 = .5 there are closed form solutions for the price of every bond in the portfolio.1 Unfortunately, the sum of the prices of the bonds in this portfolio does not converge to a closed form limit; therefore, we must resort to numerical procedures in order to calculate perpetuity prices. One alternative is to approximate the price of a perpetuity with the price of an instrument that pays a long but finite series of dividends, using the fact that prices of such instruments will converge to the perpetuity price as their maturity approaches infinity. This approach has two disadvantages. First, it is computationally inefficient.2 Second, it cannot be easily generalized. For example, if 7 is neither 0 nor 0.5 there are no known solutions to the system of difference equations for the discount bond prices. The method we will employ does not depend on the existence of closed form solutions for discount bond prices. We can, therefore, determine preferred share prices for a larger set of specifications of the state prices (for example, when 7 ^ 0,0.5). As we mentioned in Section 3.1, equation (3.3) describes the price of preferred shares as the fixed point to a functional equation and can be written in an alternate form. Let Q be the linear pricing operator defined by the state price measure. Let f{rt) be a payoff that is a function of only the short-term rate.3 The price of this payoff can be stated as: U = Qf where II is also a function of the short-term rate. The notation means that: /•oo n(r<) = / f(rt+i)q(rt+i I rt)drt+i Jo ' in the discrete-time setting these prices solve a system of difference equations, as opposed to differential equations more typically encountered in determining bond prices. 2 Depending on the parameters in the density function, the prices converge slowly. *That is, / is a payoff that is measurable with respect to the sigma algebra generated by the short-term interest rate process. CHAPTER 4. EMPIRICAL IMPLEMENTATION 46 Straight preferred share prices are calculated as the fixed point to the equation: P = Q(8e + P) (4.1) where e = 1 is the constant function. The equation states that if an investor holds the preferred share for one period, she will receive a dividend of 8 and be in a position to sell the preferred share for P ( r t + i ) one period hence. Using the fact that Q is a contraction, there are two ways to solve for preferred share prices. In the first case, we can start with an arbitrary bounded function Po(r) and then iterate. That is, if we define P„+i = Q(8 + P„) the sequence {Pn}^L0 converges to the price of a preferred share. In the special case where PQ = 0 we will just generate a sequence of prices for preferred shares paying a finite and increasing number of dividends (ie. P„ will be the price of a preferred share that pays n dividends of <5 each.) This method of calculating the preferred share price is not particularly efficient. The second method for calculating the preferred share prices involves "inverting" the fixed point equation. Note that we can rewrite equation (4.1) as P = SB + QP where B is the price of $1 three months from now. We can rearrange this equation to yield: P={I-Q)~1B8. This inverse exists since Q is a contraction. Furthermore, we can calculate the solution to the equation very efficiently if we make the state space discrete.4 Once we have done this the pricing function is translated into a vector and the problem becomes one of finding the matrix inverse.5 A few more details may make this procedure more clear. Given a discrete state space, U£Li[r*t't r,+i), the elements of Q become the appropriate state prices for the discretized system. 4 This is equivalent to performing the integration using finite Riemann sums. Alternatively, we are approx-imating the preferred share price with a collection of step functions. We describe our method for making the state space discrete in the appendix. 5 Software for calculating such inverses is widely available. We use Matlab in our numerical work. <3> CHAPTER 4. EMPIRICAL IMPLEMENTATION 47 For example, the element in row i and column j of the matrix is the price of one dollar paid out if the state variable, in our case the interest rate, is between r, and r J + i when the current state is i: Qij = / r j ' + l q(r | ri)dr. In our application, these elements will be Normal probability masses, scaled down by the current discount function, e r , A . Given this pricing operator and a payoff that is a function of only the time t +1 interest rate, f(rt+i), we can determine the price using straightforward matrix multiplication: II,- = YljLi Qijf(rj)- Furthermore, as we stated in the previous paragraph, equation (4.1) can be solved with a straightforward matrix inversion. Refer again to Figure 3.1 which summarizes the output of an application of our numerical technique given a specific set of parameters. We will point out a couple of properties of the price. First, the price is bounded both above and below. This is a result of the fact that equation (3.3) is a contraction. Second, the price is decreasing and convex in the interest rate. This follows from our choice of parameters for the Arrow-Debreu prices.6 We have chosen to place considerable structure on the state price measure. One benefit of this approach is that we can employ efficient methods to calculate the derivatives of functions we are studying.7 We will demonstrate this technique first using non-callable preferred shares. Our empirical technique, maximum likelihood, will require calculation of both the derivative of the preferred share pricing function with respect to the state variable as well as the partial derivatives of prices with respect to the parameters in the state price measure. The derivative with respect to the state variable is required in order to calculate the partial derivatives so we will describe how we calculate it first. Equation (3.3) actually describes a system of linear equations. A representative equation in the system is as follows: /•oo P(r) = e~rA / [S + P(f)]q(r \ r)df Jo As described in equation (3.1), q(r \ r) is the density function for a normal random variable. 8 The appendix describes some parameter restrictions that give rise to these characteristics for the share price. 7This is very important when it comes time to estimate the parameters in the model. Efficient algorithms for calculating derivatives make maximization of the likelihood function feasible for cases where we are utilizing data from several callable preferred shares. CHAPTER 4. EMPIRICAL IMPLEMENTATION 48 Using this fact, we can write: A r°° z2 P(r) = e" r A / P{a{r)z + p(r))e"r* + e-^ r'+>)) | rt] Given V we can calculate V(r, e) using simple static optimization: V(r, e) = min{(l + r)K + e, V(rt)}. One can use a generalization of Newton's method to solve equation (4.1) for V (Rust 1987). The appendix shows how to solve the equation using policy iteration and shows that policy iteration is equivalent to Newton's method, a result proven in general in Puterman (1994). As in any model of choice using the logistic distribution, the call probabilities here have the form: p(rt) = = (4.5) F K ' 1 + exp(F(r t) - (1 + T)K) V ; These probabilities are one of two primary observables we use to test the model in the next chapter. In addition, they are used to calculate preferred share prices, as indicated in equa-tion (3.6). CHAPTER 4. EMPIRICAL IMPLEMENTATION 52 It is possible to calculate the slope and partial derivatives of the call probabilities and callable preferred share prices. The functional equations for these derivatives are similar to those for non-callable preferred shares as described above. See the appendix for the specific form of the equations. 4.2 The Likelihood Function In this section we will describe our method for estimating the model parameters. We introduce another set of assumptions regarding the nature of the "pricing" errors from our model that allows us to write out the likelihood function for a typical sample. We will then show that parameters that maximize the log-likelihood function can be interpreted as the parameters that minimize a weighted non-linear least squares problem. Thus, our estimation procedure can be thought of as either maximum likelihood or non-linear generalized least squares (GLS). This alternative interpretation of our estimators gives us some guidance as to how to analyze the errors from the model and about how to develop robust test statistics. We will briefly summarize the relevant details of the model developed in Chapter 3. Recall that our model makes predictions about the call probabilities and prices for issues of preferred shares, both as functions of the state variable. Equation (4.5) gives the call probability, which we will denote p(r<; 6, K) to show its dependence on the model parameters, 6 = (a, 6, c, r) and the strike price of the preferred share, K. The parameters a, 6 and c are related to the pricing kernel (equation (3.1)), and r is the level of the proportional transaction cost.1 2 With the call probabilities in hand, the preferred share prices are given by equation (3.6); denote this function P(rt; 6, K). Therefore, our likelihood function can make use of two types of data; call decisions and prices. We will describe the two components of the log-likelihood that are associated with these sub-datasets separately and then describe assumptions under which the two parts can be added together. As with any empirical model of discrete choice, if we observe a series of decisions and 1 2 We could also theoretically estimate the parameter 7 but our attempts to do so have been unsuccessful. It appears that varying 7 has little effect on the likelihood. See the discussion in the next chapter. CHAPTER 4. EMPIRICAL IMPLEMENTATION 53 if we have a model of the choice probabilities we can form a likelihood function. For any given preferred share issue we will observe a series of dates at which the issue is not redeemed (du = 0) and, potentially, one date at which the issue is redeemed (du = 1). Given our model, the likelihood of this series of call decisions is given by: T Cid(0) = Tl(l-p(rt;6,Ki))(1-d")p(rt;6,Ki)d» t=i The log of this likelihood function has the form: T Lid = £ ( 1 - dit) log(l - p(rt- 9, Kt)) + dit logp(r t; 9, Kt) (4.6) Rust (1987) shows how to estimate this "dynamic logistic" model, and our procedure would be identical to his were it not for the fact that we can use the information in preferred share prices as well as the decision to estimate the parameters. Before we can do this, however, we have to make some assumptions regarding the pricing errors in the model. Let Pu denote the price of a callable preferred share at date t. We denote the pricing error at a given point in time as i/,- t = Pu — P(rt;9,K~i) and assume that, across time, the vu are normal and independently distributed with zero mean and variance a2. If we were to use only the information in prices to estimate the model we would maximize the log-likelihood of a sample of pricing errors. More specifically, if we maximize over the variance of the pricing error, the concentrated log-likelihood function (Davidson and MacKinnon 1993, p. 281) can be written as follows: LiP(9) = - ( | ) log (Y,(Pu - P(rt, Ki))^j (4.7) This estimation problem is a straightforward application of non-linear least squares. If we can determine the parameters that minimize the squared sum of pricing errors, using the numer-ical approximation to P(rt,Ki) and its derivatives outlined in Chapter 3, then we will have maximized this likelihood function. Our approach here is to use the information in call decisions and prices to form parameter CHAPTER 4. EMPIRICAL IMPLEMENTATION 54 estimates. In order to do so, we must combine both Ld and Lp into a likelihood function for the sample. We will assume the pricing errors are uncorrelated with the call probability.13 The likelihood for the sample can then be written as: T Ci{0) = U(l - p(rt; 9, if.-)) ( 1-* , )p(r«; 0, K^'Fifa) t=i Equation (4.8) simply reflects the fact that, at each point in time, we observe the call decision and, conditional on no call having been made, a pricing error. The probability of this pair of events is (1 — p(rt; 9, Ki))Pi(uit) and since these events are assumed to be independent across time, the likelihood is the product of these probabilities. When we concentrate the log-likelihood, thereby removing the parameter a, we arrive at the likelihood function we will take to the data: Li(8) = Laid) + LiP{9) (4.8) Equation (4.8) ignores the fact that, for a given firm, we have several issues of preferred shares. Initially, we will ignore the fact that there may be some structure among the pricing errors and call probabilities for these issues and assume instead that they are independent across issues. Thus, for a given firm, we write the log-likelihood of the sequence of call decisions and pricing errors as L(9) = 2~2i Li(8). We will see that it is easy to test this assumption once we move into a GLS framework. It is straightforward to show that the first-order conditions for equation (4.8) are as follows: L < > = ^ E i(Pt - P(rt, K))Pe(rt, K)] + £ ' t=i *=i p(rt; 9, K) - dt -po(rt;9,K) (l-p(rt;6,K))p(rt;9,Ky (4.9) where a 9 subscript represents the partial derivative of the given function with respect to any one of the parameters in the vector 9. Equation (4.9) has a nice interpretation once we recognize (1 — p(rt; 9, K))p(rt; 9, K) as the variance of the "decision error" (p(rt; 9, K) — dt). Equation (4.9) is simply a weighted sum of orthogonality conditions, where the weights are JWe will test to see whether or not this assumption is satisfied for a specific case in the next chapter. CHAPTER 4. EMPIRICAL IMPLEMENTATION 55 inversely proportional to the variance of the observed pricing or decision error. If we knew a2 and (1 — p(rt; 0, K))p(rt; 6, K) (or had consistent estimates), our parameter estimates would minimize the weighted sum of squared errors function: With this interpretation of our estimation procedure we have a nice framework in which to summarize our assumptions regarding the error structure. We have in effect assumed: • the pricing errors are independent across time; • the pricing errors are independent across preferred share issues; • the pricing errors are homoskedastic; • the call probabilities are independent across time; • the call probabilities are independent across preferred share issues; • the "decision error" p(rt) — dt is heteroskedastic; . • the pricing errors are independent of the call probabilities; The error covariance matrix implicit in equation (4.10) is thus diagonal with the appropriate variance term on the diagonal. We will examine the covariance matrix closely to determine whether or not these assumptions are violated. In addition, using the GLS framework we can apply techniques as suggested by White (1984, Ch. 6) to make the standard errors of our estimators robust to the specification of the covariance matrix for the model errors. Notice that throughout our description of the estimation procedure we have assumed that we can observe the state variable, r t , without error and condition on this information using the relationships described in equations (4.5) and (3.6) above. In fact, we will observe a noisy proxy for the state variable. We do not formally account for the resulting "error-in-variables" problem. Instead, we have taken care to use dependent variables that are derived from prices of portfolios of interest rate sensitive assets in an attempt to minimize measurement error. SSE(6) = ±[Pt-P(rt,K)]2 + (p(rt-e,K)-dt)2 (4.10) [l-p(rt;0,K)]P(rt;0,K) CHAPTER 4. EMPIRICAL IMPLEMENTATION 56 As mentioned in Chapter 3 one interesting alternative hypothesis we will test is whether or not managers behave myopically when calling preferred shares. It is straightforward to show that, since such managers would ignore any option value, exercise probabilities will be given by: *i \ - (A i i ^ exp[P(r t) - (1 + TS)K] . . ^ ) ^ ^ = l | r O = 1 + e x p [ p ( r t ) _ ( 1 + r a ) J g ] (4-H) where P(rt) is the price of a non-callable preferred share. Preferred share prices will be given by equation (3.6) once we substitute p§ for p*. We can perform a test of the alternative hypothesis if we nest both models for the call choice as follows: T l r x _ exp[P( r f ) - ( l + f)ir] l ) " exp[P(rt) - (1 + r)K] + ex P [AC u ( r t ; K)) ^ 1 2 ) where Cu(rt;K) = P(rt) — V(rt;K) represents the value of the unexercised option to call the preferred share (see equation (3.8) in the previous chapter). To test for myopic behavior we check whether the parameter A in (4.12) is zero. If we cannot reject this hypothesis we have evidence that the non-linear term, C„, helps us to explain the data better. Chapter 5 Empirical Results In this chapter our goal is to learn something about the capital restructuring decision we have modeled in previous chapters. The model makes strong predictions about the relationship be-tween interest rates, the call decision and preferred share prices. We will use the observed relationships between a proxy for the interest rates and other variables in a sample of utility company preferred shares to estimate the parameters in our model. We then judge the perfor-mance of the model by examining the parameter estimates, analyzing the pricing errors and by testing the model against some of the alternative theories of managerial behavior. We begin the chapter by describing the data we utilize in some detail. We then take a look at one company's preferred share issues, those of the Pacific Gas and Electric Co., and perform a series of tests on the model's performance in that dataset. We focus on this company because it has the broadest cross section of traded preferred shares among utility companies. We are, therefore, able to control for many unmodeled effects, including default risk, by estimating the model using only these shares. The chapter ends with a study of the parameter estimates from a cross-section of preferred shares of several other utilities. 57 CHAPTER 5. EMPIRICAL RESULTS 58 5.1 T h e D a t a In order to carry out the estimation described in the previous chapter, we have collected data on a sample of redeemable preferred shares including the characteristics of the issue (dividend rate, dividend dates, call price), the price of these issues, and the date at which they were called, if they were indeed called. The shares were issued exclusively by gas and electric utility companies and were outstanding during the period extending from January, 1980 to the present. In order to be included in the sample, preferred share issues were required to have prices listed in Standard & Poor's Stock Price Record. Additionally, we required that Moody's Public Utility Manual have complete information on the issue's characteristics. As a result of these requirements, our sample excludes private issues. Finally, we require that all shares have call prices which will not fall in the future. This allows us to estimate a model of preferred share prices which is not time dependent.1 Our model describes the relationships between a proxy for interest rates and the prices of callable, perpetual preferred shares with different dividend rates and/or strike prices. It does not account for cross sectional differences in prices arising from other share characteristics. Although plain vanilla preferred shares are the most common of utility issues, there were two types of shares we did not include. The first type of excluded shares are referred to as "old money" preferred shares. These shares carry tax disadvantages relative to "new money" pre-ferred shares if the shareholder is a corporation.2 Thus, our sample consists of only new money preferred shares which comprise the majority of the outstanding issues. The second type of shares we excluded from our sample are those issues with sinking fund provisions. A typical sinking fund requires that the issuer buy, either at the market price or at par, a specified number of shares each year. Dunn and Spatt (1984) describe the prices of these shares when there is a purchaser who can corner the market in the shares. Such a purchaser would be able to force the issuing firm to pay him par for the shares on the sinking fund date, ' A declining call price schedule necessitates an empirical model in which an additional state variable is relevant, the time remaining until the call price declines. Although it is possible to perform the estimation using such a model, the computing time required to do so increases dramatically. 2 Specifically, corporate holders of old money preferreds cannot utilize the full dividend received deduction of 70%. See Grundy (1993) for a detailed discussion of the tax treatment of public utility preferred shares. CHAPTER 5. EMPIRICAL RESULTS 59 even if interest rates were such that the price of the shares was less than par. This effect will, in theory, bias the price of sinking fund shares towards par. Our data happens to contain two series of preferred shares issued by Commonwealth Edison that differ only in that one series has a sinking fund provision, so we can document this bias towards par in the data. In the top panel of Figure 5.1 we plot the time series of prices of the two shares. One can immediately see that the price of the sinking fund share almost always exceeds the price of the non-sinking fund share. The bottom panel in Figure 5.1 shows the relationship between the price of a proxy for the consol preferred share and the prices of the two callable shares. Again, we see that the behavior of the sinking fund share is very different from the plain vanilla share. In light of the fact that our model does not account for the pricing effects of the sinking fund, we have excluded such shares from our sample.3 Our proxy for the state variable is some measure of the relevant short-term interest rate, either measured directly or implied by our term structure model and the price of a long dated security.4 We use two different variables; the three month t-bill rate from the CRSP Risk Free Rates File or, if available, the price of a well out-of-the-money preferred share. We will refer to this price as the consol preferred share price. There are several unmodeled factors that may influence our results. We will broadly cate-gorize these factors as liquidity related, regulation related and tax related. A brief discussion of each follows. 5.1.1 Liquidity Preferred shares are a small but important source of capital for utility companies. In the period 1983 to 1992, annual issues of preferred shares amounted to between 4% and 15% of investor-owned electric utility company security sales. In 1992 this represented over $3 billion in issued capital, outstripping electric utility common stock financing by $1.4 billion.5 8 The thesis proposal documented results from regressions where the sinking fund shares were inadvertantly included. The effect was to bias the estimate of implied transaction costs upwards. 4 A s mentioned in Chapter 3 we can invert equation (3.3) to determine a short rate consistent with the price of such a security. sSee Moody's Public Utility Manual. CHAPTER 5. EMPIRICAL RESULTS 60 Although such statistics will not allow us to conclude that the market for such shares is liquid, Grundy (1993) documents the fact that turnover for preferred stocks is close to the average turnover for all shares traded on the NYSE. 6 Therefore, we feel that the preferred share prices we observe contain useful information which we will incorporate into our study of managers' exercise decisions. 5.1.2 Regulatory Environment Although the gas and electric utility industry was heavily regulated during the time period covered by our sample, companies are currently facing the prospect of deregulation and increased competition. In the context of the problem we are studying here, the relevant aspects of regulation relate to restrictions on new financing and to the setting of rates. The Securities and Exchange Commission administers the Public Utility Holding Co. Act of 1935. The main focus of the Act is on restricting ownership of electric and gas utility systems; the SEC's duties include, among other things, approving acquisitions and limiting the number of utilities a holding company may control to one. The Act does not appear to restrict the financial policy of a healthy utility in relation to the call and reissue of redeemable preferred shares.7 The mechanism for setting gas and electric rates is relevant to the refunding decision since it is possible to influence utility firm managers' call decisions by changing the regulatory envi-ronment. Utility rates have historically been set periodically by state public utility boards (eg. the California Public Utilities Commission). The rates are typically set to cover capital costs and operating costs. Two factors enter the determination of eligible capital costs: the rate base and the cost of capital. The rate base is largely a function of operating capital in place (eg. generating stations) and is, thus, independent of the company's financial structure. The cost of capital, however, is typically calculated as a weighted average of the existing cost of fixed rate financing (ie. debt and preferred equity) and the required rate of return on common equity. 'Turnover is defined as the total annual volume of trade in an issue divided by the number of shares in the issue that are outstanding. 7 It does provide for a method of reorganization beyond the bankruptcy and state laws. CHAPTER 5. EMPIRICAL RESULTS 61 Therefore, the rate setting mechanism allows utilities with high financing costs resulting from both high coupon debt and high dividend preferreds to be passed on to consumers. In addi-tion, since operating costs include income taxes, utility managers who only wish to maximize shareholder value may pass up opportunities to exchange debt for equity as the higher taxes associated with equity financing may also be passed on to consumers. Regulators can motivate managers of utilities to reduce financing costs by increasing the time between rate hearings. Increasing the "regulatory lag" has the effect of lengthening the period during which rates are fixed and, therefore, allows for savings resulting from reduced financing costs to accrue to common shareholders. The regulatory lag varies from one to three years in our sample. In addition, many utilities are committing to fixed rates for several years. It is reasonable to expect utility firm managers to call preferred shares optimally in this environment. 5.1.3 Taxation Preferred shares are treated in exactly the same manner as are common shares under the income tax act. As such, dividends received by individuals are treated as ordinary income. Inter-corporate dividends from new money preferred shares, however, qualify for the Dividend Received Deduction (DRD). In calculating their taxable income, corporations who receive div-idends must claim the entire amount of the dividend but are then allowed to deduct a portion of the dividend received. For our sample relevant tax rates depend on who is the recipient of the dividend income as well as the time at which it was received. Prior to the Tax Reform Act of 1986 (TRA86) the top marginal tax rate for individuals was 50%. TRA86 reduced this rate to 28% and it currently stands at 39.6%. The DRD is currently 70%; thus, 30% of the dividend is taxable at the 35% corporate tax rate. The DRD was 85% prior to the Tax Reform Act of 1986, fell to 80% in 1987 and reached its current level in 1988. There was considerable uncertainty regarding the future of the DRD in mid-1988 when the House Ways and Means Committee agreed to further reduce the DRD to 50%.8 For firms 8 Grundy (1993) discusses in detail the history of taxation relevant to utility company preferred shares. CHAPTER 5. EMPIRICAL RESULTS 62 qualifying for the DRD, the net effect of the tax changes over the period was to increase the income tax rate on dividends from 6.9% to 10.2%. In order to minimize the impact of this change in the tax code on our results, we will fit our model to exercise decisions and preferred share prices observed during the period 1987-1995, after TRA86. 5.2 A n Empir ical Analysis of the Ca l l Decisions of the Pacific Gas and Electric Co . The purpose of this section is to undertake a detailed study of the decisions made by financial managers of the Pacific Gas and Electric Co. (PGE) to call preferred share issues during the period 1987-1995.9 We purposefully limit our attention to one company in order to control the dimensions in which the model can fail; for example, it is unlikely that default risk will vary among the preferred shares in this sample. Our goal here is to obtain an understanding of the relationship between the model and the data by studying the model's successes as well as its failures. We will start by examining the model's estimates of transaction costs and then move on to see whether or not our specification of the term structure model is robust. We finish the section by discussing the time series properties of the pricing errors generated by the model. 5.2.1 Benchmark Parameter Estimates We start with two sets of parameter estimates that will serve as benchmarks for the remainder of this section. As mentioned in Chapter 3, we parameterize the dynamics of the short term riskless rate as in Chan, Karolyi, Longstaff, and Sanders (1992) (see equation(3.2)). In this section we let j = 0.5 and, thus, work with a term structure model similar to that of CIR. The dependent data consists of the prices and call decisions related to several issues of Pacific Gas and Electric Ltd. (PGE) preferred shares. Two different proxies for the state variable are 9 We focus on this company because it had a large number of preferred share issues outstanding during this period, several of which were redeemed. CHAPTER 5. EMPIRICAL RESULTS 63 utilized: the 3-month treasury bill rate and the average of prices of a four well out-of-the-money PGE preferred share issues.10 Estimates of the model parameters are reported in Table 5.1. This table has two sets of results, one for each of the state variables utilized. The point estimates for the term structure parameters, (a, b, c), look reasonable when compared to those from other studies as summarized in Table 5.2.11 Our estimates are of roughly the same magnitude, although two parameters do appear to be consistently different. First, our estimate of c, the parameter which determines the variance of the short rate, is smaller than those documented in previous studies. It is well known that the variance of the time series of yields in inversely related to the term to maturity; thus, this finding may be a result of the fact that our dataset consists mainly of long maturity instruments. Second, our estimate of a appears to be smaller than what is reported in other studies. This parameter is directly proportional to the long run mean of the short rate process. Whereas the other studies utilize data from the early 1980's, a period which had historically high interest rates, our data begins in 1987. Thus, the time horizon in which the model was estimated may explain why our estimate of a is smaller. Flotation costs, as summarized by the parameter r, are estimated to be 5.6% when the state variable is the 3-month t-bill rate and 3.9% when the state variable is a consol preferred share price. Both are significantly different from zero. Although flotation costs for common stock may be as high as five percent of the issue proceeds, the cost of issuing new debt or preferred shares is considerably lower (Eckbo and Masulis 1995). We can obtain a rough estimate of the flotation costs specific to PGE by examining Moody's Public Utility Manual; for each of PGE's preferred share issues, the Manual indicates the average selling price of the issue as well as the net proceeds to the company. An informal analysis of the direct notation costs for all listed preferred shares indicates that we should expect these costs to be about three percent of 1 0 T h e first series was obtained from the CRSP bond databases. The second series is the average of the inverse of the yields of the 4.36%, 4.5%, 4.8% and 5% preferred share issues of P G E . None of these shares traded at prices in excess of 70% of their call price; thus, this series is a reasonable approximation to the price of a stream of quarterly dividends of one dollar. 1 1 All of these estimates are directly comparable to those in Table 5.1 except for the values of a and b reported by Chan, Karolyi, LongstafT, and Sanders (1992) which applie to the true dynamics of r rather than to its risk-neutral dynamics. CHAPTER 5. EMPIRICAL RESULTS 64 issue proceeds. In the absence of asymmetric information about the new issue, we suspect that variable indirect costs would be much smaller than this and perhaps even zero. Our informal estimate of the true cost of calling shares is, therefore, about three percent of the call price. This is close to the point estimates of our models and is certainly well within any confidence intervals for the estimates. We will now present some informal evidence on how well the model describes the data. Figure 5.2 shows the relationship between the PGE preferred share price data and the modeled prices where the state variable is the consol preferred share price. It is worth emphasizing that the predicted prices for all nine preferred share issues are generated using the same four parameter estimates given in the last column of Table 5.1. It appears that our model of preferred share prices does a very good job of describing the data when the consol preferred share price is the state variable. Although this "eyeball metric" of the model's performance is not conclusive evidence that it is well specified, the formal tests in the next subsection provide evidence that this is indeed the case. We have run a number of specification tests on the model's errors and have not found any instruments capable of explaining patterns in the errors. We note, however, that if the state variable is the three month t-bill rate, the model does fail a number of specification tests. A discussion of how Figure 5.2 was developed may serve to highlight the purpose of the technical machinery developed in previous chapters. Consider first the presentation of the raw data series as represented by the '+' signs on the graphs. Our analysis suggested that there should be a specific functional relationship between the prices of non-callable and callable preferred shares. This prediction is certainly supported in a qualitative sense by the data; we see that there is a very definite non-linear relationship between these prices. Although our main objective in developing a structural model of the call decision is to learn something about managerial behavior, a side effect is that a small number of parameters can be used to describe a lot of data. This is the story behind the solid lines in the graphs. As we have mentioned, one set of parameter estimates for (a, b, c, r) allowed us to draw all nine of the solid lines that represent the theoretical prices of uncalled callable preferred shares CHAPTER 5. EMPIRICAL RESULTS 65 given various levels of the state variable. In our setting there is no closed form solution for this relationship, so we have to utilize numerical techniques from the Markov Decision Process literature to calculate these prices. Application of these techniques is complicated by the fact that we are working in a setting with stochastic interest rates. Figure 5.2 really tells only half the story of our model, since we also make predictions about the relationship between the state variable and the call/don't call decision. Preferred shares are not always called when you think they should be. For example, consider the case of the 9.48% preferred shares. We see that there were cases when the state variable was very high, around 60, and the share was not called. As we pointed out in earlier chapters, a traditional option pricing model would have trouble with this observation. Such a model would predict that on the date of this observation the price should have been equal to the strike price and that managers should have called on, or possibly prior to, that date. The conclusion from these models would be that managers appear to act irrationally. Our model has no problem with this data point, since this observation may have coincided with a low realization of the unobservable state variable we introduced in Chapter 3. In fact, the overall message from the P G E data is that very reasonable levels of the model parameters do a good job of describing observed call decisions and callable preferred share prices. We now turn our attention to what the model can tell us about managerial behavior. As mentioned in Chapter 3, we can perform a test of whether or not firm managers behave as though they consider the dynamic aspects of their exercise decision. The static model of the call decision that we are considering as an alternative suggests that managers should call preferred shares when the discounted value of all future dividends, using current yields, equals the strike price plus the cost of the call. It is not unreasonable to think that many managers may believe this is the proper way to behave; it suggests that preferred shares should be called if the decision represents a positive NPV undertaking. In fact, a key point in Kraus (1983) warned against this reasoning: "Even if the net benefit [of calling] is positive ... it does not necessarily follow that the optimal decision is to call the bonds immediately ... the optimal time to [call] is when the value of doing so is greatest, not merely when it has become positive" CHAPTER 5. EMPIRICAL RESULTS 66 (p. 54). We can think of our test as determining whether or not managers have taken this advice. To answer this question we estimate a model which nests the static and dynamic program-ming problems solved by managers (see equation (4.12)). Table 5.3 provides the results of this test. It summarizes parameter estimates for the unconstrained model, the static model and the dynamic model assuming the CIR model of interest rate dynamics applies and that the state variable is the five year pure discount bond price. We provide two test statistics for each of the null hypotheses: the t-statistic and LR, the likelihood ratio statistic. For our test of myopic behavior, LR has the x 2 ( l ) distribution in large samples. We are able to easily reject the hypothesis that managers act myopically since both the t-statistic and LR are significant at the 1% level. Notice also that the level of the transaction costs implied by the static model are implausibly high, on the order of 16% of the call price. This provides further evidence that the static model may be misspecified. Our test result indicates that our dynamic model of managerial behavior does indeed have more predictive power than at least one simple rule of thumb and that managers appear to consider the value of the option to delay the call when making their refunding decisions. This finding justifies our procedure of fitting a complicated model of managerial behavior to the data, as we have learned something that would not be apparent from qualitative characterizations of the time series. For example, we could not have arrived at the conclusions implied by Table 5.3 by simply noting that preferred shares trade at premiums of around two percent prior to the call announcement. Whenever we fit a structural model to the data, our interpretation of results such as the one just presented relies on the model being well specified. We now turn to a formal investigation into whether or not this is the case. 5.2.2 Specification Tests We will now try to find some features of the data which our model does not describe. To this end, we will perform a series of specification tests of the model. We begin by examining the CHAPTER 5. EMPIRICAL RESULTS 67 performance of the model under alternative specifications for the state price measure. Equation (3.2) describes our parameterization of the interest rate dynamics under the risk-neutral measure. The estimates we have discussed so far are based on a restricted version of these dynamics in which 7 = 0.5. It is well known that this model fails to completely describe the short rate process (Ait-Sahalia 1996). It is possible for us to calculate optimal exercise decisions and callable preferred share prices given any non-negative value of 7 , so one way to test for misspecification is to allow the value of 7 to vary. Ideally, we would maximize a likelihood function which includes this additional parameter. Unfortunately, it seems that our dataset does not have the power to identify 7 using this procedure. We have estimated restricted versions of the model where 7 takes the values 0, 0.5 and 1.5. We provide the absolute value of the maximized likelihood given each of these restrictions in Tables 5.4 and 5.5. Table 5.4 provides the estimates where the state variable is the 3-month t-bill rate and Table 5.5 provides the same where the state variable is the consol preferred share price. We see that when the state variable is the 3-month t-bill rate, the fit of the model improves slightly as we decrease 7 ; this is opposite to the findings of Chan, Karolyi, Longstaff, and Sanders (1992) who determine that high values of 7 are necessary to adequately describe interest rate dynamics in the sample period considered here. When the consol preferred share price proxies for the state we again see that larger values of 7 generally result in a poorer fit. We note that the differences in the values of the likelihood are not significant, regardless of what state variable we use, and summarize the results of this specification test by noting that varying the model in this dimension does not have much impact on its explanatory power. Furthermore, the parameter we are primarily interested in is quite insensitive to the level of 7 ; although the point estimates of r vary, the 95% confidence intervals for all contain four percent. We will, thus, look for other sources of misspecification. In our next specification test, we examine whether or not the set of parameters that predict prices are the same as those that predict the exercise decisions. In a sense, this is a direct test of whether or not the rational expectations assumption we have made is violated. In order to derive preferred share prices, we assumed that investors could predict manager's exercise CHAPTER 5. EMPIRICAL RESULTS 68 decisions by solving the dynamic program that managers face (see equations (3.5) and (3.6)). Thus, under the null hypothesis one set of model parameters, (a, b, c, r) should predict prices and exercise decisions. To test this hypothesis, we would ideally estimate our model in two separate datasets, one comprised of prices and one comprised of decisons, and obtain two sets of parameter estimates and likelihood function values, one for the model which best predicts prices, and the other for the model that best predicts the exercise decision. Although it is possible to obtain parameter estimates that predict preferred share prices, we do not have enough called preferred shares to do the same for the exercise decision. We, therefore, rely on test statistics which do not require that we fit the unrestricted model. We have performed a Lagrange multiplier test on the hypothesis that the four model pa-rameters, (a, 6, c, r) adequately describe preferred share prices and managers exercise decisions. Since the unrestricted model has eight parameters {(ai,bi,Ci,fi) | i = 1,2}, our L M statistics will have the x2(4) distribution.12 We performed the test on two models, with the results pre-sented in Table 5.6. We are unable to reject the hypothesis that different sets of parameters are required to explain preferred share prices and exercise decisions. This leads us to believe that investors do, in fact, anticipate the call policies of the managers of P G E correctly. In our last set of specification tests we check for sources of predictability in the pricing errors from the model. We have two sets of pricing errors, one for each of the state variable proxies we have tried, and we will use several instruments. We begin by looking for time-series patterns in our errors. Figure 5.3 shows the relationship between time and the pricing errors from the model. The most striking feature of the error in these figures is the high degree of correlation between pricing errors from different preferred share issues. This correlation is most obvious when the 3-month t-bill proxies for the short-term interest rate but is still present when the consol preferred share price is the dependent variable. This may be an indication that we have omitted a dependent variable or that the functional form of the model is incorrect. As we will see, several candidates for omitted variables have been checked and none seem to be strongly correlated with the pricing 1 2 The 95% critical value for this statistic is 9.49. CHAPTER 5. EMPIRICAL RESULTS 69 error from the model, at least when the state variable is the consol preferred share price. In addition, there is no obvious functional relation between the explanatory variables in the model and these errors. Our pricing errors are based on deviations from predicted prices, so if our model is doing a good job we should expect to see errors on the order of half the bid/ask spread. This observation is worth some investigation in terms of a specification test for the model's errors, as the bid/ask spread gives us a standard against which to gauge the magnitude of the residuals. As pointed out in Roll (1984) "bid/ask bounce" will show up in the errors as negative autocorrelation in the first difference of the pricing errors. Figure 5.4 shows the relationship between time and the first difference in pricing errors from the model when the consol preferred share is the proxy for the state variable. The degree of negative autocorrelation in these first differences is quite striking. We investigate this further with some statistical evidence. Table 5.7 shows the results from a regression of the first differences in pricing errors from three of the PGE preferred shares on their lags. The slope coefficient is negative and highly significant, indicating the presence of negative autocorrelation in the errors. This is consistent with our errors being driven, in part, by bid/ask bounce. As suggested in Roll (1984), we have used information from the variance of the first differences to calculate the effective spread. These estimates are tabulated in Table 5.8 for each of the PGE shares in the sample. The analysis indicates that typical spreads are on the order of $0.50 to $0.75 for these shares which have a par value of $25. Observe that these calculated spreads are close to spreads quoted in January 1991.13 In fact, the calculated spreads are uniformly smaller than the quoted spreads, perhaps indicating that many of the trades for these shares occur between the quotes. To summarize, our examination of the time series of pricing errors does not indicate model misspecification in the case where the consol preferred share is the proxy for the state variable. We now document that the pricing errors do not indicate an obvious way to improve our model of the call decision or of preferred share prices given the dependent variables we have chosen. As evidence of this, we note that the pricing error has no obvious functional relationship 13Standard & Poor's Stock Price Record reports bid/ask spreads on days during which no trade took place. CHAPTER 5. EMPIRICAL RESULTS 70 with the following:14 • the state variable; • the actual or predicted preferred share price; • the call of another preferred share; • the pricing error one quarter prior to the call; • the call probability (see Figure 5.5); • the strike price (see Figure 5.6). The third and fourth points are worth a brief discussion. In our model, the call probability is a proxy for how "in-the-money" is a given callable preferred share. Figure 5.5 shows that the pricing error is unrelated to this measure of the degree to which the preferred share price exceeds the strike price. In addition, Figure 5.6 indicates that there is no relationship, at least on average, between the strike price and the pricing error. Although we do not provide the evidence here, we have examined the pricing errors as a function of the strike price at several points in time and can report that there is no systematic relation. These observations lead us to conclude that we have no evidence of a volatility smile effect in callable preferred shares. Next, we document the relationship between other possibly omitted variables and the pricing error. Our prior is that, if anything, the model is missing variables that are correlated with either prices of different maturity fixed income securities or variables that are correlated with the probability of default. We have examined the relationship between the following variables and pricing errors: • a yield spread, defined as the 5-year 0-coupon bond yield 1 5 less the 3-month t-bill yield; • a default premium which is defined as the average of the yields on two P G E bonds less the yield on a similar maturity treasury bond; • the prior month's return on PGE common stock; 1 4 This statement is based on evidence from Gauss-Newton regressions (Davidson and MacKinnon 1993). l s From the CRSP Fama-Bliss Discount Bonds file CHAPTER 5. EMPIRICAL RESULTS 71 • the residual from a regression of PGE common stock price on the consol preferred share price. The pricing errors from the model in which the state variable is the consol preferred share price are unrelated to all these instruments. In other words, there is no evidence that a two-factor model of the term structure, for example, would help us to better explain the prices of PGE preferred shares. Nor would carefully accounting for default risk, since variation in the default spread and the price of PGE common stock is not helpful in explaining pricing errors. We are, therfore, comfortable with the specification of our pricing model in this case. The picture changes when the state variable is the 3-month t-bill. Both of the yield spread and the default premium are useful in explaining the pricing errors. To see that this is the case we have performed Gauss-Newton regressions as described in Davidson and MacKinnon (1993).16 The results of this test are in Table 5.9. We see that when the state variable is the 3-month t-bill, we have strong evidence of a missing interest rate factor, since including a longer maturity yield explains a significant portion of the variance in the pricing errors. We have less evidence to indicate that we could improve our pricing predictions by accounting for default risk in the model. The t-statistic for the default premium is considerably smaller for the Gauss-Newton regression which includes only this factor. To summarize, our model of the call decision and of preferred share prices does a good job of describing the data if we use the price of a consol preferred share as a proxy for the underlying state variable. There is no apparent misspecification of the model, either with respect to functional form or with respect to potentially omitted dependent variables. When we use a short-term riskless instrument to proxy for the state variable, however, it appears that enhancing the model by adding another interest rate factor and by modeling the possibility of default would improve the fit. Finally, our estimate of the implied transaction cost parameter seems to be insensitive to the specification of the model. 1 6 T o implement the test we simply regress the model's residuals on the vector composed of the Jacobian of the first term in the log-likelihood augmented with the instrumental variables. If the regression is statistically significant then we have evidence that there is relevant information in the conditioning set that is not accounted for by the model. CHAPTER 5. EMPIRICAL RESULTS 72 5.3 Transaction Cost Estimates for Other Util i t ies We now turn our attention to the preferred share issues of utility companies other than PGE. We will restrict our attention to the model where 7 = 0.5 (ie. the CIR model of the term structure) and use the 3-month riskless interest rate as a proxy for the state variable. Although this model was shown to have some shortcomings when predicting preferred share prices in the PGE sample, no other instrument is suited to examining a broad cross-section of preferred shares. Unsuccessful attempts were made to use longer maturity interest rates, specifically the 5-year pure discount rate and the price of a portfolio of consol preferred shares. For several companies, both these state variables lead to parameter estimates which were inconsistent with our model of the state price measure. Table 5.10 gives sets of parameter estimates for each of the utility companies we studied. The point estimates for proportional transaction costs exhibit considerable cross-sectional variation, ranging from a low of -16.4% for Metropolitan Edison to a high of 18.6% for the Toledo Edison Co. This variation is reduced significantly if one considers the fact that most of these estimates are quite imprecise. Few estimates have 95% confidence intervals that do not contain 5% as a transaction cost estimate. An informal analysis again indicates that direct flotation costs for these companies is on the order of one to four percent of new issue proceeds. Thus, our examination of this cross-section of implicit transaction cost estimates indicates that our sample contains little information on which to base any conclusions about differences in behavior among utility company managers. We can explain, at least intuitively, why the standard errors of some of these estimates are so high if we consider the characteristics of the PGE data that led to our relatively precise estimates for the transaction cost parameter (see Table 5.1). Although the value of r affects the entire pricing function, its impact on the function is largest when the share is at-the-money. If a dataset does not have any shares that were ever close to being called, for example because the dividend rates are very low, then that sample will yield imprecise estimates of r. This was not the case for the PGE sample. Dividend rates on PGE shares ranged from 4% to over 10% CHAPTER 5. EMPIRICAL RESULTS 73 during the sampling period and several of the high dividend shares were called. Compare this to the Commonwealth Edision Company sample which has shares whose dividends ranged from 7.24% to 8.4%, none of which were called. Resulting point estimates of r for Commonwealth Edison are very imprecise. In short, our methodology requires datasets with fairly broad cross-sections of preferred share characteristics in order to yield informative results about the level of implied transaction costs. We now turn our attention to the point estimates of the term-structure parameters, a, b, and c. It is more difficult to quantify cross-sectional differences among these parameters, since the pricing model can assign similar prices to the same financial instrument, even if the individual parameters look significantly different. For example, the point estimates of a, b, and c for PGE look closer to the Cincinnati Gas and Electric (CGE) estimates than to the Virginia Electric Power (VEP) estimates; however, if we compare the relationship between interest rates and non-callable preferred share prices consistent with each of the three point estimates, we can see that the V E P estimates yield preferred share prices that are more like the P G E prices than are the C G E prices (see Figure 5.7). In fact, the relationship among these predicted prices gives us an idea as to the source of differences among these parameter estimates. The point estimates for a, 6, and c from the C G E regressions are consistent with lower straight preferred share prices than are the estimates from PGE or VEP regressions. This is consistent with C G E shareholders facing higher default risk than PGE or V E P shareholders. We can use this intuition to group together companies that appear to have similar characteristics. As pointed out in earlier chapters a, b, and c parameterize the state price measure. We will consider two state price densities to be near one another if they assign similar prices to similar instruments. Specifically, if two sets of point estimates for these parameters yield straight preferred share pricing functions that are, in some sense, near one another, the parameter estimates will be deemed to be similar. In our dataset, interest rates vary from just over 2% to just under 10%; we arbitrarily define two straight preferred share pricing functions to be near one another if the mean squared difference between the functions over the support [0.02,0.10] is small. Using this metric we have ranked the point estimates for a, b, and c in terms of their CHAPTER 5. EMPIRICAL RESULTS 74 distance from the point estimates for PGE. The results are in Table 5.11. According to this ranking, Virginia Electric Power preferred shares give rise to estimates that appear to be closest to those of Pacific Gas and Electric whereas the Toledo Edison parameters are least like PGE's. Table 5.11 contains some other relevant information. First, we calculate the value of the likelihood function for the given company evaluated at the PGE parameter estimates. Twice the difference between this value and the value of the likelihood in Table 5.10 is then tabulated under the heading "Pseudo Likelihood Ratio". Strictly speaking, this value is not a likelihood ratio statistic, since we have not evaluated the likelihood function at the restricted optimum parameter estimates; however, the value presented does give an upper bound for the value of the true likelihood ratio statistic, since the PGE point estimates of a, 6, c and r are candidates for the restricted parameter estimates.17 We note that this statistic is, in general, smaller for shares deemed closer to PGE according to our metric described in the previous paragraph, giving some support to our choice of ranking method. Second, we have tabulated both the 1990 and 1995 Moody's preferred share rating for each utility. Notice that in 1995, the top six companies all have "a" ratings whereas the bottom six companies all have "b" ratings. In addition, two of the the bottom six companies had a significant ratings decrease over the 1990-1995 period (from "a" to "baa"). We conclude that shares that are most like PGE's were rated highly, with no significant changes in ratings occurring over the time period covered by our data. Finally, data from well out-of-the-money preferred shares, if available, were used to calculate a time series of yield spreads relative to those of PGE. The time series average of these yield spreads, stated in basis points, is reported in Table 5.11.18 The average spreads verify the story from the previous paragraph; shares with the lowest (highest) yield spreads appear to be most (least) like PGE shares. These findings contradict the hypothesis that these preferred shares are riskless, since with-1 7 A n advantage of reporting this statistic is that they are, in a sense, additive. For example, if we are interested in whether or not the parameter estimates for three of the companies are the same, one set of which belongs to P G E , we can add up the two pseudo likelihood ratio statistics and get an upper bound for the actual likelihood ratio statistic for the test (which would be distributed x2(8)). 1 8 Time series plots indicated that the spreads may be non-stationarity, consistent with the changes in Moody's ratings described previously. CHAPTER 5. EMPIRICAL RESULTS 75 out default risk these parameters should all be the same. We can attribute at least some of the variation in the term structure parameter estimates to an omitted variable that proxies for default risk, at least when our proxy for the state variable is the three month treasury rate. This is consistent with our findings from the specification tests in the previous section relating to P G E pricing errors. Ideally, we would use company specific proxies for risky interest rates to fit the model, as we did in the case of PGE. Unfortunately, limitations in the data, such as lack of cross-sectional variation in an individual company's share characteristics, restrict us from using this approach. CHAPTER 5. EMPIRICAL RESULTS 76 Table 5.1: Parameter estimates from a sample of Pacific Gas and Electric preferred shares during the period 1987-1995. This table shows the parameter estimates generated by a sample of preferred shares from Pacific Gas and Electric during the period 1987-1995. Two proxies for the state variable were used: the 3-month t-bill yield and the average price of a set of well out-of-the-money preferred shares issued by PGE. (Standard errors are in parentheses.) Parameter 3-month Consol t-bill Preferred a 0.0027032 0.001384 (0.000354) (0.000404) b 0.9783 0.9837 (0.004244) (0.0033) c 0.0349 0.01451 (0.00590) (0.00825) T 0.0564 0.03909 (0.01637) (0.01812) n 183 183 -C 663.6 449.2 Table 5.2: Parameter estimates for CIR interest rate dynamics from other studies. This table summarizes estimates for term structure parameters from other studies of the CIR interest rate dynamics. The CKLS estimates are from Chan et. al. (1992) who estimated several processes for the short rate using one month t-bill data. The PS estimates are from Pearson and Sun (1989) who used several bond portfolios and fit the CIR term structure to the data. The GO estimates are from Green and Oedegaard (1997) who use all treasury instruments to fit an after tax version of the CIR term structure. All parameter estimates have been converted to values which can be compared directly to our estimates in table 5.1. The PS and GO estimates apply to the risk-neutral dynamics, whereas the CKLS estimates apply to the observed process for the short rate. Parameter CKLS PS GO a 0.0047 0.0058 0.0041 b 0.942 0.957 0.983 c 0.0427 0.057 0.09 Period 1964-89 1979-86 1978-92 CHAPTER 5. EMPIRICAL RESULTS 77 Table 5.3: Test of myopic behavior by Pacific Gas and Electric during the period 1987-1995. This table compares the dynamic model to the static model of managerial call behaviour (see equation (4.12). The LR statistic is highly significant for the static model, indicating we have evidence supporting the hypothesis that managers act dynamically when making their exercise decision. (Standard errors are in parentheses.) Parameter Unconstrained Static Dynamic a 0 .002711 0.00283101 0.0027032 (0.0003612) (0.00026002) (0.000354) b 0.9777 0.9767 0.9783 (0.004418) (0.003241) (0.004244) c 0.031835 0.032939 0.0349 (0.008071) (0.006629) (0.00590) T 0.010159 0.1617 0.0564 (0.023332) (0.009798) (0.01637) A 1.093 0 1 (0.0495) -C 660.3 679.3 663.6 LR 38 6.6 Table 5.4: Test of specification of the the term structure model. This table shows the sensitivity of model parameters to changes in 7. These estimates are from the dataset where the state variable is the 3-month t-bill yield. (Standard errors are in parentheses.) Parameter 7 = 0 7 = 0.5 7 = 1.5 a 0.002669 0.0027032 0.001887 (0.000350) (0.000354) (0.000405) b 0.9771 0.9783 0.996498 (0.00427) (0.004244) (0.008759) c 0.00791 0.0349 0.5489 (0.00146) (0.00590) (0.09447) T 0.06524 0.0564 0.031592 (0.01635) (0.01637) (0.016229) -c 663.6 663.6 664.8 CHAPTER 5. EMPIRICAL RESULTS 78 Table 5.5: Test of specification of the the term structure model. This table shows the sensitivity of model parameters to changes in j. These estimates are from the dataset where the state variable is the consol preferred share price. (Standard errors are in parentheses.) Parameter 7 = 0 7 = 0.5 7 = 1.5 a 0.00196 0.001384 0.000707 (0.00087) (0.000404) (0.000205) b 0.9762 0.9837 0.9905 (0.00785) (0.0033) (0.005063) c 0.00458 0.01451 0.1084 (0.00258) (0.00825) (0.07280) T 0.03944 0.03909 0.02608 (0.01763) (0.01812) (0.02468) -c 448.6 449.2 449.0 Table 5.6: Test of rational expectations for preferred shareholders. In this table we test whether or not one set of parameter estimates explain preferred share prices and the exercise decisions for PGE preferred shares over the period 1987-1995. The statistics are distributed X2(4) with a 95% critical value of 9.49. State Variable 7 = 0 7 = 0.5 3-yr. t-bill TH 8.4 Table 5.7: Test for autocorrelation among the first difference of the pricing errors. In this table we test for the presence of bid/ask bounce in the pricing errors (Roll 1984). For the three preferred shares with the longest time series of errors, we regress the first differences of the pricing errors on their lags and a constant. Standard errors are in parenthesis. Preferred Share Dividend Intercept Slope 7.84% 0.097 -0.560 (0.18) (0.15) 8.00% 0.113 -0.533 (0.15) (0.15) 8.20% 0.0359 -0.646 (0.17) (0.13) CHAPTER 5. EMPIRICAL RESULTS 79 Table 5.8: Estimates of the bid/ask spread. In this table we report the bid/ask spread implied by the pricing errors (Roll 1984). In addition we report quoted spreads from January 1991 and average monthly trading volume for each of the PGE shares we investigated. The spread is stated as dollars per share where the shares have a par value of $25. Trading volume is in hundreds of shares. Preferred Share Estimated Bid/Ask Quoted Bid/Ask Avg. Trdg. Vol. Dividend Spread (Roll) Spread (Jan. 1991) (Monthly) 10.28% 0.48 0.63 130.10 10.46% 0.42 0.63 85.45 7.84% 0.88 1.00 79.61 8.00% 0.72 1.00 80.86 10.18% 0.52 0.75 101.50 9.30% 0.46 0.63 144.50 9.48% 0.58 0.50 113.10 8.16% 0.66 0.88 N / A 8.20% 0.87 0.75 91.09 Table 5.9: Specification tests using Gauss-Newton regressions In this table we utilize the Gauss-Newton regression technique to determine whether or not we have omitted any state variables from the model. The F statistic is significant at the 5% level. State Parameter t-statistic Variable Yield spread -10.5 -13.5 3-mo. t-bill Default premium 2.2 8.5 F 144.4 CHAPTER 5. EMPIRICAL RESULTS 80 Table 5.10: Parameter estimates from the model using preferred share issues from other utility companies. This table summarizes parameter estimates from the model using preferred share prices and call decisions of four different utility companies. (Standard errors are in parentheses.) Company Name a b c r -C Alabama Power 0.0067 0.9353 0.0807 (1.0047) 195 (0.0025) (0.0258) (0.0216) (5.8840) Cincinnati Gas and Electric 0.0030 0.9788 0.0387 13.6089 203 (0.0004) (0.0044) (0.0075) (2.9279) Commonwealth Edison 0.0036 0.9789 0.0681 17.4677 640 (0.0005) (0.0111) (0.0334) (23.2916) Detroit Edison 0.0028 0.9806 0.0359 11.7721 483 (0.0005) (0.0061) (0.0090) (5.1595) Duke Power 0.0056 0.9442 0.0369 3.2435 279 (0.0009) (0.0104) (0.0097) (1.8488) Georgia Power 0.0033 0.9740 0.0350 6.1938 262 (0.0007) (0.0084) (0.0068) (3.8779) Illinois Power 0.0021 0.9938 0.0440 4.0309 429 (0.0004) (0.0057) (0.0073) (4.7385) Indiana Michigan Power 0.0033 0.9750 0.0437 6.4202 236 (0.0005) (0.0061) (0.0091) (4.3128) Jersey Central Power and Light 0.0041 0.9682 0.0548 (6.4300) 376 (0.0010) (0.0101) (0.0179) (5.8141) Kansas City Power and Light 0.0021 0.9872 0.0378 (7.9386) 254 (0.0004) (0.0046) (0.0176) (14.9191) Metropolitan Edison 0.0052 0.9660 0.0800 (16.3785) 497 (0.0009) (0.0096) (0.0127) (3.1829) Northern States Power 0.0028 0.9752 0.0351 4.4900 320 (0.0005) (0.0060) (0.0110) (9.2291) Ohio Power 0.0052 0.9588 0.0672 (13.2784) 431 (0.0010) (0.0097) (0.0112) (2.6811) Ohio Edison 0.0032 0.9812 0.0505 0.7958 528 (0.0003) (0.0033) (0.0076) (3.7303) PECO Energy 0.0027 0.9873 0.0429 (6.2645) 481 (0.0006) (0.0062) (0.0074) (1.9138) Pennsylvania Power and Light 0.0041 0.9653 0.0466 4.0547 353 (0.0008) (0.0092) (0.0083) (1.9641) Public Service Gas and Electric 0.0032 0.9750 0.0463 2.9152 722 (0.0003) (0.0038) (0.0071) (2.3153) San Diego Gas and Electric 0.0037 0.9669 0.0390 2.2532 285 (0.0007) (0.0080) (0.0112) (3.2672) c o n t i n u e d o n n e x t p a g e CHAPTER 5. EMPIRICAL RESULTS 81 Company Name a b c r -C Southern California Edison 0.0039 0.9636 0.0466 3.2826 222 (0.0008) (0.0098) (0.0089) (5.4476) Toledo Edison 0.0047 0.9774 0.0803 18.5713 627 (0.0005) (0.0207) (0.0582) (46.1395) Union Electric 0.0026 0.9802 0.0397 10.6259 336 (0.0005) (0.0056) (0.0081) (5.4023) Virginia Electric Power 0.0030 0.9763 0.0438 (0.5968) 305 (0.0006) (0.0088) (0.0168) (5.4618) Sample Average 0.0036 0.9731 0.0485 2.9322 Sample Standard Deviation (0.0012) (0.0137) (0.0160) (8.8901) Pacific Gas and Electric 0.0027 0.9783 0.0349 0.0564 664 (0.00035) (0.00424) (0.00590) (0.0164) CHAPTER 5. EMPIRICAL RESULTS 82 Table 5.11: Preferred share issuers ranked by a summary measure of the difference between P G E preferred share prices and preferred share prices implied by param-eter estimates in the previous table. The first issuer in this table has point estimates for state price density parameters that imply consol preferred share prices most like those of Pacific Gas and Electric. The pseudo LR statistic is a lower bound for a test of whether the same set of parameter estimates for the model can explain observed behavior for PGE shares and those of the listed issuer. The true LR statistic has a 95% critical value of 9.49. The yield spread, if available, is the average difference in yields for consol preferred shares of the given issuer over the yield of similar shares for PGE. This spread is reported in basis points. Moody's Preferred Share Ratings are, in order of increasing quality: b, ba, baa, The number following the letter rating indicates relative ranking within a category, with a "1" indicating the share is at the high end of the ranting category. Name of Issuer Pseudo LR Moody's Rating Yield Spread Statistic 1990 1995 over PGE Pacific Gas and Electric a l a3 Virginia Electric Power 5 a2 a3 -12 Union Electric 5 a2 a l 1 Public Service Gas and Electric 34 a3 a3 19 Kansas City Power and Light 16 a3 a2 14 San Diego Gas and Electric 5 a l a2 -19 Southern California Edison 15 aa2 a3 -20 Northern States Power 10 aa2 a2 -20 Indiana Michigan Power 26 baa2 baa2 Detroit Edison 63 baa3/bal baa3/bal Pennsylvania Power and Light 35 a3/baa2 a3 -17 Cincinnati Gas and Electric 52 baa2 baa2 41 Georgia Power 28 baa2 a2 Commonwealth Edison 94 baa3 baa3 Alabama Power 28 a2 a2 Duke Power 32 aa2 aa2 Jersey Central Power and Light 50 a3 baa2 49 Illinois Power 178 baa2 baa3 75 Ohio Edison 194 baa2 bal 70 Ohio Power 52 a3 baal PECO Energy 120 bal baa2 67 Metropolitan Edison 122 a3 baa2 46 Toledo Edison 352 ba2 b2 CHAPTER 5. EMPIRICAL RESULTS 83 110r fOOh •S 80 h 70 V 60 H 50 Sinking fund non-sinking (und 40 50 60 70 80 Console Preferred Price Figure 5.1: Behavior of two 8.4% preferred shares, one of which has a sinking fund provision. The dashed line in the top panel illustrates the time series behavior of the share with a sinking fund. The x's in the bottom panel illustrate the relationship between the consol preferred share price and the price of the share with the sinking fund. CHAPTER 5. EMPIRICAL RESULTS 84 10.28% 10.46% 7.84% Scaled Console Preferred Price ($/$ quarterly dividend) Figure 5.2: Relationship between P G E preferred share prices and the consol pre-ferred share price. Actual data is represented with the + symbol and the solid lines represent the theoretical relationship given the parameter estimates from the second column in Table 5.1. CHAPTER 5. EMPIRICAL RESULTS 85 15 20 Time (quarters) 25 30 35 Figure 5.3: The time series of pricing errors. This figure shows the time series of pricing errors for PGE preferred shares. The underlying model uses the CIR term structure. In the top panel the proxy for the state variable is the 3-month t-bill rate and in the bottom panel the proxy for the state variable is the consol preferred share price. The first of the quarterly observations is on February 29, 1987. CHAPTER 5. EMPIRICAL RESULTS 86 41 1 1 1 1 1 r _4l i i i 1 I 1 1 0 5 10 15 20 25 30 35 Time (quarters) Figure 5.4: The time series of the first difference of pricing errors. This figure shows the time series of the first differences in the pricing errors for PGE preferred shares. The underlying model uses the CIR term structure. The proxy for the state variable is the consol preferred share price. The first of the quarterly observations is on February 29, 1987. CHAPTER 5. EMPIRICAL RESULTS 87 0.15 0.2 0.25 Call Probability 0.3 0.35 0.4 0.45 -6 -5 -4 -3 -2 -1 Log of Call Probability Figure 5.5: Pr icing Errors vs. Cal l Probability. This figure shows the relationship between the call probability (or log of call probability) and the pricing error for PGE preferred shares. The underlying model uses the CIR term structure and the proxy for the state variable is the consol preferred share price. CHAPTER 5. EMPIRICAL RESULTS 88 48 50 52 Scaled Strike Price 54 56 58 Figure 5.6: Pricing Errors vs. Strike Price This figure shows the relationship between the strike price and the pricing error for PGE preferred shares. The underlying model uses the CIR term structure and the proxy for the state variable is the consol preferred share price. CHAPTER 5. EMPIRICAL RESULTS 89 Figure 5.7: Prices implied by state price densities for three different companies. This figure shows the relationship between the short-term interest rate and non-callable pre-ferred share prices given three different parameterizations of the state price density. Chapter 6 Conclusions and Directions for Future Research The goal of this thesis was to learn something about how financial managers make decisions by observing callable preferred share prices and the corresponding call policies. In order to achieve this goal, we had to develop a model of their behavior and develop an empirical test of the model's predictive powers. Our null hypothesis was that managers maximize shareholder value when making their call decision. An important component of shareholder value in this model is the value of the call option embedded in the preferred shares. Therefore, in order to explain how managers make the call decision we modified a fixed-income option pricing model. The model accounts for the effects of flotation costs which typically accompany a call as well as for unspecified costs and benefits associated with a call. These costs and benefits were modeled as an unobservable decision-relevant cash flow, the effect of which was summarized by an unpredictable random variable. The resulting model made predictions about the relationships between a state variable (a proxy for the short-term interest rate) and both the call probability and callable preferred share prices. The call probability and callable preferred share pricing functions were fitted to data from the preferred share issues of several companies. This estimation procedure was very numerically 90 CHAPTER 6. CONCLUSIONS AND DIRECTIONS FOR FUTURE RESEARCH 91 intensive since there are not closed form solutions for either function. The procedure we utilized was based on the "Nested Fixed Point Algorithm" described by Rust (1987). We modified the procedure to utilize information on preferred share prices as well as on the call decisions. The technique is, effectively, non-linear least squares. Given a set of parameter estimates, we numerically calculate the pricing and call probability functions, calculate the (weighted) sum of squared errors given the pricing and decision data and, after updating the parameter estimates in a sensible way, repeat the procedure for the new set of parameter estimates. We repeat this process until the estimates converge. This empirical procedure was applied to several datasets. The company that had data that was most suitable for the estimation was the Pacific Gas and Electric Company (PGE), although the procedure was applied to data from several utilities. We were able to learn a lot about the call policies of managers of PGE. First, the model produced flotation cost estimates and term-structure parameter estimates that seem very rea-sonable. Second, the errors from the model passed several specification tests, indicating the pricing and call probability functions do a good job of describing the data. In fact, the errors from the model which best described the data contained no clues about how to improve the model. Finally, we tested our model against a simpler model of the call decision. In the simple model managers ignored the value of the option to delay the call decision. We were able to reject the simple model in favour of the option pricing model we developed. To summarize the findings for PGE, our model did a good job of describing the data with reasonable parameter estimates, there was no obvious way to improve the model based on an inspection of the errors and a simpler model of the call decision was shown to perform significantly worse. Our findings for the decisions of managers of other companies were hindered by the quality of the data. We were, however, still able to obtain parameter estimates for several companies. We did not find any significant evidence that transaction costs are unreasonably high. We did find, however, that a factor that proxies for default risk of the various companies would be helpful in better explaining the data. There are at least two avenues for future research. It may be interesting to estimate our model in a sample of callable debt. This would allow us to examine the call policies of managers who do not work in a regulated environment. However, we must overcome two obstacles before we proceed along these lines. First, good corporate bond prices are notoriously hard to find. Second, since bonds have a maturity date, we would have to keep track of an additional state variable in our non-linear functions, namely the time to maturity. This makes the estimation much more difficult from a computational point of view. However, if were to consider only a few cross-sections of bond prices it is possible that these problems would not be insurmountable. The methodology developed here seems like a promising way to learn more about manager's capital restructuring decisions. However, some work is required on the theoretical front if we are to deduce anything from cross-sectional data on bond and stock prices or refinancing decisions. As a first step, we could extend the model of Fischer, Heinkel, and Zechner (1989a) to an environment in which the riskless interest rate is stochastic. A direct empirical test of such a model would utilize data on firm value or cashflows, as well as the interest rate, to estimate model parameters. This may provide a means by which to control for default risk and thereby allow us to analyze the behavior of a larger set of firms in a consistent framework. 92 Bibliography Ait-Sahalia, Yacine, 1996, Testing continuous-time models of the spot interest rate, Review of Financial Studies 9, 385-426. Asquith, Paul, 1995, Convertible bonds are not called late, Journal of Finance 50, 1275-1289. Backus, David K., and Stanley E. Zin, 1994, Reverse engineering the yield curve, Working Paper, New York University and Carnegie Mellon University. Barnea, A. , R. Haugen, and L. Senbet, 1980, A rationale for debt maturity structure and call provisions in the agency theoretic framework, Journal of Finance pp. 1223-1234. Barone-Adesi, Giovanni, and Francisco A. Delgado, 1995, Call policies with flotation costs: A dog chasing its tail, Rodney L. White Center for Financial Research Working Paper. Bliss, Robert R., and Ehud I. Ronn, 1998, Callable u.s. treasury bonds: Optimal calls, anoma-lies, and implied volatilities, Journal of Business forthcoming. Brennan, Michael J., and Eduardo S. Schwartz, 1977, Savings bonds, retractable bonds and callable bonds, Journal of Financial Economics 5, 67-88. Brick, I., and B. Wallingford, 1985, The relative tax benefits of alternative call features in corporate debt, Journal of Financial and Quantitative Analysis pp. 85-105. Campbell, John Y. , Andrew W. Lo, and A. Craig MacKinlay, 1997, The Econometrics of Financial Markets (Princeton University Press: Princeton, NJ). Chan, K. C., G. Andrew Karolyi, Francis A . Longstaff, and Anthony B. Sanders, 1992, An empirical comparison of alternative models of the short-term interest rate, Journal of Finance 47,1209-1227. Conway, John B., 1990, A Course in Functional Analysis (Springer Verlag: New York, NY). Cox, J . C , J . E. Ingersoll, and S. A. Ross, 1985, A theory of the term structure of interest rates, Econometrica 53, 385-407. Davidson, Russell, and James G. MacKinnon, 1993, Estimation and Inference in Econometrics (Oxford University Press). Duffie, Darrell, and Kenneth J . Singleton, 1997, Econometric modeling of term structures of defaultable bonds, Working Paper, Stanford University. 93 Dunn, Kenneth B., and Kenneth M . Eades, 1989, Voluntary conversion of convertible securities and the optimal call strategy, Journal of Financial Economics 23, 273-301. Dunn, Kenneth B., and Chester S. Spatt, 1984, A strategic analysis of sinking fund bonds, Journal of Financial Economics 13, 399-423. , 1986, The effect of refinancing costs and market imperfections on the optimal call strategy and the pricing of debt contracts, Working Paper, Carnegie-Mellon University. Dybvig, Phillip H., and Jaime F. Zender, 1991, Capital structure and dividend irrelevance with asymmetric information, Review of Financial Studies 4, 201-219. Eckbo, B. E., and R. W. Masulis, 1995, Seasoned equity offerings: A survey, in R. A . Jarrow, V. Maksimovic, and W. T. Ziemba, ed.: Handbooks in Operations Research and Management Science: Finance . pp. 1017-1072 (North-Holland: Amsterdam). Fischer, Edwin O., Robert Heinkel, and Josef Zechner, 1989a, Dynamic capital structure choice: Theory and tests, Journal of Finance pp. 19-40. , 1989b, Dynamic recapitalization policies and the role of call premia and issue discounts, Journal of Financial and Quantitative Analysis pp. 427-446. Flannery, M . J., 1986, Asymmetric information and risky debt maturity choice, Journal of Finance pp. 18-38. Grundy, Bruce D., 1993, Preferreds and taxes: The relative price of dividends and coupons, Working Paper, The Wharton School. Ingersoll, J . E., 1977, An examination of corporate call policies on convertible securities, Journal of Finance 32, 463-478. Johnson, Norman L., Samuel Kotz, and N.Balakrishnan, 1994, Continuous Univariate Distri-butions . , vol. 2 (Wiley and Sons: New York, NY) 2 edn. King, Tao-Hsien (Dolly), 1997, The determinants of corporate call policy for non-convertible bonds, Ph.D. thesis The University of Wisconsin - Madison. Kraus, Alan, 1983, An analysis of call provisions and the corporate refunding decision, Midland Corporate Finance Journal pp. 46-60. Ling, David C , 1991, Optimal refunding strategies, transaction costs and the market value of corporate debt, The Financial Review pp. 479-500. Longstaff, F., 1992, Are negative option prices possible? The callable U.S. treasury-bond puzzle, Journal of Business 65, 571-592. Mauer, David C , 1993, Optimal bond call policies under transactions costs, Journal of Finan-cial Research 16, 23-37. Modigliani, F., and M . H. Miller, 1958, The cost of capital, corporate finance and the theory of investment, American Economic Review 48, 261-297. 94 Parrino, Robert, and Michael S Weisbach, 1997, On the magnitude of stockholder-bondholder conflicts, Working Paper, University of Texas at Austin. Pearson, N . D., and T. Sun, 1989, A test of the Cox, Ingersoll, Ross model of the term structure of interest rates using the method of maximum likelihood, Working Paper, Massachusetts Institute of Technology. Persons, John C , 1994, Renegotiation and the impossibility of optimal investment, Review of Financial Studies 7, 419-449. Puterman, Martin L., 1994, Markov Decision Processes: Discrete Stochastic Dynamic Program-ming (John Wiley and Sons, Inc.). Robbins, Edward Henry, and John D. Schatzberg, 1986, Callable bonds: A risk reducing sig-nalling mechanism, Journal of Finance pp. 935-949. Roll, Richard, 1984, A simple implicit measure of the effective bid-ask spread in an efficient market, Journal of Finance 35, 1127-1140. Rust, John, 1987, Optimal replacement of GMC bus engines: An empirical model of Harold Zurcher, Econometrica 55, 999-1033. , 1988, Maximum likelihood estimation of discrete control processes, SIAM Journal on Control and Optimization 26, 1006-1024. Stanton, Richard, 1995, Rational prepayment and the valuation of mortgate-backed securities, Review of Financial Studies 8, 677-708. Vasicek, O., 1977, An equilibrium characterization of the term structure, Journal of Financial Economics. Vu, Joseph D., 1986, An empirical investigation of calls of non-convertible bonds, Journal of Financial Economics 16, 235-264. White, Halbert, 1984, Asymptotic Theory for Econometricians (Academic Press, Inc.: New York, NY). 95 Appendix A Technical Appendix A . l Existence of Preferred Share Prices Unlike Backus and Zin (1994), we assume that interest rates are restricted to be non-negative real numbers, so that r £ 1R+. The state price measure is formally defined on this state space as follows: r -o.5[(rt+i-rt,t)1a I A ^ e i f r t + 1 > 0 -0-5 q(rt+i | rt) = { . - r , A / - 0 c dz if rt+i = 0 where a, b, c, and 7 are parameters and A is the time increment. We will restrict b to be less than one and A will be three months throughout the paper. Let / : IEt~t" —¥ 1R+ define a bounded payoff. The price of this payoff one period earlier is given by its integral with respect to the state price measure. Formally: o-ftA cr /(0) f0 -0.5[(*-O^6rt>l /-oo -0.5 / e L J dz+ / /(r t + 1)e J-00 Jo ( r t + l - o - t r t ) " ! 2 cri J dr t+i If we let Q define the linear pricing operator and Pf represent the function that relates the current interest rate to the price, we can write the mapping between payoffs and prices more compactly as Pf = Qf. We will sometimes consider the "discounting" aspect of Q separately and write the operator as Q = Afl where (A/)(rt) = e~rtAf(rt) and (n/)(rt) = cr /(0) fO - Q . 5 [ t * - y r t ) l 2 r°° -0-5 ['r'+l-S~''rt)l2 / e I c-« J dz+ / f(rt+i)e j d n ^ J-00 Jo Let e = 1 be the payout of a riskless pure discount bond. Note that since lie = e, II defines a family of "risk-neutral" probability measures.1 The restrictions we have placed on Q are sufficient to guarantee the existence of preferred share prices. We restrict our attention to the set of bounded, Borel measurable real valued 1 Writing out the price of a pure discount bond will help to clarify the meaning of our notation. It's price is given by B\ = Qe = Ae. We denote the price of the bond in a given state as Bi(rt) = (Ae)(rt) = e 96 functions on IR + and denote this set U. Define a norm on U as = ess-sup 6JJ^+ l u ( x ) l - 2 It is straightforward to show that the space U endowed with the norm || • || is a complete normed linear space, or a Banach space (see, for example, Conway (1990)). As is standard, define the norm of a linear operator, L, to be | |L | | = sup||v||* 0 there is an N such that for n > N A" < e. Thus, for n > N, Q2ne < ee. • This result is important because it allows to calculate preferred share prices. Let / denote the identity operator, ie. If = / . We start with the following theorem: Theorem 1 Let L be a bounded linear transformation on a Banach space U, and suppose that \\L\\ < 1. Then (I — L)~* exists and satisfies N (I - L)-1 = lim V Ln n=0 Proof: See Puterman (1994, p. 608) • One can verify that the inverse (/—L)~l as defined satisfies (I—L)(I—L)~1 = (/—X)-1(7—L) = 7 as we would like. Applying this result to Q2 we see that (7 — Q2)~l must exist. This leads to the following corollary. Corollary 2 Suppose that Q is a two stage contraction on a Banach space. Then (7 — Q)~x exists and satisfies ( 7 - Q ) " ^ lim J2Qn n=0 Proof: First, note that (7 - Q 2 ) - 1 = limjv^oo S^Lo Q2"- This allows us to write N N N lim V Q n = lim V Q 2 " + Q( lim V Q 2 " ) ^ ~ „ t o "-~„=o N^to = ( 7 - Q 2 ) - 1 4 - Q ( 7 - g 2 ) " 1 = (7 + Q K 7 - Q 2 ) " 1 = (i-Q)-1 2Let fj be Lebesgue measure. Then ess-supu = inf{a € lR+|/i{a:|u(x) > a} = 0}. We will not distinguish between sup and ess- sup in the remainder of the appendix. 97 Preferred shares are assumed to pay a dividend of 5 each quarter in perpetuity. Using the state price measure, we calculate the price of this stream of cashflows as P = l i m ^ - K x , Yln=i Q"(^ e)i intuitively, the preferred share price is the sum of the prices of discount bonds which mature every three months where the face value is 5. Appling the corollary, we see that this sum converges. Additionally, it must satisfy the following functional equation: P = Q(5e + P) whose solution is written as P = (I — Q)~1(Q6e). A . 2 Existence of Callable Preferred Share Prices The assumptions underlying the problem of when to optimally call an outstanding issue of preferred shares do not satisfy the typical set of sufficient conditions that would allow us to characterize the optimal exercise decision as the solution to the Bellman equation introduced in chapter 3. The one period discount factor is not constant and the payout received when a share is called is unbounded. Our methodology is based on the approach described in Puterman (1994) [esp. Section 6.10]. We modify his sufficient conditions and show that typical dynamic programming techniques do indeed apply to our problem. The value of a callable preferred share will be a function of two variables, the short term interest rate and the indirect effect of the call as summarized by our unobserved state variable e. We must, therefore, extend the state space described in the previous section to S = E t + X IR. Let w(e) = 1 + e2. Using w, define a norm on the set of real valued functions defined on the state space as follows: n II U ( R ' E ) N = S U P ^ 77T-(r,e)€5 «>(«) Define U as the space of Borel measurable real valued functions v on S satisfying ||u|| < oo. Using standard results, one can show that U is a Banach Space. Consistent with our discussion in chapter 3 we will price payouts, / , defined on S using the following state price measure: Pf(r,e) = {Qf)(r,€)= f f(x,y)p{y)q(x\r)dx Js where p is a probability density function with zero mean and bounded variance and q is the state price measure described earlier. Note that Q is a positive linear functional on S. It is easy to show that its norm is given by Qw: Lemma 2 ||Q|| - Qw Proof: Applying the definition of the norm of a linear operator, suppose u solves s uP{u|||u||** Qu where the last inequality follows from the fact that < 1 and Q is positive. • 98 We now characterize the value of a call option on a preferred share given some deterministic Borel measurable call policy d : S -> {0,1}. Define the following operator on U: We interpret P{r) as the price of a non-callable preferred share and c as the value of the unobserved state. Given any policy n = (di,d2,...) we define Lk = Ldlt Our job is to show that Ld defines a contraction on U. We begin with the following property of Q Lemma 3 There is an integer, k, such that \\Qk\\ < 1; ie. Q is a k-stage contraction. Proof: Use the previous lemma and apply the definition of Q to see that = Qnw = Q"(e + e2) = Bn(l + Qu < Qv. • Lemma 5 Given any scalar, c: Ld(u + cw) < Ldu + cQw Lj(u + cw) < Ldu + cQkw Proof: Given any (r, e) 6 S, y / , \ / \ / P{r)-K + € if d(r,e) = l, Ld{u + cw){r,e) = < . v ' K ' 1 Q(u + cw)[r, e) ifa(r,e) = 0 f P{r)-K + e + cQw(r,e) ifd(r,e) = l , - \ Qu(x) + cQw(r,e) ifd(r,e) = 0, = Ldu(r,e) + cQe{r,() The proof of the second inequality follows by induction. Lkn{u + ce)(x) = Ldh{L*(u + ce)){x) f P{r)-K + e if cfc(r,«) = l , \Q{Lk-x{u + cw)){r,e) ifd f c(r,e) = 0 f F(r) -K + 6 + cQkw{r, e) if dk(r, e) = 1 " \ Q{Lk-lu){r,t) + cQkw{r,z) if dfc(r,e) = 0, = Lku(r, e) + cQkw(r, e) 99 To apply the dynamic programming algorithm we need to work with the operator denned by Lv = supdLjv. Note that all v G U are Borel measurable, as is P(r), so that the set A = {r|P(r) — K + e — v(r,e) > 0} is a Borel set. Thus, the decision rule d*(r,e) = x>l(r!c)> where XA is the indicator function for A, solves the optimization problem. Lemma 6 For all u e U and c G R + , L(u + cw) < Lu + cQw and Lk(u + cw) < Lku + cQkw. Proof: L(u + cw) = Ld*(u + cw) < Ld>u + cQw < Lu + cQw. We prove the second inequality inductively. Lk(u + cw) = L(Lk-x{u + cw)) < L(Lk-lu-rcQkw) = Ld>(Lk-1u + cQk-1w) < Ld'{Lk-1u) + cQkw < L(Lk-1u) + cQkw = Lku + cQkw. We are now in a position to show that Ld and L are k-stage contractions. Proposition 2 Ld and L are k-stage contractions on U. ie. \\Lkv — Lku\\ < X\\v — u\\, A < 1. Proof: We will prove the result for L\ the proof for Ld is identical. Let c = ||« —«||. Then u — cw < v < u + cw. We will work first with the rightmost inequality. Lkv < Lk(u + cw) < Lku + cQkw < Lku + cXe where A = supx Qkw(r, e) < 1. Similarly, v + cw > u =>• Lkv + cXw > Lku. Together, these inequalities imply Lku — cXe < Lkv < Lku + cXe so that \\Lkv - Lku\\ < cA = A||u - u||. • We can now apply the Banach Fixed-point Theorem to show that the standard dynamic programming techniques apply. We state the theorem here for the sake of completeness: Theorem 2 (Banach Fixed-point Theorem) Suppose U is a Banach space, L : U -> U is a k-stage contraction for some k > 1 and there exists a B, 0 < B < oo for which \\Lu-Lv\\ < B\\u-v\\ for all u and v in U. Then 1. there exists a unique v* in U for which Lv* = v*; and 2. for arbitrary v° in U, the sequence {vn} defined by vn+1 = Lvn converges to v*. Proof: See Puterman (1994, p. 235) • 100 A . 3 Solving for the Optimal Ca l l Policy This section describes the methodology we employ to solve the Bellman equation. Our technique is almost identical to that described in Rust (1987), with the only differences arising due to technical differences between his decision problem and the one in this paper.3 Our parameterization of the model will be complete if we specify a distribution for the unobservable state, e in equation 3.4. Several choices are natural and all are equally hard to justify. In order to ease the computational burden, we assume that the e are independent logistic random variables.4 The following characteristic of the logistic distribution is the key to understanding our technique. Lemma 7 Assume that e has the logistic distribution. Then E (max{a + e, b}) = log (ea + e b). Proof: It is well known (Johnson, Kotz, and N.Balakrishnan 1994, Ch. 22) that if e is logistic then there are a pair of iid extreme value random variables, v\ and 1/2, such that e = V\ — v2. Assume, without loss of generality, that the mean of these random variables is zero. Then we have E (max{a + e, 6}) = E (max{a + v\ — V2, b}) = E(—v2 + max{a + 1/1, 6 -4-1/2}) = E (max{a + u\, b + 1/2}) = log(e a + e6) where the last equality follows from a well known property of extreme value random variables (see, for example, Johnson, Kotz, and N.Balakrishnan (1994)). • The choice probabilities associated with the maximization problem referred to in the lemma are given by the following partial derivatives of the expectation function:5 P a = Pr(a + e>6) = — log (ea + eb) 1 ~ 1 + eb~a Using the choice probabilities and the functional form for the expectation given above, we note one final property of the logistic distribution: paE (a + e I a + e > b) = log (ea + e6) - (1 - pa)b These properties allow us to determine the solution to the Bellman equation in two steps. We repeat the Bellman equation using the compact notation developed in the previous section: V(r, e) = max [p{r) - K + e, (QV) (r)} 8Rust's application had constant interest rates, whereas we are explicitly accounting for the effects of stochastic interest rates. 4 We have worked through the model with assuming the distribution of e is Normal and Log-Normal. Solving these models numerically is very inefficient, however, since in both we must integrate over the normal cumulative distribution function several times. 5 see Rust (1988) and the references therein for a discussion of this property. 101 We have used the fact that the state price measure for time t + 1 does not depend on e to write the value of the uncalled option when interest rates are r as (QV)(r). Let V denote the value of the uncalled option, ie. V = QV. We have the following functional equation which identifies the value of this option: r OO V(r) = / log(ep^-K + ev^)q(f | r)df Jo We solve this equation using either value iteration or policy iteration. The value iteration procedure is straightforward and described in detail in Puterman (1994). Given a candidate for the price of the uncalled call option, Vi, simply iterate using the above equation to arrive at a more accurate estimate of the price: Vt+1(r) = (XVI) (r) TOO = / log(epM-K + ev'^)q{f | r)df Jo Application of the policy iteration algorithm is less obvious in this setting; we will now show that it is equivalent to the Newton-Kantorovich method described in Rust (1987). The policy iteration algorithm has two fundamental steps: policy evaluation, and policy improvement. We will start with the policy improvement step assuming we have some initial function, Vo.6 Using this function, we could determine an "improved" call policy by solving the optimization problem state-by-state: (LVt)(r, e) = max{P(r) - K + e, Vt(r)} We only need the call probabilities from this optimization problem in order to move on to the policy evaluation step. These are easily calculated: pt = Pr(P(r) - K + e > Vt{r)) = 1 1 + eVt(r)-P(r)+K To complete the policy evaluation step we need to determine the fixed point to the functional equation that describes Vt+i: Vt+1 = Q (ptE(P - K + e | P - K + e > Vt) + (1 - p)Vt+i) As described above, we can calculate the conditional expectation in the fixed point equation using the following relationship: PtE(P -K + e\P-K + e>Vt) = log (ep~K + e*) - (1 - pt)Vt Substituting this into the previous equation gives rise to the following: VJ+i = Q [log (ep~K + e*) - (1 - Pt)Vt + (1 - pt)Vt+i] (A.l) 6 As in Rust (1987), we begin the policy iteration algorithm with a Vi from several applications of the value iteration algorithm. 102 We rearrange this equation to arrive at the expression for updating the preferred share price: In a similar setting Rust (1987) develops the updating procedure described by equation A.2. His justification is based on finding a zero to the functional equation A . l using the Newton-Kan totovitch method. A . 4 Characteristics of the Optimal Ca l l Policy and of Callable Preferred Share Prices In this section we document some characteristics of the optimal call policy and callable preferred share prices. Our goal is to show why callable preferred shares trade at prices in excess of the strike price when transaction costs are present. We will begin by once again characterizing the value of the call option on preferred shares. At this point, it is useful to sacrifice some formality and simplify the notation somewhat. To this end, we ignore the stopped state discussed above and write the Bellman equation for the value of the call option on preferred shares as C = L(C) = ma,x{D — K, QC}. From the previous section, we have that the non-callable preferred share price is given by the fixed point to the functional equation D = Q(e + D). Using this relationship, we can write the Bellman equation for the value of the callable preferred share as follows: In the discussion that follows we will work with both the value of the embedded call option, C, and the value of callable preferred shares, Dc, depending on which is more convenient. We will need to refer to the value functions for a given policy and strike price. To this end, we introduce the following notation: and CK = maxj Cxd- CKA simply denotes the value of a call option with strike price K under call decision rule d and therefore solves CKA = Ld(CKd)- CK is the value of the call option under the optimal call policy and solves CK — L(CK)-Our first task is to show that an optimal policy is for managers to call if the short-term riskless rate is less than or equal to some critical rate. We will assume throughout that 0 < K < supx{D(x)} so that our problem has a non-trivial solution. We begin by showing that the value of a callable preferred share is a non-increasing function of the interest rate. (LVt - Vt) (A.2) = D-C = D - max{D - K, Qv} = m'm{K,D-Qv} = min{K,Q(e + D -v)} = mm{K,Q(e + Dc)} 103 Lemma 8 Suppose Q has the property that x' > x Qv(x') > Qv(x) given any non-increasing function v. Then the preferred share price D° is non-increasing in x. Proof: In the previous section we proved that starting with any arbitrary function u° G U the value iteration procedure converges to the fixed point of the Bellman equation. Let u° = 0 and recursively define un = min{K,Q(e + t i " - 1 ) } . We can show inductively that un is non-increasing. Since e and un~l are non-increasing, Q(e + u°) is non-increasing. Furthermore, the minimum of two non-increasing functions is non-increasing; thus, for all n, u" is non-increasing. Finally, since Dc = l i n i n - K x , w n, Dc is non-increasing. • Using the fact that the callable preferred share prices are non-increasing, we can now show that the optimal call policy has the form: We refer to r* as the critical interest rate for callable preferred shares with call price K. Corollary 3 The optimal policy is of the form given by equation (A.3). Proof: Since Dc is non-increasing, Q(e + Dc) is non-increasing. At r*, K = Q(e + Dc)(r*) (ie. the value of the uncalled share equals the value of the called share). Therefore, for all x < r*, Q(e + Dc)(x) > K indicating it is also optimal to call at these interest rates. • Our next task is to show that preferreds with higher call prices have optimal call policies with lower critical interest rates. We begin by placing some bounds on the values of call options with different strike prices. Lemma 9 Let K = K + T, T > 0. Then Cic-Te CKd'-Te = CK — Te 104 and we have established the leftmost inequality. Let d solve CK = max{7C, QCK}. Then CK > C K I > cK and we have established the rightmost inequality. • This Lemma allows us to place the following relative bounds on the callable preferred share prices: DCK < < DK + Te. These inequalities are the key to proving that when transaction costs are present managers call at a lower interest rate relative to the optimal call policy when no transaction costs are present. Proposition 3 Suppose the critical interest rates for callable preferred shares with call prices K and K + T are r* and r respectively (see equation (A.3)). Then f < r*. Proof: We first establish that r < r*. Assume that r > r*. By definition D°K(r*) = K and DcK(r) = (QDcK)(r) < K. But since D°k(r) = K + T we have Dc^(r) - DcK(r) > T which contradicts the inequality established in Lemma 9. Now suppose that f = r*. Again by definition DcR(r) = K + T = Q(e + Dk)(r) DK(r) = K = Q(e + DcK)(f). Taking the difference we have T = QDk{f)-QDK{r) = Q{DR-QDK){r) < Q{Te){f) < T again a contradiction. • We finish this section by showing that when managers follow the policy that is optimal for calling preferred shares when transaction costs are present, the market price of the preferred shares will exceed the call price. This result follows from the fact that the optimal call policy is unique. Lemma 10 Assume the consol bond price is strictly decreasing. Given a preferred share with call price K, the optimal call policy is unique. Proof: Assume that there are two distinct policies with critical interest rates, r < r*, that minimize the same callable preferred share price. Let u and v be the associated callable pre-ferred share prices. Since both solve the Bellman's equation it must be the case that u = v. Thus, we have that D(f) — K = u(r) — v(r*) — D(r*) — K which contradicts our assumption that the consol bond price is strictly decreasing. • 105 Let d, with associated critical interest rate r, denote the optimal call policy when transaction costs are present and suppose that the critical interest rate when no transaction costs are present is r*. An implication of the above Lemma is that DCK~ > DCK and D°K~ 7^ Proposition 4 Define d, with associated critical interest rate f, and r* as in Proposition 3. Then for r> f DcKj(r) > DcK(r). Proof: Q is strictly positive, D° j > DCK and D°K- / DCK. Therefore, Q{D°Ki - D°K) > 0. Thus, if r > f DcKi(r) = (QD^)(P) > (QDcK)(r) > DK(r). W We finish the section with our main result: Corollary 4 If r G (r, r*l then D*3(r) > K. A O Proof: Follows from the definition of DCK when r G (f, r*]. • A . 5 The Q Operator for Specific Models of Interest Rate Dy-namics The purpose of this section is to verify that the operator Q described in the previous two sections has the properties we assumed as sufficient to establish the results. We show that the spectral radius of Q is less than one, that Q maps non-increasing functions into non-increasing functions, and that the consol bond price is strictly decreasing. The state variable dynamics we study in this paper are given by the following dynamic equation: r t + i = a + brt + cr]ut+1 (A.4) where b < 1, ut+1 ~ iid N(0,1). We truncate the state space below at zero by defining rt+i = 0 if the right hand side of the above equation is less than zero. Define Q as Qv(x) = e~x (v(0) j° f{r\x)dx + u(r)/(r|x)dx^ where f(r\x) is the conditional density function for a normal random variable with mean a + bx and standard deviation cr^. Q is clearly strictly positive. We will first establish that discount bond prices in the truncated model are bounded by prices from the model where the interest rate is unrestricted. Let the t-period discount bond prices from the restricted and unrestricted model be denoted Pt and Pt respectively. Lemma 11 Given any t > 0, Pt < Pt-7 At this point it is important to be more clear about how we define equality on the space U. If u, v G U then u / v means that there is a set, A, which has non-zero measure and on which for all x £ A, u(x) / v(x) (ie. ti = v u(x) = v(x) a.e. x). 106 Proof: The inequality is obvious for t = 1 since for x > 0 P\ (x) = e x = Pt (x) and for x < 0 Pi(0)). Suppose these inequalities have been established for t-period bonds. Then Pt+i(x) = e~x (Pt(0)f p(r\x)dx + J~ PttfpirWdx^) < e~x Pt(r)p(r\x)dx + P^p^^dx^ Campbell, Lo, and MacKinlay (1997) show how to solve for bond yields in the unrestricted model. It is straightforward to apply their techniques to the interest rate dynamics we consider here. The result is a set of recursive formulae that give discount bond yields. We will not replicate these calculations here but note that it is necessary to restrict the parameters of the term structure dynamics in order to ensure that the limiting yield is strictly positive. Thus, for the state variable dynamics we consider, there are parameter restrictions which ensure that the spectral radius of the operator Q is strictly less than one. Next, we show that Q maps non-increasing functions into non-increasing functions. It is sufficient to show that this property holds for step functions of the form {XA, A = [0, a]} since on the space U, non-increasing functions are the limit of such characteristic functions. We first show that when 7 = 0, Q maps non-increasing characteristic functions into decreasing functions. Proposition 5 Suppose 7 = 0 in equation (A.4). Given any set A = [0, a] with corresponding characteristic function XA, for all x and x' with x' > x, QXA(%) > QXA(%')-Proof: Let F(-) denote CDF of the standard normal distribution. If x' > x then a — a — bx' a — a — bx < . c c Therefore, QXA{X) = e~x j p(r\x)dx J—00 c _Ti -,01 - a - bx'. > e x F( ) c = QXA(X'). This simple proof does not work when the interest rate dynamics have 7 > 0. Given these dynamics, characteristic functions with a < a map into non-monotonic functions of the state 107 variable.8 We can overcome this problem if we redefine the state space as 5 = [a, oo]. In the models we consider, a « 0.1% so that the truncation does not eliminate much of the state space. It is easy to see that the consol bond price in these models will be strictly decreasing. Proposition 5 shows that Q maps non-increasing functions into strictly decreasing functions. Furthermore, since Q is a contraction mapping, D = l im„_ f o o Q"e is non-decreasing. D must be strictly decreasing since it must satisfy D = Q(e + D) (ie. it is the sum of two strictly decreasing functions). A . 6 M a k i n g t h e S t a t e S p a c e D i s c r e t e In this section, we describe our methodology for making the state space for the interest rate discrete. We effectively use the method employed by Stanton (1995). We need to partition the true state space, H + , into n intervals. Choose some closed sub-interval, A = [arj, ai], of (0,1). Let {yi = a 0 + (ai — ao)/ni, i = 0, n} be the endpoints of n equal sized intervals that partition A . Now apply the transformation r,- = and we have a partition of the interval [1/ai, 1/ao]. The set of intervals B = {[r,-, r , + i ] | i — 0, . . . , n} now represent the states of the discretized system. The midpoint of each interval in B , r,, is the "representative" value of r for that state; therefore, to form the discrete transition probabilities for each state we calculate Pr(r i | r ,-) = P r ( r e [ r i j r i + l ] | f l - ) . 8 T o see this, write QXA(X) = e XF(A e ° 7 b l ) and differentiate with respect to x. When a *