Three Essays in Macroeconomics and International Economics by Yao Tang B.A., Beijing Second Foreign Language Institute, 1998 M.A., Simon Fraser University, 2003 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in The Faculty of Graduate Studies (Economics) THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver) October 2009 c Yao Tang 2009 ⃝ Abstract This dissertation examines two issues in international economics and macroeconomics. The ﬁrst is to understand the response of productivity to major real exchange rate appreciations and the second concerns how to compare the ﬁts of diﬀerent calibrated macroeconomic models. In the ﬁrst chapter, I construct a model to clarify how the increased competition due to an exchange rate appreciation provides incentive for ﬁrms to improve productivity. However, if a ﬁrm is in an industry shielded by a high trade cost, then the incentive is weaker. In industries with fewer ﬁrms, proﬁts are more responsive to productivity improvements, therefore, ﬁrms are more likely to invest more heavily in productivity improvement. Empirical analysis of Canadian manufacturing data from 1997 to 2006 ﬁnds evidence consistent with the model predictions. The second chapter presents testing procedures for comparison of misspeciﬁed calibrated models. The proposed tests are of the Vuong-type (Vuong, 1989; Rivers and Vuong, 2002). In the framework here, an econometrician selects values for the parameters in order to match some characteristics of the data with those implied by the theoretical model. We assume that all competing models are misspeciﬁed, and suggest a test for the null hypothesis that all considered models provide equal ﬁt to the data ii Abstract characteristics, against the alternative that one of the models is a better approximation. The Carlstrom and Fuerst (1997) model and the Bernanke, Gertler and Gilchrist (1999) model are two leading models that study ﬁnancial frictions in macroeconomic models. In particular, these models show that due to ﬁnancial frictions, net worth plays an important role in obtaining external ﬁnance, and that at an aggregate level, net worth can propagate technology shocks and monetary shocks. However, neither paper examines whether the models can reproduce cyclical properties of net worth. The third chapter addresses this issue by applying the comparison method developed in the third chapter. Results indicate both models do reasonably well. In addition, price rigidity seems to play an important role in the latter model. However, both models can only partially capture the positive correlation between risk premium and net worth. iii Table of Contents Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . ix Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x Statement of Co-Authorship . . . . . . . . . . . . . . . . . . . . . xi 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2 Does Productivity Respond to Exchange Rate Appreciations? A Theoretical and Empirical Investigation . . . . . 7 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2 Basic Model Setup . . . . . . . . . . . . . . . . . . . . . . . . 12 2.3 Exchange Appreciation and Investment Decision . . . . . . . 22 iv Table of Contents 2.4 Manufacturing Productivities in Canada . . . . . . . . . . . . 37 2.4.1 Speciﬁcation and Data . . . . . . . . . . . . . . . . . 39 2.4.2 Main Results . . . . . . . . . . . . . . . . . . . . . . . 46 2.4.3 Robustness Checks . . . . . . . . . . . . . . . . . . . 50 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 2.5 Conclusion 3 Comparison of Misspeciﬁed Calibrated Models: The Minimum Distance Approach . . . . . . . . . . . . . . . . . . . . . 62 . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 3.1 Introduction 3.2 Deﬁnitions 3.3 Properties of the CMD Estimators of Structural Parameters 77 3.4 Model Comparison . . . . . . . . . . . . . . . . . . . . . . . . 83 3.4.1 Nested Models . . . . . . . . . . . . . . . . . . . . . . 84 3.4.2 Strictly Non-nested Models . . . . . . . . . . . . . . . 87 3.4.3 Overlapping Models . . . . . . . . . . . . . . . . . . . 89 3.5 Model Comparison with Estimation and Evaluation . . . 3.6 Averaged and Sup Tests for Model Comparison, . . . 3.7 3.8 . . . 90 . . . . . 94 3.6.1 Averaged and Sup Tests . . . . . . . . . . . . . . . . 95 3.6.2 Conﬁdence Sets for Weight Matrices . . . . . . . . . . 99 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 3.7.1 CIA Model . . . . . . . . . . . . . . . . . . . . . . . . 100 3.7.2 PAC Model . . . . . . . . . . . . . . . . . . . . . . . . 103 3.7.3 Model Estimation and Comparison Results . . . . . . 104 Proofs of Theorems . . . . . . . . . . . . . . . . . . . . . . . 110 v Table of Contents Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 4 An Exploration of the Role of Net Worth in Business Cycles 125 4.1 Introduction 4.2 Cyclical Properties of Net Worth . . . . . . . . . . . . . . . . 128 4.3 Overviews of Two Competing Models . . . . . . . . . . . . . 136 4.4 4.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 4.3.1 The Carlstrom and Fuerst (1997) Model 4.3.2 The Bernanke, Gertler and Gilchrist (1999) Model . . 143 Comparison of the Models . . . . . . . 136 . . . . . . . . . . . . . . . . . . . 149 4.4.1 Informal Comparison of the Models . . . . . . . . . . 149 4.4.2 Overview of the Formal Comparison Methodology . . 150 4.4.3 Formal Comparison of the Models . . . . . . . . . . . 154 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 vi List of Tables 2.1 Means of Key Variables between 1997 and 2006 . . . . . . . . 43 2.2 Benchmark Fixed Eﬀect Estimations . . . . . . . . . . . . . . 48 2.3 Alternative Dependent Variables . . . . . . . . . . . . . . . . 52 2.4 Alternative Speciﬁcation of Lags . . . . . . . . . . . . . . . . 53 2.5 Eﬀects of Entry and Exit of Establishments . . . . . . . . . . 54 2.6 Other Robustness Checks . . . . . . . . . . . . . . . . . . . . 56 3.1 CIA and PAC Parameters’ Estimates and Their Standard Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 4.1 Net Worth, GDP, Consumption, and Investment . . . . . . . 133 4.2 Net Worth and Investment in Greater Details . . . . . . . . . 134 4.3 Moments of Calibrated Models . . . . . . . . . . . . . . . . . 151 4.4 Parameters Estimates of CF and BGG . . . . . . . . . . . . . 156 4.5 Moments of Estimated Models . . . . . . . . . . . . . . . . . 157 vii List of Figures 2.1 An Illustration of the Industrial Organization . . . . . . . . . 15 2.2 The Beneﬁt and Cost of Adopting the Disruptive Technology 34 2.3 Illustration of The Relation between 𝑛𝑖 and Choice of 𝜎 . . . 35 2.4 Level of Technology Adoption and Number of Firms in the Industry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 2.5 Movements of Canadian Dollar Exchange Rate Since 1990 . . 42 3.1 Model Prediction Errors of the Inﬂation Impulse Responses with 95% Conﬁdence Bands . . . . . . . . . . . . . . . . . . . 109 3.2 Model Prediction Errors of the Output Impulse Responses with 95% Conﬁdence Bands . . . . . . . . . . . . . . . . . . . 109 4.1 Net Worth and GDP, Consumption, Hours Worked and Investment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 4.2 Net Worth and Investment in Greater Details . . . . . . . . . 131 4.3 Net Worth and Interest Rates . . . . . . . . . . . . . . . . . . 132 viii Acknowledgements First thanks go to Paul Beaudry for his supervision and support. I am also grateful for help and advice from Michael Devereux, Amartya Lahiri, Vadim Marmer, Viktoria Hnatkovska, Anji Redish, Henry Siu, and colleagues and staﬀ of the UBC economics department. ix Dedication To my parents and my wife. x Statement of Co-Authorship Chapter 3 of my dissertation is a joint work with Viktoria Hnatkovska and Vadim Marmer. I have been actively involved in the area of research program identiﬁcation and design. In terms of execution, I have veriﬁed the applicability of the proposed econometric method to macroeconomic models and provide an application. I am also responsible for collection of data in the application section and provision of the manuscript draft of the section. Chapter 2 and 4 are my own independent research works. xi Chapter 1 Introduction 1 Chapter 1. Introduction This dissertation examines two issues in macroeconomics and international economics. The ﬁrst issue is whether productivity responds to major real exchange rate appreciations and the second is about how to compare the ﬁts of diﬀerent calibrations in macroeconomics. The second chapter addresses the former issue. The third chapter develops a method to answer the latter and the fourth chapter applies the method to compare two macroeconomic models with ﬁnancial frictions. The second chapter studies how exchange rate appreciations aﬀect productivity growth. While there has been a large literature (see Obstfeld and Rogoﬀ (1996)) on the casual eﬀect of productivity on exchange rate, the reverse eﬀect has not been studied, except for the study of Fung (2008). In this chapter, I address this question both theoretically and empirically. In a theoretical model, I adapt the assumption of disruptive technological change of Holmes, Levine and Schmitz (2008) to clarify the eﬀect of increased competition due to an exchange rate appreciation. One of the costs for adopting a cost-reducing technology, is the proﬁt loss due to a temporarily high marginal cost of production during the transition. When the exchange rate appreciates, there is less proﬁt to be made and the proﬁt loss due to adopting the new technology is also smaller. However if ﬁrms are in an industry shielded by high trade cost, then their proﬁtability will be less inﬂuenced by an appreciation and the incentive to improve productivity provided by an appreciation will be smaller. The incentive also depends the the concentration of the industry. In industries with fewer ﬁrms, proﬁts are more responsive to productivity improvements, therefore they are more likely to invest more in productivity improvement. Testing the predictions with Canadian man2 Chapter 1. Introduction ufacturing data from 1997 to 2006, I ﬁnd that within the group of highly traded Canadian industries, the more concentrated ones experienced larger growth in labour productivity during the period of Canadian dollar appreciation. The empirical analysis controls for energy use growth, material use growth, R&D expenditure growth, productivity growth in corresponding US industries, industry ﬁxed eﬀects and year speciﬁc eﬀects, and the results are robust to various speciﬁcations. The third chapter presents testing procedures for comparison of misspeciﬁed calibrated models. The proposed tests are of the Vuong-type (Vuong, 1989; Rivers and Vuong, 2002). In the framework here, an econometrician selects values for the parameters in order to match some characteristics of the data with those implied by the theoretical model. It is assumed that all competing models are misspeciﬁed, and suggest a test for the null hypothesis that all considered models provide equivalent ﬁt to the data characteristics, against the alternative that one of the models is a better approximation. This chapter considers both nested and non-nested cases. The discussion includes the situation when parameters are estimated to match one set of moments and the model is evaluated by its ability to match another. This chapter also relaxes the dependence of ranking of the models on the choice of weight matrix by suggesting averaged and sup procedures. The proposed method is applied to comparison of cash-in-advance and portfolio adjustment cost models. The constructed test statistic indicates that compared to the cash-in-advance model, impulse responses generated by the portfolio adjustment cost model do not provide a better approximation for the output and price dynamics in the data. 3 Chapter 1. Introduction The fourth chapter applies the method developed in the third chapter to examine which of the Carlstrom and Fuerst (1997) model and the Bernanke, Gertler and Gilchrist (1999) model can capture the cyclical properties of net worth better. The models in the papers are two leading models that study ﬁnancial frictions in macroeconomic models. In particular, these models show that due to ﬁnancial frictions, net worth plays an important role in obtaining external ﬁnance, and that at an aggregate level, net worth can propagate technology shocks and monetary shocks. Intuitively, a higher level of net worth helps a ﬁrm to obtain external ﬁnance, as it allows the ﬁrm to post more collaterals and to have interests more in line with those of creditors. However, neither paper examines whether the models can reproduce cyclical properties of net worth. This chapter documents the cyclical properties of net worth and show that it is pro-cyclical, and its co-movements with GDP, investment and interest rates seemed to undergo changes around the 1980s. Then, I applies the econometric test developed in chapter 3 to compare the quantitative performances of Carlstrom and Fuerst (1997) model and the Bernanke et al. (1999) model in replicating the cyclical properties of net worth. The results indicate they both do reasonably well. In addition, price rigidity seems to play an important role in the Bernanke et al. (1999) model, as it improves the quantitative performance of the model signiﬁcantly. However, the models can only partially capture the positive correlation between risk premium and net worth. 4 Bibliography Bernanke, Ben S., Mark Gertler, and Simon Gilchrist, “The ﬁnancial accelerator in a quantitative business cycle framework,” in J. B. Taylor and M. Woodford, eds., Handbook of Macroeconomics, Vol. 1 of Handbook of Macroeconomics, Elsevier, 1999, chapter 21, pp. 1341–1393. Carlstrom, Charles T and Timothy S Fuerst, “Agency Costs, Net Worth, and Business Fluctuations: A Computable General Equilibrium Analysis,” American Economic Review, December 1997, 87 (5), 893–910. Fung, Loretta, “Large real exchange rate movements, ﬁrm dynamics, and productivity growth,” Canadian Journal of Economics, May 2008, 41 (2), 391–424. Holmes, Thomas J., David K. Levine, and James A. Schmitz, “Monopoly and the Incentive to Innovate When Adoption Involves Switchover Disruptions,” NBER Working Paper, 2008, No. W13864. Obstfeld, Maurice and Kenneth S. Rogoﬀ, Foundations of International Macroeconomics, Vol. 1 of MIT Press Books, The MIT Press, 1996. Rivers, D. and Q. Vuong, “Model Selection Tests For Nonlinear Dynamic Models,” Econometrics Journal, 2002, 5 (1), 1–39. 5 Chapter 1. Bibliography Vuong, Quang H., “Likelihood Ratio Tests For Model Selection and NonNested Hypotheses,” Econometrica, 1989, 57 (2), 307–333. 6 Chapter 2 Does Productivity Respond to Exchange Rate Appreciations? A Theoretical and Empirical Investigation1 1 A version of this chapter will be submitted for publication. Tang, Yao, “Does Pro- ductivity Respond to Exchange Rate Appreciations? A Theoretical and Empirical Investigation”. 7 2.1. Introduction 2.1 Introduction Substantial exchange rate movements over the last decade have raised an important question: What are the impacts of a major real exchange rate appreciation on ﬁrm performance? Conventional wisdom suggests that such appreciation worsens terms of trade and weakens the competitiveness of home ﬁrms. Meanwhile, the possibility remains that to maintain competitiveness, ﬁrms will be forced to raise productivity by reducing their costs. This chapter addresses theoretically and empirically the question of whether manufacturing productivity responds to real appreciations. First, I construct a model in which currency appreciations can provide incentives for ﬁrms to improve productivity if they are highly exposed to trade. The model also predicts that among highly traded industries the highly concentrated ones will invest more in productivity improvements since the marginal beneﬁts of productivity gain will be greater for ﬁrms with a larger market share. Second, I test the predictions empirically by using Canadian manufacturing data from 1997 to 2006. The results conﬁrm that manufacturing productivity growth responded positively to the appreciation of the Canadian dollar between 2002 and 2006. Within industries exposed to a substantial amount of trade, the highly concentrated ones experienced a larger gain in labour productivity during the appreciation period. In addressing the research question, this chapter makes two contributions. Theoretically, it studies whether ﬁrms will improve productivity by adopting new technologies to counter the eﬀect of appreciations, and what type of ﬁrms will invest more in new technologies. Empirically, the esti- 8 2.1. Introduction mates in this chapter suggest the productivity responses of Canadian manufacturing industries to appreciations were positive and signiﬁcant during the Canadian dollar appreciation between 2002 and 2006. In a neoclassical framework, proﬁt maximization by ﬁrms automatically implies cost minimization. However, some economists have long argued that product market competition forces ﬁrms to lower costs and thus improve productivity. Nickell (1996) contains a review of earlier contributions along this line of thinking. Some of the theoretical models are based on contract theory, for example Hart (1983) and Raith (2003). Vives (forthcoming) examines a wide variety of industrial organization models, and concludes that, in general, increased competition encourages product and process innovations. Holmes, Levine and Schmitz (2008) provide a simple setup to explain the positive relation between competition and adoption of new technology, based on the empirical observation that technology changes are often disruptive in the sense that the transition to higher productivity often features initially higher marginal costs. This chapter adapts the Holmes et al. (2008) assumption of disruptive technological change to clarify the eﬀect of increased competition due to real exchange rate appreciations on productivity. In the model, one of the costs of adopting a cost-reducing technology is proﬁt loss due to a temporarily high marginal cost of production during the transition. When the exchange rate appreciates, there is less proﬁt to be made, and so proﬁt loss due to adopting new technology is also smaller. However, if ﬁrms in an industry are shielded by high trade costs, then their proﬁtability is less inﬂuenced by appreciations, and the incentive to improve productivity provided by 9 2.1. Introduction appreciations is smaller. Unlike Holmes et al. (2008) and other previous papers which focus on when ﬁrms are likely to adopt new technologies to improve productivity, this chapter also studies what types of ﬁrms are likely to invest more in productivity improvement. The model predicts that within the set of industries subject to low trade costs, productivity improvement is positively correlated with the concentration level of the industry. In industries with fewer ﬁrms, since the marginal beneﬁts of productivity improvements are greater, ﬁrms in these industries are likely to invest more in productivity improvements. There is a number of studies that provide evidence of a positive correlation between competition and productivity improvement, with competitive pressure measured as the number of competitors, concentration ratio, trade barriers, or the eﬀect of competition policy. MacDonald (1994) ﬁnds that import competition improved productivity in highly concentrated US industries. Nickell (1996) suggests that an increase in the number of competitors was associated with total factor productivity (TFP) gain in a sample of 700 ﬁrms in the UK. Symeonidis (2008) exploits the variation arising from the introduction of anti-cartel laws in UK industries, and ﬁnds that collusion reduced industry-level productivity growth. Galdon-Sanchez and Schmitz (2002) and Syverson (2004) are two papers that focus on individual industries. The former paper investigates Canadian and American iron ore producers, who doubled labour productivity, and increased material eﬃciency by 50% in response to intense price competition from Brazilian ﬁrms. The latter paper examines ready-mixed concrete plants in the US, and ﬁnds that an increase in local competition led to higher average productivity and lower 10 2.1. Introduction productivity dispersion. As far as I am aware, Fung (2008) is the only previous paper that looks into the eﬀect of major appreciations on productivity. The productivity gain in Fung (2008) came from exit of less eﬃcient ﬁrms and bigger production scale of surviving ﬁrms after a major appreciation. Relative to her paper, this chapter focuses on the channel that ﬁrms’ eﬀort to improve eﬃciency of production. Controlling for exits and production scale, this chapter ﬁnds empirical evidence that competitive pressure of appreciations encourage ﬁrms to improve productivity. To test the predictions of the theoretical model, I use data on 237 Canadian manufacturing industries between 1997 and 2006 to study how industrylevel labour productivity growth interacts with exchange rate movements, concentration, and trade costs. Although the sample period is restricted by data availability, the Canadian dollar experienced such substantial movements in the period as to allow us to investigate the productivity response to a major appreciation. I ﬁnd that growth rates of labour productivity, measured as value added per production worker, were on average higher during the Canadian dollar appreciation between 2002 and 2006, suggesting that the manufacturing industries responded to competitive pressure of the appreciation. Within the industries with a high trade-to-revenue ratio, the highly concentrated ones experienced greater growth in labour productivity. The empirical analysis controls for energy use growth, material use growth, R&D expenditure growth, productivity growth in corresponding US industries, industry ﬁxed eﬀects, and GDP growth rates in Canada and the US. These ﬁndings also add to the stock of evidence demonstrating the relation 11 2.2. Basic Model Setup between competition and productivity. In economic policy circles, some (e.g. Porter (1990)) suggest that a “hard currency”, meaning a currency less prone to depreciation, can contribute to higher productivity growth. Harris (2001) argues that the Canadian dollar depreciation in the 1990s was partially responsible for the Canadian productivity decline. By providing both a model linking the exchange rate and productivity, and an empirical analysis, this chapter illustrates a channel through which exchange rate policy and competition policy can aﬀect productivity. The empirics supply an assessment of the existence and magnitude of the eﬀect of currency appreciations on labour productivity. The next section lays out the modeling environment. Section 3 introduces the technological opportunity for home ﬁrms to improve productivity, and examines how home ﬁrms’ choices interact with an appreciation. Section 4 tests the model predictions on Canadian manufacturing data and section 5 concludes. 2.2 Basic Model Setup There are two countries, the home (h) and the foreign (f), and each has a representative household. The two households have the same given wealth 𝑊 and consume a continuum of goods indexed by 𝑖 with 𝑖 ∈ [0, 1].2 Labour supplies in both countries are perfectly inelastic. The home household’s 2 The model is a partial equilibrium one. In Tang (2008), I endogenize the income of the households. Since the purpose of the chapter is to explore the interaction between market concentration and productivity during exchange rate appreciations, assuming that ﬁrms take aggregate expenditure 𝑊 as given is a useful simpliﬁcation. 12 2.2. Basic Model Setup problem is to maximize 2 ∑ 𝛽 𝑡−1 𝑡=1 ∫ 1 𝑙𝑜𝑔(𝐶𝑖𝑡 )𝑑𝑖 0 subject to the life-time budget constraint 2 ∑ 𝛽 𝑡−1 𝑡=1 ∫ 1 0 𝑃𝑖𝑡 𝐶𝑖𝑡 𝑑𝑖 ≤ 𝑊 (2.1) 𝐶𝑖𝑡 denotes the quantity of good 𝑖 and 𝑃𝑖𝑡 is its price. Similarly the foreign household maximizes 2 ∑ 𝛽 𝑡−1 𝑡=1 ∫ 1 0 𝑙𝑜𝑔(𝐶𝑖𝑡∗ )𝑑𝑖 subject to the life-time budget constraint 2 ∑ 𝑡=1 𝛽 𝑡−1 ∫ 1 0 𝑃𝑖𝑡∗ 𝐶𝑖𝑡∗ 𝑑𝑖 ≤ 𝑊 ∗ (2.2) Following the convention in international economics, the superscript ∗ denotes variables in the foreign country. The household preferences determine the demand functions for good 𝑖 in both countries 𝑊/(1 + 𝛽) 𝑃𝑖𝑡 𝑊/(1 + 𝛽) 𝐶𝑖𝑡∗ = 𝑃𝑖𝑡∗ 𝐶𝑖𝑡 = (2.3) (2.4) where 𝑊/(1 + 𝛽) is normalized to be 1. For each good 𝑖, there are 𝑛𝑖 home ﬁrms and 𝑛𝑖 foreign ﬁrms who can produce it. I will refer to these ﬁrms as ﬁrms in industry 𝑖. In both periods, all home ﬁrms are endowed with a constant marginal cost of 𝑐𝑖ℎ𝑡 = 𝑐ℎ unit of labour and the foreign ﬁrms are endowed with a constant marginal cost of 𝑐𝑖𝑓 𝑡 = 𝑐𝑓 . Thus in the model, home and foreign labour productivities 13 2.2. Basic Model Setup in any industry are 1 𝑐ℎ and 1 𝑐𝑓 . Labour is the only input and is not mobile across countries. Every good is tradable, subject to an iceberg trade cost 𝜏𝑖 for good 𝑖, meaning that for each 𝜏𝑖 unit of good 𝑖 shipped to the other country only one unit will arrive. 𝜏𝑖 and 𝑛𝑖 are drawn from the joint CDF 𝐹 (𝜏, 𝑛) with support [1, ∞) × [1, 2, ⋅ ⋅ ⋅ , 𝑛]3 . The market structure within each industry is similar to that found in Brander and Krugman (1983). The home ﬁrms and foreign ﬁrms of industry 𝑖 produce using labour in their respective countries. However, they are free to sell their production in both countries. For a given period, the home and foreign ﬁrms of industry 𝑖 play a Cournot game in the home market to determine the quantities of good 𝑖 produced by each ﬁrm for the home market. Simultaneously, the same ﬁrms also compete in a Cournot game in the foreign market. As mentioned before, in all periods both the home and foreign ﬁrm face an iceberg trade cost 𝜏𝑖 when they sell in the non-native market. Figure 2.1 illustrates the market structure. 3 In this model, the number of ﬁrms in an industry is exogenously given. This treatment can be viewed as a simpliﬁcation of the case where ﬁrms can enter and exit an industry freely and the number of ﬁrms in equilibrium is determined by the exogenous ﬁxed cost of entry. 14 2.2. Basic Model Setup Home household 𝐶𝑖𝑡 𝐶𝑖𝑡 = 𝑥1𝑖ℎ𝑡 + 𝑥2𝑖ℎ𝑡 + 𝑥1𝑖𝑓 𝑡 + 𝑥2𝑖𝑓 𝑡 𝑥1𝑖ℎ𝑡 Home ﬁrm 1 in industry 𝑖 Three ﬁrms in industry 𝑖′ Home industry 𝑥1𝑖𝑓 𝑡 0 𝑥1∗ 𝑖ℎ𝑡 0 1 Foreign ﬁrm 1 in industry 𝑖 Foreign industry 1 𝑥1∗ 𝑖𝑓 𝑡 2∗ 1∗ 2∗ 𝐶𝑖𝑡∗ = 𝑥1∗ 𝑖ℎ𝑡 + 𝑥𝑖ℎ𝑡 + 𝑥𝑖𝑓 𝑡 + 𝑥𝑖𝑓 𝑡 𝐶𝑖𝑡∗ Foreign household There are two countries and both produce the same continuum of consumption goods indexed by 𝑖 with 𝑖 ∈ [0, 1] at time periods 𝑡 = 1, 2. Every good is tradable subject to an iceberg trade cost 𝜏𝑖 for good 𝑖. In industry 𝑖 there are 𝑛𝑖 home ﬁrms and 𝑛𝑖 foreign ﬁrms who produce good 𝑖. The home ﬁrms and foreign ﬁrms of industry 𝑖 produce with labour in their respective country, and sell their production in both countries. In each period, the home and foreign ﬁrms of industry 𝑖 play a Cournot game in the home market to determine the quantities of good 𝑖 output. Similarly the same ﬁrms also compete in a Cournot game in the foreign market. Figure 2.1: An Illustration of the Industrial Organization 15 2.2. Basic Model Setup The problem4 of home Firm 𝑗 of industry 𝑖 is max 𝑗 𝑗∗ 𝑥𝑗𝑖ℎ1 ,𝑥𝑗∗ 𝑖ℎ1 ,𝑥𝑖ℎ2 ,𝑥𝑖ℎ2 𝑗 𝑗∗ 𝑗 𝑗∗ Π𝑗𝑖ℎ = 𝜋𝑖ℎ1 + 𝑒1 𝜋𝑖ℎ1 + 𝛽(𝜋𝑖ℎ2 + 𝑒2 𝜋𝑖ℎ2 ) (2.5) where 𝑥𝑗𝑖ℎ1 and 𝑥𝑗∗ 𝑖ℎ1 are the quantities it produces for home and foreign markets in period 1, and 𝑥𝑗𝑖ℎ2 and 𝑥𝑗∗ 𝑖ℎ2 are the quantities for home and foreign 𝑗 𝑗 are proﬁts from the home market in and 𝜋𝑖ℎ2 markets in period 2. 𝜋𝑖ℎ1 𝑗∗ 𝑗∗ periods 1 and 2. 𝜋𝑖ℎ1 and 𝜋𝑖ℎ2 are proﬁts from the foreign market, measured in the foreign currency. 𝑒1 and 𝑒2 are the exchange rates in the two periods. They are deﬁned as the price of one unit of foreign currency in terms of home currency, so a decrease in 𝑒𝑡 is an appreciation of the home currency. Nominal money supplies are constant for both periods in both countries, so if we deﬁne a country’s real balance as the nominal money supply divided by the wage rate and normalize wages to 1, 𝑒𝑡 is also the real exchange rate. The exchange rates are determined exogenously and known to all ﬁrms at the beginning of period 1. At the beginning of period 1 all ﬁrms observe each other’s marginal costs for all times. Then all ﬁrms in industry 𝑖 play a game to determine quan4 I assume ﬁrms will discount future at the rate of time preference of the household, who is also the owner of the ﬁrms. In reality, ﬁrms may diﬀer in the discount factor. For ﬁrms who place little value on future, there is very little incentive for them to adopt a technology that will bring a future beneﬁt, holding other factors constant. The objective function also features no expectation operator, as I assume ﬁrms have perfect foresight of future. While expectation plays an important role in decision, I choose to suppress it here so as to focus discussion on how exchange rate lowers opportunity cost of adopting new technology. On empirical section, it is argued that ﬁrms in Canada have a good idea about the path of exchange rate since appreciations tend to be persistent and commodity prices are a good forecaster of exchange rate of the Canadian dollar. 16 2.2. Basic Model Setup tities of output in the four markets (home and foreign markets in period 1 and 2). The strategy of home ﬁrm 𝑗 in industry 𝑖 is the set of quantities { } 𝑗 𝑗∗ 𝑥𝑗𝑖ℎ1 , 𝑥𝑗∗ , 𝑥 , 𝑥 𝑖ℎ1 𝑖ℎ2 𝑖ℎ2 , and the strategy of foreign ﬁrm 𝑗 in industry 𝑖 is the { } 𝑗 𝑗∗ set of quantities 𝑥𝑗𝑖𝑓 1 , 𝑥𝑗∗ , 𝑥 , 𝑥 𝑖𝑓 1 𝑖𝑓 2 𝑖𝑓 2 . There are four subgames, one for each market in each period. I focus on the subgame perfect equilibrium, in which ﬁrms in industry 𝑖 of each country play symmetric strategies. Since ﬁrms have to determine simultaneously the quantities in both markets in a period, the two subgames in period 2 are independent. In period 2, ﬁrms have to play a Nash equilibrium in the subgames. By the standard backward induction principle, they will also have to play a Nash equilibrium in the subgames in period 1. Thus all four subgames are independent, so the subgame perfect equilibrium involves ﬁrms playing the symmetric Nash equilibrium in each subgame. The output quantities in each subgame are determined as the symmetric Nash equilibrium quantities in that subgame. We can calculate in the maximized total proﬁt as the sum of maximized proﬁts from each subgame. Normalizing home wage to be 1, the proﬁt of the home ﬁrm 𝑗 of industry 𝑖 in the home market at time 𝑡 is 𝑗 𝜋𝑖ℎ𝑡 = (𝑃𝑖𝑡 − 𝑐ℎ )𝑥𝑗𝑖ℎ𝑡 = ( ∑𝑛𝑖 𝑘 𝑘=1 𝑥𝑖ℎ𝑡 1 ∑𝑛𝑖 𝑘 − 𝑐ℎ )𝑥𝑗𝑖ℎ𝑡 + 𝑘=1 𝑥𝑖𝑓 𝑡 (2.6) where 𝑥𝑘𝑖ℎ𝑡 and 𝑥𝑘𝑖𝑓 𝑡 are the quantities of good 𝑖 produced by home ﬁrm 𝑘 and foreign ﬁrm 𝑘 for the home market. The last equality follows from (2.3) ∑ 𝑖 𝑘 ∑ 𝑖 𝑘 𝑥𝑖ℎ𝑡 + 𝑛𝑗=1 and the market clearing condition 𝐶𝑖𝑡 = 𝑛𝑘=1 𝑥𝑖𝑓 𝑡 . When the home ﬁrm 𝑗 chooses 𝑥𝑗𝑖ℎ𝑡 to maximize (2.6), the ﬁrst order condition is ∑𝑛 𝑖 𝑘 ∑ 𝑘 𝑘∕=𝑗 𝑥𝑖ℎ𝑡 + 𝑘=1 𝑥𝑖𝑓 𝑡 ∑ 𝑛𝑖 𝑘 ∑ 𝑛𝑖 𝑘 2 − 𝑐 ℎ ≤ 0 (2.7) ( 𝑘=1 𝑥𝑖ℎ𝑡 + 𝑘=1 𝑥𝑖𝑓 𝑡 ) 17 2.2. Basic Model Setup Similarly the proﬁt of foreign ﬁrm 𝑗 of industry 𝑖 in the home market at time 𝑡 is 𝑗 𝑗 𝜋𝑖𝑓 𝑡 = (𝑃𝑖𝑡 − 𝑒𝑡 𝜏𝑖 𝑐𝑓 )𝑥𝑖𝑓 𝑡 = ( ∑𝑛𝑖 𝑘 𝑘=1 𝑥𝑖ℎ𝑡 1 ∑ 𝑖 𝑘 − 𝑒𝑡 𝜏𝑖 𝑐𝑓 )𝑥𝑗𝑖𝑓 𝑡 + 𝑛𝑘=1 𝑥𝑖𝑓 𝑡 (2.8) When the foreign ﬁrm 𝑗 chooses 𝑥𝑗𝑖𝑓 𝑡 to maximize (2.8), the ﬁrst order condition is ∑ 𝑛𝑖 ∑ 𝑘 𝑘 𝑘∕=𝑗 𝑥𝑖𝑓 𝑡 𝑘=1 𝑥𝑖ℎ𝑡 + ∑ 𝑖 𝑘 2 ∑ 𝑖 𝑘 𝑥𝑖𝑓 𝑡 ) 𝑥𝑖ℎ𝑡 + 𝑛𝑘=1 ( 𝑛𝑘=1 − 𝑒𝑡 𝜏𝑖 𝑐𝑓 ≤ 0. (2.9) (2.7) and (2.9) implicitly deﬁne the best responses functions of the home 𝑗 and foreign ﬁrm 𝑗 to quantities produced by other ﬁrms. Combining (2.7) and (2.9) and imposing symmetry among all home ﬁrms and symmetry among all home ﬁrms, we have the equilibrium relation between outputs of home and foreign ﬁrms 𝑥𝑗𝑖𝑓 𝑡 = where 𝛼1 (𝑡, 𝑖) = 𝑛𝑖 𝑐ℎ − (𝑛𝑖 − 1)𝑒𝑡 𝜏𝑖 𝑐𝑓 𝑗 𝑥𝑖ℎ𝑡 = 𝛼1 (𝑡, 𝑖)𝑥𝑗𝑖ℎ𝑡 𝑛𝑖 𝑒𝑡 𝜏𝑖 𝑐𝑓 − (𝑛𝑖 − 1)𝑐ℎ 𝑛𝑖 𝑐ℎ −(𝑛𝑖 −1)𝑒𝑡 𝜏𝑖 𝑐𝑓 𝑛𝑖 𝑒𝑡 𝜏𝑖 𝑐𝑓 −(𝑛𝑖 −1)𝑐ℎ . (2.10) A careful examination of (2.7) suggests that if 𝑐ℎ is large, then the home ﬁrms will produce zero quantities, and foreign ﬁrms will produce large quantities. This is because foreign ﬁrms know that, given the quantities they produced, home ﬁrms’ the marginal revenue in the home market (the ﬁrst term in (2.7)) is always less than the marginal cost for all 𝑥𝑗𝑖ℎ𝑡 ≥ 0 and home ﬁrms will optimally choose zero. In this case, the denominator of 𝛼1 will be negative and (2.10) will no longer describe the relation between home and foreign quantities of output. Similarly when 𝑒𝑡 𝜏𝑖 𝑐𝑓 is large, foreign ﬁrms will produce zero quantities, and the numerator of 𝛼1 (𝑡, 𝑖) will be negative. It can be shown that the necessary 18 2.2. Basic Model Setup conditions for both home and foreign ﬁrms to produce positive quantities in the home market is that both numerator and denominator of 𝛼1 (𝑡, 𝑖) be positive. These conditions can be expressed as 𝑛𝑖 − 1 1 𝑐 ℎ 𝑛 𝑖 𝑒𝑡 𝑐 𝑓 𝑛𝑖 1 𝑐 ℎ 𝜏𝑖 < 𝑛 𝑖 − 1 𝑒𝑡 𝑐 𝑓 𝜏𝑖 > (2.11) If (2.11) is satisﬁed, we can substitute the last expression into (2.7) and (2.9) and solve for 𝑥𝑗𝑖ℎ𝑡 and 𝑥𝑗𝑖𝑓 𝑡 𝑛𝑖 − 1 + 𝑛𝑖 𝛼1 (𝑡, 𝑖) (𝑛𝑖 + 𝑛𝑖 𝛼1 (𝑡, 𝑖))2 𝑐ℎ 𝑛𝑖 − 1 + 𝑛𝑖 /𝛼1 (𝑡, 𝑖) = (𝑛𝑖 + 𝑛𝑖 /𝛼1 (𝑡, 𝑖))2 𝑒𝑡 𝜏𝑖 𝑐𝑓 𝑥𝑗𝑖ℎ𝑡 = 𝑥𝑗𝑖𝑓 𝑡 (2.12) (2.13) In particular if 𝑛𝑖 = 1 the solution is 𝑥𝑗𝑖ℎ𝑡 = 1 𝑒𝑡 𝜏𝑖 𝑐𝑓 (1 + 𝑥𝑗𝑖𝑓 𝑡 = 𝑐ℎ 2 𝑒𝑡 𝜏 𝑖 𝑐 𝑓 ) 1 𝑐ℎ (1 + 𝑒𝑡 𝜏 𝑖 𝑐 𝑓 2 𝑐ℎ ) (2.14) (2.15) If we substitute (2.12) and (2.13) into (2.6) and (2.8), we have 1 (𝑛𝑖 + 𝑛𝑖 𝛼1 (𝑡, 𝑖))2 1 = (𝑛𝑖 + 𝑛𝑖 /𝛼1 (𝑡, 𝑖))2 𝑗 𝜋𝑖ℎ𝑡 = 𝑗 𝜋𝑖𝑓 𝑡 (2.16) (2.17) Thus for industry 𝑖 we have a unique symmetric equilibrium in the home market under (2.11). Similarly the home ﬁrm’s and foreign ﬁrm’s proﬁt functions in the foreign 19 2.2. Basic Model Setup market, denoted in foreign currency, are ∗ 𝜋𝑖ℎ𝑡 = (𝑃𝑖𝑡∗ − 1 𝜏𝑖 𝑐ℎ 𝑗∗ 𝜏𝑖 𝑐ℎ 𝑗∗ )𝑥𝑖ℎ𝑡 = ( ∑𝑛𝑖 𝑘∗ ∑𝑛𝑖 𝑘∗ − )𝑥𝑖ℎ𝑡 𝑒𝑡 𝑒𝑡 𝑘=1 𝑥𝑖𝑓 𝑡 𝑘=1 𝑥𝑖ℎ𝑡 + 𝑗∗ 𝑗∗ ∗ 𝜋𝑖𝑓 𝑡 = (𝑃𝑖𝑡 − 𝑐𝑓 )𝑥𝑖𝑓 𝑡 = ( ∑𝑛𝑖 𝑘∗ 𝑘=1 𝑥𝑖ℎ𝑡 1 ∑ 𝑖 𝑘∗ − 𝑐𝑓 )𝑥𝑗∗ 𝑖𝑓 𝑡 𝑥𝑖𝑓 𝑡 + 𝑛𝑘=1 In a symmetric equilibrium in which ﬁrms of both countries produce positive quantities, the equilibrium output and proﬁts are given by 𝑛𝑖 − 1 + 𝑛𝑖 𝛼2 (𝑡, 𝑖) (𝑛𝑖 + 𝑛𝑖 𝛼2 (𝑡, 𝑖))2 𝑐ℎ 𝑛𝑖 − 1 + 𝑛𝑖 /𝛼2 (𝑡, 𝑖) = (𝑛𝑖 + 𝑛𝑖 /𝛼2 (𝑡, 𝑖))2 𝑒𝑡 𝜏𝑖 𝑐𝑓 1 ∗ 𝜋𝑖ℎ𝑡 = (𝑛𝑖 + 𝑛𝑖 𝛼2 (𝑡, 𝑖))2 1 ∗ 𝜋𝑖𝑓 𝑡 = (𝑛𝑖 + 𝑛𝑖 /𝛼2 (𝑡, 𝑖))2 𝑥𝑗∗ 𝑖ℎ𝑡 = 𝑥∗𝑖𝑓 𝑡 where 𝛼2 (𝑡, 𝑖) = 𝑛𝑖 𝜏𝑖 𝑐ℎ −(𝑛𝑖 −1)𝑒𝑡 𝑐𝑓 𝑛𝑖 𝑒𝑡 𝑐𝑓 −(𝑛𝑖 −1)𝜏𝑖 𝑐ℎ . (2.18) (2.19) (2.20) (2.21) The necessary condition for both home and foreign ﬁrms to produce positive quantities in the home market is 𝑛𝑖 − 1 𝑐 𝑓 𝑒𝑡 𝑛𝑖 𝑐ℎ 𝑐𝑓 𝑛𝑖 𝑒𝑡 𝜏𝑖 < 𝑛𝑖 − 1 𝑐 ℎ 𝜏𝑖 > (2.22) Given 𝑐ℎ , 𝑐𝑓 and 𝑒𝑡 , (2.11) and (2.22) imply that in industries in the set Θ(𝑒𝑡 ) = {(𝑛𝑖 , 𝜏𝑖 ) ∈ [1, ∞) × [1, 2, ⋅ ⋅ ⋅ , 𝑛] : 𝑛𝑖 − 1 1 𝑐 ℎ 𝑛𝑖 1 𝑐 ℎ , 𝜏𝑖 < , 𝑛 𝑖 𝑒𝑡 𝑐 𝑓 𝑛 𝑖 − 1 𝑒𝑡 𝑐 𝑓 } 𝑐𝑓 𝑛𝑖 − 1 𝑐 𝑓 𝑛𝑖 𝜏𝑖 > 𝑒𝑡 , 𝜏𝑖 < 𝑒𝑡 , 𝑛𝑖 𝑐ℎ 𝑛𝑖 − 1 𝑐 ℎ 𝜏𝑖 > (2.23) both home and foreign ﬁrms will produce positive quantities in both markets at time 𝑡.5 For these industries, total proﬁts for home and foreign ﬁrms are 5 I use the notation Θ(𝑒𝑡 ) to emphasize the set depends on 𝑒1 . 20 2.2. Basic Model Setup given by 1 𝑒1 + (𝑛𝑖 + 𝑛𝑖 𝛼1 (𝑡 = 1, 𝑖))2 (𝑛𝑖 + 𝑛𝑖 𝛼2 (𝑡 = 1, 𝑖))2 𝑒2 1 + + 2 (𝑛𝑖 + 𝑛𝑖 𝛼1 (𝑡 = 2, 𝑖)) (𝑛𝑖 + 𝑛𝑖 𝛼2 (𝑡 = 2, 𝑖))2 1 1 = + 2 𝑒1 (𝑛𝑖 + 𝑛𝑖 /𝛼1 (𝑡 = 1, 𝑖)) (𝑛𝑖 + 𝑛𝑖 /𝛼2 (𝑡 = 1, 𝑖))2 1 1 + . + 𝑒2 (𝑛𝑖 + 𝑛𝑖 /𝛼1 (𝑡 = 2, 𝑖))2 (𝑛𝑖 + 𝑛𝑖 /𝛼2 (𝑡 = 2, 𝑖))2 Π𝑗𝑖ℎ = Π𝑗𝑖𝑓 (2.24) Proposition 1. (a) For industries in the set Θ(𝑒𝑡 ), the period 𝑡 proﬁt of home ﬁrm 𝑗 in industry 𝑖 is a decreasing function of 𝑐ℎ and an increasing function of exchange rate 𝑒𝑡 . (b) For industries with the same 𝜏𝑖 in the set Θ(𝑒𝑡 ), the period 𝑡 proﬁt for home ﬁrms 𝑗 is decreasing in 𝑛𝑖 . (c) For industries with the same 𝜏𝑖 and in which only home ﬁrms are producing positive quantities, the period 𝑡 proﬁt for home ﬁrms 𝑗 is decreasing in 𝑛𝑖 . Proof: (a) From (2.16) and (2.21), we can see the period 𝑡 proﬁt of home ﬁrm 𝑗 in industry 𝑖 is decreasing in 𝛼1 and 𝛼2 . Since both 𝛼1 and 𝛼2 are increasing in 𝑐ℎ and decreasing in 𝑒𝑡 , the conclusion follows. (b) For industries in Θ(𝑒𝑡 ), the period 𝑡 proﬁt for home ﬁrm 𝑗 in the home market is given by 𝑗 𝜋𝑖ℎ𝑡 = 1 (𝑛𝑖 + 𝑛𝑖 𝛼1 (𝑡, 𝑖))2 If 𝑐ℎ > 𝑒𝑡 𝜏𝑖 𝑐𝑓 , then 𝛼1 (𝑡, 𝑖) = 𝑛𝑖 𝑐ℎ −(𝑛𝑖 −1)𝑒𝑡 𝜏𝑖 𝑐𝑓 𝑛𝑖 𝑒𝑡 𝜏𝑖 𝑐𝑓 −(𝑛𝑖 −1)𝑐ℎ (2.16) = 𝑐ℎ −(𝑛𝑖 −1)(𝑐ℎ −𝑒𝑡 𝜏𝑖 𝑐𝑓 ) 𝑒𝑡 𝜏𝑖 𝑐𝑓 −(𝑛𝑖 −1)(𝑒𝑡 𝜏𝑖 𝑐𝑓 −𝑐ℎ ) is 𝑗 increasing in 𝑛𝑖 . Thus 𝜋𝑖ℎ𝑡 is decreasing in 𝑛𝑖 . If 𝑐ℎ = 𝑒𝑡 𝜏𝑖 𝑐𝑓 , then 𝛼1 (𝑡, 𝑖) = 𝑗 1 so 𝜋𝑖ℎ𝑡 = 1 (𝑛𝑖 +𝑛𝑖 𝛼1 (𝑡,𝑖))2 is decreasing in 𝑛𝑖 . Lastly, when 𝑐ℎ < 𝑒𝑡 𝜏𝑖 𝑐𝑓 , we 21 2.3. Exchange Appreciation and Investment Decision 𝑗 can prove 𝜋𝑖ℎ𝑡 is decreasing in 𝑛𝑖 by showing the derivative of the numerator of (2.16) with respect to 𝑛𝑖 is positive. 𝑛𝑖 𝑒𝑡 𝜏𝑖 𝑐𝑐𝑓ℎ−𝑐ℎ − 𝑛2𝑖 + 𝑛𝑖 ∂ ∂ 2 )2 (𝑛𝑖 + 𝑛𝑖 𝛼1 (𝑡, 𝑖)) = (𝑛𝑖 + 𝑒𝑡 𝜏 𝑖 𝑐 𝑓 ∂𝑛𝑖 ∂𝑛𝑖 + 𝑛 − 1 𝑖 𝑒𝑡 𝜏 𝑐 −𝑐 𝑖 𝑓 = 𝑐ℎ (𝑒𝑡 𝜏𝑖 𝑐𝑓 +𝑐ℎ ) (𝑒𝑡 𝜏𝑖 𝑐 −𝑐 )2 2 𝑒𝑡 𝜏 𝑖 𝑐 𝑓 𝑓 ℎ ( 𝑒𝑡 𝜏𝑖 𝑐𝑓 −𝑐ℎ + 𝑛𝑖 − ℎ 1)2 >0 𝑗 Therefore, we have shown that 𝜋𝑖ℎ𝑡 is always decreasing in 𝑛𝑖 . Similarly, we can show the period 𝑡 proﬁt of home ﬁrm 𝑗 in the foreign market is decreasing in 𝑛𝑖 . (c) For industries in which only home ﬁrms are producing positive quantities for both markets, it is easy to verify that the period 𝑡 proﬁt for ﬁrm 𝑗 is 𝑛𝑖 − 1 𝑛𝑖 − 1 + 𝑒𝑡 , 𝑛2𝑖 𝑛2𝑖 (2.25) decreasing in 𝑛𝑖 . ■ The proposition conﬁrms the intuition that an appreciation of home currency erodes the proﬁt of home ﬁrms and validates the usual Cournot competition result that proﬁt dissipates with the number of ﬁrms. 2.3 Exchange Appreciation and Investment Decision In this section I introduce the possibility of cost-saving technology. The term technology is deﬁned as in Jones (2001), as ways to transform factors into 22 2.3. Exchange Appreciation and Investment Decision output. In general, they can be product innovations, but in this chapter I refer to a cost-saving process innovation. For example the innovation could be an improvement in labour practice as emphasized in Baily, Gersbach, Scherer and Lichtenberg (1995), and Schmitz (2005). To simplify the problem, I assume all home and foreign ﬁrms in each industry are endowed with the same cost, 𝑐ℎ = 𝑐 = 𝑐𝑓 for both periods. All home ﬁrms have access to technology that reduces the second-period marginal cost from 𝑐ℎ to 𝜎1 𝑐ℎ , where 𝜎 is the improvement in labour productivity. However the technology is also disruptive in the sense that, if a ﬁrm chooses 𝜎 > 1, it raises the ﬁrst period marginal cost from 𝑐ℎ to 𝛾𝑐ℎ , where 𝛾 is a constant greater than 1.6 Since adoption at time 𝑡 will raise the cost at that period, no ﬁrm would adopt the innovation at 𝑡 = 2. Proposition 1 implies the technology will bring higher proﬁt in the second period but entail a loss of proﬁt in the ﬁrst. Firms can choose 𝜎 in the range [1, 𝜎) but will have to pay a ﬁxed cost 𝐼(𝜎). I assume 𝐼(𝜎) is strictly convex in 𝜎 for all 1 < 𝜎 < 𝜎, 𝐼(𝜎 = 1) = 0, lim𝜎→1 𝐼(𝜎) > 0, and lim𝜎→𝜎 𝐼(𝜎) = ∞. 7 I assume that no foreign ﬁrms have the option to upgrade their technology. The assumption is made to simplify the interaction between home and foreign ﬁrms regarding the choice of 𝜎, which would vary across industries. 6 It is possible that ﬁrms could improve productivity by adopting other new technologies that are not disruptive and are always proﬁtable to implement. I choose not to model such technology opportunities as they would not interact with exchange rate movements. In the empirical section of the chapter, I will try to account for this possibility. 7 In general 𝛾 can be increasing in 𝜎, however, since the assumptions regarding 𝐼(𝜎) ensure that the ﬁrst-period cost of adoption (which equals 𝐼(𝜎) plus the proﬁt loss due to a high marginal cost 𝛾𝑐ℎ ) is increasing in 𝜎, I do not pursue this complication. 23 2.3. Exchange Appreciation and Investment Decision In Tang (2008), I show that if ﬁrms can only choose between the status quo (sq), i.e. 𝜎 = 1, and some ﬁxed 𝜎 > 1, then the unique equilibrium is for the home ﬁrms to adopt and foreign ﬁrms to keep the status quo when there is a large appreciation. 8 My assumption regarding new technology follows that of Holmes et al. (2008), which suggests that technology change is disruptive in the sense that there is a costly transition to lower cost of production. Holmes et al. (2008) motivate this assumption by citing a large number of empirical observations. For illustrative purposes consider the following scenario. The implementation of new technology requires a ﬁxed investment in the training of employees and during the transition, as a result workers are less productive as they are learning to master the new technology. As mentioned in the introduction, Vives (forthcoming) studies a wide variety of industrial organization models and concludes that in general more competition induces a bigger eﬀort to improve productivity. Holmes et al. (2008) obtain similar predictions with the empirically motivated assumption of disruptive technology changes. I follow their assumption to maintain model tractability. It is clear from the nature of the technology that the tradeoﬀ between current costs and future gain is crucial for adoption choices. A two-period 8 In the setting in which both home and foreign ﬁrms can choose 𝜎 from [1, ∞), it is very diﬃcult to predict the equilibrium outcome in an technology adoption game. It is possible to show home ﬁrms’ incentive to adopt increases with an appreciation given the choice of foreign ﬁrms, and foreign ﬁrms’ incentive decreases with an appreciation given the choice of home ﬁrms. Since it appears that foreign ﬁrms’ incentive to improve productivity is weaker with an appreciation, I assume the extreme case that foreign ﬁrms simply cannot upgrade and focus on how the choices of home ﬁrms vary with industry characteristics. 24 2.3. Exchange Appreciation and Investment Decision world is the minimum structure that allows us to study the tradeoﬀ between the present and the future. Adding more periods simply requires one to replace second-period proﬁts in ﬁrms’ objective functions with value functions. Both a second-period proﬁt function and a value function should be increasing in productivity and there will be a future gain. Since the focus of this chapter is on how ﬁrst-period loss interacts with exchange rate movements, a two-period model is suﬃcient. Since the two countries are symmetric, it is reasonable to conjecture that at steady state exchange rate 𝑒𝑡 = 1,9 will hold in both periods. The timing of the game in industry 𝑖 is the following: ∙ Stage 0, an exogenous shock to exchange rate is realized, ﬁrms have perfect foresight that 𝑒1 < 1 and 𝑒2 = 110 ; ∙ Stage 1, home ﬁrm 𝑗 determines its choices of 𝜎 𝑗 and pay 𝐼(𝜎 𝑗 ), for 𝑗 = 1, 2, ⋅ ⋅ ⋅ , 𝑛𝑖 ; ∙ Stage 2, the choices of home ﬁrms in stage 1 are observed by all (so every ﬁrm knows the marginal cost of each ﬁrm in both periods), and ﬁrms play the Cournot game as described in section 2 to determine outputs in each of the four markets (home and foreign markets in period 1 and 2). 9 In Tang (2008), I close the model and derive the equilibrium exchange rate as a function of ﬁrm productivities and shock to currency demand. In a steady state in which the productivities are equal across countries and currency demand shocks equal zero, the equilibrium exchange rate is 1. 10 Or I can assume 𝑒2 equals any other constant value commonly expected or known. This change would only scale the second period proﬁt gain of adopting 𝜎 > 1. 25 2.3. Exchange Appreciation and Investment Decision The game is solved by standard backward induction. In stage 2, given { 1 2 } 𝜎 , 𝜎 , ⋅ ⋅ ⋅ , 𝜎 𝑛𝑖 ﬁrms play the Cournot game described in section 2 and the payoﬀs are as derived in section 2. In stage 1, given how the equi} { librium proﬁt depends on 𝜎 1 , 𝜎 2 , ⋅ ⋅ ⋅ , 𝜎 𝑛𝑖 , home ﬁrm 𝑗 chooses 𝜎 𝑗 , for 𝑗 = 1, 2, ⋅ ⋅ ⋅ , 𝑛𝑖 . Again, I will focus on a symmetric equilibrium between home ﬁrms in stage 1. In stage 2, I focus on the choices of 𝜎 for industries in which ﬁrms of both countries produce positive quantities in all markets, except that home ﬁrms may be forced out of the foreign market during the period 1 appreciation. If all home ﬁrms in industry 𝑖 choose the same 𝜎 > 1, and if ﬁrms of both countries are producing positive quantities then the total proﬁt of the home ﬁrm 𝑗 before paying 𝐼(𝜎) is Π𝑗𝑖ℎ (𝜎) = 1 (𝑛𝑖 + + 𝑖 −1)𝑒1 𝜏𝑖 2 𝑛𝑖 𝑛𝑛𝑖𝑖 𝛾−(𝑛 𝑒1 𝜏𝑖 −(𝑛𝑖 −1)𝛾 ) + (𝑛𝑖 + 𝛽 (𝑛𝑖 + where 1(𝑒1 > 𝑖 −1)𝑒2 𝜏𝑖 2 ) 𝑛𝑖 𝑛𝑛𝑖𝑖 𝑒/𝜎−(𝑛 2 𝜏𝑖 −(𝑛𝑖 −1)/𝜎 𝑛−1 𝑛 𝜏𝑖 𝛾) + 𝑒1 𝑛𝑖 𝜏𝑖 𝛾−(𝑛𝑖 −1)𝑒1 2 𝑛𝑖 𝑛𝑖 𝑒1 −(𝑛𝑖 −1)𝜏𝑖 𝛾 ) (𝑛𝑖 + ⋅ 1(𝑒1 > 𝑛−1 𝜏𝑖 𝛾) 𝑛 𝛽𝑒2 𝑛𝑖 𝜏𝑖 /𝜎−(𝑛𝑖 −1)𝑒2 2 𝑛𝑖 𝑛𝑖 𝑒2 −(𝑛𝑖 −1)𝜏𝑖 /𝜎 ) is an indicator function. When 𝑒1 > 𝑛−1 𝑛 𝜏𝑖 𝛾 (2.26) fails, the home ﬁrms are driven out of the foreign market, and make zero proﬁt. If all home ﬁrms choose status quo (sq), i.e. 𝜎 = 1, the total proﬁt is Π𝑗𝑖ℎ (𝑠𝑞) = 1 (𝑛𝑖 + + 𝑖 −1)𝑒1 𝜏𝑖 2 𝑛𝑖 𝑛𝑛𝑖𝑖 𝑒−(𝑛 ) 1 𝜏𝑖 −(𝑛𝑖 −1) + 𝛽 (𝑛𝑖 + 𝑖 −1)𝑒2 𝜏𝑖 2 𝑛𝑖 𝑛𝑛𝑖𝑖 𝑒−(𝑛 ) 2 𝜏𝑖 −(𝑛𝑖 −1) (𝑛𝑖 + + 𝑒1 𝑛𝑖 𝜏𝑖 −(𝑛𝑖 −1)𝑒1 2 𝑛𝑖 𝑛𝑖 𝑒1 −(𝑛𝑖 −1)𝜏𝑖 ) (𝑛𝑖 + ⋅ 1(𝑒1 > 𝑛−1 𝜏𝑖 ) 𝑛 𝛽𝑒2 𝑛𝑖 𝜏𝑖 −(𝑛𝑖 −1)𝑒2 2 𝑛𝑖 𝑛𝑖 𝑒2 −(𝑛𝑖 −1)𝜏𝑖 ) I refer to the diﬀerence Π𝑗𝑖ℎ (𝜎) − Π𝑗𝑖ℎ (𝑠𝑞) as the beneﬁt of adopting the disruptive technology. Choosing some 𝜎 > 1 dominates 𝜎 = 1, if the associated 26 2.3. Exchange Appreciation and Investment Decision beneﬁt is greater than the cost 𝐼(𝜎). The beneﬁt has two components, the proﬁt loss in the ﬁrst period ∣𝐿1 ∣ = 1 (𝑛𝑖 + −( 𝑖 −1)𝑒1 𝜏𝑖 2 𝑛𝑖 𝑛𝑛𝑖𝑖 𝑒−(𝑛 ) 1 𝜏𝑖 −(𝑛𝑖 −1) + (𝑛𝑖 + 1 (𝑛𝑖 + + 𝑖 −1)𝑒1 𝜏𝑖 2 ) 𝑛𝑖 𝑛𝑛𝑖𝑖 𝑒−(𝑛 1 𝜏𝑖 −(𝑛𝑖 −1) 𝑒1 𝑛𝑖 𝜏𝑖 −(𝑛𝑖 −1)𝑒1 2 𝑛𝑖 𝑛𝑖 𝑒1 −(𝑛𝑖 −1)𝜏𝑖 ) (𝑛𝑖 + 𝑒1 ) 𝑛𝑖 𝜏𝑖 −(𝑛𝑖 −1)𝑒1 2 𝑛𝑖 𝑛𝑖 𝑒1 −(𝑛𝑖 −1)𝜏𝑖 ) (2.27) and the proﬁt gain in the second 𝐺2 = 𝛽 (𝑛𝑖 + −( 𝑖 −1)𝑒2 𝜏𝑖 2 ) 𝑛𝑖 𝑛𝑛𝑖𝑖 𝑒/𝜎−(𝑛 2 𝜏𝑖 −(𝑛𝑖 −1)/𝜎 + 𝛽 (𝑛𝑖 + 𝑖 −1)𝑒2 𝜏𝑖 2 𝑛𝑖 𝑛𝑛𝑖𝑖 𝑒−(𝑛 ) 2 𝜏𝑖 −(𝑛𝑖 −1) (𝑛𝑖 + + 𝛽𝑒2 𝑛𝑖 𝜏𝑖 /𝜎−(𝑛𝑖 −1)𝑒2 2 𝑛𝑖 𝑛𝑖 𝑒2 −(𝑛𝑖 −1)𝜏𝑖 /𝜎 ) (𝑛𝑖 + 𝛽𝑒2 ) 𝑛𝑖 𝜏𝑖 −(𝑛𝑖 −1)𝑒2 2 𝑛𝑖 𝑛𝑖 𝑒2 −(𝑛𝑖 −1)𝜏𝑖 ) (2.28) Similar to (2.23), given 𝑒1 < 1 and 𝑒2 = 1, we can formally deﬁne the set of industries with {𝑛𝑖 , 𝜏𝑖 , 𝜎𝑖 } such that ﬁrms of both countries produce positive quantities in all markets, except that home ﬁrms may produce zero for the foreign market during the period 1 appreciation, as Θ𝜎 (𝑒1 ) = {(𝑛𝑖 , 𝜏𝑖 , 𝜎𝑖 ) ∈ [1, ∞) × [1, 2, ⋅ ⋅ ⋅ , 𝑛] × 𝜎𝑖 ∈ [1, 𝜎) : 𝑛𝑖 (𝑛𝑖 − 1)𝛾 (𝑛𝑖 − 1)𝜎𝑖 𝜏𝑖 < , 𝜏𝑖 > , 𝜏𝑖 > (𝑛𝑖 − 1)𝜎𝑖 𝑛 𝑖 𝑒1 𝑛𝑖 } (2.29) To make it possible for the adoption decision problem to interact with the exchange rate, I assume ∙ (i) For industries in Θ𝜎 (𝑒1 = 1), Π𝑗𝑖ℎ (𝜎) − Π𝑗𝑖ℎ (𝑠𝑞) < 𝐼(𝜎) for all 𝜎 ∈ (1, 𝜎); ∙ (ii) If 𝜏𝑖 = 1, for all 𝑛𝑖 ∈ [1, 2, ⋅ ⋅ ⋅ , 𝑛] we can ﬁnd an interval Σ𝑛𝑖 ⊂ (1, 𝜎) such that the second-period proﬁt gain of ﬁrms in industry 𝑖 is strictly greater than the cost 𝐼(𝜎) for all 𝜎 ∈ Σ𝑛𝑖 . 27 2.3. Exchange Appreciation and Investment Decision Assumption (i) implies it is not proﬁtable to choose any 𝜎 > 1 with 𝑒1 = 1, and assumption (ii) says that if the ﬁrst-period proﬁt loss is zero, it will be proﬁtable for home ﬁrms of industry 𝑖 to adopt 𝜎 ∈ Σ𝑛𝑖 . The following two propositions show how beneﬁts in adopting disruptive new technologies are aﬀected by 𝑒1 and 𝜏𝑖 . Firstly given 𝜏𝑖 and 𝑛𝑖 , an exchange appreciation lowers the ﬁrst period proﬁt loss, so choosing some 𝜎 > 1 can be proﬁtable. Secondly, given 𝑒1 and 𝑛𝑖 , a large trade cost 𝜏𝑖 insulates home ﬁrms from trade and the inﬂuence of exchange rate movements. Home ﬁrms will have no incentive to choose 𝜎 > 1, even if they experience an appreciation. Proposition 2. Consider industries in Θ𝜎 (𝑒1 ). Given 𝑛𝑖 , and 𝜏𝑖 close enough to 1, for all 𝜎 ∈ Σ𝑛𝑖 there exists an exchange rate threshold such that it is proﬁtable to adopt 𝜎 for home ﬁrms for all 𝑒1 below the threshold. Proof: The absolute value of the ﬁrst-period proﬁt loss due to adoption (2.27) is bounded by the ﬁrst-period proﬁt in the status quo 1 (𝑛𝑖 + 𝑖 −1)𝑒1 𝜏𝑖 2 𝑛𝑖 𝑛𝑛𝑖𝑖 𝑒−(𝑛 ) 1 𝜏𝑖 −(𝑛𝑖 −1) As 𝑒1 tends to 𝑛𝑖 −1 𝜏 𝑖 𝑛𝑖 + (𝑛𝑖 + 𝑒1 𝑛𝑖 𝜏𝑖 −(𝑛𝑖 −1)𝑒1 2 𝑛𝑖 𝑛𝑖 𝑒1 −(𝑛𝑖 −1)𝜏𝑖 ) ⋅ 1(𝑒1 > 𝑛−1 𝜏𝑖 ). 𝑛 from above, the ﬁrst-period proﬁt will tend to zero and so will the ﬁrst-period proﬁt loss due to adoption. By assumption (ii), for industries with 𝜏𝑖 , the beneﬁt of adopting 𝜎 > 1 is greater than the cost for all 𝜎 ∈ Σ𝑛𝑖 . Since the proﬁt functions are continuous in 𝜏𝑖 , by assumption (ii) for 𝜏𝑖 close enough to 1, the second-period proﬁt gain of ﬁrms in industry 𝑖 will be strictly greater than 𝐼(𝜎) for all 𝜎 ∈ Σ𝑛𝑖 . Therefore for each 28 2.3. Exchange Appreciation and Investment Decision 𝜎 ∈ Σ𝑛𝑖 we can ﬁnd an 𝑒1 such that for all 𝑒1 < 𝑒1 , the ﬁrst period loss ∣𝐿1 ∣ < 𝐺2 − 𝐼(𝜎). Thus for all 𝑒1 < 𝑒1 , adopting 𝜎 ∈ Σ𝑛𝑖 is proﬁtable since Π𝑗𝑖 (𝜎) − Π𝑗𝑖 (𝑠𝑞) = 𝐺2 − ∣𝐿1 ∣ > 𝐼(𝜎).■ Proposition 3. Given an 𝑒1 < 1, there exists a threshold 𝜏ˆ such that adopting the technology of any level 𝜎 will not not proﬁtable for all ﬁrms in any industry with 𝜏𝑖 ≥ 𝜏ˆ. Proof: Consider an industry with 𝑛 ﬁrms. There are two possibilities. Firstly, given 𝑒1 adopting any 𝜎 ∈ (1, 𝜎) will not be proﬁtable for all 𝜏 ∈ [1, ∞). In this case, set the threshold to be 𝜏ˆ𝑛 = 1. Secondly, given 𝑒1 , adopting some 𝜎 ∈ (1, 𝜎) will be proﬁtable for some 𝜏 ∈ [1, ∞). If a new technology of level 𝜎 is not proﬁtable for all 𝜏𝑖 , set the threshold for the level 𝜎 to be 𝜏ˆ𝑛 (𝜎) = 1. Otherwise, the new technology of level 𝜎 will be proﬁtable for some level of 𝜏 . Note as 𝜏 → 𝑛𝑖 𝜎 𝑛𝑖 −1 , home ﬁrms operate almost only in the home market. The limit of ﬁrm 𝑗’s gain (which equals beneﬁt minus cost) from adopting the new technology of level 𝜎 is Π𝑗𝑖ℎ (𝜎) − Π𝑗𝑖ℎ (𝑠𝑞) − 𝐼(𝜎) = −𝐼(𝜎), lim 𝑛 𝜎 𝜏→𝑛 Let 𝜏ˆ𝑛 (𝜎) = 𝑛𝑖 𝜎 𝑛𝑖 −1 . 𝑖 𝑖 −1 Therefore, all ﬁrms in all 𝑛-ﬁrm industries with 𝜏𝑖 ≥ 𝜏ˆ𝑛 (𝜎) will not adopt the technology of level 𝜎. The threshold for the 𝑛-ﬁrm industries is 𝜏ˆ𝑛 = sup {ˆ 𝜏𝑛 (𝜎) : 𝜎 ∈ (1, 𝜎)}. To ﬁnd the trade cost threshold for all possible 𝑛, we take 𝜏ˆ = max {ˆ 𝜏𝑛 : 𝑛 = 1, 2, ⋅ ⋅ ⋅ , 𝑛} and the conclusion follows. ■ 29 2.3. Exchange Appreciation and Investment Decision The consequence of Proposition 3 is that given an appreciation of a certain magnitude, 𝜏ˆ will partition ﬁrms into two sets. The ﬁrst set of industries with low 𝜏𝑖 may choose a new technology of level 𝜎 > 1 and the second set of ﬁrms will not.11 The remaining part of the section examines how home ﬁrms choose 𝜎. We will see that if the ﬁrst set contains industries with the same trade cost but the diﬀerent 𝑛𝑖 , then those with low 𝑛𝑖 are likely to choose a large 𝜎. In stage 2 of the game, the ﬁrst-period proﬁt is not dependent on the choice of 𝜎, and the equilibrium quantities and proﬁts are similar to section 2. The second-period proﬁts for home ﬁrm 𝑗 and foreign ﬁrm 𝑗 in the home market which depend on 𝜎 are ( 𝑗 𝜋𝑖ℎ2 = 𝑗 𝜋𝑖𝑓 2 = ( 𝑐ℎ 1 ∑ 𝑛𝑖 𝑘 ∑ 𝑛𝑖 𝑘 − 𝑗 𝜎 𝑘=1 𝑥𝑖ℎ2 + 𝑘=1 𝑥𝑖𝑓 2 𝜏𝑖 𝑐 𝑓 1 ∑𝑛 𝑖 𝑘 ∑𝑛 𝑖 𝑘 − 𝑒2 𝑘=1 𝑥𝑖ℎ2 + 𝑘=1 𝑥𝑖𝑓 2 ) ) 𝑥𝑗𝑖ℎ2 𝑥𝑗𝑖𝑓 2 and the ﬁrst order conditions are ∑ 𝑛𝑖 𝑘 ∑ 𝑘 𝑐ℎ 𝑘∕=𝑗 𝑥𝑖ℎ2 + 𝑘=1 𝑥𝑖𝑓 2 ∑ 𝑛𝑖 𝑘 ∑𝑛 𝑖 𝑘 2 − 𝑗 ≤ 0 𝜎 ( 𝑘=1 𝑥𝑖ℎ2 + 𝑘=1 𝑥𝑖𝑓 2 ) ∑𝑛 𝑖 𝑘 ∑ 𝑘 𝜏𝑖 𝑐 𝑓 𝑘∕=𝑗 𝑥𝑖𝑓 2 𝑘=1 𝑥𝑖ℎ2 + ∑ 𝑛𝑖 𝑘 ∑𝑛 𝑖 𝑘 2 − ≤0 𝑒2 ( 𝑘=1 𝑥𝑖ℎ2 + 𝑘=1 𝑥𝑖𝑓 2 ) (2.30) The ﬁrst order conditions implicitly deﬁne the optimal output 𝑥𝑗𝑖ℎ2 as a function of ⃗𝜎 = [𝜎 1 , 𝜎 2 , ⋅ ⋅ ⋅ , 𝜎 𝑛𝑖 ]. Denote it as 𝑥𝑗𝑖ℎ2 (⃗𝜎 ). Similarly we deﬁne the optimal output function in the foreign market as 𝑥𝑗∗ 𝜎) 𝑖ℎ2 (⃗ In stage 1, home ﬁrm 𝑗 foresees the equilibrium output functions in the 11 The relation between 𝜏 and adoption choice is not monotonic, as some ﬁrms with 𝜏𝑖 < 𝜏ˆ may choose not to adopt. 30 2.3. Exchange Appreciation and Investment Decision second stage and chooses 𝜎 𝑗 to maximize total proﬁt Π𝑗𝑖ℎ (𝜎 𝑗 ) − 𝐼(𝜎 𝑗 ) 𝑗 𝑗∗ 𝑗 𝑗∗ =𝜋𝑖ℎ1 + 𝑒1 𝜋𝑖ℎ1 + 𝛽𝜋𝑖ℎ2 (𝜎 𝑗 ) + 𝛽𝑒2 𝜋𝑖ℎ2 (𝜎 𝑗 ) − 𝐼(𝜎 𝑗 ) 𝑗 𝑗∗ =𝜋𝑖ℎ1 + 𝑒1 𝜋𝑖ℎ1 − 𝐼(𝜎 𝑗 ) ( ) 1 𝑐ℎ ∑𝑛 𝑖 𝑘 + 𝛽 ∑ 𝑛𝑖 𝑘 − 𝑗 𝑥𝑗𝑖ℎ2 (⃗𝜎 ) 𝜎 𝑥 (⃗ 𝜎 ) + 𝑥 (⃗ 𝜎 ) 𝑘=1 𝑖ℎ2 𝑘=1 𝑖𝑓 2 ( ) 𝜏𝑖 𝑐 ℎ 1 ∑𝑛𝑖 𝑘∗ − + 𝛽𝑒2 ∑𝑛𝑖 𝑘∗ 𝑥𝑗∗ 𝜎) 𝑖ℎ2 (⃗ 𝑗 𝑒 𝜎 𝑥 (⃗ 𝜎 ) + 𝑥 (⃗ 𝜎 ) 2 𝑘=1 𝑖ℎ2 𝑘=1 𝑖𝑓 2 By the Envelop Theorem, the ﬁrst order condition for an interior solution is 𝛽 𝑐ℎ (𝜎 𝑗 )2 ∂ 𝑗 𝑗 Π (𝜎 ) = 𝐼 ′ (𝜎 𝑗 ) ⇒ ∂𝜎 𝑗 𝑖ℎ [ ] 𝑥𝑗𝑖ℎ2 (⃗𝜎 ) + 𝜏𝑖 𝑥𝑗∗ (⃗ 𝜎 ) = 𝐼 ′ (𝜎 𝑗 ) 𝑖ℎ2 (2.31) Imposing symmetry among home ﬁrms’ choices of 𝜎, we have 𝑥𝑗𝑖ℎ2 (⃗𝜎 ) = 𝑥𝑘𝑖ℎ2 (⃗𝜎 ) for all 𝑘. Using this knowledge to simplify the (2.30), we have ( ) 𝑗 1 𝜎 1 𝑥𝑗𝑖ℎ2 = − 2 𝑛𝑖 (1 + 𝛼1 ) 𝑛𝑖 (1 + 𝛼1 )2 𝑐ℎ ) 𝑗 ( 𝜎 1 1 𝑗 𝑥𝑖ℎ2 = − 2 2 𝑛𝑖 (1 + 𝛼2 ) 𝑛𝑖 (1 + 𝛼2 ) 𝑐ℎ ) ( 1 1 1 − 2 𝑥𝑗𝑖𝑓 2 = 2 𝑛𝑖 (1 + 1/𝛼1 ) 𝑛𝑖 (1 + 1/𝛼1 ) 𝜏𝑖 𝑐 𝑓 ) ( 1 1 1 − 2 (2.32) 𝑥𝑗𝑖𝑓 2 = 2 𝑛𝑖 (1 + 1/𝛼2 ) 𝑛𝑖 (1 + 1/𝛼2 ) 𝜏𝑖 𝑐 𝑓 where 𝛼1 = 𝑛𝑖 −𝜎 𝑗 𝜏𝑖 (𝑛𝑖 −1) 𝑛𝑖 𝜎 𝑗 𝜏𝑖 −𝑛𝑖 +1 and 𝛼2 = 𝑛𝑖 𝜏𝑖 /𝜎 𝑗 −𝑛𝑖 +1 . 𝑛𝑖 −(𝑛𝑖 −1)𝜏𝑖 /𝜎 𝑗 Substituting (2.32) into (2.31) we obtain ) )] [( ( 1 𝛽 1 1 1 − + 𝜏𝑖 − 𝜎𝑗 𝑛𝑖 (1 + 𝛼1 ) 𝑛2𝑖 (1 + 𝛼1 )2 𝑛𝑖 (1 + 𝛼2 ) 𝑛2𝑖 (1 + 𝛼2 )2 = 𝐼 ′ (𝜎 𝑗 ) (2.33) 31 2.3. Exchange Appreciation and Investment Decision which can be solved for the equilibrium 𝜎 𝑗 . Proposition 4. Let 𝑒1 < 1 and consider industries with the same trade cost 𝜏 < 𝜏ˆ in Θ𝜎 (𝑒1 ). If all 𝜎 > 1 in some interval in (1, 𝜎) are proﬁtable for ﬁrms in industries with diﬀerent 𝑛𝑖 , then the choice of 𝜎 is decreasing in 𝑛𝑖 for 2 ≤ 𝑛𝑖 ≤ 𝑛. Proof: Using the left-hand-side of (2.33) we have ∂ ∂ ( 𝑗 Π𝑗𝑖ℎ (𝜎 𝑗 )) = ∂𝑛𝑖 ∂𝜎 ] [ 𝑗 𝑗 1 𝜏𝑖 (𝜏𝑖 /𝜎 + 1)(𝜏𝑖 /𝜎 𝑗 )(2 − 𝑛𝑖 − 𝑛𝑖 𝛼2 ) (1𝜎 𝜏𝑖 )(2 − 𝑛𝑖 − 𝑛𝑖 𝛼1 ) + 𝜎 𝑗 (𝑛𝑖 𝜏𝑖 𝜎 𝑗 − 𝑛𝑖 + 1)𝑛3𝑖 (1 + 𝛼1 )3 (𝑛𝑖 − (𝑛𝑖 − 1)𝜏𝑖 /𝜎 𝑗 )2 𝑛3𝑖 (1 + 𝛼2 )2 which is negative if 𝑛𝑖 ≥ 2. This means the marginal beneﬁt of 𝜎 𝑗 is bigger for industries with a smaller 𝑛𝑖 , provided 𝑛𝑖 ≥ 2. By the Envelop Theorem again we have ] ∂2 −2𝑐ℎ [ 𝑗 𝑗 𝑗∗ 𝑗 Π (𝜎 ) = 𝛽 𝑥 (⃗ 𝜎 ) + 𝜏 𝑥 (⃗ 𝜎 ) <0 𝑖 𝑖ℎ2 ∂𝜎 𝑗 ∂𝜎 𝑗 𝑖ℎ (𝜎 𝑗 )3 𝑖ℎ2 Thus Π𝑗𝑖ℎ (𝜎 𝑗 ) is a strictly concave function. Let 𝑛′ and 𝑛′′ be the number of ﬁrms in two industries with the same trade cost 𝜏 < 𝜏˜ and 2 ≤ 𝑛′ < 𝑛′′ ≤ 𝑛. Denote the ﬁrms’ optimal choices of technology levels as 𝜎𝑛′ and 𝜎𝑛′′ . Suppose 𝜎𝑛′′ ≥ 𝜎𝑛′ . Then we have ∂Π𝑗𝑖ℎ (𝜎 𝑗 , 𝑛𝑖 = 𝑛′ ) ∂𝜎 𝑗 𝜎𝑛′′ > ∂Π𝑗𝑖ℎ (𝜎 𝑗 , 𝑛𝑖 = 𝑛′′ ) ∂𝜎 𝑗 𝜎𝑛′′ = ∂ 𝐼(𝜎 𝑗 ) ∂𝜎 𝑗 𝜎𝑛′′ which means the proﬁt for ﬁrm 𝑗 in the 𝑛′ -ﬁrm industry Π𝑗𝑖ℎ (𝜎 𝑗 , 𝑛𝑖 = 𝑛′ )−𝐼(𝜎 𝑗 ) is increasing at some level no smaller than 𝜎𝑛′ . This increase contradicts that 𝜎𝑛′ is the optimal choice for ﬁrms in the industry with 𝑛𝑖 ﬁrms, 32 2.3. Exchange Appreciation and Investment Decision unless there is another local maximizer 𝜎 ∗ with 𝜎 ∗ > 𝜎𝑛′ . However, since Π𝑗𝑖ℎ (𝜎 𝑗 ) is strictly concave and 𝐼(𝜎) is strictly convex, there are no other local maximizers. Thus we conclude that 𝜎𝑛′′ < 𝜎𝑛′ for all 2 ≤ 𝑛′ ≤ 𝑛′′ ≤ 𝑛. ■ Note when 𝜎 is greater (1 + 1 1 𝑛𝑖 −1 ) 𝜏𝑖 , all foreign ﬁrms in industry 𝑖 are forced out of the home market. Given a 𝜏𝑖 we can make 𝐼(𝜎) rise fast enough 1 ) 𝜏1𝑖 . This ensures so that it will exceed the beneﬁt of adoption at 𝜎 = (1+ 𝑛−1 all home ﬁrms will have interior choices of 𝜎, i.e. the foreign ﬁrms will not be out of the home and foreign market. Figure 2.2 illustrates this point. 33 2.3. Exchange Appreciation and Investment Decision Π𝑗𝑖ℎ (𝜎) − Π𝑗𝑖ℎ (𝑠𝑞),𝐼(𝜎) 𝐼(𝜎) Π𝑗𝑖 (𝜎) − Π𝑗𝑖 (𝑠𝑞) (1, 0) Optimal 𝜎 (1 + 1 1 𝑛−1 ) 𝜏𝑖 𝜎 Π𝑗𝑖 (𝜎) − Π𝑗𝑖 (𝑠𝑞) is the beneﬁt of adopting technology of level 𝜎 and 𝐼(𝜎) is the ﬁxed cost. For a given 𝜏𝑖 , when improvement in home productivity 𝜎 is larger than (1 + 1 )1, 𝑛−1 𝜏𝑖 foreign ﬁrms in industries with 𝑛 ﬁrms begin to drop out of the market and home ﬁrms has a jump in proﬁt as they are competing only against each other. When 𝐼(𝜎) rises fast enough, choosing 𝜎 > (1 + 1 )1 𝑛−1 𝜏𝑖 is not optimal and home ﬁrms will choose an interior 𝜎. For industries with 𝑛𝑖 < 𝑛, their jump points in proﬁts are bigger than (1 + 1 )1. 𝑛−1 𝜏𝑖 Firms in these industries will choose interior 𝜎 as well as long as this is the case in the 𝑛-ﬁrm industry. Figure 2.2: The Beneﬁt and Cost of Adopting the Disruptive Technology 34 2.3. Exchange Appreciation and Investment Decision The key for the proof is that among industries with the same 𝜏 , the proﬁt of ﬁrms in industries with lower 𝑛𝑖 is more responsive to 𝜎. Thus the marginal proﬁt with respect to 𝜎 is equal to the marginal cost 𝐼 ′ (𝜎) at a bigger value. Figure 2.3 demonstrates the argument graphically. Π𝑗𝑖ℎ (𝜎) − Π𝑗𝑖ℎ (𝑠𝑞),𝐼(𝜎) 𝐼(𝜎) Π𝑗𝑖ℎ (𝜎) − Π𝑗𝑖ℎ (𝑠𝑞) for 𝑛𝑖 = 𝑛 Π𝑗𝑖ℎ (𝜎) − Π𝑗𝑖ℎ (𝑠𝑞) for 𝑛𝑖 = 𝑛 + 1 Optimal 𝜎 for industry with 𝑛𝑖 = 𝑛 (1, 0) Optimal 𝜎 for industry with 𝑛𝑖 = 𝑛 + 1 𝜎 Π𝑗𝑖ℎ (𝜎) − Π𝑗𝑖ℎ (𝑠𝑞) is the beneﬁt of adopting technology of level 𝜎 and 𝐼(𝜎) is the ﬁxed cost. Proposition 4 shows, the beneﬁt of adoption for industries with 𝑛 ﬁrms is increasing faster in 𝜎 than industries with 𝑛 + 1 ﬁrms. Given the same ﬁxed cost 𝐼(𝜎), the optimal choice of 𝜎 for ﬁrms in industries with fewer ﬁrms is larger. Figure 2.3: Illustration of The Relation between 𝑛𝑖 and Choice of 𝜎 Putting Propositions 2, 3 and 4 together yields the following predictions. First, among industries with trade cost lower than the threshold 𝜏ˆ there is negative correlation between the number of ﬁrms per industry and the choice of 𝜎 if 𝑛𝑖 ≥ 2. Since the concentration level of an industry is inversely related to the number of ﬁrms, if we regresses 𝜎 on concentration for the set 35 2.3. Exchange Appreciation and Investment Decision of industries with 𝜏ˆ, OLS is predicted to ﬁnd a positive relation. Second, industries with trade costs greater than 𝜏ˆ will not adopt the disruptive technology. For these industries, a regression of 𝜎 on concentration will yield a zero slope coeﬃcient. Figure 2.4 illustrate the adoption choices for ﬁrms in diﬀerent industries. Overall, if we simply pool all industries together and regress 𝜎 on concentration, we are likely to ﬁnd a positive relation. 𝜎 Choices of 𝜎 for industries with trade cost below 𝜏ˆ 1 1 Choices of 𝜎 for industries with trade cost above 𝜏ˆ 𝑛𝑖 2 3 4 5 6 7 8 The model suggests, 1) for industries with trade cost lower than 𝜏ˆ the choice of productivity improvement 𝜎 is negatively correlated with the number of ﬁrms per industry 𝑛𝑖 , and 2) industries with trade cost greater than 𝜏ˆ will not adopt the disruptive technology (denoted as choosing 𝜎 = 1 in the ﬁgure). Figure 2.4: Level of Technology Adoption and Number of Firms in the Industry Compared to Holmes et al. (2008) and other previous theoretical papers which focus on the question of whether ﬁrms will adopt a new technology when there is more competition, this chapter studies both the conditions for adoption and the intensity of adoption. The model presented here diﬀerentiates between two types of competition, the competitive pressure from appreciations, and market concentration. The competitive pressure 36 2.4. Manufacturing Productivities in Canada . . . from appreciations is predicted to provide an incentive for adopting new technologies, consistent with ﬁnding of previous papers. However, ﬁrms in highly-concentrated industries, i.e. those subject to less competition in this dimension, are likely to invest more to achieve bigger productivity improvements. Thus, in this model the eﬀect of competition on adoption of new technologies is subtle. 2.4 Manufacturing Productivities in Canada Over the Last Decade When the home country experiences an appreciation, the model developed in sections 2 and 3 oﬀers the following two key predictions. First, in general appreciations provide incentives for ﬁrms to improve productivity. Second, among industries with low trade costs, the highly concentrated ones will implement bigger improvements to productivity, as proﬁts of ﬁrms with a bigger market share will be more responsive to change in productivity. Industries with high trade costs will have no incentive to improve productivity regardless of the concentration level, as the high trade cost will limit competition from foreign industries.12 12 It should be recognized that an important alternative mechanism can potentially also give rise to similar predictions. That is, when exchange rate appreciates, foreign capital goods and intermediate goods that embody better technology will become cheaper. Such mechanism will predict increase in capital or intermediate goods purchase. Without access to detail data on the capital investment and intermediate good trade for Canadian manufacturing industries, I am currently unable to diﬀerentiate between the two hypothesis in the empirical section. 37 2.4. Manufacturing Productivities in Canada . . . In the model I assume that in the country that experiences a depreciation, productivity will not respond to deprecation. The assumption is needed to simplify the analysis when industries in the other country are allowed to choose the level of productivity improvement. If this assumption is a reasonable approximation of ﬁrms’ behaviour during depreciation, we would see the ﬁrms’ productivity fall relative to their counterparts in the other countries as the latter group of ﬁrms have an incentive to improve productivity to counter the movement of the exchange rate. To test the predictions of the model, I analyze how the productivity of Canadian manufacturing industries responded between 1997 and 2006 to the interactions between exchange rate movements, trade costs, and concentrations. There are a few advantages to using Canadian manufacturing data. First, Canada is a highly open economy, and its manufacturing industries are exposed to a substantial amount of trade. In particular, because of the Free Trade Agreement with the US, Canada’s main trading partner, we may consider the trade costs of Canadian industries reﬂect mostly exogenous factors. Second, during the sample period the Canadian dollar experienced ﬁrst a moderate depreciation then a major appreciation. Since there is evidence (see for instance Maier and DePratto (2007)) that the recent exchange movements are partly driven by movements in commodity prices, it is reasonable to suggest the movements are exogenous to manufacturing industries. Although productivity of manufacturing industries may contribute to the movements in exchange rates, such eﬀects are likely to be dominated by the commodity factor. 38 2.4. Manufacturing Productivities in Canada . . . Third, since both Canada and the US have adopted the North American Industry Classiﬁcation System (NAICS), I am able to use productivity growth in the US manufacturing industries to control for some of the unobserved industry characteristics. Among others, this would capture technological spillovers from US industries. 2.4.1 Speciﬁcation and Data The sample used in this study involves the annual data of 237 6-digit NAICS Canadian manufacturing industries from 1997 to 200613 . The sources of Canadian data are the Annual Survey of Manufacturers (ASM) published by Statistics Canada, the Canadian Socioeconomic Information Management (CANSIM) Database, the Bank of Canada, the Annual Survey of Manufactures (ASM) published by the US Census Bureau, and the Basic Economics database (DRI/McGraw-Hill). The speciﬁcation is 𝑑𝑙𝑛(𝑝𝑟𝑜𝑑𝑢𝑐𝑡𝑖𝑣𝑖𝑡𝑦)𝑖𝑡 = 𝛽0 + 𝛽1 ⋅ 𝑑𝑙𝑛(𝑒𝑥𝑐ℎ𝑎𝑛𝑔𝑒 𝑟𝑎𝑡𝑒)𝑡−1 + 𝛽2 ⋅ 𝑐𝑜𝑛𝑐𝑒𝑛𝑡𝑟𝑎𝑡𝑖𝑜𝑛𝑖𝑡−1 + 𝛽3 ⋅ 𝑑𝑙𝑛(𝑒𝑥𝑐ℎ𝑎𝑛𝑔𝑒 𝑟𝑎𝑡𝑒)𝑡−1 ⋅ 𝑐𝑜𝑛𝑐𝑒𝑛𝑡𝑟𝑎𝑡𝑖𝑜𝑛𝑖𝑡−1 + 𝛽4 ⋅ 𝑑𝑙𝑛(𝑒𝑥𝑐ℎ𝑎𝑛𝑔𝑒 𝑟𝑎𝑡𝑒)𝑡−1 ⋅ 𝑐𝑜𝑛𝑐𝑒𝑛𝑡𝑟𝑎𝑡𝑖𝑜𝑛𝑖𝑡−1 ⋅ 𝑇 𝑟𝑎𝑑𝑒 𝐷𝑢𝑚𝑚𝑦𝑖 + 𝛽5 ⋅ (𝑜𝑡ℎ𝑒𝑟 𝑐𝑜𝑛𝑡𝑟𝑜𝑙𝑠) + 𝑢𝑖 + 𝜖𝑖𝑡 (2.34) where 𝑖 is the index for the industries and 𝑡 for year. 𝑢𝑖 is the industry speciﬁc eﬀect and 𝜖𝑖𝑡 is the error term assumed to be i.i.d. across industries 13 The total number of 6-digit NAICS manufacturing industries is 262. 25 industries are missing from the sample. 39 2.4. Manufacturing Productivities in Canada . . . and time. In the speciﬁcation, I use last period exchange rate movement as a regressor. In the model, at the beginning of the ﬁrst period ﬁrms decide whether to improve productivity conditioned on an expected appreciation. Since exchange rate movement is highly persistent, an appreciation in the last period is a good predictor that the current period exchange rate will stay at an appreciated level. The interaction between exchange rate and concentration corresponds to the model prediction, that in general, there is a positive relation between market concentration and productivity growth during appreciations. The triple interaction term reﬂects the model prediction that, during an appreciation, market concentration level is positively associated with productivity gain within the group of highly traded industries. Since an appreciation is deﬁned as an decrease in the exchange rate, a negative 𝛽4 supports the prediction. The traditional measure of productivity, total factor productivity (TFP), is not available for Canadian manufacturing industries as Statistics Canada does not provide data on capital stock or investment necessary for the computation of TFP. Thus I use labour productivity instead, and the main measure is value added per production worker. In robustness checks I also explore manufacturing revenue per production worker as an alternative measure of labour productivity. Value added per production worker is often used to measure labour productivity in the international trade literature, for instance in Bernard and Jensen (1999). Treﬂer (2004) uses “value added in production activities per hour worked by production workers” as the measure for productivity. While the analysis of Treﬂer (2004) is based on the 40 2.4. Manufacturing Productivities in Canada . . . 3-digit SIC manufacturing industries, this chapter is based on 6-digit NAICS classiﬁcation of industries. As the hours worked are not reported by Statistics Canada for the 6-digit NAICS industries, it is not possible for this chapter to use the same measure. The measure of exchange rate is the Canadian-dollar eﬀective exchange rate index (CERI) created by the Bank of Canada. It is deﬁned by the Bank of Canada as “a weighted average of bilateral exchange rates for the Canadian dollar against the currencies of Canada’s major trading partners”14 . Since the US dollar carries a weight of 0.7618, the movement of the CERI closely mimics the movement of the Canada/US exchange rate, as shown in Figure 2.5. I deﬂate the CERI by the inﬂation rate in Canada and the weighted inﬂation rate of the major trading partners to obtain movements in real exchange rate. 14 15 These currencies are the US dollar, the European Union euro, the Japanese yen, the UK pound, the Chinese yuan, and the Mexican peso. Details can be found at http://www.bank-banque-canada.ca/en/rates/ceri.pdf. 15 In unreported regressions, I use the Canada/US real exchange rate and ﬁnd results are not sensitive to this treatment. 41 2.4. Manufacturing Productivities in Canada . . . 1.8 1.7 Nominal exchange rate 1.6 1.5 1.4 1.3 1.2 1.1 1 Canada/US Trade weighted $C 0.9 0.8 1990 1992 1994 1996 1998 2000 Year 2002 2004 2006 The solid line is the Canada/US nominal exchange rate and the dashed line is the Canadian-dollar eﬀective exchange rate (CERI). Both are measured at annual frequency. Note the original CERI has a base value of 100 and is deﬁned as the price of Canadian dollar in terms of the basket of foreign currencies. To make it compatible with the deﬁnition in the chapter, I divide the original CERI by 100 and take the inverse. The dashed line plots the edited CERI series. Figure 2.5: Movements of Canadian Dollar Exchange Rate Since 1990 42 Table 2.1: Means of Key Variables between 1997 and 2006 Whole period Depr(1997-2002) Appr(2002-2006) 1.4% (0.39%) 1.0% (0.53%) 2.0% (0.56%) dln(value added per production worker), US 3.0% (0.31%) 1.3% (0.42%) 5.2% (0.42%) dln(revenue per production worker), CND 1.5% (0.29%) 0.6% (0.02%) 2.5% (0.39%) dln(eﬀective exchange rate) -1.4% trade to revenue ratio 1.39 (4.9%) (0.04) 1.6% (2.6%) -5.6% 1.36 (0.05) 1.43 4-ﬁrm concentration ratio 48.3% dln(manufacturing revenue) 0.8% (0.31%) 2.3% dln(value of export) 0.9% (0.48%) dln(value of import) 2.8% dln(number of production workers) (0.05) 48.0% (0.80%) (0.45%) -1.1% (0.43%) 4.0% (0.73%) -3.0% (0.57%) (0.44%) 4.2% (0.68%) 1.1% (0.49%) -0.7% (0.33%) 1.7% (0.43%) -3.7% (0.48%) dln(R&D expenditure) 7.9% (0.36%) 12.3% 2.5% (0.79%) dln(energy per production worker) 5.2% (0.37%) 2.7% (0.49%) 8.2% (0.55%) dln(material per production worker) 1.5% (0.34%) 1.5% (0.45%) 1.6% (0.51%) establishment size 60 (0.52%) (2.53) 48.3% (4.4%) 66 (0.69%) (0.54%) (3.90) 52 (2.89) Notes: 1) The numbers are the means of 237 6-digit NAICS industries over the time period indicated, except for the case of R&D expenditure where the means are calculated from 4-digit NAICS industries. 2) “dln” denotes ﬁrst diﬀerences in log, as approximations for growth rates. 3) The numbers in the parenthesis are standard errors. 2.4. Manufacturing Productivities in Canada . . . dln(value added per production worker), CND 43 2.4. Manufacturing Productivities in Canada . . . The concentration of production in each industry is measured by the 4ﬁrm concentration ratio (CR4) reported by Statistics Canada. In the model ﬁrms are symmetric, so CR4 has an inverse relation with the number of ﬁrms in the industry. In reality ﬁrms diﬀer in size so CR4 might be a better measure of concentration compared to the number of ﬁrms 16 . Since data on CR4 is not available beyond 2003, I use the 2003 values for the years 2004 and 200517 . Trade costs of industries are not observed but in the model they have an inverse relation with the trade to sales ratio. I construct the ratio for an industry as the value of total import plus export divided by the manufacturing revenue of the industries between 1997 and 200618 . Other control variables included in the regressions are growth in energy per production worker, growth in material per production worker, growth in R&D expenditure, average establishment size, productivity growth in 16 CR4 is used by MacDonald (1994) to study how the change in productivity varies with market power after an import surge. This chapter is similar in that it also studies how the eﬀect of competition diﬀers with cross section diﬀerence in CR4. 17 Note CR4 enters the regression model with a one-period lag. Using 2003 values for the year of 2004 and 2005 does not have a major impact on the results, since CR4 is stable over time (see Table 2.1) and most of the variation in CR4 comes from the cross-section. In the robustness check subsection, I show the main results hold even if I use the 1990 CR4 values as the measure for concentration between 1997 and 2006. 18 The construction calculates a constant trade-to-revenue ratio that does change over the years for a particular industry. I choose this ratio because I will estimate a threshold regression model based on the trade to revenue ratio where the ratio is used as a measure for trade cost. If one allows the trade to revenue ratio of an industry to vary across years, the industry can be classiﬁed as a high-trade-cost industry in one year and a low-trade-cost one in another, which is probably not desirable. 44 2.4. Manufacturing Productivities in Canada . . . corresponding US industry, and GDP growth in Canada and the US. Lastly, industry ﬁxed eﬀects and year eﬀects are also used in most of the regressions. The inclusion of the year eﬀects of course precludes GDP growth rates. As mentioned before, there are no direct measures of the capital stock, its utilization variation, and changes in hours worked per worker. Including energy and material use provides a limited remedy. The model in the chapter focuses on the adoption of a known technology, and the inclusion of R&D expenditure helps to control for the improvement to productivity due to ﬁrms’ search for new technologies. However, R&D is available only for 3digit or 4-digit NAICS industries, at a higher level of aggregation than 6-digit NAICS industries. Average establishment size is computed as the number of production workers per establishment in an industry. It is included to control for return-to-scale eﬀects. Since it is possible for Canadian industries to beneﬁt from technological spillover from foreign industries, especially US industries, I include productivity growth in the corresponding US industry to capture such learning opportunities. Adding real GDP growth rates of Canada and US will control for the eﬀects of macroeconomic productivity and demand shocks. Before turning to regression results, it is useful to have a brief look at a number of key variables during the depreciation sub-period (1997-2002) and the appreciation sub-period (2002-2006) in Table 2.1. We can see that during appreciation export growth and employment of production workers dropped. Meanwhile, Canadian manufacturing labour productivity, measured by both value added per production worker and manufacturing revenue per production worker increased, although it was 45 2.4. Manufacturing Productivities in Canada . . . outpaced by US labour productivity growth. Judging from the means reported, we cannot rule out the possibility that the higher labour productivity growth in Canada had come from spillover from the 5.2% growth in US labour productivity. It could also be case that higher energy use per production worker contributed to the labour productivity growth. Growth in R&D expenditures and scale eﬀects as measured by establishment sizes, on the other hand, appear to be poor explanations for the higher productivity growth in the appreciation sub-period, as the two variables were lower during the appreciation. Lastly, it’s worth noting there was little change in the average concentration ratio. 2.4.2 Main Results I estimate all speciﬁcations with the linear model with industry ﬁxed eﬀects. The only complication comes from the threshold eﬀect of trade. Conditioned on whether trade exceeds a threshold level, the model predicts diﬀerent relations between concentration of production and productivity gain during appreciation. The trade threshold is unknown and has to be estimated. The estimation of the threshold follows Hansen (2000), and is based on leastsquare regressions. I ﬁrst construct a grid of trade-ratios with the step size being 0.5 of a centile and then search the grid for a threshold at which the eﬀect of concentration-exchange-rate interaction changes signiﬁcantly. The estimated threshold is located at the 83.5th centile, translating to a trade to revenue ratio of 1.89. There are 39 industries with a trade ratio above the threshold. The 95% conﬁdence interval for the threshold is between the 76th and 92.5th centiles, or [1.63, 2.80] in terms of trade-to-revenue ratio. Using 46 2.4. Manufacturing Productivities in Canada . . . the threshold estimate, I estimate the threshold regression model speciﬁed in (2.34). Standard errors are computed with methods suggested in Hansen (2000). Though not predicted by the theory, it is plausible that the eﬀect of exchange rate on labour productivity growth may also change with a trade threshold. The application of the threshold estimation method on the interaction between trade and exchange rate movement indicates there is no statistically signiﬁcant threshold eﬀect19 . In essence, I have used the method in Hansen (2006) to guide the empirical speciﬁcation. In one of the robustness checks, the interaction between the trade dummy and exchange rate movement is included to show key results are insensitive to its inclusion. The ﬁrst three columns of Table 2.2 report the benchmark regression results. The speciﬁcation in column (1) includes year dummies, thus precluding variables that are invariant across the cross-section, in particular the last-period exchange rate movement. Speciﬁcation (2) and (3) estimate the same speciﬁcation using only the subsamples. In column (1) of Table 2.2, the level of concentration ratio is not signiﬁcant, consistent with the theory prediction that it should not matter independent of the exchange rate. The interaction between concentration and exchange rate is negative and signiﬁcant, with a coeﬃcient of -0.009. This estimate implies that during a 5% appreciation20 an industry with a 20% higher concentration ratio will experience labour productivity growth 19 In unreported regressions, the interaction between trade-to-revenue ratio and other variables, such as the concentration ratio, are also included as regressors. Such interactions are always highly insigniﬁcant. 20 Note again, appreciation is deﬁned as decrease in the exchange rate. 47 2.4. Manufacturing Productivities in Canada . . . Table 2.2: Benchmark Fixed Eﬀect Estimations Dependent variable dln(productivity) 𝐶𝑅4 𝑑𝑙𝑛(𝑅𝐸𝑅) ⋅ 𝐶𝑅4 𝑑𝑙𝑛(𝑅𝐸𝑅) ⋅ 𝐶𝑅4 ⋅ 𝑇 𝑟𝑎𝑑𝑒𝐷 𝑑𝑙𝑛(𝑅&𝐷) 𝐸𝑠𝑡𝑎𝑏 𝑠𝑖𝑧𝑒 𝑑𝑙𝑛(𝐸𝑛𝑒𝑟𝑔𝑦) 𝑑𝑙𝑛(𝑀 𝑎𝑡𝑒𝑟𝑖𝑎𝑙) 𝑑𝑙𝑛(𝑃 𝑟𝑜𝑑𝑢𝑐𝑡𝑖𝑣𝑖𝑡𝑦 𝑈 𝑆) Full sample Appr. Depr. 1997-2006 2002-2006 1997-2002 -0.001 -0.003 -0.001 (0.006) (0.003) (0.001) -0.009** -0.005 -0.013 (0.003) (0.004) (0.010) -0.005 -0.011** 0.001 (0.003) (0.004) (0.009) 0.002 -0.013 -0.018 (0.027) (0.046) (0.044) 0.004*** 0.001 0.001** (0.001) (0.001) (0.0002) 0.199*** 0.106** 0.326** (0.028) (0.040) (0.049) 0.250** 0.207** 0.276** (0.028) (0.039) (0.050) 0.159** 0.223* 0.165 (0.074) (0.098) (0.151) year dummies included included included industry ﬁxed eﬀects included included included 𝑅2 0.11 0.09 0.10 Observations 2068 906 1162 237 231 237 Industries Notes: 1) ***, ** and * indicate signiﬁcance levels of 1%, 5% and 10%. 2)“dln” denotes ﬁrst diﬀerences in log, as approximations for growth rates. 3) The dependent variable is labour productivity, measured as value added per production worker. RER, CR4, TradeD, R&D, Estab size, Energy, Material, and Productivity US denote respectively real exchange rate, 4-ﬁrm concentration ratio, a dummy variable for highly-trade industries, R&D expenditure, average establishment size, energy used per production worker, material used per production worker, growth in value added per production worker in the corresponding US industry. 48 2.4. Manufacturing Productivities in Canada . . . that is 0.9% higher. Since the average labour productivity growth rate between 1997 and 2006 is was 1.4%, and that the standard deviation of the concentration ratio was 24%, we can say this is an economically signiﬁcant eﬀect. Meanwhile, the coeﬃcient on the triple interaction of exchange rate, concentration and trade dummy is -0.005, which is economically large but not statistically signiﬁcant. The growth rate in R&D expenditure appears to have had no eﬀect on labour productivity growth. While the establishment size did have a impact on labour productivity growth, the magnitude was not big as a coeﬃcient of 0.004 meant that an increase of establishment size by 100 workers only raised labour productivity growth by 0.04%21 . The coeﬃcient on the energy and material variables suggest that the energy and material elasticity of productivity are 0.199 and 0.250 respectively. Both are highly signiﬁcant. Lastly, the labour productivity growth in Canadian industries was positively correlated with the growth in US. A 1% increase in productivity in an US industry is associated with a 0.159% increase in the corresponding Canadian industry. Column (2) is estimated with the subsample between 2002 and 2006, i.e. the appreciation period, while column (3) is estimated with the subsample of the depreciation period. The discussion will be focused on the interaction terms, as estimates of other coeﬃcients are similar to column (1). In column (2), the interaction between concentration and exchange rate becomes in21 The unit of measurement for establishment is scaled up to 10 workers to facilitate the presentation of results, i.e. to avoid many fractions with four digits after the decimal point. 49 2.4. Manufacturing Productivities in Canada . . . signiﬁcant while the triple interaction term becomes signiﬁcant. A coeﬃcient of -0.011 on the triple interaction term implies that during a 5% appreciation an industry with a 20% higher concentration ratio will experience a labour productivity growth that is 1.1% higher. The estimates are more in line with the predictions of the theory, i.e. we expect to see a positive correlation between concentration and labour productivity growth only for the high-trade industries. On the other hand, the estimation on the depreciation subsample indicates no threshold eﬀect and the eﬀect of concentration-exchange-rate interaction is large but not statistically signiﬁcant. It is worth noting that most of the variation in concentration ratio comes from the cross-section, rather than variation in the time dimension. Over the sample period, 98% of the variance in concentration is accounted for by the variance in the industry average concentration ratio. Namely, within most industries, the concentration levels had experienced very little changes. Therefore, in interpreting the results, we can roughly view the concentration level as ﬁxed over time and regard the regression coeﬃents on the concentration-exchange-rate interactions as reﬂection of the diﬀerent eﬀects of exchange rates movements on industries with diﬀerent pre-determined concentration levels. 2.4.3 Robustness Checks In this subsection, I conduct several robustness checks. Table 2.3 reports the results with alternative dependent variables. The dependent variable in columns (1) through (3) is diﬀerence between Canadian and US labour productivity growth rates. Adopting this dependent variable is equivalent 50 2.4. Manufacturing Productivities in Canada . . . to imposing the restriction that the coeﬃcient on US productivity growth is 1 in the regressions in Table 2.2. Careful comparison between the ﬁrst three columns of Table 2.3 and Table 2.2 suggests they are very similar. In the last three columns, the dependent variable is manufacturing revenue per production worker, arguably a poorer measure for labour productivity not accounting for costs of other inputs. Although the overall ﬁt of the three regressions are much better, we can only ﬁnd a weak relation between concentration and labour productivity growth and there is no evidence of a threshold eﬀect. In the baseline estimations, I look at the eﬀect of exchange rate change between year 𝑡 − 1 and 𝑡 on productivity growth between 𝑡 and 𝑡 + 1. Since in their decision-making, ﬁrms may look into exchange rate change over a longer period in the past, and the change in productivity may realize over a longer period too, I also estimate equations with alternative assumption about the length of periods. In Table 2.4, the ﬁrst three columns present eﬀects of exchange rate change between 𝑡−2 and 𝑡 on productivity between 𝑡 and 𝑡+2. The last three columns are eﬀects of exchange rate change between 𝑡−3 and 𝑡 on productivity between 𝑡 and 𝑡+1. While there are some changes in parameter estimates, the coeﬃcients on the triple interaction term for the appreciation period are very similar to the benchmarks in 2.2. After a major appreciation, productivity can improve due to ﬁrms upgrade their technologies, as suggested in this chapter. However, productivity increase can also result from exits of less eﬃcient ﬁrms. In Table 2.5, I present results from speciﬁcations augmented by change in the number of establishments. We can see the coeﬃcients on the interaction terms are 51 2.4. Manufacturing Productivities in Canada . . . Table 2.3: Alternative Dependent Variables Dependent variable 𝐶𝑅4 𝑑𝑙𝑛(𝑅𝐸𝑅) ⋅ 𝐶𝑅4 𝑑𝑙𝑛(𝑅𝐸𝑅) ⋅ 𝐶𝑅4 ⋅ 𝑇 𝑟𝑎𝑑𝑒𝐷 𝑑𝑙𝑛(𝑅&𝐷) 𝐸𝑠𝑡𝑎𝑏 𝑠𝑖𝑧𝑒 𝑑𝑙𝑛(𝐸𝑛𝑒𝑟𝑔𝑦) 𝑑𝑙𝑛(𝑀 𝑎𝑡𝑒𝑟𝑖𝑎𝑙) 𝑑𝑙𝑛(𝑃 𝑟𝑜𝑑𝑢𝑐𝑡𝑖𝑣𝑖𝑡𝑦 𝑈 𝑆) Full sample Appr. Depr. Full sample Appr. Depr. 1997-2006 2002-2006 1997-2002 1997-2006 2002-2006 1997-2002 (1) (2) (3) (4) (5) (6) -0.001 -0.002 -0.001 -0.001** -0.001 -0.001 (0.001) (0.001) (0.003) (0.001) (0.0003) (0.001) -0.006* -0.002 -0.014 -0.003* -0.002 -0.002 (0.004) (0.003) (0.011) (0.002) (0.003) (0.007) -0.006* -0.013*** -0.001 -0.001 -0.002 0.003 (0.004) (0.005) (0.010) (0.002) (0.002) (0.006) 0.056** 0.073 0.020 0.002 0.040** -0.043 (0.027) (0.047) (0.044) (0.015) (0.019) (0.029) 0.005*** 0.003 0.008*** 0.001* 0.002 0.002* (0.001) (0.003) (0.002) (0.0001) (0.002) (0.001) 0.207*** 0.092** 0.352*** 0.118*** 0.139*** 0.102*** (0.029) (0.042) (0.050) (0.015) (0.017) (0.032) 0.238*** 0.211*** 0.252*** 0.584*** 0.536*** 0.637*** (0.029) (0.040) (0.050) (0.079) (0.016) (0.032) - - - 0.079* 0.165*** -0.015 (0.041) (0.041) (0.099) year dummies included included included included included included industry ﬁxed eﬀects included included included included included included 𝑅2 0.09 0.06 0.09 0.05 0.06 0.04 Observations 2068 906 1162 2068 906 1162 237 231 237 237 231 237 Industries Notes: 1) ***, ** and * indicate signiﬁcance levels of 1%, 5% and 10%. 2)“dln” denotes ﬁrst diﬀerences in log, as approximations for growth rates. 3) The dependent variable in the ﬁrst three columns is the growth rates diﬀerence in Canada and US value added per production worker. 4) The dependent variable in column (4) through (6) is manufacturing revenue per production worker. 5) RER, CR4, TradeD, R&D, Estab size, Energy, Material, and Productivity US denote respectively real exchange rate, 4-ﬁrm concentration ratio, a dummy variable for highly-trade industries, R&D expenditure, average establishment size, energy used per production worker, material used per production worker, growth in value added per production worker in the corresponding US industry. 52 2.4. Manufacturing Productivities in Canada . . . Table 2.4: Alternative Speciﬁcation of Lags Dependent variable dln(productivity) 𝐶𝑅4 𝑑𝑙𝑛(𝑅𝐸𝑅) ⋅ 𝐶𝑅4 𝑑𝑙𝑛(𝑅𝐸𝑅) ⋅ 𝐶𝑅4 ⋅ 𝑇 𝑟𝑎𝑑𝑒𝐷 𝑑𝑙𝑛(𝑅&𝐷) 𝐸𝑠𝑡𝑎𝑏 𝑠𝑖𝑧𝑒 𝑑𝑙𝑛(𝐸𝑛𝑒𝑟𝑔𝑦) Full sample Appr. Depr. Full sample Appr. Depr. 1997-2006 2002-2006 1997-2002 1997-2006 2002-2006 1997-2002 (1) (2) (3) (4) (5) (6) -0.001 -0.007* -0.001 -0.001 -0.003 -0.001 (0.001) (0.004) (0.001) (0.001) (0.003) (0.001) -0.005 0.002 -0.011 0.001 0.002 -0.005 (0.003) (0.004) (0.009) (0.002) (0.002) (0.014) -0.005 -0.011*** 0.013 -0.004** -0.008*** 0.019* (0.003) (0.004) (0.008) (0.002) (0.002) (0.013) 0.003 0.073 -0.038 0.008 -0.015 0.023 (0.024) (0.042) (0.043) (0.017) (0.028) (0.041) 0.001*** 0.001*** 0.002** 0.001*** 0.003** -0.001 (0.0002) (0.001) (0.0002) (0.0001) (0.001) (0.0002) -0.022 -0.014 -0.067 0.057*** 0.037 0.161*** (0.031) (0.046) (0.054) (0.022) (0.030) (0.049) 𝑑𝑙𝑛(𝑀 𝑎𝑡𝑒𝑟𝑖𝑎𝑙) 0.103*** -0.033 0.191*** 0.089*** 0.010 0.197*** (0.031) (0.047) (0.053) (0.021) (0.031) (0.048) 𝑑𝑙𝑛(𝑃 𝑟𝑜𝑑𝑢𝑐𝑡𝑖𝑣𝑖𝑡𝑦 𝑈 𝑆) 0.299*** 0.408*** 0.441** 0.188** 0.253*** 0.200 (0.097) (0.125) (0.076) (0.097) (0.160) year dummies included included included included included included industry ﬁxed eﬀects included included included included included included 𝑅2 0.05 0.06 0.09 0.06 0.09 0.04 Observations 1818 903 915 2048 906 1162 237 233 235 239 231 237 Industries Notes: 1) ***, ** and * indicate signiﬁcance levels of 1%, 5% and 10%. 2)“dln” denotes ﬁrst diﬀerences in log, as approximations for growth rates. 3) The dependent variable in the ﬁrst three columns is productivity growth rate in Canada between year 𝑡 and 𝑡 + 2. All independent variables are also measured between 𝑡 and 𝑡 + 2, except for that RER measures the exchange rate change between 𝑡 − 2 and 𝑡. 4) In column (4) through (6) is manufacturing revenue per production worker, the dependent variable and all independent variables are measured between year 𝑡 and 𝑡 + 1, except for that RER measures the exchange rate change between 𝑡 − 3 and 𝑡. 5) RER, CR4, TradeD, R&D, Estab size, Energy, Material, and Productivity US denote respectively real exchange rate, 4-ﬁrm concentration ratio, a dummy variable for highly-trade industries, R&D expenditure, average establishment size, energy used per production worker, material used per production worker, growth in value added per production worker in the corresponding US industry. 53 2.4. Manufacturing Productivities in Canada . . . similar to the benchmarks. However, adding change in the number of establishments is a crude way to control for the eﬀect of entries and exits. Ideally one should control for the size of entrants and exiting ﬁrms, but these data have not been publicly available. Table 2.5: Eﬀects of Entry and Exit of Establishments Dependent variable dln(labour productivity, Canada) 𝐶𝑅4 𝑑𝑙𝑛(𝑅𝐸𝑅) ⋅ 𝐶𝑅4 𝑑𝑙𝑛(𝑅𝐸𝑅) ⋅ 𝐶𝑅4 ⋅ 𝑇 𝑟𝑎𝑑𝑒𝐷 𝑑𝑙𝑛(𝑅&𝐷) Whole sample Appr. Depr. 1997-2006 2002-2006 1997-2002 (1) (2) (3) -0.001 -0.003 -0.001 (0.001) (0.003) (0.001) 0.005 -0.006 -0.012 (0.003) (0.004) (0.010) -0.005 -0.011*** -0.005 (0.003) (0.004) (0.009) 0.011 -0.012 0.002 (0.027) (0.046) (0.043) 0.001*** 0.001*** 0.001 (0.0001) (0.001) (0.002) 𝑑𝑙𝑛(𝐸𝑛𝑒𝑟𝑔𝑦) 0.197*** 0.100** 0.319*** (0.028) (0.040) (0.048) 𝑑𝑙𝑛(𝑀 𝑎𝑡𝑒𝑟𝑖𝑎𝑙) 0.242*** 0.216*** 0.244*** (0.028)* (0.039) (0.049) 𝑑𝑙𝑛(𝑃 𝑟𝑜𝑑𝑢𝑐𝑡𝑖𝑣𝑖𝑡𝑦, 𝑈 𝑆) -0.854*** -0.785*** -0.812*** (0.075) (0.093) (0.156) -0.005 0.062** -0.107*** 𝐸𝑠𝑡𝑎𝑏 𝑠𝑖𝑧𝑒 𝑑𝑙𝑛(𝐸𝑠𝑡𝑎𝑏𝑙𝑖𝑠ℎ𝑚𝑒𝑛𝑡𝑠) (0.020) (0.025) (0.039) year dummies excluded excluded included industry ﬁxed eﬀects included included included 𝑅2 0.16 0.15 0.13 Observations 2068 906 1162 237 231 237 Industries Notes: 1) ***, ** and * indicate signiﬁcance levels of 1%, 5% and 10%. 2)“dln” denotes ﬁrst diﬀerences in log, as approximations for growth rates. 3) The dependent variable is labour productivity, measured as value added per production worker. RER, CR4, TradeD, R&D, Estab size, Energy, Material, Productivity US, Establishments denote respectively real exchange rate, 4-ﬁrm concentration ratio, a dummy variable for highly-trade industries, R&D expenditure, average establishment size, energy used per production worker, material used per production worker, growth in value added per production worker in the corresponding US industry, and the number of establishments. In column (1) and (2) of Table 2.6, I allow for an interaction between 54 2.4. Manufacturing Productivities in Canada . . . the trade dummy and exchange rate movement, with the triple interaction absent in column (1). This interaction is not always signiﬁcant. In column (1) we see a signiﬁcant eﬀect of the concentration-exchange-rate interaction, and in column (2) there is a threshold eﬀect, signiﬁcant at the 10% level. Lastly, it is reasonable to suspect the concentration ratio may aﬀect labour productivity growth one period later via channels other than its interaction with the exchange rate, for example, the consolidation of ﬁrms in the current period can raise concentration and the resulting synergy can lead to productivity gains in the future periods. To show that this suspicion is unlikely, I use CR4 in 1990 to interact with exchange rate movements and trade between 1997 and 2006. In this case, only the lagged cross-section variation in CR4 is used in estimation. The results are reported in column (3) of Table 2.6. We can still see a positive relation between concentration and labour productivity growth, and a trade threshold eﬀect, although the interaction terms are only signiﬁcant at the 10% level. On the balance, the evidence suggest the appreciation provided incentive for Canadian manufacturing industries to improve productivity. In particular, highly-concentrated industries experienced higher labour productivity growth during an appreciation. On the other hand, the theoretical model does not oﬀer a direct prediction for periods of deprecation, and the evidence during the 1997-2002 sub-period is inconclusive. Lack of productivity responses during the depreciation sub-period could be due to that the depreciation between 1997 and 2002 was too moderate to trigger responses from competitors of Canadian ﬁrms. 55 2.4. Manufacturing Productivities in Canada . . . Table 2.6: Other Robustness Checks Dependent variable dln(labour productivity, Canada) Whole sample Whole sample Whole sample 1997-2006 1997-2006 1997-2006 (1) (2) (3) -0.001 -0.001 - (0.001) (0.001) 𝐶𝑅4 𝑑𝑙𝑛(𝑅𝐸𝑅) ⋅ 𝐶𝑅4 -0.010*** -0.006 -0.007* (0.003) (0.004) (0.004) -0.153 0.555 - (0.191) (0.444) - -0.014* -0.006* (0.008) (0.003) 𝑑𝑙𝑛(𝑅𝐸𝑅) ⋅ 𝑇 𝑟𝑎𝑑𝑒𝐷 𝑑𝑙𝑛(𝑅𝐸𝑅) ⋅ 𝐶𝑅4 ⋅ 𝑇 𝑟𝑎𝑑𝑒𝐷 0.002 0.002 -0.009 (0.027) (0.027) (0.028) 0.004*** 0.004*** 0.004*** (0.001) (0.001) (0.001) 𝑑𝑙𝑛(𝐸𝑛𝑒𝑟𝑔𝑦) 0.200*** 0.198*** 0.167*** (0.028) (0.028) (0.028) 𝑑𝑙𝑛(𝑀 𝑎𝑡𝑒𝑟𝑖𝑎𝑙) 0.250*** 0.250*** 0.269*** (0.028)* (0.028) (0.030) 0.158** 0.157** 0.162** 𝑑𝑙𝑛(𝑅&𝐷) 𝐸𝑠𝑡𝑎𝑏 𝑠𝑖𝑧𝑒 𝑑𝑙𝑛(𝑃 𝑟𝑜𝑑𝑢𝑐𝑡𝑖𝑣𝑖𝑡𝑦, 𝑈 𝑆) (0.074) (0.074) (0.076) year dummies excluded excluded included industry ﬁxed eﬀects included included included 𝑅2 0.11 0.11 0.10 Observations 2068 2068 1987 237 237 224 Industries Notes: 1) ***, ** and * indicate signiﬁcance levels of 1%, 5% and 10%. 2)“dln” denotes ﬁrst diﬀerences in log, as approximations for growth rates. 3) The dependent variable is labour productivity, measured as value added per production worker. RER, CR4, TradeD, R&D, Estab size, Energy, Material, and Productivity US denote respectively real exchange rate, 4-ﬁrm concentration ratio, a dummy variable for highly-trade industries, R&D expenditure, average establishment size, energy used per production worker, material used per production worker, and growth in value added per production worker in the corresponding US industry. 56 2.5. Conclusion 2.5 Conclusion This chapter is motivated by the question of how productivity responds to major real exchange rate movements. Drawing on observations of disruptive technological changes documented in Holmes et al. (2008), I have built a partial equilibrium model to clarify how productivity responses of industries vary with trade costs and market concentration during an appreciation. Similar to results in previous literature, I ﬁnd that competitive pressure resulting from appreciations increases incentives to improve productivity, as the appreciation lowers the proﬁt loss during costly transitions. Meanwhile, higher trade costs reduce the incentives by diminishing the competitive pressure of appreciations. In addition, this chapter contributes to the theoretical literature by studying the intensity of technology adoption, suggesting a positive relation between market concentration and the intensity of adoption. It is ﬁrms in highly concentrated industries that will invest more in productivity improvements, as their marginal beneﬁts from adopting better technologies are greater. Empirical analysis of 237 6-digit Canadian manufacturing industries between 1997 and 2006 supports the theoretical model’s predictions. During the appreciation period between 2002 and 2006, labour productivity growth was on average higher after controlling for industry ﬁxed eﬀects, and growth in all of energy use, material use, R&D expenditure, productivity in corresponding US industries, and GDP in Canada and the US. Highly concentrated industries experienced high productivity growth, conditional on their exposure to a substantial amount of trade. The theoretical model does not 57 2.5. Conclusion oﬀer predictions for productivity response to depreciations, and during the depreciation period between 1997 and 2002, there is little empirical evidence that labour productivity growth had been correlated with exchange rate movements or concentration. The empirical analysis is the ﬁrst to study productivity response of surviving ﬁrms to real exchange rate movements, also adding to the evidence of a positive relationship between competitive pressure and productivity improvement. A logical next step would be to investigate ﬁrm level data. The theoretical model of this chapter conjectures about ﬁrm behaviour, and the empirics test the implications at the industry level. Although industry-level evidence suggests adjustments have been made to counteract an appreciation, it is only natural to ask, what exactly these ﬁrms did. 58 Bibliography Baily, Martin Neil, Hans Gersbach, F. M. Scherer, and Frank R. Lichtenberg, “Eﬃciency in Manufacturing and the Need for Global Competition,” Brookings Papers on Economic Activity. Microeconomics, 1995, 1995, 307–358. Bernard, Andrew B. and Bradford J. Jensen, “Exceptional exporter performance: cause, eﬀect, or both?,” Journal of International Economics, February 1999, 47 (1), 1–25. Brander, James and Paul Krugman, “A ’reciprocal dumping’ model of international trade,” Journal of International Economics, November 1983, 15 (3-4), 313–321. Fung, Loretta, “Large real exchange rate movements, ﬁrm dynamics, and productivity growth,” Canadian Journal of Economics, May 2008, 41 (2), 391–424. Galdon-Sanchez, Jose E. and James A. Jr. Schmitz, “Competitive Pressure and Labor Productivity: World Iron-Ore Markets in the 1980’s,” The American Economic Review, 2002, 92 (4), 1222–1235. 59 Chapter 2. Bibliography Hansen, Bruce E., “Sample Splitting and Threshold Estimation,” Econometrica, May 2000, 68 (3), 575–604. Harris, Richard G., “Is There a Case for Exchange-Rate-Induced Productivity Changes,” Canadian Institute for Advanced Research, 2001, Working Paper No. 164. Hart, Oliver D., “The Market Mechanism as an Incentive Scheme,” The Bell Journal of Economics, 1983, 14 (2), 366–382. Holmes, Thomas J., David K. Levine, and James A. Schmitz, “Monopoly and the Incentive to Innovate When Adoption Involves Switchover Disruptions,” NBER Working Paper, 2008, No. W13864. Jones, Charles I., Introduction to Economic Growth, W. W. Norton; 2 edition, 2001. MacDonald, James M., “Does Import Competition Force Eﬃcient Production?,” The Review of Economics and Statistics, 1994, 76 (4), 721–727. Maier, Philipp and Brian DePratto, “The Canadian dollar and commodity prices: Has the relationship changed over time?,” Bank of Canada Discussion Paper Series, November 2007. Nickell, Stephen J., “Competition and Corporate Performance,” The Journal of Political Economy, 1996, 104 (4), 724–746. Porter, Michael E., The Competitive Advantage of Nations, New York: New York: Free Press, 1990. 60 Chapter 2. Bibliography Raith, Michael, “Competition, Risk, and Managerial Incentives,” The American Economic Review, 2003, 93 (4), 1425–1436. Schmitz, James A., “What Determines Productivity? Lessons from the Dramatic Recovery of the U.S. and Canadian Iron Ore Industries Following Their Early 1980s Crisis,” Journal of Political Economy, June 2005, 113 (3), 582–625. Symeonidis, George, “The Eﬀect of Competition on Wages and Productivity: Evidence from the United Kingdom,” Review of Economics and Statistics, 2008, 90 (1), 134–146. Syverson, Chad, “Market Structure and Productivity: A Concrete Example.,” Journal of Political Economy, 2004, 112 (6), 1181 – 1222. Tang, Yao, “Exchange Rate Appreciation and Productivity,” Mimeo, May 2008. Treﬂer, Daniel, “The Long and Short of the Canada-U.S. Free Trade Agreement,” American Economic Review, September 2004, 94 (4), 870– 895. Vives, Xavier, “Innovation and Competitive Pressure,” Journal of Industrial Economics, forthcoming. 61 Chapter 3 Comparison of Misspeciﬁed Calibrated Models: The Minimum Distance Approach22 22 A version of this chapter has been submitted for publication. Hnatkovska, V., Marmer, V. and Tang, Y., “Comparison of Misspeciﬁed Calibrated Models: The Minimum Distance Approach”. 62 3.1. Introduction 3.1 Introduction This chapter presents a method for the comparison of calibrated mod- els. While calibration is now an essential tool of quantitative analysis in macroeconomics, surprisingly, there is no generally accepted deﬁnition of calibration, and calibration is rather viewed as a research style characterized by a certain attitude toward modelling, assigning parameters’ values, and model assessment (Kim and Pagan, 1995). A number of authors deﬁne calibration as a sequence of steps allowing one to reduce the general theoretical framework to a quantitative relationship between variables. For instance, Cooley and Prescott (1995) outline three such steps: imposing parametric restrictions; constructing a set of measurements consistent with the parametric class of models; and assigning values to the model parameters. Canova and Ortega (1996) adopt a broader deﬁnition of calibration, by including model evaluation into the list of steps. The calibration approach takes an explicitly instrumental view of economic models: a calibrationist acknowledges that the model is false and will be rejected by the data (Canova, 1994). The objective of the calibrationist is not an assessment of whether the model of interest is true, but rather which features of the data it can be used to capture. Furthermore, a calibrationist may be interested in learning which of the competing but “false” models provides a better ﬁt to the data. In a typical calibration exercise, the calibrationist selects values for the parameters in order to match some characteristics of the observed data with those implied by the theoretical model. For example, a model can be cal- 63 3.1. Introduction ibrated to match empirical moments, cross-correlations, impulse responses, and stylized facts. Such characteristics will be referred as the properties of a reduced-form model or the reduced-form parameters, since they can be consistently estimated from the data regardless of the true data generating process (DGP). Calibrated parameters can be obtained using informal moment matching, the generalized method of moments (GMM), simulated method of moments (SMM), or maximum likelihood (ML) estimation (Kim and Pagan, 1995). Calibration was also formalized as an example of minimum distance estimation in Gregory and Smith (1990, 1993). If the structural model is correctly speciﬁed, the calibrated parameters are consistent and asymptotically normal estimators of the structural (or deep) parameters, and statistical inference can be performed using the standard asymptotic results (see, for example, Newey and McFadden (1994)). However, if the structural model is misspeciﬁed, the asymptotic distribution of calibrated parameters has to be corrected for misspeciﬁcation. In this chapter, we explicitly consider the case of misspeciﬁed structural models. Our methodology uses a classical minimum distance (CMD) estimation procedure to calibrate model parameters. We then show that under some regularity conditions, the calibrated parameters converge in probability to the values of the structural parameters that minimize the distance between the population characteristics of the data and those implied by the structural model (pseudo-true values). Further, the CMD estimator is asymptotically normal, however, due to misspeciﬁcation some adjustments to the asymptotic variance matrix are required. Gallant and White (1988) 64 3.1. Introduction and Hall and Inoue (2003), Hall and Inoue (2003) hereafter, established such results for GMM estimators. After choosing parameter values, the calibration exercise continues with evaluation of the structural model. This is usually done by comparing model-implied reduced-form characteristics with those of the actual data (Gregory and Smith, 1991, 1993; Cogley and Nason, 1995; Kim and Pagan, 1995). However, according to the calibrationist’s approach, while evaluating a model, one should keep in mind that it is only an approximation and therefore should not be regarded as a null hypothesis to be statistically tested (Prescott, 1991). There is a large literature in econometrics that considers misspeciﬁed models. For example, Watson (1993) and Diebold et al. (1998) propose measures of ﬁt for calibrated models that take into account possible misspeciﬁcation. Many papers also advocate evaluating a structural misspeciﬁed model against another misspeciﬁed benchmark model (Diebold and Mariano, 1995; West, 1996; Schorfheide, 2000; White, 2000; Corradi and Swanson, 2007). In this chapter, we compare misspeciﬁed models by the means of an asymptotic test. In this test, under the null hypothesis the two misspeciﬁed models provide an equivalent approximation to the data in terms of characteristics of the reduced-form model. Our approach is related most closely to Vuong (1989) and Rivers and Vuong (2002), Rivers and Vuong (2002) hereafter. Vuong (1989) proposed such a test in the maximum likelihood framework, and Rivers and Vuong (2002) discussed it in a more general setting allowing for a broad class of lack-of-ﬁt criteria including that of GMM. The contribution of our chapter relative to Rivers and Vuong (2002) 65 3.1. Introduction is threefold. First, Rivers and Vuong (2002) focused solely on non-nested models. Our CMD framework allows us to analyze both non-nested and nested cases. The nested case is particularly important because, if the null of models equivalence is not rejected, one can replace the bigger model with a more parsimonious one. Furthermore, many hypotheses can be expressed in terms of parameter restrictions and thus fall into the nested category.23 The nested case also diﬀers from non-nested in terms of the asymptotic null distribution. Rivers and Vuong (2002) show that in the non-nested case the diﬀerence between the sample lack-of-ﬁt criteria of the two models is asymptotically normal. We derive a similar result for non-nested case, but also show that our test statistic has a mixed 𝜒2 distribution in the nested case, similarly to Vuong (1989).24 Second, we analyze the situation where the models are estimated using one set of reduced-form characteristics and are compared using another. This is a very common approach in the calibration literature. For instance, a structural model can be estimated to match the ﬁrst moments of the data, and evaluated in terms of its ability to match the second moments. In this case, we show the asymptotic null distribution is always normal, regardless of whether the models are nested or non-nested. This fact substantially simpliﬁes the testing procedure in such situations. The reason for asymptotic 23 Note however that, in our framework, the nested case does not necessarily some re- strictions on some structural or deep parameters. 24 Also, by considering the framework of MD estimation, we provide more speciﬁc assumptions and asymptotic results than in Rivers and Vuong (2002); and while, as a result, our treatment of the problem is less general than in Rivers and Vuong (2002), it covers the important case of calibration. 66 3.1. Introduction normality in the case of nested models is that there is no selection criteria minimization when the models are estimated and evaluated on diﬀerent sets of reduced-form parameters. Third, we address the issue of choosing weights for reduced-form characteristics when comparing models. When models are misspeciﬁed, the pseudo-true values of their parameters and the ranking of the models depend on the choice of the weighting scheme. In particular, the null hypothesis changes when one applies diﬀerent weights (see, for example, Hall and Inoue (2003) and Hall and Pelletier (2007)). In this chapter, we relax the dependence of models ranking on the choice of the weighting scheme by suggesting procedures that take into account the models’ relative performance for various choices of the weight matrix. We propose averaged and sup procedures for model comparison. The averaged test corresponds to the null hypothesis that the two models have equal lack-of-ﬁt on average. The null hypothesis of the sup test says that one model cannot outperform another for any choice of the weighting matrix. We also propose a simple procedure for constructing conﬁdence sets for the weighting schemes favorable for one of the models. The problem of comparison of misspeciﬁed models should be discerned from non-nested hypothesis testing problems (Davidson and MacKinnon, 1981; MacKinnon, 1983; Smith, 1992). Suppose that the two alternative models are non-nested and therefore cannot be both true at the same time. According to our model comparison null hypothesis, the models have equal measures of ﬁt and, consequently, the null hypothesis implies that they are both misspeciﬁed. However, in the literature on non-nested hypothesis 67 3.1. Introduction testing, the null hypothesis is that one of the models is true. Thus, the two approaches, the non-nested testing and the model comparison testing of misspeciﬁed models in the spirit of Vuong (1989), are not competing but rather complementary. The ﬁrst approach can be used in a search for the true speciﬁcation, while the later approach can be adopted when the econometrician believes that all alternative models are misspeciﬁed or when they all have been rejected by the overidentiﬁed restrictions or non-nested tests. Comparison of misspeciﬁed calibrated models has also been studied from the Bayesian perspective by Schorfheide (2000). Our method can be viewed as a frequentist counterpart of the Schorfheide (2000) procedure.25 Corradi and Swanson (2007) designed a Kolmogorov-type test for comparison of misspeciﬁed calibrated models. In their paper, the models are compared in terms of the distances between the historic empirical cumulative distribution function (CDF) and the CDFs implied by the model. Thus, their approach is similar to that of Vuong (1989) and Kitamura (2000) who use the KullbackLeibler Information Criterion.26 We on the other hand focus on the ability of a model to approximate some reduced form characteristics that do not require knowledge of the CDF. Recall that when the models are misspeciﬁed, 25 While in Schorfheide (2000) a structural model that achieves the lowest average pos- terior loss is selected, we follow the approach of Vuong (1989) and suggest a test for the null hypothesis that the two models have equal losses. 26 According to Corradi and Swanson (2007) approach, the model description must include the assumptions that allow one to simulate the data. Our approach allows us to compare and evaluate structural models that do not necessarily provide a complete distribution for the data. 68 3.1. Introduction diﬀerent measures can lead to diﬀerent ranking of the models. The issue of misspeciﬁed calibrated models was also addressed by Dridi et al. (2007) using indirect inference.27 The main focus of their paper is consistent estimation of some deep parameters when the model is misspeciﬁed with respect to some nuisance parameters. They also emphasize the necessity of correcting asymptotic variances formulas when there is a possibility of misspeciﬁcation; such corrections are discussed in our chapter for the CMD estimators. In a recent paper, Kan and Robotti (2008), use the Hansen-Jagannathan distance in a Vuong-type test to compare potentially misspeciﬁed asset pricing models.28 We apply our methodology to compare two standard monetary business cycle models. The ﬁrst model is a cash-in-advance (CIA) model, while the second model is the Lucas (1990) and Fuerst (1992) model with portfolio adjustment costs (PAC). The two models have the same underlying structure except in the information sets that agents possess when making their decisions. In particular, we assume that the portfolio decisions must be made before the current period shocks are realized. We judge the performance of the models based on their ability to replicate the dynamics of the business cycles in the US. As our comparison criteria or reduced-form characteristics, we use the response of inﬂation and the growth rate of output to an unanticipated monetary shock. A structural vector autoregression 27 As a matter of fact, our method can be viewed as an example of indirect inference without simulations. 28 The Hansen-Jagannathan distance uses the second moments of returns as weights for the vector of pricing errors derived from the model. 69 3.2. Deﬁnitions (SVAR) is employed to obtain model-free estimates of the impulse responses against which we judge the performance of the two structural models. The structural shocks are identiﬁed using the Blanchard and Quah (1989) decomposition under the restriction of the long-run monetary neutrality of output. According to our results, the null hypothesis that the two models have the same lack of ﬁt cannot be rejected on the basis of equally-weighted twenty-periods output and inﬂation impulse responses. We conclude that the assumed rigidity in portfolio choice and adjustment costs of the PAC model do not play a signiﬁcant role in approximating inﬂation and output impulse response dynamics. The chapter proceeds as follows. Section 3.2 introduces the framework. Section 3.3 describes the asymptotic properties of CMD estimators under misspeciﬁcation. Section 3.4 suggests a QLR-type statistic for model comparison. We discuss the distribution of the suggested statistic in the cases of nested, strictly non-nested and overlapping models. In Section 3.5, we consider the situation when a model is estimated using one set of reduced-form parameters and evaluated with respect to another. Section 3.6 discusses the averaged and sup tests, and conﬁdence sets for weighting schemes. Section 3.7 illustrates the technique with an empirical application. All proofs are in the Appendix. 3.2 Deﬁnitions This section formally deﬁnes calibration as CMD estimation and intro- duces the framework for comparison of two calibrated models. The deﬁnition 70 3.2. Deﬁnitions of calibration is similar to that of Gregory and Smith (1990); it is viewed as an example of CMD estimation (for a discussion of CMD see Newey and McFadden (1994)). CMD estimation of optimization based models was considered recently in the econometrics literature by Moon and Schorfheide (2002). Let 𝑌𝑛 (𝜔) be a data matrix of the sample size 𝑛 deﬁned on the probability space (Ω, ℱ, 𝑃 ). All random quantities in this chapter are some functions of the data 𝑌𝑛 . We use ℎ to denote an 𝑚-vector of parameters of some reduced-form model. Its true value, ℎ0 ∈ 𝑅𝑚 , depends on the true unknown structural model of the economy and its parameters. For example, ℎ can be a vector of moments, cross-correlations, impulse responses, etc. While the true structural model is unknown, we will assume that reduced-form ˆ 𝑛 denote parameter ℎ0 can be estimated consistently from the data. Let ℎ ˆ 𝑛 has the following properties. an estimator of ℎ. We assume that ℎ ˆ 𝑛 →𝑝 ℎ0 ∈ 𝑅𝑚 . Assumption 1. (a) ℎ ) ( ˆ 𝑛 − ℎ0 →𝑑 𝑁 (0, Λ0 ), where Λ0 is positive deﬁnite 𝑚 × 𝑚 ma(b) 𝑛1/2 ℎ trix. ˆ 𝑛 such that Λ ˆ 𝑛 →𝑝 Λ0 . (c) There is Λ ˆ 𝑛 is a consistent and asymptotAccording to Assumptions 1(a) and (b), ℎ ically normal estimator of ℎ0 . Similarly to ℎ0 , its asymptotic variance, Λ0 , depends on the unknown true structural model and its parameters. Part (c) of the assumption requires that Λ0 also can be estimated consistently from the data. The above assumptions are of high level; they can be veriﬁed under ˆ𝑛 more primitive conditions. For example, Assumption 1 holds when ℎ0 and ℎ 71 3.2. Deﬁnitions are functions of the ﬁrst two population and sample moments of 𝑌𝑛 respectively, 𝑌𝑛 = (𝑦1′ , . . . , 𝑦𝑛′ )′ , such that {𝑦𝑡 } is a stationary mixing sequence with 𝜙 of size −𝑟/ (𝑟 − 1), 𝑟 ≥ 2, or 𝛼 of size −2𝑟/ (𝑟 − 2), 𝑟 > 2, 𝐸 ∥𝑦𝑡 ∥4𝑟+𝛿 < ∞, ) ( ∑ where ∥⋅∥ denotes the Euclidean norm, and 𝑉 𝑎𝑟 𝑛−1/2 𝑛𝑡=1 𝑦𝑡 is uniformly positive deﬁnite (White, 2001); alternatively, Assumption 1 can be veriﬁed using a linear processes structure under the conditions of Phillips and Solo (1992). In our framework, 𝑚 is ﬁxed by the calibrationist and independent of the data. We assume that the calibrationist chooses ℎ and 𝑚 according to the economic importance of the reduced-form characteristics that a model is used to explain; note, however, that there are recent methods allowing one to choose 𝑚 using data and a statistical information criterion (Hall et al., 2007). Let 𝜃 ∈ Θ ⊂ 𝑅𝑘 be a vector of deep parameters corresponding to a structural model speciﬁed by the calibrationist. We assume that one can compute analytically the value of the reduced-form parameters ℎ given the model and a value of 𝜃. The mapping from the space of 𝜃 to the space of reduced-form parameters is given by the function 𝑓 : Θ → 𝑅𝑚 , which we call the binding function using the terminology of indirect inference (Gouri´eroux et al., 1993; Dridi et al., 2007). In the remainder of the chapter, structural models are referred by their binding functions. Vector 𝜃 denotes only the free parameters that are estimated using the sample information; any preset parameters are included as constants into the binding function 𝑓 . Such parameters usually are assigned values based on extra-sample information, and we assume that model comparison and 72 3.2. Deﬁnitions evaluation is performed conditional on the choice of preset parameters. This is a common practice in calibration literature (Gregory and Smith, 1990). A calibrationist distinguishes between free parameters that must be estimated, and other parameters with values chosen on the basis of what is considered to be reasonable values in the literature, or by other methods independent of the data used to calibrate the free parameters. The presence of preset parameters is only a problem if one wants to treat the structural model as a true DGP, since such parameters are likely to be set to wrong values (see, for example a Monte Carlo experiment in Gregory and Smith (1990)). On the other hand, when the model is treated as misspeciﬁed, the preset parameters do not pose an additional challenge. Vector 𝜃 is chosen to minimize the distance between the sample reducedˆ 𝑛 , and those implied by the chosen strucform characteristics of the data, ℎ tural model, 𝑓 (𝜃). Let 𝐴𝑛 be a possibly random 𝑚 × 𝑚 weight matrix. The weight matrix can be nonrandom and chosen by the calibrationist to put more weight on the relatively more important reduced-form parameters. Alternatively, it can be data dependent and, therefore, random. For examˆ −1 ple, in the spirit of GMM estimation, 𝐴𝑛 can be set such that 𝐴′𝑛 𝐴𝑛 = Λ 𝑛 , which exists with probability approaching one due to Assumption 1(c). Assumption 2. 𝐴𝑛 →𝑝 𝐴, where 𝐴 is of full rank. The calibrated 𝜃, or the CMD estimator of 𝜃, is given by the value that minimizes the weighted distance function: ) ( ˆ 𝑛 − 𝑓 (𝜃) 𝜃ˆ𝑛 (𝐴𝑛 ) = arg min 𝐴𝑛 ℎ 𝜃∈Θ 2 . (3.1) The structural model is said to be correctly speciﬁed if for some value 73 3.2. Deﬁnitions 𝜃0 ∈ Θ the binding function 𝑓 produces exactly the true value of the reducedform parameter. The following deﬁnition is similar to Deﬁnitions 1 and 2 of Hall and Inoue (2003). Deﬁnition 1. The structural model 𝑓 is said to be correctly speciﬁed if there exists some 𝜃0 ∈ Θ such that 𝑓 (𝜃0 ) = ℎ0 ; 𝑓 is said to be misspeciﬁed if inf 𝜃∈Θ ∥(ℎ0 − 𝑓 (𝜃))∥ > 0. Naturally, the structural model chosen by the calibrationist is correctly speciﬁed in the sense of Deﬁnition 1 in the unlikely situation that 𝑓 is the true data generating process. Also, the model 𝑓 is correctly speciﬁed according to Deﬁnition 1 in the case of exact identiﬁcation, i.e. when 𝑚 = 𝑘, even if the structural model and its binding function describe an incorrect DGP. Thus, the structural model is misspeciﬁed if it is overidentiﬁed and, for no value of 𝜃, it can replicate the reduced-form characteristics. The requirement on overidentiﬁcation is a crucial one since an exactly identiﬁed model is never misspeciﬁed according to Deﬁnition 1 (see Hall and Inoue (2003) for a discussion of overidentiﬁcation and misspeciﬁcation). Typically, the number of reduced-form parameters available for calibration exceeds 𝑘, and, therefore, the calibrationist can always choose the binding function and reduced-form parameters so that the model is overidentiﬁed. We assume that the calibrationist considers two competing structural models. The second structural model is given by the binding function 𝑔 and the vector of deep parameters 𝛾 ∈ Γ ⊂ 𝑅𝑙 . Let 𝛾ˆ𝑛 be the calibrated value of 𝛾, where 𝛾ˆ𝑛 is constructed similarly to 𝜃ˆ𝑛 in (3.1): ) ( ˆ 𝑛 − 𝑔 (𝛾) 𝛾ˆ𝑛 (𝐴𝑛 ) = arg min 𝐴𝑛 ℎ 𝛾∈Γ 2 . 74 3.2. Deﬁnitions We assume that 𝑓 and 𝑔 are both overidentiﬁed and misspeciﬁed in the sense of Deﬁnition 1. Assumption 3. 𝑓 and 𝑔 are misspeciﬁed according to Deﬁnition 1. Next, we deﬁne the pseudo-true values of the structural parameters 𝜃 and 𝛾. The pseudo-true value minimizes the distance between ℎ0 and the binding functions for a given weight matrix 𝐴. Assumption 4. (a) There exists a unique 𝜃0 (𝐴) ∈ Θ such that for all 𝜃 ∈ Θ, ∥𝐴 (ℎ0 − 𝑓 (𝜃0 (𝐴)))∥ ≤ ∥𝐴 (ℎ0 − 𝑓 (𝜃))∥ . (b) There exists a unique 𝛾0 (𝐴) ∈ Γ such that for all 𝛾 ∈ Γ, ∥𝐴 (ℎ0 − 𝑔 (𝛾0 (𝐴)))∥ ≤ ∥𝐴 (ℎ0 − 𝑔 (𝛾))∥ . The pseudo-true value is written as a function of 𝐴 to emphasize that diﬀerent choices of weight matrix may lead to diﬀerent minimizers of ∥𝐴 (ℎ0 − 𝑓 (𝜃))∥ (see Maasoumi and Phillips (1982) and Hall and Inoue (2003)). For notational brevity, we may suppress the dependence on 𝐴 if there is no ambiguity regarding the choice of 𝐴. The uniqueness of 𝜃0 and 𝛾0 is usually assumed in the literature on misspeciﬁed models (see Assumption 3 of Rivers and Vuong (2002) and Assumption 3 of Hall and Inoue (2003)). The uniqueness of the pseudo-true value can be veriﬁed with the probability approaching one since the binding functions are known, 𝐴𝑛 →𝑝 𝐴, and ˆ 𝑛 is a consistent estimator of ℎ0 by Assumption 1(a). When the pseudoℎ true value lies in the interior of Θ, it uniquely solves the following equation, 75 3.2. Deﬁnitions provided that 𝑓 is diﬀerentiable: ∂𝑓 (𝜃0 (𝐴))′ ′ 𝐴 𝐴 (ℎ0 − 𝑓 (𝜃0 (𝐴))) = 0. ∂𝜃 (3.2) Due to Assumption 2 on 𝐴, ∂𝑓 (𝜃0 (𝐴)) /∂𝜃′ must have rank 𝑘 for 𝜃0 (𝐴) to be unique. The calibrationist’s objective is to choose between the two wrong models the one that provides a better 𝐴-weighted ﬁt to the reduced-form parameters ℎ0 . We suggest a testing procedure for the null hypothesis that the two models are equally wrong 𝐻0 : ∥𝐴 (ℎ0 − 𝑓 (𝜃0 (𝐴)))∥ = ∥𝐴 (ℎ0 − 𝑔 (𝛾0 (𝐴)))∥ , (3.3) against the alternatives in which one of the models provides a better ﬁt. The calibrationist prefers the model 𝑓 if the following alternative is true. 𝐻𝑓 : ∥𝐴 (ℎ0 − 𝑓 (𝜃0 (𝐴)))∥ < ∥𝐴 (ℎ0 − 𝑔 (𝛾0 (𝐴)))∥ . (3.4) Similarly, the calibrationist prefers the model 𝑔 when 𝐻𝑔 : ∥𝐴 (ℎ0 − 𝑓 (𝜃0 (𝐴)))∥ > ∥𝐴 (ℎ0 − 𝑔 (𝛾0 (𝐴)))∥ is true. The hypotheses are analogous to those of Vuong (1989) and Rivers and Vuong (2002). Note that, in the current framework, the decision depends on the choice of the weight matrix 𝐴. Thus, under the null, the two structural models provide equivalent ﬁt for the reduced-form characteristics for a given weighting scheme 𝐴. Naturally, diﬀerent weighting schemes may lead to diﬀerent ranking of 𝑓 and 𝑔. In order to test the null hypothesis in (3.3), it is natural to consider a sample counterpart of the diﬀerence in ﬁt between the two competing models 76 3.3. Properties of the CMD Estimators of Structural Parameters which is given by the following QLR statistic ) ( ( )) 2 ( ) ( ˆ 𝑛 − 𝑔 (ˆ ˆ 𝑛 − 𝑓 𝜃ˆ𝑛 𝛾𝑛 ) + 𝐴𝑛 ℎ 𝑄𝐿𝑅𝑛 𝜃ˆ𝑛 (𝐴𝑛 ) , 𝛾ˆ𝑛 (𝐴𝑛 ) = − 𝐴𝑛 ℎ 2 . (3.5) Given our assumptions, 𝑄𝐿𝑅𝑛 consistently estimates the diﬀerence in the population measures of ﬁt − ∥𝐴 (ℎ0 − 𝑓 (𝜃0 (𝐴)))∥+∥𝐴 (ℎ0 − 𝑔 (𝛾0 (𝐴)))∥, as implied by the results presented in the next section. 3.3 Properties of the CMD Estimators of Structural Parameters In this section, we discuss the asymptotic properties of the CMD esti- mators deﬁned in the previous section. We make the following assumptions about the binding functions 𝑓 and 𝑔 and their parameters’ spaces Θ and Γ. Assumption 5. (a) Θ and Γ are compact. (b) 𝜃0 lies in the interior of Θ; 𝛾0 lies in the interior of Γ. (c) 𝑓 is continuous on Θ; 𝑔 is continuous on Γ. The following theorem gives consistency of the CMD estimators of 𝜃 and 𝛾. Theorem 5. Suppose that Assumptions 1, 2, 4, and 5 hold. Then, 𝜃ˆ𝑛 →𝑝 𝜃0 and 𝛾ˆ𝑛 →𝑝 𝛾0 . As usual, the asymptotic distribution of the CMD estimators centered around their pseudo-true values can be derived from the mean value expan77 3.3. Properties of the CMD Estimators of Structural Parameters sion of the sample ﬁrst-order conditions for the minimization problem in (3.1). We make the following assumption. Assumption 6. The binding function 𝑓 is twice continuously diﬀerentiable in the neighborhood of 𝜃0 ; the binding function 𝑔 is twice continuously differentiable in the neighborhood of 𝛾0 . It follows from Theorem 5 and Assumption 6 that the binding functions evaluated at the corresponding CMD estimators are twice continuously differentiable with probability approaching one. Thus, the CMD estimator of 𝜃 must satisfy the ﬁrst-order conditions: ( )′ ( )) ( ∂𝑓 𝜃ˆ𝑛 ˆ 𝑛 − 𝑓 𝜃ˆ𝑛 = 0. 𝐴′𝑛 𝐴𝑛 ℎ ∂𝜃 ( ) Using the mean value theorem twice to expand 𝑓 𝜃ˆ𝑛 around 𝑓 (𝜃0 ) and ( ) ∂𝑓 𝜃ˆ𝑛 /∂𝜃′ around ∂𝑓 (𝜃0 ) /∂𝜃′ , and taking into account the population ﬁrst-order conditions (3.2), we obtain the following equation determining the asymptotic distribution of the CMD estimators in the misspeciﬁed case: ( )′ ) ( ∂𝑓 𝜃ˆ𝑛 = 𝐹𝑛−1 𝜃ˆ𝑛 − 𝜃0 × ∂𝜃 ) ( ) ( ( ) ˆ 𝑛 − ℎ0 + 𝐴′ 𝐴𝑛 − 𝐴′ 𝐴 (ℎ0 − 𝑓 (𝜃0 )) ,(3.1) 𝐴′𝑛 𝐴𝑛 ℎ 𝑛 where 𝐹𝑛 = 𝑀𝑓,𝑛 ( )′ ∂𝑓 𝜃ˆ𝑛 ∂𝜃 𝐴′𝑛 𝐴𝑛 ( ) ∂𝑓 𝜃˜𝑛 ∂𝜃′ − 𝑀𝑓,𝑛 , ( ) ∂ 𝑣𝑒𝑐 = 𝐼𝑘 ⊗ (ℎ0 − 𝑓 (𝜃0 ))′ 𝐴′ 𝐴 ∂𝜃′ ( ( )) ∂𝑓 𝜃𝑛 . ∂𝜃′ In the above equations, 𝜃˜𝑛 and 𝜃𝑛 denote the mean values between 𝜃0 and 𝜃ˆ𝑛 , and 𝑣𝑒𝑐 (⋅) denotes column vectorization of a matrix. The term 𝑀𝑓,𝑛 , 78 3.3. Properties of the CMD Estimators of Structural Parameters which involves the second derivatives of 𝑓 , reﬂects the fact that the model is misspeciﬁed; 𝑀𝑓,𝑛 and the second summand in (3.1) are zero if the model is correctly speciﬁed. The analogous result holds for 𝑔 and 𝛾ˆ𝑛 . This expansion is similar to equation (9) of Hall and Inoue (2003), however, in the case of CMD it involves one less term than in the GMM case. This is due to the fact that, in the case of CMD, the data and the parameters are additively separated: the data enters through ℎ, and the parameters through the binding function. The result in (3.1) is also similar to Assumptions 10 and 13 of Rivers and Vuong (2002). The expansion in (3.1) shows that the convergence rate of the CMD of structural parameters in the misspeciﬁed case depends on that of the reduced-form parameters and the weight matrices. In many situations, it ( ) ˆ 𝑛 − ℎ0 is asymptotically normal as we do is natural to assume that 𝑛1/2 ℎ in Assumption 1(b). In regards to the weight matrices, Hall and Inoue (2003) distinguish several cases: (i) ﬁxed weight matrices, (ii) 𝑛1/2 𝑣𝑒𝑐 (𝐴′𝑛 𝐴𝑛 − 𝐴′ 𝐴) being asymptotically normal, and (iii) 𝐴′𝑛 𝐴𝑛 being the inverse of centered or uncentered HAC estimator. In the current framework, the case of ﬁxed weight matrices plays an important role, since the weight matrix deﬁnes the relative importance of various reduced-form characteristics of the data. Consider the correctly speciﬁed case: ℎ0 − 𝑓 (𝜃0 ) = 0. In this case, ) ( Assumptions 1, 2, 4, 5, and 6 and Theorem 5 imply that 𝑛1/2 𝜃ˆ𝑛 − 𝜃0 has asymptotically normal distribution with the variance matrix ( ∂𝑓 (𝜃0 )′ ′ ∂𝑓 (𝜃0 ) 𝐴𝐴 ∂𝜃 ∂𝜃′ )−1 ∂𝑓 (𝜃0 )′ ′ ∂𝑓 (𝜃0 ) 𝐴 𝐴Λ0 𝐴′ 𝐴 ∂𝜃 ∂𝜃′ ( ∂𝑓 (𝜃0 )′ ′ ∂𝑓 (𝜃0 ) 𝐴𝐴 ∂𝜃 ∂𝜃′ )−1 As usual, in the correctly speciﬁed case, the eﬃcient CMD estimator cor79 . 3.3. Properties of the CMD Estimators of Structural Parameters ˆ −1 responds to 𝐴′𝑛 𝐴𝑛 = Λ 𝑛 . However, when the model is misspeciﬁed, such ˆ 𝑛 is a choice no longer leads to statistical eﬃciency. Furthermore, when Λ a HAC estimator and the model is misspeciﬁed, 𝜃ˆ𝑛 has a convergence rate slower that 𝑛1/2 as shown in Hall and Inoue (2003). In this chapter, we focus on cases (i) and (ii). Case (i) corresponds, for example, to a situation where the calibrationist knows the relative importance of diﬀerent reduced-form characteristics of the data. In case (ii), the matrix 𝐴′𝑛 𝐴𝑛 can be given, for example, by the matrix of second moments of the data, as in the case of Hansen-Jagannathan distance (Kan and Robotti, 2008). Deﬁne ∂𝑓 (𝜃0 )′ ′ ∂𝑓 (𝜃0 ) 𝐴𝐴 − 𝑀𝑓,0 , where ∂𝜃 ∂𝜃′ ( ) ( ∂𝑓 (𝜃0 ) ′ ′ ) ∂ = 𝐼𝑘 ⊗ (ℎ0 − 𝑓 (𝜃0 )) 𝐴 𝐴 𝑣𝑒𝑐 , and ∂𝜃′ ∂𝜃′ ∂𝑔 (𝛾0 )′ ′ ∂𝑔 (𝛾0 ) 𝐴𝐴 − 𝑀𝑔,0 , where = ∂𝛾 ∂𝛾 ′ ( ) ( ∂𝑔 (𝛾0 ) ′ ′ ) ∂ = 𝐼𝑙 ⊗ (ℎ0 − 𝑔 (𝛾0 )) 𝐴 𝐴 𝑣𝑒𝑐 . ∂𝛾 ′ ∂𝛾 ′ 𝐹0 = 𝑀𝑓,0 𝐺0 𝑀𝑔,0 (3.2) (3.3) Assumption 7. 𝐹0 and 𝐺0 are non-singular. The above assumption is similar to Assumption 5 of Hall and Inoue 80 3.3. Properties of the CMD Estimators of Structural Parameters (2003). Theorem 5 implies that 𝐹𝑛 →𝑝 𝐹0 . We deﬁne further 𝑉𝑓 𝑓,0 = 𝑉𝑓 𝑔,0 = 𝑉𝑔𝑔,0 = 𝑉0 = ∂𝑓 (𝜃0 ) ′−1 (𝜃0 )′ ′ 𝐴 𝐴Λ0 𝐴′ 𝐴 𝐹0 , ∂𝜃 ∂𝜃′ ′ ∂𝑔 (𝛾0 ) ′−1 ∂𝑓 (𝜃0 ) ′ 𝐴 𝐴Λ0 𝐴′ 𝐴 𝐺0 , 𝐹0−1 ∂𝜃 ∂𝛾 ′ ∂𝑔 (𝛾0 )′ ′ ∂𝑔 (𝛾0 ) ′−1 𝐺−1 𝐴 𝐴Λ0 𝐴′ 𝐴 𝐺0 , and 0 ∂𝛾 ∂𝛾 ′ ⎞ ⎛ 𝑉𝑓 𝑓,0 𝑉𝑓 𝑔,0 ⎠. ⎝ ′ 𝑉𝑓 𝑔,0 𝑉𝑔𝑔,0 ∂𝑓 𝐹0−1 (3.4) The following theorem describes the asymptotic distribution of the CMD estimators in the ﬁxed weight matrix case. Theorem 6. Suppose that 𝐴𝑛 = 𝐴 for all 𝑛 ≥ 1. Under Assumptions 1, 2, 4, and 5-7, ⎛ 𝑛1/2 ⎝ 𝜃ˆ𝑛 − 𝜃0 𝛾ˆ𝑛 − 𝛾0 ⎞ ) ( ⎠ →𝑑 𝑁 0(𝑘+𝑙)×1 , 𝑉0 . When the weight matrix depends on the data, we extend Assumption 1 with Assumption 8 below, which assumes that the elements of 𝐴′𝑛 𝐴𝑛 are root-𝑛 consistent and asymptotically normal estimators of the elements of ˆ 𝑛. 𝐴′ 𝐴 and they can be correlated with ℎ )′ (( )′ ) ( ′ ′ ′ 1/2 ˆ →𝑑 𝑁 0, Λ𝐴 Assumption 8. (a) 𝑛 ℎ𝑛 − ℎ0 , 𝑣𝑒𝑐 (𝐴𝑛 𝐴𝑛 − 𝐴 𝐴) 0 , where Λ𝐴 0 is a positive deﬁnite 𝑚(𝑚 + 1) × 𝑚(𝑚 + 1) matrix ⎛ ⎞ Λ Λ 0𝐴 ⎝ 0 ⎠. Λ𝐴 0 = ′ Λ0𝐴 Λ𝐴𝐴 (b) 𝐴 has full rank. ˆ 𝐴 such that Λ ˆ 𝐴 →𝑝 Λ𝐴 . (c) There is Λ 0 0 0 81 3.3. Properties of the CMD Estimators of Structural Parameters This assumption is similar to condition (12) of Theorem 2 in Hall and ˆ 𝑛 , 𝜃ˆ𝑛 , and 𝛾ˆ𝑛 . Inoue (2003). In particular, it allows 𝐴′𝑛 𝐴𝑛 to depend on ℎ However, as discussed above, Assumption 8 rules out HAC based estimators of 𝐴′ 𝐴. Now, in view of expansion (3.1) and a similar expansion for 𝛾ˆ𝑛 , the asymptotic distribution of 𝜃ˆ𝑛 and 𝛾ˆ𝑛 depends on that of 𝐴′𝑛 𝐴𝑛 . Deﬁne 𝐴 , and 𝑉 𝐴 to be the asymptotic variance of 𝜃 ˆ𝑛 , the asymptotic 𝑉𝑓𝐴𝑓,0 , 𝑉𝑔𝑔,0 𝑓 𝑔,0 variance of 𝛾ˆ𝑛 , and the asymptotic covariance of 𝜃ˆ𝑛 and 𝛾ˆ𝑛 respectively: ∂𝑓 (𝜃0 )′ 𝐴 𝐴 𝐴′ ∂𝑓 (𝜃0 ) ′−1 𝑉𝑓𝐴𝑓,0 = 𝐹0−1 𝐷𝑓,0 Λ0 𝐷𝑓,0 𝐹0 , where ∂𝜃 ∂𝜃)′ ( 𝐴 𝐷𝑓,0 = 𝐴′ 𝐴 𝐼𝑚 ⊗ (ℎ0 − 𝑓 (𝜃0 ))′ ; (3.5) ∂𝑔 (𝛾0 )′ 𝐴 𝐴 𝐴′ ∂𝑔 (𝛾0 ) ′−1 𝐴 𝑉𝑔𝑔,0 = 𝐺−1 𝐷𝑔,0 Λ0 𝐷𝑔,0 𝐺0 , where 0 ∂𝛾 ∂𝛾 ′ ) ( 𝐴 𝐷𝑔,0 = 𝐴′ 𝐴 𝐼𝑚 ⊗ (ℎ0 − 𝑔 (𝛾0 ))′ ; ∂𝑓 (𝜃0 )′ 𝐴 𝐴 𝐴′ ∂𝑔 (𝛾0 ) ′−1 𝑉𝑓𝐴𝑔,0 = 𝐹0−1 𝐷𝑓,0 Λ0 𝐷𝑔,0 𝐺0 , and ∂𝜃 ∂𝛾 ′ ⎛ ⎞ 𝐴 𝐴 𝑉 𝑉 𝑓 𝑓,0 𝑓 𝑔,0 ⎠ 𝑉0𝐴 = ⎝ . 𝐴 𝑉𝑓𝐴′ 𝑉 𝑔𝑔,0 𝑔,0 The joint asymptotic distribution of 𝜃ˆ𝑛 and 𝛾ˆ𝑛 is given in the next theorem. Theorem 7. Under Assumptions 2, 4, and 5-8, ⎛ 𝑛1/2 ⎝ 𝜃ˆ𝑛 − 𝜃0 𝛾ˆ𝑛 − 𝛾0 ⎞ ) ( ⎠ →𝑑 𝑁 0(𝑘+𝑙)×1 , 𝑉0𝐴 . The asymptotic variance of 𝜃ˆ𝑛 and 𝛾ˆ𝑛 can be consistently estimated by ( ) ˆ 𝑛 − 𝑓 𝜃ˆ𝑛 and so the plug-in method, i.e. by replacing ℎ0 − 𝑓 (𝜃0 ) with ℎ on. 82 3.4. Model Comparison 3.4 Model Comparison The distribution of the 𝑄𝐿𝑅𝑛 statistic in (3.5) depends on the relation- ship between the two models. Similarly to Vuong (1989), we consider the following three cases: nested, strictly non-nested, and overlapping models 𝑓 and 𝑔. Deﬁne ℱ = {ℎ ∈ 𝑅𝑚 : ℎ = 𝑓 (𝜃) , 𝜃 ∈ Θ} , 𝒢 = {ℎ ∈ 𝑅𝑚 : ℎ = 𝑔 (𝛾) , 𝛾 ∈ Γ} . The subsets of 𝑅𝑚 , ℱ and 𝒢, represent the spaces for the reduced-form parameter ℎ that are spanned by the structural models 𝑓 and 𝑔 respectively. The relationship between the two structural models can be deﬁned in terms of ℱ and 𝒢. Deﬁnition 2. The two structural models 𝑓 and 𝑔 are said to be (a) nested if ℱ ⊂ 𝒢 or 𝒢 ⊂ ℱ, (b) strictly non-nested if ℱ ∩ 𝒢 = ∅, (c) overlapping if ℱ ∩ 𝒢 = ∕ ∅, ℱ ∕⊂ 𝒢, and 𝒢 ∕⊂ ℱ. Note that the nested case does not necessarily correspond to zero restrictions on the elements of structural parameters. The two models can be totally diﬀerent in terms of their construction so that 𝜃 and 𝛾 are not directly comparable, and still be nested with respect to the spaces they span for ℎ. The asymptotic behavior of the QLR statistic and resulting inference procedure depend on whether 𝑓 and 𝑔 are nested, strictly non-nested, or overlapping. 83 3.4. Model Comparison Further, note that in the strictly non-nested case, the two models provide absolutely diﬀerent predictions for the reduced form characteristics for any values of the structural parameters. It appears therefore that in the calibration context, the non-nested case is less realistic then nested and overlapping cases. 3.4.1 Nested Models Suppose that 𝒢 ⊂ ℱ. In this case, model 𝑔 cannot provide a better ﬁt than model 𝑓 . Thus, in this case the calibrationist is interested in testing 𝐻0 against 𝐻𝑓 , i.e. whether the approximation to the reduced-form characteristics of the data obtained from the smaller model is equivalent to that from the bigger model. Since the models are nested, and under Assumption 4 of unique pseudo-true values, the null hypothesis can be equivalently stated as 𝑓 (𝜃0 ) = 𝑔 (𝛾0 ). Indeed, let ℎ𝑓,0 = 𝑓 (𝜃0 ) and ℎ𝑔,0 = 𝑔 (𝛾0 ). Then under the null of models equivalence we have ∥𝐴 (ℎ0 − ℎ𝑓,0 )∥ = ∥𝐴 (ℎ0 − ℎ𝑔,0 )∥. However, since the models are nested, ℎ𝑔,0 ∈ ℱ, and there should be some ( ) 𝜃˜0 ∈ Θ such that ℎ𝑔,0 = 𝑓 𝜃˜0 which violates Assumption 4 if ℎ𝑓,0 = ∕ ℎ𝑔,0 . We have the following result. Lemma 1. Suppose that Assumption 4 holds, and the models 𝑓 and 𝑔 are nested according to Deﬁnition 2. Then, under 𝐻0 in (3.3), 𝑓 (𝜃0 ) = 𝑔 (𝛾0 ). The QLR statistic depends on the weight matrix explicitly and through the estimators 𝜃ˆ𝑛 and 𝛾ˆ𝑛 . The following theorem establishes the distribution of the QLR statistic in the case of ﬁxed weight matrices. 84 3.4. Model Comparison Theorem 8. Suppose that 𝐴𝑛 = 𝐴 for all 𝑛 ≥ 1, 𝐴 is of full rank, Assumptions 1, 3, 4, 5-7 hold, and 𝒢 ⊂ ℱ. (a) Under 𝐻0 , ) ( ′ 1/2 1/2 𝑛𝑄𝐿𝑅𝑛 𝜃ˆ𝑛 , 𝛾ˆ𝑛 →𝑑 𝑍 ′ Λ0 𝐴′ 𝐴 (𝑊𝑔,0 − 𝑊𝑓,0 ) 𝐴′ 𝐴Λ0 𝑍, where 𝑍 ∼ 𝑁 (0, 𝐼𝑚 ) , 𝑊𝑓,0 = 𝑊𝑓,0 (1) − 𝑊𝑓,0 (2) − 𝑊𝑓,0 (3), 𝑊𝑓,0 (1) = 𝑊𝑓,0 (2) = 𝑊𝑓,0 (3) = ∂𝑓 (𝜃0 ) ′ −1 ∂𝑓 (𝜃0 ) ′ ′ ∂𝑓 (𝜃0 ) −1 ∂𝑓 (𝜃0 )′ 𝐹0 𝐹0 , 𝐴𝐴 ∂𝜃′ ∂𝜃 ∂𝜃′ ∂𝜃 ′ ) ∂𝑓 (𝜃0 ) ( ′ −1 −1 ∂𝑓 (𝜃0 ) + 𝐹 , 𝐹 0 0 ∂𝜃′ ∂𝜃 ) −1 ∂𝑓 (𝜃0 )′ ∂𝑓 (𝜃0 ) ′ −1 ( ′ 𝐹 , 𝑀 + 𝑀 𝑓,0 𝐹0 𝑓,0 0 ∂𝜃′ ∂𝜃 and 𝑊𝑔,0 is deﬁned analogously to 𝑊𝑓,0 with 𝜃0 , ∂𝑓 /∂𝜃, 𝐹0 , and 𝑀𝑓,0 replaced by 𝛾0 , ∂𝑔/∂𝛾, 𝐺0 , and 𝑀𝑔,0 respectively. ) ( (b) Under 𝐻𝑓 , 𝑛𝑄𝐿𝑅𝑛 𝜃ˆ𝑛 , 𝛾ˆ𝑛 → +∞ with probability one. According to Theorem 8, the re-scaled QLR statistic has a mixed 𝜒2 distribution under the null. This result is similar to the one established by Vuong (1989) for MLE in the case of nested models. According to part (a) of the theorem, one should reject the null when ) ( 𝑛𝑄𝐿𝑅𝑛 𝜃ˆ𝑛 , 𝛾ˆ𝑛 > 𝑐1−𝛼 , ) ) ( ( where 𝑐1−𝛼 is the critical value satisfying 𝑃 𝑛𝑄𝐿𝑅𝑛 𝜃ˆ𝑛 , 𝛾ˆ𝑛 > 𝑐1−𝛼 ∣𝐻0 → 𝛼 as 𝑛 → ∞. Under the null, the distribution of the statistic is nonstandard 85 3.4. Model Comparison and depends on the unknown parameters ℎ0 , 𝜃0 , 𝛾0 , and Λ0 . However, its asymptotic distribution can be approximated by simulations using the conˆ 𝑓,𝑛 and 𝑊 ˆ 𝑔,𝑛 be sistent estimators of the unknown parameters. First, let 𝑊 the plug-in estimators of 𝑊𝑓 and 𝑊𝑔 deﬁned in part (a) of Theorem 8. To ˆ 𝑛 , 𝜃ˆ𝑛 , 𝛾ˆ𝑛 , and ˆ 𝑓,𝑛 and 𝑊 ˆ 𝑔,𝑛 , one replaces ℎ0 , 𝜃0 , 𝛾0 , and Λ0 by ℎ construct 𝑊 ˆ 𝑛 respectively. Next, simulate a vector of 𝑁 (0, 𝐼𝑚 ) random variables, 𝑍𝑟 , Λ and calculate ( ) ˆ 𝑔,𝑛 − 𝑊 ˆ 𝑓,𝑛 𝐴′ 𝐴Λ ˆ 1/2 ˆ 𝑛′ 1/2 𝐴′ 𝐴 𝑊 𝑄𝐿𝑅𝑛𝑟 = 𝑍𝑟′ Λ 𝑛 𝑍𝑟 . As 𝑛 → ∞, the asymptotic distribution of 𝑄𝐿𝑅𝑛𝑟 is given in part (a) of Theorem 8. Repeating this for 𝑟 = 1, . . . , 𝑅 with 𝑍𝑟 being drawn independently across 𝑟’s, the simulated critical value 𝑐1−𝛼,𝑛,𝑅 is the 1 − 𝛼 quantile of {𝑄𝐿𝑅𝑛𝑟 : 𝑟 = 1, . . . , 𝑅}. Hence, in practice, in the case of nested models, one rejects the null when ) ( 𝑛𝑄𝐿𝑅𝑛 𝜃ˆ𝑛 , 𝛾ˆ𝑛 > 𝑐1−𝛼,𝑛,𝑅 . When the weight matrix is data dependent, the following result establishes the null distribution of the QLR statistic. Theorem 9. Suppose that Assumptions 3, 4, and 5-8 hold, and 𝒢 ⊂ ℱ. Then, under 𝐻0 , ) ( ) ( 𝐴 )1/2 ( )′1/2 ( 𝐴 𝐴 Λ0 𝑍, 𝑊𝑔,0 − 𝑊𝑓,0 𝑛𝑄𝐿𝑅𝑛 𝜃ˆ𝑛 , 𝛾ˆ𝑛 →𝑑 𝑍 ′ Λ𝐴 0 86 3.4. Model Comparison where ( ) 𝑍 ∼ 𝑁 0, 𝐼𝑚(𝑚+1) , 𝐴 𝐴 𝐴 𝐴 𝐴 𝑊𝑓,0 = 𝑊𝑓,0 (1) − 𝑊𝑓,0 (2) − 𝑊𝑓,0 (3) − 𝑊𝑓,0 (4) , ′ ′ ′ ∂𝑓 (𝜃0 ) −1 ∂𝑓 (𝜃0 ) 𝐴 𝐴 𝐴′ ∂𝑓 (𝜃0 ) ′−1 ∂𝑓 (𝜃0 ) 𝐹 𝐴 𝐴 𝐹 𝐷𝑓,0 , 𝑊𝑓,0 (1) = 𝐷𝑓,0 0 0 ∂𝜃′ ∂𝜃 ∂𝜃′ ∂𝜃 ′ ( )′ ∂𝑓 (𝜃 ) 0 −1 ∂𝑓 (𝜃0 ) 𝐴 𝐴 ′ 𝑊𝑓,0 (2) = 𝐹 𝐷𝑓,0 𝐴𝐴 0 0 ∂𝜃′ ∂𝜃 ′ ( ) 𝐴′ ∂𝑓 (𝜃0 ) ′−1 ∂𝑓 (𝜃0 ) ′ 𝐹 +𝐷𝑓,0 𝐴𝐴 0 , 0 ∂𝜃′ ∂𝜃 ( ′ ) −1 ∂𝑓 (𝜃0 )′ 𝐴 𝐴 𝐴′ ∂𝑓 (𝜃0 ) ′−1 𝐹 𝐷𝑓,0 , 𝑀 + 𝑀 𝐹0 𝑊𝑓,0 (3) = 𝐷𝑓,0 𝑓,0 𝑓,0 0 ∂𝜃′ ∂𝜃 ′ ( )′ ∂𝑓 (𝜃 ) 0 −1 ∂𝑓 (𝜃0 ) 𝐴 𝐴 𝑊𝑓,0 (4) = 𝐹 𝐷𝑓,0 0 𝐼𝑚 ⊗ (ℎ0 − 𝑓 (𝜃0 ))′ 0 ∂𝜃′ ∂𝜃 ′( ) 𝐴′ ∂𝑓 (𝜃0 ) ′−1 ∂𝑓 (𝜃0 ) ′ 𝐹 . +𝐷𝑓,0 0 𝐼 ⊗ (ℎ − 𝑓 (𝜃 )) 0 𝑚 0 0 ∂𝜃′ ∂𝜃 𝐴 is deﬁned in (3.5) and 𝑊 𝐴 is deﬁned similarly to 𝑊 𝐴 . Here 𝐷𝑓,0 𝑔,0 𝑓,0 As in the ﬁxed 𝐴 case, the null asymptotic distribution is mixed 𝜒2 . 𝐴 and 𝑊 𝐴 depend on the unknown parameters, The mixing matrices 𝑊𝑓,0 𝑔,0 however, they can be consistently estimated by the plug-in method, and the critical values can be obtained by simulations as outlined above. 3.4.2 Strictly Non-nested Models In the case of strictly non-nested models, the space of reduced-form parameters generated under 𝑓 and 𝑔 does not have any common points. Either one of the models can be chosen as providing a better ﬁt to the reduced-form parameters. In this case, consistent with Vuong (1989) and Rivers and Vuong (2002), under the null, the asymptotic distribution of re-scaled QLR statistic is normal. 87 3.4. Model Comparison Theorem 10. Suppose that 𝐴𝑛 = 𝐴 for all 𝑛 ≥ 1, and 𝐴 has full rank. Suppose that Assumptions 1, 3, 4, 5-7 hold, and ℱ ∩ 𝒢 = ∅. Then, ) ( ( ) (a) 𝑛1/2 𝑄𝐿𝑅𝑛 𝜃ˆ𝑛 , 𝛾ˆ𝑛 →𝑑 𝑁 0, 𝜔02 , where 1/2 𝜔0 = 2 Λ0 𝐴′ 𝐴 (𝑓 (𝜃0 ) − 𝑔 (𝛾0 )) , under 𝐻0 . ) ( (b) 𝑛1/2 𝑄𝐿𝑅𝑛 𝜃ˆ𝑛 , 𝛾ˆ𝑛 → ∞ with probability one under 𝐻𝑓 ; ) ( 𝑛1/2 𝑄𝐿𝑅𝑛 𝜃ˆ𝑛 , 𝛾ˆ𝑛 → −∞ with probability one under 𝐻𝑔 . Asymptotic normality of the QLR statistic in the non-nested case is due to the fact that, in the asymptotic expansion, there appears a dominating ) ( ( ) ˆ 𝑛 − ℎ0 , as we show in term of order 𝑂𝑝 𝑛−1/2 , (𝑓 (𝜃0 ) − 𝑔 (𝛾0 ))′ 𝐴′ 𝐴 ℎ the proof of the theorem in the appendix. When the models are nested, this term disappears because 𝑓 (𝜃0 ) = 𝑔 (𝛾0 ) under the null. In practice, the null should be rejected in favor of 𝐻𝑓 when ) ( 𝜔𝑛 > 𝑧1−𝛼/2 , 𝑛1/2 𝑄𝐿𝑅𝑛 𝜃ˆ𝑛 , 𝛾ˆ𝑛 /ˆ where 𝑧𝛼 is the 𝛼 quantile of the standard normal distribution, and ) ( ( ) ˆ 𝑛1/2 𝐴′ 𝐴 𝑓 𝜃ˆ𝑛 − 𝑔 (ˆ 𝛾𝑛 ) . 𝜔 ˆ𝑛 = 2 Λ The null should be rejected in favor of 𝐻𝑔 when ) ( 𝜔𝑛 < −𝑧1−𝛼/2 . 𝑛1/2 𝑄𝐿𝑅𝑛 𝜃ˆ𝑛 , 𝛾ˆ𝑛 /ˆ One can see from part (a) of Theorem 10 that 𝜔02 > 0 whenever 𝑓 (𝜃0 ) ∕= 𝑔 (𝛾0 ). This condition always holds when the models are non-nested in the sense of Deﬁnition 2.29 When 𝜔02 = 0, 𝑄𝐿𝑅𝑛 has a mixed 𝜒2 distribution as described in the previous subsection. 29 This agrees with the conclusions in Section 6 of Rivers and Vuong (2002). 88 3.4. Model Comparison In the case of data dependent weight matrix, the following result provides the null asymptotic distribution of the QLR statistic. Theorem 11. Suppose that Assumptions 3, 4, and 5-8 hold, and ℱ ∩ 𝒢 = ∅. ) ( ( ) 2 Then, under 𝐻0 , 𝑛1/2 𝑄𝐿𝑅𝑛 𝜃ˆ𝑛 , 𝛾ˆ𝑛 →𝑑 𝑁 0, 𝜔𝐴,0 , where 𝜔𝐴,0 is given by ( ) 𝐴 1/2 Λ0 ⎞ 2𝐴′ 𝐴 (𝑓 (𝜃0 ) − 𝑔 (𝛾0 )) ⎠ . ⎝ ( ) ( ) (ℎ0 − 𝑔 (𝛾0 ))′ 𝐼𝑚 ⊗ (ℎ0 − 𝑔 (𝛾0 ))′ − (ℎ0 − 𝑓 (𝜃0 ))′ 𝐼𝑚 ⊗ (ℎ0 − 𝑓 (𝜃0 ))′ ⎛ Again, as in the case of ﬁxed weight matrices, 𝜔𝐴,0 is strictly positive unless the models are nested (𝑓 (𝜃0 ) = 𝑔 (𝛾0 )), and the asymptotic variance 2 𝜔𝐴,0 can be consistently estimated by the plug-in method. 3.4.3 Overlapping Models The models are overlapping when the intersection of ℱ and 𝒢 is nonempty, however, neither model nests the other. One has to consider two possibilities when the models are overlapping. First, if 𝑓 (𝜃0 ) = 𝑔 (𝛾0 ), then 𝜔02 = 0 and 𝑛𝑄𝐿𝑅𝑛 has an asymptotic mixed 𝜒2 distribution. Second, if 𝑓 (𝜃0 ) ∕= 𝑔 (𝛾0 ) but ∥𝐴 (ℎ0 − 𝑓 (𝜃0 (𝐴)))∥ = ∥𝐴 (ℎ0 − 𝑔 (𝛾0 (𝐴)))∥, then 𝑛1/2 𝑄𝐿𝑅𝑛 is asymptotically normal. In order to test the null hypothesis, one has to determine which of the two possibilities applies. Vuong (1989) proposed the following sequential procedure when the models are overlapping. In the ﬁrst step, one tests whether 𝑓 (𝜃0 ) = 𝑔 (𝛾0 ). ) ( This hypothesis is rejected when 𝑛𝑄𝐿𝑅𝑛 𝜃ˆ𝑛 , 𝛾ˆ𝑛 exceeds a critical value from the mixed 𝜒2 distribution, say 𝑐1−𝛼1 , where 𝛼1 denotes the signiﬁcance level used in step one. If not rejected, then one concludes that 𝑓 (𝜃0 ) = 𝑔 (𝛾0 ) 89 3.5. Model Comparison with Estimation and Evaluation . . . and the two models have the same lack-of-ﬁt. The null can be rejected either because 𝑓 (𝜃0 ) ∕= 𝑔 (𝛾0 ), but the models have the same lack of ﬁt (𝐻0 is true); or because one of the models has a better ﬁt (𝐻𝑓 or 𝐻𝑔 are true). If 𝐻0 : 𝑓 (𝜃0 ) = 𝑔 (𝛾0 ) is rejected in the ﬁrst step, one continues to the second step. In the second step, 𝐻0 : ∥𝐴 (ℎ0 − 𝑓 (𝜃0 (𝐴)))∥ = ∥𝐴 (ℎ0 − 𝑔 (𝛾0 (𝐴)))∥ ) ( 𝜔𝑛 > 𝑧1−𝛼2 /2 , in which case 𝑓 is the ˆ𝑛 /ˆ is rejected when 𝑛1/2 𝑄𝐿𝑅𝑛 𝜃ˆ𝑛 , 𝛾 ) ( 𝜔𝑛 < −𝑧1−𝛼2 /2 , in which case 𝑔 is ˆ𝑛 /ˆ preferred model, or 𝑛1/2 𝑄𝐿𝑅𝑛 𝜃ˆ𝑛 , 𝛾 preferred. Here 𝛼2 denotes the signiﬁcance level in step two. When the weight matrix is data dependent, one should use a consistent estimator of 𝜔𝐴,0 in place of 𝜔 ˆ 𝑛 . If 𝐻0 is not rejected in the second step, one concludes that the two models are equivalent. Vuong (1989) shows that the asymptotic signiﬁcance level of the sequential procedure is max (𝛼1 , 𝛼2 ). 3.5 Model comparison with estimation and evaluation on diﬀerent sets of reduced-form parameters In the calibration literature, model parameters are often estimated or calibrated using one set of reduced-form characteristics, while the model evaluation is conducted on another. For example, a structural model can be estimated to match ﬁrst moments, and evaluated with respect to second moments. Such case is discussed in this section; it is analogous to out-of- 90 3.5. Model Comparison with Estimation and Evaluation . . . sample model evaluation in the forecasting literature30 ; it also corresponds to the case of model comparison without lack-of-ﬁt minimization in Rivers and Vuong (2002). We ﬁnd that when a model is estimated and evaluated on diﬀerent sets of reduced-form parameters, the QLR statistic has asymptotically normal distribution regardless of whether 𝑓 and 𝑔 are nested or non-nested. The reason is that even when the models are nested a bigger model does not necessarily provides a better ﬁt, since the deep parameters are not calibrated to minimize the distance between the truth and the part of the model used for evaluation. This conclusion is in agreement with the results in Section 6 of Rivers and Vuong (2002). Next, we introduce the notation and assumptions of this section. We )′ ( partition ℎ0 = ℎ′1,0 , ℎ′2,0 , where ℎ1,0 is an 𝑚1 -vector, and ℎ2,0 is an 𝑚2 )′ ( ˆ′ ˆ′ , ℎ ˆ𝑛 = ℎ vector, 𝑚1 + 𝑚2 = 𝑚. Similarly, we partition ℎ 1,𝑛 2,𝑛 , 𝑓 (𝜃) = ( ) ( ) ′ ′ 𝑓1 (𝜃)′ , 𝑓2 (𝜃)′ , and 𝑔 (𝛾) = 𝑔1 (𝛾)′ , 𝑔2 (𝛾) . Next, consider the weight matrices 𝐴1 and 𝐴2 , where 𝐴𝑖 is 𝑚𝑖 × 𝑚𝑖 , 𝑖 = 1, 2. At the estimation stage, the parameters are calibrated using only the ﬁrst 𝑚1 reduced-form characteristics and the weight matrix 𝐴1 : ) ( ˆ 1,𝑛 − 𝑓1 (𝜃) 𝜃ˆ𝑛 (𝐴1,𝑛 ) = arg min 𝐴1,𝑛 ℎ 𝜃∈Θ ) ( ˆ 1,𝑛 − 𝑔1 (𝛾) 𝛾ˆ𝑛 (𝐴1,𝑛 ) = arg min 𝐴1,𝑛 ℎ 𝛾∈Γ 2 2 , and . At the evaluation stage, the models are compared using the remaining 𝑚2 30 See, for example, West and McCracken (1998) 91 3.5. Model Comparison with Estimation and Evaluation . . . reduced-form characteristics and the weight matrix 𝐴2 : 𝐻0 : ∥𝐴2 (ℎ2,0 − 𝑓2 (𝜃0 (𝐴1 )))∥ = ∥𝐴2 (ℎ2,0 − 𝑔2 (𝛾0 (𝐴1 )))∥ . (3.1) 𝐻𝑓 : ∥𝐴2 (ℎ2,0 − 𝑓2 (𝜃0 (𝐴1 )))∥ < ∥𝐴2 (ℎ2,0 − 𝑔2 (𝛾0 (𝐴1 )))∥ . (3.2) 𝐻𝑔 : ∥𝐴2 (ℎ2,0 − 𝑓2 (𝜃0 (𝐴1 )))∥ > ∥𝐴2 (ℎ2,0 − 𝑔2 (𝛾0 (𝐴1 )))∥ . (3.3) We make the following assumption. Assumption 9. (a) 𝑓2 and 𝑔2 are misspeciﬁed according to Deﬁnition 1. (b) 𝐴1,𝑛 →𝑝 𝐴1 , 𝐴2,𝑛 →𝑝 𝐴2 ; 𝐴1 and 𝐴2 have full ranks. (c) Assumptions 4 and 7 hold for 𝐴1 , 𝑓1 , and 𝑔1 . (d) ∂𝑓2 (𝜃0 (𝐴1 ))′ ′ 𝐴2 𝐴2 (ℎ2,0 ∂𝜃 ∂𝑔2 (𝛾0 (𝐴1 )) ∂𝛾 ′ − 𝑓2 (𝜃0 (𝐴1 ))) ∕= 0; 𝐴′2 𝐴2 (ℎ2,0 − 𝑔2 (𝛾0 (𝐴1 ))) ∕= 0. According to part (a) of the assumption, the models are misspeciﬁed with respect to the second set of reduced-form parameters ℎ2 . Note that the pseudo-true values of the parameters are deﬁned with respect to 𝐴1 and the ﬁrst 𝑚1 reduced-form characteristics. Consequently, the ﬁrst-order condition (3.2) does not hold for 𝑓2 , 𝑔2 , ℎ2 , and 𝐴2 , since 𝜃0 (𝐴1 ) and 𝛾0 (𝐴1 ) are not the minimizers of the CMD criterion for the remaining 𝑚2 reducedform characteristics, as described in part (d). The QLR statistic is now deﬁned as ( ) 𝑄𝐿𝑅𝑛 𝜃ˆ𝑛 (𝐴1,𝑛 ) , 𝛾ˆ𝑛 (𝐴1,𝑛 ) , 𝐴2,𝑛 ) 2 ( )) 2 ( ( ˆ 2,𝑛 − 𝑔2 (ˆ ˆ 2,𝑛 − 𝑓2 𝜃ˆ𝑛 (𝐴1,𝑛 ) . 𝛾𝑛 (+𝐴1,𝑛 )) 𝐴2,𝑛 ℎ = − 𝐴2,𝑛 ℎ 92 3.5. Model Comparison with Estimation and Evaluation . . . Deﬁne further 𝐽𝑓,0 = 𝐽𝑔,0 = ( ( ′ −1 ∂𝑓1 (𝜃0 (𝐴1 )) 1 )) 𝐹1,0 𝐴′1 𝐴1 𝐼𝑚2 − ∂𝑓2 (𝜃∂𝜃0 (𝐴 ′ ∂𝜃 ′ ∂𝑔1 (𝛾0 (𝐴1 )) 𝐴′1 𝐴1 𝐼𝑚2 − ∂𝑔2 (𝛾𝛾0′(𝐴1 )) 𝐺−1 1,0 ∂𝛾 ) ) , , where 𝐹1,0 and 𝐺1,0 are deﬁned similarly to 𝐹0 and 𝐺0 in (3.2) and (3.3) respectively, but using 𝐴1 , ℎ1,0 , 𝑓1 , and 𝑔1 . In the case of ﬁxed weight matrices, we have the following result. Theorem 12. Suppose that Assumptions 1 and 9 hold, and 𝐴1,𝑛 = 𝐴1 , 𝐴2,𝑛 = 𝐴2 for all 𝑛. ) ( ( ) 2 , (a) Under 𝐻0 in (3.1), 𝑛1/2 𝑄𝐿𝑅𝑛 𝜃ˆ𝑛 (𝐴1 ) , 𝛾ˆ𝑛 (𝐴1 ) , 𝐴2 →𝑑 𝑁 0, 𝜔21,0 where 𝜔21,0 = 1/2 ( ′ 𝐽𝑔,0 𝐴′2 𝐴2 (ℎ2,0 2 Λ0 ′ 𝐴′2 𝐴2 (ℎ2,0 − 𝑓2 (𝜃0 (𝐴1 ))) − 𝑔2 (𝛾0 (𝐴1 ))) − 𝐽𝑓,0 ) (3.4) ) ( (b) Under 𝐻𝑓 in (3.2), 𝑛1/2 𝑄𝐿𝑅𝑛 𝜃ˆ𝑛 (𝐴1 ) , 𝛾ˆ𝑛 (𝐴1 ) , 𝐴2 → ∞ with probability one; under the alternative 𝐻𝑔 in (3.3), ) ( 𝑛1/2 𝑄𝐿𝑅𝑛 𝜃ˆ𝑛 (𝐴1 ) , 𝛾ˆ𝑛 (𝐴1 ) , 𝐴2 → −∞ with probability one. As before the QLR statistic is asymptotically normal when the models are non-nested. Now, however, it is asymptotically normal also in the nested case. This is because there is no minimization of the lack-of-ﬁt functions in (3.4). Thus, when the models are estimated using one set of reduced-form parameters and evaluated using another, one follows the rule regardless of 93 . 3.6. Averaged and Sup Tests for Model Comparison, . . . whether the models are nested, non-nested, or overlapping. One should reject the null of equivalent models when ) ( 𝜔21,𝑛 > 𝑧1−𝛼/2 , 𝑛1/2 𝑄𝐿𝑅𝑛 𝜃ˆ𝑛 (𝐴1 ) , 𝛾ˆ𝑛 (𝐴1 ) , 𝐴2 /ˆ where 𝜔 ˆ 21,𝑛 is a consistent estimator of 𝜔21,0 . A consistent estimator of 𝜔21,0 can be obtained by the plug-in method, since all the elements of 𝜔21,0 can be consistently estimated. Note that, when 𝑓2 (𝜃0 (𝐴1 )) = 𝑔2 (𝛾0 (𝐴1 )), which can occur if the models are nested or overlapping, the columns corresponding to 𝐼𝑚2 in 𝐽𝑓,0 and 𝐽𝑔,0 do not contribute to the asymptotic variance; however, this will be reﬂected automatically by any consistent estimator 𝜔 ˆ 21,𝑛 . When the weight matrices are data dependent, one can adjust the asymptotic variance of the QLR statistic in a manner similar to that in Theorem 11. 3.6 Averaged and Sup tests for model comparison, and conﬁdence sets for weight matrices The choice of the weight matrix 𝐴 plays a crucial role when the models are misspeciﬁed: the null hypothesis (3.3) changes with diﬀerent weight matrices, and as a result diﬀerent weighing schemes can lead to diﬀerent ranking of the models. One way to relax this dependence is to consider a procedure that takes into account the models’ performance for various weighting schemes. 94 3.6. Averaged and Sup Tests for Model Comparison, . . . 3.6.1 Averaged and Sup Tests In this section, we propose averaged and sup procedures for model comparison. We assume that the models are estimated and evaluated on the same set of reduced-form parameters. Let 𝔸 be a sub-space of 𝑚 × 𝑚 full-rank matrices, ∥𝐴∥ = 𝑡𝑟 (𝐴′ 𝐴)1/2 , ℬ (𝔸) be a 𝜎-ﬁeld generated by open subsets of 𝔸, and 𝜋 be a probability measure on ℬ (𝔸). We make the following assumption. Assumption 10. (a) 𝔸 is compact. (b) Assumption 4 holds for all 𝐴 ∈ 𝔸. The null hypothesis of the averaged procedure is stated as 𝐻0𝑎 : ∫ ( 𝔸 ) ∥𝐴 (ℎ0 − 𝑔 (𝛾0 (𝐴)))∥2 − ∥𝐴 (ℎ0 − 𝑓 (𝜃0 (𝐴)))∥2 𝜋 (𝑑𝐴) = 0. According to 𝐻0𝑎 , the two models 𝑓 and 𝑔 provide equivalent approximations to the true ℎ0 on average, where the average is taken in the class 𝔸 with respect to the probability measure 𝜋. For example, 𝔸 may consist of a ﬁnite number of matrices 𝐴, and 𝜋 assigns equal weights to all 𝐴’s. Note that the pseudo-true values 𝜃0 (𝐴) and 𝛾0 (𝐴) continue to depend on 𝐴. The null hypothesis 𝐻0𝑎 will be tested against alternatives 𝐻𝑓𝑎 : 𝐻𝑔𝑎 : ∫ ( ∫𝔸 ( 𝔸 ) ∥𝐴 (ℎ0 − 𝑔 (𝛾0 (𝐴)))∥2 − ∥𝐴 (ℎ0 − 𝑓 (𝜃0 (𝐴)))∥2 𝜋 (𝑑𝐴) > 0, or 2 2 ∥𝐴 (ℎ0 − 𝑔 (𝛾0 (𝐴)))∥ − ∥𝐴 (ℎ0 − 𝑓 (𝜃0 (𝐴)))∥ ) 𝜋 (𝑑𝐴) < 0. The null hypothesis of the sup procedure is given by ( ) 𝐻0𝑠 : sup ∥𝐴 (ℎ0 − 𝑔 (𝛾0 (𝐴)))∥2 − ∥𝐴 (ℎ0 − 𝑓 (𝜃0 (𝐴)))∥2 ≤ 0. 𝐴∈𝔸 95 3.6. Averaged and Sup Tests for Model Comparison, . . . According to 𝐻0𝑠 , the model 𝑓 cannot outperform the model 𝑔 for any considered weight matrix 𝐴 ∈ 𝔸. Thus, 𝐻0𝑠 imposes a much stronger restriction than 𝐻0𝑎 . The null 𝐻0𝑠 will be tested against the following alternative: ( ) 𝐻𝑓𝑠 : sup ∥𝐴 (ℎ0 − 𝑔 (𝛾0 (𝐴)))∥2 − ∥𝐴 (ℎ0 − 𝑓 (𝜃0 (𝐴)))∥2 > 0. 𝐴∈𝔸 According to 𝐻𝑓𝑠 , there is a weight matrix 𝐴 such that model 𝑓 outperforms model 𝑔. Again, we consider the QLR statistic deﬁned in (3.5), however, it is now explicitly indexed by 𝐴: ) ( 𝑄𝐿𝑅𝑛 𝜃ˆ𝑛 (𝐴) , 𝛾ˆ𝑛 (𝐴) , 𝐴 = ) 2 ( ˆ 𝑛 − 𝑔 (ˆ 𝛾𝑛 (𝐴)) 𝐴 ℎ )) 2 ( ( ˆ 𝑛 − 𝑓 𝜃ˆ𝑛 (𝐴) . − 𝐴 ℎ (3.1) The averaged and sup statistics are given by: ∫ ) ( 𝐴𝑄𝐿𝑅𝑛 = 𝑄𝐿𝑅𝑛 𝜃ˆ𝑛 (𝐴) , 𝛾ˆ𝑛 (𝐴) , 𝐴 𝜋 (𝑑𝐴) , 𝔸 ) ( 𝑆𝑄𝐿𝑅𝑛 = sup 𝑄𝐿𝑅𝑛 𝜃ˆ𝑛 (𝐴) , 𝛾ˆ𝑛 (𝐴) , 𝐴 . 𝐴∈𝔸 The asymptotic null distributions and ranking of the models according to 𝐴𝑄𝐿𝑅𝑛 or 𝑆𝑄𝐿𝑅𝑛 depend on the choice of the measure 𝜋. The asymptotic null distributions of the averaged and sup statistics also depend on whether 𝑓 and 𝑔 are nested or non-nested. When the models are nested, 𝒢 ⊂ ℱ, the model 𝑔 cannot outperform the model 𝑓 , and the inequality in 𝐻0𝑠 holds as an equality. We have the following result. Theorem 13. Suppose that Assumptions 1, 3, 5-7, 10 hold, and 𝒢 ⊂ ℱ. Let 𝑍 ∼ 𝑁 (0, 𝐼𝑚 ), and, for a given 𝐴 ∈ 𝔸, deﬁne the matrices 𝑊𝑓,0 (𝐴), 𝑊𝑔,0 (𝐴) as 𝑊𝑓,0 , 𝑊𝑔,0 in Theorem 8. 96 3.6. Averaged and Sup Tests for Model Comparison, . . . (a) Under 𝐻0𝑎 , ′ 1/2 ′ 𝑛𝐴𝑄𝐿𝑅𝑛 →𝑑 𝑍 Λ0 (∫ ′ ′ ) 1/2 𝐴 𝐴 (𝑊𝑔,0 (𝐴) − 𝑊𝑓,0 (𝐴)) 𝐴 𝐴𝜋 (𝑑𝐴) Λ0 𝑍 . Under 𝐻𝑓𝑎 , 𝑛𝐴𝑄𝐿𝑅𝑛 → ∞ with probability one; under 𝐻𝑔𝑎 , 𝑛𝐴𝑄𝐿𝑅𝑛 → −∞ with probability one. (b) Under 𝐻0𝑠 , ( ) ′ 1/2 1/2 𝑛𝑆𝑄𝐿𝑅𝑛 →𝑑 sup 𝑍 ′ Λ0 𝐴′ 𝐴 (𝑊𝑔,0 (𝐴) − 𝑊𝑓,0 (𝐴)) 𝐴′ 𝐴Λ0 𝑍 𝐴∈𝔸 . Under 𝐻𝑓𝑠 , 𝑛𝑆𝑄𝐿𝑅𝑛 → ∞ with probability one. According to Theorem 13, when the models are nested, the asymptotic distribution of the averaged statistic is mixed 𝜒2 . However, the weights are now given by the average of matrices 𝑊𝑓,0 and 𝑊𝑔,0 . Note that 𝑊𝑓,0 , 𝑊𝑔,0 , 𝐹0 , 𝑀𝑓,0 depend on 𝐴. Since 𝑊𝑓,0 (𝐴) and 𝑊𝑔,0 (𝐴) can be estimated consistently by the plug-in method, the critical values of the mixed 𝜒2 distribution can be computed by simulations as described in Section 3.4.1. The asymptotic null distribution of the sup statistic depends on the sup transformation of the mixed 𝜒2 distribution. Its critical values can be obtained by simulations as well. In the case of non-nested models, the asymptotic null distribution is a functional of a Gaussian process. Note that when the models are non-nested, 𝐻0𝑠 does not determine the null distribution uniquely. It is a composite hypothesis, and the null distribution depends on whether the restriction is 97 3.6. Averaged and Sup Tests for Model Comparison, . . . binding or not, and the least favorable alternative, as usual, corresponds to the case when the restriction is binding. Theorem 14. Suppose that Assumptions 1, 3, 5-7, 10 hold, and ℱ ∩ 𝒢 = ∅. Let {𝑋 (𝐴) ∈ 𝑅 : 𝐴 ∈ 𝔸} be a mean zero Gaussian process such that the covariance of 𝑋 (𝐴1 ) and 𝑋 (𝐴2 ), 𝐴1 , 𝐴2 ∈ 𝔸, is 𝜔0 (𝐴1 , 𝐴2 ), where 𝜔0 (𝐴1 , 𝐴2 ) = 4 (𝑓 (𝜃0 (𝐴1 )) − 𝑔 (𝛾0 (𝐴1 )))′ 𝐴′1 𝐴1 Λ0 𝐴′2 𝐴2 (𝑓 (𝜃0 (𝐴2 )) − 𝑔 (𝛾0 (𝐴2 ))) . ( ∫ ∫ ) (a) Under 𝐻0𝑎 , 𝑛1/2 𝐴𝑄𝐿𝑅𝑛 →𝑑 𝑁 0, 𝔸 𝔸 𝜔0 (𝐴1 , 𝐴2 ) 𝜋 (𝑑𝐴1 ) 𝜋 (𝑑𝐴2 ) . Under 𝐻𝑓𝑎 , 𝑛1/2 𝐴𝑄𝐿𝑅𝑛 → ∞ with probability one; under 𝐻𝑔𝑎 , 𝑛1/2 𝐴𝑄𝐿𝑅𝑛 → −∞ with probability one. ( ) (b) Under 𝐻0𝑠 , lim𝑛→∞ 𝑃 𝑛1/2 𝑆𝑄𝐿𝑅𝑛 > 𝑐 ≤ 𝑃 (sup𝐴∈𝔸 𝑋 (𝐴) > 𝑐). Under 𝐻𝑓𝑠 , 𝑛1/2 𝑆𝑄𝐿𝑅𝑛 → ∞ with probability one. According to Theorem 14, the averaged statistic has a normal distribution. The variance is given by the weighted average of variances and covariances of the QLR statistics for diﬀerent 𝐴’s; it can be estimated consistently by the plug-in method. For the sup statistic, the asymptotic distribution is that of the sup of the Gaussian process, and the critical values for a test based on 𝑆𝑄𝐿𝑅𝑛 can be obtained by simulations. In the case of overlapping models, one can apply a sequential procedure similar to the one discussed in Section 3.4.3. 98 3.6. Averaged and Sup Tests for Model Comparison, . . . 3.6.2 Conﬁdence Sets for Weight Matrices When all the considered models are misspeciﬁed, it is possible that model 𝑔 provides a better approximation to one set of reduced-form characteristics, say ℎ1 , and model 𝑓 performs better on another set of ℎ. In such a case, it might be of interest to see how large the weight of ℎ1 has to be for model 𝑔 to be preferred to 𝑓 overall. Let 𝒜0 be a collection of weighting schemes under which 𝑔 is preferred to 𝑓 : 𝒜0 = {𝐴 ∈ 𝔸 : ∥𝐴 (ℎ0 − 𝑔 (𝛾0 (𝐴)))∥ − ∥𝐴 (ℎ0 − 𝑓 (𝜃0 (𝐴)))∥ ≤ 0} . In this section, we discuss construction of a conﬁdence set (CS) for 𝒜0 . The CS for 𝒜0 , 𝐶𝑆𝑛,1−𝛼 is deﬁned as lim 𝑃 (𝐴 ∈ 𝐶𝑆𝑛,1−𝛼 ) ≥ 1 − 𝛼, 𝑛→∞ for all 𝐴 ∈ 𝒜0 , and can be constructed by inversion of the basic QLR test discussed in Section 3.4. First, given 𝐴 ∈ 𝔸, compute 𝑄𝐿𝑅𝑛 (𝐴). Next, √ test 𝐻0 : 𝐴 ∈ 𝒜0 as follows: reject 𝐻0 when 𝑄𝐿𝑅𝑛 (𝐴) > 𝑧1−𝛼 𝜔 ˆ 𝑛 / 𝑛, if the models are non-nested. If the models are nested, assuming that 𝒢 ⊂ ℱ, one can use the mixed 𝜒2 critical values as described in Section 3.4.1 to test 𝐻0 . If the models are overlapping, one can apply the sequential procedure of Section 3.4.3. The conﬁdence set 𝐶𝑆𝑛,1−𝛼 is given by the collection of all 𝐴 for which 𝐻0 : 𝐴 ∈ 𝒜0 cannot be rejected. 99 3.7. Application 3.7 Application In this section we apply our proposed test to the two monetary macroe- conomic models, the cash-in-advance (CIA) model and the portfolio adjustment cost (PAC) model. Detailed discussions of these models can be found in Christiano (1991) and Christiano and Eichenbaum (1992). We compare the performance of the two models based on their ability to match the responses of output and inﬂation to a monetary growth shock. Therefore, the latter impulse responses comprise ℎ0 - the vector of reduced form characteristics of interest in our application. We obtain a consistent estimate of ℎ0 from a structural vector autoregression (SVAR) model of GDP and inﬂation. The identiﬁcation scheme employed for the SVAR follows Blanchard and Quah (1989). The particular restriction applied to identify the SVAR model is that money is neutral in the long run, which is satisﬁed by both CIA and PAC models. Since both CIA and PAC models are standard in the literature, we outline them only brieﬂy below. We also want to compare the results of our testing procedure with those obtained by Schorfheide (2000). For this purpose we follow his models speciﬁcations closely. 3.7.1 CIA Model The model economy is populated by a representative household, a ﬁrm, and a ﬁnancial intermediary. At the beginning of period 𝑡 the household owns the economy’s entire money stock 𝑀𝑡 and decides how to allocate it between purchases of consumption goods and deposits in the ﬁnancial intermediary, 100 3.7. Application 𝑀𝑡 − 𝑄𝑡 , where 𝑄𝑡 is money allocated to purchases of consumption goods. Consumption purchases must be ﬁnanced with 𝑄𝑡 and wage earnings. Thus, the objective of the household is to choose real consumption, 𝐶𝑡 , working hours, 𝐻𝑡 , and nominal deposit, 𝑀𝑡 − 𝑄𝑡 , to solve the following problem: ] [∞ ∑ 𝛽 𝑡 [(1 − 𝜙) ln 𝐶𝑡 + 𝜙 ln (1 − 𝐻𝑡 )] , max 𝔼0 {𝐶𝑡 ,𝐻𝑡 ,𝑀𝑡+1 ,𝑄𝑡 } 𝑡=0 subject to 𝑃𝑡 𝐶𝑡 ≤ 𝑄𝑡 + 𝑊𝑡 𝐻𝑡 , 𝑄 𝑡 ≤ 𝑀𝑡 , 𝑀𝑡+1 = (𝑄𝑡 + 𝑊𝑡 𝐻𝑡 − 𝑃𝑡 𝐶𝑡 ) + 𝑅𝐻,𝑡 (𝑀𝑡 − 𝑄𝑡 ) + 𝐹𝑡 + 𝐵𝑡 . Here 𝔼𝑡 denotes conditional expectation at date 𝑡, 𝛽 is the subjective discount factor, and 𝜙 is the share of leisure in per period utility. 𝑃𝑡 denotes economy’s price level, while 𝑊𝑡 and 𝑅𝐻,𝑡 denote nominal wage rate and return on deposit. Household also receives nominal proﬁts paid by the ﬁrm, 𝐹𝑡 , and the ﬁnancial intermediary, 𝐵𝑡 . The production technology in the economy is 𝑌𝑡 = 𝐾𝑡𝛼 (𝒜𝑡 𝑁𝑡 )1−𝛼 , where 𝐾𝑡 , 𝑁𝑡 , and 𝒜𝑡 are capital stock, labor input, and labor-augmenting technology, respectively. Firms must pay the total wage bill up-front to workers, so they borrow 𝑊𝑡 𝑁𝑡 from ﬁnancial intermediary. Loans must be repaid at the end of period 𝑡. The representative ﬁrm’s problem is [∞ ] ∑ 𝐹 𝑡 𝛽 𝑡+1 𝔼0 max , 𝐹𝑡 ,𝐾𝑡+1 ,𝑁𝑡 ,𝐿𝑡 𝐶𝑡+1 𝑃𝑡+1 𝑡=0 subject to 𝐹𝑡 ≤ 𝐿𝑡 + 𝑃𝑡 [𝑌𝑡 − 𝐾𝑡+1 + (1 − 𝛿)𝐾𝑡 ] − 𝑊𝑡 𝑁𝑡 − 𝐿𝑡 𝑅𝐹,𝑡 , 𝑊𝑡 𝑁𝑡 ≤ 𝐿𝑡 . 101 3.7. Application The objective of the ﬁnancial intermediary in this economy is simple. At the beginning of each period, it loans out the household’s deposit 𝑀𝑡 − 𝑄𝑡 and the money injection 𝑋𝑡 received from the central bank to the ﬁrm. At the end of period, it collects the loan plus interest 𝐿𝑡 𝑅𝐹,𝑡 and pays the amount to the household. The household, ﬁrm and ﬁnancial intermediary all take prices as given. Technology 𝒜𝑡 and money growth rate 𝑚𝑡 = 𝑀𝑡+1 /𝑀𝑡 follow stochastic processes ( ) 2 ln 𝒜𝑡 = 𝜓 + ln 𝒜𝑡−1 + 𝜖𝒜,𝑡 , with 𝜖𝒜,𝑡 ∼ 𝑁 0, 𝜎𝒜 , ( ) 2 . ln 𝑚𝑡 = (1 − 𝜌) ln 𝑚𝑠𝑠 + 𝜌 ln 𝑚𝑡−1 + 𝜖𝑀,𝑡 , with 𝜖𝑀,𝑡 ∼ 𝑁 0, 𝜎𝑀 Here 𝑚𝑠𝑠 is the steady state inﬂation rate, and 𝜓, 𝜌 are parameters. To solve the model, we ﬁrst re-scale all real variables by technology level 𝒜𝑡 , prices by 𝑀𝑡 /𝒜𝑡 , and nominal variables by 𝑀𝑡 . Then we log-linearize the equilibrium conditions around the deterministic steady state and solve the resulting system of linear diﬀerence equations. The state space representation for the exogenous and endogenous state variables is 𝑘ˆ𝑡+1 = 𝜂1 𝑘ˆ𝑡 + 𝜂2 𝑎 ˆ 𝑡 + 𝜂3 𝑚 ˆ 𝑡, 𝑎 ˆ𝑡+1 = 𝜖𝒜,𝑡+1 , 𝑚 ˆ 𝑡+1 = 𝜌𝑚 ˆ 𝑡 + 𝜖𝑀,𝑡+1 , where ‘ˆ’ over a variable is used to denote a log deviation of that (re-scaled) variable from its steady state value. The coeﬃcients 𝜂1 , 𝜂2 , and 𝜂3 are functions of model parameters 𝑚𝑠𝑠 , 𝛼, 𝛽, 𝛿, 𝜓, 𝜙, and 𝜌. We use this state space 102 3.7. Application representation to obtain the theoretical impulse responses of the model. These impulse responses are conditional on the structural model parame2 , 𝜎 2 ]′ . In the minimum distance estimation, ters 𝜃 = [𝑚𝑠𝑠 , 𝛼, 𝛽, 𝛿, 𝜓, 𝜙, 𝜌, 𝜎𝒜 𝑀 we look for vector 𝜃ˆ that minimizes the distance between the theoretical impulse responses for output growth and inﬂation and the impulse responses generated by the data. 3.7.2 PAC Model The production function and stochastic processes governing technology and money growth in the PAC model are the same as in the CIA model. The key diﬀerence between the two model is in the information sets that the household faces. In particular, in the PAC model the household’s contingency plan for deposit holdings is not a function of period-𝑡 realizations of shocks. This rigidity of 𝑄𝑡 implies that any positive money shock must be absorbed by ﬁrms. For ﬁrms to be willing to do so voluntarily, the interest rate must fall. To make this liquidity eﬀect persistent, Christiano (1991) introduce the second distinct feature of the PAC model - the existence of an adjustment cost 𝑝˜𝑡 given by 𝑝˜𝑡 = 𝛼1 [ ) ( ) ] ( 𝑄𝑡 𝑄𝑡 − 𝑚𝑠𝑠 ) + exp −𝛼2 ( − 𝑚𝑠𝑠 ) − 2 . exp 𝛼2 ( 𝑄𝑡−1 𝑄𝑡−1 The household’s problem in the PAC model is [∞ ] ∑ 𝑡 max 𝐸0 𝛽 [(1 − 𝜙) ln 𝐶𝑡 + 𝜙 ln(1 − 𝐻𝑡 − 𝑝˜𝑡 )] , {𝐶𝑡 ,𝐻𝑡 ,𝑀𝑡+1 ,𝑄𝑡+1 } 𝑡=0 103 3.7. Application subject to 𝑃𝑡 𝐶𝑡 ≤ 𝑄𝑡 + 𝑊𝑡 𝐻𝑡 , 𝑄 𝑡 ≤ 𝑀𝑡 , 𝑀𝑡+1 = (𝑄𝑡 + 𝑊𝑡 𝐻𝑡 − 𝑃𝑡 𝐶𝑡 ) + 𝑅𝐻,𝑡 (𝑀𝑡 − 𝑄𝑡 ) + 𝐹𝑡 + 𝐵𝑡 . The ﬁrm’s problem and the ﬁnancial intermediary’s problem are identical to those in the CIA model. We solve this model and calculate its theoretical impulse responses using the same procedure as for CIA model. These impulse responses are conditional on the set of structural model parameters 2 , 𝜎 2 , 𝛼 , 𝛼 ]′ . As before, we use the minimum 𝛾 = [𝑚𝑠𝑠 , 𝛼, 𝛽, 𝛿, 𝜓, 𝜙, 𝜌, 𝜎𝒜 1 2 𝑀 distance estimation to ﬁnd vector 𝛾ˆ that minimizes the distance between the theoretical impulse responses for output growth and inﬂation and the impulse responses generated by the data. 3.7.3 Model Estimation and Comparison Results The CIA and PAC models both provide predictions for evolution of multiple time series. In this application we focus on the growth rates of GDP per capita and price level. Thus, vector ℎ0 consists of twenty-periods output growth impulse responses and 20 periods inﬂation impulse responses to a money growth shock. We use SVAR model on the GDP per capita growth and inﬂation series to obtain the consistent estimate of ℎ0 , which ˆ 𝑛 . The data used in the empirical analysis are the US GDP per we denote ℎ capita growth rate and inﬂation rate available from the Basic Economics 104 3.7. Application database produced by DRI/McGraw-Hill.31 Our sample covers the 1947:Q22003:Q3. To conduct our testing procedure, we search for (a) values of parameter 2 , 𝜎 2 ]′ that minimize the distance between vector 𝜃 = [𝑚𝑠𝑠 , 𝛼, 𝛽, 𝛿, 𝜓, 𝜙, 𝜌, 𝜎𝒜 𝑀 the theoretical impulse responses in the CIA model, 𝑓 (𝜃), and empirical ˆ 𝑛 ; (b) values of the parameter vector impulse responses from SVAR model, ℎ 2 , 𝜎 2 , 𝛼 , 𝛼 ]′ , which minimize the corresponding 𝛾 = [𝑚𝑠𝑠 , 𝛼, 𝛽, 𝛿, 𝜓, 𝜙, 𝜌, 𝜎𝒜 1 2 𝑀 distance for the PAC model, 𝑔(𝛾). To reduce the computation time, in our calibration exercise we ﬁx the values of some parameters. Following Christiano and Eichenbaum (1992), we set 𝛼 = 0.36, 𝛽 = (1.03)−0.25 , 𝜙 = 0.797, 𝛿 = 0.012. We borrow the value of 𝜎𝒜 = 0.014 from Christiano (1991). We calibrate the rest of the parameters. Because our procedure requires that the parameter vectors are deﬁned on compact sets, we restrict the ranges of models parameters as follows. We assume that the steady state growth rate of money, 𝑚𝑠𝑠 , belongs to [0, 0.05]; the steady state growth rate of productivity, 𝜓, belongs to [0, 0.1]. Persistence of money shock is between 0 and 1, 𝜌 ∈ [0, 1]. We assume that 𝜎𝑀 ∈ [0.0001, 0.004]. Finally, we restrict the range for parameters in the adjustment cost technology. Note that in the log-linearized PAC model parameters 𝛼1 and 𝛼2 only enter through a combination 𝛼1 𝛼22 . Therefore, we can only identify 𝛼1 𝛼22 . In the calibration procedure we draw on Christiano and Eichenbaum (1992) who set 𝛼1 = 0.00005 and 𝛼2 = 1000, and restrict 31 The GDP per capita series are obtained as a ratio of GDP series (GDP215 in the DRI database) and population series (POP in the DRI database). For the price level, we choose the GDP deﬂator series (GDPD15 in the DRI database). 105 3.7. Application 𝛼1 𝛼22 ∈ [10, 90]. Table 3.1 summarizes the ranges for models parameters and their estimates with the weighting matrix being an identity matrix (𝐴𝑛 = 𝐴 = 𝐼). 106 Table 3.1: CIA and PAC Parameters’ Estimates and Their Standard Errors Parameter Range CIA estimates PAC estimates 0.36 capital share - 0.36 ﬁxed ﬁxed 𝛽 discount factor - 0.9926 0.9926 ﬁxed ﬁxed 0.012 0.012 ﬁxed ﬁxed 0.797 0.797 ﬁxed ﬁxed 0.014 0.014 ﬁxed ﬁxed 𝛿 𝜙 𝜎𝒜 𝜓 𝜌 depreciation leisure share in utility std.dev. of productivity innovations steady state productivity growth money shock persistence - - - [0,0.1] [0,1] 𝜎𝑀 std.dev. of money growth innovations [0.0001,0.004] 𝑚𝑠𝑠 steady state money growth [0,0.05] 𝛼1 𝛼22 adjustment cost parameter [10, 90] 0.0001 0.1 (3.71E-07) (2.58E-05) 0.89 0.85 (2.35E-05) (4.84E-04) 0.0024 0.0032 (6.90E-03) (8.80E-03) 0.001 0.05 (2.49E-05) (2.89E-04) - 32.86 107 (130.20) 3.7. Application 𝛼 3.7. Application Figures 3.7.3 and 3.7.3 plot the models’ prediction errors for the impulse ( ) ˆ 𝑛 − 𝑔 (ˆ ˆ 𝑛 − 𝑓 𝜃ˆ𝑛 and ℎ 𝛾𝑛 ). Figure 3.7.3 responses of inﬂation and output, ℎ suggests that both CIA and PAC models attain some success in replicating inﬂation dynamics, while Figure 3.7.3 indicates that both models lack in their ability to match the real-side dynamics. These results agree with the ﬁndings of Nason and Cogley (1994). At the same time, PAC model has a marginally better ﬁt to the data than the CIA model in terms of the output impulse responses.32 The reason is that the CIA model generates virtually no output dynamics. In the CIA model the households can always rebalance their money holdings to nullify the real eﬀect of money growth shock. In contrast, the PAC model generates a positive output response to money shock although the magnitude is much smaller than that of SVAR. The better ﬁt provided by the PAC model is not surprising though, since it is richer and nests the CIA model. In order to determine whether the better performance of the PAC model in approximating the impulse responses is statistically signiﬁcant, we compute the test statistic proposed in Section 3.4.1. The value of the 𝑄𝐿𝑅𝑛 statistic is equal 0.0008. Since the distribution of the test statistic has a mixed 𝜒240 distribution, we simulate its critical values. The 5% and 10% critical values are 0.1192 and 0.1050, respectively, both bigger than 0.0008. The p-value of the test is 0.4905. Therefore we fail to reject that the CIA model ﬁts the data as well as the PAC model. We conclude that both CIA 32 This is in agreement with the ﬁndings of Schorfheide (2000) that the PAC impulse response dynamics provide a better approximation to the posterior mean impulse response function than the CIA model. 108 3.7. Application and PAC models provide equally poor ﬁt to the output and inﬂation responses to the money supply shocks in the data. Our ﬁndings indicate that the rigidities underlying the persistent liquidity eﬀect in the PAC model do not play a signiﬁcant role in approximating the inﬂation and output impulse response dynamics. −3 4 −3 x 10 4 3 3 2 2 1 1 0 0 −1 −1 −2 −2 −3 −4 x 10 −3 0 5 10 15 20 −4 0 5 10 15 20 Left panel: PAC, right panel: CIA Figure 3.1: Model Prediction Errors of the Inﬂation Impulse Responses with 95% Conﬁdence Bands −3 8 −3 x 10 8 6 6 4 4 2 2 0 0 −2 −2 −4 0 5 10 15 20 −4 x 10 0 5 10 15 20 Left panel: PAC, right panel: CIA Figure 3.2: Model Prediction Errors of the Output Impulse Responses with 95% Conﬁdence Bands 109 3.8. Proofs of Theorems 3.8 Proofs of Theorems Proof of Theorem 5. For consistency of 𝜃ˆ𝑛 , it is suﬃcient to show uniform ) 2 ( ˆ 𝑛 − 𝑓 (𝜃) to ∥𝐴 (ℎ0 − 𝑓 (𝜃))∥2 on Θ. The desired convergence of 𝐴𝑛 ℎ result will follow from Assumptions 4 and 5 by the usual argument for extremum estimators (see, for example, Theorem 2.1 in Newey and McFadden (1994)). ) ( ˆ 𝑛 − 𝑓 (𝜃) 𝐴𝑛 ℎ 2 − ∥𝐴 (ℎ0 − 𝑓 (𝜃))∥2 = 𝑅1,𝑛 − 2𝑅2,𝑛 (𝜃) + 𝑅3,𝑛 (𝜃) , where ˆ 𝑛 − ℎ′ 𝐴′ 𝐴𝑛 ℎ0 , ˆ 𝑛 𝐴′ 𝐴𝑛 ℎ 𝑅1,𝑛 = ℎ 0 𝑛 𝑛 )′ ( ˆ 𝑛 − ℎ0 𝐴′ 𝐴𝑛 𝑓 (𝜃) ℎ 𝑅2,𝑛 (𝜃) = 𝑛 ( ) 𝑅3,𝑛 (𝜃) = (ℎ0 − 𝑓 (𝜃))′ 𝐴′𝑛 𝐴𝑛 − 𝐴′ 𝐴 (ℎ0 − 𝑓 (𝜃)) . By Assumption 1(a) and 2, ∣𝑅1,𝑛 ∣ →𝑝 0. Let ∥𝐴∥ = 𝑡𝑟 (𝐴′ 𝐴)1/2 . Due to Assumption 5 (a) and (c), 𝑓 is bounded on Θ (Davidson, 1994, Theorem 2.19), and, therefore, sup ∣𝑅2,𝑛 (𝜃)∣ ≤ ∥𝐴𝑛 ∥2 𝜃∈Θ ( ˆ 𝑛 − ℎ0 ℎ ) sup ∥𝑓 (𝜃)∥ 𝜃∈Θ →𝑝 0, by Assumptions 1(a) and 2. sup ∣𝑅3,𝑛 (𝜃)∣ ≤ 𝐴′𝑛 𝐴𝑛 − 𝐴′ 𝐴 sup ∥ℎ0 − 𝑓 (𝜃)∥2 𝜃∈Θ 𝜃∈Θ ( )2 ′ ′ ≤ 𝐴𝑛 𝐴𝑛 − 𝐴 𝐴 ∥ℎ0 ∥ + sup ∥𝑓 (𝜃)∥ 𝜃∈Θ →𝑝 0. 110 3.8. Proofs of Theorems The proof of 𝛾ˆ𝑛 →𝑝 𝛾0 is identical with 𝑓 and 𝜃 replaced by 𝑔 and 𝛾. ■ ( ) Proof of (3.1). First, applying the mean value expansion to 𝑓 𝜃ˆ𝑛 , ( )′ ∂𝑓 𝜃ˆ𝑛 ( )) ( ˆ 𝑛 − 𝑓 𝜃ˆ𝑛 𝐴′𝑛 𝐴𝑛 ℎ ∂𝜃 ( ) ( )′ ⎞ ⎛ ) ∂𝑓 𝜃˜𝑛 ( ∂𝑓 𝜃ˆ𝑛 ˆ 𝑛 − 𝑓 (𝜃0 ) − 𝜃ˆ𝑛 − 𝜃0 ⎠ 𝐴′𝑛 𝐴𝑛 ⎝ℎ = ∂𝜃 ∂𝜃′ ( )′ ) ( ) ( ∂𝑓 𝜃ˆ𝑛 ( ) ˆ 𝑛 − ℎ0 + 𝐴′ 𝐴𝑛 − 𝐴′ 𝐴 (ℎ0 − 𝑓 (𝜃0 )) = 𝐴′𝑛 𝐴𝑛 ℎ 𝑛 ∂𝜃 ( ) ( )′ ( )′ ) ∂𝑓 𝜃˜𝑛 ( ∂𝑓 𝜃ˆ𝑛 ∂𝑓 𝜃ˆ𝑛 ˆ𝑛 − 𝜃0 , 𝐴′ 𝐴 (ℎ0 − 𝑓 (𝜃0 )) − 𝐴′𝑛 𝐴𝑛 + 𝜃 ∂𝜃 ∂𝜃 ∂𝜃′ 0 = where 𝜃˜𝑛 is the mean value. Next, ( )′ ∂𝑓 𝜃ˆ𝑛 𝐴′ 𝐴 (ℎ0 − 𝑓 (𝜃0 )) ∂𝜃 ( )⎞ 𝜃ˆ𝑛 ∂𝑓 ( ′ ′ ) ⎝ ⎠ = 𝐼𝑘 ⊗ (ℎ0 − 𝑓 (𝜃0 )) 𝐴 𝐴 𝑉 𝑒𝑐 ∂𝜃′ ⎛ ∂𝑓 (𝜃0 ) ′ ′ 𝐴 𝐴 (ℎ0 − 𝑓 (𝜃0 )) ∂𝜃 ( ( )) ) ( ( ) ∂𝑓 𝜃𝑛 ∂ ˆ𝑛 − 𝜃0 𝑉 𝑒𝑐 𝜃 + 𝐼𝑘 ⊗ (ℎ0 − 𝑓 (𝜃0 ))′ 𝐴′ 𝐴 ∂𝜃′ ∂𝜃′ ) ( = 𝑀𝑓,𝑛 𝜃ˆ𝑛 − 𝜃0 , (3.1) = where 𝜃𝑛 is the mean value; note that the last equality follows from the population ﬁrst-order condition (3.2). ■ Proof of Theorem 6. One can expand the ﬁrst-order conditions for 𝛾ˆ𝑛 similarly to that of 𝜃ˆ𝑛 , equation (3.1). Taking into account that 𝐴𝑛 = 𝐴 for 111 3.8. Proofs of Theorems all 𝑛, ⎛ 𝑛1/2 ⎝ 𝜃ˆ𝑛 − 𝜃0 𝛾ˆ𝑛 − 𝛾0 ⎞ ⎛ ⎠=⎝ 𝐹𝑛−1 ∂𝑓 (𝜃ˆ𝑛 ) ∂𝜃 ∂𝑔(ˆ 𝛾𝑛 ) 𝐺−1 𝑛 ∂𝛾 ′ ′ ⎞ ) ( ˆ 𝑛 − ℎ0 , ⎠ 𝐴′ 𝐴𝑛1/2 ℎ where ∂𝑔 (ˆ 𝛾 𝑛 )′ ′ ∂𝑔 (˜ 𝛾𝑛 ) 𝐴𝑛 𝐴𝑛 − 𝑀𝑔,𝑛 , ∂𝛾 ∂𝛾 ′ ( ) ( ) ∂ ∂𝑔 (𝛾 𝑛 ) 𝑣𝑒𝑐 , = 𝐼𝑙 ⊗ (ℎ0 − 𝑔 (𝛾0 ))′ 𝐴′ 𝐴 ∂𝛾 ′ ∂𝛾 ′ 𝐺𝑛 = 𝑀𝑔,𝑛 and 𝛾 ˜𝑛 , 𝛾 𝑛 are between 𝛾ˆ𝑛 and 𝛾0 . The result follows from Theorem 5, As- sumptions 1(b) and 7. ■ Proof of Theorem 7. The result follows immediately from (3.1), a similar expansion for 𝛾ˆ𝑛 , and the assumptions of the theorem by writing ⎛ ) ⎞ ⎛ ⎞ ˆ ′ ( −1 ∂𝑓 (𝜃𝑛 ) ′ ′ ˆ 𝐹 𝜃 − 𝜃0 𝐴𝑛 𝐴𝑛 𝐼𝑚 ⊗ (ℎ0 − 𝑓 (𝜃0 )) ∂𝜃 ⎜ 𝑛 ⎟ ⎝ 𝑛 ⎠= ) ⎠ ( ⎝ ′ ∂𝑔(ˆ 𝛾𝑛 ) 𝛾ˆ𝑛 − 𝛾0 𝐺−1 𝐴′𝑛 𝐴𝑛 𝐼𝑚 ⊗ (ℎ0 − 𝑔 (𝛾0 ))′ 𝑛 ∂𝛾 ⎛ ⎞ ˆ 𝑛 − ℎ0 ℎ ⎝ ⎠. 𝑣𝑒𝑐 (𝐴′𝑛 𝐴𝑛 − 𝐴′ 𝐴) ■ Proof of Theorem 8. In the case of the ﬁxed weight matrix, using (3.1) the following expansion is obtained. ( )) 2 ( ˆ 𝑛 − 𝑓 𝜃ˆ𝑛 𝐴 ℎ ) ( )′ ) 2 ( ( ˆ 𝑛 − ℎ0 ˆ 𝑛 − ℎ0 𝐴′ 𝐴𝑊𝑓,𝑛 𝐴′ 𝐴 ℎ ˆ 𝑛 − 𝑓 (𝜃0 ) + ℎ = 𝐴 ℎ ) ( (3.2) +𝑜𝑝 𝑛−1 , 112 3.8. Proofs of Theorems where 𝑊𝑓,𝑛 = 𝑊𝑓,𝑛 (1) − 𝑊𝑓,𝑛 (2) − 𝑊𝑓,𝑛 (3), ( ) ( )′ ( )′ ( ) ˜𝑛 ˜𝑛 𝜃 𝜃 𝜃ˆ𝑛 ∂𝑓 ∂𝑓 ∂𝑓 ∂𝑓 𝜃ˆ𝑛 ′ −1 ′ −1 𝐹𝑛 𝐴𝐴 𝐹𝑛 , 𝑊𝑓,𝑛 (1) = ∂𝜃′ ∂𝜃 ∂𝜃′ ∂𝜃 ( ) ( )′ ( )′ ( ) ∂𝑓 𝜃˜𝑛 ∂𝑓 𝜃˜𝑛 ∂𝑓 𝜃ˆ𝑛 ∂𝑓 𝜃ˆ𝑛 ′ −1 −1 𝐹 + 𝐹 , 𝑊𝑓,𝑛 (2) = 𝑛 𝑛 ∂𝜃′ ∂𝜃 ∂𝜃′ ∂𝜃 ( ) ( )′ ∂𝑓 𝜃ˆ𝑛 ) −1 ∂𝑓 𝜃ˆ𝑛 ′ −1 ( ′ 𝑊𝑓,𝑛 (3) = + 𝑀 𝑀 𝐹 . 𝑓,𝑛 𝐹𝑛 𝑛 𝑓,𝑛 ∂𝜃′ ∂𝜃 To show (3.2), write ( )) ( ˆ 𝑛 − 𝑓 𝜃ˆ𝑛 𝐴 ℎ 2 ) ( ˆ 𝑛 − 𝑓 (𝜃0 ) = 𝐴 ℎ 2 + 𝑆1,𝑛 + 𝑆2,𝑛 + 𝑆3,𝑛 , (3.3) where ( ( ) ( ( ) )′ ) 𝑓 𝜃ˆ𝑛 − 𝑓 (𝜃0 ) 𝐴′ 𝐴 𝑓 𝜃ˆ𝑛 − 𝑓 (𝜃0 ) , ) ( ( ) )′ ( ˆ 𝑛 − ℎ0 𝐴′ 𝐴 𝑓 𝜃ˆ𝑛 − 𝑓 (𝜃0 ) , = −2 ℎ ) ( ( ) = −2 (ℎ0 − 𝑓 (𝜃0 ))′ 𝐴′ 𝐴 𝑓 𝜃ˆ𝑛 − 𝑓 (𝜃0 ) . 𝑆1,𝑛 = 𝑆2,𝑛 𝑆3,𝑛 ( ) Now, one obtains (3.2) by expanding 𝑓 𝜃ˆ𝑛 in 𝑆1,𝑛 , 𝑆2,𝑛 , and 𝑆3,𝑛 around ( ) 𝑓 (𝜃0 ) and using (3.1); in the case of 𝑆3,𝑛 , after expanding 𝑓 𝜃ˆ𝑛 , one can ) ( ( ) apply the result in (3.1) to (ℎ0 − 𝑓 (𝜃0 ))′ 𝐴′ 𝐴 ∂𝑓 𝜃˜𝑛 /∂𝜃′ , which leads to 𝑀𝑓,𝑛 in the expression for 𝑊𝑓,𝑛 (3). ) ( ˆ 𝑛 − 𝑔 (ˆ 𝛾𝑛 ) An expansion similar to (3.2) is available for 𝐴 ℎ 2 with 𝑓 , 𝜃, and 𝐹 replaced by 𝑔, 𝛾, and 𝐺. Hence, ) 2 ( ) 2 ) ( ( ˆ 𝑛 − 𝑔 (𝛾0 ) ˆ 𝑛 − 𝑓 (𝜃0 ) + 𝐴 ℎ 𝑄𝐿𝑅𝑛 𝜃ˆ𝑛 , 𝛾ˆ𝑛 = − 𝐴 ℎ ) ( )′ ( ˆ 𝑛 − ℎ0 . ˆ 𝑛 − ℎ0 𝐴′ 𝐴 (𝑊𝑔,𝑛 − 𝑊𝑓,𝑛 ) 𝐴′ 𝐴 ℎ + ℎ (3.4) 113 3.8. Proofs of Theorems Under the null, the ﬁrst summand on the right-hand side of (3.4) is zero by Lemma 1, and, due to Assumption 1(c) and Theorem 6, the second summand, when multiplied by 𝑛, converges in distribution to the random variable deﬁned in part (a) of the theorem. Since under 𝐻𝑓 , ∥𝐴 (ℎ0 − 𝑓 (𝜃0 ))∥2 ≤ ∥𝐴 (ℎ0 − 𝑔 (𝛾0 ))∥2 , part (b) of the theorem follows. ■ Proof of Theorem 9. As in the proof of Theorem 8, write ( )) ( ˆ 𝑛 − 𝑓 𝜃ˆ𝑛 𝐴𝑛 ℎ 2 ) ( ˆ 𝑛 − 𝑓 (𝜃0 ) = 𝐴𝑛 ℎ 2 𝐴 𝐴 𝐴 + 𝑆1,𝑛 + 𝑆2,𝑛 + 𝑆3,𝑛 , (3.5) where ) ( ( ) )′ ( ( ) 𝑓 𝜃ˆ𝑛 − 𝑓 (𝜃0 ) 𝐴′𝑛 𝐴𝑛 𝑓 𝜃ˆ𝑛 − 𝑓 (𝜃0 ) , ) ( ( ) )′ ( ˆ 𝑛 − ℎ0 𝐴′ 𝐴𝑛 𝑓 𝜃ˆ𝑛 − 𝑓 (𝜃0 ) , = −2 ℎ 𝑛 ) ( ( ) = −2 (ℎ0 − 𝑓 (𝜃0 ))′ 𝐴′𝑛 𝐴𝑛 𝑓 𝜃ˆ𝑛 − 𝑓 (𝜃0 ) . 𝐴 𝑆1,𝑛 = 𝐴 𝑆2,𝑛 𝐴 𝑆3,𝑛 Under the null in the nested case, 𝑓 (𝜃0 ) = 𝑔 (𝜃0 ), ) 2 ( ) 2 ( ˆ 𝑛 − 𝑔 (𝛾0 ) ˆ 𝑛 − 𝑓 (𝜃0 ) . Deﬁne = 𝐴𝑛 ℎ and therefore 𝐴𝑛 ℎ 𝐴 𝐷𝑓,𝑛 = ( 𝐴′𝑛 𝐴𝑛 𝐼𝑚 ⊗ (ℎ0 − 𝑓 (𝜃0 ))′ ) . ( ) By expanding 𝑓 𝜃ˆ𝑛 around 𝑓 (𝜃0 ) and using (3.1), we obtain the following 𝐴 : expression for 𝑆1,𝑛 ⎛ ⎝ ˆ 𝑛 − ℎ0 ℎ 𝑣𝑒𝑐 (𝐴′𝑛 𝐴𝑛 − 𝐴′ 𝐴) × ⎞′ 𝐴′ ⎠ 𝐷𝑓,𝑛 ( ) ∂𝑓 𝜃˜𝑛 ∂𝜃′ ( ) ∂𝑓 𝜃ˆ𝑛 𝐹𝑛−1 ∂𝜃′ 𝐹𝑛′−1 ( )′ ∂𝑓 𝜃ˆ𝑛 ∂𝜃 ( )′ ∂𝑓 𝜃˜𝑛 ∂𝜃 ⎛ 𝐴 ⎝ 𝐷𝑓,𝑛 𝐴′𝑛 𝐴𝑛 ˆ 𝑛 − ℎ0 ℎ 𝑣𝑒𝑐 (𝐴′𝑛 𝐴𝑛 − 𝐴′ 𝐴) ⎞ ⎠, 114 3.8. Proofs of Theorems 𝐴 we obtain where 𝜃˜𝑛 is the mean value. Similarly, for 𝑆2,𝑛 ⎛ ⎝ ˆ 𝑛 − ℎ0 ℎ 𝑣𝑒𝑐 (𝐴′𝑛 𝐴𝑛 ⎛ +⎝ − 𝐴′ 𝐴) ⎠ ⎝ ⎛ ⎝ ˆ 𝑛 − ℎ0 ℎ 𝑣𝑒𝑐 (𝐴′𝑛 𝐴𝑛 ⎞′ ⎛ − 𝐴′ 𝐴) 𝐴 write Next, for 𝑆3,𝑛 𝐴′𝑛 𝐴𝑛 0 ⎞ ⎠ ( ) ∂𝑓 𝜃˜𝑛 ˆ 𝑛 − ℎ0 ℎ 𝑣𝑒𝑐 (𝐴′𝑛 𝐴𝑛 ⎞′ ∂𝜃′ 𝐴′ ⎠ 𝐷𝑓,𝑛 𝐴′ 𝐴) − ( ) ∂𝑓 𝜃ˆ𝑛 ∂𝜃′ 𝐹𝑛−1 ( )′ ∂𝑓 𝜃ˆ𝑛 ∂𝜃 𝐴 𝐷𝑓,𝑛 ⎞ ⎠ ( )′ ⎛ ⎞′ ′ 𝐴 ∂𝑓 𝜃˜𝑛 𝐴 ⎝ 𝑛 𝑛 ⎠ 𝐹𝑛′−1 ∂𝜃 0 ⎛ ⎞ ˆ 𝑛 − ℎ0 ℎ ⎝ ⎠. ′ ′ 𝑣𝑒𝑐 (𝐴𝑛 𝐴𝑛 − 𝐴 𝐴) ) ( ( ) 𝐴 − 2𝑆3,𝑛 = (ℎ0 − 𝑓 (𝜃0 ))′ 𝐴′ 𝐴 𝑓 𝜃ˆ𝑛 − 𝑓 (𝜃0 ) ) ( )( ( ) + (ℎ0 − 𝑓 (𝜃0 ))′ 𝐴′𝑛 𝐴𝑛 − 𝐴′ 𝐴 𝑓 𝜃ˆ𝑛 − 𝑓 (𝜃0 ) . (3.6) For the ﬁrst summand on the right-hand side of (3.6), applying the mean( ) value expansion to 𝑓 𝜃ˆ𝑛 around 𝑓 (𝜃0 ) and by (3.1), we obtain ) ( ( ) (ℎ0 − 𝑓 (𝜃0 ))′ 𝐴′ 𝐴 𝑓 𝜃ˆ𝑛 − 𝑓 (𝜃0 ) ( ) ) ∂𝑓 𝜃˜𝑛 ( ˆ𝑛 − 𝜃0 𝜃 = (ℎ0 − 𝑓 (𝜃0 ))′ 𝐴′ 𝐴 ′ )′ ( ∂𝜃 ) ( 𝜃ˆ𝑛 − 𝜃0 𝑀𝑓,𝑛 𝜃ˆ𝑛 − 𝜃0 = ( ) ⎛ ⎞′ ˆ 𝑛 − ℎ0 ∂𝑓 𝜃ˆ𝑛 ℎ 𝐴′ ⎠ 𝐷𝑓,𝑛 = ⎝ 𝐹𝑛′−1 𝑀𝑓,𝑛 ′ ∂𝜃 ′ ′ 𝑣𝑒𝑐 (𝐴𝑛 𝐴𝑛 − 𝐴 𝐴) ( )′ ⎛ ⎞ ˆ 𝑛 − ℎ0 ∂𝑓 𝜃ˆ𝑛 ℎ 𝐴 ⎝ ⎠. 𝐷𝑓,𝑛 ×𝐹𝑛−1 ∂𝜃 ′ ′ 𝑣𝑒𝑐 (𝐴𝑛 𝐴𝑛 − 𝐴 𝐴) 115 3.8. Proofs of Theorems For the second summand on the right-hand side of (3.6), write ) ( )( ( ) (ℎ0 − 𝑓 (𝜃0 ))′ 𝐴′𝑛 𝐴𝑛 − 𝐴′ 𝐴 𝑓 𝜃ˆ𝑛 − 𝑓 (𝜃0 ) ) ( ( ) ( ( ))′ = 𝑣𝑒𝑐 𝐴′𝑛 𝐴𝑛 − 𝐴′ 𝐴 (𝐼𝑚 ⊗ (ℎ0 − 𝑓 (𝜃0 ))) 𝑓 𝜃ˆ𝑛 − 𝑓 (𝜃0 ) ( ) ⎛ ⎞′ ⎛ ⎞ ˆ 𝜃˜𝑛 ∂𝑓 ℎ𝑛 − ℎ0 0 ⎠ ⎝ ⎠ = ⎝ ∂𝜃′ 𝑣𝑒𝑐 (𝐴′𝑛 𝐴𝑛 − 𝐴′ 𝐴) 𝐼𝑚 ⊗ (ℎ0 − 𝑓 (𝜃0 )) ( )′ ⎛ ⎞ ˆ 𝑛 − ℎ0 ∂𝑓 𝜃ˆ𝑛 ℎ 𝐴 ⎝ ⎠. 𝐷𝑓,𝑛 ×𝐹𝑛−1 ∂𝜃 𝑣𝑒𝑐 (𝐴′𝑛 𝐴𝑛 − 𝐴′ 𝐴) Now, collection the above expressions for 𝑆1,𝑛 , 𝑆2,𝑛 , 𝑆3,𝑛 , and establishing ) 2 ( ˆ 𝑛 − 𝑔 (ˆ , the result follows by Assump𝛾𝑛 ) similar expansions for 𝐴𝑛 ℎ tion 8. ■ Proof of Theorem 10. From (3.2), by adding and subtracting ℎ0 , we obtain ( )) ( ˆ 𝑛 − 𝑓 𝜃ˆ𝑛 𝐴 ℎ 2 = ∥𝐴 (ℎ0 − 𝑓 (𝜃0 ))∥2 ) ( ) ( ˆ 𝑛 − ℎ0 + 𝑂𝑝 𝑛−1 , +2 (ℎ0 − 𝑓 (𝜃0 ))′ 𝐴′ 𝐴 ℎ (3.7) ) 2 ( ˆ 𝑛 − 𝑔 (ˆ . Hence, 𝛾𝑛 ) with a similar expression for 𝐴 ℎ ) ( = − ∥𝐴 (ℎ0 − 𝑓 (𝜃0 ))∥2 + ∥𝐴 (ℎ0 − 𝑔 (𝛾0 ))∥2 𝑄𝐿𝑅𝑛 𝜃ˆ𝑛 , 𝛾ˆ𝑛 ) ( ˆ 𝑛 − ℎ0 +2 (𝑓 (𝜃0 ) − 𝑔 (𝛾0 ))′ 𝐴′ 𝐴 ℎ ( ) +𝑂𝑝 𝑛−1 . (3.8) Since ℱ ∩ 𝒢 = ∅, we have that 𝑓 (𝜃0 ) ∕= 𝑔 (𝛾0 ), and the result follows from Assumption 1(b). ■ 116 3.8. Proofs of Theorems Proof of Theorem 11. From (3.5) we have ( )) 2 ( ˆ 𝑛 − 𝑓 𝜃ˆ𝑛 − ∥𝐴 (ℎ0 − 𝑓 (𝜃0 ))∥2 𝐴𝑛 ℎ ( ) = (ℎ0 − 𝑓 (𝜃0 ))′ 𝐴′𝑛 𝐴𝑛 − 𝐴′ 𝐴 (ℎ0 − 𝑓 (𝜃0 )) ) ( ( ) ˆ 𝑛 − ℎ0 + 𝑂𝑝 𝑛−1 +2 (ℎ0 − 𝑓 (𝜃0 ))′ 𝐴′𝑛 𝐴𝑛 ℎ ( ) ′ ′ ′( ′) = 2 (ℎ0 − 𝑓 (𝜃0 )) 𝐴𝑛 𝐴𝑛 (ℎ0 − 𝑓 (𝜃0 )) 𝐼𝑚 ⊗ (ℎ0 − 𝑓 (𝜃0 )) ⎛ ⎞ ˆ 𝑛 − ℎ0 ℎ ( ) ⎠ + 𝑂𝑝 𝑛−1 . ×⎝ 𝑣𝑒𝑐 (𝐴′𝑛 𝐴𝑛 − 𝐴′ 𝐴) Using a similar expansion for Assumption 8. ■ ) 2 ( ˆ 𝑛 − 𝑔 (ˆ 𝛾𝑛 ) 𝐴𝑛 ℎ , the result follows by )) ( ( ˆ 2,𝑛 − 𝑓2 𝜃ˆ𝑛 (𝐴1 ) Proof of Theorem 12. From (3.3), 𝐴2 ℎ 2 can be expanded as ) ( ˆ 2,𝑛 − ℎ2,0 ∥𝐴2 (ℎ2,0 − 𝑓2 (𝜃0 (𝐴1 )))∥2 + 2 (ℎ2,0 − 𝑓2 (𝜃0 (𝐴1 )))′ 𝐴′2 𝐴2 ℎ ) ∂𝑓2 (𝜃0 (𝐴1 )) ( ˆ 𝜃 (𝐴 ) − 𝜃 (𝐴 ) −2 (ℎ2,0 − 𝑓2 (𝜃0 (𝐴1 )))′ 𝐴′2 𝐴2 𝑛 1 0 1 ∂𝜃′ ( ) +𝑜𝑝 𝑛−1/2 = ∥𝐴2 (ℎ2,0 − 𝑓2 (𝜃0 (𝐴1 )))∥2 −2 (ℎ2,0 − 𝑓2 (𝜃0 (𝐴1 )))′ 𝐴′2 𝐴2 ( ) ˆ 1,𝑛 − ℎ1,0 ⋅ ℎ ∂𝑓2 (𝜃0 (𝐴1 )) −1 ∂𝑓1 (𝜃0 (𝐴1 ))′ ′ 𝐹1,0 𝐴1 𝐴1 ∂𝜃′ ∂𝜃 ) ( ˆ 2,𝑛 − ℎ2,0 +2 (ℎ2,0 − 𝑓2 (𝜃0 (𝐴1 )))′ 𝐴′2 𝐴2 ℎ ( ) +𝑜𝑝 𝑛−1/2 ) ( ˆ 𝑛 − ℎ0 = ∥𝐴2 (ℎ2,0 − 𝑓2 (𝜃0 (𝐴1 )))∥2 + 2 (ℎ2,0 − 𝑓2 (𝜃0 (𝐴1 )))′ 𝐴′2 𝐴2 𝐽𝑓,0 ℎ ( ) +𝑜𝑝 𝑛−1/2 . 117 3.8. Proofs of Theorems ■ Proof of Theorem 13. First, note that in the case of nested models for all 𝐴 ∈ 𝔸, ∥𝐴 (ℎ0 − 𝑔 (𝛾0 (𝐴)))∥2 ≥ ∥𝐴 (ℎ0 − 𝑓 (𝜃0 (𝐴)))∥2 , and thus, under 𝐻0𝑎 , we have that for all 𝐴 ∈ 𝔸, ∥𝐴 (ℎ0 − 𝑔 (𝛾0 (𝐴)))∥2 = ∥𝐴 (ℎ0 − 𝑓 (𝜃0 (𝐴)))∥2 . ) ( We show next that under 𝐻0𝑎 , 𝑛𝑄𝐿𝑅𝑛 𝜃ˆ𝑛 (𝐴) , 𝛾ˆ𝑛 (𝐴) , 𝐴 converges weakly to a stochastic process indexed by 𝐴. According to Theorem (10.2) of Pollard (1990), for weak convergence one needs to show ﬁnite dimensional ) ( convergence and stochastic equicontinuity of 𝑛𝑄𝐿𝑅𝑛 𝜃ˆ𝑛 (𝐴) , 𝛾ˆ𝑛 (𝐴) , 𝐴 with respect to 𝐴. Finite dimensional convergence follows by the same arguments as in the proof of Theorem 8. For stochastic equicontinuity, from (3.2) one can show that ) ( ) ( 𝑛 𝑄𝐿𝑅𝑛 𝜃ˆ𝑛 (𝐴1 ) , 𝛾ˆ𝑛 (𝐴1 ) , 𝐴1 − 𝑄𝐿𝑅𝑛 𝜃ˆ𝑛 (𝐴2 ) , 𝛾ˆ𝑛 (𝐴2 ) , 𝐴2 ˆ 𝑛 − ℎ0 ≤ 𝑛 ℎ 2 𝐾𝑛 ∥𝐴1 − 𝐴2 ∥𝛿 + 𝑜𝑝 (1) , where 𝛿 > 0, 𝐾𝑛 = 𝑂𝑝 (1) and independent of (𝐴1 − 𝐴2 ), and 𝑜𝑝 (1) term is uniform in 𝐴; this is because 𝑊𝑓,𝑛 and 𝑊𝑔,𝑛 are continuous in 𝐴, and ( ) 𝑜𝑝 𝑛−1 term is uniform in 𝐴. Stochastic equicontinuity of ) ( 𝑛𝑄𝐿𝑅𝑛 𝜃ˆ𝑛 (𝐴) , 𝛾ˆ𝑛 (𝐴) , 𝐴 follows from Lemma 2(a) of Andrews (1992). The results of the theorem follow now from weak convergence by the continuous mapping theorem (CMT). ■ Proof of Theorem 14. Convergence of ﬁnite dimensional distributions and stochastic equicontinuity can be established from (3.8). The results of the theorem will follow by the CMT. ■ 118 Bibliography Andrews, Donald W. K., “Generic Uniform Convergence,” Econometric Theory, 1992, 8 (2), 241–257. Blanchard, Olivier Jean and Danny Quah, “The Dynamic Eﬀects of Aggregate Demand and Supply Disturbances,” American Economic Review, September 1989, 79 (4), 655–73. Canova, F., “Statistical Inference in Calibrated Models,” Journal of Applied Econometrics, 1994, 9, 123–144. Canova, Fabio and Eva Ortega, “Testing Calibrated General Equilibrium Models,” 1996. Working Paper 166, Universitat Pompeu Fabra. Christiano, Lawrence J., “Modeling the Liquidity Eﬀect of a Money Shock,” Federal Reserve Bank of Minneapolis Quarterly Review, 1991, (Winter), 3–34. Christiano, Lawrence J and Martin Eichenbaum, “Liquidity Eﬀects and the Monetary Transmission Mechanism,” American Economic Review, May 1992, 82 (2), 346–53. 119 Chapter 3. Bibliography Cogley, Timothy and James M Nason, “Output Dynamics in RealBusiness-Cycle Models,” American Economic Review, June 1995, 85 (3), 492–511. Cooley, T. F. and E. C. Prescott, “Economic Growth and Business Cycles,” in T. F. Cooley, ed., Frontiers of Business Cycle Research, Princeton University Press, 1995, pp. 1–38. Corradi, Valentina and Norman R. Swanson, “Evaluation of Dynamic Stochastic General Equilibrium Models Based on Distributional Comparison of Simulated and Historic Data,” Journal of Econometrics, 2007, 136 (2), 699–723. Davidson, James, Stochastic Limit Theory, New York: Oxford University Press, 1994. Davidson, R. and J. G. MacKinnon, “Several Tests for Model Speciﬁcation in the Presence of Alternative Hypotheses,” Econometrica, 1981, 49 (3), 781–793. Diebold, F. X. and R. S. Mariano, “Comparing predictive accuracy,” Journal of Business & Economic Statistics, 1995, 13 (3), 253–263. Diebold, Francis X, Lee E Ohanian, and Jeremy Berkowitz, “Dynamic Equilibrium Economies: A Framework for Comparing Models and Data,” Review of Economic Studies, July 1998, 65 (3), 433–51. Dridi, R., A. Guay, and E. Renault, “Indirect Inference and Calibration of Dynamic Stochastic General Equilibrium Models,” Journal of Econometrics, 2007, 136 (2), 397–430. 120 Chapter 3. Bibliography Fuerst, Timothy S., “Liquidity, Loanable Funds, and Real Activity,” Journal of Monetary Economics, February 1992, 29 (1), 3–24. Gallant, A. Ronald and Halbert White, A Uniﬁed Theory of Estimation and Inference for Nonlinear Dynamic Models, Oxford: Basil Blackwell, 1988. Gouri´ eroux, Christian, Alain Monfort, and E. Renault, “Indirect Inference,” Journal of Applied Econometrics, 1993, 8, S85–S118. Gregory, A. W. and G. W. Smith, “Calibration as Estimation,” Econometric Reviews, 1990, 9 (1), 57–89. and , “Calibration as Testing: Inference in Simulated Macroeconomic Models,” Journal of Business & Economic Statistics, 1991, 9 (3), 297–303. Gregory, Allan W. and Gregor W. Smith, “Statistical Aspects of Calibration in Macroeconomics,” in G. S. Maddala, C. R. Rao, and H. D. Vinod, eds., Handbook of Statistics, Vol. 11, Amsterdam: North-Holland, 1993, chapter 25, pp. 703–719. Hall, A., A. Inoue, J. M. Nason, and B. Rossi, “Information Criteria For Impulse Response Function Matching Estimation of DSGE Models,” 2007. Federal Reserve Bank of Atlanta Working Paper 2007-10. Hall, Alastair R. and Atsushi Inoue, “The Large Sample Behavior of the Generalized Method of Moments Estimator in Misspeciﬁed Models,” Journal of Econometrics, 2003, 114 (2), 361–394. 121 Chapter 3. Bibliography and Denis Pelletier, “Non-Nested Testing in Models Estimated via Generalized Method of Moments,” 2007. Working Paper 011, North Carolina State University. Hnatkovska, Viktoria, Vadim Marmer, and Yao Tang, “Comparison of Misspeciﬁed Calibrated Models: The Minimum Distance Approach,” Working Papers, University of British Columbia Oct 2008. Kan, Raymond and Cesare Robotti, “Model Comparison Using the Hansen-Jagannathan Distance,” Review of Financial Studies, 2008, forthcoming. Kim, K. and A.R. Pagan, “The Econometric Analysis of Calibrated Macroeconomic Models,” in M. Hashem Pesaran and Mike R. Wickens, eds., Handbook of Applied Econometrics: Macroeconomics, Cambridge, Massachusetts: Blackwell, 1995, chapter 7, pp. 356–390. Kitamura, Yuichi, “Comparing Misspeciﬁed Dynamic Econometric Models Using Nonparametric Likelihood,” 2000. Working Paper, University of Pennsylvania. Lucas, Robert Jr., “Liquidity and Interest Rates,” Journal of Economic Theory, April 1990, 50 (2), 237–264. Maasoumi, E. and P. C. B. Phillips, “On the Behavior of Inconsistent Instrumental Variables Estimators,” Journal of Econometrics, 1982, 19, 183–201. MacKinnon, J. G., “Model Speciﬁcation Tests Against Non-Nested Alternatives,” Econometric Reviews, 1983, 2 (1), 85–110. 122 Chapter 3. Bibliography Moon, Hyungsik Roger and Frank Schorfheide, “Minimum Distance Estimation of Nonstationary Time Series Models,” Econometric Theory, 2002, 18, 1385–1407. Nason, J. M. and T. Cogley, “Testing the Implications of Long-Run Neutrality for Monetary Business Cycle Models,” Journal of Applied Econometrics, 1994, 9 (S), S37–70. Newey, Whitney K. and Daniel L. McFadden, “Large Sample Estimation and Hypothesis Testing,” in Robert F. Engle and Daniel L. McFadden, eds., Handbook of Econometrics, Vol. IV, Amsterdam: Elsevier, 1994, chapter 36, pp. 2111–2245. Phillips, Peter C. B. and Victor Solo, “Asymptotics For Linear Processes,” Annals of Statistics, 1992, 20, 971–1001. Pollard, David, Empirical Processes: Theory and Applications, Vol. 2 of NSF-CBMS Regional Conference Series in Probability and Statistics, Hayward, California: Institute of Mathematical Statistics, 1990. Prescott, E. C., “Real Business Cycle Theory: What Have We Learned?,” Revista de An´ alisis Econ´ omico, 1991, 6 (2), 3–19. Rivers, D. and Q. Vuong, “Model Selection Tests For Nonlinear Dynamic Models,” Econometrics Journal, 2002, 5 (1), 1–39. Schorfheide, Frank, “Loss Function-Based Evaluation of DSGE Models,” Journal of Applied Econometrics, 2000, 15, 645–670. 123 Chapter 3. Bibliography Smith, R. J., “Non-Nested Tests for Competing Models Estimated by Generalized Method of Moments,” Econometrica, 1992, 60 (4), 973–980. Vuong, Quang H., “Likelihood Ratio Tests For Model Selection and NonNested Hypotheses,” Econometrica, 1989, 57 (2), 307–333. Watson, Mark W, “Measures of Fit for Calibrated Models,” Journal of Political Economy, December 1993, 101 (6), 1011–1041. West, Kenneth D., “Asymptotic Inference About Predictive Ability,” Econometrica, 1996, 64, 1067–1084. and Michael W. McCracken, “Regression-Based Tests of Predictive Ability,” International Economic Review, 1998, 39, 817–840. White, Halbert, “A Reality Check For Data Snooping,” Econometrica, 2000, 68 (5), 1097–1126. , Asymptotic Theory For Econometricians, San Diego: Academic Press, 2001. 124 Chapter 4 An Exploration of the Role of Net Worth in Business Cycles33 33 A version of this chapter will be submitted for publication. Tang, Yao,“An Exploration of the Role of Net Worth in Business Cycles”. 125 4.1. Introduction 4.1 Introduction Net worth of a ﬁrm, deﬁned as the diﬀerence between assets and liabilities, plays an important role in the ﬁrm’s eﬀort to secure external ﬁnances in ﬁnancial markets characterized by asymmetric information in return to capital. A ﬁrm with higher net worth can provide more internal ﬁnance for a project of a given size, thus can be subject to a lower risk premium, which is the excess return beyond return to risk-free assets requested by the creditor, as the ﬁrm has stronger interests in the project. Alternatively, a ﬁrm with higher net worth can post more collateral and apply to a bigger loan. Therefore it is plausible that net worth is closely related to investment activity and risk premium. The relationships between net worth, investment, and risk premium can manifest themselves at an aggregate level. In the aggregate economy, when the total factor productivity (TFP) growth is above its trend level, return to capital is higher, and so is the value of the net assets. The increase in net worth of the corporate sector can stimulate higher investment level and further expansion of aggregate output in the future. This chapter documents the cyclical properties of net worth of nonﬁnancial sectors and examine whether leading dynamic stochastic general equilibrium (DSGE) models that incorporate the role of net worth can replicate these features quantitatively. Since it is diﬃcult to provide an appropriate survey of a large literature on net worth and ﬁnancial frictions, I elect to mention three papers that examine the role of net worth in business cycle, Kiyotaki and Moore (1997), Carlstrom and Fuerst (1997) and Bernanke, Gertler and Gilchrist (1999). 126 4.1. Introduction Kiyotaki and Moore (1997) explores a partial equilibrium model with limited enforcement of debt contract. In their setup, a ﬁrm must post ﬁxed assets as collaterals and the loan they obtain in equilibrium may not exceed the value of collaterals. Positive TFP shocks are shown to generate businesscycle-like ﬂuctuations. Carlstrom and Fuerst (1997) and Bernanke et al. (1999) embed asymmetric information in entrepreneur return respectively into a real business cycle model and an new-Keynesian model, and evaluate the quantitative performance of the models. In equilibrium, the amount of loans obtained by entrepreneurs are increasing in their net worth. Carlstrom and Fuerst (1997) ﬁnd that the ﬁnancial friction helps to generate humpshaped output impulse responses and autocorrelation in output growth, as observed in the data. In Bernanke et al. (1999), a similar ﬁnancial friction strengthens the propagation of TFP and monetary shocks, and such an ampliﬁcation eﬀect is termed “ﬁnancial accelerator”. While net worth plays a critical role in Carlstrom and Fuerst (1997) and Bernanke et al. (1999), they has not studied on whether such models can match the cyclical properties of aggregate net worth. Rather, they examines some quantitative aspects of DSGE models, such as whether the models with ﬁnancial frictions can propagate productivity and monetary shocks. Relative these papers, this chapter contributes to the literature of ﬁnancial frictions in aggregate economy by employing the econometric technique developed in chapter 3 to explicitly examine whether the models of Carlstrom and Fuerst (1997) and Bernanke et al. (1999) can capture the cyclical properties of net worth. In the chapter, I document that the cyclical properties of net worth have 127 4.2. Cyclical Properties of Net Worth three main features: 1) net worth is pro cyclical, 2) the volatility of net worth is higher in the later part of the sample starting at the ﬁrst quarter of 1983, and 3) the co-movement between net worth and macroeconomic variables are strong in the later part of the sample. Applications of econometric procedures suggest that there is no signiﬁcant diﬀerence in the performance of the two models in matching quantitatively the cyclical properties of net worth, and price rigidity plays an important role in Bernanke et al. (1999). However, both models only partially capture the positive correlation between net worth and risk premium. The next section describes the cyclical properties of net worth. Overviews of the models used in Carlstrom and Fuerst (1997) and Bernanke et al. (1999) are provided in section 3. In section 4, I carry out the econometric test in chapter 3 to compare the two models formally. Section 5 concludes. 4.2 Cyclical Properties of Net Worth In this section, I document the cross-correlations between net worth and GDP, consumption, hours worked, investment, T-Bill interest rate, prime interest rate, and risk premium in the quarterly U.S. data since the ﬁrst quarter of 1952. Except for the interest rates which are simply HP-ﬁltered series, all series are real per capita variables in log and HP-ﬁltered. In this chapter, I focus on the net worth of non-ﬁnancial sectors34 as it corresponds best to the net worth in the models of Carlstrom and Fuerst 34 Bernanke and Gertler (1995) discuss in general how balance sheets of banks and non- ﬁnancial companies aﬀect the transmission of monetary policy. Chen (2001) and Meh and Moran (2008) are examples of models that examine the role net worth of banks. 128 4.2. Cyclical Properties of Net Worth (1997) and Bernanke et al. (1999). There are two measures of net worth. The ﬁrst is net worth at market value, which is the diﬀerence between total assets at market value and total ﬁnancial liabilities. The second is net worth at historical (purchase) cost, which is the diﬀerence between total assets at historical value and total ﬁnancial liabilities. Among the total assets at market value, on average about 43% are ﬁnancial assets, with the rest being mostly equipment, software, and real estate. Almost all liabilities are ﬁnancial liabilities. To provide rough measures of the relative prices of net assets, I compute the ratio between total asset worth at market value and total asset worth at historical cost, and the ratio between net worth at market value and net worth at historical cost. The investment series used in the chapter is non-residential investment net of investment in inventory. The T-Bill interest rate is calculated from 3-month treasury bill secondary market rate, and the prime interest rate is the average prime rate charged by banks on short-term loans to business. Risk premium is deﬁned as the diﬀerence between the two interest rates. The series on total asset values and total liabilities are extracted from the Federal Reserve Bank’s Flow of Funds Accounts database. Other variables are from Basic Economics database produced by DRI/McGraw-Hill. 129 4.2. Cyclical Properties of Net Worth .08 .08 .04 .04 .00 .00 -.04 -.04 -.08 -.08 -.12 -.12 50 55 60 65 70 75 80 85 90 95 00 05 Net worth at market value 50 55 60 65 70 75 80 85 90 95 00 05 GDP Net worth at market value Consumption .3 .08 .2 .04 .1 .00 .0 -.04 -.1 -.08 -.2 -.3 -.12 50 55 60 65 70 75 80 85 90 95 00 05 Net worth at market value Hours worked 50 55 60 65 70 75 80 85 90 95 00 05 Net worth at market value Investment Figure 4.1: Net Worth and GDP, Consumption, Hours Worked and Investment In Figure 4.1, the HP-ﬁltered series of net worth at market value are plotted against those of GDP, consumption, hours worked and investment. An “eyeball” examination suggests that net worth is pro-cyclical. In addition, its volatility increased substantially in the latter part of the sample, getting close to the volatility of investment, the most volatile component of GDP. In particular, the net worth at market value series has tracked the investment series quite closely since late 1980’s. 130 4.2. Cyclical Properties of Net Worth .3 .3 .2 .2 .1 .1 .0 .0 -.1 -.1 -.2 -.2 -.3 -.3 50 55 60 65 70 75 80 85 90 95 00 05 Net worth at market value 50 55 60 65 70 75 80 85 90 95 00 05 Investment Net worth at historical cost Investment .3 .3 .2 .2 .1 .1 .0 .0 -.1 -.1 -.2 -.2 -.3 -.3 50 55 60 65 70 75 80 85 90 95 00 05 50 55 60 65 70 75 80 85 90 95 00 05 Asset (market)/Asset (cost) Investment Net worth(market)/Net worth(cost) Investment Figure 4.2: Net Worth and Investment in Greater Details The change in net worth at market value, as plotted in upper left graph in Figure 4.2, can be roughly decomposed into two parts, change in quantities of net assets and change in the price of net assets. Since the measure of net worth at historical cost plotted in the upper right graph is net of change in prices, we can see that there has been pro cyclical change in the quantities of net assets. Meanwhile, if we look at the bottom two graphs which plot two rough measures of asset prices, we can see the prices are also pro cyclical. As in Figure 1, we can also see higher volatility of quantities and prices of net assets in later part of the sample. 131 4.2. Cyclical Properties of Net Worth .3 .08 .2 .04 .1 .00 .0 -.04 -.1 -.08 -.2 -.3 -.12 50 55 60 65 70 75 80 85 90 95 00 05 Net worth at market value 50 55 60 65 70 75 80 85 90 95 00 05 Investment Net worth at market value .08 .08 .04 .04 .00 .00 -.04 -.04 -.08 -.08 T-Bill rate -.12 -.12 50 55 60 65 70 75 80 85 90 95 00 05 Net worth at market value Prime rate 50 55 60 65 70 75 80 85 90 95 00 05 Net worth at market value Risk premium Figure 4.3: Net Worth and Interest Rates Judging from Figure 4.3, we can argue the synchronizations between net worth and T-Bill rate, and between net worth and the prime rate, are more salient. Overall, visual inspection of the three ﬁgures above suggest 1) net worth is pro cyclical, 2) the volatility of net worth is higher in the later part of the sample, and 3) the co-movement between net worth and macroeconomic variables are strong in the later part of the sample. The next two tables provide a more formal description of these features. In tabulating the cross-correlation between net worth and other variables, I split the sample at the end of the 1981/1982 recession, i.e. the last quarter of 1982, to see if the correlations diﬀer in the two samples. Although 132 4.2. Cyclical Properties of Net Worth Table 4.1: Net Worth, GDP, Consumption, and Investment Corr(NW,X) 𝜎(𝑋) 𝜎(𝐺𝐷𝑃 ) 𝑋𝑡−4 𝑋𝑡−2 𝑋𝑡−1 𝑋𝑡 𝑋𝑡+1 𝑋𝑡+2 𝑋𝑡+4 2.96 0.45* 0.80* 0.92* 1 0.92* 0.80* 0.45* 1 0.53* 0.50* 0.47* 0.51* 0.43* 0.32* 0.03 6.11 0.46* 0.45* 0.41* 0.39* 0.31* 0.22* -0.02 0.63 0.23* 0.58* 0.76* 1 0.76* 0.58* 0.23* 1 0.50* 0.48* 0.35* 0.17 -0.04 -0.21* -0.39* 4.90 0.48* 0.53* 0.44* 0.31* 0.05 -0.15 -0.39* 1983Q1-2008Q4 NW(market) GDP Investment 1952Q1-1982Q4 NW(market) GDP Investment Note: * denotes statistical signiﬁcance at 5% level. the choice seems ad hoc, splitting the sample at the end of 1973/1975 recession or the end of 1990/1991 recession produce similar results. Notably, the standard deviation of net worth relative to GDP increased from 0.63 in the ﬁrst period to 2.96 in the second. From Table 4.1, we can see the autocorrelation of net worth is higher. More importantly, the positive correlation between net worth and GDP and investment are stronger. In particular, net worth only has positive and 133 4.2. Cyclical Properties of Net Worth statistically signiﬁcant correlations with leads of GDP and investment after the last quarter of 1982. This feature present in the latter part of the sample is more consistent with the premise of Carlstrom and Fuerst (1997) and Bernanke et al. (1999) that net worth helps to propagate expansion of output. Table 4.2: Net Worth and Investment in Greater Details Corr(NW,X) 𝜎(𝑋) 𝑋𝑡−4 𝑋𝑡−2 𝑋𝑡−1 𝑋𝑡 𝑋𝑡+1 𝑋𝑡+2 𝑋𝑡+4 T-Bill rate 0.011 0.22* 0.54* 0.68* 0.68* 0.61* 0.52* 0.29* Prime rate 0.012 0.18 0.53* 0.69* 0.70* 0.64* 0.56* 0.33* Risk premium 0.003 -0.07 0.07 0.20* 0.26* 0.29* 0.30* 0.25* T-Bill rate 0.017 0.01 -0.05 -0.06 -0.13 -0.02 -0.00 0.12 Prime rate 0.018 -0.09 -0.01 -0.01 0.12 0.10 0.17 0.26 Risk premium 0.007 -0.10 0.10 0.23* 0.27* 0.33* 0.34* 0.28* 1983Q1-2008Q4 1952Q1-1982Q4 Note: * denotes statistical signiﬁcance at 5% level. Judging from Table 4.2, there had been no signiﬁcant correlations between net worth and the T-Bill rate and the prime rate before the last 134 4.2. Cyclical Properties of Net Worth quarter of 1982. Afterward, positive and statistically signiﬁcant correlations emerged. Meanwhile, the correlations between net worth and risk premium are stable over the two subsamples.35 The change in volatility and cyclical properties after 1982 is certainly intriguing, but exploring the cause would require a separate study. In the rest of chapter I will take the cyclical properties after 1982 as given and examine whether DSGE models can quantitatively replicate these features. 35 The measure of risk premium presented in Table 4.2 follows that of Carlstrom and Fuerst (1997) and Bernanke et al. (1999). I also explore an alternative measure of risk premium deﬁned as the diﬀerence between Moody’s yield on AAA seasoned corporate bonds of all industries and BAA seasoned corporate bonds of all industries. When this measure is used, net worth is negatively related to lags of risk premium and its correlations with future risk premiums are positive. With this measure, I obtain model comparison results similar to Section 4.4. 135 4.3. Overviews of Two Competing Models 4.3 Overviews of Two Competing Models In this section, I reproduce the setups of the two models to facilitate comparison and discussion in section 4. In essence, Carlstrom and Fuerst (1997) adds to the benchmark RBC model a capital production sector where capital goods producers face stochastic return and have to rely partially on external ﬁnance. In comparison, Bernanke et al. (1999) is based on a new-Keynesian model and the ﬁnancial friction occurs in the production of consumption goods. 4.3.1 The Carlstrom and Fuerst (1997) Model There are a continuum of agents of measure one, among which there are households of measure 1 − 𝜂 and entrepreneurs of measure 𝜂. There are two goods, consumption goods and capital goods. A typical household supplies labour 𝑙𝑡 to consumption good ﬁrms and make investment decisions 𝑖𝑡 to maximize life-time utility 𝐸0 ∞ ∑ 𝛽 𝑡 𝑈 (𝑐𝑡 , 𝑙𝑡 ) 𝑡=0 = 𝐸0 ∞ ∑ 𝑡=0 𝛽 𝑡 [𝑙𝑛(𝑐𝑡 ) + 𝜈(1 − 𝑙𝑡 )] 𝑠𝑢𝑏𝑗𝑒𝑐𝑡 𝑡𝑜 𝑐𝑡 + 𝑖𝑡 ≤ 𝑞𝑡 𝑘𝑡ℎ (1 − 𝛿) + 𝑟𝑡 𝑘𝑡ℎ + 𝑤𝑡 𝑙𝑡 where 𝑞𝑡 is the end of period price of capital, 𝑘𝑡ℎ is the beginning of period capital stock held by a household36 , 𝛿 is the depreciation rate of capital, 𝑟𝑡 is the rental rate of capital, and 𝑤𝑡 is the wage rate. Note the households are 36 For the quantity variables, upper cases denote the aggregates. For instance, 𝐾𝑡ℎ denotes the aggregate capital stock held by all households. 136 4.3. Overviews of Two Competing Models also owners of consumption good ﬁrms, who make zero proﬁt in equilibrium since they have constant to return technologies and face perfect competition. Consumption goods producers are standard. They hire labour and rent capital from households and entrepreneurs. The production technology is 𝑦𝑡 = 𝐴𝑡 𝑘𝑡𝛼1 𝑙𝑡𝛼2 (𝑙𝑡𝑒 )1−𝛼1 −𝛼2 where 𝐴𝑡 is the stochastic TFP, which follows an AR(1) process with autocorrelation coeﬃcient 𝜌𝐴 . The crucial diﬀerence of the model from RBC lies in the production of capital goods. In a benchmark RBC model, consumption goods can be converted one-to-one into investment goods, while in the current model, households must purchase capital goods from entrepreneurs via ﬁnancial intermediaries. In the model, a continuum of ﬁnancial intermediaries receive the savings from households and make loans to entrepreneurs, who are the producer of capital goods. The entrepreneurs have access to a stochastic technology that transfer 𝑖𝑒𝑡 unit of consumption goods into 𝜔𝑖𝑒𝑡 unit of capital goods, where 𝜔 is a random variable with pdf 𝜙(𝜔) and cdf Φ(𝜔). The entrepreneurs also supply labour inelastically 𝑙𝑡𝑒 = 1 to consumption good producers so they can have positive net wealth 𝑛𝑡 . However, to ﬁnance an project of size 𝑖𝑒𝑡 , they could borrow 𝑖𝑒𝑡 − 𝑛𝑡 from the ﬁnancial intermediaries to ﬁnance the production of capital goods. While entrepreneurs can observe the realization of 𝜔 for free but ﬁnancial intermediaries can only observe it by paying a fee which is proportional to the size of the investment37 . Contracts last for one period 37 This is the costly state veriﬁcation assumption introduced by Townsend (1979) 137 4.3. Overviews of Two Competing Models and both sides are anonymous, therefore there’s no reputation building from repeated games. In this circumstance, a debt contract is optimal38 . The contract speciﬁes an interest rate 𝑟𝑡𝑘 on the loans. When the realization of 𝜔 is above some threshold 𝜔, the entrepreneurs will repay the principal and interest; otherwise, they will default. In case of defaults, the ﬁnancial intermediaries will pay the veriﬁcation fee 𝜇𝑖 which can be interpreted as bankruptcy cost, and capture the residual value of the investment 𝜇𝑖𝜔. The threshold is given by 𝜔= (1 + 𝑟𝑡𝑘 )(𝑖𝑒𝑡 − 𝑛𝑡 ) 𝑖𝑒𝑡 (4.1) since if 𝜔 is lower than the level, even using the whole return on the project to repay the ﬁnancial intermediary will still fail to meet the speciﬁed interest rate, i.e. 𝜔𝑖𝑒𝑡 < 1 + 𝑟𝑡𝑘 𝑖𝑒𝑡 − 𝑛𝑡 Hence, the expected income to an entrepreneur 𝑞𝑡 [∫ ∞ 𝜔 𝜔𝑖𝑒𝑡 𝑑Φ(𝜔) − (1 − Φ(𝜔))(1 + 𝑟𝑡𝑘 )(𝑖𝑒𝑡 − 𝑛𝑡 ) ] which by using (4.1) is rewritten as 𝑞𝑡 𝑖𝑒𝑡 (∫ ∞ 𝜔 𝜔𝑑Φ(𝜔) − (1 − Φ(𝜔)(1 + 𝑘 𝑟𝑡+1 )(𝑖𝑒𝑡+1 − 𝑛𝑡+1 ) ) = 𝑞𝑡 𝑖𝑒𝑡 𝑓 (𝜔) (4.2) The expected income to a creditor is 𝑞𝑡 38 [∫ 𝜔 0 𝜔𝑖𝑒𝑡 𝑑Φ(𝜔) − Φ(𝜔)𝜇𝑖𝑒𝑡 + (1 − Φ(𝜔))(1 + 𝑟𝑡𝑘 )(𝑖𝑒𝑡 − 𝑛𝑡 ) ] See Gale and Hellwig (1985) and Williamson (1987) 138 4.3. Overviews of Two Competing Models which by using (4.1) is rewritten as (∫ 𝜔 ) 𝑒 𝜔𝑑Φ(𝜔) − Φ(𝜔)𝜇 + (1 − Φ(𝜔))𝜔 = 𝑞𝑡 𝑖𝑒𝑡 𝑔(𝜔) 𝑞𝑡 𝑖 𝑡 (4.3) 0 The optimal contract is to maximize the entrepreneur’s income39 ] [∫ ∞ 𝑒 𝑘 𝑒 𝜔𝑖𝑡 𝑑Φ(𝜔) − (1 − Φ(𝜔))(1 + 𝑟𝑡 )(𝑖𝑡 − 𝑛𝑡 ) max 𝑞𝑡 𝑖𝑡 ,𝜔 𝜔 subject to the participation constraint of the creditor ] [∫ 𝜔 𝑒 𝑒 𝑘 𝑒 𝜔𝑖𝑡 𝑑Φ(𝜔) − Φ(𝜔)𝜇𝑖𝑡 + (1 − Φ(𝜔))(1 + 𝑟𝑡 )(𝑖𝑡 − 𝑛𝑡 ) ≥ 𝑖𝑒𝑡 − 𝑛𝑡 𝑞𝑡 0 where the alternative gross return to creditor’s funds is 1. The solution to the problem can be solved from the ﬁrst order conditions: ∫∞ 𝜔𝑑Φ(𝜔) − (1 − Φ(𝜔))𝜔 𝑞𝑡 (1 − Φ(𝜔)𝜇) + 𝜙(𝜔)𝜇 𝜔 = 1 (4.4) Φ(𝜔) − 1 [∫ 𝜔 ] 𝑒 𝑒 𝑘 𝑒 𝜔𝑖𝑡 𝑑Φ(𝜔) − Φ(𝜔)𝜇𝑖𝑡 + (1 − Φ(𝜔))(1 + 𝑟𝑡 )(𝑖𝑡 − 𝑛𝑡 ) ≥ 𝑖𝑒𝑡 − 𝑛𝑡 (4.5) 𝑞𝑡 0 The last two equations deﬁne entrepreneur’s equilibrium project size 𝑖𝑒𝑡 as a function of capital price 𝑞𝑡 and net worth 𝑛𝑡 . Carlstrom and Fuerst (1997) shows that 𝑖𝑒𝑡 is increasing in both 𝑞𝑡 and 𝑛𝑡 . The entrepreneurs are risk neutral and maximize 𝐸0 ∞ ∑ (𝛽𝛾)𝑡 𝑐𝑒𝑡 𝑛=0 𝛾 is assume to be less than 1, so entrepreneurs are impatient and they will not accumulate a level of net worth large enough to self-ﬁnance their investment projects. The entrepreneur’s net worth is 𝑛𝑡 = 𝑤𝑡 𝑙𝑡𝑒 + 𝑘𝑡𝑒 [𝑞𝑡 (1 − 𝛿) − 𝑟𝑡 ] 39 This formulation assumes the entrepreneur captures the whole surplus generated by the contract. 139 4.3. Overviews of Two Competing Models where 𝑘𝑡𝑒 is the holding of capital by entrepreneurs at the beginning of the period. As return to net worth (see equation (4.6) below) is higher than 1, and entrepreneurs are risk neutral, they will invest all the net worth in the project. At the end of period, entrepreneurs who stay solvent make their consumption and savings decision subject to the budget constraint 𝑒 𝑐𝑒𝑡 + 𝑞𝑡 𝑘𝑡+1 ≤ 𝑞𝑡 [𝑖𝑒𝑡 𝜔 − (1 + 𝑟𝑡𝑘 )(𝑖𝑒𝑡 − 𝑛𝑡 )] (4.6) Where 𝑖𝑒𝑡 𝜔 − (1 + 𝑟𝑡𝑘 )(𝑖𝑒𝑡 − 𝑛𝑡 ) is the realized return on the project of size 𝑖𝑒𝑡 . Since the expected return on capital in period 𝑡 + 1 is ] 𝑞𝑡+1 𝑓 (𝜔 𝑡+1 ) , [𝑞𝑡+1 (1 − 𝛿) + 𝑟𝑡+1 ] 1 − 𝑞𝑡+1 𝑔(𝜔 𝑡+1 ) [ where the fraction in the expression is the expected return to net worth invested in the time 𝑡+1 project derived from equation (4.5), the Euler’s equation that governs the consumption and savings decision for the entrepreneur is [ 𝑞𝑡+1 𝑓 (𝜔 𝑡+1 ) 𝑞𝑡 = 𝐸𝑡 𝛽𝛾[𝑞𝑡+1 (1 − 𝛿) + 𝑟𝑡+1 ] 1 − 𝑞𝑡+1 𝑔(𝜔 𝑡+1 ) ] Note that in the last equation, net worth does not show up, indicating the decision rule of entrepreneurs are independent of their net worth. The aggregation of entrepreneurs’ budget constraints yields the equation describing evolution of 𝐾𝑡𝑒 , the capital stock held by entrepreneurs 𝑒 𝐾𝑡+1 = = (𝜂𝑤𝑡 𝑙𝑡𝑒 + 𝐾𝑡𝑒 [𝑞𝑡 (1 [ ] 𝑞𝑡+1 𝑓 (𝜔 𝑡+1 ) − 𝛿) + 𝑟𝑡 ]) − 𝜂𝑐𝑒𝑡 /𝑞𝑡 1 − 𝑞𝑡+1 𝑔(𝜔 𝑡+1 ) [ ] 𝑞𝑡+1 𝑓 (𝜔 𝑡+1 ) 𝜂𝑛𝑡 − 𝜂𝑐𝑒𝑡 /𝑞𝑡 1 − 𝑞𝑡+1 𝑔(𝜔 𝑡+1 ) The timing of events in a period is 140 4.3. Overviews of Two Competing Models 1. The TPF shock 𝐴𝑡 is realized. 2. Consumption goods ﬁrm hire labour and capital for production. 3. Households make consumption and investment decisions. 4. The ﬁnancial intermediaries use investment funds from households to make loans to entrepreneurs. 5. Entrepreneurs produce capital from consumption goods with the stochastic technology. 6. Entrepreneur-speciﬁc technology shock 𝜔 is realized. Entrepreneurs make decision about whether to repay the loan or default. 7. Solvent entrepreneurs decide on consumption and capital holding. Given the state variables (𝐾𝑡 , 𝐾𝑡𝑒 , 𝐴𝑡 ),40 a recursive competitive equilibrium 𝑒 , 𝐿 , 𝑞 , 𝑛 , 𝑖𝑒 , 𝜔 , 𝑐𝑒 and 𝑐 . are characterized by decision rules for 𝐾𝑡+1 , 𝐾𝑡+1 𝑡 𝑡 𝑡 𝑡 𝑡 𝑡 𝑡 40 Note 𝐾𝑡 = 𝐾𝑡ℎ + 𝐾𝑡𝑒 . 141 4.3. Overviews of Two Competing Models They are implicitly deﬁned by ( 𝑈𝐿 (𝑡) = 𝑈𝐶 (𝑡)𝐴𝑡 𝛼2 𝐾𝑡𝛼1 𝐿𝑡 1 − 𝛼2 )(𝐿𝑒𝑡 )1−𝛼1 −𝛼2 ( 𝑞𝑡 𝑈𝐶 (𝑡) = 𝛽𝐸𝑡 𝑈𝐶 (𝑡 + 1)[𝑞𝑡+1 (1 − 𝛿) + 𝐴𝑡+1 𝛼1 𝐾𝑡𝛼1 −1 𝐿𝑡 𝛼2 )(𝐿𝑒𝑡 )1−𝛼1 −𝛼2 ] 𝐾𝑡+1 = (1 − 𝛿)𝐾𝑡 + 𝜂𝑖𝑒𝑡 [1 − Φ(𝜔 𝑡 )𝜇] 𝑌𝑡 = (1 − 𝜂)𝑐𝑡 + 𝜂𝑐𝑒𝑡 + 𝜂𝑖𝑒𝑡 ] [ 𝑓 (𝜔 𝑡 ) −1 𝑞𝑡 = 1 − Φ(𝜔 𝑡 )𝜇 + 𝜙(𝜔 𝑡 )𝜇 ′ 𝑓 (𝜔 𝑡 ) 𝑛𝑡 𝑖𝑒𝑡 = 1 − 𝑞𝑡 𝑔(𝜔 𝑡 ) 𝑛𝑡 = 𝐴𝑡 (1 − 𝛼1 − 𝛼2 )𝐾𝑡𝛼1 𝐿𝛼𝑡 2 (𝐿𝑒𝑡 )−𝛼1 −𝛼2 𝑒 [ ] 𝐾𝑡+1 𝑞𝑡 (1 − 𝛿) + 𝐴𝑡 𝛼1 𝐾𝑡𝛼1 𝐿𝛼𝑡 2 (𝐿𝑒𝑡 )1−𝛼1 −𝛼2 𝜂 ] [ 𝑞𝑡+1 𝑓 (𝜔 𝑡+1 ) − 𝜂𝑐𝑒𝑡 /𝑞𝑡 = 𝜂𝑛𝑡 1 − 𝑞𝑡+1 𝑔(𝜔 𝑡+1 ) + 𝑒 𝐾𝑡+1 𝑞𝑡 = 𝐸𝑡 𝛽𝛾[𝑞𝑡+1 (1 − 𝛿) + 𝛼1 𝛼2 𝐴𝑡+1 𝛼1 𝐾𝑡+1 𝐿𝑡+1 (𝐿𝑒𝑡+1 )1−𝛼1 −𝛼2 ] [ 𝑞𝑡+1 𝑓 (𝜔 𝑡+1 ) 1 − 𝑞𝑡+1 𝑔(𝜔 𝑡+1 ) ] (4.7) In addition, markets clear for two types of labour, consumption goods, and capital goods: 𝐿𝑡 = (1 − 𝜂)𝑙𝑡𝑒 𝐿𝑒𝑡 = 𝜂 𝑌𝑡 = (1 − 𝜂)𝑐𝑡 + 𝜂𝑐𝑒𝑡 + 𝜂𝑖𝑒𝑡 𝐾𝑡+1 = (1 − 𝛿)𝐾𝑡 + 𝜂𝑖𝑒𝑡 [1 − Φ(𝜔 𝑡 )𝜇] The model is solved by linearizing the system of equations (4.7) near its steady state. 142 4.3. Overviews of Two Competing Models 4.3.2 The Bernanke, Gertler and Gilchrist (1999) Model To facilitate comparison, I modify the model of Bernanke et al. (1999) slightly and present it in a fashion similar to subsection 4.3.1. There are a continuum of agents of measure one, among which there are households of measure 1 − 𝜂 and entrepreneurs of measure 𝜂. There are three types of goods, wholesale goods, diﬀerentiated goods and ﬁnal goods. The representative household supplies labour 𝐿𝑡 to wholesale good producers and make decisions on consumption 𝐶𝑡 , deposit 𝐷𝑡+1 and real balance 𝑀𝑡 𝑃𝑡 to maximize life-time utility 𝐸0 ∞ ∑ 𝛽 𝑡 [𝑙𝑛(𝐶𝑡 ) + 𝜁𝑙𝑛( 𝑡=0 𝑀𝑡 ) + 𝜈𝑙𝑛(1 − 𝐿𝑡 )] 𝑃𝑡 subject to the budget constraint 𝐶𝑡 ≤ 𝑊𝑡 𝐿𝑡 − 𝑇𝑡 + Π𝑡 + 𝑅𝑡 𝐷𝑡 − 𝐷𝑡+1 + 𝑀𝑡−1 − 𝑀𝑡 𝑃𝑡 where 𝑇𝑡 is lump sum taxes, Π𝑡 is dividends from ownership of retail ﬁrms, and 𝑅𝑡 is the return on deposit. The solution is characterized by the standard ﬁrst order conditions 1 1 = 𝐸𝑡 𝛽𝑅𝑡+1 𝐶𝑡 𝐶𝑡+1 1 1 =𝜉 𝑊𝑡 𝐶𝑡 1 − 𝐿𝑡 𝑅𝑛 − 1 −1 𝑀𝑡 = 𝜁𝐶𝑡 ( 𝑡+1𝑛 ) 𝑃𝑡 𝑅𝑡+1 𝑛 where 𝑅𝑡+1 is the gross nominal interest rate. A continuum of ﬁnancial intermediaries receive the deposits from households and make loans to entrepreneurs. Entrepreneur 𝑗 use a stochastic 143 4.3. Overviews of Two Competing Models technology to produce homogeneous wholesale goods 1−𝛼−𝛼2 𝑌𝑡𝑗 = 𝜔𝐴𝑡 (𝐾𝑡𝑗 )𝛼1 (𝐿𝑗𝑡 )𝛼2 (𝐿𝑒𝑗 𝑡 ) (4.8) where 𝐴𝑡 is the stochastic TFP, which follows an AR(1) process 𝐴𝑡 = 𝜌𝐴 𝐴𝑡−1 + 𝜖𝐴 𝑡 (4.9) Since in equilibrium all entrepreneurs will choose the same capital-labour ratio, we can write the return to capital for entrepreneur 𝑗 as 1−𝛼−𝛼2 𝜔 𝑗 𝛼1 𝐴𝑡 (𝐾𝑡𝑗 )𝛼1 −1 (𝐿𝑗𝑡 )𝛼2 (𝐿𝑒𝑗 𝑡 ) =𝜔 𝑗 𝛼1 𝐴𝑡 (𝐾𝑡 )𝛼1 −1 (𝐿𝑡 )𝛼2 (𝐿𝑒𝑡 )1−𝛼−𝛼2 =𝜔 𝑗 𝑅𝑡𝑘 where 𝑅𝑡𝑘 is the aggregate return to capital. The price of wholesale good relative to ﬁnal good is denoted as 1/𝑋𝑡 . The entrepreneurs borrow from the ﬁnancial intermediaries to ﬁnance the purchase of capital goods made from the ﬁnal good. With the purchased capital, a entrepreneur hires labour from both households and entrepreneurs to produce wholesale goods. They also supply labour 𝐿𝑒𝑡 on the market so they can have positive net wealth. Again, entrepreneur 𝑗 can observe the realization of 𝜔𝑗 for free but ﬁnancial intermediaries can only observe it by paying a fee which is proportional to the gross return to capital, 𝜔 𝑗 𝑅𝑡𝑘 𝑞𝑡 𝐾𝑡𝑗 . The optimal contract is very similar to subsection 4.3.1, so is the entrepreneurs’ maximization problem41 . It can be shown that the desired capital level is 41 In the original setup of Bernanke et al. (1999), they assume there are constant birth and death of entrepreneurs, with the survival probability in the next period being 𝛾. The entrepreneurs maximize wealth level and consume their wealth the period of death. Both setups will yield the same aggregate behaviour of the entrepreneurs. 144 4.3. Overviews of Two Competing Models given by 𝑗 = 𝜄(𝐸𝑡 𝑞𝑡 𝐾𝑡+1 𝑘 𝑅𝑡+1 )𝑁 𝑗 𝑅𝑡+1 𝑡+1 (4.10) where 𝜄(⋅) is an increasing function with 𝜄(1) = 1, 𝑞𝑡 is the price of capital, and 𝑁𝑡𝑗 is the net worth the entrepreneur 𝑗. Note the equation implies the aggregate capital evolution is linear in aggregate net worth and independent of the distribution of net worth. The functional form of 𝜄(⋅) depends on the distribution function of 𝜔. The threshold value for return, 𝜔 𝑗 is determined by 𝑅𝑡+1 = { [1 − Φ(𝜔 𝑗𝑡 )]𝜔 𝑗𝑡 − (1 − 𝜇) ∫ 𝜔 𝑗𝑡 0 𝜔𝑑Φ(𝜔) } 𝑘 𝑞 𝐾𝑗 𝑅𝑡+1 𝑡 𝑡+1 𝑗 𝑗 − 𝑁𝑡+1 𝑞𝑡 𝐾𝑡+1 (4.11) 𝑗 𝑗 Since (4.10) implies 𝐾𝑡+1 is proportional to 𝑁𝑡+1 , the fraction in the right hand side of (4.11) is a constant. Therefore the threshold value is determined only by the aggregate valuables 𝑅𝑡+1 and 𝑞𝑡 , i.e. all entrepreneurs will choose the same 𝜔 𝑗𝑡 . The entrepreneurs sell the wholesale goods to retailers, who diﬀerentiated the wholesale goods eﬀortlessly. The diﬀerentiated goods are aggregated into a ﬁnal good by a CES aggregator. The ﬁnal good can be used as both consumption or capital. Let 𝑌𝑡 (𝑧) denote the amount of output sold by retailer 𝑧 in terms of wholesale goods, then the total ﬁnal good 𝑌𝑡𝑓 is given by 𝑌𝑡𝑓 = [ ∫ 1 𝑌𝑡 (𝑧)(𝜖−1)/𝜖 ]𝜖/(𝜖−1) (4.12) 0 with its nominal price deﬁned as ∫ 1 𝑃𝑡 (𝑧)(𝜖−1)/𝜖 ]𝜖/(𝜖−1) 𝑃𝑡 = [ (4.13) 0 145 4.3. Overviews of Two Competing Models Given the relation between wholesale goods and the ﬁnal good speciﬁed in (4.12), the demand for goods of retailer 𝑧 is 𝑌𝑡 (𝑧) = ( 𝑃𝑡 (𝑧) −𝜖 𝑓 ) 𝑌𝑡 𝑃𝑡 The retailers take the demand and price of wholesale goods 𝑃𝑡𝑤 as given and set 𝑃𝑡 (𝑧) to maximize proﬁt. The retailers engage in monopolistic competition and adjust nominal prices in a Calvo type price setting environment. In each period, a retailer can adjust its price with probability 1 − 𝜃. Let 𝑃𝑡∗ and 𝑌𝑡∗ (𝑧) be the optimal price and quantity for a retailer who can reset its price. The retailer 𝑧 maximizes expected proﬁt ∞ ∑ 𝑡=0 where 𝛽 𝑡 𝐶0 𝐶𝑡 𝑡 𝜃 𝐸0 [ 𝛽 𝑡 𝐶0 𝑃0∗ − 𝑃𝑡𝑤 ∗ 𝑌𝑡 (𝑧) 𝐶𝑡 𝑃𝑡 ] is the intertemporal marginal rate of substitution of the house- hold, who is the owner. The optimal price setting rule is given by the ﬁrst order condition ∞ ∑ 𝑡=0 𝑘 𝜃 𝐸𝑡−1 { [ ∗ ]} 𝜖 𝑃𝑡𝑤 𝑃0 𝛽 𝑡 𝐶0 𝑃0∗ 𝜖 ∗ −( ( ) 𝑌𝑡 (𝑧) ) =0 𝐶𝑡 𝑃𝑡 𝑃𝑡∗ 𝜖 − 1 𝑃𝑡 Then rewriting (4.13) yields the aggregate price evolution 1−𝜖 𝑃𝑡 = [𝜃𝑃𝑡−1 + (1 − 𝜃)(𝑃𝑡∗ )1−𝜖 ]1/(1−𝜖) Lastly, the government conducts ﬁscal and monetary policies. Its expenditures 𝐺𝑡 are ﬁnanced by lump sum taxes and revenue from money printing. The government’s budget constraint is 𝐺𝑡 = 𝑀𝑡 − 𝑀𝑡−1 + 𝑇𝑡 𝑃𝑡 146 4.3. Overviews of Two Competing Models The monetary policy is characterized by the policy rule 𝑛 𝑅𝑡𝑛 = 𝜌𝑅𝑡−1 + 𝜁Π𝑡−1 + 𝜖𝑟𝑛 𝑡 (4.14) where 𝑅𝑡𝑛 , Π𝑡−1 , and 𝜖𝑟𝑛 𝑡 are nominal interest rate, inﬂation and the i.i.d. monetary policy shock. Given the state variables (𝐾𝑡 , 𝐴𝑡 , 𝑁𝑡 ) and the government policies 𝐺𝑡 , 𝑅𝑡𝑛 , a recursive equilibrium are characterized by decision rules for 𝐼𝑡 , 𝐶𝑡 , 𝐶𝑡𝑒 , 𝑅𝑡𝑘 , 𝑅𝑡 , 𝑋𝑡 , 𝑞𝑡 , 𝑌𝑡 , 𝐿𝑡 , 𝐿𝑒𝑡 , Π𝑡 , 𝐾𝑡+1 , 𝑁𝑡+1 and 𝜔 𝑡 . They are implicitly 147 4.3. Overviews of Two Competing Models deﬁned by 𝑌𝑡 = 𝐶𝑡 + 𝐶𝑡𝑒 + 𝐼𝑡 + 𝐺𝑡 + 𝜇 1 1 = 𝛽𝐸𝑡 𝑅𝑡+1 𝐶𝑡 𝐶𝑡+1 𝐶𝑡𝑒 = (1 − 𝛾) [ ⋅ 𝑅𝑡𝑘 𝑞𝑡−1 𝐾𝑡 𝑗 𝑞𝑡 𝐾𝑡+1 = 𝜄(𝐸𝑡 − ( 𝑅𝑡 + 𝜇 ∫ 𝜔𝑡 0 𝜔𝑑Φ(𝜔)𝑅𝑡𝑘 𝑞𝑡−1 𝐾𝑡 ∫ 𝜔𝑡 𝑘 𝑅𝑡+1 )𝑁 𝑗 𝑅𝑡+1 𝑡+1 0 𝜔𝑑Φ(𝜔)𝑅𝑡𝑘 𝑞𝑡−1 𝐾𝑡 𝑞𝑡−1 𝐾𝑡 − 𝑁𝑡−1 ) (𝑄𝑡−1 𝐾𝑡 − 𝑁𝑡−1 ) ] 𝑛 𝑅𝑡 = 𝑅𝑡+1 /Π𝑡 𝑘 𝐸𝑡 (𝑅𝑡+1 ) =( 1 𝛼𝑌𝑡+1 𝑋𝑡+1 𝐾𝑡+1 𝑞𝑡 = [Γ′ ( + 𝑄𝑡+1 (1 − 𝛼) 𝑄𝑡 ) 𝐼𝑡 −1 )] 𝑤ℎ𝑒𝑟𝑒 Γ(⋅) 𝑖𝑠 𝑐𝑜𝑛𝑐𝑎𝑣𝑒 𝐾𝑡 𝑌𝑡 = 𝐴𝑡 𝐾𝑡𝛼1 𝐿𝛼𝑡 2 (𝐿𝑒𝑡 )1−𝛼1 −𝛼2 𝐿𝑒𝑡 = 1 1 1 1 =𝜈 𝛼 𝛼 −1 1 2 𝐶𝑡 1 − 𝐿𝑡 𝛼2 𝐴𝑡 𝐾𝑡 𝐿𝑡 (𝐿𝑒𝑡 )1−𝛼1 −𝛼2 𝛽𝜋𝑡+1 𝜋𝑡 = 𝐸𝑡−1 ( ) 𝜅𝑋𝑡 𝐼𝑡 𝐾𝑡+1 = Γ( )𝐾𝑡 + (1 − 𝛿)𝐾𝑡 𝐾𝑡 ) ] [ ( ∫ 𝜔𝑡 𝑘𝑞 𝐾 𝜔𝑑Φ(𝜔)𝑅 𝜇 𝑡−1 𝑡 𝑡 0 (𝑄𝑡−1 𝐾𝑡 − 𝑁𝑡−1 ) 𝑁𝑡+1 = 𝛾 𝑅𝑡𝑘 𝑞𝑡−1 𝐾𝑡 − 𝑅𝑡 + 𝑞𝑡−1 𝐾𝑡 − 𝑁𝑡−1 𝛼2 1 (1 − 𝛼1 )(1 − )𝐴𝑡 𝐾𝑡𝛼1 𝐿𝛼𝑡 2 𝑋𝑡 1 − 𝛼1 − 𝛼2 { } ∫ 𝜔𝑡 𝑘 𝑞 𝐾 𝑅𝑡+1 𝑡 𝑡+1 = [1 − Φ(𝜔 𝑡 )]𝜔 𝑡 − (1 − 𝜇) 𝜔𝑑Φ(𝜔) 𝑞 𝐾 𝑡 𝑡+1 − 𝑁𝑡+1 0 + 𝑅𝑡+1 (4.15) The solution of the model is obtained by linearizing the system of equations (4.15) around steady state. 148 4.4. Comparison of the Models 4.4 Comparison of the Models In this section, I address the question which of the two models capture quantitatively better the cyclical properties of net worth. I present a informal assessment, before presenting results from applying the econometric method developed in chapter 3, which provide a likelihood-ratio type test which gauge whether two or more competing models capture equally well the same set of statistics observed in the data. One crucial advantage of their framework is that the models are allowed to be misspeciﬁed, i.e. they need not be the true data-generating process. An econometrician would vary the parameters of each model and ﬁnd the parameters values that minimize the weighted distance between the model moments and moments observed in the data. Then the econometrician can take as the test statistic the difference in the two distances generated respectively by the two estimated models, and compared it to a desired critical value. If the test statistic is greater than the speciﬁed critical value, the econometrician rejects the null that both models provide equally good ﬁts for the data. 4.4.1 Informal Comparison of the Models In table 4.3, I present the moments from the data (the ﬁrst four rows), the calibrated Carlstrom and Fuerst (1997) model (row 5 to 8) and the calibrated Bernanke et al. (1999) model (the last four rows).The parameters used in the simulations are similar to the ones used in the papers. We can see both calibrated models predict overly strong correlation between net worth and GDP, and between net worth and investment. Meanwhile, they erroneously 149 4.4. Comparison of the Models predict negative correlation between net worth and risk premium. 4.4.2 Overview of the Formal Comparison Methodology In evaluating the quantitative performances of the models, both Carlstrom and Fuerst (1997) and Bernanke et al. (1999) rely on calibration. Calibration is a technique often applied in macroeconomics and also other ﬁelds of economics. Typically, researchers calibrate (a subset of) the model parameters such that certain features of the calibrated data can ‘match’ those in the observed data. For instance, researchers may want to match moments, correlations, impulse response or other stylized facts of interests. In applications, diﬀerent theoretical models are often calibrated to match features the same data set. Comparison among competing models naturally requires metrics that have desirable properties. If the theoretical (structural) model is correctly speciﬁed, calibration can often be viewed as an example of minimum distance estimation (MD) (Gregory and Smith (1993)). For such scenarios, one can apply the usual statistical tools to judge the goodness of ﬁt. However, researchers are often aware that their models are very unlikely to represent the true data generating process. Frequently the theoretical models contain a small number of parameters, while the stylized facts of the data involve many more parameters. In such cases, it is impossible for the theoretical to match all stylized facts as it is restricted. Chapter 3 proposes a formal test for comparison for comparison of two misspeciﬁed calibrated models. The test is of the likelihood ration type and based on the diﬀerence of the MD criterion functions corresponding 150 4.4. Comparison of the Models Table 4.3: Moments of Calibrated Models Corr(NW,X) 𝜎(𝑋) 𝜎(𝐺𝐷𝑃 ) 𝑋𝑡−4 𝑋𝑡−2 𝑋𝑡−1 𝑋𝑡 𝑋𝑡+1 𝑋𝑡+2 𝑋𝑡+4 2.96 0.45* 0.80* 0.92* 1 0.92* 0.80* 0.45* 1 0.53* 0.50* 0.47* 0.51* 0.43* 0.32* 0.03 Investment 6.11 0.46* 0.45* 0.41* 0.39* 0.31* 0.22* -0.02 Risk premium 0.003 -0.07 0.07 0.20* 0.26* 0.29* 0.30* 0.25* 3.30 0.76 0.87 0.94 1 0.94 0.87 0.76 1 0.85 0.87 0.88 0.88 0.88 0.89 0.90 Investment 6.66 0.94 0.93 0.95 0.98 0.99 0.94 -0.05 Risk premium 0.05 -0.43 -0.97 -0.99 -0.99 -0.99 -0.99 -0.99 1.35 0.98 0.99 0.99 1 0.99 0.98 1 0.97 0.99 0.99 0.99 0.97 0.96 0.94 Investment 1.14 0.96 0.94 0.91 0.84 0.54 -0.03 -0.82 Risk premium 0.08 -0.45 -0.94 -0.97 -0.95 -0.96 -0.95 -0.93 1983Q1-2008Q4 data NW(market) GDP calibrated CF NW(market) GDP calibrated BGG NW(market) GDP Note: * denotes signiﬁcance at 5% level. 151 4.4. Comparison of the Models to the two competing models. They argue that among two misspeciﬁed models, the econometricians should prefer one that has a better match to the reduced form characteristics of the data. The procedure is an asymptotic test that under the null the two misspeciﬁed models provide an equivalent approximation to the data in terms of characteristics of the reduced form model. Vuong (1989) proposed such tests for misspeciﬁed models in the maximum likelihood framework. He showed that a preferred model has a smaller Kullback-Leiber distance from the true distribution. The deﬁnition of calibration here is similar to Gregory and Smith (1993) and it is viewed as classic minimum distance (CMD) estimation. 𝑌𝑛 (𝜔) is a data matrix of sample size 𝑡 deﬁned on probability space (Ω, ℱ, 𝑃 ). All random quantities are some functions of the data 𝑌𝑛 . 𝜅 ∈ 𝒦 ⊂ ℝ𝑘 are a 𝑘−𝑣𝑒𝑐𝑡𝑜𝑟 of parameters of a structural model. We use ℎ to denote a 𝑚-vector of parameters of some reduced form model. Its true value ℎ0 depends on the true unknown structural model of the economy and its parameters. For example, ℎ can be moments, correlations, impulse response and etc. While the true structural model is unknown, we will assume that the reduced form parameters ℎ0 can be estimated consistently from the data. Let ℎ𝑛 denote an consistent estimator of ℎ. Given that model and a value of 𝜃, we assume that one can compute analytically the value of the reduced form parameters. Let 𝜓 ∈ Ψ ⊂ ℝ𝑗 be a vector of parameters that corresponds to the second structural model speciﬁed by the econometrician. Formally competing models 𝐹 : 𝒦 → ℝ𝑚 and 𝐺 : Ψ → ℝ𝑚 are mappings from the parameter space 𝒦 and Ψ into the space of reduced form parameters.We assume 𝑚 ≥ 𝑘 and 𝑚 ≥ 𝑗, i.e. the structural models are more restrictive than the reduced 152 4.4. Comparison of the Models form model. 𝐹, 𝐺 are misspeciﬁed in the sense that inf ∥ℎ0 − 𝐹 (𝜅)∥ > 0 (4.16) inf ∥ℎ0 − 𝑔(𝜓)∥ > 0. (4.17) 𝜅∈𝒦 and 𝜓∈Ψ The estimated 𝜅, i.e. the CMD estimator under 𝐹 is given by the value that minimizes the weighted distance function: 𝜅 ˆ 𝑛 (𝐴𝑛 ) = arg min ∥𝐴𝑛 (ℎ𝑛 − 𝐹 (𝜅))∥2 𝜅∈𝒦 where ∥⋅∥ is the Euclidean norm and {𝐴𝑛 } is a sequence of positive deﬁnite weighting matrices such that converge in probability to some matrix 𝐴 with full rank. Similarly the estimated 𝜓, i.e. the CMD estimator under 𝐹 is given by the value that minimizes the weighted distance function: 𝜓ˆ𝑛 (𝐴𝑛 ) = arg min ∥𝐴𝑛 (ℎ𝑛 − 𝐺(𝜅))∥2 𝜓∈Ψ Under suitable assumptions, a Quasi-Likelihood-Ratio (QLR) test is deﬁned as 𝑄𝐿𝑅𝑛 (ˆ 𝜅𝑛 (𝐴𝑛 ), 𝜓ˆ𝑛 (𝐴𝑛 )) = − ∥𝐴𝑛 (ℎ𝑛 − 𝐹 (ˆ 𝜅𝑛 (𝐴𝑛 )))∥2 + 𝐴𝑛 (ℎ𝑛 − 𝐺(𝜓ˆ𝑛 (𝐴𝑛 ))) 2 It is shown 𝑛𝑄𝐿𝑅𝑛 (ˆ 𝜅𝑛 (𝐴𝑛 ), 𝜓ˆ𝑛 (𝐴𝑛 )) has a mixed 𝜒2 distribution. When 𝑛𝑄𝐿𝑅𝑛 is greater than the critical value of a speciﬁed signiﬁcance level, the econometrician will reject the null that both models provide equally good ﬁts for the data. 153 4.4. Comparison of the Models 4.4.3 Formal Comparison of the Models In this application, I take the target moments ℎ0 as the cross-correlation coeﬃcients documented in the ﬁrst four rows of table 4.342 . The distance metric is the Euclidean norm. To reduce computation time, I ﬁx a number of parameters at values used by Bernanke et al. (1999) as the literature provides relatively good information about their values. To check whether sticky price is an important factor in replication of cyclical properties of net worth, I estimate two versions of the Bernanke et al. (1999) model. In the ﬁrst one I ﬁx the Calvo price setting parameter to be 0, meaning nominal prices are fully ﬂexible. In the second one, I allow 𝜃 to be vary in [0, 1] to minimize distance from the target moments. Therefore the diﬀerence between the ﬁrst version and Carlstrom and Fuerst (1997) mainly lies in the sector in which ﬁnancial friction occurs. The second version adds the additional factor of price rigidity. Table 4.4 summarizes the parameter estimates and distances from target moments. Table 5 presents the moments from estimated models. Application of the econometric test of chapter 3 suggests that 1) diﬀerence in ﬁts (measured as distances) of the Carlstrom and Fuerst (1997) model and the second version of the Bernanke et al. (1999) model is statistically insigniﬁcant at 10% level; and 2) the ﬁt of the ﬁrst estimated version of Bernanke et al. (1999) are signiﬁcantly worse than the other two estimated models, at 1% level. It appears that price rigidity plays an important role in the Bernanke et al. (1999) model, by this standard. 42 In general it is possible to add the relative volatilities as targets, but it is not clear how to weight them against the correlations coeﬃcients in the distance function. 154 4.4. Comparison of the Models From Table 5 we can see the ﬁrst estimated version of the Bernanke et al. (1999) model fails to capture positive correlations between net worth and leads of GDP, and the positive correlations between net worth and risk premium. The other two models have reasonable performance in replicating the positive co-movement of net worth with GDP and investment. However, they only partially reproduce the correlation between net worth and risk premium. This diﬃculty may stem from the built-in mechanism of both papers via which an increase in net worth lowers risk premium holding other factors constant. In a general equilibrium framework, a lower risk premium will face pressure of upward adjustment as ﬁrms have incentive to borrow more. However, for an increase in net worth to cause increase in risk premium, it may require investment to go up substantially more than net worth which is not very plausible in the both models. 155 4.4. Comparison of the Models Table 4.4: Parameters Estimates of CF and BGG Parameter Estimates Name 𝛼1 Meaning Range CF BGG 1 BGG 2 capital share - 0.360 0.360 0.360 ﬁxed ﬁxed ﬁxed 0.630 0.630 share of non-entrepreneur labour - 0.630 ﬁxed ﬁxed ﬁxed 𝛽 discount factor - 0.990 0.990 0.990 ﬁxed ﬁxed ﬁxed 𝛿 depreciation rate - 0.020 0.020 0.020 ﬁxed ﬁxed ﬁxed 𝜇 bankruptcy cost as - 0.120 0.120 0.120 𝜃 fraction of ﬁrms can’t reset prices 𝛼2 a fraction of revenue [0,1] to inﬂation in Taylor rule 𝛾 extra discount factor [0,1] of entrepreneurs 𝜈 𝜌𝐴 𝜎𝐴 𝜌 coeﬃcient on leisure in preference persistence of TFP shock std of TFP shock persistence of nominal interest rate [0,1] [0,1] - - to inﬂation in Taylor rule 𝜁 responsiveness of nominal interest rate - to inﬂation in Taylor rule d Euclidean distance - - 0 0.792 - ﬁxed (0.194) 0.982 0.710 0.798 (0.527) (0.131) (0.258) 0.601 0.590 0.590 (0.153) (0.189) (0.276) 0.696 0.974 0.965 (0.073) (0.253) (0.337) 0.010 0.010 0.010 ﬁxed ﬁxed ﬁxed - 0.9 0.9 - ﬁxed ﬁxed - 0.11 0.11 - ﬁxed ﬁxed 2.1375 3.2466 2.1871 between model and data moments Note: the numbers in parentheses are standard errors. 156 4.4. Comparison of the Models Table 4.5: Moments of Estimated Models Corr(NW,X) 𝜎(𝑋) 𝜎(𝐺𝐷𝑃 ) 𝑋𝑡−4 𝑋𝑡−2 𝑋𝑡−1 𝑋𝑡 𝑋𝑡+1 𝑋𝑡+2 𝑋𝑡+4 2.96 0.45* 0.80* 0.92* 1 0.92* 0.80* 0.45* 1 0.53* 0.50* 0.47* 0.51* 0.43* 0.32* 0.03 Investment 6.11 0.46* 0.45* 0.41* 0.39* 0.31* 0.22* -0.02 Risk premium 0.003 -0.07 0.07 0.20* 0.26* 0.29* 0.30* 0.25* 3.07 0.26 0.59 0.64 1 0.64 0.59 0.26 1 0.62 0.93 0.96 0.73 0.70 0.70 0.69 6.65 0.79 0.96 0.96 0.42 0.64 0.59 0.24 0.0001 -0.71 -0.73 -0.70 0.08 0.61 0.52 -0.07 1.35 0.22 0.34 0.33 1 0.33 0.34 0.22 1 -0.45 -0.43 -0.47 0.64 0.85 0.09 0.22 Investment 1.12 0.45 0.43 0.48 0.64 0.78 0.08 -0.22 Risk premium 0.56 -0.88 -0.90 -0.90 -0.44 0.05 -0.35 -0.22 0.98 0.78 0.89 0.63 1 0.63 0.89 0.78 1 0.99 0.99 0.87 0.86 0.84 0.82 0.75 Investment 9.53 0.64 0.67 0.12 0.60 0.81 0.83 0.75 Risk premium 0.39 0.51 0.50 -0.12 0.40 -0.28 0.25 -0.33 NW(market) GDP Estimated CF NW(market) GDP Investment Risk premium Estimated BGG 1 NW(market) GDP Estimated BGG 2 NW(market) GDP Note: * denotes signiﬁcance at 5% level. 157 4.5. Conclusion 4.5 Conclusion This chapter is an exploration of the role of net worth in business cycles in the U.S. since 1952. It documents that 1) net worth is pro-cyclical; 2) net worth has had higher volatility since 1983; and 3) the synchronization of net worth with GDP, investment and interest rates has been stronger since 1983. Applications of formal econometric test developed in chapter 3 suggest both the Carlstrom and Fuerst (1997) model and Bernanke et al. (1999) model can capture reasonably well the cyclical properties of net worth. In addition, price rigidity seems to play an important role in the Bernanke et al. (1999) model, as it improves the quantitative performance of the model signiﬁcantly. However, both models can only partially capture the positive correlation between risk premium and net worth. Given that both models have built in mechanism which tends to predict a negative or at best zero correlation, it remains to be explored whether other omitted factors contribute to such a positive relationship. 158 Bibliography Bernanke, Ben S and Mark Gertler, “Inside the Black Box: The Credit Channel of Monetary Policy Transmission,” Journal of Economic Perspectives, Fall 1995, 9 (4), 27–48. Bernanke, Ben S., Mark Gertler, and Simon Gilchrist, “The ﬁnancial accelerator in a quantitative business cycle framework,” in J. B. Taylor and M. Woodford, eds., Handbook of Macroeconomics, Vol. 1 of Handbook of Macroeconomics, Elsevier, 1999, chapter 21, pp. 1341–1393. Carlstrom, Charles T and Timothy S Fuerst, “Agency Costs, Net Worth, and Business Fluctuations: A Computable General Equilibrium Analysis,” American Economic Review, December 1997, 87 (5), 893–910. Chen, Nan-Kuang, “Bank net worth, asset prices and economic activity,” Journal of Monetary Economics, 2001, 48 (2), 415 – 436. Gale, Douglas and Martin Hellwig, “Incentive-Compatible Debt Contracts: The One-Period Problem,” Review of Economic Studies, October 1985, 52 (4), 647–63. Gregory, Allan W. and Gregor W. Smith, “Statistical Aspects of Calibration in Macroeconomics,” in G. S. Maddala, C. R. Rao, and H. D. 159 Chapter 4. Bibliography Vinod, eds., Handbook of Statistics, Vol. 11, Amsterdam: North-Holland, 1993, chapter 25, pp. 703–719. Kiyotaki, Nobuhiro and John Moore, “Credit Cycles,” Journal of Political Economy, April 1997, 105 (2), 211–48. Meh, Csaire and Kevin Moran, “The Role of Bank Capital in the Propagation of Shocks,” Working Papers 08-36, Bank of Canada October 2008. Townsend, Robert M., “Optimal contracts and competitive markets with costly state veriﬁcation,” Journal of Economic Theory, October 1979, 21 (2), 265–293. Vuong, Quang H., “Likelihood Ratio Tests For Model Selection and NonNested Hypotheses,” Econometrica, 1989, 57 (2), 307–333. Williamson, Stephen D, “Costly Monitoring, Loan Contracts, and Equilibrium Credit Rationing,” The Quarterly Journal of Economics, February 1987, 102 (1), 135–45. 160 Chapter 5 Conclusion 161 Chapter 5. Conclusion The second chapter of the dissertation studies whether exchange rate appreciations would aﬀect productivity growth. In the theoretical part, I introduce industry heterogeneity to the industrial organization environment of Dornbusch (1987) and adopt the assumption of disruptive technological change of Holmes, Levine and Schmitz (2008) to study the eﬀect of increased competition due to an exchange rate appreciation. Exchange rate appreciation can provide incentive to upgrade technology, since when the exchange rate appreciates, the proﬁt loss due to adjustment to the new technology, which is part of the opportunity cost of technology upgrade, is low. In addition, industry heterogeneity plays an important role, in the sense that diﬀerent industries are aﬀected by the same exchange rate appreciation in diﬀerent manners. For ﬁrms in industries shielded by high trade cost, they are less likely to respond. Meanwhile, for ﬁrms in highly traded industries, those in highly concentrated industries will invest more in technology upgrade, as their proﬁt is more responsive to change in technology. The empirical section of the chapter adds to the limited empirical knowledge on the topic, by looking into the productivity performances of Canadian manufacturing industries between 1997 and 2006. In essence, this chapter regards the major Canadian dollar appreciation between 2002 and 2006 as driven mostly by commodity prices and exogenous to the manufacturing industries. After examining the path of productivity growth after the appreciation of Canadian dollar, I ﬁnd evidence that compared to industries who trade less, highly traded industries had experienced faster productivity growth, and that among these industries, there is a positive relationship between concentration and productivity improvement. 162 Chapter 5. Conclusion However, when studying productivity growth, with the publicly available data, it is impossible to control perfectly for the eﬀect of entry and exit of ﬁrms. In the regression analysis, I control for the change in the number of establishments, but have no information about the size of the entrants and exiting ﬁrms. In the future, the empirical evidence can be advanced in two directions. Firstly, applying for access to ﬁrm-level dataset in Canada can help to get a better picture of what ﬁrms did after appreciations, and to gain information on the size of establishments moving in and out of an industry. Secondly, it is interesting to look into evidence from other countries with similar appreciation experiences. To tackle the issue of comparing diﬀerent calibrations, a popular quantitative practice in economics, the third chapter proposes an econometric testing procedures for comparison of misspeciﬁed calibrated models. Relative to the previous literature, we explicitly allow the models of interests to be misspeciﬁed in a frequentist framework. The test here is similar to that of Vuong (1989) and Rivers and Vuong (2002), and can be viewed as the frequentist counterpart of Schorfheide (2000). The null hypothesis of our test is that both models provide equal ﬁt to some characteristics of data, against the alternative that one performs better. The ﬁt of model can be interpreted as an in-sample forecast performance. We consider the cases where models are nested, non-nested, and overlapping. In addition, we also extend the test to the case where the model parameters are estimated to match one set of data characteristics while the model evaluation is based on another set of data characteristics. Since the model comparison is often dependent of the weights associated with the data characteristics, this chapter considers 163 Chapter 5. Conclusion the averaged and sup tests to alleviate the dependence on the choice weighting matrix. Overall, the method proposed is applicable to many calibration practices where researchers are interested in ﬁts of diﬀerent models. The fourth chapter applies the method developed in chapter 3 to examine ﬁts of two leading macroeconomic models with ﬁnancial frictions, Carlstrom and Fuerst (1997) model and the Bernanke, Gertler and Gilchrist (1999) model. In both models, due to ﬁnancial friction, a higher level of net worth helps a ﬁrm to obtain external ﬁnance, as it allows the ﬁrm to post more collaterals and to better align its interests with those of the creditor. The two papers has shown that net worth can propagate technology shocks and monetary shocks, as the positive shocks will lead to a higher level of net worth which facilitate more borrowing and investment in the future. Since net worth places a critical role the both models, this chapter addresses the question which model replicates the cyclical properties of net worth better. The econometric test results indicate both perform reasonably well. Interestingly, it seems price rigidity play an important role in the Bernanke et al. (1999) model, as it improves the quantitative performance of the model. However, the models can only partially account for the positive correlation between risk premium and net worth. 164 Bibliography Bernanke, Ben S., Mark Gertler, and Simon Gilchrist, “The ﬁnancial accelerator in a quantitative business cycle framework,” in J. B. Taylor and M. Woodford, eds., Handbook of Macroeconomics, Vol. 1 of Handbook of Macroeconomics, Elsevier, 1999, chapter 21, pp. 1341–1393. Carlstrom, Charles T and Timothy S Fuerst, “Agency Costs, Net Worth, and Business Fluctuations: A Computable General Equilibrium Analysis,” American Economic Review, December 1997, 87 (5), 893–910. Dornbusch, Rudiger, “Exchange Rates and Prices,” The American Economic Review, 1987, 77 (1), 93–106. Holmes, Thomas J., David K. Levine, and James A. Schmitz, “Monopoly and the Incentive to Innovate When Adoption Involves Switchover Disruptions,” NBER Working Paper, 2008, No. W13864. Rivers, D. and Q. Vuong, “Model Selection Tests For Nonlinear Dynamic Models,” Econometrics Journal, 2002, 5 (1), 1–39. Schorfheide, Frank, “Loss Function-Based Evaluation of DSGE Models,” Journal of Applied Econometrics, 2000, 15, 645–670. 165 Chapter 5. Bibliography Vuong, Quang H., “Likelihood Ratio Tests For Model Selection and NonNested Hypotheses,” Econometrica, 1989, 57 (2), 307–333. 166
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Three essays in macroeconomics and international economics
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Three essays in macroeconomics and international economics Tang, Yao 2009
pdf
Page Metadata
Item Metadata
Title | Three essays in macroeconomics and international economics |
Creator |
Tang, Yao |
Publisher | University of British Columbia |
Date Issued | 2009 |
Description | This dissertation examines two issues in international economics and macroeconomics. The first is to understand the response of productivity to major real exchange rate appreciations and the second concerns how to compare the fits of different calibrated macroeconomic models. In the first chapter, I construct a model to clarify how the increased competition due to an exchange rate appreciation provides incentive for firms to improve productivity. However, if a firm is in an industry shielded by a high trade cost, then the incentive is weaker. In industries with fewer firms, profits are more responsive to productivity improvements, therefore, firms are more likely to invest more heavily in productivity improvement. Empirical analysis of Canadian manufacturing data from 1997 to 2006 finds evidence consistent with the model predictions. The second chapter presents testing procedures for comparison of misspecified calibrated models. The proposed tests are of the Vuong-type (Vuong, 1989; Rivers and Vuong, 2002). In the framework here, an econometrician selects values for the parameters in order to match some characteristics of the data with those implied by the theoretical model. We assume that all competing models are misspecified, and suggest a test for the null hypothesis that all considered models provide equal fit to the data characteristics, against the alternative that one of the models is a better approximation. The Carlstrom and Fuerst (1997) model and the Bernanke, Gertler and Gilchrist (1999) model are two leading models that study financial frictions in macroeconomic models. In particular, these models show that due to financial frictions, net worth plays an important role in obtaining external finance, and that at an aggregate level, net worth can propagate technology shocks and monetary shocks. However, neither paper examines whether the models can reproduce cyclical properties of net worth. The third chapter addresses this issue by applying the comparison method developed in the third chapter. Results indicate both models do reasonably well. In addition, price rigidity seems to play an important role in the latter model. However, both models can only partially capture the positive correlation between risk premium and net worth. |
Extent | 966674 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
File Format | application/pdf |
Language | eng |
Date Available | 2009-11-09 |
Provider | Vancouver : University of British Columbia Library |
Rights | Attribution-NonCommercial-NoDerivatives 4.0 International |
DOI | 10.14288/1.0068114 |
URI | http://hdl.handle.net/2429/14709 |
Degree |
Doctor of Philosophy - PhD |
Program |
Economics |
Affiliation |
Arts, Faculty of Vancouver School of Economics |
Degree Grantor | University of British Columbia |
Graduation Date | 2010-05 |
Campus |
UBCV |
Scholarly Level | Graduate |
Rights URI | http://creativecommons.org/licenses/by-nc-nd/4.0/ |
Aggregated Source Repository | DSpace |
Download
- Media
- 24-ubc_2010_spring_tang_yao.pdf [ 944.02kB ]
- Metadata
- JSON: 24-1.0068114.json
- JSON-LD: 24-1.0068114-ld.json
- RDF/XML (Pretty): 24-1.0068114-rdf.xml
- RDF/JSON: 24-1.0068114-rdf.json
- Turtle: 24-1.0068114-turtle.txt
- N-Triples: 24-1.0068114-rdf-ntriples.txt
- Original Record: 24-1.0068114-source.json
- Full Text
- 24-1.0068114-fulltext.txt
- Citation
- 24-1.0068114.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0068114/manifest